CN116185520A - Construction method, system, equipment and medium for An Zhuo Diaoyong graph - Google Patents

Construction method, system, equipment and medium for An Zhuo Diaoyong graph Download PDF

Info

Publication number
CN116185520A
CN116185520A CN202211579498.2A CN202211579498A CN116185520A CN 116185520 A CN116185520 A CN 116185520A CN 202211579498 A CN202211579498 A CN 202211579498A CN 116185520 A CN116185520 A CN 116185520A
Authority
CN
China
Prior art keywords
call
graph
android
statement
function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211579498.2A
Other languages
Chinese (zh)
Inventor
付才
李悦
许浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Dajia Data Technology Co ltd
Huazhong University of Science and Technology
Original Assignee
Hunan Dajia Data Technology Co ltd
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Dajia Data Technology Co ltd, Huazhong University of Science and Technology filed Critical Hunan Dajia Data Technology Co ltd
Priority to CN202211579498.2A priority Critical patent/CN116185520A/en
Publication of CN116185520A publication Critical patent/CN116185520A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44505Configuring for program initiating, e.g. using registry, configuration files
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Telephonic Communication Services (AREA)
  • Stored Programmes (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses an android Zhuo Diaoyong graph construction method, system, equipment and medium, which comprise the steps of obtaining an android apk file, constructing an android App original call graph according to the android apk file, screening system calls taking callback functions as parameters in the android App original call graph through a function screening model to obtain a system call context, searching according to the system call context and a preset algorithm to obtain a data flow analysis result, constructing a virtual system call function body according to the system call context and the data flow analysis result, generating a call sub-graph according to the virtual system call function body, adding the call sub-graph into the android App original call graph to obtain the android call graph, solving the problem that the callback functions are not in the call graph during the android App analysis, and improving the code coverage rate and the android call graph construction precision of the android call graph construction.

Description

Construction method, system, equipment and medium for An Zhuo Diaoyong graph
Technical Field
The invention relates to the technical field of software safety, in particular to a method, a system, equipment and a medium for constructing an An Zhuo Diaoyong graph.
Background
Due to the extremely high market share and the huge user population of the android system, the android App is developed suddenly and swiftly in recent years, but the mobile terminal data erupts in a blowout way, so that the damage of the malicious android App is increased.
The android code analysis mainly comprises two technologies, namely static analysis and dynamic analysis, wherein the dynamic analysis can capture the real-time state of program operation, but the code coverage rate is lower, and the analysis precision is not high; the static analysis can achieve extremely high code coverage rate through directly analyzing executable codes, has a global analysis view angle, and can clearly observe code behaviors, wherein the analysis based on the call graph is an important method for static analysis of android codes, the call graph is an important data structure for describing call relations among code processes, and the data flow analysis among the processes can be conveniently carried out through the call graph, so that the code behaviors can be analyzed more accurately. For android system call with callback interface parameters, after the system call is executed, the system calls back corresponding callback functions according to the running condition, the callback functions are realized by a developer, and the developer rewrites the callback interface functions to realize a self-defined function based on the callback interface provided by the android SDK.
The method includes the steps that the edges of a call graph need to record statements of a father function and a father function call sub-function and the sub-function, for a system call with callback interface parameters, the father function is a system call, the call statement is located in a An Zhuona kernel and is not in a code of an android App, the sub-function is the callback function, because an original call graph generation algorithm cannot be used for observing An Zhuona kernels, the call graph edges from the system call to the callback function cannot be constructed, the callback function cannot appear in the call graph, the construction of a traditional android call graph only can acquire the hierarchy of the android system call, for the android system call with callback interface parameters, the traditional call graph construction cannot acquire callback function information, and the callback function cannot be acquired in the call graph.
Disclosure of Invention
The present invention aims to at least solve the technical problems existing in the prior art. Therefore, the method, the system, the equipment and the medium for constructing the android Zhuo Diaoyong graph can solve the problem that callback functions are not in the call graph during android App analysis, and improve the code coverage rate and the android call graph construction accuracy of the android call graph construction.
The invention provides a construction method of an An Zhuo Diaoyong graph, which comprises the following steps:
acquiring an android apk file, and constructing an android App original call graph according to the android apk file;
screening system call with callback function as parameter in the android App original call graph through a function screening model to obtain system call context;
searching according to the system call context and a preset algorithm to obtain a data stream analysis result;
constructing a virtual system call function body according to the system call context and the data flow analysis result;
generating a calling sub-graph according to the virtual system calling function body, and adding the calling sub-graph into the android App original calling graph to obtain an android calling graph.
According to the embodiment of the invention, at least the following technical effects are achieved:
according to the method, an android App file is obtained, an android App original call graph is built according to the android App file, a function screening model is used for screening system calls taking callback functions as parameters in the android App original call graph, a system call context is obtained, searching is conducted according to the system call context and a preset algorithm, a data flow analysis result is obtained, a virtual system call function body is built according to the system call context and the data flow analysis result, a call sub-graph is generated according to the virtual system call function body, and the call sub-graph is added into the android App original call graph to obtain the android call graph, so that the problem that the callback functions are not in the call graph during the android App analysis is solved, and the code coverage rate and the android call graph construction accuracy of the android call graph are improved.
According to some embodiments of the present invention, the constructing an android App original call graph according to the android apk file includes:
analyzing the android apk file to obtain an android compiled version of the file;
selecting the android compression package of the corresponding android compiled version;
and compressing Bao Goujian the android App original call graph according to the android through the CHA algorithm.
According to some embodiments of the present invention, the filtering, by a function filtering model, a system call including a callback function as a parameter in the android App original call graph to obtain a system call context includes:
traversing all edges of the android App original call graph through a function screening model to obtain a function represented by a destination node of the edge as a system call: sys_call 1 ,sys_call 2 ,…sys_call i ,…sys_call N Wherein the sys_call i For the ith system call, the N is the total number of the system calls;
acquiring a parameter sequence of the system call;
judging whether each parameter of the parameter sequence is a callback interface type, if so, acquiring a parent function m of a parameter of system call i,j And obtain the execution statement stmt of the parameter of the system call i, Wherein said m i, A parent function of the jth parameter called for the ith system, the stmt i, An execution statement for a j-th parameter of an i-th system call;
according to the following formula m i, And the stmt is as follows i, Mapping is established to obtain the sys_call i System call context m of (2) i, →stmt i, And obtaining the system call context of each system call.
According to some embodiments of the present invention, the searching according to the system call context and a preset algorithm to obtain a data flow analysis result includes:
acquiring call expressions of execution statements of all parameters of the system call;
searching all parameters of the calling expression to obtain parameters of callback interface types in all parameters of the calling expression;
constructing a control flow graph according to the parent function of the parameter of the system call;
searching variables which have the same direction with the parameters of the callback interface type according to the control flow graph through a context-sensitive alias analysis algorithm to obtain a variable set;
judging whether the variable set is a local variable generated by a new statement in the father function body, a return value obtained by executing other calling statements in the father function body, parameters of the father function, class member variables or static class member variables, and if the variable set is the local variable generated by the new statement in the father function body, the return value obtained by executing other calling statements in the father function body, parameters of the father function, class member variables or static class member variables, searching all instance types possibly pointed by the variable set by calling a preset algorithm to obtain the data stream analysis result.
According to some embodiments of the invention, the constructing a control flow graph according to the parent function of the parameter of the system call includes:
acquiring the current statement of the parent function of the parameter of the system call;
judging the type of the current sentence, and if the current sentence is an order execution sentence, judging whether the current sentence has a next sentence or not; if the current sentence has a next sentence, adding the current sentence into a precursor node list of the next sentence, and adding the next sentence into a post-driver node list of the current sentence, wherein the node of the sentence is a function sentence;
if the current sentence is a branch sentence, acquiring a next sentence executed under all conditions of a branch of the current sentence, adding the current sentence into a precursor node list of the next sentence executed under all conditions, and adding the next sentence executed under all conditions into a post-driven node list of the current sentence;
and analogizing until the last statement of the father function is executed, and obtaining the control flow graph.
According to some embodiments of the present invention, the searching, by the context-sensitive alias analysis algorithm, for a variable having the same direction as the parameter of the callback interface type according to the control flow graph, to obtain a variable set includes:
Acquiring all statements of a control flow graph, and constructing a program point state set according to all the statements of the control flow graph, wherein the program point state set comprises a statement of the control flow graph, a variable set before executing the statement of the control flow graph and a variable set after executing the statement of the control flow graph, and each element of the variable set before executing the statement of the control flow graph and each element of the variable set after executing the statement of the control flow graph are variable name sets with the same direction;
taking out a first program point state from the program point states, and assigning the variable set of the first program point state before the statement of the control flow graph is executed to the variable set after the statement of the control flow graph is executed to obtain the assigned variable set after the statement of the control flow graph is executed;
judging whether a statement of a control flow graph of the first program point state is an assignment statement, if the statement of the control flow graph of the first program point state is an assignment statement, judging whether a right value of the assignment statement is a variable, if the right value of the assignment statement is a variable, merging variable sets of left and right values in a variable set after executing the statement of the control flow graph after assignment of the first program point state to obtain a first updated first program point state, adding the first updated first program point state into the program point state set, and if the right value of the assignment statement is a constant, deleting the left value in the set of left values in the variable set after executing the statement of the control flow graph after assignment of the first program point state to obtain a second updated first program point state, and adding the second updated first program point state into the program point state set;
Judging whether the statement of the control flow graph of the second program point state is an assignment statement, if the statement of the control flow graph of the second program point state is an assignment statement, judging whether the right value of the assignment statement is a variable, if the right value of the assignment statement is a variable, merging variable sets of left and right values in a variable set after executing the statement of the control flow graph after assignment of the second program point state to obtain a first updated second program point state, adding the first updated second program point state into the program point state set, and if the right value of the assignment statement is a constant, deleting the left value in the variable set after executing the statement of the control flow graph after assignment of the second program point state to obtain a second updated second program point state, adding the second program point state into the program point state set in sequence until the program point state is empty.
According to some embodiments of the present invention, if the variable set is a local variable generated by a new statement in the parent function body, a return value obtained by executing other call statements in the parent function body, a parameter of the parent function, a class member variable or a static class member variable, the data flow analysis result is obtained by calling a preset algorithm to find all instance types to which the variable set may point, including:
If the variable set is a local variable generated by a new statement in the father function body, acquiring an instance type in an expression of the new statement to obtain the data flow analysis result;
if the variable set is a return value obtained by executing other calling sentences in the father function body, acquiring the function body called by the calling sentences, traversing the function body to acquire all the return sentences of the function body and the return variables of the return sentences, judging whether the return variables are local variables generated by new sentences in the father function body, return values obtained by executing other calling sentences in the father function body, parameters of the father function, class member variables or static class member variables, and if the return variables are local variables generated by new sentences in the father function body, return values obtained by executing other calling sentences in the father function body, parameters of the father function, class member variables or static class member variables, searching all instance types possibly pointed by the return variables by calling preset algorithms to acquire the data flow analysis result;
if the variable set is a parameter of the father function, acquiring all parameters of all sentences calling the father function according to the android App original call graph; searching parameters in the same position as all parameters of the statement of the father function in the variable set; judging whether the parameters at the same position are local variables generated by new sentences in the father function body, return values obtained by executing other calling sentences in the father function body, parameters, class member variables or static class member variables of the father function, and if the parameters at the same position are the local variables generated by new sentences in the father function body, return values obtained by executing other calling sentences in the father function body, parameters, class member variables or static class member variables of the father function body, searching all instance types to which the parameters at the same position possibly point by calling a preset algorithm to obtain the data flow analysis result;
If the variable set is the class member variable or the static class member variable, acquiring all local variables of all functions of the declaration class of the variable set, searching for local variables of which all local variables of all functions of the declaration class of the variable set are consistent with the variable set type, judging whether the local variables with the consistent type are local variables generated by new sentences in the father function body, return values obtained by executing other calling sentences in the father function body, parameters of the father function, class member variables or static class member variables, and if the local variables with the consistent type are local variables generated by new sentences in the father function body, return values obtained by executing other calling sentences in the father function body, parameters of the father function, class member variables or static class member variables, searching for all instance types possibly pointed by the local variables with the consistent type by calling preset algorithms, thereby obtaining the data flow analysis result.
According to some embodiments of the invention, the constructing a virtual system call function according to the system call context and the data flow analysis result includes:
Creating an empty function body consistent with the system call parameters;
searching original parent classes and parent interfaces of all instance types in the data stream analysis result, and recording recursive paths of the original parent classes and parent interfaces of all instance types in the data stream analysis result;
judging whether the original parent class and the parent interface are callback interface types or not, and if the original parent class and the parent interface are callback interface types, judging all instance types of the recursion paths of the original parent class and the parent interface; searching all callback functions of all instance types; if the callback functions are consistent, covering until the instance type of the current instance object is returned;
and adding all callback functions to the null function body to obtain the virtual system call function body.
In a second aspect of the present invention, there is provided an android call graph construction system, the android call graph construction system comprising:
the original call diagram construction module is used for acquiring an android apk file and constructing an android App original call diagram according to the android apk file;
the system call context screening module is used for screening system calls taking callback functions as parameters in the android App original call graph through a function screening model to obtain a system call context;
The data flow analysis result searching module is used for searching according to the system call context and a preset algorithm to obtain a data flow analysis result;
the virtual system call function body construction module is used for constructing a virtual system call function body according to the system call context and the data flow analysis result;
and the android call graph generation module is used for generating a call sub graph according to the virtual system call function body, and adding the call sub graph into the android App original call graph to obtain the android call graph.
According to the system, an android App file is obtained, an android App original call graph is built according to the android App file, a function screening model is used for screening system calls taking callback functions as parameters in the android App original call graph, a system call context is obtained, searching is conducted according to the system call context and a preset algorithm, a data flow analysis result is obtained, a virtual system call function body is built according to the system call context and the data flow analysis result, a call sub-graph is generated according to the virtual system call function body, and the call sub-graph is added into the android App original call graph to obtain the android call graph, so that the problem that the callback functions are not in the call graph during the android App analysis is solved, and the code coverage rate and the android call graph construction accuracy of the android call graph are improved.
In a third aspect of the present invention, there is provided an apparatus for constructing a graph Zhuo Diaoyong, comprising at least one control processor and a memory for communicatively coupling with the at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform the android call graph construction method described above.
In a fourth aspect of the present invention, there is provided a computer-readable storage medium storing computer-executable instructions for causing a computer to perform the above-described android call graph construction method.
It should be noted that the advantages of the second to fourth aspects of the present invention and the prior art are the same as those of the aforementioned one of the construction systems of the a Zhuo Diaoyong map and the prior art, and will not be described in detail here.
Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the invention will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:
FIG. 1 is a flow chart of an embodiment of the invention of an ampere Zhuo Diaoyong diagram construction method;
fig. 2 is a flow chart of an embodiment of the present invention, an ampere Zhuo Diaoyong diagram building system.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.
In the description of the present invention, the description of first, second, etc. is for the purpose of distinguishing between technical features only and should not be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.
In the description of the present invention, it should be understood that the direction or positional relationship indicated with respect to the description of the orientation, such as up, down, etc., is based on the direction or positional relationship shown in the drawings, is merely for convenience of describing the present invention and simplifying the description, and does not indicate or imply that the apparatus or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.
In the description of the present invention, unless explicitly defined otherwise, terms such as arrangement, installation, connection, etc. should be construed broadly and the specific meaning of the terms in the present invention can be determined reasonably by a person skilled in the art in combination with the specific content of the technical solution.
The method includes the steps that the edges of a call graph need to record statements of a father function and a father function call sub-function and the sub-function, for a system call with callback interface parameters, the father function is a system call, the call statement is located in a An Zhuona kernel and is not in a code of an android App, the sub-function is the callback function, because an original call graph generation algorithm cannot be used for observing An Zhuona kernels, the call graph edges from the system call to the callback function cannot be constructed, the callback function cannot appear in the call graph, the construction of a traditional android call graph only can acquire the hierarchy of the android system call, for the android system call with callback interface parameters, the traditional call graph construction cannot acquire callback function information, and the callback function cannot be acquired in the call graph.
In order to solve the technical defect, referring to fig. 1, the invention further provides an ampere Zhuo Diaoyong diagram construction method, which comprises the following steps:
s101, acquiring an android apk file, and constructing an android App original call graph according to the android apk file;
step S102, screening system call with callback function as parameter in android App original call graph through a function screening model to obtain system call context;
step S103, searching according to the system calling context and a preset algorithm to obtain a data stream analysis result;
step S104, constructing a virtual system call function body according to the system call context and the data flow analysis result;
and step 105, generating a calling sub-graph according to the virtual system calling function body, and adding the calling sub-graph into the original calling graph of the android App to obtain the android calling graph.
According to the method, an android App file is obtained, an android App original call graph is built according to the android App file, a function screening model is used for screening system calls taking callback functions as parameters in the android App original call graph, a system call context is obtained, searching is conducted according to the system call context and a preset algorithm, a data flow analysis result is obtained, a virtual system call function body is built according to the system call context and the data flow analysis result, a call sub-graph is generated according to the virtual system call function body, and the call sub-graph is added into the android App original call graph to obtain the android call graph, so that the problem that the callback functions are not in the call graph during the android App analysis is solved, and the code coverage rate and the android call graph construction accuracy of the android call graph are improved.
In some embodiments, constructing an android App original call graph according to an android apk file includes:
analyzing the android apk file to obtain an android compiled version of the file;
selecting an android compression package of a corresponding android compiled version;
and compressing Bao Goujian the android App original call graph according to the android through the CHA algorithm.
In some embodiments, the system call with the callback function as a parameter is contained in the android App original call graph through a function screening model to obtain a system call context, which comprises the following steps:
traversing all edges of an android App original call graph through a function screening model, and obtaining a function represented by a destination node of the edge as a system call: sys_call 1 ,sys_call 2 ,…sys_call i ,…sys_call N Wherein, sys_call i For the ith system call, N is the total number of system calls;
acquiring a parameter sequence of system call;
judging whether each parameter of the parameter sequence is a callback interface type, if so, acquiring a parent function m of the parameter of the system call i,j And obtains the execution statement stmt of the parameter of the system call i,j Wherein m is i,j STmt is the parent of the jth parameter of the ith system call i,j An execution statement for a j-th parameter of an i-th system call;
according to m i,j And stmt i,j Mapping is built to obtain sys_call i System tuning of (2)By context m i,j →stmt i,j And obtaining the system call context of each system call.
In some embodiments, searching according to the system call context and a preset algorithm to obtain a data stream analysis result includes:
acquiring call expressions of execution statements of parameters of all system calls;
searching all parameters of the calling expression to obtain parameters of callback interface types in all parameters of the calling expression;
constructing a control flow graph according to the parent function of the parameter of the system call;
searching variables which have the same direction as the parameters of the callback interface type according to the control flow graph through a context-sensitive alias analysis algorithm to obtain a variable set;
judging whether the variable set is a local variable generated by a new statement in the parent function body, a return value obtained by executing other calling statements in the parent function body, parameters of the parent function, class member variables or static class member variables, if the variable set is the local variable generated by the new statement in the parent function body, the return value obtained by executing other calling statements in the parent function body, parameters of the parent function, class member variables or static class member variables, searching all instance types possibly pointed by the variable set by calling a preset algorithm, and obtaining a data stream analysis result.
In some embodiments, building a control flow graph from a parent function of a parameter of a system call includes:
acquiring the current statement of the parent function of the parameter of the system call;
judging the type of the current sentence, and if the current sentence is a sequential execution sentence, judging whether the current sentence has a next sentence or not; if the current sentence has the next sentence, adding the current sentence into a predecessor node list of the next sentence, and adding the next sentence into a successor node list of the current sentence, wherein the nodes of the sentence are function sentences;
if the current sentence is a branch sentence, acquiring a next sentence executed under all conditions of the branch of the current sentence, adding the current sentence into a precursor node list of the next sentence executed under all conditions, and adding the next sentence executed under all conditions into a post-driven node list of the current sentence;
and analogizing until the last statement of the father function is executed, and obtaining the control flow graph.
In some embodiments, searching for a variable having the same direction as a parameter of a callback interface type according to a control flow graph by a context sensitive alias analysis algorithm to obtain a variable set, including:
Acquiring all sentences of a control flow graph, and constructing a program point state set according to all sentences of the control flow graph, wherein the program point state set comprises sentences of the control flow graph, a variable set before the sentences of the control flow graph are executed and a variable set after the sentences of the control flow graph are executed, and each element of the variable set before the sentences of the control flow graph and the variable set after the sentences of the control flow graph are executed is a variable name set with the same direction;
taking out a first program point state from the program point states, and assigning the variable set before the statement of the execution control flow graph of the first program point state to the variable set after the statement of the execution control flow graph to obtain the variable set after the statement of the execution control flow graph after the assignment;
judging whether the statement of the control flow graph of the first program point state is an assignment statement, if the statement of the control flow graph of the first program point state is an assignment statement, judging whether the right value of the assignment statement is a variable, if the right value of the assignment statement is a variable, merging variable sets of left and right values in a variable set of the statement of the execution control flow graph of the first program point state after assignment to obtain a first updated first program point state, adding the first updated first program point state into the program point state set, and if the right value of the assignment statement is a constant, deleting the left value in the variable set of the statement of the execution control flow graph of the first program point state after assignment to obtain a second updated first program point state, and adding the first program point state after second update into the program point state set;
Judging whether the statement of the control flow graph of the second program point state is an assignment statement, if the statement of the control flow graph of the second program point state is an assignment statement, judging whether the right value of the assignment statement is a variable, if the right value of the assignment statement is a variable, merging variable sets of left and right values in a variable set of the statement of the execution control flow graph of the second program point state after assignment to obtain a first updated second program point state, adding the first updated second program point state into the program point state set, and if the right value of the assignment statement is a constant, deleting the left value in the variable set of the statement of the execution control flow graph of the second program point state after assignment to obtain a second updated second program point state, adding the second program point state after the second update into the program point state set, and analogically, until the program point state set is empty.
In some embodiments, if the variable set is a local variable generated by a new statement in the parent function body, a return value obtained by executing other call statements in the parent function body, a parameter of the parent function, a class member variable or a static class member variable, then searching all instance types possibly pointed to by the variable set by calling a preset algorithm to obtain a data stream analysis result, including:
If the variable set is a local variable generated by a new statement in the father function body, acquiring an instance type in an expression of the new statement to obtain a data stream analysis result;
if the variable set is a return value obtained by executing other calling sentences in the father function body, acquiring the function body called by the calling sentences, traversing the function body to acquire all the return sentences of the function body and the return variables of the return sentences, judging whether the return variables are local variables generated by new sentences in the father function body, the return values obtained by executing other calling sentences in the father function body, parameters of the father function, class member variables or static class member variables, and if the return variables are the local variables generated by the new sentences in the father function body, the return values obtained by executing other calling sentences in the father function body, the parameters of the father function, class member variables or static class member variables, searching all example types possibly pointed by the return variables by calling preset algorithms to obtain a data stream analysis result;
if the variable set is a parameter of the parent function, acquiring all parameters of all statements for calling the parent function according to an android App original call graph; searching parameters in the same position as all parameters of the statement of the father function in the variable set; judging whether the parameters at the same position are local variables generated by new sentences in the father function body, return values obtained by executing other calling sentences in the father function body, parameters of the father function, class member variables or static class member variables, if the parameters at the same position are the local variables generated by new sentences in the father function body, return values obtained by executing other calling sentences in the father function body, parameters of the father function, class member variables or static class member variables, searching all example types possibly pointed by the parameters at the same position by calling a preset algorithm, and obtaining a data stream analysis result;
If the variable set is a class member variable or a static class member variable, acquiring all local variables of all functions of a declaration class of the variable set, searching for local variables of all functions of the declaration class of the variable set, which are consistent with the variable set type, judging whether the local variables with consistent types are local variables generated by new sentences in a father function body, return values obtained by executing other calling sentences in the father function body, parameters of the father function, class member variables or static class member variables, and if the local variables with consistent types are local variables generated by new sentences in the father function body, return values obtained by executing other calling sentences in the father function body, parameters of the father function, class member variables or static class member variables, searching for all instance types to which the local variables with consistent types are likely to point by calling preset algorithms, so as to obtain a data stream analysis result.
In some embodiments, constructing a virtual system call function from the system call context and the data flow analysis results includes:
creating an empty function body consistent with the system call parameters;
searching for original parent classes and parent interfaces of all instance types in the data stream analysis result, and recording recursive paths of the original parent classes and parent interfaces of all instance types in the data stream analysis result;
Judging whether the original parent class and the parent interface are of callback interface types or not, if the original parent class and the parent interface are of callback interface types, according to all instance types of recursion paths of the original parent class and the parent interface; searching all callback functions of all instance types; if all callback functions are consistent, covering until the instance type of the current instance object is returned;
and adding all callback functions into the null function body to obtain a virtual system call function body.
The method solves the problem that the callback function is not in the call graph during Android App analysis. Compared with the prior art, the call graph constructed by the method expands the original call graph, contains more code semantic information, is deeper in analysis depth, can mine Android code behaviors in callback functions, is favorable for finding deep malicious code behaviors, and provides more comprehensive information for Android code behavior feature extraction.
Meanwhile, the virtual system call function body construction method provided by the invention realizes the connection between the original call graph and the call subgraph taking each callback function as an inlet, explicitly expresses the callback operation which is not in the Android App program control flow through simulating the function of system call, and solves the problem that the traditional call graph construction method can not be continuously analyzed because only the system call is analyzed. Compared with the prior art, the method for constructing the virtual system call function body combines all call subgraphs with the original call graph, semantic information of the callback function is contained in the call graph, and more comprehensive and accurate code behavior characteristics are provided for Android code behavior analysis based on the call graph.
On the other hand, the Android App call graph construction method based on callback function analysis has the advantages that compared with dynamic analysis, the Android App call graph construction method based on callback function analysis has high code coverage rate, high code analysis precision and finer granularity. Compared with the prior art, the Android App static analysis method is expanded, control flow analysis and data flow analysis are combined, and the problem that an analysis result is incomplete and inaccurate due to a single analysis method is avoided.
In addition, referring to fig. 2, an embodiment of the present invention provides an ampere Zhuo Diaoyong graph building system, which includes an original call graph building module 1100, a system call context filtering module 1200, a data flow analysis result searching module 1300, a virtual system call function body building module 1400, and an android call graph generating module 1500, wherein:
the original call diagram construction module 1100 is used for acquiring an android apk file and constructing an android App original call diagram according to the android apk file;
the system call context screening module 1200 is configured to screen, through a function screening model, a system call with a callback function as a parameter in an android App original call graph, so as to obtain a system call context;
The data flow analysis result searching module 1300 is configured to search according to the system call context and a preset algorithm to obtain a data flow analysis result;
the virtual system call function body construction module 1400 is configured to construct a virtual system call function body according to the system call context and the data flow analysis result;
the android call graph generating module 1500 is configured to generate a call sub-graph according to the virtual system call function body, and add the call sub-graph to the android App original call graph to obtain the android call graph.
According to the system, an android App file is obtained, an android App original call graph is built according to the android App file, a function screening model is used for screening system calls taking callback functions as parameters in the android App original call graph, a system call context is obtained, searching is conducted according to the system call context and a preset algorithm, a data flow analysis result is obtained, a virtual system call function body is built according to the system call context and the data flow analysis result, a call sub-graph is generated according to the virtual system call function body, and the call sub-graph is added into the android App original call graph to obtain the android call graph, so that the problem that the callback functions are not in the call graph during the android App analysis is solved, and the code coverage rate and the android call graph construction accuracy of the android call graph are improved.
It should be noted that, the system embodiment and the above-mentioned system embodiment are based on the same inventive concept, so that the relevant content of the above-mentioned method embodiment is also applicable to the system embodiment, and is not repeated here.
The application also provides an electronic device constructed by the Zhuo Diaoyong map, which comprises: memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing when executing the computer program: the android call graph construction method is as described above.
The processor and the memory may be connected by a bus or other means.
The memory, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory remotely located relative to the processor, the remote memory being connectable to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The non-transitory software program and instructions required to implement the android call graph construction method of the above embodiment are stored in the memory, and when executed by the processor, the android call graph construction method in the above embodiment is executed, for example, the method steps S101 to S105 in fig. 1 described above are executed.
The present application also provides a computer-readable storage medium storing computer-executable instructions for performing: the android call graph construction method is as described above.
The computer-readable storage medium stores computer-executable instructions that are executed by a processor or controller, for example, by a processor in the above-described electronic device embodiment, which may cause the processor to execute the android call graph construction method in the above-described embodiment, for example, to execute the method steps S101 to S105 in fig. 1 described above.
Those of ordinary skill in the art will appreciate that all or some of the steps, systems, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program elements or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program elements or other data in a modulated data signal such as a carrier wave or other transport mechanism and may include any information delivery media.
The embodiments of the present invention have been described in detail with reference to the accompanying drawings, but the present invention is not limited to the above embodiments, and various changes can be made within the knowledge of one of ordinary skill in the art without departing from the spirit of the present invention.

Claims (10)

1. The android call graph construction method is characterized by comprising the following steps of:
acquiring an android apk file, and constructing an android App original call graph according to the android apk file;
screening system call with callback function as parameter in the android App original call graph through a function screening model to obtain system call context;
searching according to the system call context and a preset algorithm to obtain a data stream analysis result;
constructing a virtual system call function body according to the system call context and the data flow analysis result;
generating a calling sub-graph according to the virtual system calling function body, and adding the calling sub-graph into the android App original calling graph to obtain an android calling graph.
2. The method for constructing an android Zhuo Diaoyong map according to claim 1, wherein the constructing an android App primitive call map according to the android apk file comprises:
Analyzing the android apk file to obtain an android compiled version of the file;
selecting the android compression package of the corresponding android compiled version;
and compressing Bao Goujian the android App original call graph according to the android through the CHA algorithm.
3. The method for constructing an android Zhuo Diaoyong map according to claim 2, wherein the step of screening, by a function screening model, a system call including a callback function as a parameter in the android App original call map to obtain a system call context includes:
traversing all edges of the android App original call graph through a function screening model to obtain the edgesThe function represented by the destination node is a system call: sys_call 1 ,sys_call 2 ,…sys_call i ,…sys_call N Wherein the sys_call i For the ith system call, the N is the total number of the system calls;
acquiring a parameter sequence of the system call;
judging whether each parameter of the parameter sequence is a callback interface type, if so, acquiring a parent function m of a parameter of system call i, And obtain the execution statement stmt of the parameter of the system call i, Wherein said m i, A parent function of the jth parameter called for the ith system, the stmt i, An execution statement for a j-th parameter of an i-th system call;
according to the following formula m i, And the stmt is as follows i, Mapping is established to obtain the sys_call i System call context m of (2) i, →stmt i, And obtaining the system call context of each system call.
4. The method for constructing an ampere Zhuo Diaoyong graph according to claim 3, wherein the searching according to the system call context and a preset algorithm to obtain a data stream analysis result comprises:
acquiring call expressions of execution statements of all parameters of the system call;
searching all parameters of the calling expression to obtain parameters of callback interface types in all parameters of the calling expression;
constructing a control flow graph according to the parent function of the parameter of the system call;
searching variables which have the same direction with the parameters of the callback interface type according to the control flow graph through a context-sensitive alias analysis algorithm to obtain a variable set;
judging whether the variable set is a local variable generated by a new statement in the father function body, a return value obtained by executing other calling statements in the father function body, parameters of the father function, class member variables or static class member variables, and if the variable set is the local variable generated by the new statement in the father function body, the return value obtained by executing other calling statements in the father function body, parameters of the father function, class member variables or static class member variables, searching all instance types possibly pointed by the variable set by calling a preset algorithm to obtain the data stream analysis result.
5. The method for constructing an ampere Zhuo Diaoyong graph according to claim 4, wherein the constructing a control flow graph according to the parent function of the parameter of the system call comprises:
acquiring the current statement of the parent function of the parameter of the system call;
judging the type of the current sentence, and if the current sentence is an order execution sentence, judging whether the current sentence has a next sentence or not; if the current sentence has a next sentence, adding the current sentence into a precursor node list of the next sentence, and adding the next sentence into a post-driver node list of the current sentence, wherein the node of the sentence is a function sentence;
if the current sentence is a branch sentence, acquiring a next sentence executed under all conditions of a branch of the current sentence, adding the current sentence into a precursor node list of the next sentence executed under all conditions, and adding the next sentence executed under all conditions into a post-driven node list of the current sentence;
and analogizing until the last statement of the father function is executed, and obtaining the control flow graph.
6. The method for building an ampere Zhuo Diaoyong graph according to claim 5, wherein said searching, by a context-sensitive alias analysis algorithm, for a variable having the same direction as a parameter of the callback interface type according to the control flow graph, to obtain a variable set includes:
Acquiring all statements of a control flow graph, and constructing a program point state set according to all the statements of the control flow graph, wherein the program point state set comprises a statement of the control flow graph, a variable set before executing the statement of the control flow graph and a variable set after executing the statement of the control flow graph, and each element of the variable set before executing the statement of the control flow graph and each element of the variable set after executing the statement of the control flow graph are variable name sets with the same direction;
taking out a first program point state from the program point states, and assigning the variable set of the first program point state before the statement of the control flow graph is executed to the variable set after the statement of the control flow graph is executed to obtain the assigned variable set after the statement of the control flow graph is executed;
judging whether a statement of a control flow graph of the first program point state is an assignment statement, if the statement of the control flow graph of the first program point state is an assignment statement, judging whether a right value of the assignment statement is a variable, if the right value of the assignment statement is a variable, merging variable sets of left and right values in a variable set after executing the statement of the control flow graph after assignment of the first program point state to obtain a first updated first program point state, adding the first updated first program point state into the program point state set, and if the right value of the assignment statement is a constant, deleting the left value in the set of left values in the variable set after executing the statement of the control flow graph after assignment of the first program point state to obtain a second updated first program point state, and adding the second updated first program point state into the program point state set;
Judging whether the statement of the control flow graph of the second program point state is an assignment statement, if the statement of the control flow graph of the second program point state is an assignment statement, judging whether the right value of the assignment statement is a variable, if the right value of the assignment statement is a variable, merging variable sets of left and right values in a variable set after executing the statement of the control flow graph after assignment of the second program point state to obtain a first updated second program point state, adding the first updated second program point state into the program point state set, and if the right value of the assignment statement is a constant, deleting the left value in the variable set after executing the statement of the control flow graph after assignment of the second program point state to obtain a second updated second program point state, adding the second program point state into the program point state set in sequence until the program point state is empty.
7. The method according to claim 6, wherein if the variable set is a local variable generated by new statement in the parent function body, a return value obtained by executing other call statement in the parent function body, a parameter of the parent function, a class member variable or a static class member variable, then searching all instance types possibly pointed to by the variable set by calling a preset algorithm to obtain the data stream analysis result, including:
If the variable set is a local variable generated by a new statement in the father function body, acquiring an instance type in an expression of the new statement to obtain the data flow analysis result;
if the variable set is a return value obtained by executing other calling sentences in the father function body, acquiring the function body called by the calling sentences, traversing the function body to acquire all the return sentences of the function body and the return variables of the return sentences, judging whether the return variables are local variables generated by new sentences in the father function body, return values obtained by executing other calling sentences in the father function body, parameters of the father function, class member variables or static class member variables, and if the return variables are local variables generated by new sentences in the father function body, return values obtained by executing other calling sentences in the father function body, parameters of the father function, class member variables or static class member variables, searching all instance types possibly pointed by the return variables by calling preset algorithms to acquire the data flow analysis result;
if the variable set is a parameter of the father function, acquiring all parameters of all sentences calling the father function according to the android App original call graph; searching parameters in the same position as all parameters of the statement of the father function in the variable set; judging whether the parameters at the same position are local variables generated by new sentences in the father function body, return values obtained by executing other calling sentences in the father function body, parameters, class member variables or static class member variables of the father function, and if the parameters at the same position are the local variables generated by new sentences in the father function body, return values obtained by executing other calling sentences in the father function body, parameters, class member variables or static class member variables of the father function body, searching all instance types to which the parameters at the same position possibly point by calling a preset algorithm to obtain the data flow analysis result;
If the variable set is the class member variable or the static class member variable, acquiring all local variables of all functions of the declaration class of the variable set, searching for local variables of which all local variables of all functions of the declaration class of the variable set are consistent with the variable set type, judging whether the local variables with the consistent type are local variables generated by new sentences in the father function body, return values obtained by executing other calling sentences in the father function body, parameters of the father function, class member variables or static class member variables, and if the local variables with the consistent type are local variables generated by new sentences in the father function body, return values obtained by executing other calling sentences in the father function body, parameters of the father function, class member variables or static class member variables, searching for all instance types possibly pointed by the local variables with the consistent type by calling preset algorithms, thereby obtaining the data flow analysis result.
8. The method for constructing an ampere Zhuo Diaoyong graph according to claim 7, wherein said constructing a virtual system call function from the system call context and the data flow analysis result comprises:
Creating an empty function body consistent with the system call parameters;
searching original parent classes and parent interfaces of all instance types in the data stream analysis result, and recording recursive paths of the original parent classes and parent interfaces of all instance types in the data stream analysis result;
judging whether the original parent class and the parent interface are callback interface types or not, and if the original parent class and the parent interface are callback interface types, judging all instance types of the recursion paths of the original parent class and the parent interface; searching all callback functions of all instance types; if the callback functions are consistent, covering until the instance type of the current instance object is returned;
and adding all callback functions to the null function body to obtain the virtual system call function body.
9. An android Zhuo Diaoyong graph construction system is characterized in that the android call graph construction method comprises the following steps:
the original call diagram construction module is used for acquiring an android apk file and constructing an android App original call diagram according to the android apk file;
the system call context screening module is used for screening system calls taking callback functions as parameters in the android App original call graph through a function screening model to obtain a system call context;
The data flow analysis result searching module is used for searching according to the system call context and a preset algorithm to obtain a data flow analysis result;
the virtual system call function body construction module is used for constructing a virtual system call function body according to the system call context and the data flow analysis result;
and the android call graph generation module is used for generating a call sub graph according to the virtual system call function body, and adding the call sub graph into the android App original call graph to obtain the android call graph.
10. An apparatus for constructing an a Zhuo Diaoyong graph, comprising at least one control processor and a memory for communication with the at least one control processor; the memory stores instructions executable by the at least one control processor to enable the at least one control processor to perform an ampere Zhuo Diaoyong graph construction method according to any one of claims 1 to 8.
CN202211579498.2A 2022-12-08 2022-12-08 Construction method, system, equipment and medium for An Zhuo Diaoyong graph Pending CN116185520A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211579498.2A CN116185520A (en) 2022-12-08 2022-12-08 Construction method, system, equipment and medium for An Zhuo Diaoyong graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211579498.2A CN116185520A (en) 2022-12-08 2022-12-08 Construction method, system, equipment and medium for An Zhuo Diaoyong graph

Publications (1)

Publication Number Publication Date
CN116185520A true CN116185520A (en) 2023-05-30

Family

ID=86435393

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211579498.2A Pending CN116185520A (en) 2022-12-08 2022-12-08 Construction method, system, equipment and medium for An Zhuo Diaoyong graph

Country Status (1)

Country Link
CN (1) CN116185520A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106650452A (en) * 2016-12-30 2017-05-10 北京工业大学 Mining method for built-in application vulnerability of Android system
CN108875375A (en) * 2018-04-26 2018-11-23 南京大学 A kind of dynamic characteristic information extracting method towards the detection of Android system privacy compromise
KR20200060180A (en) * 2018-11-21 2020-05-29 숭실대학교산학협력단 Method of call graph extraction in android apps, recording medium and apparatus for performing the method
CN113139184A (en) * 2021-04-12 2021-07-20 南京大学 Method for detecting Binder communication overload vulnerability based on static analysis

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106650452A (en) * 2016-12-30 2017-05-10 北京工业大学 Mining method for built-in application vulnerability of Android system
CN108875375A (en) * 2018-04-26 2018-11-23 南京大学 A kind of dynamic characteristic information extracting method towards the detection of Android system privacy compromise
KR20200060180A (en) * 2018-11-21 2020-05-29 숭실대학교산학협력단 Method of call graph extraction in android apps, recording medium and apparatus for performing the method
CN113139184A (en) * 2021-04-12 2021-07-20 南京大学 Method for detecting Binder communication overload vulnerability based on static analysis

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
STEVEN ARZT: "FlowDroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps", 《ACM》, 31 December 2014 (2014-12-31), pages 259 - 269 *
刘晓建: "多上下文特征的 Android 恶意程序静态检测方法", 《华中科技大学学报(自然科学版)》, 28 February 2020 (2020-02-28), pages 85 - 90 *

Similar Documents

Publication Publication Date Title
CN109800175B (en) Ether house intelligent contract reentry vulnerability detection method based on code instrumentation
CN112394942B (en) Distributed software development compiling method and software development platform based on cloud computing
CN109739494B (en) Tree-LSTM-based API (application program interface) use code generation type recommendation method
CN107622008B (en) Traversal method and device for application page
US9292281B2 (en) Identifying code that exhibits ideal logging behavior
CN111061643B (en) SDK cluster compatibility detection method and device, electronic equipment and storage medium
CN109240666B (en) Function calling code generation method and system based on call stack and dependent path
US10514898B2 (en) Method and system to develop, deploy, test, and manage platform-independent software
CN106295346B (en) Application vulnerability detection method and device and computing equipment
CN107015841B (en) Preprocessing method for program compiling and program compiling device
CN114138748A (en) Database mapping file generation method, device, equipment and storage medium
US11422917B2 (en) Deriving software application dependency trees for white-box testing
CN116737130A (en) Method, system, equipment and storage medium for compiling modal-oriented intermediate representation
CN109634569B (en) Method, device and equipment for realizing flow based on annotation and readable storage medium
CN106933642A (en) The processing method and processing unit of application program
CN116185520A (en) Construction method, system, equipment and medium for An Zhuo Diaoyong graph
CN112069052A (en) Abnormal object detection method, device, equipment and storage medium
CN112395199B (en) Distributed software instance testing method based on cloud computing and software development platform
CN115421831A (en) Method, device, equipment and storage medium for generating calling relation of activity component
CN113946516A (en) Code coverage rate determining method and device and storage medium
CN110704742B (en) Feature extraction method and device
CN111736848A (en) Packet conflict positioning method and device, electronic equipment and readable storage medium
CN116360712A (en) Platform framework extension method, device and storage medium
CN112835787A (en) Application program interface skipping path correction method and device, storage medium and terminal
CN116701524B (en) Construction method and device of operation path tree, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination