CN115221051B - Program instrumentation method and device for verifying execution process of data API - Google Patents

Program instrumentation method and device for verifying execution process of data API Download PDF

Info

Publication number
CN115221051B
CN115221051B CN202210813432.9A CN202210813432A CN115221051B CN 115221051 B CN115221051 B CN 115221051B CN 202210813432 A CN202210813432 A CN 202210813432A CN 115221051 B CN115221051 B CN 115221051B
Authority
CN
China
Prior art keywords
program
target data
data api
api
instrumentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210813432.9A
Other languages
Chinese (zh)
Other versions
CN115221051A (en
Inventor
黄罡
张溯
张颖
孙艳春
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN202210813432.9A priority Critical patent/CN115221051B/en
Publication of CN115221051A publication Critical patent/CN115221051A/en
Application granted granted Critical
Publication of CN115221051B publication Critical patent/CN115221051B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/362Software debugging
    • G06F11/3644Software debugging by instrumenting at runtime
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/362Software debugging
    • G06F11/366Software debugging using diagnostics
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The embodiment of the invention provides a program instrumentation method and a device for verifying the execution process of a data API, wherein in the method, a program segment corresponding to a target data API is positioned from the whole executable program for data supply; copying functions in the program segments, and reconstructing calling relations among the copied functions based on the function calling relations in the original executable program to form to-be-inserted program segments with the same execution effect and independent execution process; inserting instrumentation codes into the program fragments to be instrumented to obtain the target executable program. Compared with the prior art, the method for integrally instrumentation of the executable program has the advantages that program fragments corresponding to the target data API are accurately positioned, the constructed program fragments to be instrumented are instrumented, the same instrumentation effect can be achieved, meanwhile, the code expansion rate is reduced by narrowing the instrumentation range, and the influence on the performance of other irrelevant service functions is avoided by instrumentation of the program fragments to be instrumented.

Description

Program instrumentation method and device for verifying execution process of data API
Technical Field
The invention belongs to the technical field of computers, and particularly relates to a program instrumentation method and device for verifying a data API (application program interface) execution process.
Background
Code instrumentation refers to a class of techniques in which specific processing logic (i.e., probes) is inserted into a program under test, and then the runtime information of the program is obtained through execution of the processing logic. In a data sharing scenario, to verify the authenticity of the shared data, a specific instrumentation of the relevant program for data provision is required to verify the integrity of its execution, i.e. the runtime control flow information of the data API requested by the data consumer is collected at the data provider via the instrumented probe and a certification report is generated by the trusted component for verification by the data consumer.
In the prior art, integral instrumentation is typically performed with respect to an executable program in which a data API resides, so that all possible program information is collected for subsequent analysis and verification when the program is running. This approach results in a large code expansion rate and also creates a large overhead for executing business functions in the target program that are not related to the data supply. Therefore, how to instrumentation code with accuracy and less side effects on the target data API is a problem that needs to be solved.
Disclosure of Invention
In order to overcome the problems in the related art, the invention provides a program instrumentation method and device for verifying the execution process of a data API.
In a first aspect, the present invention provides a program instrumentation method for verifying a data API execution process, applied to a data supply end, the method comprising:
positioning a program segment corresponding to a target data API from the whole executable program of the data supply end; the target data API is used for providing shared data;
copying functions in the program fragments corresponding to the target data API, and generating a program fragment to be instrumented for the target data API based on calling relations among the functions in the program fragments; the execution logic and the execution effect of the program segment to be inserted are the same as those of the program segment;
inserting instrumentation codes into the segments of the program to be instrumented to obtain a target executable program of the target data API; the data supply end is used for measuring the executable program, the execution process and the execution result of the target data API by using a trusted measurement engine based on the target executable program.
Optionally, the locating the program segment corresponding to the target data API from the whole executable program of the data supply end includes:
executing the target data API based on the call parameters of the target data API;
acquiring an execution result of the target data API and execution process record information, wherein the execution process record information comprises names, input parameters and return values corresponding to all functions executed in the execution process of the target data API;
determining the names of the entry functions and the names of the exit functions of the program fragments corresponding to the target data API based on the calling parameters, the execution results and the execution process record information;
and determining the program fragments corresponding to the target data API in the whole executable program of the data supply end according to the names of the entry functions and the exit functions of the program fragments corresponding to the target data API.
Optionally, the determining, based on the call parameter, the execution result, and the execution process record information, a name of an entry function and a name of an exit function of a program segment corresponding to the target data API includes:
Searching an input parameter with the maximum similarity with the calling parameter in the input parameter of the execution process record information, and determining the name of the function corresponding to the searched input parameter as the name of the entry function of the program fragment corresponding to the target data API;
searching a return value with the maximum similarity with the execution result in the return value of the execution process record information, and determining the name of the function corresponding to the searched return value as the name of the exit function of the program fragment corresponding to the target data API.
Optionally, the determining, in the overall executable program of the data supply end, the program segment corresponding to the target data API according to the name of the entry function and the name of the exit function of the program segment corresponding to the target data API includes:
acquiring a function call relation diagram corresponding to the whole executable program;
locating a first function node in the function call relationship graph that matches the name of the entry function, and locating a second function node in the function call relationship graph that matches the name of the exit function;
and determining a set of functions represented by the first function node, the functions represented by the second function node and the functions represented by all function nodes with calling relations between the first function node and the second function node as program fragments corresponding to the target data API.
Optionally, the program segment includes a plurality of functions; the copying the function in the program segment corresponding to the target data API, and generating the program segment to be instrumented for the target data API based on the calling relation between the functions in the program segment, including:
copying all functions in the program fragment;
and reconstructing the call relationship among the copied functions according to the call relationship among the functions in the program fragments to generate the program fragments to be inserted.
Optionally, inserting instrumentation code into the section of the program to be instrumented to obtain a target executable program of the target data API, including:
inserting a first instrumentation code at an entry of a program segment to be instrumented of the target data API, the first instrumentation code being configured to trigger the trusted measurement engine to measure an executable of the target data API;
inserting a second instrumentation code at an entry of each basic block in a to-be-instrumented program segment of the target data API, wherein the second instrumentation code is used for triggering the trusted measurement engine to measure the execution process of the target data API;
inserting a third instrumentation code at the outlet of the to-be-instrumented program segment of the target data API, wherein the third instrumentation code is used for triggering the trusted measurement engine to measure the execution result of the target data API.
In a second aspect, the present invention provides a program instrumentation device for validating a data API execution process, for application to a data supply, the device comprising:
the positioning module is used for positioning a program segment corresponding to the target data API from the whole executable program of the data supply end; the target data API is used for providing shared data;
the copying module is used for copying the functions in the program fragments corresponding to the target data API and generating a program fragment to be inserted for the target data API based on the calling relation between the functions in the program fragments; the execution logic and the execution effect of the program segment to be inserted are the same as those of the program segment;
the instrumentation module is used for inserting instrumentation codes into the program segments to be instrumented to obtain target executable programs of the target data API; the data supply end is used for measuring the executable program, the execution process and the execution result of the target data API by using a trusted measurement engine based on the target executable program.
Optionally, the positioning module includes:
the execution module is used for executing the target data API based on the calling parameters of the target data API;
The first acquisition module is used for acquiring an execution result of the target data API and execution process record information, wherein the execution process record information comprises names, input parameters and return values corresponding to all functions executed in the execution process of the target data API;
the first determining module is used for determining the names of the entry functions and the names of the exit functions of the program fragments corresponding to the target data API based on the calling parameters, the execution results and the execution process record information;
and the second determining module is used for determining the program fragments corresponding to the target data API in the whole executable program of the data supply end according to the names of the entry functions and the exit functions of the program fragments corresponding to the target data API.
The embodiment of the invention provides a program instrumentation method and a device for verifying the execution process of a data API, wherein the method positions a program segment corresponding to a target data API from an overall executable program of a data supply end; copying functions in the program fragments corresponding to the target data API, and generating a program fragment to be instrumented for the target data API based on calling relations among the functions in the program fragments; inserting instrumentation codes into the segments of the program to be instrumented to obtain the target executable program of the target data API. Compared with the instrumentation of the whole executable program, in the embodiment of the invention, the instrumentation range is reduced by accurately instrumentation based on the instrumentation program segments to be instrumented of the target data API obtained by replication, so that the code expansion rate can be reduced, and the instrumentation is performed based on the instrumentation program segments to be instrumented of the target data API, so that the program segments are not influenced, and the problem of influence on the performance of other business functions due to instrumentation can be avoided. Meanwhile, the accuracy of the shared data itself and the correctness and authenticity of the data generation process of the shared data can be comprehensively verified by measuring the executable program, the execution process and the execution result of the data API requested by the data using terminal through the accurate instrumentation so as to be verified by the data using terminal.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of steps of a program instrumentation method for data API execution process validation provided by an embodiment of the present invention;
FIG. 2 is a schematic diagram of a call relationship provided in an embodiment of the present invention;
FIG. 3 is a schematic diagram of a reconstructed call relationship according to an embodiment of the present invention;
fig. 4 is a block diagram of a program instrumentation device for data API execution process verification according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Fig. 1 is a flowchart of steps of a program instrumentation method for verifying a data API execution process according to an embodiment of the present invention, where, as shown in fig. 1, the method may include:
step 101, locating a program segment corresponding to a target data API from the whole executable program of the data supply end; the target data API is used for providing shared data requested by the data user terminal.
In the embodiment of the invention, a data supply end can comprise a plurality of accessible data APIs which are disclosed externally. The target data API may be one of multiple data APIs, and specifically, the data acquisition request may carry data API indication information, for example, an identifier of the data API, so that the data supply end may determine, based on the data acquisition request, the target data API requested by the data user end at this time. The whole executable program of the data supply end refers to an executable program corresponding to the information system of the data supply end, and comprises program fragments of all data APIs in the data supply end. And positioning a program segment corresponding to the target data API from an executable program corresponding to the data supply end information system for instrumentation in a subsequent step.
102, copying functions in a program segment corresponding to the target data API, and generating a program segment to be instrumented for the target data API based on calling relations among the functions in the program segment; the execution logic and the execution effect of the program segment to be inserted are the same as those of the program segment.
In the embodiment of the invention, after the program segment corresponding to the target data API is positioned in the whole executable program of the data supply end, the function in the program segment corresponding to the target data API is copied, and the calling relation between the functions in the copied executable program is reconstructed based on the calling relation between the functions in the program segment, so that a new code bypass is formed. Wherein the function may be a method. The obtained code 'bypasses' are the program fragments to be instrumented which need instrumentation in the subsequent steps, and the execution logic and the execution effect of the program fragments to be instrumented are the same as those of the program fragments corresponding to the target data API.
Step 103, inserting instrumentation codes into the segments of the program to be instrumented to obtain a target executable program of the target data API; based on the target executable program, the data supply end measures the executable program, the execution process and the execution result of the target data API by using a trusted measurement engine.
In the embodiment of the invention, the instrumentation code containing the measurement logic is inserted into the instrumentation program segment to obtain the target executable program of the target data API, so that the data supply end responds to the data acquisition request sent by the data use end, and based on the target executable program of the target data API, the trusted measurement engine is utilized to measure the static executable program, the dynamic execution process and the execution result of the target data API, and the corresponding measurement result is obtained. The target executable program refers to a file which can be loaded and executed by an operating system and a dynamic link library on which the file depends. For example: under windows operating system, the executable program may be a file of the. Exe type, and the dynamic link library may be a file of the. Dll type.
In summary, the embodiment of the invention provides a program instrumentation method and device for verifying the execution process of a data API, wherein the method locates a program segment corresponding to a target data API from an overall executable program of a data supply end; copying functions in the program fragments corresponding to the target data API, and generating a program fragment to be instrumented for the target data API based on calling relations among the functions in the program fragments; inserting instrumentation codes into the segments of the program to be instrumented to obtain the target executable program of the target data API. Compared with the instrumentation of the whole executable program, in the embodiment of the invention, the instrumentation range is reduced by accurately instrumentation based on the instrumentation program segments to be instrumented of the target data API obtained by replication, so that the code expansion rate can be reduced, and the instrumentation is performed based on the instrumentation program segments to be instrumented of the target data API, so that the program segments are not influenced, and the problem of influence on the performance of other business functions due to instrumentation can be avoided. Meanwhile, the accuracy of the shared data itself and the correctness and authenticity of the data generation process of the shared data can be comprehensively verified by measuring the executable program, the execution process and the execution result of the data API requested by the data using terminal through the accurate instrumentation so as to be verified by the data using terminal.
Meanwhile, in the embodiment of the invention, the program fragments which are not instrumented can be executed when the executable program is not needed to be measured, and the program fragments which are not instrumented can be used when other business functions of part of the method in the target executable program are multiplexed to be executed. Since the original program fragment is not affected, the operation of the original program fragment is not affected, and the operation performance of other service functions depending on the original program fragment is prevented from being reduced.
Optionally, step 101 may include the following steps 1011 to 1014:
step 1011, executing the target data API based on the call parameter of the target data API.
In the embodiment of the invention, the call parameters of the target data API can be preset or input from a user.
Step 1012, obtaining an execution result of the target data API and execution process record information, where the execution process record information includes names, input parameters and return values corresponding to all functions executed in the execution process of the target data API.
In the embodiment of the invention, in the execution process of the target data API, the record information of the execution process of the target data API is recorded, and the output of the last function is obtained after the execution of the target data API is completed, so that the execution result of the target data API is obtained. The execution process record information may include names, input parameters and return values corresponding to all functions executed in the execution process of the target data API.
In the embodiment of the present invention, each function has a unique identifier, where the identifier has different expression modes in different programming languages, and the embodiment of the present invention is not limited thereto. In the embodiment of the invention, the identifier is denoted by a "function name".
In the embodiment of the invention, for each function, an input parameter and a return value exist, and recording the input parameter and the return value of each function can record all calls and process information occurring in the period of executing the process. And each time a function is executed, the data supply end records the name, the input parameter and the return value corresponding to the function, and after the execution of the target data API is completed, a set formed by a plurality of names, the input parameter and the return value is obtained, namely, the record information of the execution process is obtained.
Step 1013, determining the name of the entry function and the name of the exit function of the program fragment corresponding to the target data API based on the call parameter, the execution result and the execution process record information.
In the embodiment of the invention, according to the call parameters of the target data API, the names of the entry functions and the names of the exit functions of the program fragments corresponding to the target data API are determined according to the similarity between the execution result of the target data API and the record information of the execution process. Further, determining the name of the entry function and the name of the exit function may be achieved by:
Step 1013a, searching an input parameter with the maximum similarity to the calling parameter in the input parameters of the execution process record information, and determining the name of the function corresponding to the searched input parameter as the name of the entry function of the program fragment corresponding to the target data API.
In the embodiment of the present invention, the execution process record information may include input parameters of a plurality of functions, an input parameter having the greatest similarity with a call parameter of the target data API is searched from the plurality of input parameters, a name of a function corresponding to the input parameter having the greatest similarity is determined, and the name of the function is determined as a name of an entry function of a program fragment corresponding to the target data API.
Step 1013b, searching a return value with the maximum similarity to the execution result in the return value of the execution process record information, and determining the name of the function corresponding to the searched return value as the name of the export function of the program fragment corresponding to the target data API.
In the embodiment of the present invention, the execution process record information may include return values of a plurality of functions, a return value with the greatest similarity to the execution result of the target data API is searched from the plurality of return values, the name of the function with the greatest similarity is determined, and the name of the function is determined as the name of the exit function of the program fragment corresponding to the target data API.
It should be noted that in the process of recording the record information of the execution process, there are cases where other business functions repeatedly call part of the functions in the data API at the same time, but the call is irrelevant to the execution of the data API itself, and the record information is recorded together in the execution process. By matching the call parameters of the target data API, the execution result of the target data API and the execution process record information, functions which are recorded together due to the concurrency of multiple threads but are irrelevant to the execution of the target data API can be eliminated from a plurality of pieces of execution process record information, and functions which are relevant to the execution of the target data API only can be obtained.
In the embodiment of the invention, the calling parameters of the target data API, the execution result of the target data API and the record information of the execution process are respectively matched, so that the program fragment entry and exit functions corresponding to the target data API can be rapidly determined in a plurality of functions, and the entry and exit functions can be conveniently determined.
Step 1014, determining the program fragment corresponding to the target data API in the whole executable program of the data supply terminal according to the name of the entry function and the name of the exit function of the program fragment corresponding to the target data API.
In the embodiment of the invention, according to the name of the entry function and the name of the exit function, the program segment corresponding to the target data API is determined in the whole executable program of the data supply end. Further, determining the program segment corresponding to the target data API may be implemented by:
step 1014a, obtaining a function call relation diagram corresponding to the whole executable program.
In the embodiment of the invention, the whole executable program of the data supply end can be subjected to static analysis in advance under the condition of not executing the program, so that a function call relation diagram is generated. Wherein the function call relationship graph may be stored in a database; and/or storing the function call relation graph in a file form to a server. The manner of static analysis of the overall executable program of the data provider may include: the whole executable program of the data supply end is scanned through the technologies of lexical analysis, grammar analysis, control flow analysis, data flow analysis and the like to obtain a function call relation diagram, and the embodiment of the invention is not limited to the function call relation diagram.
In the embodiment of the present invention, the step of obtaining the function call relationship diagram may obtain the function call relationship diagram stored in the database or the server, or may obtain the function call relationship diagram generated by performing static analysis on the whole executable program of the data supply end on line, which is not limited in the embodiment of the present invention.
Step 1041b, locating a first function node in the function call relationship graph that matches the name of the entry function, and locating a second function node in the function call relationship graph that matches the name of the exit function.
In the embodiment of the invention, a first function node is positioned according to the name of the entry function in the function call relation diagram, and the first function node is used for representing the entry function of the program segment corresponding to the target data API. And locating a second function node in the function call relation diagram according to the name of the exit function, wherein the second function node is used for representing the exit function of the program segment corresponding to the target data API.
In the embodiment of the invention, the function call relation graph comprises a plurality of function nodes, one function node corresponds to a unique function, and the related information of each function node comprises: the name of the function represented by the node may be preset during the code development stage. And comparing the names of the entry function and the exit function with the names of the functions represented by all the function nodes in the function call relation graph, so as to find a first function node corresponding to the name of the entry function and a second function node corresponding to the name of the exit function.
Step 1041c, determining a set of functions represented by the first function node, functions represented by the second function node, and functions represented by all function nodes having call relations between the first function node and the second function node as program fragments corresponding to the target data API.
In the embodiment of the invention, a subgraph representing a program segment corresponding to the target data API is determined in a function call relation graph according to a first function node and a second function node, and in the subgraph, a function represented by the first function node, a function represented by the second function node and a set of functions represented by all function nodes with call relations between the first function node and the second function node, namely the program segment corresponding to the target data API. The method for determining a sub-graph representing the program segment corresponding to the target data API may be: starting from the first function node, all function nodes which are connected with the first function node and the second function node and have direct or indirect calling relation with the first function node are sequentially determined according to the function execution logic, and the function node set and the first function node and the second function node represent subgraphs of program fragments corresponding to the target data API.
In the embodiment of the invention, the call of the function is recorded through dynamic analysis of the target data API, but some branches of the function in the target data API may not be executed, and the function called in the branches is not recorded naturally. And the dynamic analysis is expanded through the static analysis, so that all program fragments corresponding to the complete target data API can be obtained, namely, the complete program fragments are provided for subsequent measurement of executable programs, execution processes and execution results of the target data API. Compared with the method for executing the target data API for multiple times and dynamically analyzing the target data API to obtain all functions in the target data API, the method and the device for determining the program fragments of the target data API reduce the complexity of calculation and the cost of the calculation process, and improve the efficiency of determining the program fragments corresponding to the target data API.
In summary, by recording the execution process record information in the execution process of the target data API, the name of the program fragment entry function and the name of the exit function corresponding to the target data API are determined based on the call parameter, the execution result and the execution process record information, so as to obtain the program fragment corresponding to the target data API in the whole executable program. Thus, the related method involved in the execution of the target data API can be accurately and conveniently positioned, and the efficiency of determining the executable program fragments is improved.
Optionally, the program segment includes a plurality of functions.
Accordingly, step 102 may include the following steps 1021 through 1022:
and 1021, copying all functions in the program fragment.
In the embodiment of the invention, all functions in the program fragment of the target data API are duplicated to construct a code bypass with equivalent execution effect for instrumentation. The copied function can be copied into the class of the original function, and the function copying can be realized through a program instrumentation tool.
And 1022, reconstructing the call relationship among the copied functions according to the call relationship among the functions in the program fragments to generate the program fragments to be inserted.
In the embodiment of the invention, the function is copied only by itself, and the calling relation of the previous function still points to the function address corresponding to the calling relation of the original function after the function is copied, namely the original function and the copied function still point to the same function address. Therefore, the call relationship between the copied functions needs to be reconstructed to generate the program segment to be instrumented, so as to ensure that the execution logic and the execution effect of the program segment to be instrumented are the same as those of the program segment corresponding to the target data API.
According to the method, the device and the system, according to the calling relation among functions shown in the subgraph of the program segment corresponding to the target data API, all instructions for defining the function calling relation in the program segment corresponding to the target data API can be traversed according to the calling relation among the original functions, the instructions for defining the function calling among the original functions are found, whether the instructions are calling relations among the copied functions in the subgraph of the program segment corresponding to the target data API or not is judged, if yes, the running address of the original function in the instructions is replaced with the running address of the function obtained by copying the original function, and then the instructions obtained after replacing the running address are replaced with the copied instructions for defining the function calling relation.
In the embodiment of the invention, because the current program generally adopts a modularized design, the condition of code multiplexing can be unavoidable in the program, if the original target code in the program is inserted, the execution of the measurement operation can possibly occur to interfere the execution of other business functions except the target code in the program, and the execution of the business functions can also cause the measurement error in the execution process of the target code, thereby influencing the measurement effect of the code. Therefore, by constructing a code bypass, the instrumentation is performed based on the program fragments to be instrumented of the copied target data API, so that the original program fragments are not affected, and the problem of the influence on the performance of other service functions due to instrumentation can be avoided.
The steps of copying and reconstructing are described with reference to fig. 2 and fig. 3, where fig. 2 shows a schematic call relationship provided by an embodiment of the present invention, and fig. 3 shows a schematic call relationship after reconstruction provided by an embodiment of the present invention. As shown in fig. 2, assuming that the calling relationship between M2, M5 and M2 and M5 is the program fragment corresponding to the target data API, a code "bypass" is to be constructed, and first, M2 and M5 are copied to obtain_m2 and_m5, but since the calling relationship between copied_m2 and M5 still exists, at this time, the calling relationship between copied_m2 and_m5 does not exist, so that the calling relationship between copied_m2 and_m5 needs to be continuously reconstructed. According to the calling relation between M2 and M5 shown in the subgraph of the program segment corresponding to the target data API in the function calling relation graph, traversing all instructions for defining the function calling relation in the program segment corresponding to the target data API according to the calling relation, finding out instructions for defining the function calling relation between M2 and M5, replacing the running address of M5 in the instructions for defining the function calling relation between M2 and M5 with the running address of_M5, and replacing the copied instructions for defining the function calling relation between_M2 and M5. Thus, the call relationships between_m2 and_m5 and between_m2 and_m5 in fig. 3 are obtained, forming a new code "bypass".
Optionally, step 103 may include the following steps 1031 to 1033:
step 1031, inserting a first instrumentation code at an entry of a section of the target data API to be instrumented, where the first instrumentation code is configured to trigger the trusted measurement engine to measure an executable of the target data API.
In the embodiment of the invention, before measuring the target data API, the data supply end performs instrumentation of measurement logic on the program segment to be instrumented of the target data API so as to measure the executable program, the execution process and the execution result of the target data API. Specifically, by inserting a first instrumentation code at an entry of a program segment to be instrumented of the target data API, the first instrumentation code is configured to trigger a trusted measurement engine to measure an executable program of the target data API, thereby obtaining a measurement value of the executable program.
Alternatively, the first instrumentation code may include code for initializing the run-time measurement and code for triggering the executable measurement. For example, the first instrumentation code may send a "Hello" message (start message) to the trusted measurement engine to trigger the trusted measurement engine to initialize the measurement and to measure the static executable of the target data API.
Step 1032, inserting a second instrumentation code at an entry of each basic block in the to-be-instrumented program segment of the target data API, where the second instrumentation code is configured to trigger the trusted measurement engine to measure an execution process of the target data API.
In the embodiment of the present invention, a second instrumentation code may be inserted at an entry of each basic block in a to-be-instrumented program segment of a target data API, where the second instrumentation code is configured to trigger the trusted measurement engine to measure an execution process of the target data API, so as to obtain the measurement value of the execution process, where one target data API may include multiple basic blocks.
Optionally, the entry of each basic block in the executable program of the target data API may include: the line number of each basic block in the executable of the target data API is preceded by a line number.
And 1033, inserting a third instrumentation code at the outlet of the to-be-instrumented program segment of the target data API, wherein the third instrumentation code is used for triggering the trusted measurement engine to measure the execution result of the target data API.
In the embodiment of the invention, the third instrumentation code is inserted at the outlet of the program segment to be instrumented of the target data API, and the third instrumentation code is used for triggering the trusted measurement engine to measure the execution result of the target data API, so as to obtain the measurement value of the execution result.
Optionally, the third instrumentation code may include code for measuring an execution result of the target data API, and code for ending the measurement and acquiring a data source authenticity proof report generated by the trusted measurement engine and returning the data source authenticity proof report to the data consumer together with the execution result of the target data API. For example, the third instrumentation code may be configured to send a "Goodbye" message (termination message) at the exit of the target data API to trigger the trusted measurement engine to measure the execution results of the target data API, and the trusted measurement engine to sign based on all of the obtained measurement results, to generate a data source authenticity report for verification by the data consumer.
Optionally, the instrumentation operation for the program segment to be instrumented of the target data API may be implemented by a corresponding instrumentation tool, where the instrumentation tool may provide an access instruction for a user, and the instrumentation tool may include: the method is applicable to Pin, dynamoRIO of binary codes and the like, is applicable to ASM of Java Bytecodes, is applicable to Mono.Cecil of C#IL and the like, and is not limited in the embodiment of the invention.
Alternatively, the instrumentation may be implemented by inserting a section of probe code representing different measurement logic at the corresponding positions of three types of instrumentation points at the entry of the target data API to-be-instrumented program segment, at the entry of each basic block in the target data API to-be-instrumented program segment, and at the exit of the target data API to-be-instrumented program segment. In one implementation manner, the instrumentation may be performed by inserting the first instrumentation code, the second instrumentation code, and the third instrumentation code in response to the user triggering the three types of instrumentation points at the entry of the target data API, at the entry of each basic block in the target data API, and at the exit of the target data API, or the instrumentation tool automatically performs the instrumentation operation to perform instrumentation after receiving the instrumentation instruction, which is not limited by the present embodiment.
In one implementation, a Java bytecode file corresponding to a target data API obtained by compiling a Java source file may be read by a preset bytecode processing tool, where the preset bytecode processing tool may be an ASM tool. Traversing all methods in the Java byte code file corresponding to the target data API, and inserting instrumentation codes into the entry and exit of the Java byte code file corresponding to the target data API and the entry of all basic blocks by using the API for instrumentation in the ASM. Wherein the instrumentation code may be pre-provided to the API for instrumentation in the ASM. For example, the corresponding instrumentation code is inserted before the first line code of the ingress method, the corresponding instrumentation code is inserted before the first line code of each basic block, and the corresponding instrumentation code is inserted before the egress method return statement. It will be appreciated that the above steps are implemented for Java, and the instrumentation implementation method for the data API developed for other programming languages is similar to the above steps, and will not be repeated here.
In the embodiment of the invention, the pile inserting points are arranged in three types: three instrumentation codes at the entry of the target data API to-be-instrumented program segment, at the entry of each basic block in the target data API to-be-instrumented program segment, and at the exit of the target data API to-be-instrumented program segment: and the first instrumentation code, the second instrumentation code and the third instrumentation code are used for obtaining the target executable program of the target data API. Based on the target executable program, the data supply end can interact with the trusted measurement engine in a specific execution stage, and the trusted measurement engine is utilized to measure the executable program, the execution process and the execution result of the target data API.
Further, in the data sharing scenario, by the program instrumentation method for verifying the execution process of the data API, the code instrumentation method can accurately perform code instrumentation with small side effects on the target data API, and in order to verify the authenticity of the shared data, the data supply terminal can measure the executable program, the execution process and the execution result of the target data API by using the trusted measurement engine based on the target executable program. The data user terminal receives the execution result and the proving report returned by the data supply terminal, and verifies the related content in the proving report based on the legal measurement result obtained by the pre-calculation, thereby ensuring the correctness and the authenticity of the obtained shared data.
Alternatively, the data supplying end may receive a data acquisition request sent by the data using end.
In the embodiment of the invention, the data supply terminal receives the data acquisition request sent by the data use terminal, the data acquisition request is used for acquiring the shared data, and the data acquisition request can be used for acquiring the returned data result, namely the shared data requested by the data use terminal by calling the data API externally disclosed by the data supply terminal. In other words, the data acquisition request in the present invention is the execution request of the data API. In the specific implementation of the invention, the data acquisition request can carry relevant parameters which indicate that the process and the result of the data acquisition need to be proved, if the relevant parameters exist in the data acquisition request, the data user end is indicated to request the data supply end to send a data source authenticity proving report for the subsequent verification of the data user end, and if the relevant parameters do not exist in the data acquisition request, the data supply end is indicated to only provide data for the data user end and does not need to send the data source authenticity proving report. The relevant parameter may specifically be a boolean type variable, which is not limited by the embodiments of the present disclosure.
In the embodiment of the invention, optionally, a random number (Nonce) may be attached to the data acquisition request, and the random number attached to each data acquisition request may be used only once, so as to ensure that the verification result of the shared data is not reused to combat replay attacks.
For example, when the data consumer wants to acquire the real shared data, the data consumer needs to send a data acquisition request and a random number (Nonce) to the data provider, where the data acquisition request includes relevant parameters indicating that the process and result of this data acquisition need to be verified, so as to call the data API provided by the data provider, and require the data provider to send a data source authenticity proof report for verification, while the data consumer and the data provider only need to remember the random numbers used respectively, and the random numbers carried in each request cannot have a repetition number, and if the request carries the random numbers used previously, the request is considered as a replay attack, so that the data provider is prevented from reusing the verification result.
Optionally, the data supply end may respond to the data acquisition request, execute a target data API indicated by the data acquisition request, and measure an executable program, an execution process and an execution result of the target data API by using a trusted measurement engine based on the executable program of the target data API, so as to obtain a measurement result; the target data API is configured to provide the shared data requested by the data acquisition request.
In the embodiment of the invention, a data supply end can comprise a plurality of accessible data APIs which are disclosed externally. Data usage initiates a data acquisition request to a target data API in accordance with the disclosed data API information. The target data API may be one of multiple data APIs, and specifically, the data acquisition request may carry data API indication information, for example, an identifier of the data API, so that the data supply end may determine, based on the data acquisition request, the target data API requested by the data user end at this time. The target data API is used for providing the shared data requested by the data acquisition request. In the online proving stage, the data supply end responds to a data acquisition request sent by the data use end, and based on the executable program of the target data API, the trusted measurement engine is utilized to measure the executable program, the dynamic execution process and the execution result of the target data API, so as to obtain a corresponding measurement result. Wherein, executable program refers to files that can be loaded and executed by an operating system and dynamic link libraries that the files depend on. For example: under windows operating system, the executable program may be a file of the. Exe type, and the dynamic link library may be a file of the. Dll type.
It should be noted that, the trusted measurement engine is a trusted component, and may be any platform capable of providing a secure running environment for performing authentication services, which may be a Trusted Execution Environment (TEE) independent of the data use end, or may be a trusted platform module TPM integrated into the data use end. The present invention is not particularly limited thereto. Alternatively, the trusted measurement engine may be a program that is authenticated by the TEE standard and set within the Intel Software Guard Extensions (SGX) TEE. The TEE may also be an ARM trust zone, AMD SEV, etc., which is not limited in this regard. The TEE is a separate secure operating environment disposed within the data supply that is logically isolated from a Rich Execution Environment (REE) of the data supply that utilizes secure communication mechanisms to obtain service of a trusted measurement engine in the TEE. The TEE provides a secure execution environment for the trusted measurement engine while also guaranteeing confidentiality, integrity, and access rights for the trusted measurement engine's resources and data. In other words, all operations performed within the TEE are trusted, and as such, measurements made with the trusted measurement engine are also trusted.
Specifically, the data supply end may calculate, based on the trusted measurement engine, a hash value of the executable program of the target data API, to obtain the measurement value of the executable program.
In the embodiment of the invention, based on a first instrumentation code at the entry of an executable program of a target data API, a trusted measurement engine is triggered to calculate a hash value of a static code of the target data API, and the obtained hash value is an executable program measured value. By measuring the executable program of the target data API, it can be verified in a subsequent verification whether the executable program of the target data API has been maliciously modified.
The data supply end can calculate a cumulative hash value of an execution sequence of a basic block in the target data API based on the trusted measurement engine in the process of executing the target data API, so as to obtain the measurement value of the execution process.
In the embodiment of the present invention, based on the second instrumentation code at the entry of each basic block in the executable program of the target data API, the cumulative hash value of the execution path traversed by the target data API in the execution process is recorded, which is essentially the cumulative hash value of the execution sequence of the basic blocks in the target data API, and the obtained cumulative hash value is the execution process measurement value. Basic block refers to a piece of code that does not contain any jump-type instructions or jump targets, there is only one entry and exit in one basic block, i.e., no other place in the program can jump directly around the entry to a non-entry location in the basic block, and only the last instruction can cause entry into other basic blocks for execution.
Alternatively, the second instrumentation code may include code for reporting the basic block ID to the measurement engine. For example, the second instrumentation code may send the basic block ID of each basic block to the trusted measurement engine to trigger measurement of the execution of the target data API by the trusted measurement engine based on the received basic block ID.
Accordingly, calculating the cumulative hash value of the basic block execution sequence of the target data API may be achieved by:
1) The data supply end can sequentially send a unique basic block ID representing the basic block to the trusted measurement engine based on the execution sequence of the basic block in the target data API in the execution process of the target data API; the target data API comprises a plurality of basic blocks.
In the execution process of the target data API, each time a new basic block is executed, the data supply end sends a unique basic block ID corresponding to the current basic block to the trusted measurement engine, the basic block ID can be preassigned and used for representing the identification of the basic block, the basic block ID and the basic block are in one-to-one correspondence, different basic blocks correspond to different IDs, and each basic block is provided with the unique basic block ID. Therefore, by sending the unique basic block ID corresponding to the current basic block to the trusted measurement engine, the execution situation of the program can be known clearly by only recording the basic block IDs corresponding to the basic blocks in sequence in the process of recording the execution sequence of the basic blocks.
2) The data supply may sequentially update the accumulated hash value representing the execution sequence of the basic block according to the received basic block ID based on the trusted measurement engine, and determine the accumulated hash value as the execution process measurement value.
In the embodiment of the invention, when the trusted measurement engine receives the basic block ID of the current basic block sent by the data supply end, the accumulated hash value representing the execution sequence of the basic block is updated in sequence according to the received basic block ID, and the accumulated hash value is determined as the execution process measurement value.
It should be noted that, the calculation formula of the cumulative hash value may be: h is a cur =H(h prev ,id cur )。
Wherein h is cur H is the current accumulated hash value prev Id, which is the previously accumulated hash value cur Is the current basic block ID.
Illustratively, in the case where N basic blocks need to be traversed during execution of the target data API once, the hash value h is accumulated previously when executing to the first basic block in the target data API prev Taking 0, the trusted measurement engine carries out hash calculation on the ID of the first basic block sent by the received data supply end and 0 to obtain a first hash value. When executing to the second basic block in the target data API, the trusted measurement engine sends the received data to the second basic block And carrying out hash calculation on the ID and the first hash value to obtain a second hash value. The second hash value is the current accumulated hash value. Similarly, when the execution is performed on the nth basic block in the target data API, an nth hash value is obtained, where the nth hash value is an accumulated hash value of the execution sequences of all basic blocks in the target data API, that is, an execution process measurement value.
In the embodiment of the invention, by recording the accumulated hash value of the execution sequence of the basic block in the target data API, whether the execution process of the target data API is tampered or not can be proved in the follow-up verification.
In the embodiment of the present invention, the calculation of the accumulated hash value may be performed by using a BLAKE2s library, which is not limited in the embodiment of the present invention.
The data supply end can calculate the hash value of the execution result based on the trusted measurement engine to obtain the measurement value of the execution result.
Based on the third instrumentation code at the outlet of the executable program of the target data API, calculating a hash value of the execution result of the target data API, wherein the obtained hash value is the measurement value of the execution result. By calculating the hash value of the execution result, whether the execution result is tampered or not before the execution result is received by the data use terminal and the data supply terminal returns the execution result to the data use terminal can be proved in the follow-up verification.
In the embodiment of the invention, by measuring the executable program, the execution process and the execution result of the target data API, whether the executable program, the execution process and the execution result of the target data API are tampered or not can be proved through verification of the measurement result, especially by monitoring the generation process of the data, the problem of data source counterfeiting which cannot be solved in the existing data authenticity guarantee technical scheme based on the distributed ledger and the predictive machine is solved, the correctness and the authenticity of the process of providing the shared data to the outside by the data supply terminal are guaranteed, and the authenticity and the reliability of the shared data are improved.
Optionally, the data provider may generate a data source authenticity proof report for the shared data based on the measurement result and a data signature of the trusted measurement engine.
In the embodiment of the invention, the data signature of the trusted measurement engine can be created through the private key of the trusted measurement engine, and the private key is always safely stored by the trusted measurement engine.
In the embodiment of the invention, the data source authenticity report consists of a measurement result obtained by measurement of the trusted measurement engine and a data signature of the trusted measurement engine. The data source authenticity report is used to verify the authenticity of the shared data generation process as well as the shared data itself. Wherein the data signature may also be marked with a time when the data source authenticity report was signed, and the data signature is invalidated if the data source authenticity report changes after the signature time. The invention is not limited in this regard.
Alternatively, the trusted measurement engine may sign the measurement with a hardware-protected private key in the TEE and generate a data source attestation report. Different TEEs correspond to different private keys and, accordingly, different TEEs correspond to different public keys paired with private keys, in other words, each TEE has a unique private-public key pair.
Optionally, the data supply end may return the execution result of the target data API and the data source authenticity proof report to the data use end; the execution result includes the shared data, and the data user terminal performs a verification operation based on the executable program legal measurement value and the execution process legal measurement value set acquired in advance and the data source authenticity proof report to verify the shared data.
In the embodiment of the invention, after the data supply end generates the data source authenticity report of the shared data, the data supply end returns the execution result of the target data API (namely the shared data requested by the data acquisition request) and the data source authenticity report to the data use end for verification by the data use end.
In the embodiment of the invention, after the data using end receives the data source authenticity proving report, the data using end can compare the executable program legal measured value and the executable program legal measured value set acquired offline in advance at the data using end with the measured result in the data source authenticity proving report so as to verify the authenticity of the shared data and the correctness and authenticity of the data generating process. The shared data is used only if it passes the verification.
Correspondingly, optionally, the data using end may send a data obtaining request to the data supplying end; the data acquisition request is used for requesting shared data provided by a target data API and a data source authenticity proof report of the shared data.
In the embodiment of the invention, the data using terminal sends a data acquisition request to the data supplying terminal under the condition that the shared data is required to be acquired from the data supplying terminal. Specifically, the data user may initiate a data acquisition request for acquiring shared data through the data API access information in the data entity in the interface type data object accessible to the external disclosure of the data supply end. In a specific implementation of the present invention, the data acquisition request may carry relevant parameters indicating that the process and the result of the data acquisition need to be verified, so as to require the data supply end to provide a data source authenticity verification report for the data use end to verify, and may specifically be a boolean variable, which is not limited by the embodiment of the present disclosure.
Optionally, the data using end may receive the data source authenticity proof report returned by the data supplying end in response to the data obtaining request and the execution result of the target data API; the execution results include the shared data, and the data source authenticity report is generated by the data supply terminal based on measurement results obtained by measuring executable programs, execution processes and execution results of the target data API by the trusted measurement engine and data signatures of the trusted measurement engine.
In the embodiment of the invention, the data user terminal receives the data source authenticity proof report returned by the data supply terminal in response to the data acquisition request and the execution result of the target data API. The execution result of the target data API may be shared data requested by the data consumer. Since the process of obtaining the measurement result is similar to the process of obtaining the measurement result by the data supply end, the process of generating the data source authenticity proof report is similar to the process of generating the data source authenticity proof report by the data supply end, and will not be described in detail herein.
Optionally, the data consumer may perform a verification operation based on the pre-acquired legal measurement values and the set of legal measurement values of the executable program and the data source authenticity proof report to verify the shared data.
In the embodiment of the invention, after the data using end receives the data source authenticity proving report, the data using end can compare the legal measured value of the executable program and the legal measured value set of the execution process obtained in advance offline at the data using end with the measured result in the data source authenticity proving report so as to verify the authenticity of the shared data.
Optionally, the method for verifying the shared data may further include the following steps:
1) An executable program of the target data API is obtained from the data supply.
In the embodiment of the invention, the data using end needs to acquire the executable program of the target data API from the data supplying end, wherein a first instrumentation code can be arranged at the inlet of the executable program of the target data API, a second instrumentation code can be arranged at the inlet of each basic block, and a third instrumentation code can be arranged at the outlet of the executable program of the target data API.
Optionally, the data user may perform instrumentation on the original executable program of the target data API by using the foregoing instrumentation method, to obtain an instrumented executable program of the target data API.
2) And calculating the hash value of the executable program of the target data API to obtain the legal measured value of the executable program.
In the embodiment of the invention, in the offline preprocessing stage, the data using end measures the executable program of the target data API, specifically, the legal measured value of the executable program can be obtained by calculating the hash value of the executable program, and whether the executable program of the target data API is maliciously modified can be proved by comparing the legal measured value of the executable program with the measured value of the executable program in the data source authenticity proving report.
3) And calculating the accumulated hash values of all legal execution paths in the executable program of the target data API to obtain a legal measurement value set in the execution process.
In the embodiment of the invention, the data user end calculates the accumulated hash values of all legal execution paths from the entrance to the exit in the target data API respectively to obtain a legal measurement value set in the execution process. By comparing the set of legal measurements of the execution process with the measurements of the execution process in the data source authenticity certification report, it can be demonstrated whether the execution process of the target data API is attacked or tampered. All legal execution paths from the inlet to the outlet in the target data API can be subjected to static analysis on an executable program of the target API to generate a static control flow graph. And acquiring the execution paths existing in the static control flow graph, thereby obtaining all potential legal execution paths corresponding to the target data API. Specifically, the executable program of the target API may be divided into basic blocks, including: the executable program of the target API is divided into at least one class, the at least one class is divided into at least one function, and the at least one function is divided into at least one basic block. The basic blocks are identified by basic block IDs. Thus, the static control flow graph includes an ID for each basic block.
In the embodiment of the present invention, the manner of performing static analysis on the executable program of the target API may include: the executable program of the target API is scanned by the technologies of lexical analysis, grammar analysis, control flow analysis, data flow analysis and the like to obtain a static control flow graph, and the embodiment of the invention is not limited to the static control flow graph.
The data using end and the data supplying end measure the same executable program. In one embodiment, the measurement of the executable program and all legal execution paths of the target data API by the data user is performed based on the post-instrumentation target data API, and the executable program and the execution process of the target data API by the data provider are also performed based on the post-instrumentation target data API, so that the comparison with the measurement result obtained by the data provider in the subsequent verification process is facilitated, and meanwhile, the alteration of the instrumentation code in the target data API caused by attack can be avoided, which results in the verification process of the shared data authenticity being affected.
It should be noted that, the steps 1) -4) may be performed before the data using end sends the data obtaining request to the data supplying end, and the data source authenticity proving report may be conveniently verified by analyzing the target data API in the offline stage, so as to determine the authenticity of the data source.
Optionally, a fourth instrumentation code is set at an entry of each basic block in the target data API in the data use end, where the fourth instrumentation code is used to provide a basic block ID of each basic block.
In this embodiment of the present invention, the fourth instrumentation code may be the same as the second instrumentation code described above, and is configured to provide a basic block ID of each basic block, so as to obtain an accumulated hash value of the legal execution path.
Accordingly, calculating the cumulative hash value of all legal execution paths in the executable program of the target data API may specifically include the following steps:
1) And for any legal execution path, sequentially acquiring basic block IDs belonging to the legal execution path in the executable program based on the fourth instrumentation code.
In the embodiment of the invention, the target data API comprises a plurality of legal execution paths, and for any legal execution path, the basic block ID belonging to the legal execution path in the executable program of the target data API is sequentially provided based on the fourth instrumentation code according to the static control flow graph so as to provide the basic block ID for the trusted measurement engine. Wherein the fourth instrumentation code may include code for reporting the basic block ID to a trusted measurement engine. The basic block IDs of the legal execution paths may be sequentially sent by the data supply end to the trusted measurement engine according to the execution sequence of the legal execution paths.
2) And calculating the accumulated hash value of the legal execution path based on the sequentially acquired basic block IDs.
In the embodiment of the invention, for any legal execution path, based on the sequentially acquired basic block IDs, the accumulated hash value of the legal execution path is calculated according to the execution sequence of the basic blocks. All legal execution paths are calculated by the method to obtain a plurality of accumulated hash values, namely a legal measurement value set in the execution process. Since the process of calculating the cumulative hash value is similar to the process of calculating the cumulative hash value in steps 1022a to 1022b, the description thereof will be omitted.
Alternatively, the data provider may generate the data source verification report based on the executable program measurements, the execution process measurements, the execution result measurements, and the data signature of the trusted measurement engine. The data consumer may perform data signature verification operations, executable program integrity verification operations, execution process integrity verification operations, and execution result integrity verification operations based on the data source authenticity certification report.
Optionally, the data signature of the trusted measurement engine is verified by using a public key corresponding to a private key used when the data signature is generated, so as to realize the data signature verification operation. Specifically, the data signature of the data source authenticity proof report is created by the private key of the trusted measurement engine, and the data user end decrypts the data signature of the trusted measurement engine by utilizing the public key corresponding to the private key, so that the authenticity of the data signature is verified. When the verification is successful, the representative may prove the authenticity of the data source authenticity proof.
Optionally, the data using end uses the public key corresponding to the private key of the TEE to verify, and when the verification is passed, the data signature performed by the private key in the TEE can be verified, and the verification data source authenticity verification report is generated by the trusted measurement engine.
The data user end can compare the legal measured value of the executable program with the measured value of the executable program so as to realize the integrity verification operation of the executable program. Specifically, the legal measurement value of the executable program obtained by the data user terminal can be compared with the measurement value of the executable program in the data source authenticity proving report, so that the integrity verification operation of the executable program is realized. If the executable program legal measurement is the same as the executable program measurement, the executable program representing the target data API is not tampered with. By comparing the legal measured value of the executable program with the measured value of the executable program, the situation that the shared data is wrong due to the fact that the executable program is tampered can be avoided, and the integrity of the executable program is verified.
The data user end can compare the legal measured value set of the execution process with the measured value of the execution process so as to realize the integrity verification operation of the execution process. Specifically, the legal measurement value set of the execution process obtained by the data user terminal can be compared with the measurement value of the execution process in the data source authenticity proving report, so that the integrity verification operation of the execution process is realized. If the execution measure can be the same at any of the set of execution legal measures, then the execution of the representative target data API has not been tampered with. By comparing the legal measured value set of the execution process with the measured value of the execution process, the situation that the control flow is tampered due to attack in the dynamic execution process can be avoided, and therefore the integrity of the dynamic execution process is verified.
The data user terminal can compare the hash value of the execution result of the target data API with the measurement value of the execution result so as to realize the integrity verification operation of the execution result. Specifically, the data user terminal calculates a hash value of the received execution result of the target data API based on the received execution result of the target data API sent by the data supply terminal, that is, the shared data requested by the data user terminal, and compares the hash value of the execution result of the target data API with the measurement value of the execution result in the data source authenticity proof report, so as to implement the execution result integrity verification operation. If the hash value of the execution result of the target data API is the same as the measurement value of the execution result, the execution result of the target data API is not tampered in the process from the data supply end to the process from the data use end receiving the execution result sent by the data supply end. By comparing the hash value of the execution result of the target data API with the measurement value of the execution result, the execution result can be prevented from being attacked and tampered in the interaction process of the data supply end and the data use end.
In summary, the data supply end inserts the instrumentation code into the instrumentation program segment to obtain the target executable program of the target data API, and the data supply end measures the executable program, the execution process and the execution result of the target data API by using the trusted measurement engine based on the target executable program. When the data using end obtains the shared data, through verifying the execution result and the data source authenticity proof report, whether the generation process of the data is tampered or not and whether the obtained execution result is correct or not can be verified in time, so that not only can the data provided by the data supplying end be ensured to be faithfully obtained by the data using end, but also the correctness and the safety of the shared data provided by the data supplying end to the data using end and the process of generating the shared data by the data supplying end can be ensured, and the authenticity of the shared data can be further improved.
Fig. 4 is a block diagram of a program instrumentation device for verifying a data API execution process according to an embodiment of the present invention, and as shown in fig. 4, the device 30 is applied to a data supply end, and may specifically include:
the positioning module 301 is configured to position a program segment corresponding to the target data API from the whole executable program of the data supply end.
And the copying module 302 is configured to copy the functions in the program fragments corresponding to the target data API, and generate a program fragment to be instrumented for the target data API based on the calling relationship between the functions in the program fragments.
The instrumentation module 303 is configured to insert instrumentation code into the program segment to be instrumented, so as to obtain a target executable program of the target data API; based on the target executable program, the data supply end measures the executable program, the execution process and the execution result of the target data API by using a trusted measurement engine.
The embodiment of the invention provides a program instrumentation device for verifying the execution process of a data API, which can locate a program segment corresponding to a target data API from an overall executable program of a data supply end; copying functions in the program fragments corresponding to the target data API, and generating a program fragment to be instrumented for the target data API based on calling relations among the functions in the program fragments; inserting instrumentation codes into the segments of the program to be instrumented to obtain the target executable program of the target data API. Compared with the instrumentation of the whole executable program, the precise instrumentation is performed based on the instrumentation program segment to be instrumented of the target data API obtained through replication, so that the instrumentation range is reduced, the code expansion rate can be reduced, and the instrumentation is performed based on the replicated instrumentation program segment to be instrumented, so that the program segment is not influenced, and the problem of influence on the performance of other service functions due to instrumentation can be avoided. Meanwhile, the accuracy of the shared data itself and the correctness and authenticity of the data generation process of the shared data can be comprehensively verified by measuring the executable program, the execution process and the execution result of the data API requested by the data using terminal through the accurate instrumentation so as to be verified by the data using terminal.
Optionally, the positioning module includes:
and the execution module is configured to execute the target data API based on the call parameters of the target data API.
The first acquisition module is configured to acquire an execution result of the target data API and execution process record information, wherein the execution process record information comprises names, input parameters and return values corresponding to all functions executed in the execution process of the target data API.
And the first determining module is configured to determine the names of the entry functions and the names of the exit functions of the program fragments corresponding to the target data API based on the calling parameters, the execution results and the execution process record information.
And the second determining module is configured to determine the program fragment corresponding to the target data API in the whole executable program of the data supply end according to the name of the entry function and the name of the exit function of the program fragment corresponding to the target data API.
Optionally, the first determining module includes:
the first searching module is configured to search the input parameter with the largest similarity with the calling parameter in the input parameters of the execution process record information, and determine the name of the function corresponding to the searched input parameter as the name of the entry function of the program fragment corresponding to the target data API.
And the second searching module is configured to search a return value with the maximum similarity with the execution result in the return value of the execution process record information, and determine the name of the function corresponding to the searched return value as the name of the exit function of the program fragment corresponding to the target data API.
Optionally, the second determining module includes:
and the second acquisition module is configured to acquire a function call relation diagram corresponding to the whole executable program.
A first positioning sub-module configured to position a first function node in the function call relationship graph that matches the name of the entry function and to position a second function node in the function call relationship graph that matches the name of the exit function.
The first determining submodule is configured to determine a set of functions represented by the first function node, the functions represented by the second function node and all functions represented by the function nodes with calling relations between the first function node and the second function node as program fragments corresponding to the target data API.
Optionally, the replication module 302 includes:
A first replication sub-module configured to replicate all functions in the program segment.
And the calling relation reconstruction module is configured to reconstruct the calling relation among the copied functions according to the calling relation among the functions in the program fragments so as to generate the program fragments to be inserted.
Optionally, the pile inserting module 303 includes:
and the first instrumentation sub-module is configured to insert first instrumentation code at an inlet of a program segment to be instrumented of the target data API, and the first instrumentation code is used for triggering the trusted measurement engine to measure an executable program of the target data API.
And the second instrumentation sub-module is configured to insert second instrumentation code at the entrance of each basic block in the program segment to be instrumented of the target data API, and the second instrumentation code is used for triggering the trusted measurement engine to measure the execution process of the target data API.
And the third instrumentation sub-module is configured to insert third instrumentation code at the outlet of the program fragment to be instrumented of the target data API, wherein the third instrumentation code is used for triggering the trusted measurement engine to measure the execution result of the target data API.
For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.
The algorithms and displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general-purpose systems may also be used with the teachings herein. The required structure for a construction of such a system is apparent from the description above. In addition, the present invention is not directed to any particular programming language. It will be appreciated that the teachings of the present invention described herein may be implemented in a variety of programming languages, and the above description of specific languages is provided for disclosure of enablement and best mode of the present invention.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the above description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be construed as reflecting the intention that: i.e., the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the apparatus of the embodiments may be adaptively changed and disposed in one or more apparatuses different from the embodiments. The modules or units or components of the embodiments may be combined into one module or unit or component and, furthermore, they may be divided into a plurality of sub-modules or sub-units or sub-components. Any combination of all features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or units of any method or apparatus so disclosed, may be used in combination, except insofar as at least some of such features and/or processes or units are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that some or all of the functions of some or all of the components in a sorting device according to the present invention may be implemented in practice using a microprocessor or Digital Signal Processor (DSP). The present invention may also be implemented as an apparatus or device program for performing part or all of the methods described herein. Such a program embodying the present invention may be stored on a computer readable medium, or may have the form of one or more signals. Such signals may be downloaded from an internet website, provided on a carrier signal, or provided in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (6)

1. A program instrumentation method for data API execution process validation, applied to a data supply, the method comprising:
positioning a program segment corresponding to a target data API from the whole executable program of the data supply end; the target data API is used for providing shared data;
copying functions in the program fragments corresponding to the target data API, and generating a program fragment to be instrumented for the target data API based on calling relations among the functions in the program fragments; the execution logic and the execution effect of the program segment to be inserted are the same as those of the program segment;
Inserting instrumentation codes into the segments of the program to be instrumented to obtain a target executable program of the target data API; based on the target executable program, the data supply end measures the executable program, the execution process and the execution result of the target data API by using a trusted measurement engine;
the copying the function in the program segment corresponding to the target data API, and generating the program segment to be instrumented for the target data API based on the calling relation between the functions in the program segment, including:
copying all functions in the program fragment;
reconstructing the call relationship among the copied functions according to the call relationship among the functions in the program segment to generate the program segment to be inserted;
inserting instrumentation codes into the segments of the program to be instrumented to obtain a target executable program of the target data API, wherein the method comprises the following steps:
inserting a first instrumentation code at an entry of a program segment to be instrumented of the target data API, the first instrumentation code being configured to trigger the trusted measurement engine to measure an executable of the target data API;
inserting a second instrumentation code at an entry of each basic block in a to-be-instrumented program segment of the target data API, wherein the second instrumentation code is used for triggering the trusted measurement engine to measure the execution process of the target data API;
Inserting a third instrumentation code at the outlet of the to-be-instrumented program segment of the target data API, wherein the third instrumentation code is used for triggering the trusted measurement engine to measure the execution result of the target data API.
2. The method according to claim 1, wherein the locating the program segment corresponding to the target data API from the whole executable program of the data supplier includes:
executing the target data API based on the call parameters of the target data API;
acquiring an execution result of the target data API and execution process record information, wherein the execution process record information comprises names, input parameters and return values corresponding to all functions executed in the execution process of the target data API;
determining the names of the entry functions and the names of the exit functions of the program fragments corresponding to the target data API based on the calling parameters, the execution results and the execution process record information;
and determining the program fragments corresponding to the target data API in the whole executable program of the data supply end according to the names of the entry functions and the exit functions of the program fragments corresponding to the target data API.
3. The method according to claim 2, wherein determining the names of the entry functions and the exit functions of the program fragments corresponding to the target data API based on the call parameters, the execution results, and the execution process record information comprises:
searching an input parameter with the maximum similarity with the calling parameter in the input parameter of the execution process record information, and determining the name of the function corresponding to the searched input parameter as the name of the entry function of the program fragment corresponding to the target data API;
searching a return value with the maximum similarity with the execution result in the return value of the execution process record information, and determining the name of the function corresponding to the searched return value as the name of the exit function of the program fragment corresponding to the target data API.
4. The method according to claim 2, wherein the determining, in the overall executable program of the data supplier, the program fragment corresponding to the target data API according to the name of the entry function and the name of the exit function of the program fragment corresponding to the target data API includes:
Acquiring a function call relation diagram corresponding to the whole executable program;
locating a first function node in the function call relationship graph that matches the name of the entry function, and locating a second function node in the function call relationship graph that matches the name of the exit function;
and determining a set of functions represented by the first function node, the functions represented by the second function node and the functions represented by all function nodes with calling relations between the first function node and the second function node as program fragments corresponding to the target data API.
5. A program instrumentation device for data API execution process validation, applied to a data supply, said device comprising:
the positioning module is used for positioning a program segment corresponding to the target data API from the whole executable program of the data supply end; the target data API is used for providing shared data;
the copying module is used for copying the functions in the program fragments corresponding to the target data API and generating a program fragment to be inserted for the target data API based on the calling relation between the functions in the program fragments; the execution logic and the execution effect of the program segment to be inserted are the same as those of the program segment;
The instrumentation module is used for inserting instrumentation codes into the program segments to be instrumented to obtain target executable programs of the target data API; based on the target executable program, the data supply end measures the executable program, the execution process and the execution result of the target data API by using a trusted measurement engine;
the replication module includes:
a first replication sub-module configured to replicate all functions in the program segment;
the calling relation reconstruction module is configured to reconstruct the calling relation among the copied functions according to the calling relation among the functions in the program fragments so as to generate the program fragments to be inserted;
the pile inserting module comprises:
a first instrumentation sub-module configured to insert first instrumentation code at an entry of a section of the target data API to be instrumented, the first instrumentation code being configured to trigger the trusted measurement engine to measure an executable of the target data API;
a second instrumentation sub-module configured to insert a second instrumentation code at an entry of each basic block in a to-be-instrumented program segment of the target data API, the second instrumentation code being configured to trigger the trusted measurement engine to measure an execution process of the target data API;
And the third instrumentation sub-module is configured to insert third instrumentation code at the outlet of the program fragment to be instrumented of the target data API, wherein the third instrumentation code is used for triggering the trusted measurement engine to measure the execution result of the target data API.
6. The apparatus of claim 5, wherein the positioning module comprises:
the execution module is used for executing the target data API based on the calling parameters of the target data API;
the first acquisition module is used for acquiring an execution result of the target data API and execution process record information, wherein the execution process record information comprises names, input parameters and return values corresponding to all functions executed in the execution process of the target data API;
the first determining module is used for determining the names of the entry functions and the names of the exit functions of the program fragments corresponding to the target data API based on the calling parameters, the execution results and the execution process record information;
and the second determining module is used for determining the program fragments corresponding to the target data API in the whole executable program of the data supply end according to the names of the entry functions and the exit functions of the program fragments corresponding to the target data API.
CN202210813432.9A 2022-07-12 2022-07-12 Program instrumentation method and device for verifying execution process of data API Active CN115221051B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210813432.9A CN115221051B (en) 2022-07-12 2022-07-12 Program instrumentation method and device for verifying execution process of data API

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210813432.9A CN115221051B (en) 2022-07-12 2022-07-12 Program instrumentation method and device for verifying execution process of data API

Publications (2)

Publication Number Publication Date
CN115221051A CN115221051A (en) 2022-10-21
CN115221051B true CN115221051B (en) 2023-06-09

Family

ID=83612754

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210813432.9A Active CN115221051B (en) 2022-07-12 2022-07-12 Program instrumentation method and device for verifying execution process of data API

Country Status (1)

Country Link
CN (1) CN115221051B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108694320A (en) * 2018-05-15 2018-10-23 中国科学院信息工程研究所 The method and system of sensitive application dynamic measurement under a kind of more security contexts

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140006994A1 (en) * 2012-06-29 2014-01-02 Apple Inc. Device, Method, and Graphical User Interface for Displaying a Virtual Keyboard
CN103744782A (en) * 2014-01-02 2014-04-23 北京百度网讯科技有限公司 Method and device for acquiring program execution sequence
CN106649063B (en) * 2016-11-22 2020-11-17 腾讯科技(深圳)有限公司 Method and system for monitoring time-consuming data during program operation
KR102462864B1 (en) * 2017-12-22 2022-11-07 한국전자통신연구원 Apparatus and method for dynamic binary instrumentation using multi-core
CN112905184B (en) * 2021-01-08 2024-03-26 浙江大学 Pile-inserting-based reverse analysis method for industrial control protocol grammar under basic block granularity
CN114327491B (en) * 2022-03-07 2022-06-21 深圳开源互联网安全技术有限公司 Source code instrumentation method, apparatus, computer device and storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108694320A (en) * 2018-05-15 2018-10-23 中国科学院信息工程研究所 The method and system of sensitive application dynamic measurement under a kind of more security contexts

Also Published As

Publication number Publication date
CN115221051A (en) 2022-10-21

Similar Documents

Publication Publication Date Title
Canetti et al. Practical delegation of computation using multiple servers
Doychev et al. Rigorous analysis of software countermeasures against cache attacks
CN107111713B (en) Automatic validation of software systems
JP5564034B2 (en) Anti-tamper system using automatic analysis
US11170113B2 (en) Management of security vulnerabilities
US20120023486A1 (en) Verification of Information-Flow Downgraders
Afrose et al. CryptoAPI-Bench: A comprehensive benchmark on Java cryptographic API misuses
WO2013121951A1 (en) Program management system
Li et al. Detecting standard violation errors in smart contracts
Lv et al. Potential risk detection system of hyperledger fabric smart contract based on static analysis
Wichelmann et al. Microwalk-ci: practical side-channel analysis for javascript applications
Emmi et al. RAPID: checking API usage for the cloud in the cloud
Zhang et al. Analyzing android taint analysis tools: FlowDroid, Amandroid, and DroidSafe
Kang et al. Scaling javascript abstract interpretation to detect and exploit node. js taint-style vulnerability
Hu et al. A probability prediction based mutable control-flow attestation scheme on embedded platforms
CN115221051B (en) Program instrumentation method and device for verifying execution process of data API
Sharif et al. Understanding precision in host based intrusion detection: Formal analysis and practical models
Paaßen et al. My fuzzer beats them all! developing a framework for fair evaluation and comparison of fuzzers
Jianming et al. PVDF: An automatic patch-based vulnerability description and fuzzing method
Siqueira et al. Experimenting with a multi-approach testing strategy for adaptive systems
Mesecan et al. Keeping Secrets: Multi-objective Genetic Improvement for Detecting and Reducing Information Leakage
JP6911928B2 (en) Hypothesis verification device, hypothesis verification method, and program
CN115051810B (en) Interface type digital object authenticity verification method and device based on remote proof
He et al. Automatically identifying cve affected versions with patches and developer logs
Jurjens Code security analysis of a biometric authentication system using automated theorem provers

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant