CN116451228B - Dynamic taint tracking method, device and related online taint propagation analysis system - Google Patents

Dynamic taint tracking method, device and related online taint propagation analysis system Download PDF

Info

Publication number
CN116451228B
CN116451228B CN202310441463.0A CN202310441463A CN116451228B CN 116451228 B CN116451228 B CN 116451228B CN 202310441463 A CN202310441463 A CN 202310441463A CN 116451228 B CN116451228 B CN 116451228B
Authority
CN
China
Prior art keywords
stain
taint
propagation
function
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310441463.0A
Other languages
Chinese (zh)
Other versions
CN116451228A (en
Inventor
张涛
宁戈
周雅飞
杜玉洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Anpro Information Technology Co ltd
Original Assignee
Beijing Anpro Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Anpro Information Technology Co ltd filed Critical Beijing Anpro Information Technology Co ltd
Priority to CN202310441463.0A priority Critical patent/CN116451228B/en
Publication of CN116451228A publication Critical patent/CN116451228A/en
Application granted granted Critical
Publication of CN116451228B publication Critical patent/CN116451228B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/566Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/362Software debugging
    • G06F11/3644Software debugging by instrumenting at runtime
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Virology (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the application discloses a dynamic taint tracking method, a device and a related online taint propagation analysis system. Wherein the method comprises the following steps: the method mainly comprises the steps of deploying an Agent program, monitoring a first input function, a first propagation function and a first output function instrumentation probe of a target application program through the Agent program in the running process of the target application program, monitoring the function execution record of a taint propagation track through the probe when the target application program processes request input, and timely restoring a relevant taint propagation path according to the taint propagation track after a taint convergence event is triggered, so that effective monitoring, immediate response, accurate analysis and full coverage of the taint data complex propagation in the target application program in the running process of the target application program are realized.

Description

Dynamic taint tracking method, device and related online taint propagation analysis system
Technical Field
The embodiment of the application mainly relates to the technical field of software analysis, and more particularly relates to a dynamic stain tracking method, a dynamic stain tracking device and a related online stain propagation analysis system.
Background
The stain Analysis technology (point Analysis) is an effective means for guaranteeing information security and preventing information integrity and confidentiality from being damaged, and is also an important research direction in the fields of network and information security. Stain analysis is essentially an information flow analysis technique, which is critical in that: the propagation of specific data segments (i.e., taint data) during program execution is tracked to provide support for other analysis. Wherein, the tracking of the smudge propagation is mainly realized by the smudge propagation analysis.
The smear propagation analysis is the focus of research in the practice of smear analysis. The smear propagation analysis can be classified into a static smear propagation analysis mode and a dynamic smear propagation analysis mode according to whether or not a program needs to be run in the analysis process. In the static stain propagation analysis mode, although the data dependency relationship of the program variable can be analyzed on the program source code or the intermediate representation thereof, the stain propagation path in the program source code or the intermediate representation can be determined, and particularly whether the stain data can be propagated from the stain source to the stain convergence point or not is detected; however, the core thinking of the static stain propagation analysis is to convert the static analysis problem of the stain propagation into the static data dependency analysis in the program, so that the implementation of the static stain propagation analysis often needs to traverse all program source codes or intermediate representations of analysis objects, and the static stain propagation analysis mode has the problems of high cost, large error and the like. The dynamic taint propagation analysis mode is to detect whether taint data can be propagated from a taint source to a taint convergence point by monitoring the propagation of taint data of a program in a system in real time in the running process of the program; among these, the dynamic taint propagation analysis based on the stake insertion is the best practice of the taint propagation analysis technique. The dynamic taint propagation analysis based on instrumentation not only does not need specific underlying hardware or virtual machine environments, but also is more close to the source program level to support the security policy of higher semantic logic, and is widely applied and suitable for the taint propagation analysis of more abstract program logic. However, the dynamic taint propagation analysis based on the stake insertion can bring about expenditure due to the stake insertion, and blindly widely distributes stake insertion points, which is helpful for recording richer program operation and taint propagation details, but brings about huge expenditure to the system, and becomes unacceptably heavy; while some existing schemes that choose a reduced instrumentation strategy for coping with overhead only support passive tracking of the taint propagation (and its recording) in a simple linear path. Thus, how to effectively and effectively track the complex propagation of the taint data in the application program in a real-time manner in the running process of the application program becomes a technical problem to be solved in the practice of the taint propagation analysis technology.
Disclosure of Invention
According to the embodiment of the application, a dynamic taint tracking scheme is provided, corresponding taint propagation paths are tracked and restored through reasonable pile insertion and corresponding security strategies, so that effective monitoring, instant response, accurate analysis and full coverage of the taint data complex propagation in the target application program are realized, and suspected available paths in the target application program are rapidly and efficiently identified.
In a first aspect of the present disclosure, a dynamic spot tracking method is provided. The method comprises the following steps: an Agent program is deployed at a server running a target application program; in the running process of the target application program, inserting a probe into a stain source function, a stain propagation function and a stain converging function of the target application program through the Agent program; then when a target application program processes request input, monitoring the execution of the stain source function, the stain propagation function and the stain convergence function through the probe, and recording the stain propagation track in the current request processing process; after the stain converging event is triggered, the stain propagation path to which the current stain converging event is subordinate is restored in real time according to the stain propagation track; the stain propagation track record is realized by respectively acquiring corresponding event information under the probe tangent points of the probes through the probes; specifically, the event information may mainly include an event type of an event obtained by the current probe, a stain propagation track under the current tangent point, and function information; the event types comprise a stain source input event, a stain propagation event and a stain convergence event corresponding to the stain source function, the stain propagation function and the stain convergence function; the function information may include a defined function name of the function in which the current probe is inserted.
Alternatively, in an implementation manner of the first aspect, the corresponding dirty data under the probe tangent point may be acquired by the probe to describe the dirty propagation track under the tangent point in the obtained event information, where the dirty data may include original input dirty data/derivative dirty data and propagation input dirty data/aggregate input dirty data thereof; thus, further the stain propagation trajectory record may include: tracking original input taint data/derived taint data/converged input taint data under a probe tangent point in the execution of the taint source function, the taint spreading function and the taint converging function by the probe, capturing taint source input event information/taint spreading event information/taint converging event information associated with the taint data under the probe tangent point, and recording the taint source input event information/taint spreading event information/taint converging event information and the identification thereof into a collecting container; the stain source input event information/stain propagation event information/stain convergence event information all comprise a stain propagation track under a current tangent point, and the stain propagation track under the tangent point can comprise: the original input taint data/derived taint data obtained by the probe and the transmission input taint data/convergence input taint data of the original input taint data/derived taint data obtained by the probe or the unique characteristic information of the corresponding taint data; correspondingly, the stain propagation path restoration may include: after (any) the stain aggregation event is triggered, according to corresponding stain aggregation event information and stain propagation event information in the collection container, a stain propagation track under a tangent point in the stain source input event information, backtracking is carried out after relevant stain data is transmitted, and the stain propagation track to which the stain aggregation event is triggered currently is played back.
Further optionally, in one implementation manner of the foregoing implementation manner, during the recording of the stain propagation track, the stain data tracked by the probe may include character string data and the like.
Further optionally, in one implementation manner, during the recording of the stain propagation track, the original input stain data/derived stain data/aggregate input stain data under a probe tangent point of the stain source function, the stain propagation function, and the stain aggregate function in execution is tracked by the probe, where there may be not less than one of the original input stain data/derived stain data/aggregate input stain data under the probe tangent point.
Further optionally, in one implementation manner, in the recording of the taint propagation track, in the taint propagation event information, single derived taint data in the taint propagation track under the tangent point may have at least one propagation input taint data corresponding to the derived taint data.
Further optionally, in a specific implementation of the foregoing implementation, the stain propagation track recording may further include: recording the unique feature information of the taint data/the taint data in the taint propagation track under the tangent point, or the unique feature information of the taint data/the taint data in the taint propagation track under the tangent point, the event type and/or the function information corresponding to the unique feature information, and the event type and/or the function information into a tree data structure.
Further optionally, in one specific implementation of the foregoing implementation manner, the stain source input/stain propagation/stain aggregation event information identifier, corresponding to the event type, may include: the original input taint data/derived taint data/the converged input taint data, or the unique characteristic information of the corresponding taint data.
Still further alternatively, in an implementation manner of the foregoing, wherein the event information identifier is the unique feature information of the spot data, the spot data/the unique feature information of the spot data may be correspondingly default in a spot propagation track under a tangent point in the spot source input event information/the spot propagation event information/the spot convergence event information.
Further optionally, in one specific implementation of the foregoing implementation, the stain propagation path information may include: a function sequence of a taint source function/a taint propagation function/a taint convergence function, which is played back, wherein the function information in the function sequence comprises a limiting function name of the function.
Still further optionally, in a specific implementation of the foregoing implementation manner, the function information in the function sequence includes, in addition to defining a function name, the method further includes: the function call stack of the function and/or corresponding original input taint data/propagation input taint data and derivative data/converging input taint data thereof, wherein the derivative data comprises derivative taint data polluted by the propagation input taint data and/or method parameters/return values comprising the derivative taint data.
With reference to the first possible implementation manner of the first aspect, or the third possible implementation manner of the first possible implementation manner, or the first possible implementation manner of the third possible implementation manner, in a specific implementation manner of the first possible implementation manner, unique feature information of the spot data may include a storage address or summary information of the spot data.
Alternatively, in an implementation manner of the first aspect, during the running of the target application, in particular during the processing of the request, there may be at least one of a stain source function/stain spread function/stain convergence function triggered and recording the relevant stain spread track by the probe.
Optionally, in an implementation manner of the first aspect, during the request processing, a stain source function/a stain propagation function/a stain convergence function for the probe stub may be: is triggered multiple times and records the relevant smudge propagation track.
Optionally, in an implementation manner of the first aspect, the stain propagation function may include an encoding function and a decoding function; in the dynamic stain propagation analysis process, the encoding function and the decoding function can be regarded as generalized propagation functions.
Alternatively, in an implementation manner of the first aspect, in the smear propagation function, a custom part of the smear propagation function may be a cleaning function to correlate smear data and smear at a probe tangent point and block propagation thereof.
Alternatively, in an implementation manner of the first aspect, the probe is instrumented by an Agent program in a stain source function, a stain propagation function, and a stain convergence function of a target application program, where the probe is instrumented in a dynamic instrumentation mode or a static instrumentation mode. Wherein, the pile-inserting probe adopting the dynamic pile-inserting mode can comprise: an Agent program is deployed at a server running a target application program; the Agent program inserts a probe to a stain source function, a stain propagation function and a stain converging function of the target application program in the running process of the target application program; the method for inserting the probe in the static pile inserting mode can comprise the following steps: instrumentation of the probe by modifying code or compiling instrumentation; the Agent program comprises the probe code or the probe binary file, the Agent program is deployed in the running process of the target application program, and the spot source function, the spot propagation function and the spot convergence function instrumentation probes of the target application program are deployed through the Agent program.
In a second aspect of the present disclosure, an online stain propagation analysis system is provided for performing the methods of the foregoing first aspect and various implementations thereof of the respective processes of dynamic stain tracking. The system comprises: a smear propagation trajectory recording unit and a smear propagation path playback unit; the stain propagation track recording unit is used for recording stain propagation tracks; the stain propagation trajectory recording unit is configured to: setting Agent programs at a server running the target application program, inserting probes into a stain source function, a stain propagation function and a stain output function of the target application program in the running process of the target application program to monitor the execution of the functions, and monitoring the execution of the stain source function, the stain propagation function and the stain convergence function in the current request processing process by the probes in the request processing process of the target application program to record stain propagation tracks; the smear propagation path replay unit is used for online replay of a smear propagation path; the smear propagation path playback unit is configured to: monitoring triggering of a stain aggregation event through the probe, and immediately restoring a stain propagation path to which the current stain aggregation event belongs according to the related stain propagation track recorded by the stain propagation track recording unit after triggering of the stain aggregation event; the stain propagation track recording unit respectively acquires corresponding event information under the probe tangent points of the stain propagation track recording unit through the probes to record the stain propagation track, wherein the event information mainly comprises event types of events acquired by the current probes, the stain propagation track under the current tangent points and function information; the event type comprises a stain source input event, a stain propagation event and a stain convergence event; the function information includes a defined function name of the function in which the current probe is inserted.
In a third aspect of the present disclosure, a dynamic spot tracking apparatus is provided. The device comprises: at least one processor, a memory coupled to the at least one processor, and a computer program stored in the memory; wherein the processor executes the computer program to implement the dynamic spot tracking method of the first aspect.
In a fourth aspect of the present disclosure, a computer-readable storage medium is provided. The computer readable storage medium has stored thereon computer instructions related to software program taint propagation analysis; the computer instructions, when executed by a computer processor, enable the dynamic spot tracking method of the first aspect.
In a fifth aspect of the present disclosure, a computer program product is provided. The program product comprises a computer program which, when executed by a computer processor, is capable of implementing the dynamic spot tracking method according to the first aspect.
It should be understood that what is described in this summary is not intended to limit the critical or essential features of the embodiments of the disclosure nor to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The above and other features, advantages and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. In the drawings, wherein like or similar reference numerals designate like or similar elements, and wherein:
FIG. 1 illustrates a schematic diagram of a process of dynamic spot tracking as set forth in an embodiment of the present disclosure;
FIG. 2 illustrates a schematic diagram of a process of recording a blemish propagation trajectory in an embodiment of the present disclosure;
FIG. 3 illustrates a schematic diagram of a process of restoring a blemish propagation path in an embodiment of the present disclosure;
FIG. 4 illustrates a block diagram of an online taint propagation analysis system as set forth in an embodiment of the present disclosure;
FIG. 5 illustrates a block diagram of a computing device capable of implementing various embodiments of the present disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.
In describing embodiments of the present disclosure, the term "comprising" and its like should be taken to be open-ended, i.e., including, but not limited to. The term "based on" should be understood as "based at least in part on". The term "one embodiment" or "the embodiment" should be understood as "at least one embodiment". The terms "first," "second," and the like, may refer to different or the same object. Other explicit and implicit definitions are also possible below.
In the description of the embodiments of the present disclosure, the technical term "target application" refers to any application that is a target object; implementations of embodiments of the present disclosure are generally applicable to software analysis and testing, while in software analysis and testing applications, "target application" refers to application software to be analyzed or tested. The technical term "stain source function" refers to a function that directly introduces stain data (untrusted data) to the system; "taint aggregation function" refers to a function that ultimately directly produces security-sensitive operations (violating data integrity) or leaks private data to the outside (violating data confidentiality) due to taint data input and propagation; "smear propagation function" refers to a function related to smear data propagation between "smear source function" and "smear convergence function"; in the research and practice of dynamic taint analysis, the innocent treatment of taint data is always different from the synchronous execution of taint propagation analysis process in static taint analysis, but the innocent treatment of taint data propagation is independent of the dynamic taint propagation analysis and is carried out special innocent analysis after being placed in the dynamic taint propagation analysis, so that the 'taint propagation function' in the disclosure is a generalized 'taint propagation function', and mainly refers to a function which inevitably leads to taint data propagation on the premise of not considering the innocent treatment of data; and still further, related aspects of the present disclosure are directed to tracking the spread of application runtime (run) taint data; in other words, it is mainly the trace of the spread of the taint data when the target application processes the request input, so the "taint source function", "taint spread function", "taint convergence function" in this disclosure is more specifically a function related to the application logic in nature (without involving the operating system or platform below the application runtime environment). The technical term "probe tangent point", also referred to as "tangent point" refers to a tangent point formed by the function of the probe inserted by the probe, and the probe can acquire information under the tangent point through the probe tangent point, including but not limited to function information of the function inserted by the probe, method parameters (including input and output through the function) and return values when the function is passed through (i.e. triggered) in the request processing process, and the like. In software development, "limiting function names" refers to a function information expression mode that uses a qualifier such as a name space, a class name, a module name, etc. in a code to limit the function names to a specific scope in order to avoid collision with the function names in other codes; the qualifier function name is generally expressed as: identification of "define function name scope" + "function (method) name"; under different programming languages, the expression modes of the definition function names are slightly different; for example, java uses package names to define the scope of function names, python uses module names to define the scope of function names, PHP uses namespaces to define the scope of function names, and so on. In summary, the defined function name is a unique identifier that can be used as an identifying correlation function, and can be used in different programming languages and environments.
The stain analysis technology is always an effective means for guaranteeing information security and preventing information integrity and confidentiality from being damaged. With the rapid development of the mobile internet and the update iteration of the Web application technology, security problems such as vulnerability attack of application software, privacy data leakage and the like related to the mobile internet are endless, and become hot spots and difficulties of current network security. In the practice of solving the above-mentioned security problem, thanks to the development of related technologies such as program analysis, stain analysis can implement more accurate and efficient analysis and detection in terms of security analysis and test for the above-mentioned application programs (e.g., web applications, partial mobile terminal applications, etc.).
In related practice, the object of the stain analysis mainly aims at the data flow related to the business logic in the process of requesting and processing by the target application program, in other words, mainly namely the code/business logic data development related analysis running in the environment when the application program runs; and the stain propagation analysis is the basis and the key point for realizing accurate and efficient analysis and test. And as previously described, the smear propagation analysis can be classified into a static smear propagation analysis mode and a dynamic smear propagation analysis mode. The static taint propagation analysis is to analyze the data dependence relationship among program variables to detect whether the (taint) data can be propagated from the taint source to the taint convergence point without running and modifying the code. As also previously described, the drawbacks of the static smudge propagation analysis mode severely limit its use in the application software security analysis test practice described above. The dynamic taint propagation analysis mode, especially the dynamic taint propagation analysis based on the stake is clearly the best choice in the practice. Of course, there remains a need to overcome many of the problems listed above for efficient and cost-effective tracking of taint data traveling along complex paths during application execution.
According to embodiments of the present disclosure, a dynamic taint tracking scheme is presented to overcome the problems in tracking taint data propagation. In the scheme, an Agent program is deployed, and the Agent program is used for monitoring a first input function, a first propagation function and a first output function of a target application program in the running process of the target application program, so that when the target application program processes request input, the function is monitored and executed through the probe to record the stain propagation track, and the related stain propagation path is restored according to the stain propagation track in time after a stain converging event is triggered. By implementing the scheme, software development testers can be helped to effectively monitor and respond to the complex transmission of the taint data in the running process of the target application program, accurately analyze the taint data and comprehensively cover the taint data, and further quickly and efficiently find and identify suspected available paths in the target application program based on the taint data.
Embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.
Fig. 1 shows a schematic diagram of a process of dynamic spot tracking as proposed in an embodiment of the present disclosure. As shown in fig. 1, the process 100 of dynamic spot tracking mainly includes: 101, an Agent program is deployed at a server running a target application program; during the operation process of a target application program, inserting a probe into a stain source function, a stain spreading function and a stain converging function of the target application program through the Agent program, monitoring the execution of the stain source function, the stain spreading function and the stain converging function by using the probe when the target application program processes the input of a request, and recording the stain spreading track in the current request processing process; and 102, monitoring the trigger of the stain aggregation event by the probe at the same time, and immediately restoring the stain propagation path to which the current stain aggregation event is subordinate according to the related stain propagation track recorded in the previous process after the trigger of the stain aggregation event. In 101, in order to effectively track the stain propagation under the complex path in the target application program, the corresponding event information under the probe tangent point of the probe is mainly obtained by the probe triggered by the stain propagation to realize the stain propagation track record; specifically, the event information may mainly include: event type of event obtained by current probe, stain propagation track and function information under current tangent point; the event types comprise a stain source input event, a stain propagation event and a stain convergence event corresponding to the stain source function, the stain propagation function and the stain convergence function; the function information may include a defined function name of the function of the current probe stub; the defined function name is intended to uniquely identify the current function. In a related embodiment, the foregoing taint propagation trajectory information is also typically recorded in the context space of the same request, i.e., the context space of the current request.
In some embodiments, at 101, the corresponding spot data under the probe is acquired by the probe to describe the spot propagation track under the point of the event information acquired by the probe, wherein the spot data may be the original input spot data/derivative spot data and the propagation input spot data/convergent input spot data thereof. Wherein fig. 2 shows a schematic diagram of a process of recording a stain propagation trajectory in some of the embodiments described above. As shown in fig. 2, a process 200 of recording a stain propagation trajectory may include: 201, tracking original input taint data/derived taint data/converged input taint data under a probe tangent point of the taint source function, the taint spreading function and the taint converging function in execution by using the probe; for each triggering execution (namely when data passes through) of the stain source function/stain spreading function/stain converging function, detecting the function input correspondingly, capturing and tracking original input stain data/spreading input stain data and derived stain data/converging input stain data under the current probe tangent point, wherein the capturing and tracking of the spreading input stain data and the derived stain data is mainly realized by capturing (in the current execution input of the function) the spreading input stain data and then obtaining the derived stain data in corresponding method parameters/return values (namely the current execution output of the function) through pre-configuration so as to realize the tracking of the derived stain data; 202, for each tracking detection, if relevant taint data is captured and tracked, then generating taint source input event information/taint transmission event information/taint aggregation event information associated with the taint data under the point of contact of the probe according to the taint data information captured and tracked and the information obtained by other probes, and recording the taint source input event information/taint transmission event information/taint aggregation event information and the identification thereof into a collection container; the stain source input event information/stain propagation event information/stain convergence event information all comprise a stain propagation track under a current tangent point, and the stain propagation track under the tangent point can comprise: the original input taint data/derived taint data obtained by the probe and the transmission input taint data/convergence input taint data of the original input taint data/derived taint data obtained by the probe or the unique characteristic information of the corresponding taint data. The spot-under-cut spot propagation trajectory is described with respect to whether the spot data or the spot data unique feature information is acquired, wherein it may be preferable to acquire the spot data unique feature information to describe the spot-under-cut spot propagation trajectory when the probe can acquire only the original type of data. This is because the raw type data can be used directly before the creation of the object, and thus it is also difficult in practice to distinguish between two or more obtained raw type data having the same value; so that the spot propagation trajectory at the tangent point can be described in particular by unique characteristic information about the spot data when the probe captures only the raw type data.
Corresponding to the procedure 200 of recording the smear propagation trajectory in the above embodiment, in the related embodiment, the related smear propagation path may be restored by acquiring the smear data in the smear propagation trajectory under the tangent point of the smear source input event information/the smear propagation event information/the smear convergence event information under the tangent point of the probe by the probe, as well, at 102. Wherein fig. 3 shows a schematic diagram of a process of restoring a blemish propagation path in some of the embodiments described above. As shown in fig. 3, a process 300 for restoring a blemish propagation path may include: 301, through tracking the converged input taint data under the probe tangent point in the execution of the taint convergence function (namely capturing the converged input taint data in the taint convergence function), monitoring the triggering of the taint convergence event by the probe in real time so as to initiate the operation of restoring the taint propagation path in time; and 302, after the probe monitors that the stain aggregation event is triggered, backtracking according to the stain data in the corresponding stain aggregation event information in the collection container, searching for an upstream stain event, backtracking according to the corresponding event information and the stain data in the event information, and replaying the stain propagation track to which the current trigger stain aggregation event is subordinate.
Additionally, in the specific implementation of some of the embodiments, during the recording of the stain propagation track, stain data tracked by the probe may include character string data and the like; by means of smear monitoring and propagation analysis of string data objects and the like, the discovery of security risks associated with string data is facilitated.
Additionally, in the specific implementation of some of the embodiments, during the recording of the stain propagation track, tracking, by the probe, the original input stain data/derived stain data/aggregate input stain data at a probe tangent point of the stain source function, the stain propagation function, and the stain aggregate function in execution, where there may be at least one of the original input stain data/derived stain data/aggregate input stain data; the fact that a plurality of relevant taint data exist at the same probe tangent point is a representation of the taint propagating along the complex path when the target application program runs, and is also a root cause of the complex taint propagating path, relevant taint data information at the probe tangent point is obtained as completely as possible, and missing report of the taint propagating path is reduced.
Additionally, in the specific implementation of some embodiments, in the foregoing description of the stain propagation track, in the stain propagation event information, single derived stain data in the stain propagation track under the tangent point may have at least one propagation input stain data corresponding to the derived stain data; contamination of the same derived taint data by a plurality of propagation input taint data is also a representation of taint propagation along a complex path when a target application program is running, related taint data information under a probe tangent point is obtained as completely as possible, and missing report is reduced in the process of restoring the related taint propagation path.
Additionally, in a specific implementation of the foregoing embodiments, the stain propagation track recording may further include: recording the unique feature information of the taint data/the taint data in the taint propagation track under the tangent point, or the unique feature information of the taint data/the taint data in the taint propagation track under the tangent point, the event type and/or the function information corresponding to the unique feature information, and the event type and/or the function information into a tree-shaped data structure; through the record of the tree data structure, the hierarchical relationship of the whole taint propagation process and the propagation pollution relationship among functions are more intuitively represented.
Additionally, in a specific implementation of the foregoing some embodiments, the stain source input/stain propagation/stain aggregation event information identifier, corresponding to the event type, may include: the original input taint data/derived taint data/converged input taint data or the unique characteristic information of the corresponding taint data; and through the stain mark, related stain data is quickly inquired in the subsequent stain propagation path restoring process so as to improve the tracing efficiency.
Further, in a specific implementation of the foregoing embodiments, when the spot data/the spot data unique feature information is identified by the spot data/the spot data unique feature information as the event information identifier, the spot data/the spot data unique feature information that is identified may be default in a spot propagation track under a tangent point in the spot source input event information/the spot propagation event information/the spot convergence event information, respectively; the related data recorded in the collection container is further reduced by default related taint data/unique feature information of the taint data, and the uploading recording and the inquiring of the related data are facilitated while the storage cost is reduced.
Additionally, in a specific implementation of some of the foregoing embodiments, some of the specific feature information of the stain data describes a track of propagation of a stain under a tangent point in the event information, some of the specific feature information of the stain data is further used as an identifier of the event information, and in some of the foregoing embodiments, the specific feature information of the stain data may further include, but is not limited to: the storage address, abstract information and the like of the taint data are used as the identification of the event information, so that the problems that relevant taint data cannot be distinguished and taint tracking is difficult to realize and the like are mainly solved.
Additionally, in a specific implementation of some of the embodiments described above, the stain propagation path information may include: a function sequence of a stain source function/a stain propagation function/a stain convergence function passed by the stain propagation track for playback, wherein function information in the function sequence comprises a limiting function name of the function; and the stain propagation track is displayed through the function sequence, so that software development testers are helped to analyze the formation reason of the security problem and more intuitively display the path information utilized by attack.
Further additionally, in a specific implementation of some of the embodiments above, the function information in the function sequence in the stain propagation path information may further include, but is not limited to: the function call stack of the function, the corresponding original input taint data/propagation input taint data, the derivative data/convergence input taint data and the like; the derived data includes one or more of derived stain data from which the input stain data stain was propagated, method parameters/return values including the derived stain data, etc.; by further capturing and enriching information on the smudge propagation path, it is certainly more helpful for software development testers to analyze the cause of the security problem. For example, the function call stack information can help software development testers to further know the running condition of programs at the lower-layer non-instrumented positions; related taint data, method parameters, return values and the like clearly restore taint propagation details more clearly, and the method is also equivalent to simulation of vulnerability attack utilization, or can help software development testers to more intuitively find related problems; the above information is of no doubt to help software development testers locate security problems.
In some embodiments, at 101, during the operation of the target application, in particular during the processing of the request, there may be at least one of a blobs source function/blobs propagation function/blobs convergence function triggered and associated blobs propagation trajectories recorded by the probe; therefore, considering the complexity of the smear propagation, the smear source function, the smear propagation function and the smear convergence function should be sufficiently identified when the probe is configured and inserted so as to avoid the missing report of related information such as the smear propagation path.
In some embodiments, at 101, during the request processing, a stain source function/stain spread function/stain convergence function for the probe stub may be: triggering and recording relevant stain propagation tracks for a plurality of times; the data input by one request is not required to pass through a function for many times, particularly some instrumentation function books run on the runtime environment, even belong to some classical frameworks/components, and are not required to be called repeatedly, so that related information such as a stain propagation path and the like is prevented from being reported missing through an endless record.
In some embodiments, at 101, the stain propagation function may include an encoding function, a decoding function; in the dynamic taint propagation analysis process, the encoding function and the decoding function can be regarded as generalized taint propagation functions. Although the coding function and the decoding function are defined as generalized stain propagation functions in the stain propagation analysis, the stain analysis result is not affected, and harmless analysis is further developed on the path information of the dynamic stain propagation analysis to ensure the accuracy of the final stain analysis result.
In some embodiments, at 101, in the smear propagation function, it may be that a custom portion of the smear propagation function is a cleaning function to correlate smear data and smear at the probe tangent point and block its propagation; by setting the determined cleaning function, the relevant data recorded in the collection container will undoubtedly be reduced.
In some embodiments, at 101, the Agent program may use a dynamic instrumentation mode for the instrumentation probe of the stain source function, the stain propagation function, and the stain convergence function of the target application program; the pile inserting probe adopting the dynamic pile inserting mode mainly comprises: an Agent program is arranged at a server running a target application program, and a stain source function, a stain propagation function and a stain convergence function instrumentation probe of the target application program are executed in a programming language runtime environment (runtime) by means of byte code modification and the like in the running process of the target application program (Java is taken as an example, a JVM (Java virtual machine) environment).
In some embodiments, at 101, the stake-inserting probe may employ a static stake-inserting mode; among these, typical static modes mainly include: the probes are instrumented by modifying the code or compiling instrumentation. More specifically, the instrumentation of the probe by modifying a code or compiling an instrumentation mainly means that the Agent program includes the probe code or a probe binary file; and deploying the Agent program in the running process of the target application program, automatically deploying the Agent program along with the running of the target application program, and realizing the spot source function, the spot propagation function and the spot convergence function instrumentation probes of the target application program in the self-deployment process of the Agent program.
FIG. 4 illustrates a block diagram of an online taint propagation analysis system as presented in an embodiment of the present disclosure. As shown in fig. 4, the online stain propagation analysis system 400 mainly includes: a smear propagation trajectory recording unit 410 and a smear propagation path playback unit 420; wherein the smear propagation trajectory recording unit 410 is configured to record a relevant smear propagation trajectory; the stain propagation trajectory recording unit 410 is configured to: setting Agent programs at a server running the target application program, inserting probes into a stain source function, a stain propagation function and a stain output function of the target application program in the running process of the target application program to monitor the execution of the functions, and monitoring the execution of the stain source function, the stain propagation function and the stain convergence function in the current request processing process by the probes in the request processing process of the target application program to record stain propagation tracks; the smear propagation path replay unit 420 is for replaying a smear propagation path on line; the smear propagation path playback unit 420 is configured to: monitoring triggering of a taint aggregation event by the probe, and immediately restoring a taint propagation path to which the current taint aggregation event belongs according to the relevant taint propagation track recorded by the taint propagation track recording unit 410 after triggering of the taint aggregation event; wherein the smear propagation trajectory recording unit 410 is configured to record the smear propagation trajectories by acquiring corresponding event information under the probe tangents thereof, respectively, by the probes; the event information mainly comprises event type of an event obtained by the current probe, stain propagation track under the current tangent point and function information; the event type comprises a stain source input event, a stain propagation event and a stain convergence event; the function information includes a defined function name of the function. In a related embodiment, the smudge propagation track recording unit 410 is generally further configured to: the stain propagation track information is recorded in the context space of the same request, namely, the context space of the current request.
In some embodiments, the trace-of-smear propagation record unit 410 may be configured to describe the trace of the smear propagation under the tangent point in the event information obtained by the probe by acquiring corresponding smear data under the tangent point of the probe, wherein the smear data includes original input smear data/derivative smear data and propagation input smear data/convergence input smear data thereof. Based on the requirements of the above implementation, the smear propagation trajectory recording unit 410 records a smear propagation trajectory, may be configured to include: tracking original input taint data/derived taint data/converged input taint data under a probe tangent point in the execution of the taint source function, the taint spreading function and the taint converging function by the probe; for each triggering execution of the stain source function/stain spreading function/stain converging function, detecting the function input correspondingly, capturing and tracking original input stain data/spread input stain data and derived stain data/converging input stain data under the current probe tangent point, wherein the capturing and tracking of the spread input stain data and the derived stain data thereof is mainly used for capturing the spread input stain data in the function input and tracking the derived stain data of the spread input stain data in the function output through pre-configuration; when relevant taint data is tracked, taint source input event information/taint propagation event information/taint convergence event information which are correlated with the taint data under the point of contact of the probe are generated together according to the information of the taint data which is obtained by capturing and tracking and the information obtained by other probes, and the taint source input event information/taint propagation event information/taint convergence event information and the identification thereof are recorded in a collection container; the stain source input event information/stain propagation event information/stain convergence event information all comprise a stain propagation track under a current tangent point, and the stain propagation track under the tangent point can comprise: the original input taint data/derived taint data obtained by the probe and the transmission input taint data/convergence input taint data of the original input taint data/derived taint data obtained by the probe or the unique characteristic information of the corresponding taint data. Wherein the smear propagation trajectory recording unit 410 records a smear propagation trajectory when the probe captures only raw type data, in particular, may be configured to describe the smear propagation trajectory at the tangent point by unique feature information related to the smear data.
Corresponding to the implementation of the smear propagation trajectory recording unit 410 recording a smear propagation trajectory in the above-described embodiment, in a related embodiment, the smear propagation path replay unit 420 may be configured to restore a related smear propagation path by acquiring the smear data in the smear propagation trajectory under the tangent point of the smear source input event information/the smear propagation event information/the smear convergence event information under the tangent point of the probe thereof through the probe. Based on the requirement of this implementation, the smear propagation path replay unit 420 restores the smear propagation path may be correspondingly configured to include: tracking the converged input taint data under the probe tangent point in the execution of the taint convergence function by the taint propagation track recording unit 410, and monitoring the triggering of the taint convergence event in real time by the probe; and after the probe monitors the triggering of the stain converging event, backtracking according to the stain data in the corresponding stain converging event information in the collecting container, searching for an upstream stain event, then backtracking according to the corresponding event information and the stain data in the event information, and replaying the stain propagation track to which the current triggering stain converging event is subordinate.
Additionally, in a specific implementation of some of the embodiments described above, the stain propagation track recording unit 410 records a stain propagation track, and may be configured to further include: in the process of recording the taint propagation track, the taint data which can be tracked by the probe also comprises character string data and the like, so as to find the taint propagation condition related to the character string data when the target application program runs.
Additionally, in a specific implementation of some of the embodiments described above, the stain propagation track recording unit 410 records a stain propagation track, and may be configured to further include: in the process of recording the smear propagation track, tracking the original input smear data/derived smear data/converged input smear data under the probe tangent points of the smear source function, the smear propagation function and the smear converging function in execution by the probes, and tracking the original input smear data/derived smear data/converged input smear data which may be not less than one under the probe tangent points so as to reduce false alarm.
Additionally, in a specific implementation of some of the embodiments described above, the stain propagation track recording unit 410 records a stain propagation track, and may be configured to further include: in the process of recording the taint propagation track, in the taint propagation event information, single derived taint data in the taint propagation track under the tangent point can be captured to at least one propagation input taint data corresponding to the derived taint data; related taint data information is acquired as completely as possible, so that missing report in the taint propagation analysis process is reduced.
Additionally, in a specific implementation of some of the embodiments described above, the stain propagation track recording unit 410 records a stain propagation track, and may be configured to further include: recording the unique feature information of the taint data/the taint data in the taint propagation track under the tangent point or the unique feature information of the taint data/the taint data in the taint propagation track under the tangent point, the event type and/or the function information corresponding to the unique feature information, and recording the unique feature information/the unique feature information and the event type and/or the function information to a tree-shaped data structure so as to intuitively record and describe the hierarchical relationship of the whole taint propagation process and the propagation pollution relationship among functions.
Additionally, in a specific implementation of some of the embodiments described above, the stain propagation track recording unit 410 records a stain propagation track, and may be configured to further include: the stain source input/stain propagation/stain aggregation event information identification corresponds to the event type and comprises the following steps: the original input taint data/derived taint data/the convergent input taint data or the unique characteristic information of the corresponding taint data are used for marking the related event information through the related taint data or the unique characteristic information thereof, so that the tracing efficiency of the subsequent taint propagation path restoring process is improved.
Further additionally, in a specific implementation of some of the embodiments described above, when the stain propagation track recording unit 410 is configured to identify by the stain data/the stain data unique feature information as the event information, it may be further configured to: correspondingly defaults the spot data serving as the mark/the spot data unique characteristic information in the spot propagation track under the tangent point in the spot source input event information/the spot propagation event information/the spot convergence event information so as to reduce related data required to be recorded in a collection container.
Additionally, in a specific implementation of the above-described related embodiment, the stain propagation trajectory recording unit 410 is configured to: when describing the spot propagation track under the tangent point in the event information by the spot data unique feature information or further using the spot data unique feature information as the identifier of the event information, in the specific implementation of some embodiments, the unique feature information of the spot data may be configured to include, but not limited to: the storage address of the taint data, abstract information, etc.
Additionally, in a specific implementation of some of the above embodiments, the taint propagation path information obtained by the taint propagation path playback unit 420 may include: and playing back a function sequence of a stain source function/a stain propagation function/a stain convergence function, wherein the function information in the function sequence comprises a limiting function name of the function, so that software development testers can find out the formation reason of the safety problem conveniently.
Further, in a specific implementation of some of the above embodiments, the function information in the function sequence in the stain propagation path information may further include, but is not limited to: the function information in the function sequence in the stain propagation path information may include, but is not limited to, in addition to defining a function name: the function call stack, the corresponding original input taint data/the propagation input taint data and the derivative data/the convergence input taint data of the function, the derivative data comprise one or more of derivative taint data polluted by the propagation input taint data, method parameters/return values comprising the derivative taint data and the like, so as to assist software development testers to analyze, find and even locate related safety problems to the maximum extent.
In some embodiments, the stain propagation track recording unit 410 records a stain propagation track, and may be configured to further include: in the running process of the target application program, at least one taint source function/taint propagation function/taint convergence function can be triggered and related taint propagation tracks are recorded through the probe, so that relevant information such as taint propagation paths and the like is prevented from being missed.
In some embodiments, the stain propagation track recording unit 410 records a stain propagation track, and may be configured to further include: in the request processing process, for the taint source function/taint propagation function/taint convergence function of the probe insert, the relevant taint propagation track can be triggered and recorded for a plurality of times so as to avoid missing report of relevant information such as a taint propagation path.
In some embodiments, the stippling propagation function of the stub inserted by the stippling propagation trajectory recording unit 410 may include an encoding function, a decoding function.
In some embodiments, the stain propagation track recording unit 410 records a stain propagation track, and may be configured to further include: among the stain spread functions, a part of the stain spread functions can be customized as a cleaning function to reduce the relevant data recorded in the aggregate container by the determined setting of the cleaning function.
In some embodiments, the stain propagation track recording unit 410 records a stain propagation track, and may be configured to further include: stake-in the probe by adopting a dynamic stake-in mode; the stake-inserting probe of the dynamic stake-inserting mode mainly comprises: an Agent program is arranged at a server running a target application program, and a stain source function, a stain propagation function and a stain convergence function instrumentation probe of the target application program are arranged in a programming language runtime environment (runtime) in a mode of byte code modification and the like in the running process of the target application program by the Agent program.
In some embodiments, the stain propagation track recording unit 410 records a stain propagation track, and may be configured to further include: stake-in the probe by adopting a static stake-in mode; the probes are instrumented, for example, by modifying code or compiling instrumentation. More specifically, the instrumentation of the probe by modifying a code or compiling an instrumentation mainly means that the Agent program includes the probe code or a probe binary file; and deploying the Agent program in the running process of the target application program, automatically deploying the Agent program along with the running of the target application program, and realizing the spot source function, the spot propagation function and the spot convergence function instrumentation probes of the target application program in the self-deployment process of the Agent program.
In some embodiments, an apparatus for implementing dynamic spot tracking is also presented. The apparatus, in particular, may be implemented by a computing device. FIG. 5 illustrates a block diagram of a computing device that can be used to implement some embodiments of the present disclosure. As shown in fig. 5, the computing device 500 includes a Central Processing Unit (CPU) 501 capable of executing various appropriate operations and processes according to computer program instructions stored in a Read Only Memory (ROM) 502 or computer program instructions loaded from a storage unit 508 into a Random Access Memory (RAM) 503, and in the (RAM) 503, various program codes, data required for the operation of the computing device 500 may also be stored. The CPU501, ROM502, RAM503 are connected to each other by a bus 504, and an input/output (I/O) interface 505 is also connected to the bus 504. Some components of computing device 500 are accessed through I/O interface 505, including: an input unit 506 such as a mouse or the like; an output unit 507 such as a display or the like; a storage unit 508, such as a magnetic disk, an optical disk, a Solid State Disk (SSD), etc., and a communication unit 509, such as a network card, a modem, etc. The communication unit 509 enables the computing device 500 to exchange information/data with other devices over a computer network. The CPU501 is capable of performing the various methods and processes described in the above embodiments, such as process 100. In some embodiments, process 100 may be implemented as a computer software program that is stored on a computer readable medium such as storage unit 508. In some embodiments, part or all of the computer program is loaded or installed into computing device 500. When the computer program is loaded into RAM503 and executed by CPU501, some or all of the operations of process 100 can be performed.
The functions described above herein may all be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a load programmable logic device (CPLD), etc.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
Moreover, although operations are depicted in a particular order, this should be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single implementation. Conversely, various features that are described in the context of a single implementation can also be implemented in multiple implementations separately or in any suitable subcombination.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims.

Claims (14)

1. A method of dynamic spot tracking, the method comprising:
An Agent program is deployed at a server running a target application program;
in the running process of the target application program, inserting a probe into a stain source function, a stain propagation function and a stain converging function of the target application program through the Agent program;
then when the target application program processes the request input, monitoring the execution of the stain source function, the stain propagation function and the stain convergence function in any request processing process through the probe, and recording the stain propagation track in the stain source function, the stain propagation function and the stain convergence function;
after the stain converging event is triggered, the stain propagation path to which the current stain converging event is subordinate is restored in real time according to the stain propagation track;
the method comprises the steps that corresponding event information under a probe tangent point of the probe is respectively obtained through the probe to record the stain propagation track, wherein the event information mainly comprises event types of events obtained by the current probe, the stain propagation track under the current tangent point and function information; the event type comprises a stain source input event, a stain propagation event and a stain convergence event; corresponding to the event type, the stain source input type event information/stain propagation type event information/stain convergence type event information tangent point lower stain propagation track comprises: original input taint data/derived taint data and propagation input taint data/convergence input taint data obtained under the current probe tangent point, or unique characteristic information of corresponding taint data; the function information includes a defined function name of the function.
2. The method of claim 1, wherein the step of determining the position of the substrate comprises,
the stain propagation track record comprises: tracking original input taint data/derived taint data/converged input taint data under a probe tangent point in the execution of the taint source function, the taint spreading function and the taint converging function by the probe, capturing taint source input event information/taint spreading event information/taint converging event information associated with the taint data under the probe tangent point, and recording the taint source input event information/taint spreading event information/taint converging event information and the identification thereof into a collecting container; the stain source input event information/stain propagation event information/stain convergence event information comprises a stain propagation track under a current tangent point, and the stain propagation track under the tangent point comprises: the original input taint data/derived taint data obtained by the probe and the transmission input taint data/convergence input taint data of the original input taint data/derived taint data obtained by the probe or the unique characteristic information of the corresponding taint data;
and the smudge propagation path restoration, comprising: after the stain converging event is triggered, according to corresponding stain converging event information and stain spreading event information in the collecting container and a stain spreading track under a tangent point in the stain source input event information, tracing back to the source, and playing back the stain spreading track to which the stain converging event is triggered.
3. The method of claim 2, wherein the step of determining the position of the substrate comprises,
in the stain propagation track recording process, the stain data tracked by the probe comprises character string data;
and/or the number of the groups of groups,
tracking original input taint data/derived taint data/converged input taint data under a probe tangent point in the execution of the taint source function, the taint propagation function and the taint convergence function by the probe, wherein at least one of the original input taint data/derived taint data/converged input taint data under the probe tangent point;
and/or the number of the groups of groups,
in the stain propagation event information, single derived stain data in the stain propagation track under the tangent point, and propagation input stain data corresponding to at least one derived stain data;
and/or the number of the groups of groups,
the stain propagation track record further comprises: recording the unique feature information of the taint data/the taint data in the taint propagation track under the tangent point, or the unique feature information of the taint data/the taint data in the taint propagation track under the tangent point, the event type and/or the function information corresponding to the unique feature information, and the event type and/or the function information into a tree-shaped data structure;
And/or the number of the groups of groups,
the stain source input/stain spreading/stain converging event information mark comprises the original input stain data/derived stain data/converging input stain data or the corresponding unique characteristic information of the stain data;
and/or the number of the groups of groups,
the taint propagation path information comprises a function sequence of a taint source function/a taint propagation function/a taint convergence function, wherein the taint propagation path information comprises a defined function name of the function.
4. The method of claim 3, wherein the step of,
wherein the unique feature information of the taint data is used as the event information mark, and then the unique feature information of the taint data is correspondingly default used as the mark in the taint propagation track under the tangent point in the taint source input event information/taint propagation event information/taint convergence event information;
and/or the number of the groups of groups,
when the taint propagation path information includes a function sequence of a taint source function/a taint propagation function/a taint convergence function through which the playback taint propagation track passes, the function information in the function sequence includes, in addition to defining a function name, the following steps: the function call stack of the function and/or corresponding original input taint data/propagation input taint data and derivative data/converging input taint data thereof, wherein the derivative data comprises derivative taint data polluted by the propagation input taint data and/or method parameters/return values comprising the derivative taint data.
5. The method according to any one of claims 2 to 4, wherein,
the unique characteristic information of the taint data comprises storage address or abstract information of the taint data.
6. The method of claim 1, wherein the step of determining the position of the substrate comprises,
in the request processing process, at least one taint source function/taint propagation function/taint convergence function is triggered and relevant taint propagation tracks are recorded through the probe;
and/or the number of the groups of groups,
in the request processing process, triggering and recording relevant taint propagation tracks for a taint source function/taint propagation function/taint convergence function of the probe pile for a plurality of times;
and/or the number of the groups of groups,
the stain propagation function comprises an encoding function and a decoding function;
and/or the number of the groups of groups,
in the stain spreading function, a user-defined part of the stain spreading function is a cleaning function;
and/or the number of the groups of groups,
the probe is inserted by an Agent program, and the probe is inserted in a dynamic insertion mode/a static insertion mode; wherein, in the stake-inserting probe of dynamic stake-inserting mode, include: an Agent program is deployed at a server running a target application program; the Agent program inserts a probe to a stain source function, a stain propagation function and a stain converging function of the target application program in the running process of the target application program; the stake-in of the probe in a static stake-in mode includes: stake-in the probe by modifying the code or compiling stake-in; the Agent program comprises the probe code or the probe binary file, the Agent program is deployed in the running process of the target application program, and the spot source function, the spot propagation function and the spot convergence function instrumentation probes of the target application program are deployed through the Agent program.
7. An on-line smudge propagation analysis system, the system comprising:
a smear propagation trajectory recording unit and a smear propagation path playback unit;
the stain propagation track recording unit is used for recording stain propagation tracks; the stain propagation trajectory recording unit is configured to: setting Agent programs at a server running the target application program, inserting probes into a stain source function, a stain propagation function and a stain output function of the target application program in the running process of the target application program to monitor the execution of the functions, and monitoring the execution of the stain source function, the stain propagation function and the stain convergence function in the current request processing process by the probes in the request processing process of the target application program to record stain propagation tracks;
the smear propagation path replay unit is used for replaying a smear propagation path; the smear propagation path playback unit is configured to: monitoring triggering of a stain aggregation event through the probe, and immediately restoring a stain propagation path to which the current stain aggregation event belongs according to the related stain propagation track recorded by the stain propagation track recording unit after triggering of the stain aggregation event;
The stain propagation track recording unit respectively acquires corresponding event information under the probe tangent points of the stain propagation track recording unit through the probes to record the stain propagation track, wherein the event information mainly comprises event types of events acquired by the current probes, the stain propagation track under the current tangent points and function information; the event type comprises a stain source input event, a stain propagation event and a stain convergence event; corresponding to the event type, the stain source input type event information/stain propagation type event information/stain convergence type event information tangent point lower stain propagation track comprises: original input taint data/derived taint data and propagation input taint data/convergence input taint data obtained under the current probe tangent point, or unique characteristic information of corresponding taint data; the function information includes a defined function name of the function.
8. The system of claim 7, wherein the system further comprises a controller configured to control the controller,
the smear propagation trajectory recording unit records a smear propagation trajectory, configured to include: tracking original input taint data/derived taint data/converged input taint data under a probe tangent point in the execution of the taint source function, the taint spreading function and the taint converging function by the probe, capturing taint source input event information/taint spreading event information/taint converging event information associated with the taint data under the probe tangent point, and recording the taint source input event information/taint spreading event information/taint converging event information and the identification thereof into a collecting container; the stain source input event information/stain propagation event information/stain convergence event information comprises a stain propagation track under a current tangent point, and the stain propagation track under the tangent point comprises: the original input taint data/derived taint data obtained by the probe and the transmission input taint data/convergence input taint data of the original input taint data/derived taint data obtained by the probe or the unique characteristic information of the corresponding taint data;
Correspondingly, the smear propagation path reproducing unit restores the smear propagation path, configured to include: and monitoring the trigger of the stain converging event by the probe, recording corresponding stain converging event information and stain propagation event information in the collecting container according to the stain propagation track after the trigger of the stain converging event, and tracing back the trace of the stain propagation under the tangent point in the stain source input event information, and playing back the trace of the stain propagation affiliated to the trigger of the stain converging event.
9. The system of claim 7, wherein the system further comprises a controller configured to control the controller,
the smear propagation trajectory recording unit records the smear propagation trajectory, and is configured to further include: in the process of recording the stain propagation track, the stain data tracked by the probe comprises character string data;
and/or the number of the groups of groups,
the smear propagation trajectory recording unit records the smear propagation trajectory, and is configured to further include: tracking, by the probe, original input smear data/derived smear data/aggregate input smear data at a probe tangent point of the execution of the smear source function, the smear propagation function, and the smear aggregate function, tracking the original input smear data/derived smear data/aggregate input smear data that may be present or not less than one at the probe tangent point;
And/or the number of the groups of groups,
the smear propagation trajectory recording unit records the smear propagation trajectory, and is configured to further include: in the stain propagation event information, for single derived stain data in the stain propagation track under the tangent point, propagation input stain data corresponding to at least one derived stain data can be captured;
and/or the number of the groups of groups,
the smear propagation trajectory recording unit records the smear propagation trajectory, and is configured to further include: recording the unique feature information of the taint data/the taint data in the taint propagation track under the tangent point, or the unique feature information of the taint data/the taint data in the taint propagation track under the tangent point, the event type and/or the function information corresponding to the unique feature information, and the event type and/or the function information into a tree-shaped data structure;
and/or the number of the groups of groups,
the smear propagation trajectory recording unit records the smear propagation trajectory, and is configured to further include: the stain source input/stain spreading/stain converging event information mark comprises the original input stain data/derived stain data/converging input stain data or the corresponding unique characteristic information of the stain data;
and/or the number of the groups of groups,
the smear propagation path reproducing unit restores the obtained smear propagation path information including a function sequence of a smear source function/a smear propagation function/a smear convergence function through which the playback smear propagation trajectory passes, the function information in the function sequence including a defined function name of the function.
10. The system of claim 7, wherein the system further comprises a controller configured to control the controller,
the stain propagation trajectory recording unit is configured to identify by the stain data/the stain data unique feature information as the event information, and is configured to further include: correspondingly default as the unique feature information of the stain data of the mark in a stain propagation track under a tangent point in the stain source input event information/stain propagation event information/stain convergence event information;
and/or the number of the groups of groups,
when the taint propagation path reproducing unit restores the obtained taint propagation path information, and includes a taint source function/taint propagation function/taint convergence function sequence through which the taint propagation track is played back, the function information in the function sequence includes, in addition to defining a function name thereof: the function call stack of the function and/or corresponding original input taint data/propagation input taint data and derivative data/converging input taint data thereof, wherein the derivative data comprises derivative taint data polluted by the propagation input taint data and/or method parameters/return values comprising the derivative taint data.
11. The system of any one of claims 8-10, wherein,
the smear propagation trajectory recording unit records the smear propagation trajectory, and is configured to further include: the unique characteristic information of the taint data comprises the storage address or abstract information of the taint data.
12. The system according to claim 7, wherein
The smear propagation trajectory recording unit records the smear propagation trajectory, and is configured to further include: in the request processing process, the method can support that at least one taint source function/taint propagation function/taint convergence function is triggered and records relevant taint propagation tracks through the probe;
and/or the number of the groups of groups,
the smear propagation trajectory recording unit records the smear propagation trajectory, and is configured to further include: in the request processing process, for the stain source function/stain propagation function/stain convergence function of the probe insert, the method can support to be triggered for multiple times and record relevant stain propagation tracks;
and/or the number of the groups of groups,
the stain propagation function of the inserted probe in the stain propagation track recording unit comprises an encoding function and a decoding function;
and/or the number of the groups of groups,
the smear propagation trajectory recording unit records the smear propagation trajectory, and is configured to further include: among the smudge propagation functions, a part of the smudge propagation functions can be customized as cleaning functions;
And/or the number of the groups of groups,
the stain propagation track recording unit adopts a dynamic pile inserting mode/a static pile inserting mode to insert the probe; wherein, in the stake-inserting probe of dynamic stake-inserting mode, include: an Agent program is deployed at a server running a target application program; the Agent program inserts a probe to a stain source function, a stain propagation function and a stain converging function of the target application program in the running process of the target application program; the stake-in of the probe in a static stake-in mode includes: stake-in the probe by modifying the code or compiling stake-in; the Agent program comprises the probe code or the probe binary file, the Agent program is deployed in the running process of the target application program, and the spot source function, the spot propagation function and the spot convergence function instrumentation probes of the target application program are deployed through the Agent program.
13. A dynamic spot tracking apparatus, the apparatus comprising:
at least one processor, a memory coupled to the at least one processor, and a computer program stored in the memory;
wherein the processor executes the computer program to implement the dynamic spot tracking method of any one of claims 1-6.
14. A computer-readable storage medium comprising,
the computer readable storage medium has stored thereon computer instructions related to software program taint propagation analysis; the computer instructions, when executed by a computer processor, are capable of implementing the dynamic spot tracking method of any one of claims 1-6.
CN202310441463.0A 2023-04-23 2023-04-23 Dynamic taint tracking method, device and related online taint propagation analysis system Active CN116451228B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310441463.0A CN116451228B (en) 2023-04-23 2023-04-23 Dynamic taint tracking method, device and related online taint propagation analysis system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310441463.0A CN116451228B (en) 2023-04-23 2023-04-23 Dynamic taint tracking method, device and related online taint propagation analysis system

Publications (2)

Publication Number Publication Date
CN116451228A CN116451228A (en) 2023-07-18
CN116451228B true CN116451228B (en) 2023-10-17

Family

ID=87125287

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310441463.0A Active CN116451228B (en) 2023-04-23 2023-04-23 Dynamic taint tracking method, device and related online taint propagation analysis system

Country Status (1)

Country Link
CN (1) CN116451228B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116842531B (en) * 2023-08-28 2023-11-03 北京安普诺信息技术有限公司 Code vaccine-based vulnerability real-time verification method, device, equipment and medium
CN117272331B (en) * 2023-11-23 2024-02-02 北京安普诺信息技术有限公司 Cross-thread vulnerability analysis method, device, equipment and medium based on code vaccine

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102081719A (en) * 2009-12-01 2011-06-01 王伟 Software security testing system and method based on dynamic taint propagation
CN103440201A (en) * 2013-09-05 2013-12-11 北京邮电大学 Dynamic taint analysis device and application thereof to document format reverse analysis
CN104063325A (en) * 2014-07-11 2014-09-24 电子科技大学 Automatic generation device and method for test cases of embedded software
CN104765687A (en) * 2015-04-10 2015-07-08 江西师范大学 J2EE (Java 2 Enterprise Edition) program bug detection method based on object tracking and taint analysis
CN105631340A (en) * 2015-12-17 2016-06-01 珠海市君天电子科技有限公司 XSS vulnerability detection method and device
CN106022132A (en) * 2016-05-30 2016-10-12 南京邮电大学 Real-time webpage Trojan detection method based on dynamic content analysis
CN110727598A (en) * 2019-10-16 2020-01-24 西安电子科技大学 Binary software vulnerability detection system and method based on dynamic taint tracking
CN111046396A (en) * 2020-03-13 2020-04-21 深圳开源互联网安全技术有限公司 Web application test data flow tracking method and system
CN111191244A (en) * 2019-12-11 2020-05-22 杭州孝道科技有限公司 Vulnerability repairing method
CN111737150A (en) * 2020-07-24 2020-10-02 江西师范大学 Taint analysis and verification method and device for Java EE program SQLIA vulnerability
CN112199274A (en) * 2020-09-18 2021-01-08 北京大学 JavaScript dynamic taint tracking method based on V8 engine and electronic device
CN113127884A (en) * 2021-04-28 2021-07-16 国家信息技术安全研究中心 Virtualization-based vulnerability parallel verification method and device
CN113392347A (en) * 2021-08-18 2021-09-14 北京安普诺信息技术有限公司 Instrumentation-based Web backend API (application program interface) acquisition method and device and storage medium
CN114422278A (en) * 2022-04-01 2022-04-29 奇安信科技集团股份有限公司 Method, system and server for detecting program security
CN115270131A (en) * 2022-06-14 2022-11-01 中国科学院信息工程研究所 Java anti-serialization vulnerability detection method and system
CN115827610A (en) * 2022-11-21 2023-03-21 杭州默安科技有限公司 Method and device for detecting effective load

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8893280B2 (en) * 2009-12-15 2014-11-18 Intel Corporation Sensitive data tracking using dynamic taint analysis

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102081719A (en) * 2009-12-01 2011-06-01 王伟 Software security testing system and method based on dynamic taint propagation
CN103440201A (en) * 2013-09-05 2013-12-11 北京邮电大学 Dynamic taint analysis device and application thereof to document format reverse analysis
CN104063325A (en) * 2014-07-11 2014-09-24 电子科技大学 Automatic generation device and method for test cases of embedded software
CN104765687A (en) * 2015-04-10 2015-07-08 江西师范大学 J2EE (Java 2 Enterprise Edition) program bug detection method based on object tracking and taint analysis
CN105631340A (en) * 2015-12-17 2016-06-01 珠海市君天电子科技有限公司 XSS vulnerability detection method and device
CN106022132A (en) * 2016-05-30 2016-10-12 南京邮电大学 Real-time webpage Trojan detection method based on dynamic content analysis
CN110727598A (en) * 2019-10-16 2020-01-24 西安电子科技大学 Binary software vulnerability detection system and method based on dynamic taint tracking
CN111191244A (en) * 2019-12-11 2020-05-22 杭州孝道科技有限公司 Vulnerability repairing method
CN111046396A (en) * 2020-03-13 2020-04-21 深圳开源互联网安全技术有限公司 Web application test data flow tracking method and system
CN111737150A (en) * 2020-07-24 2020-10-02 江西师范大学 Taint analysis and verification method and device for Java EE program SQLIA vulnerability
CN112199274A (en) * 2020-09-18 2021-01-08 北京大学 JavaScript dynamic taint tracking method based on V8 engine and electronic device
CN113127884A (en) * 2021-04-28 2021-07-16 国家信息技术安全研究中心 Virtualization-based vulnerability parallel verification method and device
CN113392347A (en) * 2021-08-18 2021-09-14 北京安普诺信息技术有限公司 Instrumentation-based Web backend API (application program interface) acquisition method and device and storage medium
CN114422278A (en) * 2022-04-01 2022-04-29 奇安信科技集团股份有限公司 Method, system and server for detecting program security
CN115270131A (en) * 2022-06-14 2022-11-01 中国科学院信息工程研究所 Java anti-serialization vulnerability detection method and system
CN115827610A (en) * 2022-11-21 2023-03-21 杭州默安科技有限公司 Method and device for detecting effective load

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Min Gyung Kang 等.DTA++: Dynamic Taint Analysis with Targeted Control-Flow Propagation.COnference:Proceedings of the Network and Distributed System Symposium,NDSS 2011.2011,全文. *
TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones;WILLIAM ENCK 等;ACM Transactions on Computer Systems;第1-28页 *
污点分析技术的原理和实践应用;王蕾 等;软件学院;第28卷(第4期);第860-882页 *
知识、探索与状态平面组织的软件漏洞分析架构研究;袁子牧;肖扬;吴炜;霍玮;邹维;;信息安全学报(第06期);全文 *
袁子牧 ; 肖扬 ; 吴炜 ; 霍玮 ; 邹维 ; .知识、探索与状态平面组织的软件漏洞分析架构研究.信息安全学报.2019,(第06期),全文. *

Also Published As

Publication number Publication date
CN116451228A (en) 2023-07-18

Similar Documents

Publication Publication Date Title
CN116451228B (en) Dynamic taint tracking method, device and related online taint propagation analysis system
Cao et al. MVD: memory-related vulnerability detection based on flow-sensitive graph neural networks
JP5978401B2 (en) Method and system for monitoring the execution of user requests in a distributed system
Li et al. Where shall we log? studying and suggesting logging locations in code blocks
CN111756575B (en) Performance analysis method and device of storage server and electronic equipment
EP3566166B1 (en) Management of security vulnerabilities
CN113761519B (en) Method and device for detecting Web application program and storage medium
CN103761175A (en) System and method for monitoring program execution paths under Linux system
KR101796369B1 (en) Apparatus, method and system of reverse engineering collaboration for software analsis
CN111625833B (en) Efficient method and device for judging reuse loopholes after release of software program
Li et al. Software vulnerability detection using backward trace analysis and symbolic execution
CN107193732A (en) A kind of verification function locating method compared based on path
CN111191248A (en) Vulnerability detection system and method for Android vehicle-mounted terminal system
CN116842531B (en) Code vaccine-based vulnerability real-time verification method, device, equipment and medium
CN116257848A (en) Memory horse detection method
CN112035354A (en) Method, device and equipment for positioning risk code and storage medium
Ko et al. Fuzzing with automatically controlled interleavings to detect concurrency bugs
CN114036526A (en) Vulnerability testing method and device, computer equipment and storage medium
Huo et al. Interpreting coverage information using direct and indirect coverage
CN104750602B (en) A kind of dynamic stain data analysing method and device
CN116467712B (en) Dynamic taint tracking method, device and related taint propagation analysis system
CN112632547A (en) Data processing method and related device
CN112612697A (en) Software defect testing and positioning method and system based on byte code technology
Yu et al. Oracle-based regression test selection
Bi et al. Benchmarking Software Vulnerability Detection Techniques: A Survey

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant