CN106919831B - Method and device for tracking stains - Google Patents

Method and device for tracking stains Download PDF

Info

Publication number
CN106919831B
CN106919831B CN201510994142.9A CN201510994142A CN106919831B CN 106919831 B CN106919831 B CN 106919831B CN 201510994142 A CN201510994142 A CN 201510994142A CN 106919831 B CN106919831 B CN 106919831B
Authority
CN
China
Prior art keywords
instruction
register
queue
module
taint
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510994142.9A
Other languages
Chinese (zh)
Other versions
CN106919831A (en
Inventor
徐胜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201510994142.9A priority Critical patent/CN106919831B/en
Publication of CN106919831A publication Critical patent/CN106919831A/en
Application granted granted Critical
Publication of CN106919831B publication Critical patent/CN106919831B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/52Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/566Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2221/00Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F2221/03Indexing scheme relating to G06F21/50, monitoring users, programs or devices to maintain the integrity of platforms
    • G06F2221/033Test or assess software

Abstract

The invention discloses a method and a device for tracking spots. Wherein, the method comprises the following steps: initializing a tracking queue and a taint queue, and storing instruction data of a first instruction in an instruction queue to the tracking queue and the taint queue; carrying out union operation on a first register set T and a first register propagation set P1 of a first instruction to obtain a set M; acquiring instruction data of a second instruction in the instruction queue; performing intersection operation on a second register set N and a set M of a second instruction to obtain a set K; if the set K is not empty, extracting the polluted registers from the set K and putting the polluted registers into a polluted register set L; and traversing all the instructions in the instruction queue by adopting the steps, and tracking to obtain all the taint registers in the instruction queue. The invention solves the technical problems that the number of tracking registers is single and the tracking process is incomplete in the register back tracking process of the register back tracking method in the prior art.

Description

Method and device for tracking stains
Technical Field
The invention relates to the field of register tracking, in particular to a method and a device for tracking taint.
Background
The smali instruction is an easy to understand instruction after decompilation of the Dalvik instruction. The Android program runs in a Dalvik virtual machine, the transmission of values or object references in the program is completed through registers, and the propagation process of tracking register values is an effective method for tracking data streams.
The existing technology adopts a specific instruction which is matched with whether a certain register appears before a current instruction of a current function, and uses constant assignment or an Android frame function call instruction as a termination condition, so that tracking of a complex instruction and a function call chain is not supported.
In the process of register back tracking, the value of one register is often the value solved by a plurality of registers together, but the prior art does not support simultaneous back tracking of a plurality of registers, the back tracking is not thorough, and the termination condition in the process of back tracking in the prior art is not completely covered, which can cause the termination of tracking in advance.
Aiming at the technical problems that the number of tracking registers is single and the tracking process is incomplete in the register back tracking process of the register back tracking method in the prior art, an effective solution is not provided at present.
Disclosure of Invention
The embodiment of the invention provides a taint tracing method and a taint tracing device, which at least solve the technical problems that the number of tracing registers is single and the tracing process is incomplete in the register back tracing process of a register back tracing method in the prior art.
According to an aspect of an embodiment of the present invention, there is provided a method of stain tracking, including: step 1, initializing a tracking queue and a taint queue, and storing instruction data of a first instruction in the instruction queue to the tracking queue and the taint queue, wherein the first instruction is a decompilation instruction to be tracked, and the instruction data of the first instruction comprises the following parameters: a first instruction content, a first register set T, and a first register propagation set P1; step 2, performing union operation on a first register set T and a first register propagation set P1 of the first instruction to obtain a set M; step 3, obtaining instruction data of a second instruction in the instruction queue, wherein the instruction data of the second instruction comprises: the contents of a second instruction and a second register set N, wherein the second instruction is a decompilated instruction before the first instruction; step 4, performing intersection operation on the second register set N and the set M of the second instruction to obtain a set K; step 5, if the set K is not empty, extracting the polluted registers from the set K and putting the polluted registers into a polluted register set L; and 6, traversing all the instructions in the instruction queue by adopting the steps 2 to 5, tracking to obtain all the taint registers in the instruction queue, and storing the tracked instructions containing the taint register set to the taint queue.
According to another aspect of the embodiments of the present invention, there is also provided an apparatus for spot tracking, including: the storage module is used for initializing the tracking queue and the taint queue and storing the instruction data of the first instruction in the instruction queue to the tracking queue and the taint queue, wherein the first instruction is a decompilation instruction to be tracked, and the instruction data of the first instruction comprises the following parameters: a first instruction content, a first register set T, and a first register propagation set P1; the first operation module is used for performing union operation on a first register set T and a first register propagation set P1 of a first instruction to obtain a set M; the first obtaining module is configured to obtain instruction data of a second instruction in the instruction queue, where the instruction data of the second instruction includes: the contents of a second instruction and a second register set N, wherein the second instruction is a decompilated instruction before the first instruction; the second operation module is used for performing intersection operation on the second register set N and the set M of the second instruction to obtain a set K; the first storage module is used for extracting polluted registers from the set K and putting the polluted registers into a polluted register set L if the set K is not empty; and the tracking module is used for traversing all the instructions in the instruction queue by adopting the first operation module, the first acquisition module, the second operation module and the first storage module, tracking all the taint registers in the instruction queue, and storing the tracked instructions containing the taint register set to the taint queue.
There is also provided, in accordance with another aspect of an embodiment of the present invention, a method for spot tracking, including: step A1, obtaining instruction data of a first instruction in an instruction queue, wherein the instruction data of the first instruction comprises the following parameters: a first instruction content, a first register set T, and a first register propagation set P1; b1, performing union operation on the first register set T and the first register propagation set P1 of the first instruction to obtain a set M; step C1, obtaining instruction data of a second instruction in the instruction queue, where the instruction data of the second instruction includes: the contents of a second instruction and a second register set N, wherein the second instruction is a decompilated instruction before the first instruction; step D1, performing intersection operation on the second register set N and the set M of the second instruction to obtain a set K; step E1, if the set K is not empty, extracting the polluted register from the set K; step F1, step B1 to step E1 are adopted to traverse all the instructions in the instruction queue, and all the dirty registers in the instruction queue are obtained by tracking.
In the embodiment of the present invention, instruction data of a first instruction in an instruction queue may be stored in a trace queue and a dirty queue by circularly traversing each instruction in the instruction queue, a first register set T and a first register propagation set P1 of the first instruction are subjected to union operation to obtain a set M, instruction data of a second instruction in the instruction queue is obtained, a second register set N and the set M of the second instruction are subjected to intersection operation to obtain a set K, and a dirty register is extracted from the set K and placed in a dirty register set L.
It is easy to note that, since all dirty registers in the instruction queue can be tracked based on the intersection of the register sets included in the two previous and next instructions, and the instruction including the dirty register is stored in the dirty queue, and the register set itself can accommodate multiple registers, thereby achieving the purpose of back-tracking multiple registers included in all instructions in the instruction queue at the same time, according to the scheme provided by the embodiment of the present application, the registers can be back-tracked based on the intersection of the register sets included in the two previous and next instructions, the whole back-tracking process is simple and clear, and the tracking of all instructions and multiple registers is supported.
Therefore, the technical problems that the number of tracking registers is single and the tracking process is incomplete in the register back tracking process of the register back tracking method in the prior art are solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a block diagram of a hardware configuration of a computer terminal of a method of taint tracking according to an embodiment of the present application;
FIG. 2 is a flow chart of a method of spot tracking according to an embodiment of the present application;
FIG. 3 is a flow diagram of an alternative intra-function register reverse taint tracking according to an embodiment of the present application;
FIG. 4 is a flow diagram of an alternative register reverse taint tracking for upper level functions in accordance with an embodiment of the present application;
FIG. 5 is a schematic view of an apparatus for spot tracking according to an embodiment of the present application;
FIG. 6 is a schematic view of an alternative spot tracking apparatus according to an embodiment of the present application;
FIG. 7 is a schematic view of an alternative spot tracking apparatus according to an embodiment of the present application;
FIG. 8 is a schematic view of an alternative spot tracking apparatus according to an embodiment of the present application;
FIG. 9 is a schematic view of an alternative spot tracking apparatus according to an embodiment of the present application;
FIG. 10 is a schematic view of an alternative spot tracking apparatus according to an embodiment of the present application;
FIG. 11 is a schematic view of an alternative spot tracking apparatus according to an embodiment of the present application;
FIG. 12 is a schematic view of an alternative spot tracking apparatus according to an embodiment of the present application;
FIG. 13 is a schematic view of an alternative spot tracking apparatus according to an embodiment of the present application;
FIG. 14 is a flow chart of another method of spot tracking according to an embodiment of the present application; and
fig. 15 is a block diagram of a computer terminal according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
First, some terms or terms appearing in the description of the embodiments of the present application are applicable to the following explanations:
dalvik virtual machine: the method is one of core components of an Android mobile device platform collaboratively developed by manufacturers such as Google company and the like, and can support the operation of Java application programs converted into a dex (Dalvik Executable) format, wherein the dex format is a compression format specially designed for Dalvik and is suitable for systems with limited memory and processor speed.
Smali: is an assembler of dex format files used by Dalvik virtual machines.
Example 1
There is also provided, in accordance with an embodiment of the present invention, a method embodiment for spot tracking, it should be noted that the steps illustrated in the flowchart of the accompanying drawings may be performed in a computer system such as a set of computer executable instructions and that, although a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than that presented herein.
The method provided by the first embodiment of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. Taking the example of the computer terminal, fig. 1 is a hardware block diagram of the computer terminal according to the method for tracking stains according to the embodiment of the present application. As shown in fig. 1, the computer terminal 10 may include one or more (only one shown) processors 102 (the processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA), a memory 104 for storing data, and a transmission module 106 for communication functions. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration and is not intended to limit the structure of the electronic device. For example, the computer terminal 10 may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
The memory 104 can be used for storing software programs and modules of application software, such as program instructions/modules corresponding to the stain tracking method in the embodiment of the present invention, and the processor 102 executes various functional applications and data processing by executing the software programs and modules stored in the memory 104, namely, the stain tracking method described above is realized. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the computer terminal 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal 10. In one example, the transmission device 106 includes a Network adapter (NIC) that can be connected to other Network devices through a base station to communicate with the internet. In one example, the transmission device 106 can be a Radio Frequency (RF) module, which is used to communicate with the internet in a wireless manner.
Under the above operating environment, the present application provides a method of stain tracking as shown in FIG. 2. Fig. 2 is a flow chart of a method of spot tracking according to an embodiment of the present application, as shown in fig. 2, the method comprising the steps of:
step 1, initializing a tracking queue and a taint queue, and storing instruction data of a first instruction in the instruction queue to the tracking queue and the taint queue, wherein the first instruction is a decompilation instruction to be tracked, and the instruction data of the first instruction comprises the following parameters: the first instruction content, the first register set T, and the first register propagation set P1.
Specifically, the trace queue may be used for TraceQueue representation, the taint queue may be used for TaintQueue representation, the first instruction content may be represented by I, the decompilation instruction may be a smali instruction, the first register set may be an instruction register set, and the first register propagation set P1 may be an instruction register propagation set P.
In an optional scheme, taking an example that the decompilated instruction is a smali instruction, to describe the above embodiment in detail, the backward instruction tracing module (denoted as Trace) may use an instruction to be traced (i.e., the first instruction) as a current instruction and fetch relevant information of the instruction, including a smal instruction content (denoted as I), an instruction register set (denoted as T) and an instruction register propagation set (denoted as P, where P is equal to T at initialization), and place the triple (I, T, P) into the Trace queue TraceQueue and the taintlnquue.
And 2, performing union operation on the first register set T and the first register propagation set P1 of the first instruction to obtain a set M.
In an optional scheme, taking an example that the decompilation instruction is a smali instruction, the above embodiments are described in detail, and a new set M may be obtained by performing union operation on the instruction register set T and the instruction register propagation set P through the instruction register set operation module.
Step 3, obtaining instruction data of a second instruction in the instruction queue, wherein the instruction data of the second instruction comprises: the contents of a second instruction and a second register set N, wherein the second instruction is a decompilated instruction before the first instruction.
In an optional scheme, taking an example that the decompilated instruction is a smali instruction, to describe the above embodiment in detail, the Trace module may fetch an instruction before the current instruction and record the instruction as the instruction, i.e., the second instruction, and fetch the instruction content, i.e., the instruction content, and the instruction register set N (i.e., the second register set N) of the second instruction, i.e., the instruction from the instruction queue.
And 4, performing intersection operation on the second register set N and the set M of the second instruction to obtain a set K.
In an optional scheme, taking an example that the decompilation instruction is a smali instruction, the above embodiment is described in detail, and the new set K may be obtained by performing intersection operation on the new set M and the instruction register set N through the instruction register set operation module.
And 5, if the set K is not empty, extracting the polluted registers from the set K and putting the polluted registers into a polluted register set L.
In an alternative scheme, taking the decompilation instruction as an example, which is a smali instruction, the above embodiment is described in detail, and if the obtained new set K is not empty, that is, the new set K includes registers, the next contaminated register is extracted from the new set K and placed into the contaminated register set L.
And 6, traversing all the instructions in the instruction queue by adopting the steps 2 to 5, tracking to obtain all the taint registers in the instruction queue, and storing the tracked instructions containing the taint register set to the taint queue.
In an alternative scheme, taking the decompilated instruction as an example of a smali instruction, and describing the above embodiment in detail, step 2 to step 5 may be repeatedly executed to traverse all instructions in the instruction queue, thereby ending the intra-function taint tracking process, obtaining a tracking result (i.e., the above instruction including the taint register set), and storing the tracking result in the taint queue.
In the scheme disclosed in the first embodiment of the present application, instruction data of a first instruction in an instruction queue may be stored in a trace queue and a dirty queue by circularly traversing each instruction in the instruction queue, a first register set T and a first register propagation set P1 of the first instruction are subjected to union operation to obtain a set M, instruction data of a second instruction in the instruction queue is obtained, an intersection operation is performed on a second register set N and the set M of the second instruction to obtain a set K, and a contaminated register extracted from the set K is placed in a contaminated register set L.
It is easy to note that, since all dirty registers in the instruction queue can be tracked based on the intersection of the register sets included in the two previous and next instructions, and the instruction including the dirty register is stored in the dirty queue, and the register set itself can accommodate multiple registers, thereby achieving the purpose of back-tracking multiple registers included in all instructions in the instruction queue at the same time, according to the scheme provided by the embodiment of the present application, the registers can be back-tracked based on the intersection of the register sets included in the two previous and next instructions, the whole back-tracking process is simple and clear, and the tracking of all instructions and multiple registers is supported.
Therefore, the scheme of the first embodiment provided by the application solves the technical problems that in the register back tracking process of the register back tracking method in the prior art, the number of tracking registers is single, and the tracking process is not complete.
The above embodiments of the present application can be implemented by the following modules: the system comprises a backward instruction tracking module (marked as Trace) and an instruction register set operation module, wherein the backward instruction tracking module (marked as Trace) is responsible for backward traversing instructions and managing a tracking queue TraceQueue and a taint queue TaintQueue; the instruction register set operation module is responsible for the operation of the register set, including union set and intersection set operation.
According to the above embodiment of the present application, before performing union operation on the first register set T of the first instruction and the first register propagation set P1 to obtain the set M in step 2, the method further includes the following steps:
step S221 determines whether a second instruction exists in the instruction queue.
In step S222, if the instruction exists, the instruction data of the last traced first instruction is obtained from the trace queue.
In step S223, if not, register reverse taint tracking is performed.
In an optional scheme, taking an example that the decompilation instruction is a smali instruction, to describe the above embodiment in detail, the Trace module may fetch an instruction before a current instruction, which is denoted as an "INext" (i.e., the second instruction described above), and if the second instruction exists in the instruction queue, fetch previously saved triple information (I, T, P) (i.e., instruction data of the first instruction described above) from the Trace queue; if no second instruction is present in the instruction queue, the function call chain tracing process (i.e., register reverse taint tracing as described above) is entered.
According to the above embodiment of the present application, after performing union operation on the first register set T of the first instruction and the first register propagation set P1 to obtain the set M in step 2, the method further includes the following steps:
in step S224, it is determined whether the set M obtained by union operation is empty.
Step S225, if not, calculating a register used by the second instruction to obtain a second register set.
In step S226, if empty, register reverse taint tracking is performed.
In an optional scheme, taking an example that the decompilation instruction is a smali instruction, the above embodiment is described in detail, after obtaining the new set M, it is determined whether the new set M is empty, and if the set M is empty, a function call chain tracing process (i.e., the above register reverse taint tracing) is performed; if the set M is not empty, the read set R and the write set W are merged to obtain the register set N of the second instruction INext (i.e. the second register set mentioned above).
According to the above embodiment of the present application, after performing intersection operation on the second register set N and the set M of the second instruction in step 4 to obtain the set K, the method further includes the following steps:
step S241, determine whether the set K obtained by intersection operation is empty.
And step S242, if not, initializing a polluted register set L, and entering a step of extracting polluted registers from the set K and putting the polluted registers into the polluted register set L, namely putting registers meeting the pollution conditions in the set K into the polluted register set L.
In step S243, if the instruction queue is empty, the instruction data of the first instruction is continuously put into the trace queue, and the process returns to continuously execute all instructions in the traversal instruction queue until all dirty registers in the instruction queue are obtained by tracing.
In an optional scheme, taking an example that a decompilation instruction is a smali instruction, the above embodiment is described in detail, after a new set K is obtained, whether the new set K is empty is determined, if the set K is empty, data (I, T, P) of a first instruction is continuously put into a trace queue, and all instructions in a traversal instruction queue are returned to be continuously executed; if set K is not empty, dirty register set L (i.e., dirty register set L described above) is initialized.
It should be noted here that the purpose of determining whether two instructions before and after the instruction have register pollution is achieved by the method, the method abstracts the registers used by two smarti instructions before and after the instruction into two register sets, then determines whether the instruction before and after the instruction has register pollution by determining whether a new set obtained by performing intersection operation on the two register sets is empty, and if the new set obtained by the intersection operation is empty, it indicates that the two instructions before and after the instruction have no register pollution; and if the new set obtained by intersection operation is not empty, indicating that the two previous and next instructions have register pollution.
According to the above embodiment of the present application, the second register set N includes: in the case of reading the register set R and writing the register set W, wherein, in step 5, extracting the contaminated registers from the set K and putting them into the contaminated register set L, comprises:
step S251, the existing register in the set K obtained by intersection operation is traversed.
In step S253, if there is a register r that has not been traversed, the attribute of the register r is determined.
In step S255, if the register R belongs to the read register set R, the register R is put into the dirty register set L.
In step S257, if the register r belongs to the write register set W, the registers in the second register set N except the register r are put into the dirty register set L.
In an alternative scheme, taking the decompilated instruction as an example, which is a smali instruction, the above embodiment is described in detail, and if the register R exists (that is, the register R that is not traversed exists in step S253), the register R is located to belong to the read set R or the write set W. If the register R belongs to the read set R, putting the register R into a pollution register set L; if r belongs to the write set W, the remaining registers in the second register set N, except for register r, are placed in the dirty register set L.
Through the scheme, in the process of back tracing, registers appearing in a current instruction are divided into a read set and a write set, if a register needing to be traced by a previous instruction appears in the read set, the register needs to be traced continuously, and if a register needing to be traced by the previous instruction appears in the write set, the register needs to be traced continuously until a constant assignment instruction or a function head appears.
According to the above embodiment of the present application, the method further includes: step S258, if all the registers in the set K are traversed, using the difference set between the set M and the dirty register set L as a second register propagation set P2 of the second instruction, and storing the instruction data after the second instruction update to the trace queue and the dirty queue, where the instruction data after the second instruction update includes: the second instruction content, the dirty register set L, and the second register propagation set P2.
In an alternative, the decompilated instruction is a smali instruction, which is described in detail in the above embodiment, after the register r does not exist, that is, after all the registers in the set K are completely traversed, the instruction register propagation set P (i.e., the second register propagation set P2 described above) is re-cloned as a difference set between the set M and the set L, and the instruction data (INext, L, P) updated by the second instruction is placed in the trace queue and the TaintQueue.
It should be noted here that the instruction register set operation module is also responsible for the clone operation of the register set.
According to the above embodiment of the present application, the register reverse taint tracking in step S223 and step S226 includes the following steps:
step A', extracting instruction data of a first polluted instruction at the tail part of the queue from the taint queue, wherein the instruction data of the first polluted instruction comprises the following steps: the system comprises the contents of an infected instruction, a set of infected registers, and a set of infected register propagation.
Specifically, the infected instruction content of the first contaminated instruction may be denoted by I, the infected register set may be denoted by T, and the infected register propagation set may be denoted by P.
And step B ', acquiring a first function corresponding to the first polluted instruction, and acquiring a parameter register set M' of the first function.
Specifically, the first function may be represented by Callee.
In an alternative scheme, taking an example that the decompilated instruction is a smali instruction, the above embodiment is described in detail, and a last tuple (I, T, P) in the TaintQueue (i.e., the instruction data of the first contaminated instruction above) is fetched, and a function where the instruction I is located is obtained and recorded as Callee (i.e., the first function above), and a parameter register set of Callee is recorded as M (i.e., the parameter register M' above).
And step C ', carrying out union operation on the dyed register set and the dyed register propagation set, and carrying out intersection operation on the union operation result and the parameter register set M' to obtain a set C.
In an alternative scheme, taking the decompilated instruction as an example, the above embodiment is explained in detail, and the set C is obtained by performing union operation on the set T and the set P and then performing intersection operation on the set T and the set M.
And step D ', after a calling function and a calling instruction for calling the first function are obtained, calculating the corresponding sequence number of the register in the parameter register set M' contained in the set C.
Specifically, the calling function of the first function may be denoted by Caller, and the calling instruction may be denoted by iclaler.
In an alternative scheme, taking the decompilation instruction as an example of a smali instruction, the above embodiment is described in detail, and after obtaining the call function Caller and the call instruction iclaler, the sequence of the registers in the set C in the set M is calculated as S (i.e., the above sequence number).
And E', acquiring a calling register set F contained in the calling instruction.
In an alternative scheme, taking the decompilated instruction as an example of a smali instruction, the above embodiment is described in detail, and a corresponding register set K (i.e., the above call register set F) is generated according to the type of the call instruction iclaler.
And F', extracting a corresponding call register from the call register set F by using the sequence number to generate a register set T.
In an alternative scheme, taking the decompilated instruction as an example of a smali instruction, the above embodiment is described in detail, and a new register set T (i.e., the above register set T) is generated by fetching a corresponding register from the set K according to the sequential sequence S.
And G', putting the calling instruction data corresponding to the calling instruction into the initialized tracking queue and the taint queue.
In an alternative scheme, taking an example that the decompilation instruction is a smali instruction, the above embodiment is described in detail, and the trace queue and the TaintQueue are initialized, and call instruction data (iclaler, T) is placed in the initialized trace queue and TaintQueue.
And step H ', returning to the step A', and traversing and processing all polluted instructions in the taint queue by adopting the steps A 'to G', so as to realize reverse taint tracking.
In an alternative scheme, taking the decompilation instruction as an example of a smali instruction, the above embodiment is described in detail, by traversing all the polluted instructions in the TaintQueue, implementing reverse taint tracking, and after stopping reverse taint tracking, outputting a tracking result to the TaintQueue.
By the embodiment, after the reverse taint tracing flow in the function is finished, the back tracing is continued from the instruction calling the function, so that the whole back tracing process is very simple and clear, and the tracing of all instructions and multiple registers and the tracing of a function calling chain are supported.
The above embodiments of the present application can be implemented by the following modules: and the instruction register read-write reflection module is responsible for cutting the instruction register set into read-write sets (R and W sets) according to the read-write attributes and providing a reflection function, namely when the register to be tracked falls into the read set of the next instruction register, the backward instruction tracking module continues to track the register, and if the register to be tracked falls into the write set of the next instruction, the backward instruction tracking module reverses the register tracking the read set.
According to the above embodiment of the present application, in step C ', a union operation is performed on the dyed register set and the dyed register propagation set, and an intersection operation is performed on the union operation result and the parameter register set M' to obtain a set C, and then the method further includes the following steps:
step S281, determine whether the set C obtained by intersection operation is empty.
In step S283, if the first function is not empty, a call function and a call instruction for calling the first function are acquired.
In step S285, if the register is empty, the register reverse taint tracing is stopped and the tracing result is output.
In an optional scheme, taking an example that the decompiling instruction is a smali instruction, the above embodiment is described in detail, after obtaining a set C through intersection operation, it is determined whether the set C is empty, that is, whether a register exists in the set C, and if the set C is not empty, that is, the register exists, a call function Caller and a call instruction iclaler corresponding to a first function Callee are obtained; if C is empty, namely no register exists, stopping reverse taint tracking, and outputting a tracking result to the taint queue TaintQueue.
According to the above embodiment of the present application, in step E', obtaining a call register set F included in a call instruction includes:
in step S2751, it is determined whether the call instruction is a static call instruction.
Step S2753, if the call instruction is a non-static call instruction, delete the first register of all instruction registers included in the call instruction to obtain a call register set F.
In step S2755, if the call instruction is a static call instruction, all instruction registers included in the call instruction form a call register set F.
In an alternative scheme, taking an example that the decompilated instruction is a smali instruction, the above embodiment is described in detail, after obtaining the calling instruction ICaller, if the calling instruction ICaller is a non-static calling instruction, such as an invoke-super, invoke-virtual, or other instruction, then removing a first register in an instruction register of the ICaller, and generating a register set K; if the call instruction ICaller is a static call instruction, then ICaller instruction register set K is generated directly.
It should be noted here that, in the above embodiment of the present application, instruction data included in each decompiled instruction in the instruction queue may be stored by using a three-dimensional array model, and a basic mode of the three-dimensional data model may include: the first-dimension array, the second-dimension array and the third-dimension array, namely the triple (instruction content I, register set T, register propagation set P) contained in the instruction data in the present application, can be maintained by using the three-dimension data model. The instruction content may be identification information of a current decompilation instruction, registers in the register set may be represented by multidimensional vector objects, the register propagation set P is used to store a call relationship between the registers, and instruction data included in any decompilation instruction has the same position coordinates at corresponding recording positions in the first-dimensional array, the second-dimensional array, and the third-dimensional array.
In an optional scheme, the three-dimensional array model may be represented by a three-dimensional coordinate system model, the first dimensional array, the second dimensional array, and the third dimensional array may be described by coordinate axes in the three-dimensional coordinate system model, where the first dimensional array may be an X axis, the second dimensional array may be a Y axis, and the third dimensional data may be a Z axis, creating the three-dimensional array model may be initializing the three-dimensional coordinate system model to make the X, Y, and Z axes empty, where the X, Y, and Z axes are all sequential structures, each element of the Z axis queue is also a sequential structure for storing vectors of multiple callers, and elements of the X and Y axis queues store a string object and a vector object, respectively.
In an alternative scheme, the instruction content included in each decompilated instruction may be mapped to the first-dimensional array in a queue-like sequential structure, that is, the instruction content included in each decompilated instruction is mapped to the X-axis of the three-dimensional coordinate system and is placed at the m position of the X-axis queue, and each register in the register set is mapped to the second-dimensional array in a queue-like sequential structure, that is, mapped to the Y-axis of the three-dimensional coordinate system and is placed at the m position of the Y-axis queue.
In an alternative scheme, each decompilated instruction may be subjected to slicing and branch stripping processing to obtain a plurality of execution branches included in each decompilated instruction, where each execution branch is formed by a decompilated instruction, and the execution branch included in each decompilated instruction is mapped to a register of a register set, where the execution branch included in each decompilated instruction is in a corresponding register.
In an optional scheme, an execution branch corresponding to each register in a register set Ds is circularly traversed, a decompilation instruction in the branch is traversed, if a call function instruction exists, the instruction content of the call function instruction is taken out and recorded as Xn, namely class name and method name information, whether an instruction content with the same instruction content and the same position exists in an X axis or not is searched, namely whether a coordinate n of Xn exists or not is searched, if the instruction content exists, a second register set at an n coordinate position on a Y axis corresponding to the instruction content is obtained, the first register set is placed in a queue at a Z axis n corresponding to the second register set, and therefore a relationship that a vector Ds points to a vector Dt is established, namely a function call chain relationship is established.
In an alternative scheme, if the instruction content with the same position on the X axis does not exist in the searching Xn, a new register is established to be mapped on the Y axis coordinate, and the instruction content Xn of the calling function instruction is mapped on the n coordinate on the X axis.
A preferred embodiment of the present application will now be described in detail with reference to figures 3 and 4.
As shown in fig. 3 and 4, an alternative stain tracking method is provided, which may include the following steps S31 to S333:
in step S31, a trace queue and taint queue are initialized.
In step S32, information related to the command to be traced is input.
Step S33, storing the related information into the trace queue and the taint queue.
Optionally, the backward instruction tracking module (denoted as Trace) takes the instruction to be tracked as the current instruction and fetches related information of the instruction, including a smal instruction content (denoted as I), an instruction register set (denoted as T), an instruction register propagation set (denoted as P, where P is equal to T at initialization), and puts the triple (I, T, P) into the tracking queue TraceQueue and the taintintqueue.
In step S34, an instruction preceding the current instruction is fetched.
Optionally, the Trace module takes an instruction before the current instruction and records as INext.
In step S35, it is determined whether the previous instruction INext is empty.
Specifically, if INext is empty, the backward tracing is stopped to enter the function call chain tracing process, and the process proceeds to step S321; if INext is not empty, the process proceeds to step S36.
Step S36, the last trace command and taint information are retrieved from the trace queue.
Optionally, the previously saved triplet information (I, T, P) is taken out of the TraceQueue queue.
And step S37, carrying out union set operation on the set T and the set P to obtain a new set M.
In step S38, it is determined whether the set M is empty.
Specifically, if the set M is not empty, the process proceeds to step S39; otherwise, if the set M is empty, the function call chain tracing procedure is entered, i.e., the step S321 is entered.
In step S39, a register set N of the INext instruction is calculated.
Optionally, a union operation is performed on the read set R and the write set W to obtain the register set N of the INext instruction.
And step S310, performing intersection operation on the set M and the set N to obtain a new set K.
In step S311, it is determined whether the set K is empty.
Specifically, if the set K is empty, go to step S312; otherwise, if the set K is not empty, the process proceeds to step S313.
Step S312, continuously putting (I, T, P) into the tracking queue.
Optionally, the triple information (I, T, P) is continuously put into the trace queue, and the step S34 is entered to continue traversing the smali instruction to be traced
Step 313 initializes the contaminated register set L.
Step S314, the next register r in the set K is taken.
In step S315, it is determined whether the next register r exists.
Specifically, if the next register r exists, the process proceeds to step S316; otherwise, if the next register r does not exist, the process proceeds to step S319.
Step S316 locates whether the next register R belongs to the read set R or the write set W.
Specifically, if the next register R belongs to the read set R, the process proceeds to step S317, the next register R is placed in the contaminated register set L, and the process proceeds to step S314 after the execution is finished; if the next register r belongs to the write set W, the process proceeds to step S318, the registers in the register set N except the next register r are placed in the dirty register set L, and the process proceeds to step S314 after the execution is finished.
In step S319, the set P is re-cloned as the difference between the set M and the set L.
Step S320, put (INext, L, P) into the trace queue and taint queue.
Specifically, after step S320 is executed, step S34 is entered, and traversal of the smali instruction to be traced is continued.
Step S321, enter the reverse register taint tracing flow of the upper function.
Optionally, ending the intra-function taint tracking process, storing the tracking result in the taintQueue queue, and entering a reverse taint tracking flow under the condition that the function is called.
As shown in fig. 4, after entering the reverse register taint tracing flow of the upper function, the method comprises the following steps:
step S322, take out the last tuple (I, T, P) in the TaintQueue queue.
Step S323, a parameter register set M in which the function in which the instruction I is located is marked as Callee and Callee is obtained.
In step S324, the set T and the set P are merged and then intersected with the set M to obtain a set C.
Step S325, determine whether the set C is empty.
Specifically, if the set C is not empty, the process proceeds to step S326; otherwise, if the set C is empty, the process proceeds to step S333.
Step S326, a function (marked as Caller) and a call instruction (marked as iclaler) for calling the function Callee are obtained.
Step S327, calculating the sequence of the registers in the set C in the set M as S.
Step S328, determine whether iclaler is a non-static call instruction.
Specifically, if the call instruction iclaler is a non-static call instruction, such as an invoke-super, invoke-virtual, or the like, step S329 is entered; if the call instruction ICaller is a static call instruction, step S330 is entered.
In step S329, the first register in the iclaler instruction register is removed, and a register set K is generated, and the process proceeds to step S331 after the execution is completed.
In step S330, an iclaler instruction register set K is directly generated, and the process proceeds to step S331 after the execution is completed.
And step S331, selecting the registers in the set K according to the sequence S to generate a new register set T.
Step S332, put (ICaller, T, T) in the trace queue and taint queue, and after execution is complete, proceed to step S322.
Optionally, the trace queue TraceQueue is initialized, and (iclaler, T) is put into the trace queue and taint queue, and after step S332 is executed, step S322 is entered to continue traversing the to-be-traced instructions in the taint queue.
And step S333, stopping reverse stain tracking and outputting a tracking result TaintQueue.
The above preferred embodiment of the present application provides a method for tracking register pollution propagation in a smali instruction stream based on union, intersection operation and register read-write reflection algorithm of register sets, the method tracks a source register reversely based on intersection operation of register sets in previous and subsequent instructions, and the set itself can accommodate a plurality of registers. In the backward tracing, the register appearing in the current instruction is divided into a read set and a write set, if the register needing to be traced by the previous instruction appears in the read set, the register needs to be traced continuously, if the register needing to be traced by the previous instruction appears in the write set, the register needing to be traced continuously in the read set is shown, until a constant assignment instruction or a function head appears, and the function head is traced back from the instruction calling the function. The whole back tracing process becomes very simple and clear, and all instruction and multi-register tracing and tracing of function call chains are supported.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.
Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
Example 2
According to an embodiment of the present invention, there is also provided an apparatus for spot tracking for implementing the above-mentioned method for spot tracking, as shown in fig. 5, the apparatus including: a saving module 51, a first operation module 52, a first obtaining module 53, a second operation module 54, a first storage module 55 and a tracking module 56.
The saving module 51 is configured to initialize the tracking queue and the taint queue, and save instruction data of a first instruction in the instruction queue to the tracking queue and the taint queue, where the first instruction is a decompilated instruction to be tracked, and the instruction data of the first instruction includes the following parameters: the first instruction content, the first register set T, and the first register propagation set P1.
Specifically, the trace queue may be used for TraceQueue representation, the taint queue may be used for TaintQueue representation, the first instruction content may be represented by I, the decompilation instruction may be a smali instruction, the first register set may be an instruction register set, and the first register propagation set P1 may be an instruction register propagation set P.
In an optional scheme, taking an example that the decompilated instruction is a smali instruction, to describe the above embodiment in detail, the backward instruction tracing module (denoted as Trace) may use an instruction to be traced (i.e., the first instruction) as a current instruction and fetch relevant information of the instruction, including a smal instruction content (denoted as I), an instruction register set (denoted as T) and an instruction register propagation set (denoted as P, where P is equal to T at initialization), and place the triple (I, T, P) into the Trace queue TraceQueue and the taintlnquue.
The first operation module 52 is configured to perform union operation on the first register set T of the first instruction and the first register propagation set P1 to obtain a set M.
In an optional scheme, taking an example that the decompilation instruction is a smali instruction, the above embodiments are described in detail, and a new set M may be obtained by performing union operation on the instruction register set T and the instruction register propagation set P through the instruction register set operation module.
The first obtaining module 53 is configured to obtain instruction data of a second instruction in the instruction queue, where the instruction data of the second instruction includes: the contents of a second instruction and a second register set N, wherein the second instruction is a decompilated instruction before the first instruction.
In an optional scheme, taking an example that the decompilated instruction is a smali instruction, to describe the above embodiment in detail, the Trace module may fetch an instruction before the current instruction and record the instruction as the instruction, i.e., the second instruction, and fetch the instruction content, i.e., the instruction content, and the instruction register set N (i.e., the second register set N) of the second instruction, i.e., the instruction from the instruction queue.
The second operation module 54 is configured to perform intersection operation on the second register set N and the set M of the second instruction to obtain a set K.
In an optional scheme, taking an example that the decompilation instruction is a smali instruction, the above embodiment is described in detail, and the new set K may be obtained by performing intersection operation on the new set M and the instruction register set N through the instruction register set operation module.
The first storage module 55 is configured to extract the contaminated registers from the set K and put the contaminated registers into the contaminated register set L if the set K is not empty.
In an alternative scheme, taking the decompilation instruction as an example, which is a smali instruction, the above embodiment is described in detail, and if the obtained new set K is not empty, that is, the new set K includes registers, the next contaminated register is extracted from the new set K and placed into the contaminated register set L.
The tracking module 56 is configured to traverse all instructions in the instruction queue by using the first operation module 52, the first obtaining module 53, the second operation module 54, and the first storage module 55, track all dirty registers in the instruction queue, and store the tracked instructions including the dirty register set in the dirty queue.
In an alternative scheme, taking the decompilated instruction as an example of a smali instruction, the above embodiment is described in detail, and the functions of the above modules may be repeatedly executed, and all instructions in the instruction queue are traversed, so that the intra-function taint tracking process is ended, a tracking result (i.e., the above instruction including the taint register set) is obtained, and the tracking result is stored in the taint queue.
In the second embodiment of the present application, instruction data of the first instruction in the instruction queue may be stored in the trace queue and the dirty queue by circularly traversing each instruction in the instruction queue, a union operation is performed on the first register set T and the first register propagation set P1 of the first instruction to obtain a set M, instruction data of the second instruction in the instruction queue is obtained, an intersection operation is performed on the second register set N and the set M of the second instruction to obtain a set K, and a contaminated register extracted from the set K is placed in the contaminated register set L.
It is easy to note that, since all dirty registers in the instruction queue can be tracked based on the intersection of the register sets included in the two previous and next instructions, and the instruction including the dirty register is stored in the dirty queue, and the register set itself can accommodate multiple registers, thereby achieving the purpose of back-tracking multiple registers included in all instructions in the instruction queue at the same time, according to the scheme provided by the embodiment of the present application, the registers can be back-tracked based on the intersection of the register sets included in the two previous and next instructions, the whole back-tracking process is simple and clear, and the tracking of all instructions and multiple registers is supported.
Therefore, the second embodiment of the present application solves the technical problems that the number of the tracking registers is single and the tracking process is incomplete in the process of the register back tracking in the register back tracking method in the prior art.
The above embodiments of the present application can be implemented by the following modules: the system comprises a backward instruction tracking module (marked as Trace) and an instruction register set operation module, wherein the backward instruction tracking module (marked as Trace) is responsible for backward traversing instructions and managing a tracking queue TraceQueue and a taint queue TaintQueue; the instruction register set operation module is responsible for the operation of the register set, including union set and intersection set operation.
According to the above embodiment of the present application, as shown in fig. 6, the apparatus further includes: a first judging module 61, a second obtaining module 63 and a first executing module 65.
The first judging module 61 is configured to judge whether a second instruction exists in the instruction queue; the second obtaining module 63 is configured to obtain instruction data of the last traced first instruction from the trace queue if the first instruction exists; the first execution module 65 is for performing register reverse taint tracking if not present.
It should be noted here that the first determining module 61, the second obtaining module 63 and the first executing module 65 correspond to steps S221 to S223 in the first embodiment, and the modules are the same as the corresponding steps in the implementation example and the application scenario, but are not limited to the disclosure in the first embodiment. It should be noted that the modules described above as part of the apparatus may be run in the computer terminal 10 provided in the first embodiment.
According to the above embodiment of the present application, as shown in fig. 7, the apparatus further includes: a second decision module 71, a first processing module 73 and a second execution module 75.
The second judging module 71 is configured to judge whether the set M obtained by union set operation is empty; the first processing module 73 is configured to calculate a register used by the second instruction if the register is not empty, so as to obtain a second register set; the second execution module 75 is configured to perform register reverse taint tracking if empty.
It should be noted here that the second determining module 71, the first processing module 73 and the second executing module 75 correspond to steps S224 to S226 in the first embodiment, and the modules are the same as the corresponding steps in the implementation example and the application scenario, but are not limited to the disclosure in the first embodiment. It should be noted that the modules described above as part of the apparatus may be run in the computer terminal 10 provided in the first embodiment.
According to the above embodiment of the present application, as shown in fig. 8, the apparatus further includes: a third judging module 81, an initializing module 83 and a second storing module 85.
The third judging module 81 is configured to judge whether the set K obtained by intersection operation is empty; the initialization module 83 is configured to initialize the dirty register set L and execute the function of the first storage module if the dirty register set L is not empty; the second storage module 85 is configured to, if the instruction queue is empty, continuously place the instruction data of the first instruction into the trace queue, and return to continuously execute all instructions in the traversal instruction queue until all dirty registers in the instruction queue are traced.
It should be noted here that the third determining module 81, the initializing module 83 and the second storing module 85 correspond to steps S241 to S243 in the first embodiment, and the modules are the same as the corresponding steps in the implementation example and the application scenario, but are not limited to the disclosure in the first embodiment. It should be noted that the modules described above as part of the apparatus may be run in the computer terminal 10 provided in the first embodiment.
According to the above embodiment of the present application, the second register set N includes: in the case of reading the register set R and writing the register set W, wherein, as shown in fig. 9, the first storage module 55 includes: a traversal module 91, a fourth determination module 93, a third storage module 95, and a fourth storage module 97.
The traversal module 91 is configured to traverse existing registers in the set K obtained by intersection operation; the fourth judging module 93 is configured to judge an attribute of the register r if the unretraversed register r exists; the third storage module 95 is configured to put the register R into the dirty register set L if the register R belongs to the read register set R; the fourth storage module 97 is configured to place registers of the second register set N other than the register r into the dirty register set L if the register r belongs to the write register set W.
It should be noted here that the traversal module 91, the fourth determination module 93, the third storage module 95, and the fourth storage module 97 correspond to steps S251 to S257 in the first embodiment, and the modules are the same as the corresponding steps in the implementation example and application scenarios, but are not limited to the disclosure in the first embodiment. It should be noted that the modules described above as part of the apparatus may be run in the computer terminal 10 provided in the first embodiment.
According to the above embodiment of the present application, as shown in fig. 10, the apparatus further includes: a second processing module 101.
The second processing module 101 is configured to, if all registers in the set K are traversed, use a difference set between the set M and the dirty register set L as a second register propagation set P2 of the second instruction, and store instruction data after updating the second instruction to the trace queue and the dirty queue, where the instruction data after updating the second instruction includes: the second instruction content, the dirty register set L, and the second register propagation set P2.
It should be noted here that the second processing module 101 corresponds to step S258 in the first embodiment, and the modules are the same as the examples and application scenarios implemented by the corresponding steps, but are not limited to the disclosure of the first embodiment. It should be noted that the modules described above as part of the apparatus may be run in the computer terminal 10 provided in the first embodiment.
According to the above embodiments of the present application, as shown in fig. 11, the first execution module or the second execution module may include: a first extraction module 111, a third acquisition module 112, a third operation module 113, a calculation module 114, a fourth acquisition module 115, a second extraction module 116, a fifth storage module 117 and a return module 118.
The first extracting module 111 is configured to extract instruction data of a first contaminated instruction at a tail of a queue from the taint queue, where the instruction data of the first contaminated instruction includes: the method comprises the steps of (1) dyeing instruction content, a dyeing register set and a dyeing register propagation set; the third obtaining module 112 is configured to obtain a first function corresponding to the first contaminated instruction, and obtain a parameter register set M' of the first function; the third operation module 113 is configured to perform union operation on the infected register set and the infected register propagation set, and perform intersection operation on a result of the union operation and the parameter register set M' to obtain a set C; the calculating module 114 is configured to calculate, after obtaining a call function and a call instruction for calling the first function, a sequence number corresponding to a register included in the set C in the parameter register set M'; the fourth obtaining module 115 is configured to obtain a call register set F included in the call instruction; the second extracting module 116 is configured to extract a corresponding call register from the call register set F by using the sequence number, and generate a register set T; the fifth storage module 117 is configured to place call instruction data corresponding to a call instruction into the initialized tracking queue and the taint queue; the returning module 118 is configured to return to execute the function of the first extracting module, and execute the functions of the first extracting module, the third obtaining module, the third calculating module, the fourth obtaining module, the second extracting module, and the fifth storing module to traverse all the polluted instructions in the taint queue, thereby implementing reverse taint tracking.
Specifically, the content of the contaminated instruction of the first contaminated instruction may be represented by I, the set of contaminated registers may be represented by T, the set of contaminated register propagation may be represented by P, the first function may be represented by Callee, the calling function of the first function may be represented by Caller, and the calling instruction may be represented by iclaler.
It should be noted here that the first extracting module 111, the third obtaining module 112, the third calculating module 113, the calculating module 114, the fourth obtaining module 115, the second extracting module 116, the fifth storing module 117 and the returning module 118 correspond to steps S271 to S278 in the first embodiment, and the modules are the same as the examples and application scenarios implemented by the corresponding steps, but are not limited to the disclosure in the first embodiment. It should be noted that the modules described above as part of the apparatus may be run in the computer terminal 10 provided in the first embodiment.
According to the above embodiment of the present application, as shown in fig. 12, the apparatus further includes: a fifth judging module 121, a fifth acquiring module 123 and an output module 125.
The fifth judging module 121 is configured to judge whether the set C obtained by intersection operation is empty; the fifth obtaining module 123 is configured to obtain a call function and a call instruction for calling the first function if the first function is not empty; the output module 125 is configured to stop performing the register reverse taint tracing if the register reverse taint tracing is empty, and output a tracing result.
It should be noted here that the fifth determining module 121, the fifth obtaining module 123 and the output module 125 correspond to steps S281 to S285 in the first embodiment, and the modules are the same as the corresponding steps in the implementation example and the application scenario, but are not limited to the disclosure in the first embodiment. It should be noted that the modules described above as part of the apparatus may be run in the computer terminal 10 provided in the first embodiment.
According to the above embodiment of the present application, as shown in fig. 13, the fourth obtaining module 115 includes: a sixth judging module 131, a third processing module 133 and a fourth processing module 135.
The sixth determining module 131 determines whether the call instruction is a static call instruction; the third processing module 133 is configured to, if the call instruction is a non-static call instruction, delete a first register of all instruction registers included in the call instruction to obtain a call register set F; the fourth processing module 135 is configured to step S2755, and if the call instruction is a static call instruction, all instruction registers included in the call instruction constitute a call register set F.
It should be noted here that the sixth determining module 131, the third processing module 133 and the fourth processing module 135 correspond to steps S2751 to S2755 in the first embodiment, and the modules are the same as the corresponding steps in the implementation example and application scenarios, but are not limited to the disclosure in the first embodiment. It should be noted that the modules described above as part of the apparatus may be run in the computer terminal 10 provided in the first embodiment.
Example 3
There is also provided, in accordance with an embodiment of the present invention, a method embodiment for spot tracking, it should be noted that the steps illustrated in the flowchart of the accompanying drawings may be performed in a computer system such as a set of computer executable instructions and that, although a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than that presented herein.
Fig. 14 is a flow chart of another method of spot tracking according to an embodiment of the present application, as shown in fig. 14, the method comprising the steps of:
step a1, obtaining instruction data of a first instruction in the instruction queue, where the instruction data of the first instruction includes the following parameters: the first instruction content, the first register set T, and the first register propagation set P1.
In particular, the first instruction content may be denoted by I, the decompilation instruction may be a smali instruction, the first set of registers may be a set of instruction registers, and the first register propagation set P1 may be a set of instruction register propagation P.
Alternatively, the trace queue and TaintQueue may be initialized before the instruction data of the first instruction in the instruction queue is fetched and saved to the trace queue and taint queue after the instruction data of the first instruction is fetched.
In an optional scheme, taking an example that the decompilated instruction is a smali instruction, to describe the above embodiment in detail, the backward instruction tracing module (denoted as Trace) may use an instruction to be traced (i.e., the first instruction) as a current instruction and fetch relevant information of the instruction, including a smal instruction content (denoted as I), an instruction register set (denoted as T) and an instruction register propagation set (denoted as P, where P is equal to T at initialization), and place the triple (I, T, P) into the Trace queue TraceQueue and the taintlnquue.
And step B1, performing union operation on the first register set T and the first register propagation set P1 of the first instruction to obtain a set M.
In an optional scheme, taking an example that the decompilation instruction is a smali instruction, the above embodiments are described in detail, and a new set M may be obtained by performing union operation on the instruction register set T and the instruction register propagation set P through the instruction register set operation module.
Step C1, obtaining instruction data of a second instruction in the instruction queue, where the instruction data of the second instruction includes: the contents of a second instruction and a second register set N, wherein the second instruction is a decompilated instruction before the first instruction.
In an optional scheme, taking an example that the decompilated instruction is a smali instruction, to describe the above embodiment in detail, the Trace module may fetch an instruction before the current instruction and record the instruction as the instruction, i.e., the second instruction, and fetch the instruction content, i.e., the instruction content, and the instruction register set N (i.e., the second register set N) of the second instruction, i.e., the instruction from the instruction queue.
And D1, performing intersection operation on the second register set N and the set M of the second instruction to obtain a set K.
In an optional scheme, taking an example that the decompilation instruction is a smali instruction, the above embodiment is described in detail, and the new set K may be obtained by performing intersection operation on the new set M and the instruction register set N through the instruction register set operation module.
Step E1, if set K is not empty, extracting the contaminated registers from set K.
Alternatively, after a dirty register is fetched, it may be put into a dirty register set L.
In an alternative scheme, taking the decompilation instruction as an example, which is a smali instruction, the above embodiment is described in detail, and if the obtained new set K is not empty, that is, the new set K includes registers, the next contaminated register is extracted from the new set K and placed into the contaminated register set L.
Step F1, step B1 to step E1 are adopted to traverse all the instructions in the instruction queue, and all the dirty registers in the instruction queue are tracked.
Optionally, after tracking dirty registers, the tracked instructions containing the dirty register set may be saved to a dirty queue.
In an alternative, taking the decompilated instruction as an example of a smali instruction, the above embodiment is described in detail, and steps B1 to E1 may be repeatedly executed to traverse all instructions in the instruction queue, thereby ending the intra-function taint tracking process, obtaining a tracking result (i.e., the above instruction including the taint register set), and storing the tracking result in the taint queue.
In the third embodiment of the present application, instruction data of a first instruction in the instruction queue may be stored in the trace queue and the dirty queue by circularly traversing each instruction in the instruction queue, a first register set T and a first register propagation set P1 of the first instruction are subjected to union operation to obtain a set M, instruction data of a second instruction in the instruction queue is obtained, an intersection operation is performed on a second register set N and the set M of the second instruction to obtain a set K, and a contaminated register extracted from the set K is placed in the contaminated register set L.
It is easy to note that, since all dirty registers in the instruction queue can be tracked based on the intersection of the register sets included in the two previous and next instructions, and the instruction including the dirty register is stored in the dirty queue, and the register set itself can accommodate multiple registers, thereby achieving the purpose of back-tracking multiple registers included in all instructions in the instruction queue at the same time, according to the scheme provided by the embodiment of the present application, the registers can be back-tracked based on the intersection of the register sets included in the two previous and next instructions, the whole back-tracking process is simple and clear, and the tracking of all instructions and multiple registers is supported.
Therefore, the third embodiment of the present application solves the technical problems that the number of the tracking registers is single and the tracking process is incomplete in the process of the register back tracking in the register back tracking method in the prior art.
The above embodiments of the present application can be implemented by the following modules: the system comprises a backward instruction tracking module (marked as Trace) and an instruction register set operation module, wherein the backward instruction tracking module (marked as Trace) is responsible for backward traversing instructions and managing a tracking queue TraceQueue and a taint queue TaintQueue; the instruction register set operation module is responsible for the operation of the register set, including union set and intersection set operation.
According to the above embodiment of the present application, before the union operation is performed on the first register set T of the first instruction and the first register propagation set P1 to obtain the set M in step B1, the method further includes the following steps:
in step S1421, it is determined whether a second instruction exists in the instruction queue.
In step S1422, if the instruction exists, the instruction data of the last traced first instruction is obtained from the trace queue.
In step S1423, if not, register reverse taint tracking is performed.
In an optional scheme, taking an example that the decompilation instruction is a smali instruction, to describe the above embodiment in detail, the Trace module may fetch an instruction before a current instruction, which is denoted as an "INext" (i.e., the second instruction described above), and if the second instruction exists in the instruction queue, fetch previously saved triple information (I, T, P) (i.e., instruction data of the first instruction described above) from the Trace queue; if no second instruction is present in the instruction queue, the function call chain tracing process (i.e., register reverse taint tracing as described above) is entered.
According to the above embodiment of the present application, after performing union operation on the first register set T of the first instruction and the first register propagation set P1 to obtain the set M in step B1, the method further includes the following steps:
step S1424, determine whether the set M obtained by union operation is empty.
In step S1425, if the instruction is not empty, the register used by the second instruction is calculated to obtain a second register set.
In step S1426, if empty, register reverse taint tracking is performed.
In an optional scheme, taking an example that the decompilation instruction is a smali instruction, the above embodiment is described in detail, after obtaining the new set M, it is determined whether the new set M is empty, and if the set M is empty, a function call chain tracing process (i.e., the above register reverse taint tracing) is performed; if the set M is not empty, the read set R and the write set W are merged to obtain the register set N of the second instruction INext (i.e. the second register set mentioned above).
According to the embodiment of the present application, after performing an intersection operation on the second register set N of the second instruction and the set M to obtain the set K in step D1, the method further includes the following steps:
step S1441, determine whether the set K obtained by intersection operation is empty.
Step S1442, if not, initializing a polluted register set L, and entering a step of extracting polluted registers from the set K and putting the polluted registers into the polluted register set L, namely, a step of putting registers meeting the pollution condition in the set K into the polluted register set L.
Step S1443, if the instruction queue is empty, the instruction data of the first instruction is continuously put into the trace queue, and the execution is returned to continuously execute all the instructions in the traversal instruction queue until all the taint registers in the instruction queue are obtained through tracing.
In an optional scheme, taking an example that a decompilation instruction is a smali instruction, the above embodiment is described in detail, after a new set K is obtained, whether the new set K is empty is determined, if the set K is empty, data (I, T, P) of a first instruction is continuously put into a trace queue, and all instructions in a traversal instruction queue are returned to be continuously executed; if set K is not empty, dirty register set L (i.e., dirty register set L described above) is initialized.
It should be noted here that the purpose of determining whether two instructions before and after the instruction have register pollution is achieved by the method, the method abstracts the registers used by two smarti instructions before and after the instruction into two register sets, then determines whether the instruction before and after the instruction has register pollution by determining whether a new set obtained by performing intersection operation on the two register sets is empty, and if the new set obtained by the intersection operation is empty, it indicates that the two instructions before and after the instruction have no register pollution; and if the new set obtained by intersection operation is not empty, indicating that the two previous and next instructions have register pollution.
According to the above embodiment of the present application, the second register set N includes: in the case of reading the register set R and writing the register set W, wherein step E1, extracting the contaminated registers from the set K comprises:
step S1451, traverse the existing registers in the set K obtained by intersection operation.
In step S1453, if there is a register r that is not traversed, the attribute of the register r is determined.
Step S1455, if the register R belongs to the read register set R, the register R is put into the dirty register set L.
In step S1457, if the register r belongs to the write register set W, the registers in the second register set N except the register r are put into the dirty register set L.
In an alternative scheme, taking the decompilated instruction as an example of a smali instruction, the above embodiment is described in detail, and if the register R exists (i.e., the unretraversed register R exists in step S1453), the register R is located to belong to the read set R or the write set W. If the register R belongs to the read set R, putting the register R into a pollution register set L; if r belongs to the write set W, the remaining registers in the second register set N, except for register r, are placed in the dirty register set L.
Through the scheme, in the process of back tracing, registers appearing in a current instruction are divided into a read set and a write set, if a register needing to be traced by a previous instruction appears in the read set, the register needs to be traced continuously, and if a register needing to be traced by the previous instruction appears in the write set, the register needs to be traced continuously until a constant assignment instruction or a function head appears.
According to the above embodiment of the present application, the method further includes: step S1458, if all the registers in the set K are traversed, propagating the difference set between the set M and the dirty register set L as a second register propagation set P2 of the second instruction, and storing the instruction data updated by the second instruction to the trace queue and the dirty queue, where the instruction data updated by the second instruction includes: the second instruction content, the dirty register set L, and the second register propagation set P2.
In an alternative, the decompilated instruction is a smali instruction, which is described in detail in the above embodiment, after the register r does not exist, that is, after all the registers in the set K are completely traversed, the instruction register propagation set P (i.e., the second register propagation set P2 described above) is re-cloned as a difference set between the set M and the set L, and the instruction data (INext, L, P) updated by the second instruction is placed in the trace queue and the TaintQueue.
It should be noted here that the instruction register set operation module is also responsible for the clone operation of the register set.
According to the above embodiment of the present application, the register reverse taint tracking in step S1423 and step S1426 includes the following steps:
step A2, the instruction data of the first contaminated instruction at the tail of the queue is extracted from the dirty queue, the instruction data of the first contaminated instruction includes: the system comprises the contents of an infected instruction, a set of infected registers, and a set of infected register propagation.
Specifically, the infected instruction content of the first contaminated instruction may be denoted by I, the infected register set may be denoted by T, and the infected register propagation set may be denoted by P.
Step B2, obtain a first function corresponding to the first contaminated instruction, and obtain a parameter register set M' of the first function.
Specifically, the first function may be represented by Callee.
In an alternative scheme, taking an example that the decompilated instruction is a smali instruction, the above embodiment is described in detail, and a last tuple (I, T, P) in the TaintQueue (i.e., the instruction data of the first contaminated instruction above) is fetched, and a function where the instruction I is located is obtained and recorded as Callee (i.e., the first function above), and a parameter register set of Callee is recorded as M (i.e., the parameter register M' above).
And step C2, carrying out union operation on the dyed register set and the dyed register propagation set, and carrying out intersection operation on the result of the union operation and the parameter register set M' to obtain a set C.
In an alternative scheme, taking the decompilated instruction as an example, the above embodiment is explained in detail, and the set C is obtained by performing union operation on the set T and the set P and then performing intersection operation on the set T and the set M.
In step D2, after obtaining the calling function and the calling instruction for calling the first function, the sequence numbers of the registers included in the set C corresponding to the parameter register set M' are calculated.
Specifically, the calling function of the first function may be denoted by Caller, and the calling instruction may be denoted by iclaler.
In an alternative scheme, taking the decompilation instruction as an example of a smali instruction, the above embodiment is described in detail, and after obtaining the call function Caller and the call instruction iclaler, the sequence of the registers in the set C in the set M is calculated as S (i.e., the above sequence number).
Step E2, obtain the call register set F included in the call instruction.
In an alternative scheme, taking the decompilated instruction as an example of a smali instruction, the above embodiment is described in detail, and a corresponding register set K (i.e., the above call register set F) is generated according to the type of the call instruction iclaler.
And step F2, extracting the corresponding call register from the call register set F by using the sequence number to generate a register set T.
In an alternative scheme, taking the decompilated instruction as an example of a smali instruction, the above embodiment is described in detail, and a new register set T (i.e., the above register set T) is generated by fetching a corresponding register from the set K according to the sequential sequence S.
And G2, putting the call instruction data corresponding to the call instruction into the initialized trace queue and taint queue.
In an alternative scheme, taking an example that the decompilation instruction is a smali instruction, the above embodiment is described in detail, and the trace queue and the TaintQueue are initialized, and call instruction data (iclaler, T) is placed in the initialized trace queue and TaintQueue.
And step H2, returning to step A2, and traversing all the polluted instructions in the processing taint queue by adopting steps A2 to G2 to realize reverse taint tracking.
In an alternative scheme, taking the decompilation instruction as an example of a smali instruction, the above embodiment is described in detail, by traversing all the polluted instructions in the TaintQueue, implementing reverse taint tracking, and after stopping reverse taint tracking, outputting a tracking result to the TaintQueue.
By the embodiment, after the reverse taint tracing flow in the function is finished, the back tracing is continued from the instruction calling the function, so that the whole back tracing process is very simple and clear, and the tracing of all instructions and multiple registers and the tracing of a function calling chain are supported.
The above embodiments of the present application can be implemented by the following modules: and the instruction register read-write reflection module is responsible for cutting the instruction register set into read-write sets (R and W sets) according to the read-write attributes and providing a reflection function, namely when the register to be tracked falls into the read set of the next instruction register, the backward instruction tracking module continues to track the register, and if the register to be tracked falls into the write set of the next instruction, the backward instruction tracking module reverses the register tracking the read set.
According to the above embodiment of the present application, in step C2, after performing union operation on the dyed register set and the dyed register propagation set, and performing intersection operation on the union operation result and the parameter register set M' to obtain the set C, the method further includes the following steps:
step S1481, determine whether the set C obtained by intersection operation is empty.
Step S1483, if not empty, a call function and a call instruction for calling the first function are acquired.
In step S1485, if empty, the register reverse taint tracking is stopped from being performed, and a tracking result is output.
In an optional scheme, taking an example that the decompiling instruction is a smali instruction, the above embodiment is described in detail, after obtaining a set C through intersection operation, it is determined whether the set C is empty, that is, whether a register exists in the set C, and if the set C is not empty, that is, the register exists, a call function Caller and a call instruction iclaler corresponding to a first function Callee are obtained; if C is empty, namely no register exists, stopping reverse taint tracking, and outputting a tracking result to the taint queue TaintQueue.
According to the above embodiment of the present application, in step E2, obtaining a call register set F included in a call instruction includes:
in step S14751, it is determined whether the call instruction is a static call instruction.
In step S14753, if the call instruction is a non-static call instruction, the first register in all the instruction registers included in the call instruction is deleted, so as to obtain a call register set F.
In step S14755, if the call instruction is a static call instruction, all instruction registers included in the call instruction form a call register set F.
In an alternative scheme, taking an example that the decompilated instruction is a smali instruction, the above embodiment is described in detail, after obtaining the calling instruction ICaller, if the calling instruction ICaller is a non-static calling instruction, such as an invoke-super, invoke-virtual, or other instruction, then removing a first register in an instruction register of the ICaller, and generating a register set K; if the call instruction ICaller is a static call instruction, then ICaller instruction register set K is generated directly.
Example 4
The embodiment of the invention can provide a computer terminal which can be any computer terminal device in a computer terminal group. Optionally, in this embodiment, the computer terminal may also be replaced with a terminal device such as a mobile terminal.
Optionally, in this embodiment, the computer terminal may be located in at least one network device of a plurality of network devices of a computer network.
In this embodiment, the computer terminal may execute the program code of the following steps in the stain tracking method: step 1, initializing a tracking queue and a taint queue, and storing instruction data of a first instruction in the instruction queue to the tracking queue and the taint queue, wherein the first instruction is a decompilation instruction to be tracked, and the instruction data of the first instruction comprises the following parameters: a first instruction content, a first register set T, and a first register propagation set P1; step 2, performing union operation on a first register set T and a first register propagation set P1 of the first instruction to obtain a set M; step 3, obtaining instruction data of a second instruction in the instruction queue, wherein the instruction data of the second instruction comprises: the contents of a second instruction and a second register set N, wherein the second instruction is a decompilated instruction before the first instruction; step 4, performing intersection operation on the second register set N and the set M of the second instruction to obtain a set K; step 5, if the set K is not empty, extracting the polluted registers from the set K and putting the polluted registers into a polluted register set L; and 6, traversing all the instructions in the instruction queue by adopting the steps 2 to 5, tracking to obtain all the taint registers in the instruction queue, and storing the tracked instructions containing the taint register set to the taint queue.
Alternatively, fig. 15 is a block diagram of a computer terminal according to an embodiment of the present application. As shown in fig. 15, the computer terminal a may include: one or more processors 151 (only one shown), a memory 153, and a transmission device 155.
The memory 153 may be used to store software programs and modules, such as program instructions/modules corresponding to the method and apparatus for tracking stains in the embodiments of the present invention, and the processor 151 executes various functional applications and data processing by running the software programs and modules stored in the memory, that is, the method for tracking stains as described above is implemented. The memory 153 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 153 may further include memory located remotely from the processor, which may be connected to terminal a via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Processor 151 may invoke the memory-stored information and applications via the transmission means to perform the following steps: step 1, initializing a tracking queue and a taint queue, and storing instruction data of a first instruction in the instruction queue to the tracking queue and the taint queue, wherein the first instruction is a decompilation instruction to be tracked, and the instruction data of the first instruction comprises the following parameters: a first instruction content, a first register set T, and a first register propagation set P1; step 2, performing union operation on a first register set T and a first register propagation set P1 of the first instruction to obtain a set M; step 3, obtaining instruction data of a second instruction in the instruction queue, wherein the instruction data of the second instruction comprises: the contents of a second instruction and a second register set N, wherein the second instruction is a decompilated instruction before the first instruction; step 4, performing intersection operation on the second register set N and the set M of the second instruction to obtain a set K; step 5, if the set K is not empty, extracting the polluted registers from the set K and putting the polluted registers into a polluted register set L; and 6, traversing all the instructions in the instruction queue by adopting the steps 2 to 5, tracking to obtain all the taint registers in the instruction queue, and storing the tracked instructions containing the taint register set to the taint queue.
Optionally, the processor 151 may further execute program codes of the following steps: judging whether a second instruction exists in the instruction queue; if yes, acquiring instruction data of the first instruction traced last time from the trace queue; if not, register reverse taint tracking is performed.
Optionally, the processor 151 may further execute program codes of the following steps: judging whether a set M obtained by union set operation is empty; if not, calculating a register used by the second instruction to obtain a second register set; if empty, register reverse taint tracking is performed.
Optionally, the processor 151 may further execute program codes of the following steps: judging whether a set K obtained by intersection operation is empty; if not, initializing a polluted register set L, and entering a step of extracting polluted registers from the set K and putting the polluted registers into the polluted register set L, namely, a step of putting registers meeting the pollution condition in the set K into the polluted register set L; and if the instruction data of the first instruction is empty, continuously putting the instruction data of the first instruction into the tracking queue, and returning to continuously execute all instructions in the traversal instruction queue until all dirty registers in the instruction queue are tracked.
Optionally, the processor 151 may further execute program codes of the following steps: traversing existing registers in the set K obtained by intersection operation; if the register r which is not traversed exists, judging the attribute of the register r; if the register R belongs to the read register set R, putting the register R into a pollution register set L; if the register r belongs to the write register set W, registers in the second register set N other than the register r are placed into the dirty register set L.
Optionally, the processor 151 may further execute program codes of the following steps: and if the registers in the set K are completely traversed, taking the difference set of the set M and the dirty register set L as a second register propagation set P2 of a second instruction, and saving the instruction data after updating the second instruction to a tracking queue and a dirty queue, wherein the instruction data after updating the second instruction comprises: the second instruction content, the dirty register set L, and the second register propagation set P2.
Optionally, the processor 151 may further execute program codes of the following steps: step A', extracting instruction data of a first polluted instruction at the tail part of the queue from the taint queue, wherein the instruction data of the first polluted instruction comprises the following steps: the method comprises the steps of (1) dyeing instruction content, a dyeing register set and a dyeing register propagation set; step B ', a first function corresponding to the first polluted instruction is obtained, and a parameter register set M' of the first function is obtained; step C ', carrying out union operation on the dyed register set and the dyed register propagation set, and carrying out intersection operation on the union operation result and the parameter register set M' to obtain a set C; step D ', after a calling function and a calling instruction for calling the first function are obtained, calculating a corresponding serial number of a register in the parameter register set M' contained in the set C; step E', a calling register set F contained in the calling instruction is obtained; step F', extracting a corresponding calling register from the calling register set F by using the sequence number to generate a register set T; step G', call instruction data corresponding to the call instruction are put into the initialized tracking queue and the taint queue; and step H ', returning to the step A', and traversing and processing all polluted instructions in the taint queue by adopting the steps A 'to G', so as to realize reverse taint tracking.
Optionally, the processor 151 may further execute program codes of the following steps: judging whether the set C obtained by intersection operation is empty; if not, acquiring a calling function and a calling instruction for calling the first function; if the register is empty, the register reverse taint tracing is stopped and the tracing result is output.
Optionally, the processor 151 may further execute program codes of the following steps: judging whether the calling instruction is a static calling instruction or not; if the calling instruction is a non-static calling instruction, deleting a first register in all instruction registers contained in the calling instruction to obtain a calling register set F; and if the calling instruction is a static calling instruction, all instruction registers contained in the calling instruction form a calling register set F.
By adopting the embodiment of the application, the instruction data of the first instruction in the instruction queue can be stored in the tracking queue and the taint queue by circularly traversing each instruction in the instruction queue, the first register set T and the first register propagation set P1 of the first instruction are subjected to union operation to obtain the set M, the instruction data of the second instruction in the instruction queue is obtained, the second register set N and the set M of the second instruction are subjected to intersection operation to obtain the set K, and the polluted register is extracted from the set K and put into the polluted register set L.
It is easy to note that, since all dirty registers in the instruction queue can be tracked based on the intersection of the register sets included in the two previous and next instructions, and the instruction including the dirty register is stored in the dirty queue, and the register set itself can accommodate multiple registers, thereby achieving the purpose of back-tracking multiple registers included in all instructions in the instruction queue at the same time, according to the scheme provided by the embodiment of the present application, the registers can be back-tracked based on the intersection of the register sets included in the two previous and next instructions, the whole back-tracking process is simple and clear, and the tracking of all instructions and multiple registers is supported.
Therefore, the technical problems that the number of tracking registers is single and the tracking process is incomplete in the register back tracking process of the register back tracking method in the prior art are solved.
It can be understood by those skilled in the art that the structure shown in fig. 15 is only an illustration, and the computer terminal may also be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palmtop computer, a Mobile Internet Device (MID), a PAD, and the like. Fig. 15 is a diagram illustrating a structure of the electronic device. For example, the computer terminal a may also include more or fewer components (e.g., network interfaces, display devices, etc.) than shown in fig. 15, or have a different configuration than shown in fig. 15.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
Example 5
The embodiment of the invention also provides a storage medium. Optionally, in this embodiment, the storage medium may be configured to store program codes executed by the method for performing stain tracking provided in the first embodiment.
Optionally, in this embodiment, the storage medium may be located in any one of computer terminals in a computer terminal group in a computer network, or in any one of mobile terminals in a mobile terminal group.
Optionally, in this embodiment, the storage medium is configured to store program code for performing the following steps: step 1, initializing a tracking queue and a taint queue, and storing instruction data of a first instruction in the instruction queue to the tracking queue and the taint queue, wherein the first instruction is a decompilation instruction to be tracked, and the instruction data of the first instruction comprises the following parameters: a first instruction content, a first register set T, and a first register propagation set P1; step 2, performing union operation on a first register set T and a first register propagation set P1 of the first instruction to obtain a set M; step 3, obtaining instruction data of a second instruction in the instruction queue, wherein the instruction data of the second instruction comprises: the contents of a second instruction and a second register set N, wherein the second instruction is a decompilated instruction before the first instruction; step 4, performing intersection operation on the second register set N and the set M of the second instruction to obtain a set K; step 5, if the set K is not empty, extracting the polluted registers from the set K and putting the polluted registers into a polluted register set L; and 6, traversing all the instructions in the instruction queue by adopting the steps 2 to 5, tracking to obtain all the taint registers in the instruction queue, and storing the tracked instructions containing the taint register set to the taint queue.
Optionally, the storage medium is further arranged to store program code for performing the steps of: judging whether a second instruction exists in the instruction queue; if yes, acquiring instruction data of the first instruction traced last time from the trace queue; if not, register reverse taint tracking is performed.
Optionally, the storage medium is further arranged to store program code for performing the steps of: judging whether a set M obtained by union set operation is empty; if not, calculating a register used by the second instruction to obtain a second register set; if empty, register reverse taint tracking is performed.
Optionally, the storage medium is further arranged to store program code for performing the steps of: judging whether a set K obtained by intersection operation is empty; if not, initializing a polluted register set L, and entering a step of extracting polluted registers from the set K and putting the polluted registers into the polluted register set L, namely, a step of putting registers meeting the pollution condition in the set K into the polluted register set L; and if the instruction data of the first instruction is empty, continuously putting the instruction data of the first instruction into the tracking queue, and returning to continuously execute all instructions in the traversal instruction queue until all dirty registers in the instruction queue are tracked.
Optionally, the storage medium is further arranged to store program code for performing the steps of: traversing existing registers in the set K obtained by intersection operation; if the register r which is not traversed exists, judging the attribute of the register r; if the register R belongs to the read register set R, putting the register R into a pollution register set L; if the register r belongs to the write register set W, registers in the second register set N other than the register r are placed into the dirty register set L.
Optionally, the storage medium is further arranged to store program code for performing the steps of: and if the registers in the set K are completely traversed, taking the difference set of the set M and the dirty register set L as a second register propagation set P2 of a second instruction, and saving the instruction data after updating the second instruction to a tracking queue and a dirty queue, wherein the instruction data after updating the second instruction comprises: the second instruction content, the dirty register set L, and the second register propagation set P2.
Optionally, the storage medium is further arranged to store program code for performing the steps of: step A', extracting instruction data of a first polluted instruction at the tail part of the queue from the taint queue, wherein the instruction data of the first polluted instruction comprises the following steps: the method comprises the steps of (1) dyeing instruction content, a dyeing register set and a dyeing register propagation set; step B ', a first function corresponding to the first polluted instruction is obtained, and a parameter register set M' of the first function is obtained; step C ', carrying out union operation on the dyed register set and the dyed register propagation set, and carrying out intersection operation on the union operation result and the parameter register set M' to obtain a set C; step D ', after a calling function and a calling instruction for calling the first function are obtained, calculating a corresponding serial number of a register in the parameter register set M' contained in the set C; step E', a calling register set F contained in the calling instruction is obtained; step F', extracting a corresponding calling register from the calling register set F by using the sequence number to generate a register set T; step G', call instruction data corresponding to the call instruction are put into the initialized tracking queue and the taint queue; and step H ', returning to the step A', and traversing and processing all polluted instructions in the taint queue by adopting the steps A 'to G', so as to realize reverse taint tracking.
Optionally, the storage medium is further arranged to store program code for performing the steps of: judging whether the set C obtained by intersection operation is empty; if not, acquiring a calling function and a calling instruction for calling the first function; if the register is empty, the register reverse taint tracing is stopped and the tracing result is output.
Optionally, the storage medium is further arranged to store program code for performing the steps of: judging whether the calling instruction is a static calling instruction or not; if the calling instruction is a non-static calling instruction, deleting a first register in all instruction registers contained in the calling instruction to obtain a calling register set F; and if the calling instruction is a static calling instruction, all instruction registers contained in the calling instruction form a calling register set F.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (17)

1. A method of spot tracking, comprising:
step 1, initializing a tracking queue and a taint queue, and storing instruction data of a first instruction in the instruction queue to the tracking queue and the taint queue, wherein the first instruction is a decompilated instruction to be tracked, and the instruction data of the first instruction comprises the following parameters: a first instruction content, a first register set T, and a first register propagation set P1;
step 2, performing union operation on the first register set T and the first register propagation set P1 of the first instruction to obtain a set M;
step 3, obtaining instruction data of a second instruction in the instruction queue, wherein the instruction data of the second instruction comprises: the content of a second instruction and a second register set N, wherein the second instruction is a decompilated instruction which is previous to the first instruction;
step 4, performing intersection operation on the second register set N of the second instruction and the set M to obtain a set K;
step 5, if the set K is not empty, extracting polluted registers from the set K and putting the polluted registers into a polluted register set L;
step 6, traversing and processing all instructions in the instruction queue by adopting the steps 2 to 5, tracking and obtaining all taint registers in the instruction queue, and storing the tracked instructions containing taint register sets into the taint queue, wherein the taint registers are polluted registers;
before performing a union operation on the first register set T and the first register propagation set P1 of the first instruction to obtain a set M, the method further includes:
judging whether the second instruction exists in the instruction queue or not;
if yes, acquiring instruction data of the first instruction traced last time from the trace queue;
if not, a register reverse taint trace is performed, wherein the register reverse taint trace is used to represent a function call chain trace process.
2. The method of claim 1, wherein after performing a union operation on the first set of registers T and the first set of register propagations P1 of the first instruction to obtain set M, the method further comprises:
judging whether the set M obtained by union set operation is empty;
if not, calculating a register used by the second instruction to obtain a second register set N;
if empty, a register reverse taint trace is performed, wherein the register reverse taint trace is used to represent a function call chain trace process.
3. The method of claim 1 or 2, wherein after intersecting the second set of registers N of the second instruction with the set M, resulting in a set K, the method further comprises:
judging whether the set K obtained by intersection operation is empty;
if not, initializing the polluted register set L, and entering the step 5;
and if the instruction data of the first instruction is empty, continuing to put the instruction data of the first instruction into the tracking queue, and returning to continue executing and traversing all instructions in the instruction queue until all taint registers in the instruction queue are tracked.
4. The method according to claim 1 or 2, comprising, at the second set of registers N: in the case of reading a register set R and writing a register set W, wherein extracting a dirty register from the set K is put into a dirty register set L, comprising:
traversing existing registers in the set K obtained by intersection operation;
if the register r which is not traversed exists, judging the attribute of the register r;
if the register R belongs to the read register set R, putting the register R into the pollution register set L;
if the register r belongs to the write register set W, registers in the second register set N except the register r are put into the dirty register set L.
5. The method of claim 4, wherein if the registers in the set K are completely traversed, propagating a difference set of the set M and the dirty register set L as a second register propagation set P2 for the second instruction, and saving updated instruction data for the second instruction to the trace queue and the dirty queue, wherein the updated instruction data for the second instruction comprises: the second instruction content, the dirty register set L, and the second register propagation set P2.
6. The method of claim 1 or 2, wherein register reverse taint tracking, comprises:
step A', the instruction data of a first polluted instruction at the tail of the queue is extracted from the taint queue, and the instruction data of the first polluted instruction comprises the following steps: the method comprises the steps of (1) dyeing instruction content, a dyeing register set and a dyeing register propagation set;
step B ', a first function corresponding to the first polluted instruction is obtained, and a parameter register set M' of the first function is obtained;
step C ', carrying out union operation on the dyed register set and the dyed register propagation set, and carrying out intersection operation on the union operation result and the parameter register set M' to obtain a set C;
step D ', after a calling function and a calling instruction for calling the first function are obtained, calculating a corresponding serial number of a register in the parameter register set M' in the set C;
step E', obtaining a calling register set F contained in the calling instruction;
step F', extracting a corresponding calling register from the calling register set F by using the sequence number to generate a register set T;
step G', call instruction data corresponding to the call instruction are put into the initialized tracking queue and the taint queue;
and step H ', returning to the step A', and traversing and processing all the polluted instructions in the taint queue by adopting the steps A 'to G', so as to realize reverse taint tracking.
7. The method of claim 6, wherein after performing a union operation on the set of infected registers and the set of infected register propagation, and performing an intersection operation on the result of the union operation and the set of parameter registers M' to obtain a set C, the method further comprises:
judging whether the set C obtained by intersection operation is empty;
if not, acquiring a calling function and a calling instruction for calling the first function;
and if the register is empty, stopping executing the register reverse taint tracing, and outputting a tracing result.
8. The method of claim 6, wherein obtaining the set of call registers F included in the call instruction comprises:
judging whether the calling instruction is a static calling instruction or not;
if the calling instruction is a non-static calling instruction, deleting a first register in all instruction registers contained in the calling instruction to obtain a calling register set F;
and if the calling instruction is the static calling instruction, all instruction registers contained in the calling instruction form the calling register set F.
9. A method for obtaining a dirty register, comprising:
step a1, obtaining instruction data of a first instruction in an instruction queue, where the instruction data of the first instruction includes the following parameters: a first instruction content, a first register set T, and a first register propagation set P1;
step B1, performing union operation on the first register set T and the first register propagation set P1 of the first instruction to obtain a set M;
step C1, obtaining instruction data of a second instruction in the instruction queue, where the instruction data of the second instruction includes: the content of a second instruction and a second register set N, wherein the second instruction is a decompilated instruction which is previous to the first instruction;
step D1, performing intersection operation on the second register set N of the second instruction and the set M to obtain a set K;
step E1, if the set K is not empty, extracting the contaminated registers from the set K;
step F1, traversing and processing all the instructions in the instruction queue by using the steps B1 to E1, and tracking to obtain all dirty registers in the instruction queue, wherein the dirty registers are dirty registers;
before performing a union operation on the first register set T and the first register propagation set P1 of the first instruction to obtain a set M, the method further includes:
judging whether the second instruction exists in the instruction queue or not;
if yes, acquiring instruction data of the first instruction traced last time from the trace queue;
if not, a register reverse taint trace is performed, wherein the register reverse taint trace is used to represent a function call chain trace process.
10. An apparatus for spot tracking, comprising:
the storage module is used for initializing a tracking queue and a taint queue and storing instruction data of a first instruction in the instruction queue to the tracking queue and the taint queue, wherein the first instruction is a decompilated instruction to be tracked, and the instruction data of the first instruction comprises the following parameters: a first instruction content, a first register set T, and a first register propagation set P1;
a first operation module, configured to perform union operation on the first register set T of the first instruction and the first register propagation set P1 to obtain a set M;
a first obtaining module, configured to obtain instruction data of a second instruction in the instruction queue, where the instruction data of the second instruction includes: the content of a second instruction and a second register set N, wherein the second instruction is a decompilated instruction which is previous to the first instruction;
the second operation module is used for performing intersection operation on a second register set N of the second instruction and the set M to obtain a set K;
a first storage module, configured to extract a contaminated register from the set K and put the contaminated register into a contaminated register set L if the set K is not empty;
the tracking module is used for traversing and processing all instructions in the instruction queue by adopting the first operation module, the first acquisition module, the second operation module and the first storage module, tracking and obtaining all taint registers in the instruction queue, and storing the tracked instructions containing the taint register set to the taint queue, wherein the taint registers are polluted registers;
the device further comprises:
the first judgment module is used for judging whether the second instruction exists in the instruction queue or not;
a second obtaining module, configured to obtain instruction data of the last traced first instruction from the trace queue if the first instruction exists;
a first execution module to execute a register reverse taint trace if not present, wherein the register reverse taint trace is to represent a function call chain trace process.
11. The apparatus of claim 10, further comprising:
the second judgment module is used for judging whether the set M obtained by union set operation is empty;
the first processing module is used for calculating a register used by the second instruction to obtain a second register set N if the register is not empty;
and the second execution module is used for executing register reverse taint tracing if the register reverse taint tracing is empty, wherein the register reverse taint tracing is used for representing a function call chain tracing process.
12. The apparatus of claim 10 or 11, further comprising:
the third judgment module is used for judging whether the set K obtained by intersection operation is empty or not;
the initialization module is used for initializing the pollution register set L and executing the function of the first storage module if the pollution register set L is not empty;
and the second storage module is used for continuously putting the instruction data of the first instruction into the tracking queue if the instruction data of the first instruction is empty, and returning to continuously execute and traverse all instructions in the instruction queue until all taint registers in the instruction queue are tracked.
13. The apparatus according to claim 10 or 11, wherein said second set of registers N comprises: in the case of a read register set R and a write register set W, wherein the first storage module comprises:
the traversal module is used for traversing the existing registers in the set K obtained by intersection operation;
the fourth judging module is used for judging the attribute of the register r if the register r which is not traversed exists;
a third storage module, configured to put the register R into the dirty register set L if the register R belongs to the read register set R;
a fourth storing module, configured to, if the register r belongs to the write register set W, place registers in the second register set N except for the register r into the dirty register set L.
14. The apparatus of claim 13, wherein said apparatus further comprises:
a second processing module, configured to, if all registers in the set K are traversed, take a difference set between the set M and the dirty register set L as a second register propagation set P2 of the second instruction, and store updated instruction data of the second instruction to the trace queue and the dirty queue, where the updated instruction data of the second instruction includes: the second instruction content, the dirty register set L, and the second register propagation set P2.
15. The apparatus of claim 11, wherein the first execution module or the second execution module comprises:
a first extraction module, configured to extract instruction data of a first contaminated instruction at a tail of a queue from the taint queue, where the instruction data of the first contaminated instruction includes: the method comprises the steps of (1) dyeing instruction content, a dyeing register set and a dyeing register propagation set;
a third obtaining module, configured to obtain a first function corresponding to the first contaminated instruction, and obtain a parameter register set M' of the first function;
the third operation module is used for performing union operation on the infected register set and the infected register propagation set, and performing intersection operation on the result of the union operation and the parameter register set M' to obtain a set C;
a calculating module, configured to calculate, after obtaining a call function and a call instruction for calling the first function, a sequence number, in the parameter register set M', of a register included in the set C, which corresponds to the register;
a fourth obtaining module, configured to obtain a call register set F included in the call instruction;
the second extraction module is used for extracting a corresponding calling register from the calling register set F by using the serial number to generate a register set T;
a fifth storage module, configured to place call instruction data corresponding to the call instruction into the initialized tracking queue and the taint queue;
and the return module is used for returning and executing the function of the first extraction module and executing the first extraction module, the third acquisition module, the third operation module, the calculation module, the fourth acquisition module, the second extraction module and the fifth storage module to traverse and process all polluted instructions in the taint queue, so that reverse taint tracking is realized.
16. The apparatus of claim 15, further comprising:
a fifth judging module, configured to judge whether the set C obtained by intersection operation is empty;
a fifth obtaining module, configured to obtain a call function and a call instruction for calling the first function if the first function is not empty;
and the output module is used for stopping executing the register reverse taint tracking if the register reverse taint tracking is empty and outputting a tracking result.
17. The apparatus of claim 15, wherein the fourth obtaining module comprises:
the sixth judging module is used for judging whether the calling instruction is a static calling instruction or not;
the third processing module is configured to delete a first register of all instruction registers included in the call instruction to obtain a call register set F if the call instruction is a non-static call instruction;
and the fourth processing module is configured to, if the call instruction is the static call instruction, configure all instruction registers included in the call instruction into the call register set F.
CN201510994142.9A 2015-12-25 2015-12-25 Method and device for tracking stains Active CN106919831B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510994142.9A CN106919831B (en) 2015-12-25 2015-12-25 Method and device for tracking stains

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510994142.9A CN106919831B (en) 2015-12-25 2015-12-25 Method and device for tracking stains

Publications (2)

Publication Number Publication Date
CN106919831A CN106919831A (en) 2017-07-04
CN106919831B true CN106919831B (en) 2020-10-09

Family

ID=59455575

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510994142.9A Active CN106919831B (en) 2015-12-25 2015-12-25 Method and device for tracking stains

Country Status (1)

Country Link
CN (1) CN106919831B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109583200B (en) * 2017-09-28 2021-04-27 中国科学院软件研究所 Program abnormity analysis method based on dynamic taint propagation

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101996292B (en) * 2010-12-10 2012-05-23 北京理工大学 Method for analyzing safety property of software based on sequence clustering
CN104102879B (en) * 2013-04-15 2016-08-17 腾讯科技(深圳)有限公司 The extracting method of a kind of message format and device
CN103995782B (en) * 2014-06-17 2016-06-22 电子科技大学 A kind of stain based on stain invariant set analyzes method
CN104766012B (en) * 2015-04-09 2017-09-22 广东电网有限责任公司信息中心 The data safety dynamic testing method and system followed the trail of based on dynamic stain

Also Published As

Publication number Publication date
CN106919831A (en) 2017-07-04

Similar Documents

Publication Publication Date Title
CN108733713B (en) Data query method and device in data warehouse
CN108197114B (en) Method and device for detecting table data, storage medium and electronic device
CN104536729B (en) It is a kind of to realize the method and apparatus that screenshot is carried out in browser page
CN103777866B (en) Data editing method and device for magnifying glass
CN106709974B (en) Game scene drawing method and device
CN110421974B (en) Marking method and device and laser marking machine
CN106919831B (en) Method and device for tracking stains
CN115455166A (en) Method, device, medium and equipment for detecting abnormality of intelligent dialogue system
CN107977504A (en) A kind of asymmetric in-core fuel management computational methods, device and terminal device
WO2016165461A1 (en) Automated testing method and apparatus for network management system software of telecommunications network
CN109189343A (en) A kind of metadata rule method, apparatus, equipment and computer readable storage medium
CN109739859B (en) Relation map drawing method, system and related device
CN107679222A (en) Image processing method, mobile terminal and computer-readable recording medium
CN115935917A (en) Data processing method, device and equipment for visual chart and storage medium
CN112070852A (en) Image generation method and system, and data processing method
CN109118413A (en) Urban activity demographics method and device thereof, computer-readable medium
CN105718214B (en) The access method and device of fingerprint mask image
CN108255486A (en) For view conversion method, device and the electronic equipment of form design
CN106919429B (en) Method and device for processing decompiled data
CN113656127A (en) Page routing method, device, storage medium and processor
CN111951355A (en) Animation processing method and device, computer equipment and storage medium
CN111046249A (en) Data storage, positioning and application method and related device
CN107678812A (en) The processing method and processing device of browser interface
CN115712622B (en) Electric power transaction data processing method, system, computer device and storage medium
CN106919430B (en) Method and device for processing register in decompiling instruction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant