WO2020248512A1 - Method for constructing runtime model of terminal application behavior - Google Patents

Method for constructing runtime model of terminal application behavior Download PDF

Info

Publication number
WO2020248512A1
WO2020248512A1 PCT/CN2019/119272 CN2019119272W WO2020248512A1 WO 2020248512 A1 WO2020248512 A1 WO 2020248512A1 CN 2019119272 W CN2019119272 W CN 2019119272W WO 2020248512 A1 WO2020248512 A1 WO 2020248512A1
Authority
WO
WIPO (PCT)
Prior art keywords
activity
runtime
activities
terminal application
model
Prior art date
Application number
PCT/CN2019/119272
Other languages
French (fr)
Chinese (zh)
Inventor
蔡华谦
黄罡
张颖
刘譞哲
Original Assignee
北京大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京大学 filed Critical 北京大学
Publication of WO2020248512A1 publication Critical patent/WO2020248512A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/448Execution paradigms, e.g. implementations of programming paradigms
    • G06F9/4482Procedural

Definitions

  • the present invention relates to computer technology, in particular to a method for constructing a runtime model of terminal application behavior.
  • Computational reflection is a common operating state monitoring and control application main mechanism provided by system software such as programming languages, operating systems, and middleware. Based on computational reflection, various development frameworks and testing frameworks can be implemented to improve the efficiency of developers in code development, testing and even running deployment.
  • reflexivity refers to the ability of a program to manipulate a set of data at runtime. This set of data describes the running state of the program. Manipulation has two implications: 1) Introspection, the program can observe and reason about itself 2) Control (Intercession), the program can change its operation or semantics. Both of these aspects need to be able to encode the state of program execution as data, and providing this kind of encoding is called reflection.
  • reflection is actually mapping the running state of the program into a set of operable data.
  • the former part constitutes the base-level entity
  • the latter part forms the meta-level entity
  • the causal relationship between the base-level entity and the meta-level entity is maintained.
  • the calculated reflection is mainly divided into structural reflection and behavior reflection.
  • the basic entity of structural reflection is the current program and its abstract data type (can be regarded as the state of the application)
  • the basic entity of behavior reflection is the execution behavior of the current program and the data required for its execution (can be regarded as the behavior of the application) .
  • Structural reflection refers to the ability of a programming language to reflect the current program and its abstract data types. It naturally exists because it is similar to the ability of a programming language framework (runtime or framework), and is an inherent ability of most programming language frameworks.
  • Behavior reflection refers to the ability of a programming language to provide its own execution semantics and the data reflection required for its execution, that is, the programming language framework itself needs to be reflected, and behavior reflection faces two challenges in monitoring and control: First, it needs to be complete Describe the existing application behavior, that is, monitor the execution of the application. The execution of an application can be regarded as a set of runtime activities. The finer the granularity of the activity, the richer the information to be monitored, the more resources the monitoring function occupies, and the more serious resource competition between it and business logic. At this time, the complexity and scale of application behavior monitoring have become the primary challenge for terminal application behavior reflection. Second, the existing programming language and the behavior reflection of system software such as operating system and middleware do not support instruction-level behavior control. The fundamental reason lies in the complex data and control dependencies contained in the instruction sequence. Therefore, the instruction-level application behavior Control has become the main difficulty of terminal application behavior reflection.
  • the main purpose of the present invention is to provide a method for constructing a runtime model of terminal application behavior, which overcomes the above-mentioned first challenge and realizes complete monitoring of terminal application runtime behavior.
  • the present invention proposes a method for constructing a runtime model of terminal application behavior.
  • the runtime model includes a runtime stack model and a runtime heap model.
  • the method includes constructing a terminal application behavior. The steps of the runtime stack model and the steps of constructing the runtime heap model of the terminal application behavior;
  • the step of constructing the runtime stack model of the terminal application behavior includes:
  • control flow graph For the control flow graph, input the control flow graph to be monitored into a preset behavior interpreter;
  • the step of constructing the runtime heap model of the terminal application behavior includes:
  • a heap operation activity is generated, and a runtime heap model of the terminal application behavior is obtained.
  • the present invention uses a behavior interpreter to generate a complete, accurate, and detailed application behavior self-report, that is, a runtime model of terminal application behavior, which overcomes the dynamic, changeable, and difficult-to-controllability of the prior art
  • the application runtime environment's insufficient monitoring of terminal application application behavior enables flexible and complete monitoring of terminal application application behavior, and provides technical guarantee for subsequent implementation of command-level control of terminal application application behavior.
  • Figure 1 is an existing 3G radio resource control state machine
  • Figure 2(a) is a schematic diagram of the network request control flow before the merger in an example of network request merger
  • Figure 2(b) is a schematic diagram of the network request control flow after merging in an example of network request merging
  • Figure 3 is an example of communication dependencies between threads-a schematic diagram of the producer-consumer model
  • Figure 5 is an example of Android multi-threaded programming
  • Figure 6 is an example of dependencies between multithreaded programming
  • Figure 7(a) is the heap area object before execution
  • Figure 7(b) is the heap area object after execution
  • FIG. 8 is a schematic diagram of the architecture of the Reflectall model generation subsystem of the present invention.
  • Fig. 9 is a schematic structural diagram of an interface operation subsystem of Reflectall, an example of the present invention.
  • Figure 10(a) is the experimental result on the open source application set
  • Figure 10(b) is the experimental result on the closed source application set
  • Figure 11 is a comparison chart of application startup time results when Reflectall and Emma generate code coverage reports.
  • FIG. 12 schematically shows a block diagram of a computing processing device for executing the method according to the present invention.
  • Fig. 13 schematically shows a storage unit for holding or carrying program codes for implementing the method according to the present invention.
  • the present invention selects two typical cases of application function evolution scenarios to analyze, so as to clarify the fundamental reason why the existing behavior reflection is not applicable.
  • IDLE Idle state, in this state, the power consumption of the 3G module is the lowest, and it cannot send or receive any data. In this state, if you want to send or receive data, it will transition to the CELL_DCH state.
  • CELL_DCH In this state, the bandwidth of the 3G module reaches the maximum, and data transmission can be carried out at the maximum rate at this time, and its power consumption is also the maximum. If it continues for a period of time and there is still no data transmission, it will transition to the CELL_FACH state. Depending on the settings of different operators, the duration of continuous operation in the CELL_DCH state is usually 5 seconds to 10 seconds.
  • CELL_FACH In this state, the power consumption of the 3G module is 50% less than that of CELL_DCH. At the same time, in this state, its network transmission rate is also lower. If in this state, the data sent or received is greater than a certain threshold, it will re-transition to the CELL_DCH state. If no data is sent or received in the CELL_FACH state for a period of time, it will transition to the IDLE state. Generally speaking, this period of time is usually 10 seconds to 15 seconds.
  • Figure 2 shows an example of a network request merge.
  • Figure 2(a) is the network request and the power consumption of the wireless communication module before merging, the horizontal axis is time, the upper half is the power consumption of the wireless communication module; the dotted line in the lower half is the thread that initiated the two network requests; The solid line in the lower half is the control flow.
  • a background news push thread wakes up the thread responsible for sending network requests1; after the thread is awakened, it initiates network requests2.
  • the power consumption of the wireless communication module also changes from the low power consumption in the IDLE state. It is the high power consumption in the CELL_DCH state; when the entire request is completed, the thread responsible for sending the network request returns the result to the news push thread 3.
  • the wireless communication module does not receive or send data, it will still remain at high power
  • the power consumption of the wireless communication module starting from this is called the "end time power consumption", which corresponds to the slash part used in Figure 2(a); the news push thread receives the returned result 4, Handle it and prompt 5 on the notification bar.
  • another version update thread also executed similar logic 6 and sent a network request. As shown in Figure 2(a), because the two network requests are separated by tens of seconds, the wireless communication module is woken up twice, so there are two corresponding "tail times", which results in additional network Energy consumption.
  • a network request scheduling mechanism that is, the network request that was originally sent directly can be delayed; 2) a network request scheduling algorithm, that is, to find out the request that can be delayed, and at the same time Use the scheduling mechanism for delayed transmission.
  • Structural reflection can be used to automatically reconstruct the network request execution logic of mobile applications, and the scheduling mechanism can be built into the application.
  • this requires developers of different applications to use the same automatic refactoring framework, and all applications need to be recompiled, deployed and run. This is obviously unrealistic for a large number of closed-source applications that belong to different application developers.
  • WeChat has become more than just a simple communication application, it has also become an essential tool for work communication; it has spawned "WeChat business” that uses WeChat Moments and Official Accounts for marketing; it has become the largest free Media’s publishing platform.
  • the core of WeChat is a communication tool, and its function is still to satisfy ordinary users. Even so, it is difficult to meet the specific needs of ordinary users. For example, as WeChat has been used for longer and longer, its cached chat log files have also become larger and larger, and it is difficult for ordinary users to manage their chat logs. Furthermore, WeChat is difficult to meet the specific needs of special groups such as WeChat businessmen and self-media people.
  • the starting point of its execution lies in operations such as clicking, dragging, and inputting user interface elements.
  • access to external resources by means of network requests, database queries, and file reading and writing, to obtain corresponding data or realize corresponding functions.
  • most of the logic is similar to the execution logic of the interoperable programmable interface, but the starting point of its execution is different.
  • the granularity of existing behavioral reflection monitoring and control is method level.
  • existing functions can correspond to running A group of program activities at a time, the method-granular behavior reflects that the content of monitoring is limited, and the execution of instructions in the method cannot be monitored, and then it cannot be controlled. This has led to the fact that existing solutions are often based on existing code and documentation. For a development team, the flow of developers, the lack of documentation, and even non-standard source code comments will change the iterative development of mobile applications. Too difficult to proceed.
  • the embodiment of the present invention proposes a method for constructing a runtime model of terminal application behavior.
  • the runtime model includes a runtime stack model and a runtime heap model.
  • the application runs in the operating system, it can be regarded as one or more processes.
  • the operating system loads the executable file required by the mobile application into the memory and starts execution.
  • the memory occupied by a process can be divided into three areas:
  • Code segment a memory area where the execution code is stored, with read-only attribute
  • Heap area It can be divided into a memory area (data segment) used to store global variables and a memory area used for dynamic allocation during process operation.
  • a memory area data segment
  • a memory area used for dynamic allocation during process operation.
  • Java a thread creating a new object is equivalent to A piece of memory is applied for in the heap area;
  • Stack area used to temporarily store local variables, etc.
  • Java when a thread calls a method, it will apply for a new frame (frame), and data such as parameters required by the method are stored in the frame.
  • the runtime model of the application needs to be able to reflect the application over a period of time: 1) Code execution: During development, the code of the mobile application can be abstracted as a control flow graph, so corresponding to the runtime, the execution of the code can be abstracted as One or more paths of the control flow graph; 2) Changes in memory data (such as heap area): During development, developers will design various data structures to represent the data model of the application, and at runtime , The execution of the code causes the creation, modification, and deletion of instances of these data structures, which corresponds to a set of memory allocation and modification operations. From the perspective of memory area, the most important areas affected by program execution are the stack area and heap area of the memory. The path in the control flow diagram in 1) can be regarded as a description of the stack change, and the main reflection in 2) is the change of the heap area data.
  • the application runtime model constructed by the present invention includes a runtime stack model describing stack changes and a runtime heap model describing heap changes.
  • the runtime stack model also includes the acquisition of code, which completely divides the memory occupied by a process into three areas.
  • B ⁇ b 1 , b 2 ,..., b n ⁇ is a basic block
  • the control flow graph is instantiated into one or more control flows, and basic blocks are executed according to the path in the control flow graph.
  • the present invention calls the basic block executed at a certain moment as an activity, and the runtime stack model for a period of time is composed of a control flow graph, one or more control flows, and a set of activity sequences.
  • the activity sequence is the instruction execution sequence.
  • I ⁇ i 1 , i 2 ,..., i n ⁇ , which means t
  • the heap state of the program from 1 to t n .
  • the runtime stack model can be regarded as a collection of multiple paths in the control flow graph. Therefore, the edges in the runtime stack model must be in the control
  • the occurring context has a temporal sequence relationship for two activities in the same control flow; if there are edges for two activities in different control flows, it indicates that the two activities still have a dependency relationship.
  • Activity definition program a j depends on the synchronization program activities a i, a j if the start or end of a i by the implementation of the decisions, in general, a i and often some thread synchronization operation. It is said that a j communication depends on a i , if a certain data dependency of a j is generated by a i activity.
  • the granularity of the basic block is the granularity of the basic block of the source code.
  • Each control flow of the runtime stack model corresponds to the execution sequence of a Java thread. There are six states of thread state transition:
  • Running The thread is in a running state, and the thread in this state may wait for some system resources, such as CPU;
  • Blocking The thread is waiting for a Monitor Lock. For example, when the thread enters the method or code block modified by the synchronized keyword, the thread will enter the blocking state;
  • Waiting/Timed Waiting The thread is waiting, for example, when the thread calls the wait method of an object to enter the waiting state. When the notify method of the object is used, the thread will re-enter the running state;
  • a thread in a running state can wake up another thread in a non-running state to enter the running state.
  • the present invention calls this relationship between threads a synchronization dependency relationship.
  • these inter-thread wakeups occur, corresponding to the runtime stack model, there is a cross-thread (cross-control flow) edge between the activities of the threads in the running state and the activities that occur when the non-running threads enter the running state.
  • these thread dependencies can be summarized into four categories, as shown in Table 1.
  • Table 1 Classification of synchronization dependencies at the Java language level
  • Table 1 the activity of each running thread corresponds to the activity of the non-running thread. Therefore, the inter-thread dependency in Table 1 is called the synchronization dependency.
  • Java provides a variety of multi-threaded programming libraries to support. For example, java.util.concurrent provides read-write locks, reentrant locks, blocking locks, and thread pools.
  • Figure 3 shows an example of communication dependencies between threads-the producer-consumer model.
  • the Task class represents computing tasks; the static field tasks represents a queue of pending tasks; the postTask method represents generating and submitting tasks; and the handleTask method represents processing tasks.
  • Thread 1 represents the producer thread, which will submit tasks to the pending task queue; 2) Thread 2 represents the consumer thread, and will check the pending task queue at regular intervals. And handle the corresponding tasks.
  • the generation of the activity relationship in the runtime stack model must depend on the corresponding data at runtime.
  • the data flow analysis algorithm calculates the data flow equation based on the structure of the control flow graph and iterates to a stable point. Therefore, in addition to the above-mentioned runtime stack model, the runtime model of the application also needs a runtime heap model that describes changes in the data state of the memory heap area.
  • the classic data flow diagram is often used in the requirements analysis phase.
  • the software uses the data flow diagram to decompose the software system to be developed layer by layer from abstract to concrete.
  • the data flow graph is a directed graph, which contains two different types of edges and a variety of different nodes to describe data starting from an initial node, performing layer-by-layer calculations, and finally getting the final result.
  • a certain node of the data flow graph essentially corresponds to a set of changes in memory data. Therefore, the heap model of the application behavior runtime model of the present invention focuses on changes in memory data, rather than changing operations.
  • the runtime heap model of the present invention only models the heap area of the memory during application runtime from the perspective of memory data changes.
  • D ⁇ d 1 , d 2 ,..., d n ⁇
  • A ⁇ i 1 , i 2 ,..., i n ⁇
  • T ⁇ T 1 , t 2 ,..., t n ⁇ , which is the time stamp.
  • step flow chart of a method of constructing a runtime model of terminal application behavior of the present invention the method includes the steps of constructing a runtime stack model of the terminal application behavior and constructing the terminal application behavior The steps of the runtime heap model.
  • the step of constructing the runtime stack model of the terminal application behavior may specifically include:
  • Step S401 When the terminal application is running, obtain the actually executed code in the memory of the terminal application, and abstract the actually executed code to generate a control flow graph;
  • Step S402 Regarding the control flow graph, input the control flow graph to be monitored into a preset behavior interpreter;
  • Step S403 Use the behavior interpreter to interpret and execute the control flow graph that needs to be monitored, and generate stack activities when the terminal application is running;
  • Step S404 When the terminal application is running, the dependency between the control flows of the stack activity is generated to obtain the runtime stack model of the terminal application behavior;
  • the construction of the runtime stack model includes the following three basic elements: 1) Control flow graph: contains all possible activities and all possible activity relationships, is the source code of the program, or the abstraction of the intermediate code Representation; 2) A group of activities that occur at runtime, that is, a path in the control flow graph, can be regarded as a group of nodes in the stack model; 3) The relationship between activities that occur at runtime, that is, the edges of the stack model.
  • the construction of the stack model needs to focus on solving three challenges: First, due to technologies such as compilation optimization and runtime just-in-time compilation, the source code of the application and the bytecode generated by the compilation may be different from the code segment in the runtime memory. How to ensure The control flow graph and runtime activities can be mapped correctly; second, how to generate activities with different granularities to describe the changes of complex application running state; third, because current applications use a lot of multi-threaded compilation to ensure the response speed of the interface, To improve user experience, how to generate dependencies between runtime control flows.
  • the first step of the embodiment of the present invention is to obtain the actually executed code in the memory at runtime and abstract the currently executed code, which can ensure the accurate mapping of the control flow graph and the runtime activities.
  • a behavior interpreter is proposed. The behavior interpreter takes the control flow graph generated in the previous step as input to interpret and execute it.
  • the third step is to explain the runtime activities of the generated application during the execution process; and the last step of model generation is to generate the dependency between the control flow at runtime.
  • Control flow graph generation During the application development process, since the installation package release will contain the intermediate code compiled by the application, for protection purposes, the application will use various obfuscation tools to generate the intermediate code, such as the dex bytecode under Android , To be confused. This makes it difficult to map the directly provided source code with the activities performed during application runtime. These obfuscated codes will be loaded and executed through the application runtime environment. For example, the Dex bytecode in an Android application will be executed in Android Runtime (ART). In the present invention, the bytecode of the application is obtained by modifying the running time of the application. This approach can bring two benefits.
  • control flow graph generation Specifically, control flow graph generation:
  • An instruction is the beginning of a basic block if and only if: 1) it is the first instruction of a method or 2) a certain instruction may jump to the current instruction.
  • An instruction is the end of a basic block, if and only if: 1) it is the return of a method, such as return, throw instructions; or 2) it is a jump instruction, such as if, goto, or the instruction may throw an exception .
  • the control flow graph generation algorithm of the present invention is divided into the following three steps:
  • the execution of the control flow graph is redistributed on demand, and the activities of the application runtime generated by the behavior interpreter.
  • each thread corresponds to a control flow, and each control flow can be regarded as a set of ordered activities.
  • This group of activities can be regarded as a path of the control flow graph generated in the previous step. Therefore, the present invention proposes a behavior interpreter suitable for monitoring program execution. According to the configuration, assign the control flow graph that needs to be monitored to the behavior interpreter for execution. If the execution of each instruction corresponds to an activity, this group of instruction sequences will become extremely large and difficult to process: 1. Numerical calculation statements are difficult to correspond to semantics; 2. A large number of activities generated by program loops will obliterate the real Processing logic. Therefore, the present invention divides activities into numerical calculation, branch control, method invocation, etc., and implements an activity filter that provides multiple granular activity screening in the behavior interpreter, so as to generate a suitable stack model.
  • a class filter and an activity type filter are included; wherein the class filter is based on a coarse-grained filter based on regular matching of package and class names, and is used to remove program activities that developers do not care about;
  • the activity type filter is based on fine-grained filtering of activity types, and is used to remove activity types that are not of concern to developers.
  • the activity type of the stack activity includes method start and method end, field read, array read, and synchronization instructions; based on the above activity type, the implementation method of step S403 includes:
  • class filter uses the class filter to perform coarse-grained screening of the activities of the terminal application during runtime, and generate stack activities caused by the class;
  • the activity type filter is used to perform fine-grained screening of the stack activity.
  • the embodiment of the present invention can generate a required stack model by flexibly specifying a specific package, class, and instruction type, which improves the ease of use.
  • the present invention regards the start and end of the execution method call instruction as activities and records. From the point of view of Java's method invocation, its invocation appears to be a tree-like structure: for a certain method invocation, multiple method invocations may occur during the execution. Therefore, in order to ensure that the generated sequence can be restored to this tree-like structure, the subscript s in the present invention represents the activity at the beginning of the method call, and the subscript e represents the activity at the end of the method call.
  • the two program execution situations will correspond to two different sequences:
  • the method call that reconstructs the generated activity sequence into a tree structure can adopt the call tree construction algorithm.
  • the algorithm process is actually the process of simulating the execution process of the Java virtual machine.
  • each thread's activity corresponds to an actions object.
  • two data structures are maintained: 1) the sub-control flow queue that has been executed; 2) the function stack of the current control flow execution. Traverse each activity in actions in order, and make the following judgments:
  • the current activity is the method start type, instantiate a new sub-control flow, push the newly instantiated sub-control flow onto the function stack, and add it to the activity queue of the current control flow. Finally, the current control flow is set to the sub-control flow just instantiated.
  • the current activity is the method end type, pop the stack operation. If the function stack is empty after the pop-up stack ends, it means that the sub-control flow of the current thread has been executed and can be added to the executed sub-control flow queue; if the function stack is not empty, the current control flow is set to the function The sub-control flow at the top of the stack.
  • the runtime activity representation implementation can have a storage form: it can be an object in the memory, or a persistent binary file or an ASIC II file.
  • the runtime heap model can be expressed in the form of Backus paradigm.
  • the present invention achieves this scalability through an active serialization and deserialization mechanism.
  • the sequence of activities in the runtime model is stored in a buffer with a configurable size.
  • the activities in the buffer are serialized and persisted to local storage .
  • Thread dependency is a relationship related to time: at a certain moment, the main thread can send calculation tasks to the background thread, and the activities performed by the background thread at this time depend on the activities of the main thread; and at the next moment, the background thread completes the calculation task After that, the main thread is notified to update the interface; at this time, the activities performed by the main thread depend on the activities of the background thread. Therefore, the present invention classifies these inter-thread dependencies, and processes different types of dependencies to generate these dependencies at runtime.
  • the dependency relationship includes synchronization dependency and communication dependency.
  • the thread state transfer related methods provided in the Java language specification are used between threads, such as Thread.join, Object.wait, Object.notify, etc., to realize the coordination between multiple threads.
  • the present invention calls the dependencies between these threads as synchronization dependencies. .
  • objects are used between threads to achieve coordination among multiple threads, and the present invention calls these dependencies among threads as communication dependencies.
  • application developers will reuse various multi-threaded programming classes provided by the framework layer to improve development efficiency.
  • the framework layer provides a good semantic application programming interface to the application layer classes, shielding the implementation details, but in order to ensure the performance and robustness of the framework, the implementation is often more complicated. Programs implemented using these programming frameworks may be dependent on synchronization or communication between threads at runtime.
  • the BackgroundTask.execute method will call the ThreadPoolExecutor.execute method during the execution process (activity 1), and then call the start method of the background thread object (activity 2), which will further cause the background thread to run Method call (activity 3).
  • the background thread's run method starts to execute, it will eventually call the BackgroundTask.doInBackground method (activity 4) after layers of calls.
  • the execution of this method in addition to calling the calculate method for calculation tasks, it will also call AsyncTask.publishProgress Method (activity 5), so that the foreground thread calls the onProgressUpdated method (activity 6) to update the interface. Subsequently, after the background thread ends the calculation task, it will notify the foreground process that the current task has ended in a similar manner.
  • activity 2 and activity 3 are synchronous dependence
  • activity 5 and activity 6 are communication dependence.
  • the methods related to synchronization dependencies in Java are considered to be activities that need to be collected.
  • the runtime stack model can collect various activities related to synchronization dependencies as shown in Table 1.
  • the subsequent activity may be the end of a method or the beginning of a method.
  • the synchronization dependencies can be divided into two types and processed separately:
  • the timestamp is used to find matching activities in other threads from back to front. If found, it corresponds to a synchronization dependency.
  • Thread.join depends on the end of Thread.run
  • Object.wait depends on the end of Object.notify.
  • Thread.join or Object.wait you can use the timestamp to find others from back to front. Matchable activities in the thread. If found, it corresponds to a synchronization dependency.
  • the timestamp is used to find matching activities in other threads from back to front. If found, it corresponds to a synchronization dependency Relationship; for example, the end of Thread.join depends on the end of Thread.run; the end of Object.wait depends on the end of Object.notify. For methods such as Thread.join or Object.wait to end the activity, you can use the timestamp to move forward Find matching activities in other threads. If found, it corresponds to a synchronization dependency.
  • the start of an activity depends on the end of another activity
  • the start of Thread.run depends on the end of Thread.start
  • the start of an activity that is, a method call
  • the current Thread.run needs to be checked first: if the activity is in the current thread. The first activity executed, it depends on another thread Thread.start to end the activity, otherwise it is just a normal method call and does not depend on the activity of another thread.
  • the threads are not based on the method provided by Java that can realize the thread state transition, and realize the coordination among multiple threads.
  • the present invention refers to the dependency between these threads as communication dependency.
  • the concrete implementation is based on the next method and enqueueMessage method of MessageQueue.
  • the enqueueMessage method caused by activity 5 will only add the current task to the queue, and will not explicitly wake up the foreground thread.
  • the knowledge base can also support application customization.
  • the step of constructing the runtime heap model of the terminal application behavior may specifically include:
  • Step S405 Generate an initial state of the heap area when the terminal application is running
  • Step S406 Generate a heap operation activity, and obtain a runtime heap model of the terminal application behavior.
  • the runtime heap model includes the following basic elements: 1) the initial state of a heap area; 2) a group of activities that affect the data of the heap area during operation.
  • the present invention first provides a description method of the initial state of the heap area, and generates an initial state of heap data that conforms to the representation during operation.
  • the present invention provides a description method of heap operation activities, and constructs activities in the runtime heap model at runtime.
  • the BNF representation of the initial state of the heap area and the heap operation activity is given.
  • the initial state of the heap area is the state of the heap area data at the start time.
  • the heap is the area used to analyze all class instances and arrays at runtime, and this area is managed by an automatic storage management system (ie garbage collector) .
  • Objects in the heap are never explicitly collected, but are automatically collected by the garbage collector.
  • the initial state of the heap area can be regarded as a snapshot of the heap area data at a certain moment. Therefore, if the data state of the memory heap area is generated, if other threads continue to execute and perform heap area operations (such as creating objects and performing garbage collection) Etc.), it will destroy the atomicity of the initial state. Therefore, the present invention first provides a BNF representation describing the initial state of the heap area, and adopts a method of "freezing" the heap area data when generating the initial state of the heap area data to ensure the atomicity of the initial state generation process.
  • the description of activities is expanded here to support the description of garbage collection activities, similar to the runtime stack model, providing multiple granular activity selection filtering options to generate a suitable heap model.
  • the generated heap model describes the changes of the objects of interest in detail. Therefore, the heap object state query algorithm based on timestamp can query the state of the heap object at any time.
  • the data in the Java heap area only includes instantiated objects and arrays.
  • the creation of objects may occur in application layer code or framework layer code. Therefore, we divide the objects in the application into application layer and framework layer.
  • the objects of the heap area are shown in Figure 7(a). Each circle in the figure represents an object, and the line between the circle and the circle represents the reference relationship.
  • the objects related to the application business logic in Figure 7(a) include the display interface FloatActivity, the button that can trigger the background calculation task, the background task to be processed, the BackgroundTask, the TextView used to display the calculation result of the task, and the click event listener OnClickListener.
  • a MessageQueue object in the framework layer Realizing background processing tasks can notify the foreground to update.
  • the solid arrow in Figure 7 indicates the reference relationship of the object, that is, if there is a field in object A that points to another object B, there is a directed connection from A to B; the dotted arrow indicates that there is a Through the reference relationship of the object, and at the end, there is no reference relationship.
  • the ThreadPoolExecutor.execute method will be called.
  • the object since the object executes the execute method for the first time, the object will create a new Thread object1, which is The thread of background execution; after calling the start method of the background thread object, it will further cause the call of the run method of the background thread and officially start calling the doInBackground method.
  • this method in addition to calling the calculate method to perform the calculation task, it also calls the AsyncTask.publishProgress method.
  • the incoming parameters will be encapsulated into the newly created Integer[] object 2; During execution, a new Message object 3 will be created and placed in a global MessageQueue queue.
  • the front thread receives the Message and executes onProgressPublished, it will create a StringBuilder object 4 to construct the parameters required by setText, and instantiate a new String object 5 through the StringBuilder.toString method.
  • a new StringBuilder object 6 will be created, and the return value String object 7 will be calculated.
  • the String object will be encapsulated again in a newly created Message object 8, and notify the foreground thread to execute the onPostExecute method.
  • Thread object does not directly rely on the BackgroundTask, but will be encapsulated by FutureTask, Callable and other objects, and indirectly rely on the BackgroundTask object.
  • objects 1 to 8 may be collected in a garbage collection.
  • the present invention treats the instantiation, field assignment, and recycling of each object in Figure 7(b) as an activity. Similar to the representation of the runtime stack model, the following invention preferably provides a representation of the runtime stack model in the form of Backus paradigm.
  • DataAction is similar to the ControlAction described in the key technology of runtime stack model construction. ControlAction is used to describe the execution of instructions, and DataAction is used to describe changes in memory data. Number represents a number type, which can be a numeric value or a memory address; String represents a string type. From the above representation, it can be found that the complexity of the model mainly depends on: 1) the number of objects in the initial state; 2) the number of data activities in the heap area.
  • App Heap the memory area used by the current application to instantiate objects and arrays
  • Image Heap which is loaded The memory area of the current application mirroring
  • Zygote Heap the memory area where the system classes loaded when the system starts.
  • the initial state described in the present invention mainly focuses on the application stack that changes the most during runtime.
  • Dalvik virtual machine and ART virtual machine on the Android platform they realize that the current application heap state can be saved as a file at any time (heap dump operation).
  • This file is a private memory image format, which can be converted to hprof format that complies with J2EE platform regulations through Android Developer Tools.
  • the current application heap dump can only reflect the heap state at a certain moment, and it is difficult to reflect the heap state at any time within a period of time.
  • all threads need to be suspended to perform a heap dump operation, which is time-consuming.
  • the generated files range from tens of megabytes to hundreds of megabytes, and it is difficult to implement a heap dump operation every once in a while.
  • the state of the heap at any time within a period of time.
  • performing a heap dump operation does not dump those objects that are recovered by the collector, including those temporary objects generated during execution.
  • the temporary objects generated are also very important for the execution of the described process.
  • the objects 2 to 8 in Figure 7 are all temporary objects, and these objects cannot be persisted directly using heap dumps.
  • the step of constructing the runtime heap model of the terminal application behavior includes: the terminal application runtime activities include instantiation activities, modification activities, and recycling activities.
  • the instantiation activity that is, the activity of creating a new object and a new array
  • the instantiation activity can correspond to the execution of instructions such as newInstance and newArray in the bytecode at runtime.
  • ModifiyAction that is, the activity of modifying the value of the static field of the class, the field of the object, and the element of the array, can correspond to instructions such as sput, iput, aput in the bytecode.
  • GCAction is an activity that affects objects in the heap when garbage collection is performed.
  • the garbage collection mechanism is an automatic memory management mechanism. When the data in a piece of memory will no longer be used, it will be reclaimed and released to facilitate the next allocation.
  • Specific garbage collection algorithms are implemented with reference counting method and reachability analysis algorithm.
  • the recycling activity does not correspond to the instructions of the dex bytecode at runtime, because its specific implementation is at the virtual machine level. Recycling activities can be further subdivided into cleanup activities and compression activities.
  • cleanup activity is to clean up objects that are no longer needed; while the so-called compression activity is to organize active objects into contiguous memory space to avoid allocation failure due to fragmentation when allocating large memory.
  • the framework layer code will also create a large number of objects, in some cases, even several times the objects created by the application layer. It is necessary to provide a mechanism for the complexity management of the heap model to ensure the accuracy and ease of use of the generated runtime stack model. Similar to the aforementioned two-level screening mechanism, there is also a two-level screening mechanism for the activity generation of the heap model. Coarse-grained filtering based on regular matching of package and class names and fine-grained filtering based on activity type.
  • the present invention preferably provides 6 types of heap operation activities, and the activity types of the heap operation activities include object instantiation, array instantiation, object field writing, array element writing, clearing activity and compression activity;
  • the steps of generating a heap operation activity include:
  • class filter uses the class filter to perform coarse-grained screening of the terminal application runtime activities to generate heap operation activities caused by the class;
  • the activity type filter is used to perform fine-grained screening of the heap operation activity.
  • Reflectall For Android mobile applications widely used in the mobile Internet, a prototype system implementation of the behavior reflection framework: Reflectall is given.
  • the full name of Reflectall is Reflection at low level interpreter, which has two meanings. One is reflection based on the underlying behavior interpreter; the other is that it can monitor and control instruction-level application behavior.
  • Reflectall is based on the Android operating system open source project.
  • the Reflectall platform can be divided into a behavior runtime model construction subsystem, a model analysis and code generation subsystem, and an operation subsystem to realize the monitoring and control in the behavior reflection framework.
  • Reflectall's behavior runtime model construction subsystem implements the construction of mobile application behavior runtime models. Its core implementation is at the system layer and consists of four modules: optimization-deoptimizer, behavior interpreter, model construction and interface layer. These four modules realize the monitoring and control of mobile application behavior.
  • optimization-de-optimizer the Android runtime environment can load native instructions that can be directly executed by the CPU. Therefore, it is necessary to switch the original instruction to bytecode, that is, de-optimization, and perform interpretation and execution through the behavior interpreter, so as to realize the monitoring of mobile application runtime activities. Due to the complexity of mobile applications, it is difficult to monitor all activities in the execution of mobile applications, so a two-level screening mechanism is introduced.
  • the optimization-de-optimizer implements the class filtering mechanism in the two-level filtering mechanism. Through the optimization-de-optimizer, the classes to be monitored can be de-optimized into bytecodes as needed, and interpreted and executed; while for unmonitored The class is still executed in the native executor.
  • the optimization-de-optimizer will be triggered in the following three situations: 1) When receiving the command to start monitoring, it will filter the methods of the currently loaded classes according to the configured parameters and perform de-optimization; 2) Receive When the monitoring command is finished, the currently de-optimized class will be re-optimized to re-enter the native executor for execution; 3) When the class linker loads a new class, the class will be similar to the case 1) The screening and de-optimization process. In order to ensure the correctness of the program execution, the de-optimization process needs to be the same as part of the garbage collection algorithm, temporarily suspend the execution of all threads, and resume the thread execution after the de-optimization execution ends. Through this partial de-optimization and maintaining the coexistence of interpreted execution and native execution, the performance overhead of monitoring can be greatly reduced.
  • Behavior interpreter is an interpreter that interprets and executes the dex format bytecode. It can monitor the activities that occur in the current program execution during the interpretation and execution. Most of the activities in the mobile application behavior runtime model are generated by the behavior interpreter. In addition to the activities generated by the behavior interpreter, the garbage collector can also generate some activities-garbage collection activities. The behavior interpreter also implements the activity filtering mechanism in the two-level filtering mechanism, which can generate different types of activities according to the configured activity collection granularity.
  • Model builder The activities generated by the behavior interpreter and garbage collector will be built in the model builder. When there are more activities during runtime, it will lead to a larger memory footprint. Therefore, the model builder implements online and offline model construction. When there are few activities, when the model builder runs in the online model building mode, when the number of activities reaches the configured threshold, the model builder will persist the currently generated sequence of activities and store them in the form of files .
  • Interface layer encapsulates the functions provided by the optimization-de-optimizer, behavior interpreter, and model builder. At the same time, it also provides the interfaces required for deserialization activities, such as finding objects based on addresses and converting given objects to addresses.
  • two mobile application behavior runtime models can be generated: 1) a refined model that includes runtime data dependency; 2) a simplified model that does not include runtime data dependency.
  • Reflectall includes a set of behavior reflection interfaces that can monitor the activities of applications with different granularities and generate runtime models of different granularities of application behavior; a set of remote debugging connection interfaces that can control application activity monitoring Start and end.
  • the interface of the framework layer is encapsulated to realize an Android application, which can provide a remote debugging interface in the form of a Web service.
  • the analysis and code generation subsystem of Reflectall is a browser-server architecture.
  • the analysis and code generation subsystem implements:
  • Version management Use git to manage different versions of mobile applications and interoperability interfaces. At the same time, it supports server-side compilation, and uses the client's interface management application to push the compiled dex bytecode to the client.
  • Stack model visualization Provides a tree view and supports keyword-based data dependency analysis.
  • the interface operation subsystem of Reflectall adds a behavior reflection class loader to the framework layer of the Android open source project, as shown in FIG. 9, which is a structural diagram of the interface operation subsystem of Reflectall, an example of the present invention.
  • the application process When the application process starts, it will check whether there is a loadable behavior reflection interface bytecode file. If there is a behavior reflection interface bytecode file suitable for the current application, it is loaded into the application process through the behavior reflection class loader, and at the same time, the Binder communication mechanism is used to register the interoperability interface provided by the current application with the interface management application.
  • the interface management application provides services such as interface forwarding and status detection.
  • the caller process can realize interoperability with the specified application through the interface management application.
  • the present invention verifies the performance of the Reflectall model generation with an open source application set containing 69 open source Android applications and a closed source application set containing 39 closed source applications.
  • the construction cost of the mobile application behavior runtime model is positively related to the number of model activities-the more complex the application and the more activities, the greater the cost of generating the behavior runtime model.
  • open source applications are far less complex in implementation than closed source applications.
  • the median of the number of classes in the open source application set is 58 and the median of the number of methods is 246; the number of application classes in 75% of the open source application sets is no more than 167; the number of methods is no more than 859 .
  • the median number of application classes is 14,266, and the median number of methods is 87717, which is 245 times and 102 times the corresponding values in the open source application set.
  • the hardware configuration used in the experiment is as follows: 1) Use the Android smartphone Redmi 2A, its CPU is 1.5GHz, the memory is 1GB, and the Android operating system version is 5.1.1. 2) The experiment uses an ordinary PC as a remote control The terminal controls the mobile phone for experiments.
  • the PC's CPU is Intel Core i5 3427U (1.8GHz), the memory is 4GB, and it runs OSX 10.11.
  • the method of monitoring application execution flow also includes two methods of runtime meta-message binding and compile-time bytecode reconstruction.
  • Table 2 shows the granularity of activity monitoring supported by the three methods. Reflectall is more granular in activity monitoring than the method based on runtime binding: it supports activity monitoring up to the instruction level; at the same time, it has a wider range of adaptation than the method based on bytecode reconstruction: based on bytecode reconstruction.
  • the method needs to modify the compilation process of the original application, and it is difficult to directly use it on obfuscated and hardened applications.
  • the present invention compares the performance of Reflectall and the method based on runtime binding in monitoring program execution flow.
  • the performance of Reflectall and the method based on bytecode reconstruction in monitoring the activity of instruction granularity were compared.
  • Xposed framework is a framework service that can monitor and modify the running behavior of the program without modifying the APK (rovo89, 2012). Similar to Reflectall, the Xposed framework is also modified on the system layer of the Android operating system.
  • the Xposed framework implements the behavior reflection of the meta message model, that is, Xposed binds the corresponding meta object to the specified method according to the configuration when the application is running. In the subsequent execution, these methods bound to the meta-object will call the before and after methods in the meta-object before and after execution.
  • This article uses the Xposed framework to implement an Xposed module that monitors program execution similar to Reflectall. In this section, the application startup time is used as an indicator.
  • Reflectall and Xposed-based monitoring modules are deployed on two Redmi 2A phones with the same hardware configuration.
  • the Android operating systems are both 5.1.1.
  • Figure 10(a) shows the experimental results on the open source application set.
  • 69 applications can be started normally in the above 6 scenarios.
  • the solid lines in Figure 10(a) are the three scenarios where Reflectall is deployed; the dotted lines are the three scenarios where the Xposed framework is deployed.
  • the average startup time of the mobile phone deployed with Reflectall (scenario 1) is 392 milliseconds, while the platform startup time of the mobile phone deployed with the Xposed framework (scenario 3) is 449 milliseconds. This is because even if the implementation of the Xposed framework does not bind meta objects, there will be a certain overhead when the application is loaded.
  • Reflectall's optimization-de-optimizer realizes that all code is executed in the native executor when it is not monitored.
  • the average startup time of Reflectall is 486 milliseconds, which is 23% more than the unmonitored situation (392 milliseconds).
  • the Xposed framework-based method has an average startup time of up to 2078 milliseconds in Scenario 2.
  • the additional overhead is 368%.
  • the additional overhead of Reflectall is only 27%, while the method based on the Xposed framework is as high as 477%.
  • Figure 10(b) shows the experimental results on the closed source application set.
  • the average startup time of closed source applications is 936 milliseconds (Reflectall) and 1010 milliseconds (Xposed).
  • Reflectall has 3 applications that are unresponsive, and the average startup time of the remaining 36 applications is 1601 milliseconds, which is 71% more than the unmonitored scenario (936 milliseconds).
  • Scenario 4 22 applications are unresponsive, and the average startup time of the remaining 17 applications is 4593 milliseconds.
  • the additional overhead is 355%.
  • the additional overhead of Reflectall is 98%, while the method based on the Xposed framework is as high as 470%.
  • the performance of Reflectall’s generated behavior runtime model is quite different from that of the open source application set and the closed source application set. This is because the implementation of the closed source application set is more complex, so the generated model scale is also larger, which may cause multiple The process of garbage collection and activity persistence, which brings more performance overhead.
  • the Xposed-based method has similar costs in these two application sets. The reason is that when Xposed is used for monitoring, 22 applications are unresponsive, accounting for 57% of the entire closed-source application set.
  • One of the main reasons why Reflectall’s performance overhead is lower than the Xposed framework-based method is that the Xposed framework-based method uses Java, and the behavior interpreter programming language implemented by Reflectall is C++. When the execution process of the application is more complicated, the memory allocation and recovery based on the Xposed framework will be more frequent than Reflectall. The above experiments show that this implementation of Reflectall of the present invention can handle more complex applications.
  • bytecode reconstruction frameworks Many commonly used Java libraries use bytecode reconstruction frameworks.
  • a very important use scenario for bytecode reconstruction is program analysis.
  • the popular bug location tool FindBugs uses ASM at the bottom to analyze bytecode and locate vulnerabilities.
  • Another common usage scenario is to use bytecode reconstruction to generate code coverage reports for programs, such as Emma (Roubtsov, 2005) and JCover (JCover, 2017).
  • the condensed model generated by Reflectall can be converted into a code coverage report. This experiment will compare the difference between Reflectall and Emma in generating code coverage reports.
  • the method based on bytecode reconstruction is not suitable for closed source applications that only have application installation packages. Therefore, this part is only for open source application sets. This implementation still uses the application startup time as an indicator.
  • Reflectall and an unmodified Android system are deployed on two Redmi 2A phones with the same hardware configuration.
  • the Android operating systems are both 5.1.1.
  • the method of constructing a runtime model of terminal application behavior can regard the execution of the application as a programming language framework (such as an interpreter, virtual machine), and read the memory according to the code segment of the application.
  • a programming language framework such as an interpreter, virtual machine
  • Write operation What kind of method is executed can correspond to the operation of the programming language framework on the stack; what kind of object data is modified can correspond to the operation of the programming language framework on the heap. It realizes flexible and complete monitoring of terminal application application behavior, and provides technical guarantee for subsequent realization of instruction-level control of terminal application application behavior.
  • the computational reflection engine designed according to this method can be used as a stand-alone operating environment, and can also be integrated into various mainstream development platforms or commercial software to provide developers with the basic ability to monitor application runtimes.
  • the device embodiments described above are merely illustrative.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in One place, or it can be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the objectives of the solutions of the embodiments. Those of ordinary skill in the art can understand and implement it without creative work.
  • the various component embodiments of the present invention may be implemented by hardware, or by software modules running on one or more processors, or by their combination.
  • a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all of the functions of some or all of the components in the computing processing device according to the embodiments of the present invention.
  • DSP digital signal processor
  • the present invention can also be implemented as a device or device program (for example, a computer program and a computer program product) for executing part or all of the methods described herein.
  • Such a program for realizing the present invention may be stored on a computer-readable medium, or may have the form of one or more signals. Such signals can be downloaded from Internet websites, or provided on carrier signals, or provided in any other form.
  • FIG. 12 shows a computing processing device that can implement the method according to the present invention.
  • the computing processing device traditionally includes a processor 1010 and a computer program product in the form of a memory 1020 or a computer readable medium.
  • the memory 1020 may be an electronic memory such as flash memory, EEPROM (Electrically Erasable Programmable Read Only Memory), EPROM, hard disk, or ROM.
  • the memory 1020 has a storage space 1030 for executing the program code 1031 of any method step in the above method.
  • the storage space 1030 for program codes may include various program codes 1031 for implementing various steps in the above method. These program codes can be read out from or written into one or more computer program products.
  • These computer program products include program code carriers such as hard disks, compact disks (CDs), memory cards or floppy disks.
  • Such a computer program product is usually a portable or fixed storage unit as described with reference to FIG. 13.
  • the storage unit may have storage segments, storage spaces, etc., arranged similarly to the memory 1020 in the computing processing device of FIG. 12.
  • the program code can be compressed in an appropriate form, for example.
  • the storage unit includes computer-readable codes 1031', that is, codes that can be read by, for example, a processor such as 1010. These codes, when run by a computing processing device, cause the computing processing device to execute the method described above. The various steps.
  • any reference signs placed between parentheses should not be constructed as a limitation to the claims.
  • the word “comprising” does not exclude the presence of elements or steps not listed in the claims.
  • the word “a” or “an” preceding an element does not exclude the presence of multiple such elements.
  • the invention can be implemented by means of hardware comprising several different elements and by means of a suitably programmed computer. In the unit claims enumerating several devices, several of these devices may be embodied by the same hardware item.
  • the use of the words first, second, and third, etc. do not indicate any order. These words can be interpreted as names.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)

Abstract

Disclosed in the present invention is a method for constructing a runtime model of a terminal application behavior. A complete, accurate and detailed application behavior README file, i.e., a runtime model of an application behavior of a terminal application, is generated by means of a behavior interpreter. The defects in monitoring the application behavior of the terminal application in a dynamic, changeable and difficult-to-control application runtime environment in the prior art are overcome; flexible and complete monitoring for the application behavior of the terminal application is achieved; a technical guarantee is provided for subsequent instruction-level control to the application behavior of the terminal application.

Description

一种构造终端应用行为的运行时模型的方法A method for constructing runtime model of terminal application behavior
本申请要求在2019年06月10日提交中国专利局、申请号为201910498727.X、发明名称为“一种构造终端应用行为的运行时模型的方法”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office, the application number is 201910498727.X, and the invention title is "a method for constructing a runtime model of terminal application behavior" on June 10, 2019, and its entire contents Incorporated in this application by reference.
技术领域Technical field
本发明涉及计算机技术,尤其涉及一种构造终端应用行为的运行时模型的方法。The present invention relates to computer technology, in particular to a method for constructing a runtime model of terminal application behavior.
背景技术Background technique
计算反射(computational reflection,简称反射)是编程语言、操作系统、中间件等系统软件,提供的一种常见的运行态监测与控制应用的主要机制。基于计算反射可以实现各种开发框架、测试框架,以提高开发人员在代码开发、测试甚至运行部署中的效率。在计算机领域,反射性是指程序具有在运行时刻操纵一组数据的能力,这组数据描述了该程序的运行状态,操纵有两方面含意:1)监测(Introspection),程序可以观测并推理自身的状态;2)控制(Intercession),程序可以改变自身的运行或语义。而这两方面都需要能将程序执行的状态编码为数据,而提供这种编码便称为反射也就是说,反射其实是将程序的运行状态映射为一组可操作的数据。前一部分组成基层实体,后一部分组成元层实体,而基层实体与元层实体之间保持了因果关联。根据基层实体的不同,计算反射主要分为结构反射和行为反射。结构反射的基层实体为当前程序及其抽象数据类型(可视为应用的状态),而行为反射的基层实体则是当前程序的执行行为及其执行所需的数据(可视为应用的行为)。Computational reflection (computational reflection, referred to as reflection) is a common operating state monitoring and control application main mechanism provided by system software such as programming languages, operating systems, and middleware. Based on computational reflection, various development frameworks and testing frameworks can be implemented to improve the efficiency of developers in code development, testing and even running deployment. In the computer field, reflexivity refers to the ability of a program to manipulate a set of data at runtime. This set of data describes the running state of the program. Manipulation has two implications: 1) Introspection, the program can observe and reason about itself 2) Control (Intercession), the program can change its operation or semantics. Both of these aspects need to be able to encode the state of program execution as data, and providing this kind of encoding is called reflection. That is to say, reflection is actually mapping the running state of the program into a set of operable data. The former part constitutes the base-level entity, the latter part forms the meta-level entity, and the causal relationship between the base-level entity and the meta-level entity is maintained. According to the different base entities, the calculated reflection is mainly divided into structural reflection and behavior reflection. The basic entity of structural reflection is the current program and its abstract data type (can be regarded as the state of the application), and the basic entity of behavior reflection is the execution behavior of the current program and the data required for its execution (can be regarded as the behavior of the application) .
结构反射是指编程语言提供对当前程序及其抽象数据类型反射的能力,由于与编程语言框架(runtime或framework)的能力类似而自然存在,是大多数编程语言框架固有的能力。Structural reflection refers to the ability of a programming language to reflect the current program and its abstract data types. It naturally exists because it is similar to the ability of a programming language framework (runtime or framework), and is an inherent ability of most programming language frameworks.
行为反射是指编程语言提供对自身的执行语义及其执行所需的数据反射的能力,也就是编程语言框架自身需要被反射,行为反射在监测和控制上面临两个挑战:其一,需要完整描述既有的应用行为,也就是对应用的执行进行监测。应用的执行可视为一组运行时活动的集合,活动的粒度越细,监测的信息也越丰富,监测功能占用的资源就越大,其与业务逻辑之间的资源竞争也就越严重。此时,应用行为监测的复杂性和规模性就成为终端应用行为反射的首要挑战。其二,现有编程语言以及操作系统和中间件等系统软件的行为反射,均不支持指令级的行为控制,根本原因在于指令序列蕴含的复杂的数据和控制依赖,因此,应用行为的指令级控制就成为终端应用行为反射的主要难点。Behavior reflection refers to the ability of a programming language to provide its own execution semantics and the data reflection required for its execution, that is, the programming language framework itself needs to be reflected, and behavior reflection faces two challenges in monitoring and control: First, it needs to be complete Describe the existing application behavior, that is, monitor the execution of the application. The execution of an application can be regarded as a set of runtime activities. The finer the granularity of the activity, the richer the information to be monitored, the more resources the monitoring function occupies, and the more serious resource competition between it and business logic. At this time, the complexity and scale of application behavior monitoring have become the primary challenge for terminal application behavior reflection. Second, the existing programming language and the behavior reflection of system software such as operating system and middleware do not support instruction-level behavior control. The fundamental reason lies in the complex data and control dependencies contained in the instruction sequence. Therefore, the instruction-level application behavior Control has become the main difficulty of terminal application behavior reflection.
发明内容Summary of the invention
本发明主要目的在于,提供一种构造终端应用行为的运行时模型的方法,克服上述第一个挑战,实现对终端应用运行时行为的完整监测。The main purpose of the present invention is to provide a method for constructing a runtime model of terminal application behavior, which overcomes the above-mentioned first challenge and realizes complete monitoring of terminal application runtime behavior.
本发明是通过如下技术方案实现的:The present invention is realized through the following technical solutions:
为解决上述技术问题,本发明提出了一种构造终端应用行为的运行时模型的方法,所述运行时模型包括运行时栈模型和运行时堆模型,所述方法包括构造所述终端应用行为的运行时栈模型的步骤和构造所述终端应用行为的运行时堆模型的步骤;To solve the above technical problems, the present invention proposes a method for constructing a runtime model of terminal application behavior. The runtime model includes a runtime stack model and a runtime heap model. The method includes constructing a terminal application behavior. The steps of the runtime stack model and the steps of constructing the runtime heap model of the terminal application behavior;
所述构造所述终端应用行为的运行时栈模型的步骤包括:The step of constructing the runtime stack model of the terminal application behavior includes:
在所述终端应用运行时,获取所述终端应用的内存中真正执行的代码,并对所述真正执行的代码进行抽象,生成控制流图;When the terminal application is running, obtain the actually executed code in the memory of the terminal application, and abstract the actually executed code to generate a control flow graph;
针对所述控制流图,将需要监测的控制流图输入至预设的行为解释器;For the control flow graph, input the control flow graph to be monitored into a preset behavior interpreter;
利用所述行为解释器对所述需要监测的控制流图进行解释执行,生成所述终端应用运行时的栈活 动;Use the behavior interpreter to interpret and execute the control flow graph that needs to be monitored, and generate stack activities when the terminal application is running;
在所述终端应用运行时,生成所述栈活动的控制流间的依赖关系,得到所述终端应用行为的运行时栈模型;When the terminal application is running, generate the dependency between the control flows of the stack activity, and obtain the runtime stack model of the terminal application behavior;
所述构造所述终端应用行为的运行时堆模型的步骤包括:The step of constructing the runtime heap model of the terminal application behavior includes:
在所述终端应用运行时,生成堆区的初始状态;Generating the initial state of the heap area when the terminal application is running;
生成堆操作活动,得到所述终端应用行为的运行时堆模型。A heap operation activity is generated, and a runtime heap model of the terminal application behavior is obtained.
与现有技术相比,本发明通过行为解释器,生成一个完整、准确、详实的应用行为自述,即终端应用应用行为的运行时模型,克服了现有技术在动态、多变、难控的应用运行时环境对终端应用应用行为的监测上的不足,实现了对终端应用应用行为的灵活、完整的监测,为后续实现对终端应用应用行为的指令级控制提供了技术保障。Compared with the prior art, the present invention uses a behavior interpreter to generate a complete, accurate, and detailed application behavior self-report, that is, a runtime model of terminal application behavior, which overcomes the dynamic, changeable, and difficult-to-controllability of the prior art The application runtime environment's insufficient monitoring of terminal application application behavior enables flexible and complete monitoring of terminal application application behavior, and provides technical guarantee for subsequent implementation of command-level control of terminal application application behavior.
上述说明仅是本发明技术方案的概述,为了能够更清楚了解本发明的技术手段,而可依照说明书的内容予以实施,并且为了让本发明的上述和其它目的、特征和优点能够更明显易懂,以下特举本发明的具体实施方式。The above description is only an overview of the technical solution of the present invention. In order to understand the technical means of the present invention more clearly, it can be implemented in accordance with the content of the description, and in order to make the above and other objectives, features and advantages of the present invention more obvious and understandable. In the following, specific embodiments of the present invention are specifically cited.
附图说明Description of the drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the following will briefly introduce the drawings used in the description of the embodiments or the prior art. Obviously, the drawings in the following description These are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative work.
图1是现有3G无线资源控制状态机;Figure 1 is an existing 3G radio resource control state machine;
图2(a)是一个网络请求合并的实例中合并前的网络请求控制流示意图;Figure 2(a) is a schematic diagram of the network request control flow before the merger in an example of network request merger;
图2(b)是一个网络请求合并的实例中合并后的网络请求控制流示意图;Figure 2(b) is a schematic diagram of the network request control flow after merging in an example of network request merging;
图3是一个线程之间通信依赖的例子——生产者-消费者模式示意图;Figure 3 is an example of communication dependencies between threads-a schematic diagram of the producer-consumer model;
图4是本发明一种构造终端应用行为的运行时模型的方法的步骤流程图;4 is a flowchart of the steps of a method for constructing a runtime model of terminal application behavior of the present invention;
图5是安卓多线程编程示例;Figure 5 is an example of Android multi-threaded programming;
图6是一个多线程编程间依赖示例;Figure 6 is an example of dependencies between multithreaded programming;
图7(a)是执行前堆区对象;Figure 7(a) is the heap area object before execution;
图7(b)是执行后堆区对象;Figure 7(b) is the heap area object after execution;
图8是本发明示例Reflectall模型生成子系统架构示意图;8 is a schematic diagram of the architecture of the Reflectall model generation subsystem of the present invention;
图9是本发明示例Reflectall的接口运行子系统的结构示意图;Fig. 9 is a schematic structural diagram of an interface operation subsystem of Reflectall, an example of the present invention;
图10(a)是开源应用集上的实验结果;Figure 10(a) is the experimental result on the open source application set;
图10(b)是闭源应用集上的实验结果;Figure 10(b) is the experimental result on the closed source application set;
图11是Reflectall与Emma生成代码覆盖率报告时的应用启动时间结果对比图。Figure 11 is a comparison chart of application startup time results when Reflectall and Emma generate code coverage reports.
图12示意性地示出了用于执行根据本发明的方法的计算处理设备的框图;以及FIG. 12 schematically shows a block diagram of a computing processing device for executing the method according to the present invention; and
图13示意性地示出了用于保持或者携带实现根据本发明的方法的程序代码的存储单元。Fig. 13 schematically shows a storage unit for holding or carrying program codes for implementing the method according to the present invention.
具体实施例Specific embodiment
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of the embodiments of the present invention, not all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.
为使本发明的目的、技术方案和优点更加清楚明白,下面结合实施例和附图,对本发明作进一步详细说明。In order to make the objectives, technical solutions, and advantages of the present invention clearer, the following further describes the present invention in detail with reference to embodiments and drawings.
为了更好理解本申请的技术问题,本发明采用选取两个典型案例的应用功能演化场景进行分析,以此来明确现有的行为反射不适用的根本原因。In order to better understand the technical problems of the present application, the present invention selects two typical cases of application function evolution scenarios to analyze, so as to clarify the fundamental reason why the existing behavior reflection is not applicable.
案例一:Case number one:
以移动设备中的3G网络模块为例,一共包含三个状态,如图1所示。Taking the 3G network module in the mobile device as an example, there are three states in total, as shown in Figure 1.
(1)IDLE:即空闲状态,在此状态下,3G模块的耗电最低,同时也不能发送、接收任何数据。在这个状态下,如果要发送或接收数据,则会转移至CELL_DCH状态。(1) IDLE: Idle state, in this state, the power consumption of the 3G module is the lowest, and it cannot send or receive any data. In this state, if you want to send or receive data, it will transition to the CELL_DCH state.
(2)CELL_DCH:在这个状态下,3G模块的带宽达到最大,此时能以最大的速率进行数据传输,同时它的功耗也是最大的。如果持续一段时间,仍然没有数据传输的话,它会转移至CELL_FACH状态。根据不同的运营商的设置,持续运行于CELL_DCH状态的时间通常是5秒至10秒。(2) CELL_DCH: In this state, the bandwidth of the 3G module reaches the maximum, and data transmission can be carried out at the maximum rate at this time, and its power consumption is also the maximum. If it continues for a period of time and there is still no data transmission, it will transition to the CELL_FACH state. Depending on the settings of different operators, the duration of continuous operation in the CELL_DCH state is usually 5 seconds to 10 seconds.
(3)CELL_FACH:在这个状态下,3G模块的耗电比CELL_DCH要节省50%,同时,在这个状态下,其网络传输速率也较低。如果在这个状态,发送或接收的数据大于某个阀值,则会重新转移至CELL_DCH状态。而如果在CELL_FACH状态持续一段时间没有发送或接收数据,则会转移至IDLE状态。一般来说这段时间通常是10秒至15秒。(3) CELL_FACH: In this state, the power consumption of the 3G module is 50% less than that of CELL_DCH. At the same time, in this state, its network transmission rate is also lower. If in this state, the data sent or received is greater than a certain threshold, it will re-transition to the CELL_DCH state. If no data is sent or received in the CELL_FACH state for a period of time, it will transition to the IDLE state. Generally speaking, this period of time is usually 10 seconds to 15 seconds.
图2展示了一个网络请求合并的实例。图2(a)为合并前的网络请求与无线通讯模块耗电,其横轴为时间,上半部分为无线通讯模块的功耗;下半部分的虚线为发起这两个网络请求的线程;下半部分的实线为其控制流。首先,一个后台的新闻推送线程唤醒了负责发送网络请求的线程①;该线程被唤醒后,发起了网络请求②,此时,无线通讯模块的功耗也由IDLE状态下的低功耗,变为CELL_DCH状态下的高功耗;当整个请求完成后,负责发送网络请求的线程将结果返回给新闻推送线程③,此时虽然无线通讯模块没有接收或发送数据,但它仍会保持在高功耗状态,由此开始的无线通讯模块耗电被称之为“尾时间耗电”,对应为图2(a)中用的斜线部分;新闻推送线程在收到返回结果后④,对结果进行处理,并在通知栏上进行提示⑤。又过了一段时间后,另一个版本更新线程也执行了类似的逻辑⑥,发送了网络请求。如图2(a)所示,由于这两个网络请求间隔了几十秒,导致无线通讯模块被唤醒了两次,因此也有了对应的两次“尾时间”,由此导致了额外的网络能耗。Figure 2 shows an example of a network request merge. Figure 2(a) is the network request and the power consumption of the wireless communication module before merging, the horizontal axis is time, the upper half is the power consumption of the wireless communication module; the dotted line in the lower half is the thread that initiated the two network requests; The solid line in the lower half is the control flow. First, a background news push thread wakes up the thread responsible for sending network requests①; after the thread is awakened, it initiates network requests②. At this time, the power consumption of the wireless communication module also changes from the low power consumption in the IDLE state. It is the high power consumption in the CELL_DCH state; when the entire request is completed, the thread responsible for sending the network request returns the result to the news push thread ③. At this time, although the wireless communication module does not receive or send data, it will still remain at high power The power consumption of the wireless communication module starting from this is called the "end time power consumption", which corresponds to the slash part used in Figure 2(a); the news push thread receives the returned result ④, Handle it and prompt ⑤ on the notification bar. After another period of time, another version update thread also executed similar logic ⑥ and sent a network request. As shown in Figure 2(a), because the two network requests are separated by tens of seconds, the wireless communication module is woken up twice, so there are two corresponding "tail times", which results in additional network Energy consumption.
对于安卓应用,有很大一部分后台请求可以被延迟几十秒、甚至两、三分钟,而不会影响用户的体验。例如上述的新闻推送、版本更新推送等。对于这些网络请求,如果在时间维度上进行合并,即两个请求同时发送,而不是间隔了几十秒发送,就可以减少“尾时间”的网络耗电。图2(b)为图2(a)中两个请求进行了合并后的控制流及其无线通讯模块耗电情况。首先,负责发送网络请求的线程被新闻推送线程唤醒后,并不直接发送网络请求,而是进入一个等待状态⑦。一段时间后,另一个网络请求线程被后台更新推送线程唤醒,同时,它也进入一个等待状态⑧。在等待状态结束后⑨,这两个线程同时发送网络请求,对应的无线通讯模块也只被唤醒一次。如图2(b)所示,合并后的网络请求的耗电要远小于合并前的网络请求耗电。For Android applications, a large part of background requests can be delayed for tens of seconds, or even two or three minutes, without affecting the user experience. For example, the aforementioned news push, version update push, etc. For these network requests, if they are merged in the time dimension, that is, two requests are sent at the same time, instead of sending tens of seconds apart, the network power consumption of the "tail time" can be reduced. Figure 2(b) shows the control flow and the power consumption of the wireless communication module after the two requests in Figure 2(a) are combined. First, after the thread responsible for sending the network request is awakened by the news push thread, it does not directly send the network request, but enters a waiting state⑦. After a period of time, another network request thread is awakened by the background update push thread, and at the same time, it also enters a waiting state⑧. After the waiting state is over ⑨, these two threads send network requests at the same time, and the corresponding wireless communication module is only awakened once. As shown in Figure 2(b), the power consumption of the merged network request is much less than the power consumption of the network request before the merge.
为了实现网络请求合并,需要1)一种网络请求调度机制,即,使原本直接发送的网络请求可被延迟发送;2)一种网络请求调度算法,即找出可被延迟调度的请求,同时利用调度机制进行延迟发送。利用结构反射可以实现自动化重构移动应用的网络请求执行逻辑,并把调度机制内置进应用。然而这就要求不同应用的开发者均使用同一个自动重构框架,并且需要对所有应用进行重新的编译、部署和运行。这对于大量的、分属于不同应用开发者的闭源应用显然不现实。In order to achieve network request merging, 1) a network request scheduling mechanism, that is, the network request that was originally sent directly can be delayed; 2) a network request scheduling algorithm, that is, to find out the request that can be delayed, and at the same time Use the scheduling mechanism for delayed transmission. Structural reflection can be used to automatically reconstruct the network request execution logic of mobile applications, and the scheduling mechanism can be built into the application. However, this requires developers of different applications to use the same automatic refactoring framework, and all applications need to be recompiled, deployed and run. This is obviously unrealistic for a large number of closed-source applications that belong to different application developers.
案例二:Case 2:
随着微信的普及,微信已经不仅仅是一个简单的通讯应用,它还成为了工作交流的必备工具;催生了利用微信朋友圈、公众号进行营销的“微商”;成为了最大的自媒体的发布平台。微信其核心作为一个通讯工具,其功能还是以满足普通用户为主。即便如此,它也很难满足普通用户的特定需求。例如,随着微信使用的时间越来越长,其缓存的聊天记录文件也越来越大,对于普通用户而言,很难对自己的聊天日志进行管理。更进一步地,微信难以满足微商、自媒体人等特殊群体的特定的需求。要实现微信应用内数据与功能的开放共享,就需要将面向用户的用户界面接口转换为面向互操作的可编程接口。一般而言,对于面向用户的用户界面接口,其执行的起点在于用户界面元素的点击、拖动和输入等操作。经过部分的逻辑处理,以网络请求、数据库查询、文件读写的方式,访问外部资源,获取相应的数据或实现相应的功能。在该处理流程中,有大部分的逻辑是与面向互操作的可编程接口的 执行逻辑是相似的,只是其执行的起点不同。然而,现有的行为反射监测和控制的粒度为方法级别。基于现有的行为反射,在既有应用的执行流程中插入某些执行逻辑的方式,难以实现将面向用户的用户界面接口转换为面向互操作的可编程接口:既有的功能可对应于运行时的一组程序活动,以方法为粒度的行为反射其监测的内容有限,无法监测方法内的指令的执行,继而也无法控制。这就导致了现有的解决方案往往是基于既有的代码和文档,对于一个开发团队而言,开发人员的流动、文档的缺失、甚至是不规范的源码注释都会使得移动应用的迭代开发变得难以进行。With the popularity of WeChat, WeChat has become more than just a simple communication application, it has also become an essential tool for work communication; it has spawned "WeChat business" that uses WeChat Moments and Official Accounts for marketing; it has become the largest free Media’s publishing platform. The core of WeChat is a communication tool, and its function is still to satisfy ordinary users. Even so, it is difficult to meet the specific needs of ordinary users. For example, as WeChat has been used for longer and longer, its cached chat log files have also become larger and larger, and it is difficult for ordinary users to manage their chat logs. Furthermore, WeChat is difficult to meet the specific needs of special groups such as WeChat businessmen and self-media people. To realize the open sharing of data and functions in the WeChat application, it is necessary to transform the user-oriented user interface interface into an interoperable programmable interface. Generally speaking, for a user-oriented user interface interface, the starting point of its execution lies in operations such as clicking, dragging, and inputting user interface elements. After part of the logic processing, access to external resources by means of network requests, database queries, and file reading and writing, to obtain corresponding data or realize corresponding functions. In this processing flow, most of the logic is similar to the execution logic of the interoperable programmable interface, but the starting point of its execution is different. However, the granularity of existing behavioral reflection monitoring and control is method level. Based on the existing behavior reflection, inserting some execution logic into the execution process of existing applications, it is difficult to convert user-oriented user interface interfaces into interoperability-oriented programmable interfaces: existing functions can correspond to running A group of program activities at a time, the method-granular behavior reflects that the content of monitoring is limited, and the execution of instructions in the method cannot be monitored, and then it cannot be controlled. This has led to the fact that existing solutions are often based on existing code and documentation. For a development team, the flow of developers, the lack of documentation, and even non-standard source code comments will change the iterative development of mobile applications. Too difficult to proceed.
由上述两个案例分析可以看出,难以实现移动应用互操作接口根本原因在于现有的工作由于缺乏对应用行为的一个完整、详实的描述,同时也没有对这种指令粒度的自描述的控制的方法。因而,能否给出一个完整描述应用行为、并且可操作的运行时模型也就成了解决本发明问题的难点和关键。It can be seen from the above two case studies that the fundamental reason for the difficulty in implementing the mobile application interoperability interface is that the existing work lacks a complete and detailed description of the application behavior, and there is no self-descriptive control of this instruction granularity. Methods. Therefore, whether a runtime model that completely describes the application behavior and is operable can be given has become the difficulty and key to solving the problem of the present invention.
针对上述技术问题,本发明实施例提出了一种构造终端应用行为的运行时模型的方法,所述运行时模型包括运行时栈模型和运行时堆模型。In view of the foregoing technical problems, the embodiment of the present invention proposes a method for constructing a runtime model of terminal application behavior. The runtime model includes a runtime stack model and a runtime heap model.
应用在操作系统中运行起来之后,可视为一个或多个进程,操作系统将移动应用所需的可执行文件加载到内存中,并开始执行。一般而言,一个进程所占的内存可分为三个区域:After the application runs in the operating system, it can be regarded as one or more processes. The operating system loads the executable file required by the mobile application into the memory and starts execution. Generally speaking, the memory occupied by a process can be divided into three areas:
代码段:存放执行代码的一块内存区域,具有只读属性;Code segment: a memory area where the execution code is stored, with read-only attribute;
堆区:可分为用于存放全局变量的一块内存区域(数据段),和用于进程运行中动态分配的内存区域,例如,面向对象编程语言Java中,线程创建一个新的对象相当于在堆区中申请了一片内存;Heap area: It can be divided into a memory area (data segment) used to store global variables and a memory area used for dynamic allocation during process operation. For example, in the object-oriented programming language Java, a thread creating a new object is equivalent to A piece of memory is applied for in the heap area;
栈区:用于临时存局部变量等。例如,面向对象编程语言Java中,线程调用一个方法时,会新申请一个帧(frame),帧中保存了方法所需的参数等数据。Stack area: used to temporarily store local variables, etc. For example, in the object-oriented programming language Java, when a thread calls a method, it will apply for a new frame (frame), and data such as parameters required by the method are stored in the frame.
经过发明人仔细研究发现,终端应用在运行时,代码段的执行会引起堆区和栈区内存数据的变化。应用的运行时模型需要能反映应用在一段时间内的:1)代码的执行情况:在开发时,移动应用的代码可抽象为控制流图,那么对应于运行时,代码的执行情况可抽象为控制流图的一条或多条路径;2)内存数据(如堆区)的变化:在开发时,开发人员会设计好各种数据结构以表示应用的数据模型(Data Model),而在运行时,代码的执行引起对这些数据结构的实例的创建、修改、删除,也就是对应于一组内存的分配和修改操作。从内存区域角度上看,程序执行所影响的最主要的区域是内存的栈区和堆区。1)中的控制流图中的路径可视为对栈变化的一个描述,而2)中主要体现的是堆区数据的变化。After careful study by the inventor, it is found that the execution of the code segment will cause changes in the memory data in the heap area and the stack area when the terminal application is running. The runtime model of the application needs to be able to reflect the application over a period of time: 1) Code execution: During development, the code of the mobile application can be abstracted as a control flow graph, so corresponding to the runtime, the execution of the code can be abstracted as One or more paths of the control flow graph; 2) Changes in memory data (such as heap area): During development, developers will design various data structures to represent the data model of the application, and at runtime , The execution of the code causes the creation, modification, and deletion of instances of these data structures, which corresponds to a set of memory allocation and modification operations. From the perspective of memory area, the most important areas affected by program execution are the stack area and heap area of the memory. The path in the control flow diagram in 1) can be regarded as a description of the stack change, and the main reflection in 2) is the change of the heap area data.
因此,本发明所构造的应用运行时模型包括一个描述栈变化的运行时栈模型和一个描述堆变化的运行时堆模型。其中运行时栈模型还包括对代码的获取,以此将一个进程所占的内存完整的分为三个区域。通过本发明实施例的运行时栈模型,可以了解在任意时刻移动应用的代码执行情况;而通过运行时堆模型,可以了解在任意时刻代码执行所依赖的对象数据状态。Therefore, the application runtime model constructed by the present invention includes a runtime stack model describing stack changes and a runtime heap model describing heap changes. Among them, the runtime stack model also includes the acquisition of code, which completely divides the memory occupied by a process into three areas. Through the runtime stack model of the embodiment of the present invention, it is possible to understand the code execution status of the mobile application at any time; and through the runtime heap model, it is possible to understand the object data state on which the code execution depends at any time.
运行时栈模型Runtime stack model
控制流图为一个有向图G=<B,P>;The control flow graph is a directed graph G=<B, P>;
其中,B={b 1,b 2,…,b n}为基本块; Among them, B={b 1 , b 2 ,..., b n } is a basic block;
Figure PCTCN2019119272-appb-000001
为控制流路径;
Figure PCTCN2019119272-appb-000001
Is the control flow path;
对于任意p i=(b i1,b i2),p i∈P,当且仅当b i2可能b i1之后执行。而在运行时,控制流图会实例化为一个或多个控制流,并按照控制流图中的路径执行基本块。本发明称在某一时刻执行的基本块为活动,则一段时间内的运行时栈模型是由控制流图,一个或多个控制流,和一组活动序列组成。当基本块的粒度为指令粒度时,活动序列就是指令执行序列。下面给出本发明所述的运行时栈模型的形式化定义。 For any p i = (b i1 , b i2 ), p i ∈ P, it is executed if and only if b i2 is possible after b i1 . At runtime, the control flow graph is instantiated into one or more control flows, and basic blocks are executed according to the path in the control flow graph. The present invention calls the basic block executed at a certain moment as an activity, and the runtime stack model for a period of time is composed of a control flow graph, one or more control flows, and a set of activity sequences. When the granularity of the basic block is instruction granularity, the activity sequence is the instruction execution sequence. The formal definition of the runtime stack model of the present invention is given below.
定义运行时栈模型为一个或多个控制流在一段时间内发生的活动的集合M=<G,T,A,I,E>,Define the runtime stack model as a collection of activities that occur in one or more control flows within a period of time M = <G, T, A, I, E>,
其中,G=<B,P>为控制流图,T={t 1,t 2,…,t n}为一组时刻,I={i 1,i 2,…,i n},表示t 1至t n时刻的程序的堆区状态。 Among them, G=<B, P> is the control flow graph, T={t 1 , t 2 ,..., t n } is a set of moments, I={i 1 , i 2 ,..., i n }, which means t The heap state of the program from 1 to t n .
令F={f 1,f 2,…,f n},为一组控制流,则A=F×1×T×B为一段时间内发生的活动的集合,
Figure PCTCN2019119272-appb-000002
表示两个活动发生的前后关系的集合。
Let F = {f 1 , f 2 ,..., f n }, be a set of control flow, then A = F × 1 × T × B is the set of activities that occur within a period of time,
Figure PCTCN2019119272-appb-000002
Represents the collection of the context of the occurrence of two activities.
运行时栈模型可视为控制流图的多条路径的集合,因此,在运行时栈模型中的边必须在控制
Figure PCTCN2019119272-appb-000003
发生的前后关系,对于在同一控制流中的两个活动,具有时间上的前后顺序关系;对于在不同控制流中的两个活动如果存在边,则表明这两个活动间还具有依赖关系。
The runtime stack model can be regarded as a collection of multiple paths in the control flow graph. Therefore, the edges in the runtime stack model must be in the control
Figure PCTCN2019119272-appb-000003
The occurring context has a temporal sequence relationship for two activities in the same control flow; if there are edges for two activities in different control flows, it indicates that the two activities still have a dependency relationship.
在同一个控制流中,若有两个活动具有前后发生关系,则对于任意的其他活动,都不可能在
Figure PCTCN2019119272-appb-000004
In the same control flow, if two activities have a contextual relationship, it is impossible for any other activities
Figure PCTCN2019119272-appb-000004
在不同控制流中,若两个活动具有前后发生关系,则对于后一个活动所在的控制流,在发生前一活动的时刻之后,可以先发生其他活动。
Figure PCTCN2019119272-appb-000005
其中a i={f i1,t i2,b i3),a j=(f j1,t j2,b j3),如果f i1≠f j1,则t i2<t j2
In different control flows, if two activities have a contextual relationship, for the control flow where the latter activity is located, other activities can occur first after the moment when the previous activity occurs.
Figure PCTCN2019119272-appb-000005
Where a i ={f i1 , t i2 , b i3 ), a j = (f j1 , t j2 , b j3 ), if f i1 ≠ f j1 , then t i2 <t j2 .
定义程序活动a j同步依赖于程序活动a i,如果a j的开始或结束由a i的执行决定,一般地,而a i往往是一些线程同步操作。称a j通信依赖于a i,如果a j的某项数据依赖是由a i活动产生。以面向对象编程语言Java为例,基本块的粒度为源代码的基本块的粒度。运行栈模型的每一个控制流对应于一个Java线程的执行序列。线程的状态转移共有六种状态: Activity definition program a j depends on the synchronization program activities a i, a j if the start or end of a i by the implementation of the decisions, in general, a i and often some thread synchronization operation. It is said that a j communication depends on a i , if a certain data dependency of a j is generated by a i activity. Taking the object-oriented programming language Java as an example, the granularity of the basic block is the granularity of the basic block of the source code. Each control flow of the runtime stack model corresponds to the execution sequence of a Java thread. There are six states of thread state transition:
创建:线程对象刚创建,还未开始时处于该状态;Creation: The thread object has just been created and is in this state when it has not yet started;
运行:线程处于正在运行的状态,在该状态的线程可能等待一些系统资源,如CPU;Running: The thread is in a running state, and the thread in this state may wait for some system resources, such as CPU;
阻塞:线程正在等待某个监控锁(Monitor Lock),例如线程在进入synchronized关键字修改的方法或是代码块时,线程会进入阻塞状态;Blocking: The thread is waiting for a Monitor Lock. For example, when the thread enters the method or code block modified by the synchronized keyword, the thread will enter the blocking state;
等待/定时等待:线程正在等待,例如,当线程调用某个对象的wait方法进入等待状态。当该对象的notify方法被该用时,线程会重新进入运行状态;Waiting/Timed Waiting: The thread is waiting, for example, when the thread calls the wait method of an object to enter the waiting state. When the notify method of the object is used, the thread will re-enter the running state;
死亡:当一个线程的run方法执行结束后,会进入死亡状态。Death: When the run method of a thread ends, it will enter the dead state.
由以上的状态转换可以发现,在某些情况下,某一个处于运行状态的线程可以唤醒另一个处于非运行状态的线程进入运行状态。本发明称线程之间的这种关系为同步依赖关系。在这些线程间唤醒发生时,对应于运行时栈模型中,处于运行状态的线程发生的活动会与非运行状态线程进入运行状态时发生的活动存在一条跨线程(跨控制流)的边。从Java的语言层面,可将这些线程依赖归纳为四类,如表1所示。From the above state transition, it can be found that in some cases, a thread in a running state can wake up another thread in a non-running state to enter the running state. The present invention calls this relationship between threads a synchronization dependency relationship. When these inter-thread wakeups occur, corresponding to the runtime stack model, there is a cross-thread (cross-control flow) edge between the activities of the threads in the running state and the activities that occur when the non-running threads enter the running state. From the Java language level, these thread dependencies can be summarized into four categories, as shown in Table 1.
表1:Java语言层面的同步依赖关系分类Table 1: Classification of synchronization dependencies at the Java language level
Figure PCTCN2019119272-appb-000006
Figure PCTCN2019119272-appb-000006
在表1中,每个运行线程的活动都会与非运行线程的活动对应。因此,称表1中的线程间依赖关系为同步依赖关系。基于以上线程的状态转换,Java提供了多种多线程编程库,以支持,例如java.util.concurrent中提供了读写锁、可重入锁、阻塞锁和线程池等。In Table 1, the activity of each running thread corresponds to the activity of the non-running thread. Therefore, the inter-thread dependency in Table 1 is called the synchronization dependency. Based on the state transition of the above threads, Java provides a variety of multi-threaded programming libraries to support. For example, java.util.concurrent provides read-write locks, reentrant locks, blocking locks, and thread pools.
图3展示了一个线程之间通信依赖的例子——生产者-消费者模式。在该例子中,Task类表示计算任务;静态字段tasks表示待处理任务队列;postTask方法表示生成并提交任务;handleTask方法表示处理任务。如图3所示,存在两个线程:1)线程1表示生产者线程,会提交任务至待处理任务队列;2)线程2表示消费者线程,每隔一段时间便会查看待处理任务队列,并处理相应的任务。在这个例子中,生产者线程与消费者线程之间并无同步依赖关系——消费者线程每隔一段时间便自动 由定时等待状态转为运行状态,但是却存在通信的依赖关系——如果生产者线程不提交任务,消费者线程的task.run方法便不会被调用。Figure 3 shows an example of communication dependencies between threads-the producer-consumer model. In this example, the Task class represents computing tasks; the static field tasks represents a queue of pending tasks; the postTask method represents generating and submitting tasks; and the handleTask method represents processing tasks. As shown in Figure 3, there are two threads: 1) Thread 1 represents the producer thread, which will submit tasks to the pending task queue; 2) Thread 2 represents the consumer thread, and will check the pending task queue at regular intervals. And handle the corresponding tasks. In this example, there is no synchronization dependency between the producer thread and the consumer thread-the consumer thread will automatically switch from the timed waiting state to the running state every certain period of time, but there is a communication dependency-if the production If the user thread does not submit the task, the task.run method of the consumer thread will not be called.
由上述例子可以发现,运行时栈模型中的活动关系的生成必须依赖运行时的相应数据。在经典的数据流分析中,数据流分析算法会根据控制流图的结构计算数据流方程,并迭代至稳定点。因此,应用的运行时模型除了上述的运行时栈模型外,还需一个描述内存堆区数据状态变化的运行时堆模型。From the above example, it can be found that the generation of the activity relationship in the runtime stack model must depend on the corresponding data at runtime. In classic data flow analysis, the data flow analysis algorithm calculates the data flow equation based on the structure of the control flow graph and iterates to a stable point. Therefore, in addition to the above-mentioned runtime stack model, the runtime model of the application also needs a runtime heap model that describes changes in the data state of the memory heap area.
运行时堆模型Runtime heap model
经典的数据流图常用于需求分析阶段。软件利用数据流图由抽象到具体、逐层分解待开发的软件系统。数据流图为一个有向图,包含了两种不同类型的边和多种不同的节点用以描述数据从一个初始节点出发,进行层层计算,最后得到最终结果。在运行时,数据流图的某个节点本质上对应了一组内存数据的变化。因此,本发明的应用的行为运行时模型的堆模型重点关注内存数据的变化,而不是变化的操作。本发明所述的运行时的堆模型仅从内存数据变化角度对应用运行时的内存的堆区进行建模。The classic data flow diagram is often used in the requirements analysis phase. The software uses the data flow diagram to decompose the software system to be developed layer by layer from abstract to concrete. The data flow graph is a directed graph, which contains two different types of edges and a variety of different nodes to describe data starting from an initial node, performing layer-by-layer calculations, and finally getting the final result. At runtime, a certain node of the data flow graph essentially corresponds to a set of changes in memory data. Therefore, the heap model of the application behavior runtime model of the present invention focuses on changes in memory data, rather than changing operations. The runtime heap model of the present invention only models the heap area of the memory during application runtime from the perspective of memory data changes.
运行时堆模型是由一组内存数据的初始值和在一段时间内发生堆的内存修改活动的集合M=<D,A,T,R>;The runtime heap model is a collection of the initial value of a set of memory data and the memory modification activities of the heap during a period of time M = <D, A, T, R>;
其中,D={d 1,d 2,…,d n},为一组内存地址的初始值,A={i 1,i 2,…,i n},为引起内存数据变化的活动,T={t 1,t 2,…,t n},为时间戳。 Among them, D = {d 1 , d 2 ,..., d n }, is the initial value of a set of memory addresses, A = {i 1 , i 2 ,..., i n }, is the activity that causes memory data changes, T = {T 1 , t 2 ,..., t n }, which is the time stamp.
对于不同的面向对象编程语言,它们会提供不同的应用编程接口以实现内存的动态分配和回收。例如C/C++语言,通过在标准库函数中提供malloc和free函数实现内存的分配和回收;而Java语言,通过new关键字可创建新的对象,实现内存的分配,通过自动的垃圾回收机制,实现对内存的回收。For different object-oriented programming languages, they will provide different application programming interfaces to achieve dynamic memory allocation and recovery. For example, in the C/C++ language, memory allocation and recovery are realized by providing malloc and free functions in the standard library functions; while in the Java language, new objects can be created through the new keyword to achieve memory allocation. Through an automatic garbage collection mechanism, Realize the recovery of memory.
针对本发明的技术问题,接下来对如何构造本发明实施例的上述运行时模型进行详细阐述。In view of the technical problems of the present invention, how to construct the above-mentioned runtime model of the embodiment of the present invention will be described in detail below.
参照图4,示出了本发明一种构造终端应用行为的运行时模型的方法的步骤流程图,所述方法包括构造所述终端应用行为的运行时栈模型的步骤和构造所述终端应用行为的运行时堆模型的步骤。4, there is shown a step flow chart of a method of constructing a runtime model of terminal application behavior of the present invention, the method includes the steps of constructing a runtime stack model of the terminal application behavior and constructing the terminal application behavior The steps of the runtime heap model.
所述构造所述终端应用行为的运行时栈模型的步骤具体可以包括:The step of constructing the runtime stack model of the terminal application behavior may specifically include:
步骤S401:在所述终端应用运行时,获取所述终端应用的内存中真正执行的代码,并对所述真正执行的代码进行抽象,生成控制流图;Step S401: When the terminal application is running, obtain the actually executed code in the memory of the terminal application, and abstract the actually executed code to generate a control flow graph;
步骤S402:针对所述控制流图,将需要监测的控制流图输入至预设的行为解释器;Step S402: Regarding the control flow graph, input the control flow graph to be monitored into a preset behavior interpreter;
步骤S403:利用所述行为解释器对所述需要监测的控制流图进行解释执行,生成所述终端应用运行时的栈活动;Step S403: Use the behavior interpreter to interpret and execute the control flow graph that needs to be monitored, and generate stack activities when the terminal application is running;
步骤S404:在所述终端应用运行时,生成所述栈活动的控制流间的依赖关系,得到所述终端应用行为的运行时栈模型;Step S404: When the terminal application is running, the dependency between the control flows of the stack activity is generated to obtain the runtime stack model of the terminal application behavior;
在本发明实施例中,构建运行时栈模型包含以下三个基本要素:1)控制流图:包含了所有可能的活动和所有可能的活动关系,是程序的源代码,或是中间代码的抽象表示;2)运行时发生的一组活动,即控制流图中的一条路径,可视为栈模型中的一组节点;3)运行时发生的活动之间的关系,即栈模型的边。In the embodiment of the present invention, the construction of the runtime stack model includes the following three basic elements: 1) Control flow graph: contains all possible activities and all possible activity relationships, is the source code of the program, or the abstraction of the intermediate code Representation; 2) A group of activities that occur at runtime, that is, a path in the control flow graph, can be regarded as a group of nodes in the stack model; 3) The relationship between activities that occur at runtime, that is, the edges of the stack model.
栈模型的构造需要着力解决三方面的挑战:第一,由于编译优化、运行时即时编译等技术,应用的源代码和编译生成的字节码可能与运行时内存中的代码段不同,如何保障控制流图与运行时的活动能正确映射;第二,如何生成不同粒度的活动以描述复杂的应用运行状态的改变;第三,由于现在的应用大量使用多线程编译以保证界面的响应速度,提升用户体验,如何生成运行时控制流之间的依赖关系。针对以上三个挑战,本发明实施例的第一步是通过在运行时获取内存中真正执行的代码,对当前执行的代码进行抽象,可以保证控制流图与运行时活动的准确映射。第二步提出一种行为解释器,该行为解释器以上步生成的控制流图作为输入,对其进行解释执行。第三步,解释执行过程中生成应用运行时的活动;而模型生成的最后一步,是在运行时生成控制流之间的依赖关系。The construction of the stack model needs to focus on solving three challenges: First, due to technologies such as compilation optimization and runtime just-in-time compilation, the source code of the application and the bytecode generated by the compilation may be different from the code segment in the runtime memory. How to ensure The control flow graph and runtime activities can be mapped correctly; second, how to generate activities with different granularities to describe the changes of complex application running state; third, because current applications use a lot of multi-threaded compilation to ensure the response speed of the interface, To improve user experience, how to generate dependencies between runtime control flows. In view of the above three challenges, the first step of the embodiment of the present invention is to obtain the actually executed code in the memory at runtime and abstract the currently executed code, which can ensure the accurate mapping of the control flow graph and the runtime activities. In the second step, a behavior interpreter is proposed. The behavior interpreter takes the control flow graph generated in the previous step as input to interpret and execute it. The third step is to explain the runtime activities of the generated application during the execution process; and the last step of model generation is to generate the dependency between the control flow at runtime.
下面,对运行时栈模型的生成步骤作进一步概述。Below, a further overview is given to the steps of generating the runtime stack model.
一、控制流图生成。在应用的开发过程中,由于安装包发布会包含应用编译过的中间代码,因此,出于保护的目的,应用通过会使用各种混淆工具对生成的中间代码,例如安卓下的dex字节码,进 行混淆。这会导致直接提供的源代码难以与应用运行时执行的活动进行映射。这些混淆过的代码会通过应用运行时环境进行加载执行。例如,在安卓应用的Dex字节码会在Android Runtime(ART)中执行。本发明采用修改应用运行时的方式获取应用的字节码。这种方式可以带来两个好处,一是无需提供与匹配的中间代码或源代码,提高了方法的实用性;二是在应用运行时生成的中间代码可保证与执行的活动的一致性,从而保障了控制流图与运行时的控制流的匹配。1. Control flow graph generation. During the application development process, since the installation package release will contain the intermediate code compiled by the application, for protection purposes, the application will use various obfuscation tools to generate the intermediate code, such as the dex bytecode under Android , To be confused. This makes it difficult to map the directly provided source code with the activities performed during application runtime. These obfuscated codes will be loaded and executed through the application runtime environment. For example, the Dex bytecode in an Android application will be executed in Android Runtime (ART). In the present invention, the bytecode of the application is obtained by modifying the running time of the application. This approach can bring two benefits. One is that there is no need to provide matching intermediate code or source code, which improves the practicability of the method; the other is that the intermediate code generated when the application is running can ensure consistency with the executed activities. This ensures the matching of the control flow graph with the control flow at runtime.
具体的,控制流图的生成:Specifically, control flow graph generation:
根据指令类别得出基本块的边界,一条指令是基本块的开始当且仅当:1)它是一个方法的第一条指令或2)有某一条指令可能跳转至当前指令。而一条指令是基本块的结束,当且仅当:1)它是方法的返回,如return,throw指令;或者2)它是一条跳转指令,如if,goto,或者该指令可能抛出异常。在界定好基本块的起始和结束后,本发明的控制流图生成算法分为以下三个步骤:According to the instruction type, the boundary of the basic block is obtained. An instruction is the beginning of a basic block if and only if: 1) it is the first instruction of a method or 2) a certain instruction may jump to the current instruction. An instruction is the end of a basic block, if and only if: 1) it is the return of a method, such as return, throw instructions; or 2) it is a jump instruction, such as if, goto, or the instruction may throw an exception . After defining the start and end of the basic block, the control flow graph generation algorithm of the present invention is divided into the following three steps:
计算所有跳转指令(包括显式跳转和异常跳转)的目标地址,将该地址的指令标记为可作为基本块开始的指令。Calculate the target address of all jump instructions (including explicit jump and exception jump), and mark the instruction at this address as an instruction that can be used as the start of a basic block.
初始化基本块队列为空,并由低至高遍历每一条指令,若该指令是基本块的开始,或是当前基本块为空,则新建一个基本块,将其作为当前基本块,并放到基本块队列末尾;若该指令是基本块的结束,则将其放到当前基本块内,并新建一个基本块作为当前基本块,并放到基本块队列末尾。Initialize the basic block queue to be empty, and traverse each instruction from low to high. If the instruction is the beginning of a basic block or the current basic block is empty, create a new basic block, use it as the current basic block, and place it in the basic block The end of the block queue; if the instruction is the end of the basic block, put it in the current basic block, create a new basic block as the current basic block, and put it at the end of the basic block queue.
遍历整个基本块队列,建立基本块的前趋和后继关系:一个基本块的最后一条指令如果是可跳转的指令,则在该基本块和跳转的目标基本块添加一条有向边;如果一个基本块不是return或goto指令,则与队列中的下一个基本块添加一条有向边。Traverse the entire basic block queue to establish the predecessor and successor relationship of the basic block: if the last instruction of a basic block is a jumpable instruction, add a directed edge to the basic block and the target basic block of the jump; if If a basic block is not a return or goto instruction, it adds a directed edge to the next basic block in the queue.
二、按需重分配控制流图的执行,以及由行为解释器生成应用运行时的活动。在应用运行时,每一个线程会对应一条控制流,每一条控制流可视为一组有序的活动。这组活动可视为上步生成的控制流图的一条路径。因此,本发明提出一个适用于监测程序执行的行为解释器。根据配置,将需要监测的控制流图分配至行为解释器中执行。如果将每一条指令的执行都对应于一个活动,会导致这组指令序列变得规模极大而难以处理:1、数值计算语句难以与语义对应;2、程序循环产生的大量活动会湮没真正的处理逻辑。因此,本发明将活动分为数值计算、分支控制、方法调用等,并在行为解释器中实现了提供多种粒度的活动筛选的活动筛选器,以便生成合适的栈模型。Second, the execution of the control flow graph is redistributed on demand, and the activities of the application runtime generated by the behavior interpreter. When the application is running, each thread corresponds to a control flow, and each control flow can be regarded as a set of ordered activities. This group of activities can be regarded as a path of the control flow graph generated in the previous step. Therefore, the present invention proposes a behavior interpreter suitable for monitoring program execution. According to the configuration, assign the control flow graph that needs to be monitored to the behavior interpreter for execution. If the execution of each instruction corresponds to an activity, this group of instruction sequences will become extremely large and difficult to process: 1. Numerical calculation statements are difficult to correspond to semantics; 2. A large number of activities generated by program loops will obliterate the real Processing logic. Therefore, the present invention divides activities into numerical calculation, branch control, method invocation, etc., and implements an activity filter that provides multiple granular activity screening in the behavior interpreter, so as to generate a suitable stack model.
在本发明实施例的构造方法中,包括类筛选器和活动类型筛选器;其中,所述类筛选器基于包和类名正则匹配的粗粒度筛选,用于去除开发人员不关心的程序活动;所述活动类型筛选器基于活动类型的细粒度筛选,用于去除与开发者不关心的活动类型。In the construction method of the embodiment of the present invention, a class filter and an activity type filter are included; wherein the class filter is based on a coarse-grained filter based on regular matching of package and class names, and is used to remove program activities that developers do not care about; The activity type filter is based on fine-grained filtering of activity types, and is used to remove activity types that are not of concern to developers.
其中,所述栈活动的活动类型包括方法开始与方法结束,字段读,数组读和同步指令;基于上述活动类型,步骤S403的实现方法包括:Wherein, the activity type of the stack activity includes method start and method end, field read, array read, and synchronization instructions; based on the above activity type, the implementation method of step S403 includes:
利用对所述终端应用的应用行为具有监测功能的行为解释器对所述需要监测的控制流图进行解释执行,获得所述终端应用运行时的活动;Interpret and execute the control flow graph that needs to be monitored by using a behavior interpreter that has a monitoring function for the application behavior of the terminal application, and obtain the activities of the terminal application when the terminal application is running;
根据所关注的类,利用所述类筛选器对所述终端应用运行时的活动进行粗粒度筛选,生成由所述类引起的栈活动;According to the class of interest, use the class filter to perform coarse-grained screening of the activities of the terminal application during runtime, and generate stack activities caused by the class;
针对所述栈活动的活动类型,利用所述活动类型筛选器对所述栈活动进行细粒度筛选。With regard to the activity type of the stack activity, the activity type filter is used to perform fine-grained screening of the stack activity.
本发明实施例通过灵活地指定特定的包、类及指令类型,可以生成所需的栈模型,提高了易用性。The embodiment of the present invention can generate a required stack model by flexibly specifying a specific package, class, and instruction type, which improves the ease of use.
为提高构造模型的准确性,本发明把执行方法调用指令的开始和结束都视作活动并进行记录。从Java的方法调用上看,其调用显现的是一种树状的结构:对于某一个方法调用,在执行过程中可能又发生多个方法的调用。因此,为了保证生成的序列能还原成这种树状的结构,本发明以下标s表示方法调用开始的活动,下标e表示方法调用结束的活动。对于上述示例的两种程序执行情况会对应为两个不同的序列:In order to improve the accuracy of the construction model, the present invention regards the start and end of the execution method call instruction as activities and records. From the point of view of Java's method invocation, its invocation appears to be a tree-like structure: for a certain method invocation, multiple method invocations may occur during the execution. Therefore, in order to ensure that the generated sequence can be restored to this tree-like structure, the subscript s in the present invention represents the activity at the beginning of the method call, and the subscript e represents the activity at the end of the method call. For the above example, the two program execution situations will correspond to two different sequences:
1)如果calculate都是在doInBackground中被调用,则序列为d s→c s→c e→c s→c e→d e1) If both calculate doInBackground is called, then the sequence is d s → c s → c e → c s → c e → d e;
2)如果有一个calculate是自身的递归调用,则序列为d s→c s→c s→c e→c e→d e2) If there is a recursive call to itself calculate, the sequence d s → c s → c s → c e → c e → d e.
将生成的活动序列重新构造成树状结构的方法调用可采用调用树构建算法。算法的过程其实是模 拟Java虚拟机执行流程的过程。算法开始时,每一个线程的活动对应一个actions对象。对于每一个线程,维护两个数据结构:1)为已执行过的子控制流队列;2)为当前控制流的执行的函数栈。按序遍历actions中的各个活动,并进行以下判断:The method call that reconstructs the generated activity sequence into a tree structure can adopt the call tree construction algorithm. The algorithm process is actually the process of simulating the execution process of the Java virtual machine. At the beginning of the algorithm, each thread's activity corresponds to an actions object. For each thread, two data structures are maintained: 1) the sub-control flow queue that has been executed; 2) the function stack of the current control flow execution. Traverse each activity in actions in order, and make the following judgments:
如果没有当前控制流,则实例化一个,并压入函数栈。If there is no current control flow, one is instantiated and pushed onto the function stack.
如果当前活动为方法开始类型,则实例化一个新的子控制流,将新实例化的子控制流压入函数栈,并添加进当前控制流的活动队列。最后将当前控制流设置为刚实例化的子控制流。If the current activity is the method start type, instantiate a new sub-control flow, push the newly instantiated sub-control flow onto the function stack, and add it to the activity queue of the current control flow. Finally, the current control flow is set to the sub-control flow just instantiated.
如果当前活动为方法结束类型,则进行弹栈操作。如果弹栈结束后,函数栈为空,则说明当前线程的子控制流已经执行完毕,可以添加进已执行的子控制流队列中;如果函数栈不为空,则当前的控制流设置为函数栈顶的那个子控制流。If the current activity is the method end type, pop the stack operation. If the function stack is empty after the pop-up stack ends, it means that the sub-control flow of the current thread has been executed and can be added to the executed sub-control flow queue; if the function stack is not empty, the current control flow is set to the function The sub-control flow at the top of the stack.
否则将当前活动压入子控制流的活动队列中。Otherwise, the current activity is pressed into the activity queue of the sub-control flow.
类似于方法调用指令,其他类型的指令都可以有指令开始执行和执行结束两种活动。因为这些指令具有原子性,即在同一线程中,指令执行的开始与结束之间具有不会发生其他活动,所以,对于这些类型的指令仅需有指令开始执行的活动即可。Similar to method call instructions, other types of instructions can have two activities: instruction start execution and execution end. Because these instructions are atomic, that is, in the same thread, no other activities will occur between the beginning and the end of the instruction execution. Therefore, for these types of instructions, only the activity of starting the execution of the instruction is required.
在具体实现中,运行时的活动表示实现可以有存储形式:可以是内存中的对象,也可以是持久化的二进制文件或ASIC II文件。本发明中,运行时堆模型可以巴科斯范式的形式表示。In the specific implementation, the runtime activity representation implementation can have a storage form: it can be an object in the memory, or a persistent binary file or an ASIC II file. In the present invention, the runtime heap model can be expressed in the form of Backus paradigm.
本发明通过一个活动的序列化与反序列化的机制来实现这种伸缩性。在运行时模型生成时,将运行时模型中的活动序列存放于可配置大小的缓冲区中,当活动数量超过预设时,则将缓冲区的活动进行序列化,并持久化到本地存储中。The present invention achieves this scalability through an active serialization and deserialization mechanism. When the runtime model is generated, the sequence of activities in the runtime model is stored in a buffer with a configurable size. When the number of activities exceeds the preset, the activities in the buffer are serialized and persisted to local storage .
三、控制流之间依赖关系的生成。多线程编程已成为安卓应用开发中重要的一部分。利用多线程编程可实现用户界面的高效响应,和多项计算任务的并行加速。多线程编程中的线程同步和相互唤醒(称之为线程依赖关系),可抽象为对于栈模型中控制流之间的边。线程的依赖关系是与时间相关的一个关系:在某时刻,主线程可以向后台线程发送计算任务,此时为后台线程执行的活动依赖主线程的活动;而在下一时刻,后台线程完成计算任务后,通知主线程进行界面更新;此时,为主线程执行的活动依赖后台线程的活动。因此,本发明对这些线程间依赖关系进行分类,并对不同类型的依赖进行处理,以在运行时生成这些依赖关系。Third, the generation of dependencies between control flows. Multithreaded programming has become an important part of Android application development. The use of multi-threaded programming can realize the efficient response of the user interface and the parallel acceleration of multiple computing tasks. The thread synchronization and mutual wake-up (called thread dependency) in multithreaded programming can be abstracted as the edge between the control flow in the stack model. Thread dependency is a relationship related to time: at a certain moment, the main thread can send calculation tasks to the background thread, and the activities performed by the background thread at this time depend on the activities of the main thread; and at the next moment, the background thread completes the calculation task After that, the main thread is notified to update the interface; at this time, the activities performed by the main thread depend on the activities of the background thread. Therefore, the present invention classifies these inter-thread dependencies, and processes different types of dependencies to generate these dependencies at runtime.
在本发明实施例中,所述依赖关系包括同步依赖和通信依赖。线程间利用Java语言规范中提供的线程状态转移相关的方法,如Thread.join,Object.wait,Object.notify等实现多个线程之间的协同,本发明称这些线程间的依赖关系为同步依赖。而线程间利用对象,实现多线程间的协同的,本发明称这些线程间的依赖关系为通信依赖。在实际开发中,应用开发者会复用框架层提供的各种多线程编程类,以提升开发效率。框架层虽然提供良好语义的应用编程接口给应用层的类,屏蔽了实现细节,但是为了保障框架的性能和鲁棒性,往往实现较为复杂。利用这些编程框架实现的程序,在运行时线程之间既可能是同步依赖,也可能是通信依赖。In the embodiment of the present invention, the dependency relationship includes synchronization dependency and communication dependency. The thread state transfer related methods provided in the Java language specification are used between threads, such as Thread.join, Object.wait, Object.notify, etc., to realize the coordination between multiple threads. The present invention calls the dependencies between these threads as synchronization dependencies. . Whereas objects are used between threads to achieve coordination among multiple threads, and the present invention calls these dependencies among threads as communication dependencies. In actual development, application developers will reuse various multi-threaded programming classes provided by the framework layer to improve development efficiency. Although the framework layer provides a good semantic application programming interface to the application layer classes, shielding the implementation details, but in order to ensure the performance and robustness of the framework, the implementation is often more complicated. Programs implemented using these programming frameworks may be dependent on synchronization or communication between threads at runtime.
以图5中的BackgroundTask.execute方法调用的开始至onPostExecute方法调用的结束为例,全局共有两个活跃的线程,它们之间相互发生了同步依赖和通信依赖。该过程的方法调用如图6所示:图中的上下两条轴分别表示前台线程和后台线程随时间变化的方法栈的情况;图中的框表示方法,其中灰色的框表示框架层的方法,白色的框表示应用层的方法,即应用开发人员实现的方法;图中的箭头表示线程间的依赖关系,其中实线箭头表示同步依赖,而虚线箭头表示通信依赖。在图6所展示的方法调用过程中,BackgroundTask.execute方法在执行过程中(活动①)会调用ThreadPoolExecutor.execute方法,进而调用后台线程对象的start方法(活动②),进一步会引起后台线程的run方法的调用(活动③)。在后台线程的run方法开始执行后,经过层层调用,最终会调用BackgroundTask.doInBackground方法(活动④),在该方法的执行过程,除了会调用calculate方法进行计算任务外,还会调用AsyncTask.publishProgress方法(活动⑤),以使前台线程调用onProgressUpdated方法(活动⑥)对界面进行更新。随后,后台线程结束计算任务后,会再次用类似的方式通知前台进程当前任务已结束。其中,活动②与活动③之间为同步依赖,而活动⑤与活动⑥之间为通信依赖。Take the BackgroundTask.execute method call in Figure 5 from the beginning to the end of the onPostExecute method call as an example. There are two active threads globally, and there are synchronization dependencies and communication dependencies between them. The method call of this process is shown in Figure 6: the upper and lower axes in the figure represent the method stacks of the foreground thread and the background thread over time; the boxes in the figure represent the methods, and the gray boxes represent the methods of the framework layer. , The white box represents the method of the application layer, that is, the method implemented by the application developer; the arrow in the figure represents the dependency between threads, the solid arrow represents the synchronization dependency, and the dashed arrow represents the communication dependency. In the method call process shown in Figure 6, the BackgroundTask.execute method will call the ThreadPoolExecutor.execute method during the execution process (activity ①), and then call the start method of the background thread object (activity ②), which will further cause the background thread to run Method call (activity ③). After the background thread's run method starts to execute, it will eventually call the BackgroundTask.doInBackground method (activity ④) after layers of calls. During the execution of this method, in addition to calling the calculate method for calculation tasks, it will also call AsyncTask.publishProgress Method (activity ⑤), so that the foreground thread calls the onProgressUpdated method (activity ⑥) to update the interface. Subsequently, after the background thread ends the calculation task, it will notify the foreground process that the current task has ended in a similar manner. Among them, activity ② and activity ③ are synchronous dependence, and activity ⑤ and activity ⑥ are communication dependence.
同步依赖的生成:Generation of synchronization dependencies:
为了实现同步依赖的生成,Java中与同步依赖相关的方法都会被认为是需要收集的活动。这样运行时栈模型就能收集到如表1的各种与同步依赖相关的活动。In order to realize the generation of synchronization dependencies, the methods related to synchronization dependencies in Java are considered to be activities that need to be collected. In this way, the runtime stack model can collect various activities related to synchronization dependencies as shown in Table 1.
对于存在同步依赖的两个活动,后发生的活动可能是一个方法的结束或是一个方法的开始,据此,可将同步依赖分为两种,并进行分别处理:For two activities that have synchronization dependencies, the subsequent activity may be the end of a method or the beginning of a method. Based on this, the synchronization dependencies can be divided into two types and processed separately:
对于一个方法的结束依赖另一个方法的结束的情况,利用时间戳由后往前寻找其他线程中可匹配的活动,如果找到,则对应于一个同步依赖关系。例如Thread.join的结束依赖Thread.run的结束;Object.wait的结束依赖Object.notify的结束,对于如Thread.join或Object.wait的方法结束活动,就可利用时间戳由后往前寻找其他线程中可匹配的活动。如果找到了,则对应于一个同步依赖关系。For the case where the end of one method depends on the end of another method, the timestamp is used to find matching activities in other threads from back to front. If found, it corresponds to a synchronization dependency. For example, the end of Thread.join depends on the end of Thread.run; the end of Object.wait depends on the end of Object.notify. For methods such as Thread.join or Object.wait to end the activity, you can use the timestamp to find others from back to front. Matchable activities in the thread. If found, it corresponds to a synchronization dependency.
在生成控制流间的同步依赖关系时,对于一个方法的结束依赖另一个方法的结束的情况,利用时间戳由后往前寻找其他线程中可匹配的活动,如果找到,则对应于一个同步依赖关系;例如Thread.join的结束依赖Thread.run的结束;Object.wait的结束依赖Object.notify的结束,对于如Thread.join或Object.wait的方法结束活动,就可利用时间戳由后往前寻找其他线程中可匹配的活动。如果找到了,则对应于一个同步依赖关系。When generating synchronization dependencies between control flows, for the case where the end of one method depends on the end of another method, the timestamp is used to find matching activities in other threads from back to front. If found, it corresponds to a synchronization dependency Relationship; for example, the end of Thread.join depends on the end of Thread.run; the end of Object.wait depends on the end of Object.notify. For methods such as Thread.join or Object.wait to end the activity, you can use the timestamp to move forward Find matching activities in other threads. If found, it corresponds to a synchronization dependency.
对于一个活动的开始依赖另一个活动的结束的情况,先对当前线程进行检查,如果该活动为当前线程中的第一个执行的活动,则该活动是依赖另一个线程结束活动的,否则该活动仅是正常的方法调用,并不依赖另一个线程的活动。例如Thread.run的开始依赖Thread.start的结束,由于在Java中,一个活动的开始(即某个方法的调用)可以在任意地方进行任意次数,包括Thread.run方法。因此,要判断一个Thread.run方法的调用是否依赖Thread.start方法的调用不能直接根据时间戳由后往面查找匹配,需要先对当前的Thread.run进行检查:如果该活动为当前线程中的第一个执行的活动,则它是依赖另一个线程Thread.start结束活动的,否则它仅是正常的方法调用,并不依赖另一个线程的活动。For the case where the start of an activity depends on the end of another activity, check the current thread first. If the activity is the first activity executed in the current thread, the activity is dependent on another thread to end the activity, otherwise the Activities are just normal method calls and do not depend on the activities of another thread. For example, the start of Thread.run depends on the end of Thread.start, because in Java, the start of an activity (that is, a method call) can be performed any number of times at any place, including the Thread.run method. Therefore, to determine whether a call to the Thread.run method depends on the call of the Thread.start method, the current Thread.run needs to be checked first: if the activity is in the current thread. The first activity executed, it depends on another thread Thread.start to end the activity, otherwise it is just a normal method call and does not depend on the activity of another thread.
通信依赖的生成:Generation of communication dependencies:
线程间不基于Java提供的能实现线程状态转移的方法,实现多线程间的协同的,本发明称这些线程间的依赖关系为通信依赖。The threads are not based on the method provided by Java that can realize the thread state transition, and realize the coordination among multiple threads. The present invention refers to the dependency between these threads as communication dependency.
以图6中活动⑤与⑥为例,其具体实现是基于MessageQueue的next方法与enqueueMessage方法。在这过程中,如果前台线程的待处理队列中还有元素排队等待处理的话,则活动⑤引起的enqueueMessage方法,仅会将当前任务添加到该队列中,并不会显式地唤醒前台线程。但从逻辑上,可认为对于某一MessageQueue对象而言,其next方法所返回的Message对象,会被Handler以参数的形式传入dispatchMessage方法中,因此可认为MessageQueue的next方法的结束依赖MessageQueue.enqueueMessage,并且该依赖是以参数Message为匹配对象的而不是MessageQueue。Taking activities ⑤ and ⑥ in Figure 6 as an example, the concrete implementation is based on the next method and enqueueMessage method of MessageQueue. In this process, if there are elements in the pending queue of the foreground thread waiting to be processed, the enqueueMessage method caused by activity ⑤ will only add the current task to the queue, and will not explicitly wake up the foreground thread. But logically, it can be considered that for a MessageQueue object, the Message object returned by its next method will be passed into the dispatchMessage method by the Handler as a parameter, so it can be considered that the end of the MessageQueue next method depends on MessageQueue.enqueueMessage And the dependency is based on the parameter Message as the matching object instead of MessageQueue.
在生成控制流间的通信依赖关系时,对所有与活动间通信依赖相关的类进行总结,并将这些类的相关方法与线程依赖相关的方法一起作为生成通信依赖的知识库。该知识库亦可支持应用的进行自定义。When generating communication dependencies between control flows, all classes related to communication dependencies between activities are summarized, and the related methods of these classes and thread dependencies are used as the knowledge base for generating communication dependencies. The knowledge base can also support application customization.
参照图4,所述构造所述终端应用行为的运行时堆模型的步骤具体可以包括:4, the step of constructing the runtime heap model of the terminal application behavior may specifically include:
步骤S405:在所述终端应用运行时,生成堆区的初始状态;Step S405: Generate an initial state of the heap area when the terminal application is running;
步骤S406:生成堆操作活动,得到所述终端应用行为的运行时堆模型。Step S406: Generate a heap operation activity, and obtain a runtime heap model of the terminal application behavior.
在本发明实施例中,运行时堆模型包含以下基本要素:1)一个堆区的初始状态;2)运行时发生的一组影响堆区数据的活动。本发明首先给出堆区初始状态的描述方法,并在运行时生成符合该表示的堆数据初始状态。其次,本发明给出了堆操作活动的描述方法,并在运行时构造运行时堆模型中的活动。最后,给出了堆区初始状态与堆操作活动的BNF表示。In the embodiment of the present invention, the runtime heap model includes the following basic elements: 1) the initial state of a heap area; 2) a group of activities that affect the data of the heap area during operation. The present invention first provides a description method of the initial state of the heap area, and generates an initial state of heap data that conforms to the representation during operation. Secondly, the present invention provides a description method of heap operation activities, and constructs activities in the runtime heap model at runtime. Finally, the BNF representation of the initial state of the heap area and the heap operation activity is given.
下面,对运行时堆模型的生成步骤作进一步概述。Below, a further overview of the generation steps of the runtime heap model is given.
一、是对堆区初始状态的生成。堆区的初始状态为开始时刻的堆区数据的状态。在Java虚拟机规范中,仅对堆区给出了最简单的描述:堆是运行时用于分析所有类实例和数组的区域,该区域由一个自动存储管理系统(即垃圾回收器)进行管理。堆中的对象从来都不会显式地回收,而是会由垃圾 回收器自动回收。堆区的初始状态可视为某一时刻堆区数据的快照,因此,如果在生成内存堆区数据状态时,如果有别的线程继续执行,并进行堆区操作(例如创建对象、执行垃圾回收等),就会破坏初始状态的原子性。因此,本发明首先给出了一种描述堆区初始状态的BNF表示,并在生成堆区数据初始状态时采用“冻结”堆区数据的方式,保证了初始状态生成过程的原子性。One is the generation of the initial state of the heap area. The initial state of the heap area is the state of the heap area data at the start time. In the Java virtual machine specification, only the simplest description of the heap area is given: the heap is the area used to analyze all class instances and arrays at runtime, and this area is managed by an automatic storage management system (ie garbage collector) . Objects in the heap are never explicitly collected, but are automatically collected by the garbage collector. The initial state of the heap area can be regarded as a snapshot of the heap area data at a certain moment. Therefore, if the data state of the memory heap area is generated, if other threads continue to execute and perform heap area operations (such as creating objects and performing garbage collection) Etc.), it will destroy the atomicity of the initial state. Therefore, the present invention first provides a BNF representation describing the initial state of the heap area, and adopts a method of "freezing" the heap area data when generating the initial state of the heap area data to ensure the atomicity of the initial state generation process.
二、堆模型中的活动的生成。在应用运行时,Java的垃圾回收器可以产生回收内存的活动。除了这些活动外,其他活动可视为运行时栈模型中的活动的一个子集。一方面,如果将每一个会影响堆区的数据的操作都对应于一个活动,会导致这组活动数量变得极大而难以处理。例如某应用中存在大文件的I/O操作,如果全部操作都以活动的形式记录下来,则活动的数据量将不小于大文件的数据量;另一方面,类似于控制流模型,可能仅关注部分类、方法的执行,生成过大的堆模型反而难以分析。这里扩展了活动的描述使其支持描述垃圾回收活动,类似于运行时栈模型,提供多种粒度的活动选择筛选选项,以便生成合适的堆模型。生成的堆模型详实地描述了所关心的对象的变化情况,因此,利用基于时间戳的堆对象状态查询算法可以查询堆对象在任意时刻的状态。2. The generation of activities in the heap model. When the application is running, Java's garbage collector can generate activities to reclaim memory. In addition to these activities, other activities can be regarded as a subset of the activities in the runtime stack model. On the one hand, if each operation that affects the data in the heap area corresponds to an activity, the number of activities in this group will become extremely large and difficult to process. For example, there are I/O operations of large files in an application. If all operations are recorded in the form of activities, the amount of active data will not be less than that of large files; on the other hand, similar to the control flow model, it may only Focusing on the execution of some classes and methods, generating a heap model that is too large is difficult to analyze. The description of activities is expanded here to support the description of garbage collection activities, similar to the runtime stack model, providing multiple granular activity selection filtering options to generate a suitable heap model. The generated heap model describes the changes of the objects of interest in detail. Therefore, the heap object state query algorithm based on timestamp can query the state of the heap object at any time.
接下来,采用一个具体的例子对运行时堆模型建模过程进行介绍:Next, use a specific example to introduce the runtime heap model modeling process:
Java堆区的数据仅包括实例化的对象和数组。对于应用而言,对象的创建可能发生在应用层代码或框架层代码,因此,我们将应用中的对象分为应用层和框架层,以图5实现的代码为例,在触发点击事件前,堆区的对象如图7(a)所示。图中每一个圆表示一个对象,而圆与圆之间的连线表示引用关系。图7(a)中与应用业务逻辑相关对象有展示界面FloatActivity、可触发后台计算任务的按钮Button、待处理的后台任务BackgroundTask、用于展示任务计算结果的TextView和点击事件监听对象OnClickListener。除了与业务逻辑相关的对象,还有许多框架层的对象。例如,框架层一个MessageQueue对象。实现后台处理任务可以通知前台进行更新。图7中的实线箭头表示对象的引用关系,即对象A如果有某个字段指向另一个对象B,则由A至B有一条有向连接;虚线箭头表示在整个事件的处理过程中,存在过对象的引用关系,而在结束时,已没有引用关系了。The data in the Java heap area only includes instantiated objects and arrays. For applications, the creation of objects may occur in application layer code or framework layer code. Therefore, we divide the objects in the application into application layer and framework layer. Taking the code implemented in Figure 5 as an example, before triggering the click event, The objects of the heap area are shown in Figure 7(a). Each circle in the figure represents an object, and the line between the circle and the circle represents the reference relationship. The objects related to the application business logic in Figure 7(a) include the display interface FloatActivity, the button that can trigger the background calculation task, the background task to be processed, the BackgroundTask, the TextView used to display the calculation result of the task, and the click event listener OnClickListener. In addition to objects related to business logic, there are many framework-level objects. For example, a MessageQueue object in the framework layer. Realizing background processing tasks can notify the foreground to update. The solid arrow in Figure 7 indicates the reference relationship of the object, that is, if there is a field in object A that points to another object B, there is a directed connection from A to B; the dotted arrow indicates that there is a Through the reference relationship of the object, and at the end, there is no reference relationship.
在事件触发过程中,会创建如下对象:在BackgroundTask.execute方法在执行过程中,会调用ThreadPoolExecutor.execute方法,此时,由于是该对象首次执行execute方法,该对象会新建一个Thread对象①,为后台执行的线程;在调用后台线程对象的start方法之后,进一步会引起后台线程的run方法的调用并正式开始调用doInBackground方法。在该方法中,除了会调用calculate方法进行计算任务外,还会调用AsyncTask.publishProgress方法,在执行该方法前,传入的参数会被封装到新创建的Integer[]对象中②;而在方法执行过程中,则是会新创建一个Message对象③,并放至全局的一个MessageQueue队列中。当前台线程接收到该Message,并执行onProgressPublished时,会创建一个StringBuilder对象④,以构造setText所需的参数,并通过StringBuilder.toString方法实例化出一个新的String对象⑤。而当doInBackground方法执行结束前,又会创建一个新的StringBuilder对象⑥,并且计算出返回值String对象⑦。该String对象会被再次封装到一个新创建的Message对象⑧中,并通知前台线程执行onPostExecute方法。上述过程已对部分步骤进行简化,在实际运行中会存在更多的对象创建。例如,Thread对象并不是直接依赖BackgroundTask而是会经过FutureTask,Callable等对象的层层封装,间接依赖BackgroundTask对象。在该过程结束后,对象①至⑧都可能会在某次垃圾回收中被回收。During the event triggering process, the following objects will be created: During the execution of the BackgroundTask.execute method, the ThreadPoolExecutor.execute method will be called. At this time, since the object executes the execute method for the first time, the object will create a new Thread object①, which is The thread of background execution; after calling the start method of the background thread object, it will further cause the call of the run method of the background thread and officially start calling the doInBackground method. In this method, in addition to calling the calculate method to perform the calculation task, it also calls the AsyncTask.publishProgress method. Before executing the method, the incoming parameters will be encapsulated into the newly created Integer[] object ②; During execution, a new Message object ③ will be created and placed in a global MessageQueue queue. When the front thread receives the Message and executes onProgressPublished, it will create a StringBuilder object ④ to construct the parameters required by setText, and instantiate a new String object ⑤ through the StringBuilder.toString method. Before the execution of the doInBackground method ends, a new StringBuilder object ⑥ will be created, and the return value String object ⑦ will be calculated. The String object will be encapsulated again in a newly created Message object ⑧, and notify the foreground thread to execute the onPostExecute method. The above process has simplified some of the steps, and more objects will be created in actual operation. For example, the Thread object does not directly rely on the BackgroundTask, but will be encapsulated by FutureTask, Callable and other objects, and indirectly rely on the BackgroundTask object. After the end of the process, objects ① to ⑧ may be collected in a garbage collection.
在堆模型中,本发明将图7(b)中的每个对象的实例化、字段赋值以及回收都视为一个活动。类似运行时栈模型的表示,下面本发明优选以巴科斯范式的形式给出一种运行时堆模型的表示。In the heap model, the present invention treats the instantiation, field assignment, and recycling of each object in Figure 7(b) as an activity. Similar to the representation of the runtime stack model, the following invention preferably provides a representation of the runtime stack model in the form of Backus paradigm.
其中,DataAction与运行时栈模型构造关键技术中所描述的ControlAction类似,ControlAction用于描述执行的指令情况,而DataAction用于描述内存数据的变化情况。Number表示数字类型,可以是数值,也可以是内存地址;String表示字符串类型。从上述表示中,可以发现,模型的复杂度主要取决于:1)初始状态中对象的数量;2)堆区数据活动的数量。Among them, DataAction is similar to the ControlAction described in the key technology of runtime stack model construction. ControlAction is used to describe the execution of instructions, and DataAction is used to describe changes in memory data. Number represents a number type, which can be a numeric value or a memory address; String represents a string type. From the above representation, it can be found that the complexity of the model mainly depends on: 1) the number of objects in the initial state; 2) the number of data activities in the heap area.
在安卓的具体实现中,其堆区可分为三个子区域:1)应用堆(App Heap),当前应用实例化对象和数组时使用的内存区域;2)镜像堆(Image Heap),装载了当前应用镜像的内存区域;3)孵化 堆(Zygote Heap),系统启动时加载的系统类存放的内存区域。本发明所述的初始状态主要关注运行时变化最大的应用堆。对于安卓平台上的Dalvik虚拟机和ART虚拟机而言,它们实现了可以在任一时刻,将当前应用堆的状态以文件的状态保存下来(堆转储操作)。该文件是一种私有的内存镜像格式,通过安卓开发者工具可将其转换为符合J2EE平台规定的hprof格式。In the specific implementation of Android, its heap area can be divided into three sub-areas: 1) App Heap, the memory area used by the current application to instantiate objects and arrays; 2) Image Heap, which is loaded The memory area of the current application mirroring; 3) Zygote Heap, the memory area where the system classes loaded when the system starts. The initial state described in the present invention mainly focuses on the application stack that changes the most during runtime. For the Dalvik virtual machine and ART virtual machine on the Android platform, they realize that the current application heap state can be saved as a file at any time (heap dump operation). This file is a private memory image format, which can be converted to hprof format that complies with J2EE platform regulations through Android Developer Tools.
然而直接基于现在的应用堆转储仅能反映某一时刻的堆状态,难以实现反映一段时间内的任意时刻的堆状态。首先,执行一次堆转储操作需要挂起所有线程,耗时高,生成的文件从几十兆字节至几百兆字节不等,难以通过每隔一段时间执行一次堆转储操作实现反映一段时间内的任意时刻的堆状态。其次,执行堆转储操作并不会转储那些被回收器回收的对象,包括那些在执行过程中产生的临时对象,然而对于一个执行过程而言,产生的临时对象对描述过程的执行也十分重要,例如,图7中的对象②至⑧都是临时对象,这些对象不能直接利用堆转储进行持久化。However, the current application heap dump can only reflect the heap state at a certain moment, and it is difficult to reflect the heap state at any time within a period of time. First of all, all threads need to be suspended to perform a heap dump operation, which is time-consuming. The generated files range from tens of megabytes to hundreds of megabytes, and it is difficult to implement a heap dump operation every once in a while. The state of the heap at any time within a period of time. Secondly, performing a heap dump operation does not dump those objects that are recovered by the collector, including those temporary objects generated during execution. However, for an execution process, the temporary objects generated are also very important for the execution of the described process. Important, for example, the objects ② to ⑧ in Figure 7 are all temporary objects, and these objects cannot be persisted directly using heap dumps.
在本发明实施例中,所述构造所述终端应用行为的运行时堆模型的步骤包括:所述终端应用运行时的活动包括实例化活动、修改活动和回收活动。In the embodiment of the present invention, the step of constructing the runtime heap model of the terminal application behavior includes: the terminal application runtime activities include instantiation activities, modification activities, and recycling activities.
其中,实例化活动(NewAction),即创建新的对象、新的数组的活动,可对应于字节码中的newInstance、newArray等指令在运行时的执行。Among them, the instantiation activity (NewAction), that is, the activity of creating a new object and a new array, can correspond to the execution of instructions such as newInstance and newArray in the bytecode at runtime.
修改活动(ModifiyAction),即修改类的静态字段、对象的字段、数组的元素的值的活动,可对应于字节码中的sput,iput,aput等指令。ModifiyAction, that is, the activity of modifying the value of the static field of the class, the field of the object, and the element of the array, can correspond to instructions such as sput, iput, aput in the bytecode.
回收活动(GCAction),即执行垃圾回收时,对堆中的对象产生影响的活动。针对回收活动,垃圾回收机制是一种自动的内存管理机制。当一片内存中的数据不再会被用到时,会予以回收释放,以便于进行下次的分配。具体的垃圾回收算法实现有引用计数法、可达性分析算法等。GCAction is an activity that affects objects in the heap when garbage collection is performed. For recycling activities, the garbage collection mechanism is an automatic memory management mechanism. When the data in a piece of memory will no longer be used, it will be reclaimed and released to facilitate the next allocation. Specific garbage collection algorithms are implemented with reference counting method and reachability analysis algorithm.
回收活动在运行时是对应不到dex字节码的指令的,因为它的具体实现是在虚拟机层面。回收活动可进一步细分为清除活动和压缩活动。所谓清除活动就是清除不再需要的对象;而所谓的压缩活动则是对活跃的对象整理到连续的内存空间上,以避免分配大内存时由于碎片化导致分配失败的情况。The recycling activity does not correspond to the instructions of the dex bytecode at runtime, because its specific implementation is at the virtual machine level. Recycling activities can be further subdivided into cleanup activities and compression activities. The so-called cleanup activity is to clean up objects that are no longer needed; while the so-called compression activity is to organize active objects into contiguous memory space to avoid allocation failure due to fragmentation when allocating large memory.
除此之外,除了实现的应用层的代码会创建对象以外,框架层的代码也会创建大量对象,在某些情况下,甚至会是几倍于应用层创建的对象。需要提供一种堆模型复杂性管理的机制,以保证生成的运行时栈模型的准确性和易用性。与前述两级筛选机制类似,对于堆模型的活动生成,也有一个两级的筛选机制。基于包和类名正则匹配的粗粒度筛选和基于活动类型的细粒度筛选。In addition, in addition to the implementation of the application layer code will create objects, the framework layer code will also create a large number of objects, in some cases, even several times the objects created by the application layer. It is necessary to provide a mechanism for the complexity management of the heap model to ensure the accuracy and ease of use of the generated runtime stack model. Similar to the aforementioned two-level screening mechanism, there is also a two-level screening mechanism for the activity generation of the heap model. Coarse-grained filtering based on regular matching of package and class names and fine-grained filtering based on activity type.
本发明优选提供了6种堆操作活动,所述堆操作活动的活动类型包括对象实例化,数组实例化,对象字段写,数组元素写,清除活动和压缩活动;The present invention preferably provides 6 types of heap operation activities, and the activity types of the heap operation activities include object instantiation, array instantiation, object field writing, array element writing, clearing activity and compression activity;
所述生成堆操作活动的步骤包括:The steps of generating a heap operation activity include:
根据所关注的类,利用所述类筛选器对所述终端应用运行时的活动进行粗粒度筛选,生成由所述类引起的堆操作活动;According to the class of interest, use the class filter to perform coarse-grained screening of the terminal application runtime activities to generate heap operation activities caused by the class;
针对所述堆操作活动的活动类型,利用所述活动类型筛选器对所述堆操作活动进行细粒度筛选。For the activity type of the heap operation activity, the activity type filter is used to perform fine-grained screening of the heap operation activity.
针对行为反射在控制上面临的挑战,即本发明背景技术所阐述的第二个挑战,支持指令级的行为控制,由于不是本发明的重点,在此不多赘述。Regarding the control challenge faced by behavior reflection, that is, the second challenge described in the background art of the present invention, supporting instruction-level behavior control is not the focus of the present invention, so I will not repeat it here.
下面,采用一具体实例验证本发明实施例对终端应用行为监测的有效性。In the following, a specific example is used to verify the effectiveness of the embodiment of the present invention on terminal application behavior monitoring.
针对移动互联网中广泛使用的安卓移动应用,给出行为反射框架的原型系统实现:Reflectall。Reflectall的全称为Reflection at low level interpreter,具有两重含意,一是基于底层的行为解释器实现的反射;二是它可以监测和控制指令级别的应用行为。Reflectall基于安卓操作系统开源项目。为了实现移动应用行为的监测与控制,Reflectall平台可分为行为运行时模型构造子系统、模型分析与代码生成子系统和运行子系统,实现了行为反射框架中的监测和控制。For Android mobile applications widely used in the mobile Internet, a prototype system implementation of the behavior reflection framework: Reflectall is given. The full name of Reflectall is Reflection at low level interpreter, which has two meanings. One is reflection based on the underlying behavior interpreter; the other is that it can monitor and control instruction-level application behavior. Reflectall is based on the Android operating system open source project. In order to realize the monitoring and control of mobile application behavior, the Reflectall platform can be divided into a behavior runtime model construction subsystem, a model analysis and code generation subsystem, and an operation subsystem to realize the monitoring and control in the behavior reflection framework.
参照图8,为Reflectall模型生成子系统架构示意图。Reflectall的行为运行时模型构造子系统实现了移动应用行为运行时模型的构造,其核心实现在系统层,由优化-反优化器、行为解释器、模型构建和接口层四个模块组成。这个四个模块实现了移动应用行为的监测与控制。Referring to Figure 8, a schematic diagram of the architecture of the Reflectall model generation subsystem. Reflectall's behavior runtime model construction subsystem implements the construction of mobile application behavior runtime models. Its core implementation is at the system layer and consists of four modules: optimization-deoptimizer, behavior interpreter, model construction and interface layer. These four modules realize the monitoring and control of mobile application behavior.
其中,优化-反优化器:安卓运行时环境可加载CPU可直接执行的原生指令。因此,需要将以原 生指令切换为字节码,即反优化,并通过行为解释器进行解释执行,从而实现对移动应用运行时活动的监测。移动应用由于其复杂性,难以监测移动应用执行中的所有活动,因此引入了一个两级筛选机制。优化-反优化器实现了两级筛选机制中的类筛选机制,通过优化-反优化器,可以按需地将待监测的类反优化为字节码,并进行解释执行;而对于未被监测的类,则仍然在原生执行器中执行。优化-反优化器会在如下三种情况下触发:1)收到开始监测的命令时,会根据所配置的参数,对当前已经加载的类的方法进行筛选,并进行反优化;2)收到结束监测命令时,会对当前已反优化后的类进行再优化,令其重新进入原生执行器中执行;3)类链接器加载新的类时,会对类进行类似于情况1)中的筛选和反优化过程。为了保证程序执行的正确性,反优化的过程需要和部分垃圾回收算法一样,暂时挂起所有线程的执行,待反优化执行结束后,再恢复线程的执行。通过这种局部反优化,并保持解释执行与原生执行共存的状态,可以大大减少监测的性能开销。Among them, optimization-de-optimizer: the Android runtime environment can load native instructions that can be directly executed by the CPU. Therefore, it is necessary to switch the original instruction to bytecode, that is, de-optimization, and perform interpretation and execution through the behavior interpreter, so as to realize the monitoring of mobile application runtime activities. Due to the complexity of mobile applications, it is difficult to monitor all activities in the execution of mobile applications, so a two-level screening mechanism is introduced. The optimization-de-optimizer implements the class filtering mechanism in the two-level filtering mechanism. Through the optimization-de-optimizer, the classes to be monitored can be de-optimized into bytecodes as needed, and interpreted and executed; while for unmonitored The class is still executed in the native executor. The optimization-de-optimizer will be triggered in the following three situations: 1) When receiving the command to start monitoring, it will filter the methods of the currently loaded classes according to the configured parameters and perform de-optimization; 2) Receive When the monitoring command is finished, the currently de-optimized class will be re-optimized to re-enter the native executor for execution; 3) When the class linker loads a new class, the class will be similar to the case 1) The screening and de-optimization process. In order to ensure the correctness of the program execution, the de-optimization process needs to be the same as part of the garbage collection algorithm, temporarily suspend the execution of all threads, and resume the thread execution after the de-optimization execution ends. Through this partial de-optimization and maintaining the coexistence of interpreted execution and native execution, the performance overhead of monitoring can be greatly reduced.
行为解释器:行为解释器是解释执行dex格式字节码的解释器,能在解释执行的过程中,监测当前的程序执行中发生的活动。移动应用行为运行时模型中的活动大多是由行为解释器生成。除了行为解释器所生成的活动,垃圾回收器也可以生成部分活动——垃圾回收活动。行为解释器还实现了两级筛选机制中的活动筛选机制,可根据配置的活动收集粒度产生不同种类的活动。Behavior interpreter: The behavior interpreter is an interpreter that interprets and executes the dex format bytecode. It can monitor the activities that occur in the current program execution during the interpretation and execution. Most of the activities in the mobile application behavior runtime model are generated by the behavior interpreter. In addition to the activities generated by the behavior interpreter, the garbage collector can also generate some activities-garbage collection activities. The behavior interpreter also implements the activity filtering mechanism in the two-level filtering mechanism, which can generate different types of activities according to the configured activity collection granularity.
模型构建器:行为解释器与垃圾回收器所产生的活动会在模型构建器中构建。当运行时产生的活动较多时,会导致内存占用较大。因此,模型构建器实现了在线和离线的模型构建。当活动较少时,模型构建器运行于在线的模型构建模式时,当活动数量达到配置的阀值,模型构建器会将当前已产生的活动序列持久化,并以文件的形式持久化至存储。Model builder: The activities generated by the behavior interpreter and garbage collector will be built in the model builder. When there are more activities during runtime, it will lead to a larger memory footprint. Therefore, the model builder implements online and offline model construction. When there are few activities, when the model builder runs in the online model building mode, when the number of activities reaches the configured threshold, the model builder will persist the currently generated sequence of activities and store them in the form of files .
接口层:对优化-反优化器、行为解释器和模型构建器所提供的功能进行封装。同时也提供反序列化活动所需的接口,例如根据地址查找对象和给定对象转换为地址等。Interface layer: encapsulates the functions provided by the optimization-de-optimizer, behavior interpreter, and model builder. At the same time, it also provides the interfaces required for deserialization activities, such as finding objects based on addresses and converting given objects to addresses.
本发明实例所实现的原型中,可以生成两种移动应用行为运行时模型:1)包含了运行时数据依赖的精细化模型;2)不包含运行时数据依赖的精简模型。基于系统层的实现,在框架层,Reflectall包括一组行为反射接口,可监测不同粒度的应用的活动,生成不同粒度的应用行为运行时模型;一组远程调试连接接口,可控制应用活动监测的开始与结束。在应用层,对框架层的接口进行封装,实现了一个安卓应用,可以以Web服务的形式对外提供远程调试接口。In the prototype realized by the example of the present invention, two mobile application behavior runtime models can be generated: 1) a refined model that includes runtime data dependency; 2) a simplified model that does not include runtime data dependency. Based on the implementation of the system layer, in the framework layer, Reflectall includes a set of behavior reflection interfaces that can monitor the activities of applications with different granularities and generate runtime models of different granularities of application behavior; a set of remote debugging connection interfaces that can control application activity monitoring Start and end. At the application layer, the interface of the framework layer is encapsulated to realize an Android application, which can provide a remote debugging interface in the form of a Web service.
Reflectall的分析与代码生成子系统为浏览器-服务端架构。分析与代码生成子系统实现了:The analysis and code generation subsystem of Reflectall is a browser-server architecture. The analysis and code generation subsystem implements:
版本管理:利用git管理不同版本的移动应用与互操作接口。同时支持服务器端编译,并利用客户端的接口管理应用将编译好的dex字节码推送至客户端。Version management: Use git to manage different versions of mobile applications and interoperability interfaces. At the same time, it supports server-side compilation, and uses the client's interface management application to push the compiled dex bytecode to the client.
栈模型可视化:提供了一个树状的视图,并支持基于关键字的数据依赖沾染分析。Stack model visualization: Provides a tree view and supports keyword-based data dependency analysis.
Reflectall的接口运行子系统在安卓开源项目的框架层上添加了一个行为反射类加载器,如图9所示,为本发明示例Reflectall的接口运行子系统的结构示意图。当应用进程启动时,会检查是否有可加载的行为反射接口字节码文件。如果存在适合当前应用的行为反射接口字节码文件,则通过行为反射类加载器将其加载进应用进程,同时,利用Binder通信机制向接口管理应用注册当前应用提供的互操作接口。接口管理应用提供了接口转发、状态检测等服务。调用者进程通过接口管理应用可实现与指定应用互操作。The interface operation subsystem of Reflectall adds a behavior reflection class loader to the framework layer of the Android open source project, as shown in FIG. 9, which is a structural diagram of the interface operation subsystem of Reflectall, an example of the present invention. When the application process starts, it will check whether there is a loadable behavior reflection interface bytecode file. If there is a behavior reflection interface bytecode file suitable for the current application, it is loaded into the application process through the behavior reflection class loader, and at the same time, the Binder communication mechanism is used to register the interoperability interface provided by the current application with the interface management application. The interface management application provides services such as interface forwarding and status detection. The caller process can realize interoperability with the specified application through the interface management application.
具体验证时,本发明以一个包含69个开源安卓应用的开源应用集和一个包含39个闭源应用的闭源应用集对Reflectall模型生成的性能进行验证。During specific verification, the present invention verifies the performance of the Reflectall model generation with an open source application set containing 69 open source Android applications and a closed source application set containing 39 closed source applications.
移动应用行为运行时模型的构造开销与模型的活动数量正相关——应用越复杂、活动越多,生成行为运行时模型的开销也就越大。相比闭源应用,开源应用在实现的复杂程度远低于闭源应用。开源应用集的类的数量的中位数为58个,方法的数量的中位数为246个;75%的开源应用集中的应用的类的数量不大于167个;方法的数量不大于859个。而闭源应用集中的应用,应用的类的数量的中位数为14266,方法的数量的中位数是87717个,是开源应用集中对应的数值的245倍和102倍。实验所使用的硬件配置如下:1)使用Android智能手机红米2A,它的CPU是1.5GHz,内存为1GB,运行安卓操作系统版本为5.1.1。2)实验使用一台普通PC作为远程控制端控制手机进行实验,该PC的 CPU为Intel酷睿i5 3427U(1.8GHz),内存为4GB,运行OSX 10.11操作系统。The construction cost of the mobile application behavior runtime model is positively related to the number of model activities-the more complex the application and the more activities, the greater the cost of generating the behavior runtime model. Compared with closed source applications, open source applications are far less complex in implementation than closed source applications. The median of the number of classes in the open source application set is 58 and the median of the number of methods is 246; the number of application classes in 75% of the open source application sets is no more than 167; the number of methods is no more than 859 . For applications in the closed-source application set, the median number of application classes is 14,266, and the median number of methods is 87717, which is 245 times and 102 times the corresponding values in the open source application set. The hardware configuration used in the experiment is as follows: 1) Use the Android smartphone Redmi 2A, its CPU is 1.5GHz, the memory is 1GB, and the Android operating system version is 5.1.1. 2) The experiment uses an ordinary PC as a remote control The terminal controls the mobile phone for experiments. The PC's CPU is Intel Core i5 3427U (1.8GHz), the memory is 4GB, and it runs OSX 10.11.
目前,监测应用执行流程的方法除了本发明所述的实现行为解释器的方式外,还包括运行时元消息绑定和编译时字节码重构两种方式。表2给出了三种方式所支持的活动监测的粒度。Reflectall在活动监测的粒度上比基于运行时绑定的方法要更细:支持到指令级别的活动监测;同时也比基于字节码重构的方法适应范围更广:基于字节码重构的方式需要修改原应用的编译流程,难以直接在经过混淆、加固的应用上使用。At present, in addition to the method of implementing the behavior interpreter described in the present invention, the method of monitoring application execution flow also includes two methods of runtime meta-message binding and compile-time bytecode reconstruction. Table 2 shows the granularity of activity monitoring supported by the three methods. Reflectall is more granular in activity monitoring than the method based on runtime binding: it supports activity monitoring up to the instruction level; at the same time, it has a wider range of adaptation than the method based on bytecode reconstruction: based on bytecode reconstruction The method needs to modify the compilation process of the original application, and it is difficult to directly use it on obfuscated and hardened applications.
表2:监测程序执行流程的方法比较Table 2: Comparison of methods for monitoring program execution flow
 To 执行流程监测的粒度Granularity of execution process monitoring 是否需要字节码Do you need bytecode
ReflectallReflectall 支持方法级别、指令级别的监测Support method level and instruction level monitoring 不需要Not needed
基于运行时绑定的方法Method based on runtime binding 支持方法级别的监测Support method level monitoring 不需要Not needed
基于字节码重构的方法Method based on bytecode reconstruction 支持方法级别、指令级别的监测Support method level and instruction level monitoring 需要need
本发明在实验一对比Reflectall与基于运行时绑定的方法在监测程序执行流程方面的性能。在实验二对比Reflectall与基于字节码重构的方法在监测指令粒度的活动的性能。In experiment 1, the present invention compares the performance of Reflectall and the method based on runtime binding in monitoring program execution flow. In Experiment 2, the performance of Reflectall and the method based on bytecode reconstruction in monitoring the activity of instruction granularity were compared.
实验一:对比运行时元消息绑定的方法Experiment 1: Compare the method of runtime meta-message binding
Xposed框架是一款可以在不修改APK的情况下监测和修改程序运行行为的框架服务(rovo89,2012)。类似于Reflectall,Xposed框架也是在安卓操作系统的系统层上进行修改。Xposed框架实现了元消息模型的行为反射,即Xposed在应用运行时根据配置对指定的方法绑定相应的元对象。在后续的执行中,这些绑定了元对象的方法在执行前、执行后会调用元对象中的before、after方法。本文利用Xposed框架,实现了与Reflectall类似监测程序执行的Xposed模块。本节以应用启动时间作为指标,在两台硬件配置相同的红米2A手机上,分别部署了Reflectall和基于Xposed的监测模块,安卓操作系统均为5.1.1。通过以下6个不同的实验场景,比较Reflectall与基于Xposed框架的方法在开源应用集和闭源应用集上监测应用的所有应用类执行的性能,如表3所示。在每个场景下,每个应用启动10次,实验的结果如图10所示。Xposed framework is a framework service that can monitor and modify the running behavior of the program without modifying the APK (rovo89, 2012). Similar to Reflectall, the Xposed framework is also modified on the system layer of the Android operating system. The Xposed framework implements the behavior reflection of the meta message model, that is, Xposed binds the corresponding meta object to the specified method according to the configuration when the application is running. In the subsequent execution, these methods bound to the meta-object will call the before and after methods in the meta-object before and after execution. This article uses the Xposed framework to implement an Xposed module that monitors program execution similar to Reflectall. In this section, the application startup time is used as an indicator. Reflectall and Xposed-based monitoring modules are deployed on two Redmi 2A phones with the same hardware configuration. The Android operating systems are both 5.1.1. Through the following 6 different experimental scenarios, compare the performance of Reflectall and the method based on the Xposed framework to monitor the execution of all application classes of the application on the open source application set and the closed source application set, as shown in Table 3. In each scenario, each application was launched 10 times, and the results of the experiment are shown in Figure 10.
表3:监测程序执行流程的方法比较Table 3: Comparison of methods for monitoring program execution flow
Figure PCTCN2019119272-appb-000007
Figure PCTCN2019119272-appb-000007
图10(a)为开源应用集上的实验结果。在开源应用集中,69个应用在上述6个场景下均能正常启动。图10(a)中的实线部分为部署了Reflectall的三个场景;虚线部分为部署了Xposed框架的三个场景。在不监测程序执行流程的情况下,部署Reflectall的手机(场景1)平均启动时间为392毫秒,而部署了Xposed框架的手机(场景3)的平台启动时间为449毫秒。这是由于Xposed框架的实现即便不绑定元对象,也在应用加载时会有一定的开销。而Reflectall的优化-反优化器实现了在不监测时,所有代码均在原生执行器中执行。在场景2下,Reflectall的平均启动时间为486毫秒,相比不监测情况(392毫秒),额外开销为23%;而基于Xposed框架的方法,在场景2下的平均启动时间高达2078毫秒,相比不监测情况(449毫秒),其额外开销为368%。而在生成更复杂的行为运行时模型时(场景3和场景6),Reflectall的额外开销仅为27%,而基于Xposed框架的方法高达477%。Figure 10(a) shows the experimental results on the open source application set. In the open source application set, 69 applications can be started normally in the above 6 scenarios. The solid lines in Figure 10(a) are the three scenarios where Reflectall is deployed; the dotted lines are the three scenarios where the Xposed framework is deployed. Without monitoring the program execution process, the average startup time of the mobile phone deployed with Reflectall (scenario 1) is 392 milliseconds, while the platform startup time of the mobile phone deployed with the Xposed framework (scenario 3) is 449 milliseconds. This is because even if the implementation of the Xposed framework does not bind meta objects, there will be a certain overhead when the application is loaded. Reflectall's optimization-de-optimizer realizes that all code is executed in the native executor when it is not monitored. In Scenario 2, the average startup time of Reflectall is 486 milliseconds, which is 23% more than the unmonitored situation (392 milliseconds). The Xposed framework-based method has an average startup time of up to 2078 milliseconds in Scenario 2. Compared with the unmonitored situation (449 milliseconds), the additional overhead is 368%. When generating more complex behavioral runtime models (Scenario 3 and Scenario 6), the additional overhead of Reflectall is only 27%, while the method based on the Xposed framework is as high as 477%.
图10(b)为闭源应用集上的实验结果。在不监测的场景下,相比于实现简单的开源应用,闭源应用的平均启动时间为936毫秒(Reflectall)和1010毫秒(Xposed)。而在场景2下,由于应用的复杂性,Reflectall有3个应用无响应,剩余的36个应用的平均启动时间为1601毫秒,相比不监测的场景(936毫秒)额外开销为71%。而场景4下有22个应用无响应,剩余的17个应用的平均启动时 间为4593毫秒,相比不监测的场景(1010毫秒),其额外开销为354%。而在生成更复杂的行为运行时模型时(场景3和场景6),Reflectall的额外开销为98%,而基于Xposed框架的方法高达470%。Figure 10(b) shows the experimental results on the closed source application set. In the non-monitoring scenario, compared to implementing simple open source applications, the average startup time of closed source applications is 936 milliseconds (Reflectall) and 1010 milliseconds (Xposed). In scenario 2, due to the complexity of the application, Reflectall has 3 applications that are unresponsive, and the average startup time of the remaining 36 applications is 1601 milliseconds, which is 71% more than the unmonitored scenario (936 milliseconds). In Scenario 4, 22 applications are unresponsive, and the average startup time of the remaining 17 applications is 4593 milliseconds. Compared with the unmonitored scenario (1010 milliseconds), the additional overhead is 355%. When generating more complex behavioral runtime models (Scenario 3 and Scenario 6), the additional overhead of Reflectall is 98%, while the method based on the Xposed framework is as high as 470%.
Reflectall生成行为运行时模型的性能在开源应用集上和闭源应用集的差异较大,这是由于闭源应用集中的应用实现更为复杂,因此生成的模型规模也更大,可能引起多次的垃圾回收和活动持久化过程,由此带来了更多的性能开销。而基于Xposed的方法在这两个应用集中开销接近,原因在于利用Xposed进行监测时,有22个应用无响应,占整个闭源应用集的57%。Reflectall的性能开销比基于Xposed框架的方法低的一大原因是基于Xposed框架的方法的所使用的编程语言是Java,而Reflectall所实现的行为解释器编程语言为C++。当应用的执行流程较复杂时,基于Xposed框架的方法对内存的分配和回收会比Reflectall更加频繁。通过上述实验表明,本发明Reflectall的这种实现方式可以处理更复杂的应用。The performance of Reflectall’s generated behavior runtime model is quite different from that of the open source application set and the closed source application set. This is because the implementation of the closed source application set is more complex, so the generated model scale is also larger, which may cause multiple The process of garbage collection and activity persistence, which brings more performance overhead. The Xposed-based method has similar costs in these two application sets. The reason is that when Xposed is used for monitoring, 22 applications are unresponsive, accounting for 57% of the entire closed-source application set. One of the main reasons why Reflectall’s performance overhead is lower than the Xposed framework-based method is that the Xposed framework-based method uses Java, and the behavior interpreter programming language implemented by Reflectall is C++. When the execution process of the application is more complicated, the memory allocation and recovery based on the Xposed framework will be more frequent than Reflectall. The above experiments show that this implementation of Reflectall of the present invention can handle more complex applications.
实验二:对比基于字节码重构的方法Experiment 2: Compare methods based on bytecode reconstruction
很多常用的Java库都用到了字节码重构框架。字节码重构的一个很重要的使用场景就是程序分析。例如,流行的bug定位工具FindBugs在底层就使用了ASM来分析字节码并定位漏洞。另外一个常见的使用场景就是利用字节码重构生成程序的代码覆盖率报告,例如Emma(Roubtsov,2005)、JCover(JCover,2017)。Reflectall所生成的精简模型可转换为代码覆盖率报告。本实验将对比Reflectall与Emma在生成代码覆盖报告上的区别。由于基于字节码重构的方法不适合应用于仅有应用安装包的闭源应用。因此,这部分的仅针对开源应用集进行。本实现仍以应用启动时间作为指标,在两台硬件配置相同的红米2A手机上,分别部署了Reflectall和未修改的安卓系统,安卓操作系统均为5.1.1。本实验在部署了Reflectall的手机上安装原版应用;在未经修改的安卓手机上安装经过Emma插桩过的应用,在以下3个不同的实验场景,比较Reflectall与Emma生成开源应用集上生成代码覆盖报告的性能。在每个场景下,每个应用启动10次,实验的结果如图11所示。Many commonly used Java libraries use bytecode reconstruction frameworks. A very important use scenario for bytecode reconstruction is program analysis. For example, the popular bug location tool FindBugs uses ASM at the bottom to analyze bytecode and locate vulnerabilities. Another common usage scenario is to use bytecode reconstruction to generate code coverage reports for programs, such as Emma (Roubtsov, 2005) and JCover (JCover, 2017). The condensed model generated by Reflectall can be converted into a code coverage report. This experiment will compare the difference between Reflectall and Emma in generating code coverage reports. Because the method based on bytecode reconstruction is not suitable for closed source applications that only have application installation packages. Therefore, this part is only for open source application sets. This implementation still uses the application startup time as an indicator. Reflectall and an unmodified Android system are deployed on two Redmi 2A phones with the same hardware configuration. The Android operating systems are both 5.1.1. In this experiment, install the original application on the mobile phone where Reflectall is deployed; install the application plugged in by Emma on the unmodified Android phone. In the following 3 different experimental scenarios, compare Reflectall and Emma to generate code on the open source application set Cover the performance of the report. In each scenario, each application was launched 10 times, and the results of the experiment are shown in Figure 11.
实验结果表明,Reflectall与Emma的应用平均启动时间接近,平均启动时间为442毫秒和455毫秒,额外开销分别为13%和16%。但是从生成的代码覆盖信息上,Reflectall要比Emma更丰富。表4给出了Reflectall在代码覆盖报告的监测粒度和部署运行的区别。Emma对块覆盖的报告可能存在不准确,并且不支持分支执行次数,而Reflectall基于行为解释器可以保证覆盖报告的准确性,同时也实现了对分支指令(如If-gt、Packed-Switch)的各个分支的执行次数的统计。另一区别是Emma需要进行一定配置,以对字节码进行重构,并且重构之后需要重打包;而Reflectall不需要这些配置,也不会改变移动应用的编译流程。因此,Reflectall比Emma这类基于字节码重构的工具更具易用性和实用性。Experimental results show that the average startup time of Reflectall and Emma applications is close, the average startup time is 442 milliseconds and 455 milliseconds, and the additional overhead is 13% and 16%, respectively. But from the code coverage information generated, Reflectall is more abundant than Emma. Table 4 shows the difference between Reflectall's monitoring granularity and deployment operation in the code coverage report. Emma's reporting of block coverage may be inaccurate, and does not support the number of branch executions, while Reflectall's behavioral interpreter can guarantee the accuracy of coverage reports, and it also implements branch instructions (such as If-gt, Packed-Switch) Statistics of the number of executions of each branch. Another difference is that Emma needs to be configured to reconstruct the bytecode, and repackaging is required after reconstruction; Reflectall does not require these configurations and will not change the compilation process of mobile applications. Therefore, Reflectall is more usable and practical than Emma-based tools such as bytecode reconstruction.
表4:Reflectall与Emma的监测粒度对比Table 4: Comparison of monitoring granularity between Reflectall and Emma
对比类别Comparison category EmmaEmma ReflectallReflectall
类覆盖Class coverage 支持stand by 支持stand by
方法覆盖Method coverage 支持stand by 支持stand by
块覆盖Block coverage 部分支持Partial support 支持stand by
分支覆盖Branch coverage 部分支持Partial support 支持stand by
行覆盖Line coverage 支持stand by 支持stand by
分支执行次数Number of branch executions 不支持not support 支持stand by
指令覆盖Instruction coverage 不支持not support 支持stand by
是否需要字节码Do you need bytecode 需要need 不需要Not needed
是否需要重打包Need to repack 需要need 不需要Not needed
综上,本发明提供的一种构造终端应用行为的运行时模型的方法,能将应用的执行可视为编程语言框架(例如解释器、虚拟机),根据应用的代码段,对内存进行读写操作。什么样的方法被执行,可对应为编程语言框架在栈上的操作;什么样的对象数据被修改,则可对应为编程语言框架对堆上的操作。实现了对终端应用应用行为的灵活、完整的监测,为后续实现对终端应用应用行为的指令级控 制提供了技术保障。依据该方法所设计的计算反射引擎,可作为一个单独的运行环境使用,也可被集成到各种主流的开发平台或商业软件中,给开发者提供对应用运行时监测的基本能力。In summary, the method of constructing a runtime model of terminal application behavior provided by the present invention can regard the execution of the application as a programming language framework (such as an interpreter, virtual machine), and read the memory according to the code segment of the application. Write operation. What kind of method is executed can correspond to the operation of the programming language framework on the stack; what kind of object data is modified can correspond to the operation of the programming language framework on the heap. It realizes flexible and complete monitoring of terminal application application behavior, and provides technical guarantee for subsequent realization of instruction-level control of terminal application application behavior. The computational reflection engine designed according to this method can be used as a stand-alone operating environment, and can also be integrated into various mainstream development platforms or commercial software to provide developers with the basic ability to monitor application runtimes.
上述实施例仅为优选实施例,并不用以限制本发明的保护范围,在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。The above-mentioned embodiments are only preferred embodiments, and are not intended to limit the scope of protection of the present invention. Any modification, equivalent replacement and improvement made within the spirit and principle of the present invention shall be included in the scope of protection of the present invention. .
以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下,即可以理解并实施。The device embodiments described above are merely illustrative. The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in One place, or it can be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the objectives of the solutions of the embodiments. Those of ordinary skill in the art can understand and implement it without creative work.
本发明的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本发明实施例的计算处理设备中的一些或者全部部件的一些或者全部功能。本发明还可以实现为用于执行这里所描述的方法的一部分或者全部的设备或者装置程序(例如,计算机程序和计算机程序产品)。这样的实现本发明的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或者以任何其他形式提供。The various component embodiments of the present invention may be implemented by hardware, or by software modules running on one or more processors, or by their combination. Those skilled in the art should understand that a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all of the functions of some or all of the components in the computing processing device according to the embodiments of the present invention. The present invention can also be implemented as a device or device program (for example, a computer program and a computer program product) for executing part or all of the methods described herein. Such a program for realizing the present invention may be stored on a computer-readable medium, or may have the form of one or more signals. Such signals can be downloaded from Internet websites, or provided on carrier signals, or provided in any other form.
例如,图12示出了可以实现根据本发明的方法的计算处理设备。该计算处理设备传统上包括处理器1010和以存储器1020形式的计算机程序产品或者计算机可读介质。存储器1020可以是诸如闪存、EEPROM(电可擦除可编程只读存储器)、EPROM、硬盘或者ROM之类的电子存储器。存储器1020具有用于执行上述方法中的任何方法步骤的程序代码1031的存储空间1030。例如,用于程序代码的存储空间1030可以包括分别用于实现上面的方法中的各种步骤的各个程序代码1031。这些程序代码可以从一个或者多个计算机程序产品中读出或者写入到这一个或者多个计算机程序产品中。这些计算机程序产品包括诸如硬盘,紧致盘(CD)、存储卡或者软盘之类的程序代码载体。这样的计算机程序产品通常为如参考图13所述的便携式或者固定存储单元。该存储单元可以具有与图12的计算处理设备中的存储器1020类似布置的存储段、存储空间等。程序代码可以例如以适当形式进行压缩。通常,存储单元包括计算机可读代码1031’,即可以由例如诸如1010之类的处理器读取的代码,这些代码当由计算处理设备运行时,导致该计算处理设备执行上面所描述的方法中的各个步骤。For example, FIG. 12 shows a computing processing device that can implement the method according to the present invention. The computing processing device traditionally includes a processor 1010 and a computer program product in the form of a memory 1020 or a computer readable medium. The memory 1020 may be an electronic memory such as flash memory, EEPROM (Electrically Erasable Programmable Read Only Memory), EPROM, hard disk, or ROM. The memory 1020 has a storage space 1030 for executing the program code 1031 of any method step in the above method. For example, the storage space 1030 for program codes may include various program codes 1031 for implementing various steps in the above method. These program codes can be read out from or written into one or more computer program products. These computer program products include program code carriers such as hard disks, compact disks (CDs), memory cards or floppy disks. Such a computer program product is usually a portable or fixed storage unit as described with reference to FIG. 13. The storage unit may have storage segments, storage spaces, etc., arranged similarly to the memory 1020 in the computing processing device of FIG. 12. The program code can be compressed in an appropriate form, for example. Generally, the storage unit includes computer-readable codes 1031', that is, codes that can be read by, for example, a processor such as 1010. These codes, when run by a computing processing device, cause the computing processing device to execute the method described above. The various steps.
本文中所称的“一个实施例”、“实施例”或者“一个或者多个实施例”意味着,结合实施例描述的特定特征、结构或者特性包括在本发明的至少一个实施例中。此外,请注意,这里“在一个实施例中”的词语例子不一定全指同一个实施例。The term "one embodiment", "an embodiment" or "one or more embodiments" referred to herein means that a specific feature, structure or characteristic described in conjunction with the embodiment is included in at least one embodiment of the present invention. In addition, please note that the word examples "in one embodiment" herein do not necessarily all refer to the same embodiment.
在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本发明的实施例可以在没有这些具体细节的情况下被实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。In the instructions provided here, a lot of specific details are explained. However, it can be understood that the embodiments of the present invention can be practiced without these specific details. In some instances, well-known methods, structures and technologies are not shown in detail, so as not to obscure the understanding of this specification.
在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”不排除存在未列在权利要求中的元件或步骤。位于元件之前的单词“一”或“一个”不排除存在多个这样的元件。本发明可以借助于包括有若干不同元件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。In the claims, any reference signs placed between parentheses should not be constructed as a limitation to the claims. The word "comprising" does not exclude the presence of elements or steps not listed in the claims. The word "a" or "an" preceding an element does not exclude the presence of multiple such elements. The invention can be implemented by means of hardware comprising several different elements and by means of a suitably programmed computer. In the unit claims enumerating several devices, several of these devices may be embodied by the same hardware item. The use of the words first, second, and third, etc. do not indicate any order. These words can be interpreted as names.
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: The technical solutions recorded in the foregoing embodiments are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (12)

  1. 一种构造终端应用行为的运行时模型的方法,其特征在于,所述运行时模型包括运行时栈模型和运行时堆模型,所述方法包括构造所述终端应用行为的运行时栈模型的步骤和构造所述终端应用行为的运行时堆模型的步骤;A method for constructing a runtime model of a terminal application behavior, characterized in that the runtime model includes a runtime stack model and a runtime heap model, and the method includes the step of constructing a runtime stack model of the terminal application behavior And the step of constructing a runtime heap model of the terminal application behavior;
    所述构造所述终端应用行为的运行时栈模型的步骤包括:The step of constructing the runtime stack model of the terminal application behavior includes:
    在所述终端应用运行时,获取所述终端应用的内存中真正执行的代码,并对所述真正执行的代码进行抽象,生成控制流图;When the terminal application is running, obtain the actually executed code in the memory of the terminal application, and abstract the actually executed code to generate a control flow graph;
    针对所述控制流图,将需要监测的控制流图输入至预设的行为解释器;For the control flow graph, input the control flow graph to be monitored into a preset behavior interpreter;
    利用所述行为解释器对所述需要监测的控制流图进行解释执行,生成所述终端应用运行时的栈活动;Use the behavior interpreter to interpret and execute the control flow graph that needs to be monitored, and generate stack activities when the terminal application is running;
    在所述终端应用运行时,生成所述栈活动的控制流间的依赖关系,得到所述终端应用行为的运行时栈模型;When the terminal application is running, generate the dependency between the control flows of the stack activity, and obtain the runtime stack model of the terminal application behavior;
    所述构造所述终端应用行为的运行时堆模型的步骤包括:The step of constructing the runtime heap model of the terminal application behavior includes:
    在所述终端应用运行时,生成堆区的初始状态;Generating the initial state of the heap area when the terminal application is running;
    生成堆操作活动,得到所述终端应用行为的运行时堆模型。A heap operation activity is generated, and a runtime heap model of the terminal application behavior is obtained.
  2. 如权利要求1所述的方法,其特征在于,所述方法包括类筛选器和活动类型筛选器;其中,所述类筛选器基于包和类名正则匹配的粗粒度筛选,用于去除开发人员不关心的程序活动;所述活动类型筛选器基于活动类型的细粒度筛选,用于去除与开发者不关心的活动类型。The method of claim 1, wherein the method comprises a class filter and an activity type filter; wherein, the class filter is based on a coarse-grained filter based on regular matching of package and class names, and is used to remove developers Program activities that are not concerned; the activity type filter is based on fine-grained filtering of activity types, and is used to remove activity types that are not concerned with the developer.
  3. 如权利要求2所述的方法,其特征在于,所述栈活动的活动类型包括方法开始与方法结束,字段读,数组读和同步指令;The method according to claim 2, wherein the activity type of the stack activity includes method start and method end, field read, array read and synchronization instructions;
    利用所述行为解释器对所述需要监测的控制流图进行解释执行,生成所述终端应用运行时的栈活动的步骤包括:The steps of using the behavior interpreter to interpret and execute the control flow graph that needs to be monitored, and generating the stack activity when the terminal application is running include:
    利用对所述终端应用的应用行为具有监测功能的行为解释器对所述需要监测的控制流图进行解释执行,获得所述终端应用运行时的活动;Interpret and execute the control flow graph that needs to be monitored by using a behavior interpreter that has a monitoring function for the application behavior of the terminal application, and obtain the activities of the terminal application when the terminal application is running;
    根据所关注的类,利用所述类筛选器对所述终端应用运行时的活动进行粗粒度筛选,生成由所述类引起的栈活动;According to the class of interest, use the class filter to perform coarse-grained screening of the activities of the terminal application during runtime, and generate stack activities caused by the class;
    针对所述栈活动的活动类型,利用所述活动类型筛选器对所述栈活动进行细粒度筛 选。For the activity type of the stack activity, the activity type filter is used to perform fine-grained screening of the stack activity.
  4. 如权利要求1所述的方法,其特征在于,所述构造所述终端应用行为的运行时堆模型的步骤包括:The method according to claim 1, wherein the step of constructing a runtime heap model of the terminal application behavior comprises:
    所述终端应用运行时的活动包括实例化活动、修改活动和回收活动。The activities when the terminal application is running include instantiation activities, modification activities and recycling activities.
  5. 如权利要求2所述的方法,其特征在于,所述堆操作活动的活动类型包括对象实例化,数组实例化,对象字段写,数组元素写,清除活动和压缩活动;The method according to claim 2, wherein the activity types of the heap operation activity include object instantiation, array instantiation, object field writing, array element writing, clearing activity and compression activity;
    所述生成堆操作活动的步骤包括:The steps of generating a heap operation activity include:
    根据所关注的类,利用所述类筛选器对所述终端应用运行时的活动进行粗粒度筛选,生成由所述类引起的堆操作活动;According to the class of interest, use the class filter to perform coarse-grained screening of the terminal application runtime activities to generate heap operation activities caused by the class;
    针对所述堆操作活动的活动类型,利用所述活动类型筛选器对所述堆操作活动进行细粒度筛选。For the activity type of the heap operation activity, the activity type filter is used to perform fine-grained screening of the heap operation activity.
  6. 如权利要求1所述的方法,其特征在于,所述依赖关系包括同步依赖和通信依赖。The method according to claim 1, wherein the dependency relationship includes synchronization dependency and communication dependency.
  7. 如权利要求6所述的方法,其特征在于,在生成控制流间的同步依赖关系时,对于一个方法的结束依赖另一个方法的结束的情况,利用时间戳由后往前寻找其他线程中可匹配的活动,如果找到,则对应于一个同步依赖关系;对于一个活动的开始依赖另一个活动的结束的情况,先对当前线程进行检查,如果该活动为当前线程中的第一个执行的活动,则该活动是依赖另一个线程结束活动的,否则该活动仅是正常的方法调用,并不依赖另一个线程的活动。The method of claim 6, wherein when the synchronization dependency between the control flows is generated, for the case where the end of one method depends on the end of another method, the timestamp is used to find other threads from back to front. The matching activity, if found, corresponds to a synchronization dependency; for the case where the start of one activity depends on the end of another activity, the current thread is checked first, if the activity is the first activity executed in the current thread , The activity is dependent on another thread to end the activity, otherwise the activity is just a normal method call and does not depend on the activity of another thread.
  8. 如权利要求6所述的方法,其特征在于,在生成所述栈活动的控制流间的依赖关系时,对所有与活动间通信依赖相关的类进行总结,并将所述类的相关方法与线程依赖相关的方法一起作为生成通信依赖的知识库。The method according to claim 6, wherein when generating the dependency between the control flow of the stack activity, all the classes related to the communication dependency between the activities are summarized, and the related methods of the class are compared with Thread dependency related methods together serve as a knowledge base for generating communication dependencies.
  9. 如权利要求1所述的方法,其特征在于,在所述运行时模型生成时,将所述运行时模型中的活动序列存放于可配置大小的缓冲区中,当活动数量超过预设时,则将所述缓冲区的活动进行序列化,并持久化到本地存储中。The method according to claim 1, wherein when the runtime model is generated, the sequence of activities in the runtime model is stored in a buffer with a configurable size, and when the number of activities exceeds a preset value, Then the activities of the buffer are serialized and persisted to local storage.
  10. 如权利要求1所述的方法,其特征在于,所述运行时堆模型以巴科斯范式的形式表示。The method of claim 1, wherein the runtime heap model is expressed in the form of Backus paradigm.
  11. 一种计算机程序,包括计算机可读代码,当所述计算机可读代码在计算处理设备上运行时,导致所述计算处理设备执行根据权利要求1-10中的任一个所述的语音处理方法。A computer program comprising computer readable code, which when run on a computing processing device, causes the computing processing device to execute the voice processing method according to any one of claims 1-10.
  12. 一种计算机可读介质,其中存储了如权利要求11所述的计算机程序。A computer readable medium in which the computer program according to claim 11 is stored.
PCT/CN2019/119272 2019-06-10 2019-11-18 Method for constructing runtime model of terminal application behavior WO2020248512A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910498727.X 2019-06-10
CN201910498727.XA CN110347448B (en) 2019-06-10 2019-06-10 Method for constructing runtime model of terminal application behavior

Publications (1)

Publication Number Publication Date
WO2020248512A1 true WO2020248512A1 (en) 2020-12-17

Family

ID=68181744

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/119272 WO2020248512A1 (en) 2019-06-10 2019-11-18 Method for constructing runtime model of terminal application behavior

Country Status (2)

Country Link
CN (1) CN110347448B (en)
WO (1) WO2020248512A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110347448B (en) * 2019-06-10 2021-02-12 北京大学 Method for constructing runtime model of terminal application behavior
CN113971124A (en) * 2020-07-24 2022-01-25 腾讯科技(深圳)有限公司 Debugging method and device of sub-application, computer equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104636256A (en) * 2015-02-17 2015-05-20 中国农业银行股份有限公司 Memory access abnormity detecting method and memory access abnormity detecting device
US10157055B2 (en) * 2016-09-29 2018-12-18 Microsoft Technology Licensing, Llc Code refactoring mechanism for asynchronous code optimization using topological sorting
CN109189469A (en) * 2018-06-22 2019-01-11 北京大学 Android application micro services method and system based on reflection
CN109189374A (en) * 2018-06-22 2019-01-11 北京大学 Object formation code generating method and system based on object reference chain
CN109240666A (en) * 2018-06-22 2019-01-18 北京大学 Function call code generating method and system based on call stack and independent path
CN110347448A (en) * 2019-06-10 2019-10-18 北京大学 A method of the model when operation of construction terminal applies behavior
CN110362363A (en) * 2019-06-10 2019-10-22 北京大学 A method of model realization controls terminal applies when based on operation
CN110362301A (en) * 2019-06-10 2019-10-22 北京大学 A kind of processing method of terminal applies behavior reflex

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010147010A1 (en) * 2009-06-17 2010-12-23 日本電気株式会社 Module classification analysis system, module classification analysis method, and module classification analysis program
CN103473400B (en) * 2013-08-27 2016-12-28 北京航空航天大学 Software FMEA method based on level dependence modeling

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104636256A (en) * 2015-02-17 2015-05-20 中国农业银行股份有限公司 Memory access abnormity detecting method and memory access abnormity detecting device
US10157055B2 (en) * 2016-09-29 2018-12-18 Microsoft Technology Licensing, Llc Code refactoring mechanism for asynchronous code optimization using topological sorting
CN109189469A (en) * 2018-06-22 2019-01-11 北京大学 Android application micro services method and system based on reflection
CN109189374A (en) * 2018-06-22 2019-01-11 北京大学 Object formation code generating method and system based on object reference chain
CN109240666A (en) * 2018-06-22 2019-01-18 北京大学 Function call code generating method and system based on call stack and independent path
CN110347448A (en) * 2019-06-10 2019-10-18 北京大学 A method of the model when operation of construction terminal applies behavior
CN110362363A (en) * 2019-06-10 2019-10-22 北京大学 A method of model realization controls terminal applies when based on operation
CN110362301A (en) * 2019-06-10 2019-10-22 北京大学 A kind of processing method of terminal applies behavior reflex

Also Published As

Publication number Publication date
CN110347448B (en) 2021-02-12
CN110347448A (en) 2019-10-18

Similar Documents

Publication Publication Date Title
WO2020248510A1 (en) Terminal application behaviour reflection processing method
CN110362363B (en) Method for realizing terminal application control based on runtime model
Fang et al. Interruptible tasks: Treating memory pressure as interrupts for highly scalable data-parallel programs
US10635570B2 (en) Memory leak profiling events
CN101322106B (en) Method and apparatus for hardware-based dynamic escape detection in managed run-time environments
TWI539280B (en) Method for analyzing application not specifically designed to provide memory allocation informaion and extracting memory allocation information, and computer system and computer-readable storage medium thereof
US20120197944A1 (en) System and method for determining an object&#39;s lifetime in an object oriented environment
CN101416214A (en) Abstract execution model for a continuation-based meta-runtime
Pellegrini et al. Autonomic state management for optimistic simulation platforms
CN110325969A (en) Multistage garbage collector
Jin et al. Mapreduce programming model for. net-based cloud computing
US11188364B1 (en) Compilation strategy for a sharable application snapshot
US11321090B2 (en) Serializing and/or deserializing programs with serializable state
CN110291508A (en) Garbage collector
WO2020248512A1 (en) Method for constructing runtime model of terminal application behavior
Atkinson et al. A review of the rationale and architectures of PJama-a durable, flexible, evolvable and scalable orthogonally persistent programming platform.
CN112214388A (en) Memory monitoring method, device, equipment and computer readable storage medium
CN111797025B (en) Application-oriented data processing method and device
US8490115B2 (en) Ambient state for asynchronous methods
CN104778087B (en) A kind of information processing method and information processing unit
Simão et al. A checkpointing‐enabled and resource‐aware Java Virtual Machine for efficient and robust e‐Science applications in grid environments
Dubrulle et al. A low-overhead dedicated execution support for stream applications on shared-memory CMP
Wang Thin serverless functions with graalvm native image
US11720348B2 (en) Computing node allocation based on build process specifications in continuous integration environments
Daoud et al. Multilevel analysis of the java virtual machine based on kernel and userspace traces

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19932447

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19932447

Country of ref document: EP

Kind code of ref document: A1