CN106294169A - The detection of a kind of data contention based on semiology analysis virtual machine and playback method - Google Patents
The detection of a kind of data contention based on semiology analysis virtual machine and playback method Download PDFInfo
- Publication number
- CN106294169A CN106294169A CN201610679571.1A CN201610679571A CN106294169A CN 106294169 A CN106294169 A CN 106294169A CN 201610679571 A CN201610679571 A CN 201610679571A CN 106294169 A CN106294169 A CN 106294169A
- Authority
- CN
- China
- Prior art keywords
- instruction
- thread
- function
- record
- write
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 61
- 238000000034 method Methods 0.000 title claims abstract description 61
- 238000004458 analytical method Methods 0.000 title claims description 20
- 230000006870 function Effects 0.000 claims abstract description 120
- 239000013598 vector Substances 0.000 claims description 62
- 230000008569 process Effects 0.000 claims description 32
- 238000013507 mapping Methods 0.000 claims description 4
- 230000001360 synchronised effect Effects 0.000 claims description 4
- 238000012163 sequencing technique Methods 0.000 claims 2
- 238000012545 processing Methods 0.000 description 10
- 238000003491 array Methods 0.000 description 4
- 230000006872 improvement Effects 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 238000013522 software testing Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Prevention of errors by analysis, debugging or testing of software
- G06F11/362—Debugging of software
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本发明公开了一种嵌入式环境下并行程序数据竞争检测与重放的方法,结合软件调试与测试的需求,采用符号执行虚拟机方案,动态地监测程序的运行,收集程序的执行信息,并分析数据竞争。并且还能通过执行信息对程序进行确定性重放,使得程序的执行轨迹能够重现。功能包括:基于符号执行虚拟机的数据竞争检测、基于符号执行虚拟机的数据竞争重放功能。本发明可以发掘隐藏在程序中的数据竞争,防止多线程程序在运行时发生错误。
The invention discloses a method for detecting and replaying parallel program data competition in an embedded environment. Combining with the requirements of software debugging and testing, a symbolic execution virtual machine scheme is adopted to dynamically monitor the running of the program, collect the execution information of the program, and Analyze data races. And it can also deterministically replay the program through the execution information, so that the execution track of the program can be reproduced. Functions include: data race detection based on symbolic execution virtual machine, data race replay function based on symbolic execution virtual machine. The invention can excavate the data competition hidden in the program, and prevent the multi-threaded program from running wrong.
Description
技术领域technical field
本发明属于软件调试技术领域,更具体地,涉及一种基于符号执行虚拟机的数据竞争检测与重放方法。The invention belongs to the technical field of software debugging, and more specifically relates to a data competition detection and playback method based on a symbolic execution virtual machine.
背景技术Background technique
随着软件规模越来越大,软件测试以及程序的错误检测变得越来越重要。现有的软件测试及错误检测方法可以分为静态方法和动态方法两种方法。静态方法不实际运行程序,通过分析程序源代码中的调用图、数据流图、控制流图、执行分析等信息,将这些信息和建立的错误模型比较,从而检测程序可能包含的错误代码片段。而动态方法则通过运行程序,分析程序的运行轨迹,检测程序中存在的错误。With the increasing scale of software, software testing and program error detection become more and more important. Existing software testing and error detection methods can be divided into static methods and dynamic methods. The static method does not actually run the program. By analyzing the call graph, data flow graph, control flow graph, execution analysis and other information in the program source code, it compares these information with the established error model to detect the error code fragments that the program may contain. The dynamic method runs the program, analyzes the running track of the program, and detects the errors in the program.
目前国内外主流的动态检测方法包括尼古拉斯内瑟科特研发的Valgrind,以及英特尔公司研发的Pin,两者都是二进制动态检测框架,通过将可执行程序反编译成为中间代码,在中间代码上进行程序检测,然后内部虚拟机再执行源代码。At present, the mainstream dynamic detection methods at home and abroad include Valgrind developed by Nicholas Nethercott, and Pin developed by Intel Corporation, both of which are binary dynamic detection frameworks. The program is instrumented and then the internal virtual machine executes the source code.
然而,现有动态检测方法存在诸多问题:首先,Valgrind与Pin都属于通用检测框架,二进制代码编译成为中间代码,无法主动甄别线程共享数据,在数据竞争检查时监控过多冗余内存,从而降低了检测的效率;此外,Valgrind内部实现“影子价值”(shadow value)技术,在此框架下实现轻量级的检测工具也会造成较大开销;最后,Pin框架下的检测是根据提供的具体接口来实现,接口的功能相对固定,因此难以满足用户的自定义需求。However, there are many problems in the existing dynamic detection methods: First, both Valgrind and Pin belong to the general detection framework, and the binary code is compiled into an intermediate code, which cannot actively identify thread shared data, and monitors excessive redundant memory during data race inspection, thereby reducing In addition, Valgrind internally implements "shadow value" (shadow value) technology, and implementing lightweight detection tools under this framework will also cause relatively large overhead; finally, the detection under the Pin framework is based on the provided specific Interface, the function of the interface is relatively fixed, so it is difficult to meet the user's customization needs.
发明内容Contents of the invention
针对现有技术的以上缺陷或改进需求,本发明提供了一种基于符号执 行虚拟机的数据竞争检测与重放方法,其目的在于,解决现有方法中存在的由于监控过多冗余内存造成检测效率低、检测过程的开销大、以及可扩展性差的技术问题。In view of the above defects or improvement needs of the prior art, the present invention provides a data race detection and replay method based on a symbolic execution virtual machine. Low detection efficiency, large detection process overhead, and poor scalability technical problems.
为实现上述目的,按照本发明的一个方面,提供了一种基于符号执行虚拟机的数据竞争检测与重放方法,包括以下步骤:In order to achieve the above object, according to one aspect of the present invention, a data race detection and replay method based on a symbolic execution virtual machine is provided, including the following steps:
(1)接收来自宿主机端的程序执行轨迹运行请求,并判断该请求是重放请求,还是检测请求,如果是重放请求,则进入步骤(2),否则进入步骤(5);(1) Receive the program execution trajectory operation request from the host computer, and judge whether the request is a replay request or a detection request, if it is a replay request, then enter step (2), otherwise enter step (5);
(2)判断该重放请求对应的记录文件是否存在于磁盘中,如果存在则转入步骤(3),否则过程结束;(2) judge whether the recording file corresponding to this replay request exists in the disk, if exist then go to step (3), otherwise process ends;
(3)获取记录文件中的程序执行信息文件,通过获取符号执行虚拟机中的全局变量读写指令和/或多线程函数调用指令调用检测函数,以加载记录文件中线程指令之间的约束条件,并根据约束条件判断全局变量读写指令和/或多线程函数调用指令是否能够被执行,如果能则执行步骤(4),否则过程结束;(3) Obtain the program execution information file in the record file, and execute the global variable read and write instructions and/or multi-thread function call instructions in the virtual machine by obtaining symbols to call the detection function, so as to load the constraints between the thread instructions in the record file , and judge whether the global variable read and write instruction and/or the multi-threaded function call instruction can be executed according to the constraints, and if so, perform step (4), otherwise the process ends;
(4)执行全局变量读写指令和/或多线程函数指令,过程结束;(4) Execute global variable read and write instructions and/or multi-thread function instructions, and the process ends;
(5)在符号执行虚拟机中执行检测请求对应的多线程程序,监测多个线程之间对多线程程序中的全局变量访问,并记录多线程对全局变量访问的先后顺序,并将这些先后顺序信息记录到二进制文件中,以生成程序执行信息文件;(5) Execute the multi-threaded program corresponding to the detection request in the symbolic execution virtual machine, monitor the global variable access in the multi-threaded program between multiple threads, and record the order in which the multi-threads access the global variable, and record these Sequence information is recorded into a binary file to generate a program execution information file;
(6)获取符号执行虚拟机中的指令信息,采用数据竞争检测算法检测指令信息中的全局变量读写,并利用多线程函数调用拦截功能调用指令信息中的多线程函数,以判断全局变量读写是否存在数据竞争,如果是则将数据竞争检测错误的信息报告给客户端,若程序执行未结束,则获取获得符号执行虚拟机的容器存放的下一条指令信息,并转到步骤(5)执行。(6) Obtain the instruction information in the symbolic execution virtual machine, use the data competition detection algorithm to detect the reading and writing of global variables in the instruction information, and use the multi-thread function call interception function to call the multi-thread function in the instruction information to judge the global variable read and write Write whether there is a data race, if so, report the data race detection error information to the client, if the program execution is not over, obtain the next instruction information stored in the container of the symbolic execution virtual machine, and go to step (5) implement.
优选地,本方法还包括在步骤(1)之前,对符号执行虚拟机进行多线 程扩展,以生成具有多线程函数调用拦截和内部仿真多线程功能的符号执行虚拟机。Preferably, the method also includes before step (1), carrying out multi-thread extension to the symbolic execution virtual machine, to generate a symbolic execution virtual machine with multi-threaded function call interception and internal emulation multi-threading function.
优选地,对符号执行虚拟机执行多线程扩展的过程具体为:首先,通过函数指针和结构体搭建多线程仿真的实现框架,然后针对包括线程创建、线程等待、线程加锁和解锁处理的同步函数进行重写。Preferably, the process of executing the multi-thread extension for the symbolic execution virtual machine is as follows: firstly, build a multi-thread simulation implementation framework through function pointers and structures, and then aim at the synchronization including thread creation, thread waiting, thread locking and unlocking The function is rewritten.
优选地,步骤(3)包括下述子步骤:Preferably, step (3) includes the following sub-steps:
(3-1)读取程序执行信息文件,将程序执行信息文件中的记录文件写入内存;(3-1) read the program execution information file, and write the record file in the program execution information file into the internal memory;
(3-2)根据符号执行虚拟机的执行轨迹循环选取检测请求对应的多线程程序的指令,判断指令是否为读指令、写指令、函数调用指令,若是,则转入步骤(3-3),否则根据符号执行虚拟机的执行轨迹继续执行该指令;(3-2) According to the execution trajectory of the symbolic execution virtual machine, the instruction of the multi-threaded program corresponding to the detection request is cyclically selected, and it is judged whether the instruction is a read instruction, a write instruction, or a function call instruction, and if so, proceed to step (3-3) , otherwise continue to execute the instruction according to the execution track of the symbolic execution virtual machine;
(3-3)判断指令访问的内存地址是否为全局变量,如果是则进入子步骤(3-4),否则如果该指令是多线程函数调用指令中的加锁解锁函数则进入子步骤(3-4),否则根据符号执行虚拟机的执行轨迹继续执行该指令。。(3-3) judge whether the memory address accessed by the instruction is a global variable, if it is, then enter the substep (3-4), otherwise if the instruction is the lock and unlock function in the multithreaded function call instruction, then enter the substep (3 -4), otherwise continue to execute the instruction according to the execution track of the symbolic execution virtual machine. .
(3-4)根据写入内存的文件记录信息,判断当前指令是否存在依赖关系,若不存在依赖关系,则进入子步骤(3-5),否则获取符号执行虚拟机中的下一条指令,并返回子步骤(3-2)执行;(3-4) According to the file record information written into the internal memory, it is judged whether there is a dependency relationship in the current instruction, if there is no dependency relationship, then enter sub-step (3-5), otherwise the symbol is obtained to execute the next instruction in the virtual machine, And return sub-step (3-2) to carry out;
(3-5)执行当前指令,解除已执行指令的指令间依赖关系,从而更新写入内存的文件记录信息,获取符号执行虚拟机中的下一条指令,程序未结束时返回子步骤(3-2)执行,直至程序结束。(3-5) Execute the current instruction, remove the dependencies between the instructions of the executed instructions, thereby update the file record information written into the memory, obtain the next instruction in the symbolic execution virtual machine, and return to the substep (3-5) when the program is not over 2) Execute until the end of the program.
优选地,步骤(5)包括下述子步骤:Preferably, step (5) includes the following sub-steps:
(5-1)检测请求对应的多线程程序所执行的指令,并判断该指令是否是读指令或写指令,如果是则进入子步骤(5-2),否则如果该指令是加锁指令或解锁指令,则在执行完加锁或解锁指令后,进入子步骤(5-2),否则根据符号执行虚拟机的执行轨迹继续执行该指令;(5-1) Detect the instruction that the corresponding multi-thread program of request is carried out, and judge whether this instruction is read instruction or write instruction, if then enter substep (5-2), otherwise if this instruction is lock instruction or For the unlock instruction, after executing the lock or unlock instruction, enter the sub-step (5-2), otherwise, continue to execute the instruction according to the execution track of the symbolic execution virtual machine;
(5-2)判断指令访问的内存地址是否是全局变量地址,如果是,获取 全局变量的地址,建立全局变量地址与编号的映射关系,获取线程访问全局变量信息,包括线程号以及时间戳向量,若指令类型为写指令,则转入子步骤(5-3),若指令类型为读指令则转入子步骤(5-4);否则根据符号执行虚拟机的执行轨迹继续执行该指令;(5-2) Determine whether the memory address accessed by the instruction is a global variable address, if so, obtain the address of the global variable, establish the mapping relationship between the global variable address and the number, and obtain the thread access global variable information, including the thread number and timestamp vector , if the instruction type is a write instruction, then proceed to substep (5-3), if the instruction type is a read instruction, then proceed to substep (5-4); otherwise, continue to execute the instruction according to the execution track of the symbolic execution virtual machine;
(5-3)根据全局变量地址对应的编号读取该全局变量对应的所有读访问记录的集合,遍历所有的读访问记录,对每个读访问记录按照如下形式处理:将读访问记录的线程号对应的时间戳值与当前访问记录的相同的线程号对应的时间戳值进行比较,若满足小于等于关系,则不做处理,否则表明两个操作存在先后序关系,并进行记录。最后将所有读访问记录的时间戳向量与当前访问的时间戳向量进行合并;(5-3) Read the set of all read access records corresponding to the global variable according to the number corresponding to the global variable address, traverse all the read access records, and process each read access record in the following form: the thread of the read access record The timestamp value corresponding to the thread number is compared with the timestamp value corresponding to the same thread number of the current access record. If the less than or equal relationship is satisfied, no processing is performed. Otherwise, it indicates that there is a sequence relationship between the two operations and is recorded. Finally, merge the timestamp vectors of all read access records with the currently accessed timestamp vector;
(5-4)根据全局变量地址对应的编号读取该全局变量对应的所有写访问记录的集合,对每个写访问记录按照如下形式处理:将写访问记录的线程号对应的时间戳值与当前访问记录的相同的线程号对应的时间戳值进行比较,若满足小于等于关系,则不作处理,否则表明两个操作存在先后序关系,并进行记录。最后将所有写访问记录的时间戳向量与当前访问的时间戳向量进行合并;(5-4) Read the collection of all write access records corresponding to the global variable according to the number corresponding to the global variable address, and process each write access record in the following form: compare the timestamp value corresponding to the thread number of the write access record with Compare the timestamp values corresponding to the same thread number of the current access record. If the less than or equal relationship is satisfied, no processing will be performed. Otherwise, it indicates that there is a sequence relationship between the two operations and will be recorded. Finally, merge the timestamp vectors of all write access records with the current access timestamp vector;
(5-5)将所有读写访问记录的集合写入记录文件,从而完成执行轨迹的记录过程。(5-5) Write the set of all read-write access records into the record file, so as to complete the recording process of the execution track.
优选地,步骤(6)包括下述子步骤:Preferably, step (6) includes the following sub-steps:
(6-1)执行请求对应的多线程程序,以获取执行的指令;(6-1) Execute the multi-threaded program corresponding to the request to obtain the executed instruction;
(6-2)通过符号执行虚拟机中的条件分支结构判断指令的类型,如果指令为读指令load或者写指令store,则转入子步骤(6-3);如果指令为函数调用指令,则转入子步骤(6-7);如果指令为其它指令,则根据符号执行虚拟机的执行轨迹继续执行该指令,然后返回子步骤(6-1);(6-2) execute the conditional branch structure in the virtual machine to judge the type of instruction by symbol, if the instruction is read instruction load or write instruction store, then proceed to substep (6-3); if the instruction is a function call instruction, then Go to sub-step (6-7); if the instruction is other instructions, continue to execute the instruction according to the execution track of the symbolic execution virtual machine, and then return to sub-step (6-1);
(6-3)判断该指令访问的内存地址是否为全局变量,是则转入子步骤(6-4),否则根据符号执行虚拟机的执行轨迹继续执行该指令,然后返回子 步骤(6-1);(6-3) Judging whether the memory address accessed by the instruction is a global variable, if so, proceed to substep (6-4), otherwise continue to execute the instruction according to the execution track of the symbolic execution virtual machine, and then return to substep (6-4) 1);
(6-4)判断该指令是否为写访问,若为写访问,则调用写处理函数callStore,并转入子步骤(6-5),若为读访问,则调用读处理函数callLoad,并转入子步骤(6-6);(6-4) Judge whether the instruction is a write access, if it is a write access, then call the write processing function callStore, and proceed to substep (6-5), if it is a read access, then call the read processing function callLoad, and turn to Into sub-steps (6-6);
(6-5)根据全局变量的内存地址addr,检查全局变量是否为第一次写访问,如果是则将访问该全局变量的线程号、以及线程的时间戳存放到写记录中,然后返回子步骤(6-1);否则取出全局变量的内存地址addr的写记录信息,包括线程号和时间戳向量等。将当前线程的时间戳向量中记录的当前线程的时间戳值与写记录信息中的时间戳值进行比较,若前者小于后者,表示发生数据竞争,则报告全局变量的写写数据竞争错误给客户端,否则不作处理,同时继续判断两次写访问之间是否存在并发读访问标识,若存在并发读访问,则将此次写访问记录的当前线程的时间戳值与存放两次写访问的所有并发读访问记录的时间戳向量中记录当前线程的时间戳值进行比较,只要满足小于关系,表示发生数据竞争,则报告全局变量的读写数据竞争错误给客户端,清空存放并发读访问记录中的所有信息,更新写访问记录,然后返回子步骤(6-1),否则清空存放并发读访问记录中的所有信息,更新写访问记录,然后返回子步骤(6-1)。(6-5) According to the memory address addr of the global variable, check whether the global variable is the first write access, if so, store the thread number and the timestamp of the thread accessing the global variable in the write record, and then return to the child Step (6-1); otherwise, take out the write record information of the memory address addr of the global variable, including thread number and time stamp vector, etc. Compare the timestamp value of the current thread recorded in the timestamp vector of the current thread with the timestamp value in the write record information. If the former is smaller than the latter, it means that data competition occurs, and report the write data competition error of the global variable to Client, otherwise do not process, and continue to judge whether there is a concurrent read access identifier between the two write accesses. If there is a concurrent read access, then compare the timestamp value of the current thread recorded in this write access with the The timestamp value of the current thread recorded in the timestamp vector of all concurrent read access records is compared, as long as the less than relationship is satisfied, it means that data competition occurs, and the read and write data competition error of the global variable is reported to the client, and the concurrent read access records are cleared and stored All the information in, update the write access record, and then return to substep (6-1), otherwise clear all the information stored in the concurrent read access record, update the write access record, and then return to substep (6-1).
(6-6)根据全局变量的内存地址,检查全局变量是否为第一次读访问,如果是第一次读访问,则将线程号以及线程的时间戳值存放到读记录中,然后返回子步骤(6-1);如果不是第一次读访问,则取出全局变量的内存地址的写记录信息,包括线程号和时间戳向量等,将当前线程的时间戳向量中记录的当前线程的时间戳值与记录的时间戳值进行比较,若两者满足小于关系,表示发生数据竞争,则报告数据竞争给客户端,否则不作处理,同时继续判断两次写访问之间是否存在并发读访问标识,若存在并发读访问,则将此次读访问直接添加到地址为addr的并发读访问记录中;否则取出读访问记录信息,包括线程号和时间戳向量等,将当前线程的时间戳向 量中记录的当前线程的时间戳值与记录的时间戳值进行比较,若两者满足小于关系,则更新并发读访问标识位,将此次读访问添加到地址为addr的并发读访问记录中,然后更新地址为addr的读访问记录,并返回子步骤(6-1);(6-6) According to the memory address of the global variable, check whether the global variable is the first read access, if it is the first read access, store the thread number and the timestamp value of the thread in the read record, and then return to the child Step (6-1); if it is not the first read access, then take out the write record information of the memory address of the global variable, including thread number and timestamp vector, etc., and record the time of the current thread in the timestamp vector of the current thread The stamp value is compared with the recorded timestamp value. If the two satisfy the less than relationship, it means that a data competition has occurred, and the data competition will be reported to the client. Otherwise, it will not be processed. At the same time, continue to judge whether there is a concurrent read access identifier between the two write accesses. , if there is concurrent read access, add this read access directly to the concurrent read access record whose address is addr; otherwise, take out the read access record information, including thread number and timestamp vector, etc., and add the current thread’s timestamp vector to Compare the recorded timestamp value of the current thread with the recorded timestamp value, if the two satisfy the less than relationship, update the concurrent read access flag, add this read access to the concurrent read access record whose address is addr, and then Update the read access record whose address is addr, and return to substep (6-1);
(6-7)通过字符串匹配的方式检查函数调用指令调用的当前调用函数是否为多线程函数,是则进入子步骤(6-8),否则根据符号执行虚拟机的执行轨迹继续执行该指令;(6-7) Check whether the current call function called by the function call instruction is a multi-thread function by means of character string matching, if yes, enter sub-step (6-8), otherwise continue to execute the instruction according to the execution track of the symbolic execution virtual machine ;
(6-8)在符号执行虚拟机中条件分支结构的函数调用指令处,通过字符串匹配的方式识别当前调用函数的名称,并获取函数的参数,根据函数的名称和参数判断当前调用函数是线程创建函数,还是线程等待函数,还是加锁函数,还是解锁函数,如果是线程创建函数则转入子步骤(6-9);若为线程等待函数,则转入子步骤(6-10);若为加锁函数,则转入子步骤(6-11);若为解锁函数,则转入子步骤(6-12);(6-8) At the function call instruction of the conditional branch structure in the symbolic execution virtual machine, the name of the currently called function is identified by string matching, and the parameters of the function are obtained, and the current call function is judged according to the name and parameters of the function. Thread creation function, thread waiting function, locking function, or unlocking function, if it is a thread creation function, then proceed to substep (6-9); if it is a thread waiting function, then proceed to substep (6-10) ; If it is a locking function, then proceed to substep (6-11); if it is an unlocking function, then proceed to substep (6-12);
(6-9)调用线程创建函数创建子线程,子线程初始化自身时间戳向量,并与父线程的时间戳向量进行合并,同时父线程的时间戳值自增1;(6-9) Call the thread creation function to create a child thread, the child thread initializes its own timestamp vector, and merges with the parent thread's timestamp vector, and the parent thread's timestamp value increases by 1;
(6-10)调用线程等待函数,以将等待线程的时间戳向量与被等待线程的时间戳向量进行合并;(6-10) calling the thread waiting function to merge the timestamp vector of the waiting thread with the timestamp vector of the waiting thread;
(6-11)调用加锁函数,并在成功加锁后,将线程的时间戳向量和锁对象的时间戳向量进行合并;(6-11) Call the locking function, and after successful locking, merge the timestamp vector of the thread and the timestamp vector of the lock object;
(6-12)调用解锁函数,并在成功解锁后,将线程的时间戳向量赋予锁对象的时间戳向量。(6-12) Call the unlock function, and after successful unlocking, assign the thread's timestamp vector to the lock object's timestamp vector.
总体而言,通过本发明所构思的以上技术方案与现有技术相比,能够取得下列有益效果:Generally speaking, compared with the prior art, the above technical solutions conceived by the present invention can achieve the following beneficial effects:
(1)本发明的检测效率高:由于采用了在步骤(6)的检测数据竞争的方法中监测的是全局变量的读写访问和同步函数的调用访问信息,并且仅保存了最后一次写访问记录,在进行全局变量的读写数据竞争检测和写写数据竞争检测时,只需要将当前读写记录的时间戳值与最后一次写访问 记录的时间戳值进行比较即可判断是否存在数据竞争错误,将检测的时间复杂度度由线性时间复杂度降低到了常量时间复杂度,因此检测效率有所提高。(1) the detection efficiency of the present invention is high: what monitor in the method for detecting data competition in step (6) is the read-write access of global variable and the calling access information of synchronous function, and only preserve last write access Record, when performing read-write data competition detection and write-write data competition detection of global variables, you only need to compare the timestamp value of the current read-write record with the timestamp value of the last write access record to determine whether there is data competition Error, the time complexity of detection is reduced from linear time complexity to constant time complexity, so the detection efficiency is improved.
(2)本发明的检测开销小:由于采用了步骤(5)中的子步骤(5-2)仅在判断出指令访问的内存地址为全局变量地址时才建立全局变量地址与编号的映射关系,在检测过程中仅存储全局变量的访问记录,检测全局变量的访问是否存在数据竞争错误,降低了检测开销。(2) the detection expense of the present invention is little: because adopted the substep (5-2) in the step (5) only just set up the mapping relation of global variable address and numbering when judging that the memory address of instruction access is the global variable address In the detection process, only the access records of the global variables are stored to detect whether there is a data competition error in the access of the global variables, and the detection cost is reduced.
(3)本发明的扩展性好:由于本发明步骤(3)中采用的符号执行虚拟机,在对其进行多线程扩展过程中,并且重写了线程创建、线程等待、线程加锁和解锁处理同步函数,可扩展实现更多的同步函数进行检测,可扩展性强。(3) The scalability of the present invention is good: due to the symbolic execution virtual machine adopted in the step (3) of the present invention, in the process of multi-thread extension to it, thread creation, thread waiting, thread locking and unlocking have been rewritten Handle synchronous functions, which can be expanded to implement more synchronous functions for detection, and the scalability is strong.
附图说明Description of drawings
图1是本发明基于符号执行虚拟机的数据竞争检测与重放方法的整体框架图。FIG. 1 is an overall framework diagram of the data race detection and replay method based on a symbolic execution virtual machine in the present invention.
图2是本发明基于符号执行虚拟机的数据竞争检测与重放方法的流程图。Fig. 2 is a flow chart of the data race detection and playback method based on symbolic execution virtual machine in the present invention.
图3是本发明方法中数据竞争检测流程图。Fig. 3 is a flowchart of data competition detection in the method of the present invention.
图4是本发明方法中数据竞争记录运行轨迹的流程图。Fig. 4 is a flow chart of the running track of data competition recording in the method of the present invention.
图5是本发明方法中数据竞争重放方法的流程图。Fig. 5 is a flow chart of the data contention replay method in the method of the present invention.
具体实施方式detailed description
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。此外,下面所描述的本发明各个实施方式中所涉及到的技术特征只要彼此之间未构成冲突就可以相互组合。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not constitute a conflict with each other.
如图1所示,本发明基于符号执行虚拟机的数据竞争检测与重放方法 所基于的整体框架是由宿主机端和目标机端构成。宿主机端运行图形化界面,通过网口发送数据竞争检测的相关命令到目标机端,并处理从目标机发送过来的数据,进行相关的处理筛选,最终显示给用户;目标机端接收到有宿主机段发送过来的命令,执行数据竞争的检测或者重放,并将执行信息与检测结果通过网口发送给宿主机端。从目标机端接收到数据竞争的检测结果以及程序的执行信息之后,通过对信息的解析和分类,将数据竞争的检测结果显示在面板上,将程序的执行轨迹绘制在另外一块面板上,供用户查看。As shown in Figure 1, the overall framework based on the data race detection and replay method based on the symbolic execution virtual machine of the present invention is composed of a host machine end and a target machine end. The host machine runs a graphical interface, sends relevant commands for data competition detection to the target machine through the network port, and processes the data sent from the target machine, performs relevant processing and screening, and finally displays it to the user; The command sent by the host machine performs data race detection or replay, and sends the execution information and detection results to the host machine through the network port. After receiving the detection result of data competition and the execution information of the program from the target machine, the detection result of data competition is displayed on the panel by analyzing and classifying the information, and the execution track of the program is drawn on another panel for the user view.
如图2所示,本发明基于符号执行虚拟机的数据竞争检测与重放方法包括以下步骤:As shown in Figure 2, the data race detection and replay method based on the symbolic execution virtual machine of the present invention includes the following steps:
(1)接收来自宿主机端的程序执行轨迹运行请求,并判断该请求是重放请求,还是检测请求,如果是重放请求,则进入步骤(2),否则进入步骤(5);具体而言,如果请求中的标识位是replay,则该请求是重放请求,否则是检测请求;(1) Receive the program execution track operation request from the host computer, and judge whether the request is a replay request or a detection request, if it is a replay request, then enter step (2), otherwise enter step (5); specifically , if the flag in the request is replay, the request is a replay request, otherwise it is a detection request;
(2)判断该重放请求对应的记录文件是否存在于磁盘中,如果存在则转入步骤(3),否则过程结束;(2) judge whether the recording file corresponding to this replay request exists in the disk, if exist then go to step (3), otherwise process ends;
(3)获取记录文件中的程序执行信息文件,通过获取符号执行虚拟机(symbolicvirtual machine)中的全局变量读写指令和/或多线程函数调用指令调用检测函数,以加载记录文件中线程指令之间的约束条件,并根据约束条件判断全局变量读写指令和/或多线程函数调用指令是否能够被执行,如果能则执行步骤(4),否则过程结束;在本发明中,使用的符号执行虚拟机是KLEE;(3) Obtain the program execution information file in the record file, and execute the global variable read and write instructions and/or multi-thread function call instructions in the symbolic virtual machine (symbolic virtual machine) to call the detection function to load the thread instructions in the record file. Between constraints, and according to the constraints to determine whether the global variable read and write instructions and/or multi-threaded function call instructions can be executed, if it can then perform step (4), otherwise the process ends; in the present invention, the symbolic execution used The virtual machine is KLEE;
需要注意的是,在本发明的过程执行之前,需要对符号执行虚拟机进行多线程扩展,以生成具有多线程函数调用拦截和内部仿真多线程功能的符号执行虚拟机。It should be noted that, before the process of the present invention is executed, the symbolic execution virtual machine needs to be multi-threaded, so as to generate a symbolic execution virtual machine with functions of multi-threaded function call interception and internal emulation of multi-threading.
具体而言,对符号执行虚拟机执行多线程扩展的过程具体为:首先, 通过函数指针和结构体搭建多线程仿真的实现框架,然后针对同步函数(线程创建、线程等待、线程加锁和解锁处理)进行重写,其中:Specifically, the process of executing multi-thread extension on the symbolic execution virtual machine is as follows: firstly, build the implementation framework of multi-thread simulation through function pointers and structures, and then target synchronization functions (thread creation, thread waiting, thread locking and unlocking) processing) to rewrite, where:
针对线程创建函数,在符号执行虚拟机中每条指令作为一条执行路径,执行路径有独立的执行环境,并且都存放在容器中,可将每条执行路径视为一个线程,线程创建的过程首先获取指令信息、参数个数、函数地址、以及具体的参数信息,进而通过调用符号执行虚拟机内部函数创建新的线程,初始化线程号等信息,进而对每个线程的时间戳向量进行初始化,主要操作为本线程对应的时间戳值设置为1,其他线程的时间戳值设置为0,然后将子线程与父线程的时间戳向量进行合并,得到新的时间戳向量赋值给子线程。最后将创建的线程存放到符号执行虚拟机的容器中。For the thread creation function, each instruction in the symbolic execution virtual machine is used as an execution path. The execution path has an independent execution environment and is stored in the container. Each execution path can be regarded as a thread. The process of thread creation is first Obtain instruction information, number of parameters, function address, and specific parameter information, and then create a new thread by calling the symbol to execute the internal function of the virtual machine, initialize the thread number and other information, and then initialize the timestamp vector of each thread, mainly The operation is to set the timestamp value corresponding to this thread to 1, and set the timestamp value of other threads to 0, and then merge the timestamp vector of the child thread and the parent thread to obtain a new timestamp vector and assign it to the child thread. Finally, the created thread is stored in the container of the symbolic execution virtual machine.
针对线程等待函数,从存储线程的容器中取出等待线程的信息,判断等待线程是否已经结束,如果未结束,则对于未结束的等待线程,将等待线程加入调用线程的等待集合中,等待线程结束时被唤醒,如果结束,则扫描线程集合,唤醒那些等待它的线程,并且调用线程做相关的退出清理操作。For the thread waiting function, take out the information of the waiting thread from the storage thread container, judge whether the waiting thread has ended, if not, then for the unfinished waiting thread, add the waiting thread to the waiting set of the calling thread, and wait for the thread to end When it is woken up, if it ends, scan the thread collection, wake up those threads waiting for it, and call the thread to do related exit cleanup operations.
针对线程加锁函数,首先判断此锁是否在锁集合中,对于不存在锁集合中的锁,表示第一次访问该锁,创建锁对象,并且设置锁为忙碌状态,将锁的持有线程设置为当前线程,对处于空闲状态的锁,设置锁为忙碌状态,将锁的持有线程设置为当前线程;对忙碌状态的锁,将当前线程设置为阻塞状态,将当前的线程加入到锁的等待队列中,待锁释放后再获取。For the thread locking function, first determine whether the lock is in the lock set. For a lock that does not exist in the lock set, it means that the lock is accessed for the first time, a lock object is created, and the lock is set to a busy state, and the lock holding thread Set as the current thread, for the lock in the idle state, set the lock to the busy state, set the lock holding thread as the current thread; for the busy state of the lock, set the current thread to the blocked state, and add the current thread to the lock In the waiting queue of the lock, it will be acquired after the lock is released.
针对线程解锁函数,对于处于忙碌状态的锁的对象,在当前线程不持有当前锁时返回错误;对不为空的锁的等待队列,则取出队列最开始的线程,将锁的持有线程设置为该线程,并将该线程的等待状态重置,表示线程不再是阻塞状态,其中对等待队列为空的锁,则设置锁为空闲状态,清除锁的持有线程。For the thread unlock function, for the object of the lock in the busy state, an error is returned when the current thread does not hold the current lock; for the waiting queue of the lock that is not empty, the thread at the beginning of the queue is taken out, and the thread holding the lock Set it to this thread, and reset the waiting state of this thread, indicating that the thread is no longer in a blocked state. For a lock whose waiting queue is empty, set the lock to an idle state, and clear the thread holding the lock.
(4)执行全局变量读写指令和/或多线程函数指令,过程结束;(4) Execute global variable read and write instructions and/or multi-thread function instructions, and the process ends;
(5)在符号执行虚拟机中执行检测请求对应的多线程程序,监测多个线程之间对多线程程序中的全局变量访问,并记录多线程对全局变量访问的先后顺序,并将这些先后顺序信息记录到二进制文件当中,以生成程序执行信息文件;(5) Execute the multi-threaded program corresponding to the detection request in the symbolic execution virtual machine, monitor the global variable access in the multi-threaded program between multiple threads, and record the order in which the multi-threads access the global variable, and record these The sequence information is recorded into the binary file to generate the program execution information file;
(6)获取符号执行虚拟机中的指令信息,采用数据竞争检测算法检测指令信息中的全局变量读写,并利用多线程函数调用拦截功能调用指令信息中的多线程函数,以判断全局变量读写是否存在数据竞争,如果是则将数据竞争检测错误的信息报告给客户端,若程序执行未结束,则获取获得符号执行虚拟机的容器存放的下一条指令信息,并转到步骤(5)执行;(6) Obtain the instruction information in the symbolic execution virtual machine, use the data competition detection algorithm to detect the reading and writing of global variables in the instruction information, and use the multi-thread function call interception function to call the multi-thread function in the instruction information to judge the global variable read and write Write whether there is a data race, if so, report the data race detection error information to the client, if the program execution is not over, obtain the next instruction information stored in the container of the symbolic execution virtual machine, and go to step (5) implement;
如图3所示,本发明方法的步骤(6)具体包括下述子步骤:As shown in Figure 3, the step (6) of the inventive method specifically includes the following sub-steps:
(6-1)执行请求对应的多线程程序,以获取执行的指令;(6-1) Execute the multi-threaded program corresponding to the request to obtain the executed instruction;
(6-2)通过符号执行虚拟机中的条件分支结构判断指令的类型,如果指令为读指令load或者写指令store,则转入子步骤(6-3);如果指令为函数调用指令,则转入子步骤(6-7);如果指令为其它指令,则根据符号执行虚拟机的执行轨迹继续执行该指令,然后返回子步骤(6-1);(6-2) execute the conditional branch structure in the virtual machine to judge the type of instruction by symbol, if the instruction is read instruction load or write instruction store, then proceed to substep (6-3); if the instruction is a function call instruction, then Go to sub-step (6-7); if the instruction is other instructions, continue to execute the instruction according to the execution track of the symbolic execution virtual machine, and then return to sub-step (6-1);
(6-3)判断该指令访问的内存地址是否为全局变量,是则转入子步骤(6-4),否则根据符号执行虚拟机的执行轨迹继续执行该指令,然后返回子步骤(6-1);(6-3) Judging whether the memory address accessed by the instruction is a global variable, if so, proceed to substep (6-4), otherwise continue to execute the instruction according to the execution track of the symbolic execution virtual machine, and then return to substep (6-4) 1);
(6-4)判断该指令是否为写访问,若为写访问,则调用写处理函数callStore,并转入子步骤(6-5),若为读访问,则调用读处理函数callLoad,并转入子步骤(6-6);(6-4) Judge whether the instruction is a write access, if it is a write access, then call the write processing function callStore, and proceed to substep (6-5), if it is a read access, then call the read processing function callLoad, and turn to Into sub-steps (6-6);
(6-5)根据全局变量的内存地址(为了方便描述,假设地址为addr),检查全局变量是否为第一次写访问,如果是则将访问该全局变量的线程号、以及线程的时间戳存放到写记录中,然后返回子步骤(6-1);否则取出全局变量的内存地址addr的写记录信息,包括线程号和时间戳向量等。将当前线程的时间戳向量中记录的当前线程的时间戳值与写记录信息中的时间戳 值进行比较,若前者小于后者,表示发生数据竞争,则报告全局变量的写写数据竞争错误给客户端,否则不作处理,同时继续判断两次写访问之间是否存在并发读访问标识,若存在并发读访问,则将此次写访问记录的当前线程的时间戳值与存放两次写访问的所有并发读访问记录的时间戳向量中记录当前线程的时间戳值进行比较,只要满足小于关系,表示发生数据竞争,则报告全局变量的读写数据竞争错误给客户端,清空存放并发读访问记录中的所有信息,更新写访问记录,然后返回子步骤(6-1),否则清空存放并发读访问记录中的所有信息,更新写访问记录,然后返回子步骤(6-1)。(6-5) According to the memory address of the global variable (for the convenience of description, the address is assumed to be addr), check whether the global variable is the first write access, if so, the thread number that will access the global variable, and the timestamp of the thread Store it in the write record, and then return to substep (6-1); otherwise, take out the write record information of the memory address addr of the global variable, including the thread number and the timestamp vector. Compare the timestamp value of the current thread recorded in the timestamp vector of the current thread with the timestamp value in the write record information. If the former is smaller than the latter, it means that data competition occurs, and report the write data competition error of the global variable to Client, otherwise do not process, and continue to judge whether there is a concurrent read access identifier between the two write accesses. If there is a concurrent read access, then compare the timestamp value of the current thread recorded in this write access with the The timestamp value of the current thread recorded in the timestamp vector of all concurrent read access records is compared, as long as the less than relationship is satisfied, it means that data competition occurs, and the read and write data competition error of the global variable is reported to the client, and the concurrent read access records are cleared and stored All the information in, update the write access record, and then return to substep (6-1), otherwise clear all the information stored in the concurrent read access record, update the write access record, and then return to substep (6-1).
(6-6)根据全局变量的内存地址,检查全局变量是否为第一次读访问,如果是第一次读访问,则将线程号以及线程的时间戳值存放到读记录中,然后返回子步骤(6-1);如果不是第一次读访问,则取出全局变量的内存地址的写记录信息,包括线程号和时间戳向量等,将当前线程的时间戳向量中记录的当前线程的时间戳值与记录的时间戳值进行比较,若两者满足小于关系,表示发生数据竞争,则报告数据竞争给客户端,否则不作处理,同时继续判断两次写访问之间是否存在并发读访问标识,若存在并发读访问,则将此次读访问直接添加到地址为addr的并发读访问记录中;否则取出读访问记录信息,包括线程号和时间戳向量等,将当前线程的时间戳向量中记录的当前线程的时间戳值与记录的时间戳值进行比较,若两者满足小于关系,则更新并发读访问标识位,将此次读访问添加到地址为addr的并发读访问记录中,然后更新地址为addr的读访问记录,然后返回子步骤(6-1);(6-6) According to the memory address of the global variable, check whether the global variable is the first read access, if it is the first read access, store the thread number and the timestamp value of the thread in the read record, and then return to the child Step (6-1); if it is not the first read access, then take out the write record information of the memory address of the global variable, including thread number and timestamp vector, etc., and record the time of the current thread in the timestamp vector of the current thread The stamp value is compared with the recorded timestamp value. If the two satisfy the less than relationship, it means that a data competition has occurred, and the data competition will be reported to the client. Otherwise, it will not be processed. At the same time, continue to judge whether there is a concurrent read access identifier between the two write accesses. , if there is concurrent read access, add this read access directly to the concurrent read access record whose address is addr; otherwise, take out the read access record information, including thread number and timestamp vector, etc., and add the current thread’s timestamp vector to Compare the recorded timestamp value of the current thread with the recorded timestamp value, if the two satisfy the less than relationship, update the concurrent read access flag, add this read access to the concurrent read access record whose address is addr, and then Update address is the read access record of addr, then return to substep (6-1);
(6-7)通过字符串匹配的方式检查函数调用指令调用的当前调用函数是否为多线程函数,是则进入子步骤(6-8),否则根据符号执行虚拟机的执行轨迹继续执行该指令;(6-7) Check whether the current call function called by the function call instruction is a multi-thread function by means of character string matching, if yes, enter sub-step (6-8), otherwise continue to execute the instruction according to the execution track of the symbolic execution virtual machine ;
(6-8)在符号执行虚拟机中条件分支结构的函数调用指令处,通过字符串匹配的方式识别当前调用函数的名称,并获取函数的参数,根据函数 的名称和参数判断当前调用函数是线程创建函数,还是线程等待函数,还是加锁函数,还是解锁函数,如果是线程创建函数则转入子步骤(6-9);若为线程等待函数,则转入子步骤(6-10);若为加锁函数,则转入子步骤(6-11);若为解锁函数,则转入子步骤(6-12);(6-8) At the function call instruction of the conditional branch structure in the symbolic execution virtual machine, the name of the currently called function is identified by string matching, and the parameters of the function are obtained, and the current call function is judged according to the name and parameters of the function. Thread creation function, thread waiting function, locking function, or unlocking function, if it is a thread creation function, then proceed to substep (6-9); if it is a thread waiting function, then proceed to substep (6-10) ; If it is a locking function, then proceed to substep (6-11); if it is an unlocking function, then proceed to substep (6-12);
(6-9)调用线程创建函数创建子线程,子线程初始化自身时间戳向量,并与父线程的时间戳向量进行合并,同时父线程的时间戳值自增1;(6-9) Call the thread creation function to create a child thread, the child thread initializes its own timestamp vector, and merges with the parent thread's timestamp vector, and the parent thread's timestamp value increases by 1;
(6-10)调用线程等待函数,以将等待线程的时间戳向量与被等待线程的时间戳向量进行合并;(6-10) calling the thread waiting function to merge the timestamp vector of the waiting thread with the timestamp vector of the waiting thread;
(6-11)调用加锁函数,并在成功加锁后,将线程的时间戳向量和锁对象的时间戳向量进行合并;(6-11) Call the locking function, and after successful locking, merge the timestamp vector of the thread and the timestamp vector of the lock object;
(6-12)调用解锁函数,并在成功解锁后,将线程的时间戳向量赋予锁对象的时间戳向量。(6-12) Call the unlock function, and after successful unlocking, assign the thread's timestamp vector to the lock object's timestamp vector.
如图4所示,本发明方法的步骤(5)包括下述子步骤:As shown in Figure 4, the step (5) of the inventive method comprises the following sub-steps:
(5-1)检测请求对应的多线程程序所执行的指令,并判断该指令是否是读指令或写指令,如果是则进入子步骤(5-2),否则如果该指令是加锁指令或解锁指令,则在执行完加锁或解锁指令后,进入子步骤(5-2),否则根据符号执行虚拟机的执行轨迹继续执行该指令;(5-1) Detect the instruction that the corresponding multi-thread program of request is carried out, and judge whether this instruction is read instruction or write instruction, if then enter substep (5-2), otherwise if this instruction is lock instruction or For the unlock instruction, after executing the lock or unlock instruction, enter the sub-step (5-2), otherwise, continue to execute the instruction according to the execution track of the symbolic execution virtual machine;
(5-2)判断指令访问的内存地址是否是全局变量地址,如果是,获取全局变量的地址,建立全局变量地址与编号的映射关系,获取线程访问全局变量信息,包括线程号以及时间戳向量,若指令类型为写指令,则转入子步骤(5-3),若指令类型为读指令则转入子步骤(5-4);否则根据符号执行虚拟机的执行轨迹继续执行该指令;(5-2) Determine whether the memory address accessed by the instruction is a global variable address, if so, obtain the address of the global variable, establish the mapping relationship between the global variable address and the number, and obtain the thread access global variable information, including the thread number and timestamp vector , if the instruction type is a write instruction, then proceed to substep (5-3), if the instruction type is a read instruction, then proceed to substep (5-4); otherwise, continue to execute the instruction according to the execution track of the symbolic execution virtual machine;
(5-3)根据全局变量地址对应的编号读取该全局变量对应的所有读访问记录的集合,遍历所有的读访问记录,对每个读访问记录按照如下形式处理:将读访问记录的线程号对应的时间戳值与当前访问记录的相同的线程号对应的时间戳值进行比较,若满足小于等于关系,则不做处理,否则 表明两个操作存在先后序关系,并进行记录。最后将所有读访问记录的时间戳向量与当前访问的时间戳向量进行合并,合并过称为:按照数组下标从小到大的顺序遍历两个数组,假设下标为i时,将两个数组中下标为i的较大值赋值给当前访问的时间戳向量下标为i的时间戳值。(5-3) Read the set of all read access records corresponding to the global variable according to the number corresponding to the global variable address, traverse all the read access records, and process each read access record in the following form: the thread of the read access record The timestamp value corresponding to the thread number is compared with the timestamp value corresponding to the same thread number of the current access record. If the less than or equal relationship is satisfied, no processing is performed. Otherwise, it indicates that there is a sequence relationship between the two operations and is recorded. Finally, the timestamp vector of all read access records is merged with the timestamp vector of the current access. Merging is called: traversing the two arrays in order of array subscripts from small to large, assuming that the subscript is i, the two arrays The larger value with the subscript i in the middle is assigned to the timestamp value with the subscript i of the currently accessed timestamp vector.
(5-4)根据全局变量地址对应的编号读取该全局变量对应的所有写访问记录的集合,对每个写访问记录按照如下形式处理:将写访问记录的线程号对应的时间戳值与当前访问记录的相同的线程号对应的时间戳值进行比较,若满足小于等于关系,则不作处理,否则表明两个操作存在先后序关系,并进行记录。最后将所有写访问记录的时间戳向量与当前访问的时间戳向量进行合并,合并过程为:按照数组下标从小到大的顺序遍历两个数组,假设下标为i时,将两个数组中下标为i的较大值赋值给当前访问的时间戳向量下标为i的时间戳值;(5-4) Read the collection of all write access records corresponding to the global variable according to the number corresponding to the global variable address, and process each write access record in the following form: compare the timestamp value corresponding to the thread number of the write access record with Compare the timestamp values corresponding to the same thread number of the current access record. If the less than or equal relationship is satisfied, no processing will be performed. Otherwise, it indicates that there is a sequence relationship between the two operations and will be recorded. Finally, merge the timestamp vectors of all write access records with the currently accessed timestamp vector. The merging process is as follows: traverse the two arrays according to the order of the array subscripts from small to large. Assuming that the subscript is i, the two arrays The larger value with the subscript i is assigned to the timestamp value with the subscript i of the currently accessed timestamp vector;
(5-5)将所有读写访问记录的集合写入记录文件,从而完成执行轨迹的记录过程。(5-5) Write the set of all read-write access records into the record file, so as to complete the recording process of the execution track.
如图5所示,本发明方法的步骤(3)包括下述子步骤:As shown in Figure 5, the step (3) of the inventive method comprises the following sub-steps:
(3-1)读取程序执行信息文件,将程序执行信息文件中的记录文件写入内存;(3-1) read the program execution information file, and write the record file in the program execution information file into the internal memory;
(3-2)根据符号执行虚拟机的执行轨迹循环选取检测请求对应的多线程程序的指令,判断指令是否为读指令、写指令、函数调用指令,若是,则转入步骤(3-3),否则根据符号执行虚拟机的执行轨迹继续执行该指令;(3-2) According to the execution trajectory of the symbolic execution virtual machine, the instruction of the multi-threaded program corresponding to the detection request is cyclically selected, and it is judged whether the instruction is a read instruction, a write instruction, or a function call instruction, and if so, proceed to step (3-3) , otherwise continue to execute the instruction according to the execution track of the symbolic execution virtual machine;
(3-3)判断指令访问的内存地址是否为全局变量,如果是则进入子步骤(3-4),否则如果该指令是多线程函数调用指令中的加锁解锁函数则进入子步骤(3-4),否则根据符号执行虚拟机的执行轨迹继续执行该指令。。(3-3) judge whether the memory address accessed by the instruction is a global variable, if it is, then enter the substep (3-4), otherwise if the instruction is the lock and unlock function in the multithreaded function call instruction, then enter the substep (3 -4), otherwise continue to execute the instruction according to the execution track of the symbolic execution virtual machine. .
(3-4)根据写入内存的文件记录信息,判断当前指令是否存在依赖关系,若不存在依赖关系,则进入子步骤(3-5),否则获取符号执行虚拟机中的下一条指令,并返回子步骤(3-2)执行;(3-4) According to the file record information written into the internal memory, it is judged whether there is a dependency relationship in the current instruction, if there is no dependency relationship, then enter sub-step (3-5), otherwise the symbol is obtained to execute the next instruction in the virtual machine, And return sub-step (3-2) to carry out;
(3-5)执行当前指令,解除已执行指令的指令间依赖关系,从而更新写入内存的文件记录信息,获取符号执行虚拟机中的下一条指令,程序未结束时返回子步骤(3-2)执行,直至程序结束。(3-5) Execute the current instruction, remove the dependencies between the instructions of the executed instructions, thereby update the file record information written into the memory, obtain the next instruction in the symbolic execution virtual machine, and return to the substep (3-5) when the program is not over 2) Execute until the end of the program.
本领域的技术人员容易理解,以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。Those skilled in the art can easily understand that the above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present invention, All should be included within the protection scope of the present invention.
Claims (6)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610679571.1A CN106294169B (en) | 2016-08-17 | 2016-08-17 | A kind of data contention detection and playback method based on semiology analysis virtual machine |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610679571.1A CN106294169B (en) | 2016-08-17 | 2016-08-17 | A kind of data contention detection and playback method based on semiology analysis virtual machine |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106294169A true CN106294169A (en) | 2017-01-04 |
CN106294169B CN106294169B (en) | 2018-08-03 |
Family
ID=57679508
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610679571.1A Expired - Fee Related CN106294169B (en) | 2016-08-17 | 2016-08-17 | A kind of data contention detection and playback method based on semiology analysis virtual machine |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106294169B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108447330A (en) * | 2018-01-30 | 2018-08-24 | 王志强 | A kind of numerical control device cascade system and method that actual situation combines |
CN110083539A (en) * | 2019-04-29 | 2019-08-02 | 广州华多网络科技有限公司 | A kind of data contention detection method and device |
CN111563045A (en) * | 2020-05-11 | 2020-08-21 | 西安邮电大学 | A statement-level detection method for concurrent program data races based on Adaboost model |
CN113297069A (en) * | 2021-04-30 | 2021-08-24 | 中国科学院信息工程研究所 | Software testing method and device based on target drive |
CN113553137A (en) * | 2021-06-17 | 2021-10-26 | 中国人民解放军战略支援部队信息工程大学 | A high-speed data processing method for access capability network elements based on DPDK under NFV architecture |
CN114245892A (en) * | 2019-08-14 | 2022-03-25 | 微软技术许可有限责任公司 | Data race analysis based on changing the loading inside a function during time travel debugging |
CN114817058A (en) * | 2022-05-11 | 2022-07-29 | 北京百度网讯科技有限公司 | Concurrent risk detection method and device, electronic equipment and storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1019826A1 (en) * | 1996-10-11 | 2000-07-19 | Ericsson Australia Pty. Ltd. | A method and system for continuous software monitoring |
US7076776B2 (en) * | 2001-11-22 | 2006-07-11 | Electronics And Telecommunications Research Institute | Parallel loop transformation methods for race detection during an execution of parallel programs |
CN103365776A (en) * | 2013-06-28 | 2013-10-23 | 中国科学院计算技术研究所 | Parallel system weak consistency verifying method and system based on deterministic replay |
CN103488563A (en) * | 2013-09-05 | 2014-01-01 | 龙芯中科技术有限公司 | Data race detection method and device for parallel programs and multi-core processing system |
US8677186B2 (en) * | 2010-12-15 | 2014-03-18 | Microsoft Corporation | Debugging in data parallel computations |
CN104077144A (en) * | 2014-07-07 | 2014-10-01 | 西安交通大学 | Data race detection and evidence generation method based on multithreaded program constraint building |
CN104978272A (en) * | 2015-07-08 | 2015-10-14 | 中国科学院软件研究所 | Program scheduling method for data race detection |
CN105183655A (en) * | 2015-09-25 | 2015-12-23 | 南京大学 | Android application program data race detection based on predictability analysis |
-
2016
- 2016-08-17 CN CN201610679571.1A patent/CN106294169B/en not_active Expired - Fee Related
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1019826A1 (en) * | 1996-10-11 | 2000-07-19 | Ericsson Australia Pty. Ltd. | A method and system for continuous software monitoring |
US7076776B2 (en) * | 2001-11-22 | 2006-07-11 | Electronics And Telecommunications Research Institute | Parallel loop transformation methods for race detection during an execution of parallel programs |
US8677186B2 (en) * | 2010-12-15 | 2014-03-18 | Microsoft Corporation | Debugging in data parallel computations |
CN103365776A (en) * | 2013-06-28 | 2013-10-23 | 中国科学院计算技术研究所 | Parallel system weak consistency verifying method and system based on deterministic replay |
CN103488563A (en) * | 2013-09-05 | 2014-01-01 | 龙芯中科技术有限公司 | Data race detection method and device for parallel programs and multi-core processing system |
CN104077144A (en) * | 2014-07-07 | 2014-10-01 | 西安交通大学 | Data race detection and evidence generation method based on multithreaded program constraint building |
CN104978272A (en) * | 2015-07-08 | 2015-10-14 | 中国科学院软件研究所 | Program scheduling method for data race detection |
CN105183655A (en) * | 2015-09-25 | 2015-12-23 | 南京大学 | Android application program data race detection based on predictability analysis |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108447330A (en) * | 2018-01-30 | 2018-08-24 | 王志强 | A kind of numerical control device cascade system and method that actual situation combines |
CN108447330B (en) * | 2018-01-30 | 2020-07-03 | 王志强 | Virtual-real combined numerical control equipment cascade system and method |
CN110083539A (en) * | 2019-04-29 | 2019-08-02 | 广州华多网络科技有限公司 | A kind of data contention detection method and device |
CN110083539B (en) * | 2019-04-29 | 2020-11-13 | 广州华多网络科技有限公司 | Data competition detection method and device |
CN114245892A (en) * | 2019-08-14 | 2022-03-25 | 微软技术许可有限责任公司 | Data race analysis based on changing the loading inside a function during time travel debugging |
CN111563045A (en) * | 2020-05-11 | 2020-08-21 | 西安邮电大学 | A statement-level detection method for concurrent program data races based on Adaboost model |
CN111563045B (en) * | 2020-05-11 | 2022-11-01 | 西安邮电大学 | A statement-level detection method for concurrent program data races based on Adaboost model |
CN113297069A (en) * | 2021-04-30 | 2021-08-24 | 中国科学院信息工程研究所 | Software testing method and device based on target drive |
CN113553137A (en) * | 2021-06-17 | 2021-10-26 | 中国人民解放军战略支援部队信息工程大学 | A high-speed data processing method for access capability network elements based on DPDK under NFV architecture |
CN113553137B (en) * | 2021-06-17 | 2022-11-01 | 中国人民解放军战略支援部队信息工程大学 | A high-speed data processing method for access capability network elements based on DPDK under NFV architecture |
CN114817058A (en) * | 2022-05-11 | 2022-07-29 | 北京百度网讯科技有限公司 | Concurrent risk detection method and device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN106294169B (en) | 2018-08-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106294169B (en) | A kind of data contention detection and playback method based on semiology analysis virtual machine | |
US7752605B2 (en) | Precise data-race detection using locksets | |
US7526750B2 (en) | Object-based systematic state space exploration of software | |
US8392930B2 (en) | Resource contention log navigation with thread view and resource view pivoting via user selections | |
CN102073588B (en) | Code static analysis based multithread deadlock detection method and system | |
US20080209436A1 (en) | Automated testing of programs using race-detection and flipping | |
Li et al. | Practical symbolic race checking of GPU programs | |
US20080320437A1 (en) | Constructing Petri Nets from traces for diagnostics | |
CN114428733B (en) | Kernel data race detection method based on static program analysis and fuzz testing | |
JPH10254716A (en) | Detection of concurrent error in multi-threaded program | |
US20230185703A1 (en) | Automatic parsing and path analysis method for unit test code structure | |
US8141082B2 (en) | Node-based representation of multi-threaded computing environment tasks, and node-based data race evaluation | |
CN109308213B (en) | Multi-task breakpoint debugging method based on improved task scheduling mechanism | |
WO2015027403A1 (en) | Testing multi-threaded applications | |
Dong et al. | Concurrency-related flaky test detection in android apps | |
CN109542444A (en) | Monitoring method, device, server and the storage medium of JAVA application | |
Chiang et al. | Formal analysis of GPU programs with atomics via conflict-directed delay-bounding | |
CN109522207B (en) | Atom set serialization violation detection method based on constraint solving | |
US8607204B2 (en) | Method of analyzing single thread access of variable in multi-threaded program | |
CN117971626B (en) | Shared memory leakage detection method based on multi-process coroutine model | |
US10452534B2 (en) | Asynchronous operation query | |
CN106484618A (en) | A kind of based on memory access rely on to parallel playback method and device | |
CN110187884B (en) | An optimization method for memory access instruction instrumentation in multi-threaded application scenarios | |
KR102194413B1 (en) | Method for hybrid garbage collection using reverse reference tracing and garbage collection system for using the same method | |
CN111858361B (en) | An Atomicity Violation Defect Detection Method Based on Prediction and Parallel Verification Strategy |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20180803 Termination date: 20190817 |
|
CF01 | Termination of patent right due to non-payment of annual fee |