WO2022142595A1 - 程序检测方法及装置 - Google Patents

程序检测方法及装置 Download PDF

Info

Publication number
WO2022142595A1
WO2022142595A1 PCT/CN2021/123936 CN2021123936W WO2022142595A1 WO 2022142595 A1 WO2022142595 A1 WO 2022142595A1 CN 2021123936 W CN2021123936 W CN 2021123936W WO 2022142595 A1 WO2022142595 A1 WO 2022142595A1
Authority
WO
WIPO (PCT)
Prior art keywords
program
thread
memory
read
write
Prior art date
Application number
PCT/CN2021/123936
Other languages
English (en)
French (fr)
Inventor
张汝涛
周卿
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP21913340.2A priority Critical patent/EP4258121A1/en
Publication of WO2022142595A1 publication Critical patent/WO2022142595A1/zh
Priority to US18/342,388 priority patent/US20230367516A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3604Software analysis for verifying properties of programs
    • G06F11/3612Software analysis for verifying properties of programs by runtime analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0655Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices
    • G06F3/0659Command handling arrangements, e.g. command buffers, queues, command scheduling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3664Environments for testing or debugging software
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3696Methods or tools to render software testable
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/81Threshold

Definitions

  • the present application relates to the field of communications, and in particular, to a program detection method and device.
  • the strong memory model and the weakly ordered memory model are two storage models.
  • the strong memory model describes that each instruction has implicit acquire (acquire) and release (release) semantics, and acquire semantics can be used to prevent read-acquire (read-acquire) and any subsequent read and write operations out of order, release Semantics can be used to prevent write-release and any previous read and write operations from being out of order, that is, a sequence of write operations performed by a core (core) to memory can be used by other central processing unit (CPU) Nuclei were observed in the same order.
  • the weak memory model describes that within an independent thread, any read and write operations to memory can exchange order with other read and write operations without changing the behavior of the thread.
  • the embodiments of the present application provide a program detection method and device, which can help a user to quickly complete the inspection of a program running on a weak memory model platform, and can improve the program detection efficiency.
  • a program detection method includes: receiving a program provided by a user, and obtaining a result that the program runs in a weak memory environment according to query parameters and the program. Among them, the query parameter is used to indicate the maximum interval for reordering of two operations of the program.
  • the program detection device receives the program provided by the user, and according to the query parameter and the program used to indicate the maximum interval for reordering of two operations of the program, it can be obtained that the program runs in a weak memory environment
  • the above result can help the user to quickly complete the check of the program running in the weak memory environment, can improve the detection efficiency of the program, and has low requirements for the user's professional ability.
  • obtaining the result of the program running in the weak memory environment according to the query parameters and the program may include: according to the query parameters, the first operation in the first thread of the program, and the operation in the first thread The interval between the second operation and the interval between the third operation in the second thread of the program and the fourth operation in the second thread obtain the result that the program runs on a weak memory environment.
  • the first operation and the third operation can be a pair of read and write operations on the same variable
  • the second operation and the fourth operation can be a pair of read and write operations on the same variable
  • the operations may be operations performed on different variables
  • the third and fourth operations may be operations performed on different variables. In this way, the user does not need to test the program, which can improve the detection efficiency of the program.
  • the program detection method provided in the first aspect may further include: querying the rules in a memory read/write mode, performing a first operation in the first thread of the program and a second operation in the first thread of the program. , Detect the third operation in the second thread of the program and the fourth operation in the second thread, and obtain the result that the program runs in a weak memory environment.
  • the memory read/write mode query rule may be determined according to query parameters, and the memory read/write mode query rule may be used to determine whether the first thread and the second thread overlap in time.
  • the result is an error, such as a weak memory order problem is likely to occur;
  • the program does not meet the query rules of the memory read-write mode, the result is correct, such as the running result on the weak memory model platform is correct, There is no need for the user to repeatedly test the program, and the result of the program running in a weak memory environment can be quickly obtained.
  • the memory read-write mode query rules may include read-read, write-write out-of-order query rules, and/or read-write, write-read out-of-order query rules.
  • the memory read-write mode query rules may include read-read, write-write out-of-order query rules, and/or read-write, write-read out-of-order query rules.
  • the query parameter may be determined by the user, and the query parameter may include the on-chip write operation delay.
  • the query parameters may also include cross-NUMA write latency and/or cross-NUMA read latency. In this way, the memory read/write mode query can be performed according to the query parameters input by the user.
  • the program detection method provided by the first aspect may further include: obtaining a value corresponding to a query parameter.
  • the value corresponding to the query parameter may include a value corresponding to the on-chip write operation delay.
  • the value corresponding to the query parameter may further include a value corresponding to the cross-NUMA write operation delay and/or a value corresponding to the cross-NUMA read operation delay.
  • the weak memory environment may be a running environment corresponding to a running device of a non-Uniform memory access architecture (NUMA).
  • NUMA non-Uniform memory access architecture
  • the program detection method provided by the first aspect may further include: providing a user with a result.
  • the results may include suggestions for modification.
  • the user can manually modify the program according to the modification suggestion, which can help the user to quickly locate the position in the program that causes the program error, quickly complete the program modification, and improve the program modification efficiency.
  • the result can include: correct.
  • the results may include errors and/or suggested revisions.
  • the modification suggestion may include location information of the code to be modified, such as code line number and/or program file name, and the modification suggestion may be used for the program detection device to modify the program or for the user to manually modify the program.
  • the program detection method provided by the first aspect may further include: in response to a user's determination instruction, modifying the program according to the modification suggestion. In this way, the user can be directly helped to complete the modification of the program, and the modification efficiency of the program can be further improved.
  • a program detection device in a second aspect, includes: a receiving unit and an acquiring unit. Among them, the receiving unit is used to receive the program provided by the user; the obtaining unit is used to obtain the result of the program running on the weak memory environment according to the query parameter and the program. Among them, the query parameter is used to indicate the maximum interval for reordering of two operations of the program.
  • the obtaining unit is further configured to obtain the data according to the query parameter, the interval between the first operation in the first thread of the program and the second operation in the first thread, and the first operation in the second thread of the program.
  • the interval between the third operation and the fourth operation in the second thread obtains the result that the program runs in a weak memory environment.
  • the first operation and the third operation are a pair of read and write operations on the same variable
  • the second operation and the fourth operation are a pair of read and write operations on the same variable
  • the first operation and the second operation are:
  • the third operation and the fourth operation are operations performed on different variables.
  • the acquisition unit is also used to query the rules in the memory read-write mode, and perform the first operation in the first thread of the program, the second operation in the first thread, and the second thread of the program.
  • the third operation in the second thread and the fourth operation in the second thread are detected, and the result of the program running in the weak memory environment is obtained.
  • the memory read/write mode query rule may be determined according to query parameters, and the memory read/write mode query rule may be used to determine whether the first thread and the second thread overlap in time.
  • the memory read-write mode query rules may include read-read, write-write out-of-order query rules, and/or read-write, write-read out-of-order query rules.
  • the query parameter may be determined by the user, and the query parameter may include the on-chip write operation delay.
  • the query parameters may also include cross-NUMA write latency and/or cross-NUMA read latency. In this way, the memory read/write mode query can be performed according to the query parameters input by the user.
  • the obtaining unit is also used to obtain the value corresponding to the query parameter.
  • the value corresponding to the query parameter may include a value corresponding to the on-chip write operation delay.
  • the value corresponding to the query parameter may further include a value corresponding to the cross-NUMA write operation delay and/or a value corresponding to the cross-NUMA read operation delay.
  • the weak memory environment may be a running environment corresponding to a running device of a non-uniform memory access architecture NUMA.
  • the program detection apparatus described in the second aspect may further include: an output unit.
  • An output unit that provides results to the user.
  • the results may include suggestions for modification.
  • the result can include: correct.
  • the results may include errors and/or suggested revisions.
  • the modification suggestion may include location information of the code to be modified, such as code line number and/or program file name, and the modification suggestion may be used for the program detection device to modify the program or for the user to manually modify the program.
  • the acquiring unit is further configured to modify the program according to the modification suggestion in response to the user's determination instruction.
  • the program detection device may be placed in a cloud server.
  • the receiving unit and the output unit may be set separately, or may be integrated in one module, that is, the transceiver module.
  • the present application does not specifically limit the specific implementation manners of the receiving unit and the output unit.
  • the program detection apparatus described in the second aspect may further include a storage module, where the storage module stores programs or instructions.
  • the program detection apparatus can execute the program detection method described in any possible implementation manner of the first aspect.
  • the program detection device described in the second aspect may be a computer device, a server or a cloud server, or a chip (system) or other components or components that can be set in the computer device, server or cloud server. This is not limited.
  • a program detection device in a third aspect, includes: a processor, which is coupled to a memory, and the memory is used for storing a computer program; the processor is used for executing the computer program stored in the memory, so that the program detection apparatus executes any one of the possibilities in the first aspect The program detection method described in the implementation manner.
  • the program detection apparatus described in the third aspect may further include a transceiver.
  • the transceiver may be a transceiver circuit or an input/output port.
  • the transceiver can be used for the program detection apparatus to communicate with other devices.
  • the program detection apparatus described in the third aspect may be a computer device, a server or a cloud server, or a chip or a chip system provided inside the computer device, server or cloud server.
  • a system-on-chip in a fourth aspect, is provided, the system-on-chip has a weak memory environment, the system-on-chip includes a processor and input/output ports, the processor is coupled to a memory containing instructions for controlling the implementation of the system-on-chip With regard to the processing function involved in any one of the implementation manners of the first aspect, the input/output port is used to implement the transceiving function involved in any one of the implementation manners of the first aspect.
  • the chip system further includes a memory for storing program instructions and data for implementing the functions involved in the first aspect.
  • the chip system may be composed of chips, or may include chips and other discrete devices.
  • a computer-readable storage medium comprising: computer instructions are stored in the computer-readable storage medium; when the computer instructions are executed on a computer, the computer is made to execute any one of the possibilities in the first aspect.
  • a sixth aspect provides a computer program product containing instructions, including computer programs or instructions that, when the computer program or instructions are run on a computer, cause the computer to perform any of the possible implementations of the first aspect.
  • FIG. 1 is a schematic flowchart one of a program detection method provided by an embodiment of the present application.
  • FIG 2 is an application schematic diagram 1 of the program detection method provided by the embodiment of the present application.
  • FIG. 3 is a second schematic flowchart of a program detection method provided by an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram 1 of a CPU provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram 1 of an interface of a program detection device provided by an embodiment of the present application.
  • FIG. 6 is an example diagram of a program provided by an embodiment of the present application.
  • FIG. 7 is a second interface schematic diagram of a program detection device provided by an embodiment of the present application.
  • FIG. 8 is an analysis example diagram 1 of a program provided by an embodiment of the present application.
  • FIG. 9 is an analysis example FIG. 2 of the program provided by the embodiment of the present application.
  • FIG. 10 is a second application schematic diagram of the program detection method provided by the embodiment of the present application.
  • FIG. 11 is a schematic structural diagram 1 of a program detection apparatus provided by an embodiment of the present application.
  • FIG. 12 is a second schematic structural diagram of a program detection apparatus provided by an embodiment of the present application.
  • operation instruction and “operation” may sometimes be used interchangeably. It should be noted that, when the difference is not emphasized, the meanings to be expressed are the same. “Statement”, “program statement” and “code” can sometimes be used interchangeably. It should be pointed out that when the difference is not emphasized, the meanings to be expressed are the same.
  • the TSO consistency model describes the order in which multiple cores in the central processing unit (CPU) have one and only one global order of memory write operations.
  • the TSO consistency model belongs to the strong memory model.
  • An embodiment of the present application provides a program detection method, which can be used to detect the correctness of a program running on a weak memory model platform, such as the correctness of a program running on an advanced reduced instruction set machine (Advanced RISC Machine, ARM) platform.
  • This program detection method can be used alone or integrated with third-party software.
  • the program detection apparatus provided by the embodiment of the present application may be a computer device, a server, or a cloud server, etc., or a chip or other component with a program detection function applied to the computer device, server, or cloud server.
  • FIG. 1 is a schematic flowchart 1 of a program detection method provided by an embodiment of the present application.
  • the program detection method provided by the embodiments of the present application can be used to check software developed in a compiled language, such as C language or C++ language.
  • the program detection method includes the following steps:
  • FIG. 2 is a schematic diagram 1 of the application of the program detection method provided by the embodiment of the present application.
  • the program detection device receives the program provided by the user.
  • the program may be C/C++ software source code.
  • the program may include one or more threads.
  • the program detection method provided by the embodiment of the present application may include: using a compiler to convert the program provided by the user into intermediate code (intermediate representation, IR).
  • intermediate code intermediate representation, IR
  • the program is compiled by the Clang/LLVM compiler to generate intermediate code, for example, clang-emit-llvm-c-g XXX.c-o XXX .bc.
  • the intermediate file XXX.bc of the program is analyzed by the program detection device Wekmemcheck.
  • the program provided by the user is compiled by the compiler, and the intermediate code IR is generated, and the program detection device analyzes the intermediate code, which can realize static analysis of the source code without running
  • the program provided by the user has no intrusion to the user's software, and can ensure the safety of the user's program and the convenient operation.
  • the weak memory environment is a running environment corresponding to a running device of a non-Uniform memory access architecture (NUMA).
  • NUMA non-Uniform memory access architecture
  • the weak memory environment may be an operating environment of a weak memory model device, a weak memory model platform, or the like.
  • the query parameter is used to indicate the maximum interval for reordering of two operations of the program.
  • the query parameter may include the type of the query parameter and/or the value corresponding to the query parameter.
  • the type of query parameter may include on-chip write latency.
  • the on-chip write latency may be used to indicate the maximum interval for reordering of two on-chip store instructions.
  • memory instructions may be used to save data in registers to memory, and on-chip write latency may be used to detect whether reordering occurs between two or more store instructions within a NUMA node.
  • on-chip write latency may be used to detect whether reordering occurs between two or more store instructions within a NUMA node.
  • the types of query parameters may also include cross-NUMA write latency and/or cross-NUMA read latency.
  • the cross-NUMA write latency may be used to indicate the maximum interval between two store instructions across NUMA nodes for reordering to occur
  • the cross-NUMA read latency may be used to indicate that two load instructions across NUMA nodes occur. The maximum interval for reordering, where an instruction across NUMA nodes refers to an instruction processed by one NUMA node that accesses the memory of another NUMA node.
  • a load instruction can be used to read data from memory into a register
  • a cross-NUMA write latency can be used to detect if two or more store instructions will occur between two or more NUMA nodes
  • cross-NUMA read latency can be used to detect if reordering occurs between two or more load instructions between two or more NUMA nodes.
  • the value corresponding to the query parameter may include a value corresponding to the on-chip write operation delay.
  • the maximum interval for reordering of two store instructions is 8 memory operation instructions. That is to say, if the interval between two storage instructions is less than or equal to 8 memory operation instructions, the two storage instructions are considered to be reordered; if the interval between two storage instructions is greater than 8 memory operation instructions, It is considered that the two store instructions will not be reordered.
  • the memory operation instructions may include: store (store) instructions and/or load (load) instructions.
  • the value corresponding to the query parameter may further include a value corresponding to the cross-NUMA write operation delay and/or a value corresponding to the cross-NUMA read operation delay.
  • the maximum interval for reordering of two store instructions is 8 memory operation instructions. That is to say, if the interval between two storage instructions is less than or equal to 8 memory operation instructions, the two storage instructions are considered to be reordered; if the interval between two storage instructions is greater than 8 memory operation instructions, It is considered that the two store instructions will not be reordered.
  • the maximum interval for reordering of two load instructions is 8 memory operation instructions. That is to say, if the interval between two load instructions is less than or equal to 8 memory operation instructions, it is considered that the two load instructions will not be reordered; if the interval between two load instructions exceeds 8 memory operation instructions , the two load instructions are considered not to be reordered.
  • the type of the query parameter may include an on-chip read operation delay
  • the value corresponding to the query parameter may include a value corresponding to the on-chip read operation delay.
  • the on-chip read latency can be used to indicate the maximum interval for reordering of two load instructions, and the on-chip read latency can be used to detect whether there are two or more load instructions in a NUMA node. Reordering will occur.
  • the specific implementation of the on-chip read operation delay is similar to the above-mentioned on-chip write operation delay, and will not be repeated here.
  • reordering may refer to a change in the sequential execution order of memory operation instruction 1 and memory operation instruction 2 .
  • the sequence recorded in the code of the program file is that the memory operation instruction 1 is executed before the memory operation instruction 2. If the program is run in a weak memory environment, due to the characteristics of the weak memory environment, the memory operation instruction 2 is executed first, and then the memory operation instruction 2 is executed. Memory operation instruction 1.
  • in-memory read-write mode query can be performed according to query parameters.
  • This query parameter is used to describe the on-chip and off-chip memory read and write delays of different cores of the CPU hardware, and to construct the memory read and write mode query rules.
  • query parameters can be determined according to the structure of the CPU.
  • FIG. 4 is a schematic structural diagram 1 of a CPU according to an embodiment of the present application.
  • the CPU includes two memories (memory), memory 0 (memory0) and memory 1 (memory1), and the two memories respectively correspond to a core and a cache.
  • memory0 corresponds to core0
  • cache cache0 and cache1 corresponds to a store buffer (Store Buffer)
  • between core core1 and cache cache1 corresponds to a store buffer (Store Buffer)
  • memory1 corresponds to core2, core3, cache2, and cache3.
  • a cross-NUMA load buffer (NUMA Load Buffer) and/or a cross-NUMA store buffer (NUMA Store Buffer).
  • NUMA Load Buffer can read memory1 by crossing the NUMA load cache
  • core0 and core1 can write memory1 by crossing the NUMA store cache
  • core2 and core3 can read memory0 by crossing the NUMA load cache
  • core2 and core3 can write memory0 by crossing the NUMA store cache.
  • the architecture including memory0, core0, core1, cache0 and cache1, and the corresponding Store Buffer can be called NUMA node 0.
  • the read operation of (cache0 and cache1, the corresponding Store Buffer) is called an on-chip read operation.
  • the read operation of core0 to memory0 is called an on-chip read operation.
  • the write operation of core0 or core1 inside NUMA node 0 to the memory (memory0) or cache (cache0 and cache1, corresponding Store Buffer) inside NUMA node 0 is called an on-chip write operation.
  • the write operation of core0 to memory0 is called an on-chip write operation.
  • the architecture including memory1, core2, core3, cache2, and cache3, and the corresponding Store Buffer can be called NUMA node 1, and the core2 or core3 inside NUMA node 1 can be compared to the memory (memory1) or cache inside NUMA node 1 (cache2, cache3, and the corresponding Store Buffer) read operations are called on-chip read operations.
  • the read operation of core2 to memory1 is called an on-chip read operation.
  • the write operation of core2 or core3 inside NUMA node 1 to the memory (memory1) or cache (cache2, cache3, and the corresponding Store Buffer) inside NUMA node 1 is called an on-chip write operation.
  • the write operation of core2 to memory1 is called an on-chip write operation.
  • a memory operation between NUMA node 0 and NUMA node 1 is referred to as a memory operation across NUMA, and the memory operation includes a read operation and/or a write operation.
  • a read operation from core0 in NUMA node 0 to memory1 in NUMA node 1 is called a cross-NUMA read operation
  • a write operation from core0 in NUMA node 0 to memory1 in NUMA node 1 is called a cross-NUMA write operation.
  • the read operation of core2 in NUMA node 1 to memory0 in NUMA node 0 is called a cross-NUMA read operation
  • the write operation of core2 in NUMA node 1 to memory0 in NUMA node 0 is called a cross-NUMA write operation. List them one by one.
  • the on-chip write operation delay and/or a value corresponding to the on-chip write operation delay can be determined. If the CPU includes a cross-NUMA load buffer (NUMA Load Buffer), a value corresponding to the cross-NUMA write operation delay and/or the cross-NUMA write operation delay may be determined. If the CPU includes a cross-NUMA store buffer (NUMA Store Buffer), a value corresponding to the cross-NUMA read operation delay and/or the cross-NUMA read operation delay may be determined.
  • NUMA Load Buffer a value corresponding to the cross-NUMA write operation delay and/or the cross-NUMA write operation delay may be determined.
  • a load buffer (Load Buffer)
  • a load buffer may be corresponding to the core core0 and the cache cache0.
  • a load buffer may be corresponding to the core1 and the cache1, between the core2 and the cache2, and between the core3 and the cache3 respectively (not shown in FIG. 4 ). Shows).
  • the query parameters may be determined by a program detection device.
  • the program detection apparatus may determine the type of query parameter, eg, using on-chip write latency, or using cross-NUMA write latency and/or cross-NUMA read latency.
  • the program detection device may determine the value corresponding to the query parameter. For example, the program detection device obtains the value corresponding to the query parameter according to the measurement of the test set. For example, the program detection device can determine the value corresponding to the on-chip write operation delay, and can also determine the value corresponding to the cross-NUMA write operation delay and/or the cross-NUMA read operation delay. corresponding value.
  • query parameters may be user-determined.
  • the program detection device can analyze the program according to the query parameters determined by the user, and obtain the result that the program runs in a weak memory environment.
  • the user can determine the type of query parameter.
  • FIG. 5 is a schematic diagram 1 of an interface of a program detection apparatus provided by an embodiment of the present application.
  • the user may input an instruction corresponding to the query parameter in the area of the display interface for inputting the query parameter.
  • the instruction StoreBuffer corresponding to the on-chip write delay can be entered.
  • the corresponding instruction NUMAStoreSize to be delayed across NUMA write operations and/or the corresponding instruction NUMALoadSize to be delayed across NUMA read operations can also be input.
  • the display interface of the program detection apparatus may include a selection box for on-chip write operation delay, and may further include a selection box for cross-NUMA write operation delay and/or a selection box for cross-NUMA read operation delay.
  • the user may determine the type of the adopted query parameter by checking the selection box corresponding to the type of the query parameter. With reference to (b) in Figure 5, the user can check the box corresponding to the on-chip write operation delay to determine the use of the on-chip write operation delay. Similarly, the user can use the check box corresponding to the cross-NUMA write operation delay to determine the use of cross-NUMA write operation delay. If the cross-NUMA read operation delay is not adopted, the selection box corresponding to the cross-NUMA read operation delay may not be operated.
  • the user can determine the value corresponding to the query parameter.
  • the user can input the corresponding value in the input value area corresponding to the type of each query parameter. For example, the user can input 8 in the area where the corresponding input value is delayed in the on-chip write operation. Similarly, 8 can be entered in the area where the corresponding input value is delayed across NUMA write operations, and 8 can be entered in the area where the corresponding input value is delayed across NUMA read operations.
  • the type of the query parameter may be determined by the program detection device, and the value corresponding to the query parameter may be determined by the user.
  • the type of the query parameter is determined by the user, and the value of the query parameter is determined by the program detection device.
  • the program detection device determines the type of the query parameter and/or the value corresponding to the query parameter
  • the user can modify the type of the query parameter and/or the value corresponding to the query parameter.
  • the program detection method provided by the embodiment of the present application may further include: acquiring a value corresponding to a query parameter.
  • the value corresponding to the query parameter may be preset, and/or the value corresponding to the query parameter may be provided by the user through the input value area corresponding to the type of the query parameter in the display interface.
  • the above S102 may include: according to the query parameter, the interval between the first operation in the first thread of the program and the second operation in the first thread, and the first operation in the second thread of the program The interval between the third operation and the fourth operation in the second thread obtains the result that the program runs in a weak memory environment.
  • the first operation and the third operation can be a pair of read and write operations on the same variable
  • the second operation and the fourth operation can be a pair of read and write operations on the same variable
  • the operations may be operations performed on different variables
  • the third and fourth operations may be operations performed on different variables.
  • the variable can be a shared variable.
  • a pair of operations may be: the first operation is a read operation on the first variable, and the third operation is a write operation on the first variable; or, the first operation is a write operation on the first variable, and the third operation is a write operation on the first variable.
  • the second operation is a read operation on the second variable
  • the fourth operation is a write operation on the second variable
  • the second operation is a write operation on the second variable
  • the fourth operation is a write operation on the second variable read operation.
  • the first variable may include a global variable, a shared variable, and the like
  • the second variable may include a global variable, a shared variable, and the like.
  • FIG. 6 is an example diagram of a program provided by an embodiment of the present application.
  • the program includes the first thread thread1 in line 14 and the second thread thread2 in line 21, and can calculate the write operation to the variable population in thread1 (line 17), and the write operation to the variable syn_flag in thread1 (Line 18), the interval between the read operation of the variable syn_flag in thread2 (Line 23) and the read operation of the variable population in thread2 (Line 24) can be calculated, according to the query parameters and this Two intervals to obtain the results of the program running on a weak memory environment.
  • the program detection method provided by the embodiment of the present application may further include: querying the rules in a memory read-write mode, performing a first operation in the first thread of the program, and a second operation in the first thread of the program. The operation, the third operation in the second thread of the program, and the fourth operation in the second thread are detected to obtain a result that the program runs in a weak memory environment.
  • the memory read/write mode query rule may be determined according to query parameters, and the memory read/write mode query rule may be used to determine whether the first thread and the second thread overlap in time.
  • the query rules of the memory read/write mode can be used to detect thread1 and thread2 in the program shown in FIG. 6 to obtain the result that the program runs in a weak memory environment.
  • the memory read-write mode query rules may include read-read, write-write out-of-order query rules, and/or read-write, write-read out-of-order query rules.
  • the read-read, write-write out-of-order query rule may include one or more of the following conditions: the first operation instruction and the second operation instruction are two adjacent operation instructions in the first thread, and the first operation instruction is The write operation instruction to the first variable, the second operation instruction is the write operation instruction to the second variable, the third operation instruction and the fourth operation instruction are two adjacent operation instructions in the second thread, and the third operation instruction is The read operation instruction for the first variable, the fourth operation instruction is the read operation instruction for the second variable; the first operation instruction and the third operation instruction are in a competitive relationship, and the second operation instruction and the fourth operation instruction are in a competitive relationship; The interval between an operation instruction and the second operation instruction is less than the sum of the value corresponding to the on-chip write operation delay and the value corresponding to the cross-NUMA write operation delay; and the interval between the third operation instruction and the fourth operation instruction is less than the cross-NUMA write operation delay.
  • the read-read, write-write out-of-order query rule can be used to check the running result of a program including at least two threads, and one thread includes read and read operations and the other thread includes write and write operations in a weak memory environment.
  • the read-write, write-read out-of-order query rule may include one or more of the following conditions: the fifth operation instruction and the sixth operation instruction are two adjacent operation instructions in the first thread, and the fifth operation instruction is The write operation instruction for the third variable, the sixth operation instruction is the read operation instruction for the fourth variable, and the seventh operation instruction and the eighth operation instruction are two adjacent operation instructions in the second thread, and the seventh operation instruction is a read operation instruction for the third variable, and the eighth operation instruction is a write operation instruction for the fourth variable; the fifth operation instruction and the seventh operation instruction are in a competitive relationship, and the sixth operation instruction and the eighth operation instruction are in a competitive relationship;
  • the interval between the fifth operation instruction and the sixth operation instruction is less than the sum of the value corresponding to the on-chip write operation delay, the value corresponding to the cross-NUMA write operation delay, and the value corresponding to the cross-NUMA read operation delay.
  • the interval between the eight operation instructions is smaller than the sum of the value corresponding to the on-chip write operation delay, the value corresponding to the cross-NUMA write operation delay, and the value corresponding to the cross-NUMA read operation delay.
  • the read-write and write-read out-of-order query rules can be used to check the running result of a program including at least two threads, and one thread includes read and write operations and the other thread includes write and read operations in a weak memory environment.
  • the competition relationship may be that the sequential execution order of the two operation instructions will affect the execution result.
  • the third variable may include a global variable, a shared variable, and the like
  • the fourth variable may include a global variable, a shared variable, and the like.
  • a program statement that satisfies the above-mentioned memory read/write mode query rule can be determined as a dangerous statement that leads to wrong results due to out-of-order instructions in a weak memory environment.
  • the result is an error, such as a weak memory order problem is likely to occur; if the program does not meet the query rules of the memory read-write mode, the result is correct, such as the running result on the weak memory model platform is correct, There is no need for the user to repeatedly test the program, and the result of the program running in a weak memory environment can be quickly obtained.
  • the program detection method provided by the embodiment of the present application may further include: providing a user with a result.
  • the result can include: correct.
  • the results may include errors and/or suggested revisions.
  • correct can indicate that the program can run correctly in a weak memory environment
  • error can indicate that when the program runs in a weak memory environment, abnormal phenomena such as program crash, program exit, or program calculation result error may occur.
  • the location information of the code such as the code line number and/or the program file name, and the modification suggestion can be used by the program detection device to modify the program or for the user to manually modify the program to make the program run correctly in a weak memory environment. In this way, it can help the user to quickly locate the position that causes the program error in the program, quickly complete the program modification, and improve the program modification efficiency.
  • the detection result may be displayed to the user through the display interface. If the result is an error, the detection result may be displayed to the user through the display interface, and/or a modification suggestion may be displayed.
  • FIG. 7 is a second schematic diagram of an interface of a program detection apparatus provided by an embodiment of the present application. As shown in FIG. 7 , if the result is an error, the result is an error, the corresponding program file name, and the code line number of the error statement, etc. can be displayed on the display interface.
  • the program detection method provided by the embodiment of the present application may further include: in response to a user's determination instruction, modifying the program according to the modification suggestion.
  • the user selects the determined repair area in the display interface, and the program detection apparatus can modify the program according to the modification suggestion in response to the user's determination instruction.
  • the program can be manually modified according to the modification suggestion. In this way, the user can be directly helped to complete the modification of the program, and the modification efficiency of the program can be further improved.
  • the program detection device receives the program provided by the user, and can obtain the result of the program running in the weak memory environment according to the query parameters and the program, which can help the user to quickly complete the program running in the weak memory environment. Inspection can improve the detection efficiency of the program, and requires less professional ability of the user.
  • the program inspection method provided by the embodiments of the present application may further include the following steps 1 to 8.
  • the following description will be given by taking the program including the first thread and the second thread as an example.
  • the first step is to analyze the alias relationship of all variables in the multiple variables included in the program.
  • the second step is to analyze the variable dependence of the program.
  • the function relationship and variable dependency relationship in the program provided by the user is analyzed, and a related thread relationship diagram and a function call relationship diagram are constructed.
  • Analyze the multithreading related application programming interface (API) used in the program and construct a thread data structure based on the thread calling context and the function calling context, so as to distinguish the contexts of different thread operations and analyze the variable dependencies.
  • the multi-thread related API may include thread related functions, such as pthread_create, pthread_mutex_lock, and the like.
  • variable dependency analysis of the program shown in Fig. 6 can be obtained, as shown in Fig. 8, the main function main (including its functions and instructions), the first thread thread1 (including its functions and instructions) and The second thread thread2 (including its functions and instructions).
  • the function call context of the first thread and the function call context of the second thread can be sorted out, and a thread analysis method based on the thread call context and the function call context can be realized.
  • the third step is to analyze the first variable in the program.
  • one or more first variables in the program are analyzed, and a thread call context and a function call context accessed by each of the one or more first variables are distinguished, and the first variable may be the first thread and the function call context.
  • the variables included in the second thread, the first variable may include shared variables, global variables, and the like.
  • a shared variable access point identification method based on the thread calling context and the function calling context can be implemented.
  • the search for shared variables can be completed based on an alias analysis algorithm to improve efficiency, and a context encoding technique can be used to analyze shared variables in a program to improve calculation accuracy.
  • the fourth step is to analyze the lock variables in the program.
  • analyze whether any statement in the program uses a lock variable.
  • the context can be used Coding techniques to analyze lock variables in programs to improve computational accuracy.
  • Step 5 analyzing the possibility of parallel occurrence of the program basic blocks including the first variable.
  • a static vector time algorithm may be used to analyze the possibility of parallel occurrence (may happen in parallel, MHP) of each statement in each program basic block (BB) including the first variable in the program. Constructing a vector timestamp for the program basic block including the read operation or write operation of the first variable, simulating the relative logical time of the program basic block executing in the program or the thread, can realize the program basic block as the granularity, based on Parallel analysis method of the whole program thread call context and function call context.
  • MHP analysis is performed on the program shown in Figure 6, and the vector timestamp construction result of the first thread thread1 and the vector timestamp construction result of the second thread thread2 as shown in Figure 8 can be obtained.
  • the statement execution time overlaps with the statement execution time of thread2, and the statement of thread1 and the statement of thread2 may be executed concurrently.
  • FIG. 9 is an example of analysis of the program provided by the embodiment of the present application, FIG. 2 .
  • FIG. 9 is the analysis result of the program shown in FIG. 6 (eg, the file name is weekConsistency.c).
  • ST is the abbreviation of Static Thread (Note1: ST means Static Thread).
  • A(b)->C means that A calls C by instruction b (Note4:A(b)->C means that A calls C by instruction b).
  • the pair includes: the write operation to the global variable population is in thread1 (source code weekConsistency.c in 17 lines), the read operation of the global variable population is in thread2 (line 24 in the source code weekConsistency.c); the pair includes: the write operation of the global variable syrn_flag in thread1 (line 18 in the source code weekConsistency.c), right The read operation of the global variable syrn_flag is in thread2 (line 23 in the source code weekConsistency.c).
  • the time interval between the 17th line statement of thread1 and the 18th line statement of thread1 overlaps with the time interval between the 23rd line statement of thread2 and the 24th line statement of thread2, and the statement of thread1 and the statement of thread2 May execute concurrently.
  • the above steps 3 to 5 are methods for analyzing the inside of a single thread, and the above steps 3 to 5 can be performed respectively for the first thread and the second thread included in the program.
  • the present application does not limit the specific implementation manner.
  • the above steps 3 to 5 may be performed on the first thread first, and then the above steps 3 to 5 may be performed on the second thread.
  • the above-mentioned step 3 is performed on the first thread
  • the above-mentioned step 3 is performed on the second thread.
  • the above-mentioned step 4 is performed on the first thread
  • the above-mentioned step 4 is performed on the second thread.
  • steps 3, 4 and 5 may be in a parallel relationship, and the execution order of steps 3, 4 and 5 does not affect the corresponding execution results of each step.
  • Step 6 Query the memory read/write mode of the program.
  • step 6 you can refer to the above-mentioned query rule using memory read/write mode to perform the first operation in the first thread of the program, the second operation in the first thread, and the third operation in the second thread of the program. Detecting with the fourth operation in the second thread to obtain the specific description of the result of the program running in the weak memory environment, which will not be repeated here.
  • the statement in the program can be queried to obtain a first statement, and the first statement includes read and write operations on the same variable, and The read and write operations of the variable are not write-protected, and the read and write operations of the variable may be concurrently performed by the first thread and the second thread at the same time, and the read and write operations of the variable may constitute a competition relationship.
  • the program shown in FIG. 6 is inquired about the memory read/write mode. Since the program shown in FIG. 6 does not include dangerous statements, the result shown in FIG. 8 may be correct.
  • Step 7 Obtain the result of the program running in the weak memory environment.
  • the first statement is marked as a dangerous statement
  • the dangerous statement may also be referred to as a dangerous code
  • a modification suggestion can be determined according to the dangerous code and the operation type included in the dangerous code.
  • the modification suggestion may include the code line number and/or program file name of the dangerous code, and the operation type may include a write operation or a read operation.
  • Step 8 provide the result to the user.
  • the result can include: correct.
  • the results may include errors and/or suggested revisions.
  • S706 reference may be made to the above-mentioned related descriptions, which will not be repeated here.
  • the font color of the dangerous code can be converted into a color different from the font color of other codes in the program, and displayed to the user through the display interface.
  • FIG. 10 is a second application schematic diagram of the program detection method provided by the embodiment of the present application.
  • the program detection device provided by the embodiment of the present application uses the program detection method provided by the embodiment of the present application to detect the program A, and the obtained result is: mistake.
  • the user manually or the program detection device automatically modifies the program A. For example, by inserting memory shielding instructions to fix the existing problems, after being compiled by the compiler, the obtained program A' can run correctly on the weak memory model platform, which can improve program detection and modification. s efficiency.
  • FIG. 11 is a schematic structural diagram 1 of a program detection apparatus provided by an embodiment of the present application.
  • the program detection device can be adapted to perform the function of the program detection device in the program detection method shown in FIG. 1 .
  • FIG. 11 only shows the main components of the program detection device.
  • the program detection apparatus 1100 includes: a receiving unit 1101 and an obtaining unit 1102 .
  • the receiving unit 1101 is used for receiving the program provided by the user.
  • the obtaining unit 1102 is configured to obtain the result of the program running in the weak memory environment according to the query parameter and the program.
  • the query parameter is used to indicate the maximum interval for reordering of two operations of the program.
  • the obtaining unit 1102 is further configured to, according to the query parameter, the interval between the first operation in the first thread of the program and the second operation in the first thread, and the interval between the second operation in the second thread of the program
  • the interval between the third operation and the fourth operation in the second thread obtains the result that the program runs in a weak memory environment.
  • the first operation and the third operation are a pair of read and write operations on the same variable
  • the second operation and the fourth operation are a pair of read and write operations on the same variable
  • the first operation and the second operation are:
  • the third operation and the fourth operation are operations performed on different variables.
  • the obtaining unit 1102 is further configured to use a memory read-write mode to query the rules, and perform the first operation in the first thread of the program, the second operation in the first thread, and the second thread of the program.
  • the third operation in the second thread and the fourth operation in the second thread are detected to obtain the result that the program runs in a weak memory environment.
  • the memory read/write mode query rule may be determined according to query parameters, and the memory read/write mode query rule may be used to determine whether the first thread and the second thread overlap in time.
  • the memory read-write mode query rules may include read-read, write-write out-of-order query rules, and/or read-write, write-read out-of-order query rules.
  • the query parameter may be determined by the user, and the query parameter may include the on-chip write operation delay.
  • the query parameters may also include cross-NUMA write latency and/or cross-NUMA read latency.
  • the memory read/write mode query can be performed according to the query parameters input by the user.
  • the obtaining unit 1102 is further configured to obtain the value corresponding to the query parameter.
  • the value corresponding to the query parameter may include a value corresponding to the on-chip write operation delay.
  • the value corresponding to the query parameter may further include a value corresponding to the cross-NUMA write operation delay and/or a value corresponding to the cross-NUMA read operation delay.
  • the weak memory environment may be a running environment corresponding to a running device of a non-uniform memory access architecture NUMA.
  • the program detection apparatus 1100 may further include: an output unit 1103 .
  • the output unit 1103 is used to provide the user with the result.
  • the results may include suggestions for modification.
  • the result can include: correct.
  • the results may include errors and/or suggested revisions.
  • the modification suggestion may include location information of the code to be modified, such as code line number and/or program file name, and the modification suggestion may be used for the program detection device to modify the program or for the user to manually modify the program.
  • the obtaining unit 1102 is further configured to modify the program according to the modification suggestion in response to the user's determination instruction.
  • the program detection apparatus 1100 may be placed in a cloud server.
  • the receiving unit 1101 and the output unit 1103 may be provided separately, or may be integrated into one module, that is, a transceiver module (not shown in FIG. 11 ).
  • a transceiver module not shown in FIG. 11 .
  • the present application does not specifically limit the specific implementation manners of the receiving unit 1101 and the output unit 1103 .
  • the program detection apparatus 1100 may further include a storage module (not shown in FIG. 11 ), where the storage module stores programs or instructions.
  • the program detection apparatus 1100 can perform the function of the program detection apparatus in the program detection method shown in FIG. 1 .
  • the program detection apparatus 1100 may be a computer device, a server or a cloud server, or a chip (system) or other components or components that can be set in the computer device, server or cloud server, which is not limited in this application .
  • FIG. 12 is a second schematic structural diagram of a program detection apparatus provided by an embodiment of the present application.
  • the program detection device may be a computer device, server or cloud server, or a chip (system) or other components or components that can be provided in the computer device, server or cloud server, which is not limited in this application.
  • the program detection apparatus 1200 may include a processor 1201 .
  • the program detection apparatus 1200 may further include a memory 1202 and a transceiver 1203 .
  • the processor 1201 is coupled with the memory 1202 and the transceiver 1203, such as can be connected through a communication bus.
  • the processor 1201 is the control center of the program detection apparatus 1200, and may be a processor or a general term for multiple processing elements.
  • the processor 1201 is one or more central processing units (CPUs), may also be a specific integrated circuit (application specific integrated circuit, ASIC), or is configured to implement one or more embodiments of the present application
  • An integrated circuit such as: one or more microprocessors (digital signal processor, DSP), or, one or more field programmable gate array (field programmable gate array, FPGA).
  • the processor 1201 can execute various functions of the program detection apparatus 1200 by running or executing the software program stored in the memory 1202 and calling the data stored in the memory 1202 .
  • the processor 1201 may include one or more CPUs, such as CPU0 and CPU1 shown in FIG. 12 .
  • the program detection apparatus 1200 may also include multiple processors, for example, the processor 1201 and the processor 1204 shown in FIG. 12 .
  • processors can be a single-core processor (single-CPU) or a multi-core processor (multi-CPU).
  • a processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (eg, computer program instructions).
  • Memory 1202 may be read-only memory (ROM) or other type of static storage device that can store static information and instructions, random access memory (RAM), or other type of static storage device that can store information and instructions It can also be an electrically erasable programmable read-only memory (EEPROM), a compact disc read-only memory (CD-ROM) or other optical disk storage, CD-ROM storage (including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or capable of carrying or storing desired program code in the form of instructions or data structures and capable of being executed by a computer Access any other medium without limitation.
  • the memory 1202 can be integrated with the processor 1201, or can exist independently, and is coupled to the processor 1201 through an input/output port (not shown in FIG. 12 ) of the program detection device 1200, which is not specifically limited in this embodiment of the present application .
  • the memory 1202 is used for storing the software program for executing the solution of the present application, and the execution is controlled by the processor 1201 .
  • the processor 1201 controls the execution of the software program for executing the solution of the present application.
  • the transceiver 1203 is used for communication with other devices. Additionally, the transceiver 1203 may include a receiver and a transmitter (not shown separately in FIG. 12). Among them, the receiver is used to realize the receiving function, and the transmitter is used to realize the sending function.
  • the transceiver 1203 may be integrated with the processor 1201, or may exist independently, and be coupled to the processor 1201 through an input/output port (not shown in FIG. 12) of the program detection device 1200, which is not specifically described in this embodiment of the present application limited.
  • the structure of the program detection device 1200 shown in FIG. 12 does not constitute a limitation to the program detection device, and the actual program detection device may include more or less components than those shown in the figure, or a combination of some components, or a different arrangement of components.
  • Embodiments of the present application provide a chip system.
  • the chip system includes a processor and an input/output port, where the processor is used to implement the processing functions involved in the above method embodiments, and the input/output ports are used to implement the transceiver functions involved in the above method embodiments.
  • the chip system further includes a memory for storing program instructions and data for implementing the functions involved in the above method embodiments.
  • the chip system may be composed of chips, or may include chips and other discrete devices.
  • An embodiment of the present application provides a computer-readable storage medium, including: the computer-readable storage medium stores computer instructions; when the computer instructions are executed on a computer, the computer is caused to perform the program detection described in the foregoing method embodiments. method.
  • Embodiments of the present application provide a computer program product containing instructions, including computer programs or instructions, which, when the computer program or instructions are run on a computer, cause the computer to execute the program detection method described in the above method embodiments.
  • processors in the embodiments of the present application may be a central processing unit (central processing unit, CPU), and the processor may also be other general-purpose processors, digital signal processors (digital signal processors, DSP), dedicated integrated Circuit (application specific integrated circuit, ASIC), off-the-shelf programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the memory in the embodiments of the present application may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
  • the non-volatile memory may be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically programmable Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
  • Volatile memory may be random access memory (RAM), which acts as an external cache.
  • RAM random access memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • DRAM synchronous dynamic random access memory
  • SDRAM synchronous dynamic random access memory
  • DDR SDRAM double data rate synchronous dynamic random access memory
  • enhanced SDRAM enhanced synchronous dynamic random access memory
  • SLDRAM synchronous connection dynamic random access memory Fetch memory
  • direct memory bus random access memory direct rambus RAM, DR RAM
  • the above embodiments may be implemented in whole or in part by software, hardware (eg, circuits), firmware, or any other combination.
  • the above-described embodiments may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions or computer programs. When the computer instructions or computer programs are loaded or executed on a computer, all or part of the processes or functions described in the embodiments of the present application are generated.
  • the computer may be a general purpose computer, special purpose computer, computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be downloaded from a website site, computer, server or data center Transmission to another website site, computer, server or data center by wire (eg, infrared, wireless, microwave, etc.).
  • the computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, a data center, or the like that contains one or more sets of available media.
  • the usable media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVDs), or semiconductor media.
  • the semiconductor medium may be a solid state drive.
  • At least one means one or more, and “plurality” means two or more.
  • At least one item(s) below” or similar expressions thereof refer to any combination of these items, including any combination of single item(s) or plural items(s).
  • at least one item (a) of a, b, or c can represent: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, c may be single or multiple .
  • the size of the sequence numbers of the above-mentioned processes does not mean the sequence of execution, and the execution sequence of each process should be determined by its functions and internal logic, and should not be dealt with in the embodiments of the present application. implementation constitutes any limitation.
  • the disclosed system, apparatus and method may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the functions, if implemented in the form of software functional units and sold or used as independent products, may be stored in a computer-readable storage medium.
  • the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution.
  • the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), magnetic disk or optical disk and other media that can store program codes .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Human Computer Interaction (AREA)
  • Software Systems (AREA)
  • Debugging And Monitoring (AREA)

Abstract

一种程序检测方法及装置,能够帮助用户快速完成程序运行在弱内存环境上的检查,可以提高程序的检测效率,可应用于程序检测中。该方法包括:接收用户提供的程序(S101),根据查询参数和程序获得程序运行在弱内存环境上的结果(S102)。其中,查询参数用于指示程序的两条操作发生重排序的最大间隔。

Description

程序检测方法及装置 技术领域
本申请涉及通信领域,尤其涉及一种程序检测方法及装置。
背景技术
强内存模型(strong memory model)和弱内存模型(weakly ordered memory model)为两种存储模型。其中,强内存模型描述的是每一条指令都隐含获取(acquire)和释放(release)语义,获取语义可用于阻止读获取(read-acquire)和它之后的任何读写操作的乱序,释放语义可用于阻止写释放(write-release)和它之前的任何读写操作的乱序,即一个核(core)对内存进行的写操作序列可以被中央处理器(central processing unit,CPU)的其他核以相同的次序观察到。弱内存模型描述的是在一个独立的线程内,任意对内存的读写操作都可以与其它读写操作交换顺序,而不改变该线程的行为。
由于弱内存模型的软件生态环境的滞后,可能会导致基于强内存模型平台开发的程序,在移植到弱内存模型平台后,出现程序崩溃、重启或计算结果错误等异常现象。从而需要通过有经验的程序开发人员对程序进行反复测试、调试,来分析程序运行在弱内存模型平台的是否会发生异常。然而,上述程序崩溃、重启或计算结果错误等异常现象的复现概率低,受限于测试用例和测试环境,复现问题的成本高,并且对测试人员的专业能力要求很高。
发明内容
本申请实施例提供一种程序检测方法及装置,能够帮助用户快速完成程序运行在弱内存模型平台上的检查,可以提高程序的检测效率。
为达到上述目的,本申请采用如下技术方案:
第一方面,提供一种程序检测方法。该程序检测方法包括:接收用户提供的程序,根据查询参数和程序,获得程序运行在弱内存环境上的结果。其中,查询参数用于指示程序的两条操作发生重排序的最大间隔。
基于第一方面所述的程序检测方法,程序检测装置接收用户提供的程序,根据用于指示程序的两条操作发生重排序的最大间隔的查询参数和程序,便可获得程序运行在弱内存环境上的结果,能够帮助用户快速完成程序运行在弱内存环境上的检查,可以提高程序的检测效率,对用户的专业能力要求低。
在一种可能的设计方案中,上述根据查询参数和程序,获得程序运行在弱内存环境上的结果,可以包括:根据查询参数、程序的第一线程中的第一操作与第一线程中的第二操作的间隔、和程序的第二线程中的第三操作与第二线程中的第四操作的间隔,获得程序运行在弱内存环境上的结果。可选地,第一操作和第三操作可以为对同一变量进行的一对读写操作,第二操作和第四操作可以为对同一变量进行的一对读写操作,第一操作和第二操作可以为对不同变量进行的操作,第三操作和第四操作可以为对不同变量进行的操作。如此,不需要用户对程序进行测试,可以提高程序的检测效率。
在一种可能的设计方式中,第一方面提供的程序检测方法,还可以包括:采用内存读写模式查询规则,对程序的第一线程中的第一操作、第一线程中的第二操作、程序的第二线程 中的第三操作和第二线程中的第四操作进行检测,获得程序运行在弱内存环境上的结果。
可选地,内存读写模式查询规则可以是根据查询参数确定的,内存读写模式查询规则可用于确定第一线程和第二线程在时间上是否发生重叠。如此,若程序满足内存读写模式查询规则,则结果为错误,如容易产生弱内存序问题;若不满足内存读写模式查询规则,则结果为正确,如在弱内存模型平台运行结果正确,不需要用户对程序进行反复测试,可以快速获得程序运行在弱内存环境的结果。
可选地,内存读写模式查询规则可以包括读读、写写乱序查询规则,和/或,读写、写读乱序查询规则。如此,可以检查包括至少两个线程,且一个线程包括读读操作和另一个线程包括写写操作的程序在弱内存环境上的运行结果,和/或,可以检查包括至少两个线程,且一个线程包括读写操作和另一个线程包括写读操作的程序在弱内存环境上的运行结果。
在一种可能的设计方式中,查询参数可以是用户确定的,查询参数可以包括片内写操作延迟。可选地,查询参数还可以包括跨NUMA写操作延迟和/或跨NUMA读操作延迟。如此,可以根据用户输入的查询参数进行内存读写模式查询。
在一种可能的设计方式中,第一方面提供的程序检测方法,还可以包括:获取查询参数对应的数值。其中,查询参数对应的数值可以包括片内写操作延迟对应的值。可选地,查询参数对应的数值还可以包括跨NUMA写操作延迟对应的值和/或跨NUMA读操作延迟对应的值。
可选地,弱内存环境可以为非统一内存访问架构(non-Uniform memory access architecture,NUMA)的运行设备对应的运行环境。
在一种可能的设计方式中,第一方面提供的程序检测方法,还可以包括:向用户提供结果。其中,结果可以包括修改建议。如此,用户可根据修改建议手动修改程序,从而可以帮助用户快速定位程序中导致程序发生错误的位置,快速完成程序的修改,提高程序的修改效率。
可选地,结果可以包括:正确。或者,结果可以包括错误和/或修改建议。
可选地,修改建议可以包括需要修改的代码的位置信息,例如代码行号和/或程序文件名,修改建议可用于程序检测装置修改程序或用于用户手动修改程序。
在一种可能的设计方式中,第一方面提供的程序检测方法,还可以包括:响应于用户的确定指示,根据修改建议修改程序。如此,可以直接帮助用户完成程序的修改,进一步提高程序的修改效率。
第二方面,提供一种程序检测装置。该程序检测装置包括:接收单元和获取单元。其中,接收单元,用于接收用户提供的程序;获取单元,用于根据查询参数和程序,获得程序运行在弱内存环境上的结果。其中,查询参数用于指示程序的两条操作发生重排序的最大间隔。
在一种可能的设计方式中,获取单元,还用于根据查询参数、程序的第一线程中的第一操作与第一线程中的第二操作的间隔、和程序的第二线程中的第三操作与第二线程中的第四操作的间隔,获得程序运行在弱内存环境上的结果。可选地,第一操作和第三操作为对同一变量进行的一对读写操作,第二操作和第四操作为对同一变量进行的一对读写操作,第一操作和第二操作为对不同变量进行的操作,第三操作和第四操作为对不同变量进行的操作。
在一种可能的设计方式中,获取单元,还用于采用内存读写模式查询规则,对程序的第一线程中的第一操作、第一线程中的第二操作、程序的第二线程中的第三操作和第二线程中的第四操作进行检测,获得程序运行在弱内存环境上的结果。
可选地,内存读写模式查询规则可以是根据查询参数确定的,内存读写模式查询规则可 用于确定第一线程和第二线程在时间上是否发生重叠。
在一种可能的设计方式中,内存读写模式查询规则可以包括读读、写写乱序查询规则,和/或,读写、写读乱序查询规则。
在一种可能的设计方式中,查询参数可以是用户确定的,查询参数可以包括片内写操作延迟。可选地,查询参数还可以包括跨NUMA写操作延迟和/或跨NUMA读操作延迟。如此,可以根据用户输入的查询参数进行内存读写模式查询。
在一种可能的设计方式中,获取单元,还用于获取查询参数对应的数值。其中,查询参数对应的数值可以包括片内写操作延迟对应的值。可选地,查询参数对应的数值还可以包括跨NUMA写操作延迟对应的值和/或跨NUMA读操作延迟对应的值。
可选地,弱内存环境可以为非统一内存访问架构NUMA的运行设备对应的运行环境。
在一种可能的设计方式中,第二方面所述的程序检测装置,还可以包括:输出单元。输出单元,用于向用户提供结果。其中,结果可以包括修改建议。
可选地,结果可以包括:正确。或者,结果可以包括错误和/或修改建议。
可选地,修改建议可以包括需要修改的代码的位置信息,例如代码行号和/或程序文件名,修改建议可用于程序检测装置修改程序或用于用户手动修改程序。
在一种可能的设计方式中,获取单元,还用于响应于用户的确定指示,根据修改建议修改程序。
在一种可能的设计方式中,程序检测装置可放置于云服务器中。
需要说明的是,接收单元和输出单元可以分开设置,也可以集成在一个模块中,即收发模块。本申请对于接收单元和输出单元的具体实现方式,不做具体限定。
可选地,第二方面所述的程序检测装置还可以包括存储模块,该存储模块存储有程序或指令。当获取单元执行该程序或指令时,使得程序检测装置可以执行如第一方面中任意一种可能的实现方式所述的程序检测方法。
需要说明的是,第二方面所述的程序检测装置可以是计算机设备、服务器或云服务器,也可以是可设置于计算机设备、服务器或云服务器的芯片(系统)或其他部件或组件,本申请对此不做限定。
此外,第二方面所述的程序检测装置的技术效果可以参考第一方面所述的程序检测方法的技术效果,此处不再赘述。
第三方面,提供一种程序检测装置。该程序检测装置包括:处理器,该处理器与存储器耦合,存储器用于存储计算机程序;处理器用于执行存储器中存储的计算机程序,以使得该程序检测装置执行如第一方面中任意一种可能的实现方式所述的程序检测方法。
在一种可能的设计中,第三方面所述的程序检测装置还可以包括收发器。该收发器可以为收发电路或输入/输出端口。所述收发器可以用于该程序检测装置与其他设备通信。
在本申请中,第三方面所述的程序检测装置可以为计算机设备、服务器或云服务器,或者设置于计算机设备、服务器或云服务器内部的芯片或芯片系统。
此外,第三方面所述的程序检测装置的技术效果可以参考第一方面中的任意一种实现方式所述的程序检测方法的技术效果,此处不再赘述。
第四方面,提供了一种芯片系统,该芯片系统具有弱内存环境,该芯片系统包括处理器和输入/输出端口,所述处理器与包含指令的存储器耦合,用于控制所示芯片系统实现第一方面中的任意一种实现方式所涉及的处理功能,所述输入/输出端口用于实现第一方面中的任意一种实现方式所涉及的收发功能。
在一种可能的设计中,该芯片系统还包括存储器,该存储器用于存储实现第一方面所涉及功能的程序指令和数据。
该芯片系统,可以由芯片构成,也可以包含芯片和其他分立器件。
第五方面,提供一种计算机可读存储介质,包括:该计算机可读存储介质中存储有计算机指令;当该计算机指令在计算机上运行时,使得该计算机执行如第一方面中任意一种可能的实现方式所述的程序检测方法。
第六方面,提供了一种包含指令的计算机程序产品,包括计算机程序或指令,当该计算机程序或指令在计算机上运行时,使得该计算机执行如第一方面中任意一种可能的实现方式所述的程序检测方法。
附图说明
图1为本申请实施例提供的程序检测方法的流程示意图一;
图2为本申请实施例提供的程序检测方法的应用示意图一;
图3为本申请实施例提供的程序检测方法的流程示意图二;
图4为本申请实施例提供的CPU的结构示意图一;
图5为本申请实施例提供的程序检测装置的界面示意图一;
图6为本申请实施例提供的程序的示例图;
图7为本申请实施例提供的程序检测装置的界面示意图二;
图8为本申请实施例提供的程序的分析示例图一;
图9为本申请实施例提供的程序的分析示例图二;
图10为本申请实施例提供的程序检测方法的应用示意图二;
图11为本申请实施例提供的程序检测装置的结构示意图一;
图12为本申请实施例提供的程序检测装置的结构示意图二。
具体实施方式
下面将结合附图,对本申请中的技术方案进行描述。
本申请将围绕可包括多个设备、组件、模块等的系统来呈现各个方面、实施例或特征。应当理解和明白的是,各个系统可以包括另外的设备、组件、模块等,并且/或者可以并不包括结合附图讨论的所有设备、组件、模块等。此外,还可以使用这些方案的组合。
另外,在本申请实施例中,“示例地”、“例如”等词用于表示作例子、例证或说明。本申请中被描述为“示例”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用示例的一词旨在以具体方式呈现概念。
本申请实施例中,“操作指令”、“操作”有时可以混用,应当指出的是,在不强调其区别时,其所要表达的含义是一致的。“语句”、“程序语句”、“代码”有时可以混用,应当指出的是,在不强调其区别时,其所要表达的含义是一致的。
在本申请的描述中,除非另有说明,“多个”的含义是指两个或两个以上。本文中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。
首先,为了便于理解,下面先对本申请实施例可能涉及的相关术语和概念进行介绍。
(1)全存储排序(total store ordering,TSO)一致性模型
TSO一致性模型描述的是中央处理器(central processing unit,CPU)中的多个核有且仅有一个全局的对内存写操作的顺序,TSO一致性模型属于强内存模型。
本申请实施例提供一种程序检测方法,能够用于检测程序运行在弱内存模型平台的正确性,如程序运行在高级精简指令集机器(Advanced RISC Machine,ARM)平台的正确性。该程序检测方法可以单独使用,也可以与第三方软件集成在一起使用。本申请实施例提供的程序检测装置,可以是计算机设备、服务器或云服务器等,也可以是应用于计算机设备、服务器或云服务器中的芯片或者其他具有程序检测功能的部件。
下面将结合图1-图10对本申请实施例提供的程序检测方法进行具体阐述。
图1为本申请实施例提供的程序检测方法的流程示意图一。本申请实施例提供的程序检测方法可用于检查编译型语言开发的软件,如C语言或C++语言等。
如图1所示,该程序检测方法包括如下步骤:
S101,接收用户提供的程序。
图2为本申请实施例提供的程序检测方法的应用示意图一。
如图2所示,程序检测装置接收用户提供的程序。以C/C++软件为例,该程序可以为C/C++软件源代码。该程序可以包括一个或多个线程。
可选地,本申请实施例提供的程序检测方法,可以包括:采用编译器,将用户提供的程序转变为中间代码(intermediate representation,IR)。
示例性地,以Clang/LLVM编译器为例,假设用户提供的程序为XXX.c,通过Clang/LLVM编译器编译该程序,产生中间代码,例如,clang-emit-llvm-c-g XXX.c-o XXX.bc。可选地,通过程序检测装置Weakmemcheck分析程序的中间文件XXX.bc。
结合图3中(a)或图3中(b),通过编译器编译用户提供的程序,生成中间代码IR,程序检测装置对中间代码进行分析,可以实现对源代码进行静态分析,不需要运行用户提供的程序,从而对用户软件无侵入,可以保证用户的程序的安全性且操作方便。
S102,根据查询参数和程序,获得程序运行在弱内存环境上的结果。
可选地,弱内存环境为非统一内存访问架构(non-Uniform memory access architecture,NUMA)的运行设备对应的运行环境。
示例性地,弱内存环境可以为弱内存模型设备的运行环境、或弱内存模型平台等。
示例性地,查询参数用于指示程序的两条操作发生重排序的最大间隔。可选地,查询参数可以包括查询参数的类型和/或查询参数对应的数值。
在一些实施例中,查询参数的类型可以包括片内写操作延迟。示例性地,片内写操作延迟可以用于指示片内两条存储(store)指令发生重排序的最大间隔。
示例性地,内存指令可用于将寄存器中的数据保存到内存,片内写操作延迟可以用于检测NUMA节点内的两条或多条存储指令之间是否会发生重排序。关于NUMA节点可参照下述对图4的具体阐述。
可选地,查询参数的类型还可以包括跨NUMA写操作延迟和/或跨NUMA读操作延迟。示例性地,跨NUMA写操作延迟可用于指示跨NUMA节点的两条存储(store)指令发生重排序的最大间隔,跨NUMA读操作延迟可用于指示跨NUMA节点的两条装载(load)指令发生重排序的最大间隔,其中,跨NUMA节点的指令是指一个NUMA节点的处理访问另一个NUMA节点的内存的指令。
示例性地,装载指令可用于从内存中读取数据放入寄存器中,跨NUMA写操作延迟可以用于检测两个或多个NUMA节点之间的两条或多条存储指令之间是否会发生重排序,跨 NUMA读操作延迟可以用于检测两个或多个NUMA节点之间的两条或多条装载指令之间是否会发生重排序。关于跨NUMA写操作延迟和跨NUMA读操作延迟可参照下述对图4的具体阐述。
在一些实施例中,查询参数对应的数值可以包括片内写操作延迟对应的值。
示例性地,假设将片内写操作延迟对应的值置为8,则两条存储指令发生重排序的最大间隔为8条内存操作指令。也就是说,若两条存储指令之间的间隔小于或等于8条内存操作指令,则认为这两条存储指令会发生重排序;若两条存储指令之间的间隔大于8条内存操作指令,则认为这两条存储指令不会发生重排序。其中,内存操作指令可以包括:存储(store)指令和/或装载(load)指令。
可选地,查询参数对应的数值还可以包括跨NUMA写操作延迟对应的值和/或跨NUMA读操作延迟对应的值。
示例性地,假设将跨NUMA写操作延迟对应的值置为8,则两条存储指令发生重排序的最大间隔为8条内存操作指令。也就是说,若两条存储指令之间的间隔小于或等于8条内存操作指令,则认为这两条存储指令会发生重排序;若两条存储指令之间的间隔大于8条内存操作指令,则认为这两条存储指令不会发生重排序。
类似地,假设将跨NUMA读操作延迟对应的值置为8,则两条装载指令发生重排序的最大间隔为8条内存操作指令。也就是说,若两条装载指令之间的间隔小于或等于8条内存操作指令,则认为这两条装载指令不会发生重排序;若两条装载指令之间的间隔超过8条内存操作指令,则认为这两条装载指令不会发生重排序。
可选地,查询参数的类型可以包括片内读操作延迟,查询参数对应的数值可以包括片内读操作延迟对应的值。示例性地,片内读操作延迟可以用于指示两条装载(load)指令发生重排序的最大间隔,片内读操作延迟可以用于检测NUMA节点内的两条或多条装载指令之间是否会发生重排序。关于片内读操作延迟的具体实现方式与上述片内写操作延迟类似,此处不再赘述。
需要说明的是,以程序中包括内存操作指令1和内存操作指令2为例,重排序可指内存操作指令1和内存操作指令2的先后执行顺序发生变化。例如,程序文件的代码中记载的顺序为内存操作指令1在内存操作指令2之前执行,若在弱内存环境中运行该程序,由于弱内存环境的特性,导致先执行内存操作指令2,再执行内存操作指令1。
在一些实施例中,可以根据查询参数进行内存读写模式查询。该查询参数用来描述CPU硬件的不同核在片内、外的内存读写延迟,并构造内存读写模式查询规则。
可选地,可以根据CPU的结构确定查询参数。
图4为本申请实施例提供的CPU的结构示意图一。
如图4所示,该CPU的包括两个内存(memory),内存0(memory0)和内存1(memory1),两个内存分别对应有核和缓存。memory0对应core0、core1、缓存cache0和cache1,核core0与缓存cache0之间对应一个存储缓存(Store Buffer),核core1与缓存cache1之间对应一个存储缓存(Store Buffer)。memory1对应core2、core3、cache2、和cache3,核core2与缓存cache2之间对应一个存储缓存(Store Buffer),核core3与缓存cache3之间对应一个存储缓存(Store Buffer)。可选地,内存0(memory0)和内存1(memory1)之间对应有跨NUMA装载缓存(NUMA Load Buffer)和/或跨NUMA存储缓存(NUMA Store Buffer)。例如,core0和core1可通过跨NUMA装载缓存对memory1进行读操作,core0和core1可通过跨NUMA存储缓存对memory1进行写操作。类似地,core2和core3可通过跨NUMA装载缓存对 memory0进行读操作,core2和core3可通过跨NUMA存储缓存对memory0进行写操作。
结合图4,可将包括memory0、core0、core1、cache0和cache1、对应的Store Buffer的架构称为NUMA节点0,将NUMA节点0内部的core0或core1对NUMA节点0内部的内存(memory0)或缓存(cache0和cache1、对应的Store Buffer)的读操作称为片内读操作。例如,将core0对memory0的读操作称为片内读操作。将NUMA节点0内部的core0或core1对NUMA节点0内部的内存(memory0)或缓存(cache0和cache1、对应的Store Buffer)的写操作称为片内写操作。例如,将core0对memory0的写操作称为片内写操作。
类似地,可将包括memory1、core2、core3、cache2、和cache3、对应的Store Buffer的架构称为NUMA节点1,将NUMA节点1内部的core2或core3对NUMA节点1内部的内存(memory1)或缓存(cache2、cache3、和对应的Store Buffer)的读操作称为片内读操作。例如,将core2对memory1的读操作称为片内读操作。将NUMA节点1内部的core2或core3对NUMA节点1内部的内存(memory1)或缓存(cache2、cache3、和对应的Store Buffer)的写操作称为片内写操作。例如,将core2对memory1的写操作称为片内写操作。
示例性地,将NUMA节点0与NUMA节点1之间的内存操作称为跨NUMA的内存操作,内存操作包括读操作和/或写操作。例如,将NUMA节点0中core0对NUMA节点1中memory1的读操作称为跨NUMA读操作,将NUMA节点0中core0对NUMA节点1中memory1的写操作称为跨NUMA写操作。类似地,将NUMA节点1中core2对NUMA节点0中memory0的读操作称为跨NUMA读操作,将NUMA节点1中core2对NUMA节点0中memory0的写操作称为跨NUMA写操作,此处不一一列举。
示例性地,结合图4,若CPU包括存储缓存(Store Buffer),可以确定采用片内写操作延迟和/或片内写操作延迟对应的值。若CPU包括跨NUMA装载缓存(NUMA Load Buffer),可以确定采用跨NUMA写操作延迟和/或跨NUMA写操作延迟对应的值。若CPU包括跨NUMA存储缓存(NUMA Store Buffer),可以确定采用跨NUMA读操作延迟和/或跨NUMA读操作延迟对应的值。可选的,若CPU包括装载缓存(Load Buffer),可以确定采用片内读操作延迟和/或片内读操作延迟对应的值。其中,核core0与缓存cache0之间可对应有装载缓存(Load Buffer),类似地,core1与cache1之间、core2与cache2之间、core3与cache3之间分别可对应有装载缓存(图4中未示出)。
在一些实施例中,查询参数可以是程序检测装置确定的。
示例性地,程序检测装置可以确定查询参数的类型,例如,采用片内写操作延迟,或采用跨NUMA写操作延迟和/或跨NUMA读操作延迟。
示例性地,程序检测装置可以确定查询参数对应的数值。例如,程序检测装置根据测试集测量得到查询参数对应的数值,如程序检测装置可以确定片内写操作延迟对应的值,还可以确定跨NUMA写操作延迟对应的值和/或跨NUMA读操作延迟对应的值。
在一些实施例中,查询参数可以是用户确定的。
结合图2,程序检测装置可以根据用户确定的查询参数对程序进行分析,获得程序运行在弱内存环境上的结果。
可选地,用户可以确定查询参数的类型。
示例性地,用户可以在程序检测装置的显示界面内输入查询参数。图5为本申请实施例提供的程序检测装置的界面示意图一。结合图5中(a),用户可以在显示界面的输入查询参数的区域中,输入查询参数对应的指令。例如,可以输入片内写操作延迟对应的指令StoreBuffer。可选地,还可以输入跨NUMA写操作延迟对应的指令NUMAStoreSize和/或跨 NUMA读操作延迟对应的指令NUMALoadSize。
或者,示例性地,程序检测装置的显示界面可以包括片内写操作延迟的选择框,还可以包括跨NUMA写操作延迟的选择框和/或跨NUMA读操作延迟的选择框。示例性地,用户可以通过勾选查询参数的类型对应的选择框,来确定采用的查询参数的类型。结合图5中(b),用户可以勾选片内写操作延迟对应的选择框,以确定采用片内写操作延迟。类似地,用户可以用过勾选跨NUMA写操作延迟对应的选择框,以确定采用跨NUMA写操作延迟。若不采用跨NUMA读操作延迟,可以不对跨NUMA读操作延迟对应的选择框进行操作。
可选地,用户可以确定查询参数对应的数值。
结合图5中(a)或图5中(b),用户可以在各查询参数的类型对应的输入数值的区域输入对应的值。例如,用户可以在片内写操作延迟对应的输入数值的区域输入8。类似地,可以在跨NUMA写操作延迟对应的输入数值的区域输入8,可以在跨NUMA读操作延迟对应的输入数值的区域输入8。
需要说明的是,上述仅为本申请实施例提供的示例,本申请实施例不对查询参数是如何确定的进行限定。例如,查询参数的类型可以是程序检测装置确定的,查询参数对应的数值可以是用户确定的。或者,查询参数的类型是用户确定的,查询参数的数值是程序检测装置确定的。或者,程序检测装置确定查询参数的类型和/或查询参数对应的数值后,用户可以对查询参数的类型和/或查询参数对应的数值进行修改。
在一种可能的设计方式中,本申请实施例提供的程序检测方法,还可以包括:获取查询参数对应的数值。
可选地,查询参数对应的数值可以是预设置的,和/或,查询参数对应的数值可以是用户通过显示界面中查询参数的类型对应的输入数值区域提供的。
在一种可能的设计方式中,上述S102,可以包括:根据查询参数、程序的第一线程中的第一操作与第一线程中的第二操作的间隔,和程序的第二线程中的第三操作与第二线程中的第四操作的间隔,获得程序运行在弱内存环境上的结果。
可选地,第一操作和第三操作可以为对同一变量进行的一对读写操作,第二操作和第四操作可以为对同一变量进行的一对读写操作,第一操作和第二操作可以为对不同变量进行的操作,第三操作和第四操作可以为对不同变量进行的操作。例如,该变量可以为共享变量。一对操作可以为:第一操作为对第一变量的读操作,第三操作为对第一变量的写操作;或者,第一操作为对第一变量的写操作,第三操作为对第一变量的读操作。类似地,第二操作为对第二变量的读操作,第四操作为对第二变量的写操作;或者,第二操作为对第二变量的写操作,第四操作为对第二变量的读操作。第一变量可以包括全局变量、共享变量等,第二变量可以包括全局变量、共享变量等。
图6为本申请实施例提供的程序的示例图。
如图6所示,该程序包括第14行的第一线程thread1和第21行的第二线程thread2,可以计算thread1中对变量population写操作(第17行),和thread1中对变量syn_flag写操作(第18行)之间的间隔,可以计算thread2中对变量syn_flag的读操作(第23行),和thread2中对变量population的读操作(第24行)之间的间隔,根据查询参数和这两个间隔获得该程序运行在弱内存环境上的结果。
在一种可能的设计方式中,本申请实施例提供的程序检测方法,还可以包括:采用内存读写模式查询规则,对程序的第一线程中的第一操作、第一线程中的第二操作、程序的第二线程中的第三操作和第二线程中的第四操作进行检测,获得程序运行在弱内存环境上的结果。
示例性地,关于第一操作、第二操作、第三操作和第四操作的具体实现方式,可参照上述具体实现方式,此处不再赘述。
可选地,内存读写模式查询规则可以是根据查询参数确定的,内存读写模式查询规则可用于确定第一线程和第二线程在时间上是否发生重叠。
结合图6,可以采用内存读写模式查询规则对图6所示的程序中的thread1和thread2进行检测,获得该程序运行在弱内存环境上的结果。
示例性地,内存读写模式查询规则可以包括读读、写写乱序查询规则,和/或,读写、写读乱序查询规则。
示例性地,读读、写写乱序查询规则可以包括如下一项或多项条件:第一操作指令和第二操作指令为第一线程中相邻的两条操作指令,第一操作指令为对第一变量的写操作指令,第二操作指令为对第二变量的写操作指令,第三操作指令和第四操作指令为第二线程中相邻的两条操作指令,第三操作指令为对第一变量的读操作指令,第四操作指令为对第二变量的读操作指令;第一操作指令与第三操作指令是竞争关系,第二操作指令与第四操作指令是竞争关系;第一操作指令与第二操作指令之间的间隔小于片内写操作延迟对应的值与跨NUMA写操作延迟对应的值之和;和第三操作指令与第四操作指令之间的间隔小于跨NUMA读操作延迟对应的值。如此,可以采用读读、写写乱序查询规则,检查包括至少两个线程,且一个线程包括读读操作和另一个线程包括写写操作的程序在弱内存环境上的运行结果。
示例性地,读写、写读乱序查询规则可以包括如下一项或多项条件:第五操作指令和第六操作指令为第一线程中相邻的两条操作指令,第五操作指令为对第三变量的写操作指令,第六操作指令为对第四变量的读操作指令,且第七操作指令和第八操作指令为第二线程中相邻的两条操作指令,第七操作指令为对第三变量的读操作指令,第八操作指令为对第四变量的写操作指令;第五操作指令与第七操作指令是竞争关系,第六操作指令与第八操作指令是竞争关系;第五操作指令与第六操作指令之间的间隔小于片内写操作延迟对应的值与跨NUMA写操作延迟对应的值、跨NUMA读操作延迟对应的值之和,且第三操作指令与第八操作指令之间的间隔小于片内写操作延迟对应的值与跨NUMA写操作延迟对应的值、跨NUMA读操作延迟对应的值之和。如此,可以采用读写、写读乱序查询规则,检查包括至少两个线程,且一个线程包括读写操作和另一个线程包括写读操作的程序在弱内存环境上的运行结果。
具体地,竞争关系可以为两条操作指令的先后执行顺序会影响执行结果。第三变量可以包括全局变量、共享变量等,第四变量可以包括全局变量、共享变量等。
满足上述内存读写模式查询规则的程序语句可以被确定为在弱内存环境下因指令乱序而导致结果错误的危险语句。
如此,若程序满足内存读写模式查询规则,则结果为错误,如容易产生弱内存序问题;若不满足内存读写模式查询规则,则结果为正确,如在弱内存模型平台运行结果正确,不需要用户对程序进行反复测试,可以快速获得程序运行在弱内存环境的结果。
在一种可能的设计方式中,本申请实施例提供的程序检测方法,还可以包括:向用户提供结果。可选地,结果可以包括:正确。或者,结果可以包括错误和/或修改建议。
示例性地,正确可以指示程序可以在弱内存环境上正确运行,错误可以指示程序在弱内存环境上运行会出现程序崩溃、程序退出或程序计算结果错误等异常现象,修改建议可以包括需要修改的代码的位置信息,例如代码行号和/或程序文件名,修改建议可用于程序检测装置修改程序或用于用户手动修改程序,以使程序在弱内存环境上正确运行。如此,可以帮助 用户快速定位程序中导致程序发生错误的位置,快速完成程序的修改,提高程序的修改效率。
可选地,若结果为正确,可以通过显示界面向用户显示检测结果。若结果为错误,可以通过显示界面向用户显示检测结果,和/或显示修改建议。
图7为本申请实施例提供的程序检测装置的界面示意图二。如图7所示,若结果为错误,可以在显示界面显示结果为错误、对应的程序文件名、和错误语句的代码行号等。
在一种可能的设计方式中,本申请实施例提供的程序检测方法,还可以包括:响应于用户的确定指示,根据修改建议修改程序。
结合图7,用户选择显示界面中确定修复区域,则程序检测装置可以响应于用户的确定指示,根据修改建议修改程序。或者,用户选择显示界面中不修复区域后,可以根据修改建议手动修改程序。如此,可以直接帮助用户完成程序的修改,进一步提高程序的修改效率。
基于图1所述的程序检测方法,程序检测装置接收用户提供的程序,根据查询参数和程序便可获得程序运行在弱内存环境上的结果,能够帮助用户快速完成程序运行在弱内存环境上的检查,可以提高程序的检测效率,对用户的专业能力要求低。
在一些实施例中,结合图3中(a)或图3中(b),本申请实施例提供的程序检查方法,还可以包括下述步骤一至步骤八。下面以程序包括第一线程和第二线程为例进行阐述。
步骤一,分析程序包括的多个变量中所有变量的别名关系。
结合图6和图8,分析图6所示的程序中所有变量的别名关系,可得到如图8所示的别名关系,{Beijing.Population Alias my
Figure PCTCN2021123936-appb-000001
_city->population},以及{Beijing.sync_flag Alias my
Figure PCTCN2021123936-appb-000002
_city->sync_flag}。
步骤二,对程序进行变量依赖分析。
示例性地,分析用户提供的程序中函数关系、变量依赖关系,并构建相关线程关系图、函数调用关系图。分析程序中使用的多线程相关应用程序接口(application programming interface,API),构造一种基于线程调用上下文、函数调用上下文的线程数据结构,以区别不同线程操作的上下文,分析变量依赖关系。其中,多线程相关API可以包括线程相关函数,如pthread_create、pthread_mutex_lock等。
结合图6和图8,对图6所示的程序进行变量依赖分析,可获得如图8所示的主函数main(包括其函数以及指令)、第一线程thread1(包括其函数以及指令)和第二线程thread2(包括其函数以及指令)。
如此,对程序进行分解,可以梳理出第一线程的函数调用上下文和第二线程的函数调用上下文,可以实现基于线程调用上下文、函数调用上下文的线程分析方法。
步骤三,分析程序中的第一变量。
示例性地,分析程序中的一个或多个第一变量,并区分一个或多个第一变量中每个第一变量访问的线程调用上下文和函数调用上下文,第一变量可以为第一线程和第二线程都包括的变量,第一变量可以包括共享变量、全局变量等。如此,可以实现基于线程调用上下文、函数调用上下文的共享变量访问点识别方法。示例性地,可基于别名分析算法完成共享变量的搜索,以提高效率,可以采用上下文编码技术,分析程序中的共享变量,以提高计算精度。
结合图6和图8,对图6所示的程序进行共享变量分析,可获得如图8所示的变量搜索结果my_city->population:{line 17,24,31}、my_city->sync_flag:{line 18,23,30}。
步骤四,分析程序中的锁变量。
示例性地,分析程序中任意一条语句是否使用锁变量。分析第一线程和第二线程都使用到的一个或多个互斥锁变量,并区分一个或多个互斥锁变量中每个互斥锁变量的线程调用上 下文和函数调用上下文,可以采用上下文编码技术,分析程序中的锁变量,以提高计算精度。
结合图6和图8,对图6所示的程序进行锁变量分析,由于图6所示的程序中不包括锁变量,因此,图8中不包括锁变量的分析结果。
步骤五,分析包括第一变量的程序基本块并行发生的可能性。
示例性地,可以采用静态向量时间算法,分析程序中的包括第一变量的每个程序基本块(basic block,BB)中的每条语句的并行发生(may happen in parallel,MHP)的可能。对包括第一变量的读操作或写操作的程序基本块进行向量时间戳的构造,模拟这个程序基本块在该程序或该线程内执行的相对逻辑时间,可以实现以程序基本块为粒度,基于全程序线程调用上下文、函数调用上下文的并行分析方法。
结合图6和图8,对图6所示的程序进行MHP分析,可获得如图8所示的第一线程thread1的向量时间戳构造结果和第二线程thread2的向量时间戳构造结果,thread1的语句执行时间和thread2的语句执行时间存在重合,thread1的语句和thread2的语句可能并发执行。
图9为本申请实施例提供的程序的分析示例图二。图9为对图6所示的程序(如文件名为weekConsistency.c)的分析结果。
如图9所示,注释1:ST是Static Thread的缩写(Note1:ST means Static Thread)。注释2:精确地记录每个变量的调用字符串(Note2:We precisely record the call string of each variable)。注释3:A(b)->C(d)等调用字符串的类型中,A表示调用方,b标识调用指令(note3:The style of call sting is A(b)->C(d)->....A means caller,b meas call instruction)。注释4:A(b)->C表示A采用指令b调用C(Note4:A(b)->C means that A calls C by instruction b)。
采用上述步骤五对图6所示的程序(如文件名为weekConsistency.c)的进行分析,可获得如下信息:对组包括:对全局变量population的写操作在thread1中(源码weekConsistency.c中的17行),对全局变量population的读操作在thread2中(源码weekConsistency.c中的24行);对组包括:对全局变量syrn_flag的写操作thread1中(源码weekConsistency.c中的18行),对全局变量syrn_flag的读操作在thread2中(源码weekConsistency.c中的23行)。其中,thread1的第17行语句与thread1的第18行语句之间的时间间隔,和thread2的第23行语句和thread2的第24行语句之间的时间间隔存在重合,thread1的语句和thread2的语句可能并发执行。
需要说明的是,上述步骤三至步骤五是对单个线程的内部进行分析的方法,可对程序中包括的第一线程和第二线程分别执行上述步骤三至步骤五。本申请不对具体实施方式进行限定,例如,可先对第一线程执行上述步骤三至步骤五,后再对第二线程执行上述步骤三至步骤五。或者,对第一线程执行上述步骤三后,对第二线程执行上述步骤三,类似地,对第一线程执行上述步骤四后,对第二线程执行上述步骤四。
需要说明的是,本申请实施例不对上述步骤三至步骤五的先后顺序进行限定。结合图3中(b),步骤三、步骤四和步骤五之间可以是并列的关系,步骤三、步骤四和步骤五的执行顺序,不影响各个步骤的对应的执行结果。
步骤六,对程序进行内存读写模式查询。
关于步骤六的具体实现方式,可参照上述采用内存读写模式查询规则,对程序的第一线程中的第一操作、第一线程中的第二操作、程序的第二线程中的第三操作和第二线程中的第四操作进行检测,获得程序运行在弱内存环境上的结果的具体阐述,此处不再赘述。
示例性地,结合图3中(b),可根据上述步骤三至步骤五的结果,对程序中的语句进行查询获得第一语句,该第一语句包括对同一个变量的读写操作、且该变量的读写操作未被写 保护、且该变量的读写操作可能同时被第一线程和第二线程并发执行、且该变量的读写操作可能构成竞争关系。
结合图6和图8,对图6所示的程序进行内存读写模式查询,由于图6所示的程序中不包括危险语句,因此,图8中所示的结果可以包括正确。
步骤七,获得程序运行在弱内存环境上的结果。
示例性地,将第一语句标记为危险语句,危险语句也可称为危险代码,可根据危险代码和该危险代码中包括的操作类型确定修改建议。其中,修改建议可以包括危险代码的代码行号和/或程序文件名,操作类型可以包括写操作或读操作。
步骤八,向用户提供结果。
可选地,结果可以包括:正确。或者,结果可以包括错误和/或修改建议。关于S706的具体实现方式可参照上述相关阐述,此处不再赘述。
可选地,可以将危险代码的字体颜色转换为与程序中其它代码的字体颜色不同的颜色,并通过显示界面向用户显示。
图10为本申请实施例提供的程序检测方法的应用示意图二。如图10所示,假设用户提供的程序A为运行在强内存模型平台的程序,本申请实施例提供的程序检测装置采用本申请实施例提供的程序检测方法对程序A进行检测,获得结果为错误。用户手动或程序检测装置自动对程序A进行修改,如通过插入内存屏蔽指令修复存在的问题,经编译器编译后,获得的程序A’可以正确运行在弱内存模型平台,可以提高程序检测以及修改的效率。
基于图3中(a)或图3中(b)所示的程序检测方法,可以静态分析用户提供的程序中对全局变量的读写操作以及相对时序关系,获得程序运行在弱内存环境上是否会存在问题,还可以获得存在问题代码的位置,并可帮助用户进行自动或手动修复,可以提高程序的检测以及修复效率。
以上结合图1-图10详细说明了本申请实施例提供的程序检测方法。以下结合图11-图12详细说明本申请实施例提供的程序检测装置。
图11是本申请实施例提供的程序检测装置的结构示意图一。该程序检测装置可适用于执行图1所示的程序检测方法中程序检测装置的功能。为了便于说明,图11仅示出了该程序检测装置的主要部件。
如图11所示,程序检测装置1100包括:接收单元1101和获取单元1102。其中,接收单元1101,用于接收用户提供的程序。获取单元1102,用于根据查询参数和程序,获得程序运行在弱内存环境上的结果。其中,查询参数用于指示程序的两条操作发生重排序的最大间隔。
在一种可能的设计方式中,获取单元1102,还用于根据查询参数、程序的第一线程中的第一操作与第一线程中的第二操作的间隔、和程序的第二线程中的第三操作与第二线程中的第四操作的间隔,获得程序运行在弱内存环境上的结果。可选地,第一操作和第三操作为对同一变量进行的一对读写操作,第二操作和第四操作为对同一变量进行的一对读写操作,第一操作和第二操作为对不同变量进行的操作,第三操作和第四操作为对不同变量进行的操作。
在一种可能的设计方式中,获取单元1102,还用于采用内存读写模式查询规则,对程序的第一线程中的第一操作、第一线程中的第二操作、程序的第二线程中的第三操作和第二线程中的第四操作进行检测,获得程序运行在弱内存环境上的结果。
可选地,内存读写模式查询规则可以是根据查询参数确定的,内存读写模式查询规则可用于确定第一线程和第二线程在时间上是否发生重叠。
在一种可能的设计方式中,内存读写模式查询规则可以包括读读、写写乱序查询规则, 和/或,读写、写读乱序查询规则。
在一种可能的设计方式中,查询参数可以是用户确定的,查询参数可以包括片内写操作延迟。
可选地,查询参数还可以包括跨NUMA写操作延迟和/或跨NUMA读操作延迟。如此,可以根据用户输入的查询参数进行内存读写模式查询。
在一种可能的设计方式中,获取单元1102,还用于获取查询参数对应的数值。其中,查询参数对应的数值可以包括片内写操作延迟对应的值。
可选地,查询参数对应的数值还可以包括跨NUMA写操作延迟对应的值和/或跨NUMA读操作延迟对应的值。
可选地,弱内存环境可以为非统一内存访问架构NUMA的运行设备对应的运行环境。
在一种可能的设计方式中,程序检测装置1100还可以包括:输出单元1103。输出单元1103,用于向用户提供结果。其中,结果可以包括修改建议。
可选地,结果可以包括:正确。或者,结果可以包括错误和/或修改建议。
可选地,修改建议可以包括需要修改的代码的位置信息,例如代码行号和/或程序文件名,修改建议可用于程序检测装置修改程序或用于用户手动修改程序。
在一种可能的设计方式中,获取单元1102,还用于响应于用户的确定指示,根据修改建议修改程序。
在一种可能的设计方式中,程序检测装置1100可放置于云服务器中。
需要说明的是,接收单元1101和输出单元1103可以分开设置,也可以集成在一个模块中,即收发模块(图11中未示出)。本申请对于接收单元1101和输出单元1103的具体实现方式,不做具体限定。
可选地,程序检测装置1100还可以包括存储模块(图11中未示出),该存储模块存储有程序或指令。当获取单元1102执行该程序或指令时,使得程序检测装置1100可以执行图1所示的程序检测方法中程序检测装置的功能。
需要说明的是,程序检测装置1100可以是计算机设备、服务器或云服务器,也可以是可设置于计算机设备、服务器或云服务器的芯片(系统)或其他部件或组件,本申请对此不做限定。
此外,图11所示的程序检测装置1100的技术效果可以参考图1所示的程序检测方法的技术效果,此处不再赘述。
图12是本申请实施例提供的程序检测装置的结构示意图二。该程序检测装置可以是计算机设备、服务器或云服务器,也可以是可设置于计算机设备、服务器或云服务器的芯片(系统)或其他部件或组件,本申请对此不做限定。
如图12所示,程序检测装置1200可以包括处理器1201。可选地,程序检测装置1200还可以包括存储器1202和收发器1203。其中,处理器1201与存储器1202和收发器1203耦合,如可以通过通信总线连接。
下面结合图12对程序检测装置1200的各个构成部件进行具体的介绍:
处理器1201是程序检测装置1200的控制中心,可以是一个处理器,也可以是多个处理元件的统称。例如,处理器1201是一个或多个中央处理器(central processing unit,CPU),也可以是特定集成电路(application specific integrated circuit,ASIC),或者是被配置成实施本申请实施例的一个或多个集成电路,例如:一个或多个微处理器(digital signal processor,DSP),或,一个或者多个现场可编程门阵列(field programmable gate array,FPGA)。
其中,处理器1201可以通过运行或执行存储在存储器1202内的软件程序,以及调用存储在存储器1202内的数据,执行程序检测装置1200的各种功能。
在具体的实现中,作为一种实施例,处理器1201可以包括一个或多个CPU,例如图12中所示的CPU0和CPU1。
在具体实现中,作为一种实施例,程序检测装置1200也可以包括多个处理器,例如图12中所示的处理器1201和处理器1204。这些处理器中的每一个可以是一个单核处理器(single-CPU),也可以是一个多核处理器(multi-CPU)。这里的处理器可以指一个或多个设备、电路、和/或用于处理数据(例如计算机程序指令)的处理核。
存储器1202可以是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)或者可存储信息和指令的其他类型的动态存储设备,也可以是电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、只读光盘(compact disc read-only memory,CD-ROM)或其他光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。存储器1202可以和处理器1201集成在一起,也可以独立存在,并通过程序检测装置1200的输入/输出端口(图12中未示出)与处理器1201耦合,本申请实施例对此不作具体限定。
其中,所述存储器1202用于存储执行本申请方案的软件程序,并由处理器1201来控制执行。上述具体实现方式可以参考上述方法实施例,此处不再赘述。
收发器1203,用于与其他设备之间的通信。此外,收发器1203可以包括接收器和发送器(图12中未单独示出)。其中,接收器用于实现接收功能,发送器用于实现发送功能。收发器1203可以和处理器1201集成在一起,也可以独立存在,并通过程序检测装置1200的输入/输出端口(图12中未示出)与处理器1201耦合,本申请实施例对此不作具体限定。
需要说明的是,图12中示出的程序检测装置1200的结构并不构成对该程序检测装置的限定,实际的程序检测装置可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
本申请实施例提供一种芯片系统。该芯片系统包括处理器和输入/输出端口,所述处理器用于实现上述方法实施例所涉及的处理功能,所述输入/输出端口用于实现上述方法实施例所涉及的收发功能。
在一种可能的设计中,该芯片系统还包括存储器,该存储器用于存储实现上述方法实施例所涉及的功能的程序指令和数据。
该芯片系统,可以由芯片构成,也可以包含芯片和其他分立器件。
本申请实施例提供一种计算机可读存储介质,包括:该计算机可读存储介质中存储有计算机指令;当该计算机指令在计算机上运行时,使得该计算机执行上述方法实施例所述的程序检测方法。
本申请实施例提供了一种包含指令的计算机程序产品,包括计算机程序或指令,当该计算机程序或指令在计算机上运行时,使得该计算机执行上述方法实施例所述的程序检测方法。
应理解,在本申请实施例中的处理器可以是中央处理单元(central processing unit,CPU),该处理器还可以是其他通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。 通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
还应理解,本申请实施例中的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的随机存取存储器(random access memory,RAM)可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。
上述实施例,可以全部或部分地通过软件、硬件(如电路)、固件或其他任意组合来实现。当使用软件实现时,上述实施例可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令或计算机程序。在计算机上加载或执行所述计算机指令或计算机程序时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以为通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集合的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质。半导体介质可以是固态硬盘。
应理解,本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况,其中A,B可以是单数或者复数。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系,但也可能表示的是一种“和/或”的关系,具体可参考前后文进行理解。
本申请中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。
应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (19)

  1. 一种程序检测方法,其特征在于,包括:
    接收用户提供的程序;
    根据查询参数和所述程序,获得所述程序运行在弱内存环境上的结果;其中,所述查询参数用于指示所述程序的两条操作发生重排序的最大间隔。
  2. 根据权利要求1所述的程序检测方法,其特征在于,所述根据查询参数和所述程序,获得所述程序运行在弱内存环境上的结果,包括:
    根据所述查询参数、所述程序的第一线程中的第一操作与所述第一线程中的第二操作的间隔、和所述程序的第二线程中的第三操作与所述第二线程中的第四操作的间隔,获得所述程序运行在所述弱内存环境上的结果;其中,所述第一操作和所述第三操作为对同一变量进行的一对读写操作,所述第二操作和所述第四操作为对同一变量进行的一对读写操作,所述第一操作和所述第二操作为对不同变量进行的操作,所述第三操作和所述第四操作为对不同变量进行的操作。
  3. 根据权利要求2所述的程序检测方法,其特征在于,还包括:
    采用内存读写模式查询规则,对所述程序的第一线程中的第一操作、所述第一线程中的第二操作、所述程序的第二线程中的第三操作和所述第二线程中的第四操作进行检测,获得所述程序运行在所述弱内存环境上的结果。
  4. 根据权利要求3所述的程序检测方法,其特征在于,所述内存读写模式查询规则是根据所述查询参数确定的,所述内存读写模式查询规则用于确定所述第一线程和所述第二线程在时间上是否发生重叠。
  5. 根据权利要求1-4中任一项所述的程序检测方法,其特征在于,所述查询参数是所述用户确定的,所述查询参数包括片内写操作延迟。
  6. 根据权利要求1-5中任一项所述的程序检测方法,其特征在于,所述弱内存环境为非统一内存访问架构NUMA的运行设备对应的运行环境。
  7. 根据权利要求1-6中任一项所述的程序检测方法,其特征在于,还包括:
    向所述用户提供所述结果,其中,所述结果包括修改建议。
  8. 根据权利要求7所述的程序检测方法,其特征在于,还包括:
    响应于所述用户的确定指示,根据所述修改建议修改所述程序。
  9. 一种程序检测装置,其特征在于,包括:接收单元和获取单元;其中,
    所述接收单元,用于接收用户提供的程序;
    所述获取单元,用于根据查询参数和所述程序,获得所述程序运行在弱内存环境上的结果;其中,所述查询参数用于指示所述程序的两条操作发生重排序的最大间隔。
  10. 根据权利要求9所述的程序检测装置,其特征在于,
    所述获取单元,还用于根据所述查询参数、所述程序的第一线程中的第一操作与所述第一线程中的第二操作的间隔、和所述程序的第二线程中的第三操作与所述第二线程中的第四操作的间隔,获得所述程序运行在所述弱内存环境上的结果;其中,所述第一操作和所述第三操作为对同一变量进行的一对读写操作,所述第二操作和所述第四操作为对同一变量进行的一对读写操作,所述第一操作和所述第二操作为对不同变量进行的操作,所述第三操作和所述第四操作为对不同变量进行的操作。
  11. 根据权利要求9所述的程序检测装置,其特征在于,
    所述获取单元,还用于采用内存读写模式查询规则,对所述程序的第一线程中的第一操 作、所述第一线程中的第二操作、所述程序的第二线程中的第三操作和所述第二线程中的第四操作进行检测,获得所述程序运行在所述弱内存环境上的结果。
  12. 根据权利要求11所述的程序检测装置,其特征在于,所述内存读写模式查询规则是根据所述查询参数确定的,所述内存读写模式查询规则用于确定所述第一线程和所述第二线程在时间上是否发生重叠。
  13. 根据权利要求9-12中任一项所述的程序检测装置,其特征在于,所述查询参数是所述用户确定的,所述查询参数包括片内写操作延迟。
  14. 根据权利要求9-13中任一项所述的程序检测装置,其特征在于,所述弱内存环境为非统一内存访问架构NUMA的运行设备对应的运行环境。
  15. 根据权利要求9-14中任一项所述的程序检测装置,其特征在于,还包括:输出单元;其中,
    所述输出单元,用于向所述用户提供所述结果;其中,所述结果包括修改建议。
  16. 根据权利要求15所述的程序检测装置,其特征在于,
    所述获取单元,还用于响应于所述用户的确定指示,根据所述修改建议修改所述程序。
  17. 根据权利要求9-16中任一项所述的程序检测装置,其特征在于,
    所述程序检测装置放置于云服务器中。
  18. 一种程序检测装置,其特征在于,所述装置包括:处理器,所述处理器与存储器耦合;
    所述存储器,用于存储计算机程序;
    所述处理器,用于执行所述存储器中存储的所述计算机程序,以使得所述装置执行如权利要求1-8中任一项所述的程序检测方法。
  19. 一种计算机程序产品,其特征在于,所述计算机程序产品包括:计算机程序代码,当所述计算机程序代码在计算机上运行时,使得所述计算机执行如权利要求1-8中任一项所述的程序检测方法。
PCT/CN2021/123936 2020-12-29 2021-10-14 程序检测方法及装置 WO2022142595A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21913340.2A EP4258121A1 (en) 2020-12-29 2021-10-14 Program detection method and device
US18/342,388 US20230367516A1 (en) 2020-12-29 2023-06-27 Program Detection Method and Apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011608587.6A CN114691474A (zh) 2020-12-29 2020-12-29 程序检测方法及装置
CN202011608587.6 2020-12-29

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/342,388 Continuation US20230367516A1 (en) 2020-12-29 2023-06-27 Program Detection Method and Apparatus

Publications (1)

Publication Number Publication Date
WO2022142595A1 true WO2022142595A1 (zh) 2022-07-07

Family

ID=82133018

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/123936 WO2022142595A1 (zh) 2020-12-29 2021-10-14 程序检测方法及装置

Country Status (4)

Country Link
US (1) US20230367516A1 (zh)
EP (1) EP4258121A1 (zh)
CN (1) CN114691474A (zh)
WO (1) WO2022142595A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115718622B (zh) * 2022-11-25 2023-10-13 苏州睿芯通量科技有限公司 一种arm架构下的数据处理方法、装置及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101939724A (zh) * 2008-02-06 2011-01-05 Nxp股份有限公司 数据处理设备及用于执行混淆程序的方法
CN104050043A (zh) * 2014-06-17 2014-09-17 华为技术有限公司 基于共享缓存感知的虚拟机调度方法和装置
CN104137081A (zh) * 2012-02-13 2014-11-05 国际商业机器公司 偏移在先的高等待时间操作的存储器重排序队列
CN106201889A (zh) * 2016-07-15 2016-12-07 国云科技股份有限公司 一种检查程序代码编写规范的系统及其实现方法
CN107861830A (zh) * 2017-12-01 2018-03-30 深圳乐信软件技术有限公司 应用程序崩溃的检测方法、装置、存储介质及移动终端

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101939724A (zh) * 2008-02-06 2011-01-05 Nxp股份有限公司 数据处理设备及用于执行混淆程序的方法
CN104137081A (zh) * 2012-02-13 2014-11-05 国际商业机器公司 偏移在先的高等待时间操作的存储器重排序队列
CN104050043A (zh) * 2014-06-17 2014-09-17 华为技术有限公司 基于共享缓存感知的虚拟机调度方法和装置
CN106201889A (zh) * 2016-07-15 2016-12-07 国云科技股份有限公司 一种检查程序代码编写规范的系统及其实现方法
CN107861830A (zh) * 2017-12-01 2018-03-30 深圳乐信软件技术有限公司 应用程序崩溃的检测方法、装置、存储介质及移动终端

Also Published As

Publication number Publication date
EP4258121A1 (en) 2023-10-11
CN114691474A (zh) 2022-07-01
US20230367516A1 (en) 2023-11-16

Similar Documents

Publication Publication Date Title
US7814378B2 (en) Verification of memory consistency and transactional memory
US7779393B1 (en) System and method for efficient verification of memory consistency model compliance
US9135139B2 (en) Methods and systems to identify and reproduce concurrency violations in multi-threaded programs using expressions
US7003691B2 (en) Method and apparatus for seeding differences in lock-stepped processors
US9575816B2 (en) Deadlock/livelock resolution using service processor
US8495344B2 (en) Simultaneous execution resumption of multiple processor cores after core state information dump to facilitate debugging via multi-core processor simulator using the state information
US11204859B2 (en) Partial-results post-silicon hardware exerciser
US11789734B2 (en) Padded vectorization with compile time known masks
US9286180B1 (en) Final result checking for system with pre-verified cores
US8359291B2 (en) Architecture-aware field affinity estimation
US9117021B2 (en) Methods and apparatus to manage concurrent predicate expressions
ITVI20100208A1 (it) Metodo¿e sistema di simulazione atti alla simulazione di una piattaforma hardware a componenti multipli
US8392891B2 (en) Technique for finding relaxed memory model vulnerabilities
US20140215483A1 (en) Resource-usage totalizing method, and resource-usage totalizing device
US20230367516A1 (en) Program Detection Method and Apparatus
US9448909B2 (en) Randomly branching using performance counters
US9483379B2 (en) Randomly branching using hardware watchpoints
Garashchenko et al. System of Combined Specialized Test Generators for the New Generation of VLIW DSP Processors with Elcore50 Architecture
US11022649B2 (en) Stabilised failure estimate in circuits
JP6473023B2 (ja) 性能評価モジュール及びこれを組み込んだ半導体集積回路
CN114153750B (zh) 代码检查方法及装置、代码编写方法、电子设备
JP2014194746A (ja) シミュレーション装置及びシミュレーション方法及びプログラム
US20210405969A1 (en) Computer-readable recording medium recording arithmetic processing program, arithmetic processing method, and arithmetic processing device
US20200057707A1 (en) Methods and apparatus for full-system performance simulation
Charvát et al. An Abstraction of Multi-port Memories with Arbitrary Addressable Units

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21913340

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021913340

Country of ref document: EP

Effective date: 20230707

NENP Non-entry into the national phase

Ref country code: DE