WO2024109068A1 - 程序监控方法、装置、电子设备和存储介质 - Google Patents

程序监控方法、装置、电子设备和存储介质 Download PDF

Info

Publication number
WO2024109068A1
WO2024109068A1 PCT/CN2023/104590 CN2023104590W WO2024109068A1 WO 2024109068 A1 WO2024109068 A1 WO 2024109068A1 CN 2023104590 W CN2023104590 W CN 2023104590W WO 2024109068 A1 WO2024109068 A1 WO 2024109068A1
Authority
WO
WIPO (PCT)
Prior art keywords
atomic
cache
variable
program
preset
Prior art date
Application number
PCT/CN2023/104590
Other languages
English (en)
French (fr)
Inventor
王碧
Original Assignee
惠州市德赛西威智能交通技术研究院有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 惠州市德赛西威智能交通技术研究院有限公司 filed Critical 惠州市德赛西威智能交通技术研究院有限公司
Publication of WO2024109068A1 publication Critical patent/WO2024109068A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication

Definitions

  • the present application relates to the field of computer application technology, for example, to a program monitoring method, device, electronic device and storage medium.
  • Program monitoring technology as an important technology for the development of functional safety of automotive software, mainly includes the following three functions: active point detection (monitoring whether the program is running normally), deadline monitoring (monitoring whether the program is executed within the specified time) and logical timing monitoring (monitoring whether each program segment runs in a predetermined time sequence).
  • the relevant program monitoring method generally requires the monitored process to periodically send heartbeat data (such as time, process number, program number and status, etc.) to the service process, and then the service process performs preset logical judgments to determine whether there are active point timeout errors, deadline timeout errors, logical timing errors and other problems, and performs corresponding error handling when errors occur.
  • the heartbeat data sent by the monitored process has no status itself, and the status data is saved in the service process. This will cause: if the service process crashes and exits, the status data of the monitored process will be lost, hot switching and lossless recovery cannot be achieved, and the reliability is low.
  • the present application provides a program monitoring method, device, electronic device and storage medium to realize the change of locked communication to lock-free communication through atomic variable reading and writing, which can reduce system resource overhead and communication latency.
  • atomic variables containing data and status in a shared space, it can support the service process crash and restart, and can restore the previous data and status without loss, with higher reliability.
  • a program monitoring method comprising:
  • a program monitoring generation device comprising:
  • a variable generation module configured to process the message information of the target monitoring program into atomic variables according to a preset configuration file
  • variable temporary storage module is set to save atomic variables based on the preset shared space
  • the message processing module is configured to perform message processing according to the atomic variables in the preset shared space.
  • an electronic device comprising:
  • the memory stores a computer program that can be executed by the at least one processor, and the computer program is executed by the at least one processor so that the at least one processor can execute the program monitoring method described in any embodiment of the present application.
  • a computer-readable storage medium stores computer instructions, and the computer instructions are used to enable a processor to implement the program monitoring method described in any embodiment of the present application when executed.
  • FIG1 is a flow chart of a program monitoring method provided in Embodiment 1 of the present application.
  • FIG2 is a flow chart of a program monitoring method provided in Embodiment 2 of the present application.
  • FIG3 is a flow chart of a program monitoring method provided in Embodiment 3 of the present application.
  • FIG4 is a structural example diagram of a program monitoring system provided in Embodiment 3 of the present application.
  • FIG5 is a diagram showing an example of a data structure of an atomic variable provided in Embodiment 3 of the present application.
  • FIG6 is a flowchart of a data status processing method provided in Embodiment 3 of the present application.
  • FIG7 is an example diagram of CPU Cache pseudo-sharing provided in Example 3 of the present application.
  • FIG8 is a flowchart of cache optimization provided in Embodiment 3 of the present application.
  • FIG9 is a schematic diagram of the structure of a program monitoring device provided in Embodiment 4 of the present application.
  • FIG. 10 is a schematic diagram of the structure of an electronic device of a program monitoring method provided in an embodiment of the present application.
  • FIG1 is a flow chart of a program monitoring method provided in the first embodiment of the present application.
  • the present embodiment is applicable to the case of monitoring a target monitoring program.
  • the method can be executed by a program monitoring device.
  • the program monitoring device can be implemented in the form of hardware and/or software.
  • the program monitoring device can be configured in an electronic device.
  • the electronic device can include a vehicle-mounted device, a mobile device, etc.
  • a program monitoring method provided in the first embodiment includes the following steps:
  • the preset configuration file can be understood as a file configured for the target monitoring program, and the preset configuration file can include parameters such as the task number, program number, the front-end dependency of the program, the time-consuming of the program itself, the time-consuming of the entire task, and the heartbeat time of the task.
  • the target monitoring program can be understood as a program segment to be monitored, and the target monitoring program can include one or more program segments of the task to be monitored.
  • the message information can be understood as a message containing the monitoring data and monitoring status of the target monitoring program, and the message information can include the running time, task number, program number, timeout error status, and timing error status of the target monitoring program.
  • Atomic variables can be understood as an integer data variable in multi-threaded lock-free communication.
  • the operation on atomic variables is an atomic operation.
  • Atomic operations refer to operations that will not be interrupted by the thread scheduling mechanism. Once this operation is started, it will not be interrupted by any other tasks or events before execution is completed. Through atomic variable operations, locked communication can be converted into lock-free communication, reducing communication delays.
  • the message information of the target monitoring program can be determined according to the configuration items in the preset configuration file.
  • the preset configuration file may include but is not limited to one of the following parameter information: task number, program number, program dependencies, time consumption of the program itself, time consumption of the entire task, heartbeat time of the task, and the message information is processed into atomic variables.
  • the processing process may include but is not limited to the following methods: encapsulating the message information into corresponding atomic variables by calling some application program interface (Application Program Interface, API) interface function libraries, encapsulating the message information into corresponding atomic variables by calling some custom or third-party atomic variable operation toolkits, etc.
  • API Application Program Interface
  • the preset configuration file containing parameters such as the task number, program number and program dependencies can be started and loaded, and the target monitoring program can be run. After the target monitoring program is completed or at each preset time interval, the message information containing the monitoring data and monitoring status of the program can be encapsulated into corresponding atomic variables through some API interface function libraries or atomic variable operation toolkits.
  • the preset shared space can be understood as a data storage space for storing atomic variables, and the preset shared space can include at least a cache and a shared memory.
  • the atomic variable can be saved to a corresponding position of a preset shared space, and the preset shared space can include at least a cache and a shared memory.
  • the preset shared space can include at least a cache and a shared memory.
  • a pointer or a memory address can be assigned to the atomic variable, and the atomic variable can be saved according to the data type of the atomic variable.
  • the atomic variable can be stored in the shared memory in the form of its original code or complement.
  • the preset shared space may include a cache and a shared memory.
  • message processing can be understood as a method for processing message information of a target monitoring program.
  • Message processing may include message reading, timeout error processing, timing error processing, and data status updating.
  • Some message processing can be performed on the atomic variables stored in the preset shared space.
  • the message processing process may include but is not limited to the following: reading the atomic variables in the preset shared space and parsing the status information of the target monitoring program from them; determining whether the program has a timeout error or a timing error, and in the event of an error In the case of error information, the corresponding fault code, fault cause and other fault information can be reported to the background management personnel or the functional safety management module for program error status processing; after the program error status processing, the error status of the current target monitoring program is cleared, and the atomic variables in the preset shared space are updated.
  • the technical solution of the embodiment of the present application processes the message information of the target monitoring program into atomic variables according to a preset configuration file, saves the atomic variables based on a preset shared space, and performs message processing according to the atomic variables in the preset shared space.
  • the embodiment of the present application realizes lock-free communication by reading and writing atomic variables and saves atomic variables containing data and status in a shared space, which can reduce system resource overhead, reduce communication latency, and support the non-destructive recovery of previous data and status after the service process crashes and restarts, with higher reliability.
  • the atomic variable includes at least one of the following: a time identification field, a state identification field, a sequence number field, a task identification field, and a check code field.
  • the time identification field can be understood as a field used to characterize the running time of the target monitoring program.
  • the time identification field can be calculated from 0 and the unit can be 10 microseconds.
  • the state identification field can be understood as a field used to characterize the state information of the target monitoring program.
  • the state identification field can include whether there is a disorder error, whether there is a timeout error, and whether it is in an activated state.
  • the sequence number field can be understood as a field used to characterize the program number.
  • the task identification field can be understood as a field used to characterize the task number.
  • the check code field can be understood as a field used to characterize the legality check.
  • the check code type can include a parity check code, a Hamming check code, and a cyclic redundancy check code.
  • FIG2 is a flowchart of a program monitoring method provided in Example 2 of the present application, which is expanded based on the above implementation and can be combined with the optional technical solutions in the above implementation.
  • a program monitoring method provided in Example 2 includes the following steps:
  • S210 Searching for a data identifier and a status identifier corresponding to the message information in a preset configuration file.
  • the data identifier can be understood as the data information contained in the message information, and the data identifier can include information such as program number, task number and checksum.
  • the status identifier can be understood as the status information contained in the message information, and the status identifier can include timing error status, timeout error status, start status and activation status.
  • the data identifier and status identifier corresponding to the program's message information may include but is not limited to: program number, task number, checksum; the status representation may include but is not limited to: timing error status, timeout error status, start status, activation status.
  • S220 Call the preset monitoring library to encapsulate the data identifier and the state identifier into atomic variables.
  • the preset monitoring library can be understood as an API interface function library for atomic variable encapsulation, and the preset monitoring library can include the atomic class atomic operation API function provided by C++.
  • the message information containing the data identifier and the status identifier can be encapsulated into an atomic variable by calling a preset monitoring library.
  • the preset monitoring library may include but is not limited to: the atomic operation API function of the atomic class provided by C++, the java.util.concurrent.atomic class atomic operation API function provided by Java, and a custom atomic variable encapsulation library. According to actual needs, different bits and bit sizes can be allocated to the data identifier and the status identifier, and API functions such as atomic_int64_t can be used to encapsulate the data identifier and the status identifier into atomic variables of corresponding data types.
  • the preset shared space at least includes a cache and a shared memory.
  • cache may refer to the central processing unit (CPU) cache, which is a small but fast memory located between the CPU and the main memory. Since the speed of the CPU is much higher than the main memory, the CPU has to wait for a certain period of time to directly access data from the memory.
  • the cache stores a part of the data that the CPU has just used or circulated. When the CPU uses this part of the data again, it can be directly called from the cache, which reduces the waiting time of the CPU and improves the efficiency of data reading and writing.
  • Shared memory can be understood as a large-capacity memory that can be accessed by different processes in inter-process communication, that is, different processes can achieve data sharing and interaction by accessing the same memory area (i.e., shared memory).
  • Each process can map its own virtual address to a specific area in the physical memory.
  • these processes can achieve inter-process communication through shared memory. If a process changes the content of the shared memory area, other processes will be aware of the change in the area.
  • direct mapping can be used to determine whether the cache hits by checking whether the valid bit of the corresponding stored data in the cache line is 0 or 1, thereby determining whether there is an atomic variable in the cache.
  • the atomic variable exists in the cache, the atomic variable is stored in the cache; when the atomic variable does not exist in the cache, the atomic variable is stored in the shared memory, wherein the shared memory includes at least a heartbeat monitoring sequence and a time sequence monitoring sequence.
  • the heartbeat monitoring sequence is understood as an atomic variable for storing the heartbeat data of the target monitoring program, and the heartbeat monitoring sequence may include one or more atomic variables representing the heartbeat data of the target monitoring program.
  • the time sequence monitoring sequence may be understood as an atomic variable for storing the time sequence data of the target monitoring program, and the time sequence monitoring sequence may include one or more atomic variables representing the time sequence data of the target monitoring program.
  • the atomic variables can be stored in the cache according to the corresponding pointer or memory address of the atomic variables, using direct mapping, fully connected cache and group connected cache; when there are no atomic variables in the cache, the atomic variables can be stored in the corresponding storage location in the shared memory according to the pointer or memory address corresponding to the atomic variable, wherein the shared memory includes at least a heartbeat monitoring sequence and a time sequence monitoring sequence; further, the heartbeat monitoring sequence and the time sequence monitoring sequence can be located in two independent message queues respectively, and the message queue types can include ActiveMQ message queues, RabbitMQ message queues, ZeroMQ message queues and Kafka message queues, etc., and the implementation of this application does not impose any restrictions on this.
  • the atomic variables in the cache can be read preferentially according to the pointer or memory address corresponding to the atomic variables; when there are no atomic variables in the cache within the preset shared space, the atomic variables in the shared memory can be read.
  • S270 Extract the state identifier and data identifier of the atomic variable, and execute message processing corresponding to the state identifier and the data identifier.
  • Data identifiers such as task number, program number and checksum, as well as status identifiers such as timing error status and timeout error status can be extracted from the read atomic variables.
  • the extraction method may include but is not limited to: extracting the status identifier and data identifier in the atomic variable according to the encapsulation principle of the atomic variable, parsing and obtaining the field information contained in the atomic variable, and then judging the current state according to the status identifier and data identifier.
  • the status of the target monitoring program if a timeout error or timing error occurs, the fault information such as the fault code and the cause of the fault can be reported to the background management personnel or the functional safety management module for program error status processing.
  • the error status of the current target monitoring program is cleared. For example, the mark bit representing the timeout error or out-of-order error in the atomic variable can be reset.
  • the technical solution of the embodiment of the present application is to search for the data identifier and state identifier corresponding to the message information in the preset configuration file, call the preset monitoring library to encapsulate the data identifier and state identifier as atomic variables, judge whether there are atomic variables in the cache in the preset shared space, and store the atomic variables in the cache if there are atomic variables in the cache; if there are no atomic variables in the cache, store the atomic variables in the shared memory, wherein the shared memory at least includes a heartbeat monitoring sequence and a time sequence monitoring sequence, judge whether there are atomic variables in the cache in the preset shared space, and read the atomic variables in the cache until the cache is empty if there are atomic variables in the cache; if there are no atomic variables in the cache, read the atomic variables of the shared memory in the preset shared space, extract the state identifier and data identifier of the atomic variables, and execute the message processing corresponding to the state identifier and data identifier.
  • the embodiment of the present application changes the locked communication to the lockless communication by reading and writing atomic variables, which can effectively reduce the system resource overhead and reduce the communication delay.
  • the atomic variables of the message carry data and data status, that is, the data and data status are stored in the shared space, which can support the hot switching and lossless recovery of the service after the service process crashes and restarts, and has high reliability.
  • the storage length of the atomic variable is the same as the cache line of the cache.
  • the atomic variable can be a variety of basic variable types, such as uint64_t, long, char32_t, and uint_least8_t, etc.
  • Each basic variable type corresponds to a different number of bytes (i.e., storage length); the cache line of the cache is generally an integer power of 2 consecutive bytes; to avoid the occurrence of the false sharing problem, the storage length of the atomic variable can be expanded to the cache line size of the cache according to the size of the cache line of the cache of different platforms.
  • the cache line size of the cache is 64 bytes
  • the atomic variable is represented by the uint64_t type occupying 8 bytes
  • 7 long type variables (8 bytes) can be placed before and after the 8-byte atomic variable, that is, by filling the atomic variable with meaningless variables, it is ensured that the entire atomic variable occupies the entire cache line exclusively.
  • the present invention may further include:
  • the status information of the corresponding atomic variable is updated in the preset shared memory space.
  • the state information of the corresponding atomic variable in the preset shared memory space can be updated, and the mark bit representing the timeout error or out-of-order error in the atomic variable can be reset.
  • FIG3 is a flow chart of a program monitoring method provided in Example 3 of the present application. Based on the above embodiments, this embodiment provides an implementation of a program monitoring method, which can monitor the target monitoring program based on atomic variable reading and writing.
  • FIG4 is a structural example diagram of a program monitoring system applicable to Example 3 of the present application.
  • a program monitoring method provided in Embodiment 3 of the present application includes the following steps:
  • the configuration file will configure all parameters such as task number, program number, program dependencies, program time, task heartbeat time, etc.
  • the monitored process, monitoring service process and monitoring library will load the required configuration files.
  • S320 The monitored process reports a status message, writes it into an atomic variable and saves it into a shared memory.
  • each message is compressed into an 8-byte atomic variable as shown in FIG5 , and the message field information contained in the atomic variable is shown in the following table:
  • the monitoring service process periodically reads the memory queue, processes the data and status therein, and writes back the status data to the shared memory.
  • each queue is integrated by atomic variable messages, and the atomic variables include data and status.
  • Figure 6 is a flowchart of data status processing provided by the third embodiment.
  • the logic of data status update is as follows: when the monitored process writes a message, it also updates the message status, such as timing error status, timeout error status, start status, and activation status; when the monitoring service process reads the message, it performs logic judgment and message status judgment, and after processing the message, it also updates the message status of the data; this ensures that, in the case of unprocessed data or status, it is always stored in the memory, and after the monitoring service process is abnormally restarted, the previous data and status can be restored without loss, realizing hot switching and lossless recovery.
  • the message status such as timing error status, timeout error status, start status, and activation status
  • the monitoring service process reads the message, it performs logic judgment and message status judgment, and after processing the message, it also updates the message status of the data; this ensures that, in the case of unprocessed data or status, it is always stored in the memory, and after the monitoring service process is abnormally restart
  • the third embodiment of the present invention also provides a method for optimizing the CPU Cache.
  • the CPU Cache accesses data in units of cache lines.
  • the cache lines of most CPUs are 64 bytes. If a cache line contains multiple data variables, then in a multi-concurrent read and write scenario, each new write of a variable will cause the invalidation of other variables in the cache line, causing a false sharing problem.
  • Figure 7 is an example diagram of CPU cache false sharing.
  • the CPU cache optimization method proposed in the present embodiment is as follows: expand each message atomic variable to 64 bytes, which just occupies one cache line, eliminates the CPU cache false sharing problem, and thus improves communication performance.
  • Figure 8 is an example diagram of the cache optimization process provided in the third embodiment of the present invention.
  • the process of data status processing includes:
  • atomic variables can only be basic variable types.
  • Embodiment 3 is configured as follows: uint64_t type (8 bytes), which can also be replaced by 4 bytes or larger than 8 bytes on other platforms. Lock-free communication is achieved by reading and writing atomic variables.
  • the state data may also include more state definitions or different bits and bit sizes.
  • the message body includes data and state to achieve hot switching and lossless recovery.
  • the optimization of CPU cache can expand the variable size according to the size of CPU Cache Line.
  • a CPU Cache Line contains only one variable, eliminating the false sharing problem.
  • the following table shows the time results of using atomic variables to implement lock-free communication and using locked communication.
  • the technical solution of the embodiment of the present application starts the monitored process and the monitoring service process, and loads the configuration file.
  • the monitored process reports the status message, writes it to the atomic variable and saves it to the shared memory.
  • the monitoring service process periodically reads the memory queue, processes the data and status therein, and writes back the status data to the shared memory.
  • the embodiment of the present application uses atomic operations to read and write messages by the monitored process and the service process, thereby realizing the change from locked communication to lockless communication, which can effectively reduce system resource overhead and communication latency; at the same time, the atomic variables of the message carry data and data status and are all stored in the shared memory, which can support hot switching and lossless recovery after the service process crashes and restarts, thereby improving the reliability of communication; in addition, the atomic variables are expanded to occupy a cache line exclusively through the optimization of the CPU cache, eliminating the problem of false sharing, improving the hit rate of the CPU cache, and further improving the communication performance.
  • FIG9 is a schematic diagram of the structure of a program monitoring device provided in Embodiment 4 of the present application. As shown in FIG9 , the device includes:
  • the variable generation module 41 is configured to process the message information of the target monitoring program into atomic variables according to a preset configuration file.
  • the variable temporary storage module 42 is configured to store atomic variables based on a preset shared space.
  • the message processing module 43 is configured to perform message processing according to the atomic variables in the preset shared space.
  • the technical solution of the embodiment of the present application is to process the message information of the target monitoring program into atomic variables according to a preset configuration file through a variable generation module, the variable temporary storage module saves the atomic variables based on a preset shared space, and the message processing module performs message processing according to the atomic variables in the preset shared space.
  • the embodiment of the present application realizes lock-free communication through atomic variable reading and writing and saves the atomic variables containing data and status in a shared space, which can reduce system resource overhead, reduce communication latency, and support the non-destructive recovery of previous data and status after the service process crashes and restarts, with higher reliability.
  • variable generation module 41 includes:
  • the identification search unit is configured to search for a data identification corresponding to the message information in a preset configuration file and Status indicator.
  • the atomic variable encapsulation unit is configured to call a preset monitoring library to encapsulate the data identifier and the state identifier into atomic variables.
  • the preset shared space at least includes a cache and a shared memory. Accordingly, the variable temporary storage module 42 includes:
  • the first judgment unit is configured to judge whether there is an atomic variable in the cache in the preset shared space.
  • the atomic variable storage unit is configured to store the atomic variable in the cache when the atomic variable exists in the cache; and to store the atomic variable in the shared memory when the atomic variable does not exist in the cache, wherein the shared memory includes at least a heartbeat monitoring sequence and a time sequence monitoring sequence.
  • the message processing module 43 includes:
  • the second judgment unit is configured to judge whether there is an atomic variable in the cache in the preset shared space.
  • the atomic variable reading unit is configured to read the atomic variables in the cache until the cache is empty when the atomic variables exist in the cache; and to read the atomic variables in the shared memory in the preset shared space when the atomic variables do not exist in the cache.
  • the message processing unit is configured to extract the state identifier and the data identifier of the atomic variable and perform message processing corresponding to the state identifier and the data identifier.
  • the state update module is configured to update the state information of the corresponding atomic variable in the preset shared memory space according to the processing result of the exception processing.
  • the atomic variable includes at least one of the following: a time identification field, a state identification field, a sequence number field, a task identification field, and a check code field.
  • the storage length of the atomic variable is the same as the cache line of the cache.
  • the program monitoring device provided in the embodiments of the present application can execute the program monitoring method provided in any embodiment of the present application, and has the corresponding functional modules and effects of the execution method.
  • FIG10 shows a block diagram of an electronic device 50 that can be used to implement an embodiment of the present application.
  • the electronic device is intended to represent various forms of digital computers, such as laptop computers, desktop computers, workbenches, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers.
  • the electronic device can also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices (such as helmets, glasses, watches, etc.) and other similar computing devices.
  • the components shown herein, their connections and relationships, and their functions are merely examples and are not intended to limit the implementation of the present application described and/or required herein.
  • the electronic device 50 includes at least one processor 51, and a memory connected to the at least one processor 51, such as a read-only memory (ROM) 52, a random access memory (RAM) 53, etc., wherein the memory stores a computer program that can be executed by at least one processor, and the processor 51 can perform various appropriate actions and processes according to the computer program stored in the ROM 52 or the computer program loaded from the storage unit 58 to the RAM 53.
  • the RAM 53 various programs and data required for the operation of the electronic device 50 can also be stored.
  • the processor 51, the ROM 52, and the RAM 53 are connected to each other through a bus 54.
  • An input/output (I/O) interface 55 is also connected to the bus 54.
  • a number of components in the electronic device 50 are connected to the I/O interface 55, including: an input unit 56, such as a keyboard, a mouse, etc.; an output unit 57, such as various types of displays, speakers, etc.; a storage unit 58, such as a disk, an optical disk, etc.; and a communication unit 59, such as a network card, a modem, a wireless communication transceiver, etc.
  • the communication unit 59 allows the electronic device 50 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks.
  • the processor 51 may be a variety of general and/or special processing components with processing and computing capabilities. Some examples of the processor 51 include, but are not limited to, a central processing unit (CPU), a graphics processing unit (GPU), various dedicated artificial intelligence (AI) computing chips, a variety of processors running machine learning model algorithms, a digital signal processor (DSP), and any appropriate processor, controller, microcontroller, etc.
  • the processor 51 performs the methods and processes described above, such as the program monitoring method.
  • the program monitoring method may be implemented as a computer program, which is tangibly contained in a computer-readable storage medium, such as the storage unit 58.
  • part or all of the computer program may be loaded and/or installed on the electronic device 50 via the ROM 52 and/or the communication unit 59.
  • the processor 51 may be configured to execute the program monitoring method in any other appropriate manner (eg, by means of firmware).
  • Various embodiments of the systems and techniques described above herein may be implemented in digital electronic circuit systems, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard parts (ASSPs), system on chip systems (SOCs), complex programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof.
  • FPGAs field programmable gate arrays
  • ASICs application specific integrated circuits
  • ASSPs application specific standard parts
  • SOCs system on chip systems
  • CPLDs complex programmable logic devices
  • These various embodiments may include: being implemented in one or more computer programs that are executable and/or interpreted on a programmable system including at least one programmable processor that may be a special purpose or general purpose programmable processor that may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device.
  • a programmable processor that may be a special purpose or general purpose programmable processor that may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device.
  • the computer programs for implementing the methods of the present application may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, so that when the computer programs are executed by the processor, the functions/operations specified in the flow charts and/or block diagrams are implemented.
  • the computer programs may be executed entirely on the machine, partially on the machine, partially on the machine and partially on a remote machine as a stand-alone software package, or entirely on a remote machine or server.
  • a computer readable storage medium may be a tangible medium that may contain or store a computer program for use by or in conjunction with an instruction execution system, device, or apparatus.
  • a computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any suitable combination of the foregoing.
  • a computer readable storage medium may be a machine readable signal medium.
  • machine readable storage media would include an electrical connection based on one or more lines, a portable computer disk, a hard disk, a RAM, a ROM, an Erasable Programmable Read Only Memory (EPROM) or a flash memory, an optical fiber, a portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
  • the electronic device has: a display device (e.g., a cathode ray tube (CRT) or a liquid crystal display (LCD) monitor) for displaying information to the user; and a keyboard and a pointing device (e.g., a mouse or a trackball), through which the user can provide input to the electronic device.
  • a display device e.g., a cathode ray tube (CRT) or a liquid crystal display (LCD) monitor
  • a keyboard and a pointing device e.g., a mouse or a trackball
  • Other types of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and the input from the user can be received in any form (including acoustic input, voice input, or tactile input).
  • the systems and techniques described herein may be implemented in a computing system that includes backend components (e.g., as a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes frontend components (e.g., a user computer with a graphical user interface or a web browser through which a user can interact with implementations of the systems and techniques described herein), or a computing system that includes any combination of such backend components, middleware components, or frontend components.
  • the components of the system may be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: Local Area Network (LAN), Wide Area Network (WAN), blockchain network, and the Internet.
  • a computing system may include a client and a server.
  • the client and the server are generally remote from each other and usually interact through a communication network.
  • the relationship between the client and the server is generated by computer programs running on the respective computers and having a client-server relationship with each other.
  • the server may be a cloud server, also known as a cloud computing server or a cloud host, which is a host product in the cloud computing service system to solve the defects of difficult management and weak business scalability in traditional physical hosts and virtual private servers (VPS).
  • VPN virtual private servers

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Debugging And Monitoring (AREA)

Abstract

一种程序监控方法、装置、电子设备(50)和存储介质。其中,方法包括:根据预设配置文件将目标监测程序的消息信息处理为原子变量(S110),基于预设共享空间保存原子变量(S120),根据预设共享空间内原子变量执行消息处理(S130)。

Description

程序监控方法、装置、电子设备和存储介质
本申请要求在2022年11月25日提交中国专利局、申请号为202211492584.X的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机应用技术领域,例如涉及一种程序监控方法、装置、电子设备和存储介质。
背景技术
程序监控技术,作为一项汽车软件功能安全开发的重要技术,主要包括以下三部分功能:活跃点检测(监测程序是否正常运行)、截止时间监控(监测程序是否在规定的时间内执行完成)以及逻辑时序监控(监测各程序段是否按照既定的时间顺序运行)。相关的程序监控方法一般由被监测进程周期性地发送心跳数据(如时间、进程号、程序号和状态等)给服务进程,再由服务进程进行预设逻辑判断,判断是否存在活跃点超时错误、截止时间超时错误和逻辑时序错误等问题,并在发生错误时进行相应的错误处理。
然而,现有的程序监控方法存在以下问题:
1、现有程序监控方法主要使用Unix域套接字(Unix Domain Sockets,UDS)和共享内存(Share Memory,SHMEM)方式进行进程间通讯,上述两种通讯方式均为有锁通讯,会增加系统资源开销,增加通讯时延。
2、被监测进程发送的心跳数据本身没有状态,状态数据均在服务进程中保存,这会引起:如果服务进程崩溃退出,会导致被监测进程的状态数据丢失,无法实现热切换和无损恢复,可靠性较低。
发明内容
本申请提供了一种程序监控方法、装置、电子设备和存储介质,以实现通过原子变量读写将有锁通讯变更为无锁通讯,可以减少系统资源开销,降低通讯时延,同时通过将包含数据和状态的原子变量保存在共享空间中,可以支持服务进程崩溃重启后,能无损地恢复之前的数据和状态,可靠性更高。
根据本申请的一方面,提供了一种程序监控方法,其中,该方法包括:
根据预设配置文件将目标监测程序的消息信息处理为原子变量;
基于预设共享空间保存原子变量;
根据预设共享空间内原子变量执行消息处理。
根据本申请的另一方面,提供了一种程序监控生成装置,包括:
变量生成模块,设置为根据预设配置文件将目标监测程序的消息信息处理为原子变量;
变量暂存模块,设置为基于预设共享空间保存原子变量;
消息处理模块,设置为根据预设共享空间内原子变量执行消息处理。
根据本申请的另一方面,提供了一种电子设备,所述电子设备包括:
至少一个处理器;以及
与所述至少一个处理器通信连接的存储器;其中,
所述存储器存储有可被所述至少一个处理器执行的计算机程序,所述计算机程序被所述至少一个处理器执行,以使所述至少一个处理器能够执行本申请任一实施例所述的程序监控方法。
根据本申请的另一方面,提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机指令,所述计算机指令用于使处理器执行时实现本申请任一实施例所述的程序监控方法。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍。
图1是本申请实施例一提供的一种程序监控方法的流程图;
图2是本申请实施例二提供的一种程序监控方法的流程图;
图3是本申请实施例三提供的一种程序监控方法的流程图;
图4是本申请实施例三提供的一种程序监控系统的结构示例图;
图5是本申请实施例三提供的原子变量的数据结构示例图;
图6是本申请实施例三提供的数据状态处理的流程示例图;
图7是本申请实施例三提供的CPU Cache伪共享的示例图;
图8是本申请实施例三提供的缓存优化的流程示例图;
图9是本申请实施例四提供的一种程序监控装置的结构示意图;
图10是本申请实施例提供的一种程序监控方法的电子设备的结构示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行说明。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
实施例一
图1为本申请实施例一提供的一种程序监控方法的流程图,本实施例可适用于对目标监测程序进行监控的情况,该方法可以由程序监控装置来执行,该程序监控装置可以采用硬件和/或软件的形式实现,该程序监控装置可配置于电子设备中,例如,电子设备可以包括车载设备、移动设备等。如图1所示,本实施例一提供的一种程序监控方法,包括如下步骤:
S110、根据预设配置文件将目标监测程序的消息信息处理为原子变量。
在本申请实施例中,预设配置文件可以理解为针对目标监测程序配置的文件,预设配置文件可以包括任务号、程序号、程序的前后依赖关系、程序本身的耗时、整个任务的耗时和任务的心跳时间等参数。目标监测程序可以理解为待监测的程序段,目标监测程序可以包括待监测任务的一个或多个程序段。消息信息可以理解为包含目标监测程序的监测数据和监控状态的消息,消息信息可以包括目标监测程序的运行时间、任务号、程序号、超时错误状态和时序错误状态等。原子变量可以理解为在多线程无锁通信中的一种整型数据变量,对原子变量进行的操作是一个原子操作,原子操作指的是不会被线程调度机制打断的操作,这种操作一旦开始,在执行完毕前不会被任何其他任务或事件打断,通过原子变量操作可以使有锁通信变为无锁通信,降低通讯时延。
可以根据预设配置文件中的配置项去确定目标监测程序的消息信息,预设配置文件中可以包括但不限于以下几项参数信息之一:任务号、程序号、程序的前后依赖关系、程序本身的耗时、整个任务的耗时、任务的心跳时间,并将消息信息处理成原子变量,处理过程可以包括但不限于以下几种方式:通过调用一些应用程序界面(Application Program Interface,API)接口函数库将消息信息封装成对应的原子变量、通过调用一些自定义的或者第三方的原子变量操作工具包将消息信息封装成对应的原子变量等。在一个实施例中,可以启动加载包含任务号、程序号和程序的前后依赖关系等参数的预设配置文件,以及运行目标监测程序,可以在目标监测程序运行完毕或者每间隔预设时间将包含该程序的监测数据和监控状态的消息信息,通过一些API接口函数库或原子变量操作工具包将消息信息封装成对应的原子变量。
S120、基于预设共享空间保存原子变量。
在本申请实施例中,预设共享空间可以理解为用于存储原子变量的数据存储空间,预设共享空间可以至少包括高速缓存和共享内存。
在目标监测程序的消息信息处理为原子变量后,可以将原子变量保存到预设共享空间的相应位置,预设共享空间可以至少包括高速缓存和共享内存,在编译期间可以为原子变量分配指针或者内存地址,可以根据原子变量的数据类型对原子变量进行保存,示例性地,可以将原子变量以其原码或补码的形式存储在共享内存中。在一个实施例中,预设共享空间可以包括高速缓存和共享内存,在对原子变量进行保存时,在高速缓存中还存在该原子变量的情况下,可以优先将原子变量保存在高速缓存中;在高速缓存中无该原子变量的情况下,可以将原子变量保存到共享内存中。
S130、根据预设共享空间内原子变量执行消息处理。
在本申请实施例中,消息处理可以理解为针对目标监测程序的消息信息的处理方法,消息处理可以包括消息读取、超时错误处理、时序错误处理和数据状态更新等。
可以对保存在预设共享空间内的原子变量进行一些消息处理,消息处理过程可以包括但不限于几种:读取预设共享空间内的原子变量,从中解析出目标监测程序的状态信息;判断该程序是否出现超时错误或者时序错误,在出现错 误信息的情况下,可以将对应的故障码和故障原因等故障信息上报给后台管理人员或者功能安全管理模块进行程序错误状态处理;在程序错误状态处理后,清除当前目标监测程序的错误状态,并对预设共享空间内原子变量进行更新。
本申请实施例的技术方案,通过根据预设配置文件将目标监测程序的消息信息处理为原子变量,基于预设共享空间保存原子变量,根据预设共享空间内原子变量执行消息处理。本申请实施例通过原子变量读写实现无锁通讯以及将包含数据和状态的原子变量保存在共享空间中,可以减少系统资源开销,降低通讯时延,可以支持服务进程崩溃重启后,能无损地恢复之前的数据和状态,可靠性更高。
在上述发明实施例的基础上,所述原子变量包括以下至少之一:时间标识字段、状态标识字段、序号字段、任务标识字段、校验码字段。
在本申请实施例中,时间标识字段可以理解为用于表征目标监测程序的运行时间的字段,时间标识字段可以从0开始计算,单位可以为10微秒。状态标识字段可以理解为用于表征目标监测程序状态信息的字段,状态标识字段可以包括有无存在乱序错误、有无存在超时错误以及是否处于激活状态等。序号字段可以理解为用于表征程序号的字段。任务标识字段可以理解为用于表征任务号的字段。校验码字段可以理解为用于表征合法性校验的字段,校验码类型可以包括奇偶校验码、海明校验码和循环冗余校验码等。
实施例二
图2为本申请实施例二提供的一种程序监控方法的流程图,基于上述实施方式进行扩展,并可以与上述实施方式中可选技术方案结合。如图2所示,本实施例二提供的一种程序监控方法,包括如下步骤:
S210、在预设配置文件中查找消息信息对应的数据标识以及状态标识。
在本申请实施例中,数据标识可以理解为消息信息包含的数据信息,数据标识可以包括程序号、任务号和校验码等信息。状态标识可以理解为消息信息包含的状态信息,状态标识可以包括时序错误状态、超时错误状态、开始状态和激活状态等。
可以根据预设配置文件内的任务号和程序号等配置信息,查找与目标监测 程序的消息信息对应的数据标识和状态标识,其中,数据表示可以包括但不限于:程序号、任务号、校验码;状态表示可以包括但不限于:时序错误状态、超时错误状态、开始状态、激活状态。
S220、调用预设监测库将数据标识和状态标识封装为原子变量。
在本申请实施例中,预设监测库可以理解为是进行原子变量封装的API接口函数库,预设监测库可以包括C++提供的atomic类原子操作API函数。
可以通过调用预设监测库将包含数据标识和状态标识的消息信息封装成原子变量,预设监测库可以包括但不限于:C++提供的atomic类原子操作API函数、Java提供的java.util.concurrent.atomic类原子操作API函数、自定义的原子变量封装库,可以根据实际需要,对数据标识和状态标识分配不同的比特(Bit)位和Bit大小,并利用atomic_int64_t等API函数将数据标识和状态标识封装成对应数据类型的原子变量。
在上述发明实施例的基础上,所述预设共享空间至少包括高速缓存和共享内存。
在本申请实施例中,高速缓存可以指中央处理单元(Central Processing Unit,CPU)缓存(Cache),是位于CPU和主内存之间的一种容量较小但速度很快的存储器,由于CPU的速度远高于主内存,CPU直接从内存中存取数据要等待一定时间周期,Cache中保存着CPU刚用过或循环使用的一部分数据,当CPU再次使用该部分数据时可从Cache中直接调用,减少CPU的等待时间,提高了数据读写的效率。共享内存可以理解为在进程间通信中能够被不同进程访问的大容量内存,即不同进程可以通过访问同一块内存区域(即共享内存)实现数据共享和交互。每个进程可以将自身的虚拟地址映射到物理内存中的特定区域,当不同进程将相同的物理内存区域与各自的虚拟地址空间关联时,这些进程就能实现通过共享内存来完成进程间通信。若一进程更改了共享内存区的内容,其他进程都会觉察到该区域的更改。
S230、判断预设共享空间内高速缓存内是否存在原子变量。
可以根据原子变量对应的指针或者内存地址,采用直接映射、全相连Cache和组相连Cache等方式,通过缓存行中对应存储数据的有效位是否为0或者1,判断高速缓存是否命中,进而确定高速缓存内是否存在原子变量。
S240、在高速缓存内存在原子变量的情况下,将原子变量存储到高速缓存;在高速缓存内不存在原子变量的情况下,将原子变量存储到共享内存,其中,共享内存至少包括心跳监控序列和时间时序监控序列。
在本申请实施例中,心跳监控序列理解为用于存储包含目标监测程序心跳数据的原子变量,心跳监控序列可以包含一个或多个表征目标监测程序心跳数据的原子变量。时间时序监控序列可以理解为用于存储包含目标监测程序时间时序数据的原子变量,时间时序监控序列可以包含一个或多个表征目标监测程序时间时序数据的原子变量。
在预设共享空间内高速缓存内存在原子变量的情况下,可以根据原子变量的对应的指针或者内存地址,采用直接映射、全相连Cache和组相连Cache等方式,将原子变量存储到高速缓存中;在高速缓存内不存在原子变量的情况下,可以根据原子变量对应的指针或者内存地址,将原子变量存储到共享内存内的相应存储位置,其中,共享内存至少包括心跳监控序列和时间时序监控序列;进一步地,心跳监控序列和时间时序监控序列可以分别位于两条独立的消息队列中,消息队列类型可以包括ActiveMQ消息队列、RabbitMQ消息队列、ZeroMQ消息队列和卡夫卡(Kafka)消息队列等,本申请实施对此不进行限制。
S250、判断预设共享空间内高速缓存是否存在原子变量。
S260、在高速缓存内存在原子变量的情况下,读取高速缓存内原子变量直到高速缓存为空;在高速缓存内不存在原子变量的情况下,读取预设共享空间内共享内存的原子变量。
在预设共享空间内高速缓存内存在原子变量的情况下,可以根据原子变量对应的指针或者内存地址,优先读取高速缓存内的原子变量;在预设共享空间内高速缓存内不存在原子变量的情况下,可以读取共享内存内的原子变量。
S270、提取原子变量的状态标识和数据标识,并执行状态标识和数据标识对应的消息处理。
可以从读取到的原子变量中提取出任务号、程序号和校验码等数据标识,以及时序错误状态和超时错误状态等状态标识,提取方式可以包括但不限于:根据原子变量的封装原理对原子变量中的状态标识和数据标识进行提取、对原子变量包含的字段信息进行解析获取,进而根据状态标识和数据标识判断当前 目标监测程序的状态,出现超时错误或者时序错误,可以将故障码和故障原因等故障信息上报给后台管理人员或者功能安全管理模块进行程序错误状态处理,并在程序错误状态处理后,清除当前目标监测程序的错误状态,例如可以将原子变量中的表征超时错误或者乱序错误的标记位进行重置。
本申请实施例的技术方案,通过在预设配置文件查找消息信息对应的数据标识以及状态标识,调用预设监测库将数据标识和状态标识封装为原子变量,判断预设共享空间内高速缓存内是否存在原子变量,在高速缓存内存在原子变量的情况下,将原子变量存储到高速缓存;在高速缓存内不存在原子变量的情况下,将原子变量存储到共享内存,其中,共享内存至少包括心跳监控序列和时间时序监控序列,判断预设共享空间内高速缓存是否存在原子变量,在高速缓存内存在原子变量的情况下,读取高速缓存内原子变量直到高速缓存为空;在高速缓存内不存在原子变量的情况下,读取预设共享空间内共享内存的原子变量,提取原子变量的状态标识和数据标识,并执行状态标识和数据标识对应的消息处理。本申请实施例通过原子变量读写将有锁通讯变更为无锁通讯,可以有效减少系统资源开销和降低通讯时延,同时,消息的原子变量中带有数据和数据状态,即数据和数据状态均保存在共享空间中,可以支持在服务进程崩溃重启后,实现服务的热切换和无损恢复,具有较高的可靠性。
在上述发明实施例的基础上,所述原子变量的存储长度与高速缓存的缓存行相同。
在本申请实施例中,原子变量可以为多种基本变量类型,例如可以包括uint64_t、long、char32_t和uint_least8_t等,每种基本变量类型对应不同的字节数(即存储长度);高速缓存的缓存行一般是2的整数幂个连续字节;为避免伪共享问题的出现,可以将根据不同平台的高速缓存的缓存行的大小,将原子变量的存储长度扩充至高速缓存的缓存行大小。在一个实施例中,若高速缓存的缓存行大小为64字节,而原子变量使用占用8个字节的uint64_t类型进行表示,为避免因伪共享导致缓存命中率较低的问题,可以在8个字节的原子变量基础上,在其前后各放置7个long类型变量(8字节),即通过对原子变量填充无意义的变量,来保证整个原子变量独占整个缓存行。
在上述发明实施例的基础上,还可以包括:
按照异常处理的处理结果在预设共享内存空间更新对应原子变量的状态信息。
在处理完当前目标监测程序的超时错误或者时序错误后,可以将预设共享内存空间的对应原子变量的状态信息进行更新,可以将原子变量中的表征超时错误或者乱序错误的标记位重置。
实施例三
图3为本申请实施例三提供的一种程序监控方法的流程图。本实施例在上述实施例的基础上,提供了一种程序监控方法的一个实施方式,能够基于原子变量读写实现对目标监测程序的监控。图4为本申请实施例三所适用的程序监控系统的结构示例图。
如图3所示,本申请实施例三提供的一种程序监控方法,包括如下步骤:
S310、启动被监测进程和监测服务进程,并加载配置文件。
在本申请实施例中,配置文件中会配置所有的任务号、程序号、程序的前后依赖关系、程序本身的耗时、整个任务的耗时和任务的心跳时间等参数。启动被监测进程和监测服务进程后,被监测进程、监测服务进程和监测库会加载所需的配置文件。
S320、被监测进程上报状态消息,写入到原子变量并保存至共享内存。
在本申请实施例中,将每条消息压缩成如图5所示的一个8字节的原子变量,该原子变量包含的消息字段信息如下表所示:

S330、监测服务进程周期性地读取内存队列,处理其中的数据和状态,并回写状态数据到共享内存。
在本申请实施例中,共享内存中有两个独立的消息队列,即心跳监控列表和时间时序监控列表,每个队列内均由原子变量消息集成,原子变量中包括数据和状态。
图6为本实施例三提供的数据状态处理的流程示例图。数据状态的更新逻辑如下:被监测进程在写消息时,同时更新消息状态,如时序错误状态、超时错误状态、开始状态和激活状态等;监测服务进程在读取消息时,进行逻辑判断和消息状态判断,在处理消息后,同时更新数据的消息状态;这样可以保证,在数据或状态未处理的情况下,一直保存在内存中,在监测服务进程异常重启后,能无损地恢复之前的数据和状态,实现热切换和无损恢复。
本实施例三还提供了一种关于CPU Cache的优化方法。CPU的Cache是以缓存行(Cache Line)为单位进行数据存取的,大多数CPU的Cache Line为64字节,如果一个Cache Line中包含多个数据变量,则在多并发读写场景下,每个变量的新写入都会引起Cache Line中其他变量的失效,造成伪共享问题。图7为CPU Cache伪共享的示例图。为避免伪共享问题,本实施例提出的CPU Cache的优化方法如下:将每个消息原子变量扩展为64字节,刚好独占一个Cache Line,消除CPU Cache伪共享问题,进而提高通讯性能。图8为本实施例三提供的缓存优化的流程示例图。
如图6所示,数据状态处理的流程包括:
监测服务进程,1.0启动加载配置文件();
被监测进程,1.1启动加载配置文件();
1.2起始消息(task=1,seq=0,sta=start);
1.3read(task=1,seq=0,sta=start);
1.4清空task(task下所有seq状态);
1.5状态消息(task=1,seq=1,sta=start);
1.6read(task=1,seq=1,sta=start);
1.7启动定时器()。
alt分支:
2.0状态消息(task=1,seq=1,sta=stop|active);
2.1read(task=1,seq=1,sta=stop);
2.2释放定时器()。
超时发送:
3.0状态消息(task=1,seq=1,sta=stop|timeout);
3.1read timeout(task=1,seq=1,sta=stop);
3.2处理timeout错误();
3.3清除错误状态(clear task=1,seq=1timeout)。
发送错误时序:
3.4状态消息(task=1,seq=3,sta=start|seqerr);
3.5read seqerr(task=1,seq=3,sta=start|seqerr);
3.6处理seqerr错误()。
未发送时序:
3.7清除错误状态(clear task=1,seq=1seqerr);
4.0定时器超时();
4.1处理timeout错误();
4.2清除开始状态(clear task=1,seq=1start)。
在上述实施例的基础上,原子变量只能是基本变量类型,实施例三配置为 uint64_t类型(8字节),也可以替代为4字节或其他平台的大于8字节。通过原子变量读写实现无锁通讯。
在上述实施例的基础上,状态数据也可以包括更多的状态定义或不同的Bit位及Bit大小。消息体包括数据和状态,以实现热切换和无损恢复。
在上述实施例的基础上,CPU缓存的优化可以根据CPU Cache Line的大小来扩展变量大小。一个CPU的Cache Line中只包含一个变量,消除伪共享问题。
下表为使用原子变量实现无锁通讯以及使用有锁通讯的耗时结果示例。
由上表可以看出,使用原子变量实现无锁通讯的耗时,要比有锁通讯的耗时少得多,进而说明被监测进程与服务进程通过原子操作,对消息进行读写,从有锁通讯变更为无锁通讯,可以降低通讯时延。
本申请实施例的技术方案,通过启动被监测进程和监测服务进程,并加载配置文件,被监测进程上报状态消息,写入到原子变量并保存至共享内存,监测服务进程周期性地读取内存队列,处理其中的数据和状态,并回写状态数据到共享内存。本申请实施例通过被监测进程与服务进程采用原子操作对消息进行读写,实现了从有锁通讯变更为无锁通讯,可以有效减少系统资源开销和降低通讯时延;同时,消息的原子变量中带有数据和数据状态并都保存在共享内存中,可以支持在服务进程崩溃重启后,实现热切换和无损恢复,提高通讯的可靠性;此外,通过CPU缓存的优化将原子变量扩充为独占一个缓存行,消除了伪共享问题,提高了CPU缓存的命中率,进一步提高了通讯性能。
实施例四
图9为本申请实施例四提供的一种程序监控装置的结构示意图。如图9所示,该装置包括:
变量生成模块41,设置为根据预设配置文件将目标监测程序的消息信息处理为原子变量。
变量暂存模块42,设置为基于预设共享空间保存原子变量。
消息处理模块43,设置为根据预设共享空间内原子变量执行消息处理。
本申请实施例的技术方案,通过变量生成模块根据预设配置文件将目标监测程序的消息信息处理为原子变量,变量暂存模块基于预设共享空间保存原子变量,消息处理模块根据预设共享空间内原子变量执行消息处理。本申请实施例通过原子变量读写实现无锁通讯以及将包含数据和状态的原子变量保存在共享空间中,可以减少系统资源开销,降低通讯时延,可以支持服务进程崩溃重启后,能无损地恢复之前的数据和状态,可靠性更高。
在上述实施例的基础上,变量生成模块41包括:
标识查找单元,设置为在预设配置文件查找消息信息对应的数据标识以及 状态标识。
原子变量封装单元,设置为调用预设监测库将数据标识和状态标识封装为原子变量。
在上述实施例的基础上,预设共享空间至少包括高速缓存和共享内存,相应的,变量暂存模块42包括:
第一判断单元,设置为判断预设共享空间内高速缓存内是否存在原子变量。
原子变量存储单元,设置为在高速缓存内存在原子变量的情况下,将原子变量存储到高速缓存;在高速缓存内不存在原子变量的情况下,将原子变量存储到共享内存,其中,共享内存至少包括心跳监控序列和时间时序监控序列。
在上述实施例的基础上,消息处理模块43包括:
第二判断单元,设置为判断预设共享空间内高速缓存是否存在原子变量。
原子变量读取单元,设置为在高速缓存存在原子变量的情况下,读取高速缓存内原子变量直到高速缓存为空;在高速缓存内不存在原子变量的情况下,读取预设共享空间内共享内存的原子变量。
消息处理单元,设置为提取原子变量的状态标识和数据标识,并执行状态标识和数据标识对应的消息处理。
在上述实施例的基础上,还包括:
状态更新模块,设置为按照异常处理的处理结果在预设共享内存空间更新对应原子变量的状态信息。
在上述实施例的基础上,原子变量包括以下至少之一:时间标识字段、状态标识字段、序号字段、任务标识字段、校验码字段。
在上述实施例的基础上,原子变量的存储长度与所述高速缓存的缓存行相同。
本申请实施例所提供的程序监控装置可执行本申请任意实施例所提供的程序监控方法,具备执行方法相应的功能模块和效果。
实施例五
图10示出了可以用来实施本申请的实施例的电子设备50的结构示意图。电子设备旨在表示多种形式的数字计算机,诸如,膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示多种形式的移动装置,诸如,个人数字处理、蜂窝电话、智能电话、可穿戴设备(如头盔、眼镜、手表等)和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本申请的实现。
如图10所示,电子设备50包括至少一个处理器51,以及与至少一个处理器51通信连接的存储器,如只读存储器(Read-Only Memory,ROM)52、随机访问存储器(Random Access Memory,RAM)53等,其中,存储器存储有可被至少一个处理器执行的计算机程序,处理器51可以根据存储在ROM52中的计算机程序或者从存储单元58加载到RAM53中的计算机程序,来执行各种适当的动作和处理。在RAM 53中,还可存储电子设备50操作所需的各种程序和数据。处理器51、ROM 52以及RAM 53通过总线54彼此相连。输入/输出(Input/Output,I/O)接口55也连接至总线54。
电子设备50中的多个部件连接至I/O接口55,包括:输入单元56,例如键盘、鼠标等;输出单元57,例如多种类型的显示器、扬声器等;存储单元58,例如磁盘、光盘等;以及通信单元59,例如网卡、调制解调器、无线通信收发机等。通信单元59允许电子设备50通过诸如因特网的计算机网络和/或多种电信网络与其他设备交换信息/数据。
处理器51可以是多种具有处理和计算能力的通用和/或专用处理组件。处理器51的一些示例包括但不限于中央处理单元(Central Processing Unit,CPU)、图形处理单元(Graphics Processing Unit,GPU)、各种专用的人工智能(Artificial Intelligence,AI)计算芯片、多种运行机器学习模型算法的处理器、数字信号处理器(Digital Signal Processor,DSP)、以及任何适当的处理器、控制器、微控制器等。处理器51执行上文所描述的方法和处理,例如程序监控方法。
在一些实施例中,程序监控方法可被实现为计算机程序,其被有形地包含于计算机可读存储介质,例如存储单元58。在一些实施例中,计算机程序的部分或者全部可以经由ROM 52和/或通信单元59而被载入和/或安装到电子设备50上。当计算机程序加载到RAM 53并由处理器51执行时,可以执行上文描述的程 序监控方法的一个或多个步骤。备选地,在其他实施例中,处理器51可以通过其他任何适当的方式(例如,借助于固件)而被配置为执行程序监控方法。
本文中以上描述的系统和技术的多种实施方式可以在数字电子电路系统、集成电路系统、场可编程门阵列(Field Programmable Gate Array,FPGA)、专用集成电路(Application Specific Integrated Circuit,ASIC)、专用标准产品(Application Specific Standard Parts,ASSP)、芯片上系统的系统(System On Chip,SOC)、复杂可编程逻辑设备(Complex Programmable Logic Device,CPLD)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括:实施在一个或者多个计算机程序中,该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释,该可编程处理器可以是专用或者通用可编程处理器,可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令,并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。
用于实施本申请的方法的计算机程序可以采用一个或多个编程语言的任何组合来编写。这些计算机程序可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器,使得计算机程序当由处理器执行时使流程图和/或框图中所规定的功能/操作被实施。计算机程序可以完全在机器上执行、部分地在机器上执行,作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。
在本申请的上下文中,计算机可读存储介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的计算机程序。计算机可读存储介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。备选地,计算机可读存储介质可以是机器可读信号介质。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、RAM、ROM、可擦除可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)或快闪存储器、光纤、便捷式紧凑盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
为了提供与用户的交互,可以在电子设备上实施此处描述的系统和技术, 该电子设备具有:用于向用户显示信息的显示装置(例如,阴极射线管(Cathode Ray Tube,CRT)或者液晶显示器(Liquid Crystal Display,LCD)监视器);以及键盘和指向装置(例如,鼠标或者轨迹球),用户可以通过该键盘和该指向装置来将输入提供给电子设备。其它种类的装置还可以用于提供与用户的交互;例如,提供给用户的反馈可以是任何形式的传感反馈(例如,视觉反馈、听觉反馈、或者触觉反馈);并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。
可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如,作为数据服务器)、或者包括中间件部件的计算系统(例如,应用服务器)、或者包括前端部件的计算系统(例如,具有图形用户界面或者网络浏览器的用户计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如,通信网络)来将系统的部件相互连接。通信网络的示例包括:局域网(Local Area Network,LAN)、广域网(Wide Area Network,WAN)、区块链网络和互联网。
计算系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。服务器可以是云服务器,又称为云计算服务器或云主机,是云计算服务体系中的一项主机产品,以解决了传统物理主机与虚拟专用服务器(Virtual Private Server,VPS)中,存在的管理难度大,业务扩展性弱的缺陷。
应该理解,可以使用上面所示的多种形式的流程,重新排序、增加或删除步骤。例如,本申请中记载的多个步骤可以并行地执行也可以顺序地执行也可以不同的次序执行,只要能够实现本申请的技术方案所期望的结果,本文在此不进行限制。
上述具体实施方式,并不构成对本申请保护范围的限制。

Claims (10)

  1. 一种程序监控方法,所述方法包括:
    根据预设配置文件将目标监测程序的消息信息处理为原子变量;
    基于预设共享空间保存所述原子变量;
    根据所述预设共享空间内所述原子变量执行消息处理。
  2. 根据权利要求1所述方法,其中,所述根据预设配置文件将目标监测程序的消息信息处理为原子变量,包括:
    在所述预设配置文件中查找所述消息信息对应的数据标识以及状态标识;
    调用预设监测库将所述数据标识和所述状态标识封装为所述原子变量。
  3. 根据权利要求1或2所述方法,其中,所述原子变量包括以下至少之一:时间标识字段、状态标识字段、序号字段、任务标识字段、校验码字段。
  4. 根据权利要求1所述方法,其中,所述预设共享空间至少包括高速缓存和共享内存,相应的,所述基于预设共享空间保存所述原子变量,包括:
    判断所述预设共享空间内所述高速缓存内是否存在所述原子变量;
    在所述高速缓存内存在所述原子变量的情况下,将所述原子变量存储到所述高速缓存;
    在所述高速缓存内不存在所述原子变量的情况下,将所述原子变量存储到所述共享内存,其中,所述共享内存至少包括心跳监控序列和时间时序监控序列。
  5. 根据权利要求1所述方法,其中,所述根据所述预设共享空间内所述原子变量执行消息处理,包括:
    判断所述预设共享空间内高速缓存是否存在所述原子变量;
    在所述高速缓存内存在所述原子变量的情况下,读取所述高速缓存内所述 原子变量直到所述高速缓存为空;
    在所述高速缓存内不存在所述原子变量的情况下,读取所述预设共享空间内共享内存的所述原子变量;
    提取所述原子变量的状态标识和数据标识,并执行所述状态标识和所述数据标识对应的所述消息处理。
  6. 根据权利要求4所述方法,其中,所述原子变量的存储长度与所述高速缓存的缓存行相同。
  7. 根据权利要求5所述方法,还包括:
    按照异常处理的处理结果在所述预设共享内存空间更新对应所述原子变量的状态信息。
  8. 一种程序监控装置,所述装置包括:
    变量生成模块,设置为根据预设配置文件将目标监测程序的消息信息处理为原子变量;
    变量暂存模块,设置为基于预设共享空间保存所述原子变量;
    消息处理模块,设置为根据所述预设共享空间内所述原子变量执行消息处理。
  9. 一种电子设备,所述电子设备包括:
    至少一个处理器;以及
    与所述至少一个处理器通信连接的存储器;其中,
    所述存储器存储有可被所述至少一个处理器执行的计算机程序,所述计算机程序被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求1-7中任一项所述的程序监控方法。
  10. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机指 令,所述计算机指令用于使处理器执行时实现权利要求1-7中任一项所述的程序监控方法。
PCT/CN2023/104590 2022-11-25 2023-06-30 程序监控方法、装置、电子设备和存储介质 WO2024109068A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211492584.X 2022-11-25
CN202211492584.XA CN115757039A (zh) 2022-11-25 2022-11-25 一种程序监控方法、装置、电子设备和存储介质

Publications (1)

Publication Number Publication Date
WO2024109068A1 true WO2024109068A1 (zh) 2024-05-30

Family

ID=85338175

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/104590 WO2024109068A1 (zh) 2022-11-25 2023-06-30 程序监控方法、装置、电子设备和存储介质

Country Status (2)

Country Link
CN (1) CN115757039A (zh)
WO (1) WO2024109068A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115757039A (zh) * 2022-11-25 2023-03-07 惠州市德赛西威智能交通技术研究院有限公司 一种程序监控方法、装置、电子设备和存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140237004A1 (en) * 2013-02-19 2014-08-21 Ivan Schreter Lock-Free, Scalable Read Access To Shared Data Structures Using Garbage Collection
CN107667364A (zh) * 2015-06-04 2018-02-06 微软技术许可有限责任公司 使用硬件事务存储器控制索引的原子更新
CN110287044A (zh) * 2019-07-02 2019-09-27 广州虎牙科技有限公司 无锁共享内存处理方法、装置、电子设备及可读存储介质
CN111190732A (zh) * 2019-12-27 2020-05-22 成都欧珀通信科技有限公司 定时任务处理系统及方法、存储介质和电子设备
CN113407414A (zh) * 2021-06-24 2021-09-17 厦门科灿信息技术有限公司 程序运行监测方法、装置、终端及存储介质
CN114217986A (zh) * 2021-12-03 2022-03-22 腾讯科技(深圳)有限公司 数据处理方法、装置、设备、存储介质及产品
CN115757039A (zh) * 2022-11-25 2023-03-07 惠州市德赛西威智能交通技术研究院有限公司 一种程序监控方法、装置、电子设备和存储介质

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140237004A1 (en) * 2013-02-19 2014-08-21 Ivan Schreter Lock-Free, Scalable Read Access To Shared Data Structures Using Garbage Collection
CN107667364A (zh) * 2015-06-04 2018-02-06 微软技术许可有限责任公司 使用硬件事务存储器控制索引的原子更新
CN110287044A (zh) * 2019-07-02 2019-09-27 广州虎牙科技有限公司 无锁共享内存处理方法、装置、电子设备及可读存储介质
CN111190732A (zh) * 2019-12-27 2020-05-22 成都欧珀通信科技有限公司 定时任务处理系统及方法、存储介质和电子设备
CN113407414A (zh) * 2021-06-24 2021-09-17 厦门科灿信息技术有限公司 程序运行监测方法、装置、终端及存储介质
CN114217986A (zh) * 2021-12-03 2022-03-22 腾讯科技(深圳)有限公司 数据处理方法、装置、设备、存储介质及产品
CN115757039A (zh) * 2022-11-25 2023-03-07 惠州市德赛西威智能交通技术研究院有限公司 一种程序监控方法、装置、电子设备和存储介质

Also Published As

Publication number Publication date
CN115757039A (zh) 2023-03-07

Similar Documents

Publication Publication Date Title
US10175891B1 (en) Minimizing read latency for solid state drives
US20160026581A1 (en) Detection of unauthorized memory modification and access using transactional memory
US8984173B1 (en) Fast path userspace RDMA resource error detection
WO2022134428A1 (zh) 小程序页面渲染方法、装置、电子设备及存储介质
WO2018072688A1 (zh) 内核镜像文件快速加载方法和装置
CN110865888A (zh) 一种资源加载方法、装置、服务器及存储介质
JP2017538212A (ja) 中央処理装置(cpu)と補助プロセッサとの間の改善した関数コールバック機構
WO2024109068A1 (zh) 程序监控方法、装置、电子设备和存储介质
CN106708627A (zh) 一种基于kvm的多虚拟机映射、多通路的fuse加速方法及系统
WO2021232729A1 (zh) 异常堆栈处理方法、系统、电子设备及存储介质
CN110851276A (zh) 一种业务请求处理方法、装置、服务器和存储介质
US10642750B2 (en) System and method of a shared memory hash table with notifications and reduced memory utilization
CN114697194B (zh) 阻塞式事件通知方法及装置
US20200371827A1 (en) Method, Apparatus, Device and Medium for Processing Data
US11157312B2 (en) Intelligent input/output operation completion modes in a high-speed network
CN111858393B (zh) 内存页面管理方法、内存页面管理装置、介质与电子设备
WO2024007934A1 (zh) 中断处理方法、电子设备和存储介质
US10127076B1 (en) Low latency thread context caching
WO2021061269A1 (en) Storage control apparatus, processing apparatus, computer system, and storage control method
CN116243983A (zh) 处理器、集成电路芯片、指令处理方法、电子设备和介质
US6675238B1 (en) Each of a plurality of descriptors having a completion indicator and being stored in a cache memory of an input/output processor
US11645154B2 (en) Enhanced recovery from externally initiated adjunct processor queue reset
WO2022001133A1 (zh) 一种提升软拷贝读性能的方法、系统、终端及存储介质
EP3696674A1 (en) Triggered operations for collective communication
CN117093335A (zh) 分布式存储系统的任务调度方法及装置