WO2023279815A1 - Système de surveillance de performances et procédé associé - Google Patents

Système de surveillance de performances et procédé associé Download PDF

Info

Publication number
WO2023279815A1
WO2023279815A1 PCT/CN2022/089717 CN2022089717W WO2023279815A1 WO 2023279815 A1 WO2023279815 A1 WO 2023279815A1 CN 2022089717 W CN2022089717 W CN 2022089717W WO 2023279815 A1 WO2023279815 A1 WO 2023279815A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
collection
collected
analysis
nodes
Prior art date
Application number
PCT/CN2022/089717
Other languages
English (en)
Chinese (zh)
Inventor
彭大成
鲁强
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202111043661.9A external-priority patent/CN115599641A/zh
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023279815A1 publication Critical patent/WO2023279815A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment

Definitions

  • the present application relates to the field of computer technology, and in particular to a performance monitoring system, a performance monitoring method, a collection node, a computer-readable storage medium, and a computer program product.
  • the industry provides some software-based performance monitoring solutions. These performance monitoring solutions require the installation of corresponding software on the collected nodes and analysis nodes.
  • the collection software can collect performance information such as the number of occurrences of hardware events (such as specified instructions being called, cache misses, etc.), the function call stack when a certain hardware event occurs, etc. , and then the analysis node analyzes the above performance information through the analysis software, and presents the analysis results to the user through charts and other forms.
  • This application provides a performance monitoring system.
  • the collection nodes in the performance monitoring system collect performance information through the hardware debugging interface of the collected node, without installing collection software, which breaks through the limitation of software compatibility, and is suitable for all kinds of hardware
  • the collected nodes of the debug interface perform performance monitoring and have high availability.
  • the present application also provides a corresponding performance monitoring method, a collection node, a computer-readable storage medium, and a computer program product.
  • the present application provides a performance monitoring system.
  • the performance monitoring system may be a hardware system with a performance monitoring function.
  • the system is used to monitor the performance of the application on the collected nodes.
  • the collected node may be a device running an application, such as a server, or a personal computer such as a desktop computer, a notebook computer, or a smart phone.
  • the performance monitoring system includes collection nodes and analysis nodes.
  • the collection node is connected with the collected node through the hardware debugging interface of the collected node.
  • the collection node is used to read performance information through the hardware debugging interface of the collected node, and the analysis node is used to receive the performance information sent by the collection node, analyze the performance information, and obtain an analysis result.
  • the collection nodes collect performance information through the hardware debugging interface, without the need to install collection software, which breaks through the limitation of software compatibility, and is suitable for performance monitoring of various collected nodes with hardware debugging interfaces, with high availability.
  • the collection node reads the performance information from the hardware debugging interface in a read-only manner, which will not use the resources of the collected node and will not affect the operation of the application on the collected node.
  • it ensures that the collected performance information is more accurate. The effect of performance monitoring is improved.
  • the impact of the performance monitoring system on the environment of the collected nodes is completely isolated.
  • the collected node includes registers, such as registers related to performance or application running status, and the collection node can read the registers in the collected node through the hardware debugging interface of the collected node. data to obtain performance information.
  • registers such as registers related to performance or application running status
  • the collection node reads the data in the relevant registers through the hardware debugging interface of the collected node to collect performance information.
  • the hardware debugging interface of the collected node to collect performance information.
  • it breaks through the limitation of software compatibility, and on the other hand, it directly reads through the hardware debugging interface instead of collecting Software reading can realize high-speed data export and improve collection efficiency.
  • the data volume of the performance information in the collected node is relatively large, and the collected node may set an independent storage component for buffering the performance information.
  • the independent storage unit refers to a storage unit in the collected node that is independent from the main part of the central processing unit, and the independent storage unit may be, for example, a storage unit such as a cache or a buffer.
  • the collection node can read the data transmitted from the register of the collected node to the independent storage unit of the collected node through the hardware debugging interface of the collected node to obtain performance information.
  • independent storage components can bridge the gap between high-speed devices and low-speed devices, reduce the limitations of low-speed devices, and increase the collection speed of performance information, thereby improving the efficiency of performance monitoring.
  • the analysis node is further configured to receive the collection prompt information input by the user, and the collection node is specifically configured to read the performance information through the hardware debugging interface of the collected node according to the collection prompt information. In this way, performance information can be collected on demand according to user requirements, and personalized performance monitoring can be realized.
  • the collection prompt information includes an application code collection scope and at least one collection item, and each collection item corresponds to a hardware event.
  • the code collection range is used to indicate the code fragments that need to collect performance information for performance monitoring. It should be noted that the code fragment may be a compiled code fragment of the source code fragment.
  • the collection item is used to indicate an indicator for performance monitoring of the above code fragment, and the indicator may be represented by a performance-related hardware event.
  • the hardware events may include, for example, cache miss events and branch misprediction events.
  • the user can first determine the source code segment that needs to be tracked from the application that needs to monitor the performance, and then view the compiled code to obtain the address of the compiled code segment of the above source code segment, so that the collection node can be based on the The address identifies the code fragment that needs to collect performance information.
  • the user can configure the register address through the configuration interface provided by the analysis node, so that after the analysis node sends the register address to the collection node, the collection node can collect the data in the register corresponding to the register address according to the register address, so as to obtain performance information.
  • the collected nodes are provided with registers for performance monitoring, such as a set of registers in the performance monitoring unit, or another set of registers used for performance tuning to obtain hardware snapshots, etc.
  • the collection node is specifically configured to configure the configuration register in the collected node according to the collection prompt information, so as to select the target register in the collected node and the line of the independent storage component of the collected node.
  • the target register includes a register matching the collection prompt information, and the target register may be at least one register among the above-mentioned registers for performance monitoring.
  • the collection node can read the data transmitted from the target register to the independent storage unit through the hardware debugging interface of the collected node to obtain performance information.
  • the collection node of the performance monitoring system can configure the configuration register in the collected node according to the collection prompt information, so as to gate the target register in the collected node and the line of the independent storage component, therefore, the hardware debugging interface from the collected node can be realized Read performance information on demand.
  • the system includes a plurality of analysis nodes
  • the collection node is further configured to receive an analysis node address list configured by a user, and send a join to the plurality of analysis nodes according to the analysis node address list. ask.
  • at least one analysis node among the plurality of analysis nodes is configured to send a join success notification to the acquisition node.
  • the collection node and the analysis node configure the network separately, so that the performance information being transmitted will not load the current network, reduce the pressure on the current network, and be able to transmit more and more detailed performance information.
  • the collection node before sending the joining request to the multiple analysis nodes, is further configured to add the multiple analysis nodes according to the analysis node address list, so that the collection nodes collect The performance information is shared by the plurality of analysis nodes.
  • the collection node has a hardware debugging interface, and the hardware debugging interface of the collection node is connected to the hardware debugging interface of the collected node through a cable. Therefore, the collection node and the collected node can transmit performance information through the hardware debugging interface of the collected node, the cable, and the line of the hardware debugging interface of the collection node, thereby realizing the collection of performance information through hardware, breaking through the limitation of software compatibility , and improve the collection efficiency.
  • the collected nodes include multiple nodes, and a network topology of the collection node and the multiple collected nodes connected to the collection node is a daisy chain topology. Through the daisy chain topology, one collection node can collect the performance of multiple collected nodes. Specifically, the collection node can be connected to multiple collected nodes without plugging and unplugging. After the process of collecting performance information is triggered, the collection node can read the performance information of multiple collected nodes in time-sharing.
  • the collection node is powered by an alternating current or a battery, and the collection node powered by the alternating current is used to collect performance information applied in a fixed node to be collected, and the collection node powered by the battery
  • the collection node is used to collect the performance information of the application in the mobile collected node.
  • the fixed collected nodes can be computing devices such as servers in large data centers, and the mobile collected nodes can be robots, electric vehicles, drones, virtual reality wearable devices, etc.
  • the performance monitoring system of the present application can be applied to different performance tuning scenarios according to different performance tuning environments, and has high usability.
  • some nodes to be collected may be sensitive to weight or electric energy. Therefore, the collection nodes can also be equipped with batteries with a smaller weight, and use energy-saving communication modules such as Zigbee communication modules to achieve low energy consumption. Complete the transmission of performance information.
  • the present application provides a performance monitoring method.
  • the method is applied to a performance monitoring system, the system is used to monitor the performance of the application on the collected node, the system includes a collection node and an analysis node, and the collection node communicates with the collected node through the hardware debugging interface of the collected node
  • the collected nodes are connected, the method includes:
  • the collection node reads the performance information through the hardware debugging interface of the collected node
  • the analysis node receives the performance information sent by the collection node, analyzes the performance information, and obtains an analysis result.
  • the collection node reads the performance information through the hardware debugging interface of the collected node, including:
  • the collection node reads the data in the register of the collected node through the hardware debugging interface of the collected node to obtain performance information.
  • the collection node reads the performance information through the hardware debugging interface of the collected node, including:
  • the collection node reads the data transmitted from the register of the collected node to the independent storage component of the collected node through the hardware debugging interface of the collected node, and obtains performance information.
  • the method also includes:
  • the analysis node receives the collection prompt information input by the user
  • the collection node reads the performance information through the hardware debugging interface of the collected node, including:
  • the performance information is read through the hardware debugging interface of the collected node.
  • the collection prompt information includes a code collection scope of the application and at least one collection item, and each collection item corresponds to a hardware event.
  • the collection node reads the performance information through the hardware debugging interface of the collected node, including:
  • the collection node configures the configuration register in the collected node according to the collection prompt information, so as to select the target register in the collected node and the line of the independent storage component in the collected node;
  • the collection node When the application is running, the collection node reads the data transmitted from the target register to the independent storage unit through the hardware debugging interface of the collected node to obtain performance information.
  • the system includes multiple analysis nodes, and the method further includes:
  • the collection node receives a user-configured analysis node address list
  • the collection node sends a join request to the plurality of analysis nodes according to the analysis node address list;
  • At least one analysis node among the plurality of analysis nodes sends a joining success notification to the collection node.
  • the method before the collection node sends joining requests to the multiple analysis nodes, the method further includes:
  • the collection node has a hardware debugging interface, and the hardware debugging interface of the collection node is connected to the hardware debugging interface of the collected node through a cable.
  • the collected nodes include multiple nodes, and a network topology of the collection node and the multiple collected nodes connected to the collection node is a daisy chain topology.
  • the collection node is powered by alternating current or battery
  • the collection node reads the performance information through the hardware debugging interface of the collected node, including:
  • the collection node powered by the battery collects the performance information of the application in the mobile collected node.
  • the present application provides a collection node.
  • the collection node is connected to the collected node through the hardware debugging interface of the collected node, and the collection node is used to perform performance monitoring as described in the second aspect or any implementation manner of the second aspect of the present application Steps in a method performed by the collection node.
  • the present application provides a computer-readable storage medium, where an instruction is stored in the computer-readable storage medium, and the instruction instructs the device to execute the method described in the second aspect or any implementation manner of the second aspect.
  • the present application provides a computer program product containing instructions, which, when run on a device, causes the device to execute the performance monitoring method described in the second aspect or any implementation manner of the second aspect described by The steps performed by the analysis node.
  • FIG. 1 is a schematic structural diagram of a performance monitoring system provided by an embodiment of the present application
  • FIG. 2 is a schematic diagram of a configuration interface provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of a hardware structure of a collection node provided in an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a hardware structure of a collection node provided in an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a hardware structure of a collection node provided in an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a hardware structure of a collection node provided in an embodiment of the present application.
  • FIG. 7 is a flowchart of a performance monitoring method provided in an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a configuration code collection range provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a configuration collection item provided by an embodiment of the present application.
  • first and second in the embodiments of the present application are used for description purposes only, and cannot be understood as indicating or implying relative importance or implicitly indicating the quantity of indicated technical features. Thus, a feature defined as “first” and “second” may explicitly or implicitly include one or more of these features.
  • An application also called an application program, refers to a computer program written for a certain special purpose of a user.
  • Applications are usually deployed on hardware devices, such as personal computers such as desktops, laptops, and smart phones, or servers.
  • the hardware device runs the application by executing the program code of the application, and then realizes the functions of the application.
  • Application performance specifically refers to the running performance of the application.
  • the running performance can be characterized by the resource utilization rate when the application is running.
  • the resources may be of different types such as computing resources, storage resources, and network resources.
  • computing resources may include processor resources such as central processing unit (central processing unit, CPU), graphics processing unit (graphics processing unit, GPU), and storage resources may include internal memory (memory), external storage (also called auxiliary storage, Auxiliary storage (auxiliary storage), cache (cache) and other resources, network resources may include bandwidth and other resources.
  • processor resources such as central processing unit, CPU), graphics processing unit (graphics processing unit, GPU)
  • storage resources may include internal memory (memory), external storage (also called auxiliary storage, Auxiliary storage (auxiliary storage), cache (cache) and other resources
  • network resources may include bandwidth and other resources.
  • running performance can be characterized by CPU utilization, memory utilization, and cache hit ratio.
  • Performance monitoring refers to the continuous detection of the running performance of the application.
  • Large-scale chip design companies provide a wealth of performance monitoring components, such as performance monitoring unit (performance monitor, PMU) registers, core view CoreSight TM chip monitoring components, extended debugging interface (eXtend Debug Port, XDP), and so on.
  • performance monitoring unit performance monitor, PMU
  • core view CoreSight TM chip monitoring components core view CoreSight TM chip monitoring components
  • extended debugging interface eXtend Debug Port, XDP
  • the performance monitoring solutions provided by the industry are based on the support of the above performance monitoring components, develop performance monitoring related software, and then implement performance monitoring through the software.
  • the acquisition software is installed on the collected nodes, and the analysis software is installed on the analysis nodes.
  • the acquisition software can analyze hardware events such as cache miss and branch misprediction according to the user's requirements, or performance information such as the function call stack when some specific hardware events occur.
  • the analysis node analyzes the above performance information through the analysis software, and presents the analysis results to the user through charts and other forms.
  • software usually has compatibility limitations. For example, when the collected node uses a specific operating system, the collection software may not work normally. Based on this, the industry urgently needs to provide a performance monitoring solution with high availability.
  • an embodiment of the present application provides a performance monitoring system.
  • the performance monitoring system may be a hardware system with a performance monitoring function.
  • the system is used to monitor the performance of the application on the collected nodes.
  • the collected node may be a device running an application, such as a server or a PC.
  • the performance monitoring system includes collection nodes and analysis nodes.
  • the collection node is connected with the collected node through the hardware debugging interface of the collected node.
  • the collection node is used to read performance information through the hardware debugging interface of the collected node, and the analysis node is used to receive the performance information sent by the collection node, analyze the performance information, and obtain an analysis result.
  • the collection nodes collect performance information through the hardware debugging interface, without the need to install collection software, which breaks through the limitation of software compatibility, and is suitable for performance monitoring of various collected nodes with hardware debugging interfaces, with high availability.
  • the collection node reads the performance information from the hardware debugging interface in a read-only manner, which will not use the resources of the collected node and will not affect the operation of the application on the collected node.
  • it ensures that the collected performance information is more accurate. The effect of performance monitoring is improved.
  • the impact of the performance monitoring system on the environment of the collected nodes is completely isolated.
  • the performance monitoring system 10 includes at least one collection node 100 and at least one analysis node 200 .
  • the collection node 100 has a hardware entity, such as a computing box.
  • the computing box includes a hardware debugging interface adapted to the hardware debugging interface of the collected node. It should be noted that the hardware debugging interface of the collection node 100 and the hardware debugging interface of the collected node may be natively adapted, or may be adapted through an adapter.
  • the collection node 100 can use an adapter to convert the call of hardware debugging interface A into the call of hardware debugging interface B, so as to realize the adaptation to the collected node .
  • the analysis node 200 may be a hardware device deployed with analysis software (ie, performance analysis software), such as a server, a PC, a smart phone, and the like.
  • analysis software ie, performance analysis software
  • the server for deploying the analysis software may be a cloud server or a server in a local data center.
  • FIG. 1 illustrates an example in which a performance monitoring system 10 includes m collection nodes and n analysis nodes.
  • m and n are positive integers.
  • the collection node 100 and the analysis node 200 are connected through a network, and the user can configure different networking modes on the collection node 100 .
  • one analysis node 200 can access multiple collection nodes 100 at the same time. If the permissions of multiple analysis nodes 200 are opened on the collection node 100, one collection node 100 can also be shared by multiple analysis nodes 200, that is, the performance information collected by one collection node 100 can be used by multiple shared analysis nodes 200.
  • the collecting node 100 may be installed on the collected node.
  • the collected nodes can be large, medium or small devices.
  • large-scale equipment can include computing clusters in data centers (for example, server clusters), medium-sized equipment can be robots, streaming media equipment, electric vehicles, etc., and small equipment can be drones and virtual reality (virtual reality, VR).
  • the collected node has a hardware debugging interface, which may be, for example, a tracing (tracing) interface.
  • the collection node 100 may be connected to the collected node through the hardware debugging interface of the collected node.
  • the collection node 100 is configured to read performance information through a hardware debugging interface of the collected node.
  • the performance information may include one or more of the number of hardware events counted by PMU registers, application snapshot (snapshot) and tracing (tracing) information.
  • the CPU can count various hardware events through the PMU. For example, the CPU can access the first PMU register through the PMU to obtain the number of cache miss events. For example, the CPU can access the second PMU register through the PMU to obtain branch prediction errors. The number of times the event occurred.
  • Application snapshots are used to reflect the state of the application at a certain time. In the performance tuning scenario, an application snapshot refers to a data group formed by a combination of information generated by the chip's internal registers (different from the PMU registers). This data group can reverse the state of the chip executing the application program code at runtime. The state can be, for example, any of running, hibernating, waiting, resident, or monitoring.
  • the trace information includes any one or more of the running status of the application and the function call stack when the hardware event occurs.
  • the collection node 100 can read the data in the register of the collected node through the hardware debugging interface of the collected node, so as to obtain the performance information. Further, considering that the data volume of the performance information is relatively large, an independent storage unit may also be set in the collected node for buffering the data in the register. Wherein, the independent storage component is a storage component independent of the CPU main body in the collected node, and the independent storage component may be a cache cache or a buffer buffer or the like.
  • the collection node 100 can read the data transmitted from the register of the collected node to the independently stored component through the hardware debugging interface of the collected node to obtain performance information.
  • the analysis node 200 is configured to receive the performance information sent by the collection node 100, analyze the performance information, and obtain an analysis result. Specifically, the analysis node 200 can analyze the received performance information in a statistical manner, for example, the analysis node 200 can determine the sum of the number of occurrences of the cache hit event and the number of occurrences of the cache miss event, and then determine the number of occurrences of the cache hit event. The ratio of the number of times in the above sum value, so as to obtain the cache hit rate.
  • the analysis result may include the cache hit ratio described above.
  • the analysis node 200 can also present the analysis results in a graph form. Specifically, the analysis node 200 may present the analysis results to the user through any one or more of a line graph, a histogram, a flame graph, or a table. The analysis node 200 may also generate a performance monitoring report according to at least one of a line graph, a histogram, a flame graph or a table of the analysis results, and output the performance monitoring report.
  • the collection node 100 may perform preparatory work first, for example, the collection node 100 may perform network configuration, and then collect performance information. Specifically, the collection node 100 may receive a user-configured analysis node address list.
  • the analysis node address list includes the address of at least one analysis node 200, and the address may include a uniform resource locator (uniform resource locator, URL) address, an Internet protocol (Internet protocol, IP) address, a message queue telemetry transmission (message queuing telemetry transport, MQTT) address at least one.
  • the collection node 100 performs network configuration by adding the address of at least one analysis node 200 . Further, the collection node 100 may add addresses of multiple analysis nodes 200 , so that the performance information collected by the collection node 100 may be shared by multiple analysis nodes 200 .
  • the analysis node address list may include an identifier of at least one analysis node 200 and an address of the analysis node 200 .
  • the identifier of the analysis node 200 may be, for example, a universally unique identifier (UUID) of the analysis node 200 .
  • the collection node 100 may send a join request to the analysis node 200 according to the address of the added analysis node 200 .
  • the collection node 100 sends a joining request to multiple analysis nodes 200
  • at least one analysis node 200 among the multiple analysis nodes 200 may save the identification and address of the collection node 100 , and then return a join success notification to the collection node 100 .
  • the identifier of the collection node 100 may be the UUID of the collection node, and the address of the collection node may be at least one of URL address, IP address or MQTT address.
  • the analysis node 200 may instruct the collection node 100 to collect performance information.
  • the analysis node 200 may receive collection prompt information input by a user.
  • the collection prompt information is used to prompt the collection node 100 to collect performance information, for example, the collection prompt information may include an application code collection range and at least one collection item. Each acquisition item can correspond to a hardware event. Then the analysis node 200 can read the performance information through the hardware debugging interface of the collected node according to the collection prompt information.
  • the analysis node 200 may provide a configuration interface, and the configuration interface is an interactive interface (user interface, UI) supporting user interaction.
  • the UI may be a graphical user interface (graphical user interface, GUI) or a command user interface (command user interface, CUI).
  • GUI graphical user interface
  • CUI command user interface
  • the configuration interface 20 includes a first input box 202 for configuring a code collection range and a second input box 206 for configuring a collection item.
  • the user can input the address of the code segment to be collected in the first input box 202, thereby configuring the code collection range.
  • the user can input the hardware events to be collected in the second input box 206, so as to configure the collection items.
  • the configuration interface 20 further includes a browsing control 204, and the user can trigger the browsing control 204 to browse the code file, and then select a code segment in the code file to configure the code collection range.
  • the configuration interface 20 may also include a drop-down control 208. When the drop-down control 208 is triggered, the configuration interface 20 displays a drop-down box 210, and the user can select hardware events to be collected from the drop-down box 210 to configure the collection items.
  • the configuration interface 20 also includes a confirm control 212 and a cancel control 214.
  • the analysis node 200 can generate a configuration file according to the code collection scope and collection items configured by the user. Further, the analysis node 200 may deliver the configuration file to the collection node 100 .
  • the collection node 100 can read the configuration file, configure the configuration register in the collected node according to the collection prompt information carried in the configuration file, and select the connection between the target register in the collected node and the independent storage unit in the collected node.
  • the target register includes a register matching the collection prompt information
  • the independent storage component refers to a storage component independent of the CPU main body in the collected node.
  • the key to implementing performance monitoring by the performance monitoring system 10 lies in the collection node 100 , and the hardware structure of the collection node 100 will be described in detail below.
  • the collection node 100 includes a hardware debugging interface 102 , a control unit 104 , a network transceiver unit 106 and a network output interface 108 .
  • the hardware debugging interface 102 and the network transceiver unit 106 are respectively connected to the control unit 104
  • the network output interface 108 is connected to the network transceiver unit 106 .
  • the collection node 100 further includes a configuration interface 103 .
  • the configuration interface 103 is connected to the control unit 104 .
  • the hardware debugging interface 102 is adapted to the hardware debugging interface of the collected node, so as to transmit information between the collecting node 100 and the collected node, for example, transmit performance information.
  • the hardware debugging interface 102 of the collection node 100 can be connected with the hardware debugging interface of the collected node through a cable.
  • the collection node 100 may be connected to multiple collected nodes in the form of a daisy chain topology, so as to implement performance monitoring on multiple collected nodes.
  • the configuration interface 103 is used to receive an analysis node address list for network configuration.
  • the configuration interface 103 is also used to receive the configuration file sent by the analysis node 200, for example, to receive the collection prompt information carried in the configuration file.
  • the information transmission of the configuration interface 103 may be implemented in various manners.
  • the configuration interface 103 can be configured via a universal synchronous asynchronous receiver and transmitter (USART), a universal asynchronous receiver and transmitter (UART), a serial peripheral interface (serial peripheral interface bus, SPI) , IC bus (Inter-Integrated Circuit, I2C) or buttons and other electronic protocol methods to realize information transmission.
  • the configuration interface 103 can implement information transmission through infrared communication technology proposed by the Infrared Data Association (IrDA), or implement information transmission through ultrasonic waves, lasers, and the like.
  • IrDA Infrared Data Association
  • the control unit 104 is used to read the performance information from the collected nodes through the hardware debugging interface 102 , forward the read performance information to the network transceiver unit 106 , and output it through the network output interface 108 .
  • the control unit 104 is also used to establish a connection with the analysis node 200 according to the configuration of the configuration interface 103 when starting up for the first time after the network configuration is completed, so as to transmit information with the analysis node 200 .
  • the control unit 104 may receive a configuration file, and configure the collection node 100 and the collected node, so that the collection node 100 collects performance information according to the collection prompt information.
  • the network transceiver unit 106 is used to receive the performance information from the control unit 104, and encapsulate the performance information through a protocol, so that the network output interface 108 forwards the encapsulated performance information to the analysis node 200.
  • the network transceiver unit 106 can support at least one protocol, such as Transmission Control Protocol/Internet Protocol (Transmission Control Protocol/Internet Protocol, TCP/IP), User Datagram Protocol (UDP) protocol, Zigbee (zigbee) , Bluetooth, wireless communication (Wi-Fi) protocol, MQTT, wireless heart, modbus, industry standard architecture (Industry Standard Architecture, ISA), etc. at least one.
  • the network output interface 108 is used to forward the performance information (for example, the encapsulated performance information) to the analysis node 200 in a wireless or wired manner.
  • the wired method includes optical fiber, network cable, etc.
  • the wireless method includes the fifth generation (the fifth generation, 5G) mobile communication, the fourth generation (the forth generation, 4G) mobile communication, the third generation (the third generation, 3G), The second generation (the second generation, 2G) mobile communication, etc.
  • the collection node 100 may have multiple implementation forms.
  • the collection node 100 may be powered by alternating current or batteries.
  • the performance information of the application in the collection node can be collected through the collection node 100 powered by AC power; when the collected node is a mobile collected node, the collection node 100 powered by a battery can The performance information of the application in the collected node is collected.
  • the collected nodes are computing devices such as servers in the large data center, and these computing devices are usually fixed, so the collecting node 100 can be fixed, and one collecting node 100 can support collecting multiple collected Performance information for applications in the node.
  • one collecting node 100 may support collecting performance information of applications in 64 collected nodes.
  • the collection node 100 can be connected to 64 collected nodes without plugging and unplugging. After the process of collecting performance information is triggered, the collection node 100 can read the performance information of the 64 collected nodes in time-sharing.
  • the collection node 100 can be powered by an AC power supply. Therefore, the network transceiver unit 106 and the network output interface 108 do not need to consider power consumption, and can be connected through the 5G module accomplish.
  • the main part of the 5G module is used to realize the function of the network transceiver unit 106
  • the 5G antenna of the 5G module is used to realize the function of the network output interface 108 .
  • the hardware debugging interface 102 may be a joint test action group (jtag) interface
  • the control unit 104 may be realized by a coresight component, for example, the control unit 104 may be a coresight specification reading module.
  • the collection node 100 In medium mobile device scenarios, such as robots and electric vehicles, in order to avoid loss of power supply to the collected nodes (ie, robots or electric vehicles), the collection node 100 usually has a battery.
  • the collection node 100 is powered by a battery
  • the network transceiver unit 106 and the network output interface 108 of the collection node 100 are realized by a 5G module
  • the hardware debugging interface of the collection node 100 can be jtag interface
  • the control unit 104 of the collection node 100 may be implemented by a coresight component, for example, the control unit 104 may be a coresight specification reading module.
  • the collection node 100 is powered by a battery with a relatively small weight
  • the hardware debugging interface 102 of the collection node 100 can be a jtag interface
  • the control unit 104 of the collection node 100 can pass the coresight component
  • the control unit 104 may be a coresight specification reading module.
  • the network transceiver unit 106 and the network output interface 108 of the collection node 100 are realized by an energy-saving communication module such as a Zigbee communication module.
  • FIG. 1 to 6 illustrate the structure of the performance monitoring system 10 and the collection node 100 in the embodiment of the present application in detail. Next, the performance monitoring method of the embodiment of the present application is introduced from the perspective of the performance monitoring system 10 .
  • the method includes:
  • the collection node 100 receives the analysis node address list configured by the user, and adds the address of the analysis node 200.
  • the analysis node address list includes the address of at least one analysis node 200 .
  • the address of the analysis node 200 may be any one or more of URL address, IP address or MQTT address.
  • the collection node 100 can be powered on first, and then configure the network of the current collection node 100 through the configuration interface 103 of the collection node 100 . Specifically, the collection node 100 receives the analysis node address list through the configuration interface 103 , and then adds the address of at least one analysis node 200 in the analysis node address list, so as to configure the network of the collection node 100 .
  • the performance information collected by the collection node 100 can be shared by the multiple analysis nodes 200 .
  • the collection node 100 adds addresses of multiple analysis nodes 200 for illustration.
  • the collection node 100 can connect multiple collected nodes in a daisy chain form, and as long as a remote analysis node 200 is added to a collection node 100, the analysis node 200 can access this collection node 100.
  • the node to be collected there is no need to replace the collection node 100 because there is a new analysis node 200 . In this way, the movement of nodes can be reduced and the flexibility of networking can be improved.
  • each record in the address list of the analysis node 200 may include the identifier and address of the analysis node 200 .
  • the identifier of the analysis node 200 may be the UUID of the analysis node 200 , and the UUID of the analysis node 200 is written into the analysis node 200 before leaving the factory, so as to distinguish it from other analysis nodes 200 .
  • S704 The collection node 100 sends a join request to multiple analysis nodes 200 according to the analysis node address list.
  • the collection node 100 can be restarted. Then the collection node 100 may send a join request to the multiple analysis nodes 200 according to the addresses of the multiple analysis nodes 200 added in the analysis node address list.
  • the join request is specifically used to join the network where the analysis node 200 is located.
  • the joining request may carry the identification and address of the collection node 100, so that the analysis node 200 may add the collection node 100 to the network of the analysis node 200 based on the identification and address.
  • the join request may be sent in a polling manner.
  • the collection node 100 may send the join request in other manners, for example, in a concurrent manner.
  • At least one analysis node 200 sends a join success response.
  • At least one analysis node 200 among the plurality of analysis nodes 200 can add the address of the collection node 100, for example, add the identification and address of the collection node 100, so as to add the collection node 100 to the network of the analysis node 200, and then the analysis node 200 sends The collection node 100 returns a join success response.
  • the joining success response is used to notify the collection node 100 of joining success.
  • the analysis node 200 receives the collection prompt information input by the user.
  • the collection prompt information is used to prompt the collection node 100 to collect performance information.
  • the collection prompt information may include a code collection range and at least one collection item.
  • the code collection scope is used to indicate the code fragments that need to collect performance information, and then perform performance monitoring.
  • the range of code collection can be represented by the address of the code fragment.
  • Each acquisition item can correspond to a hardware event. For example, one collection item may correspond to a cache miss event, and another collection item may correspond to a branch misprediction event.
  • the user when configuring the code collection range, the user can first determine the source code fragment 802 to be tracked from the application that needs to monitor performance, and then view the compiled code to obtain the above source code.
  • the address of the compiled code segment 804 of the source code segment can be characterized by a start offset and a length.
  • a tracing module is added inside the CPU chip of the server. That is, the CPU chip of the server includes two parts: the CPU main body and the tracing module. Processes or threads usually run on the main part of the CPU and occupy resources.
  • the tracing module is connected to some registers of the CPU main body through an independent tracing channel to obtain performance information.
  • registers related to performance or program operation are connected to the tracing channel, and the user can configure the register address through the configuration interface provided by the analysis node 200, and the analysis node 200 sends the register address to the collection node 100, so The collection node 100 can directly connect the register corresponding to the above register address to the independent storage unit in the tracing module through the tracing channel through the configuration module in the tracing module. In this way, when the CPU chip is running, the performance information in the above registers will be automatically pushed to the corresponding independent storage components.
  • the tracing module is independent of the CPU main body, when the CPU chip runs the current application, the tracing module is in a state of passively receiving performance information, and as an independent circuit, it will not affect the operation of the CPU main body at all. Furthermore, users can also configure parameters such as collection frequency and overflow value, so as to collect performance information according to these parameters.
  • the tracing-related pins in the CPU chip of the server can be connected to an external hardware debugging interface on the main board, and the collection node 100 has a corresponding hardware debugging interface connected to the hardware debugging interface of the server, so as to facilitate hardware debugging.
  • the interface collects performance information.
  • the analysis node 200 generates a configuration file according to the collected prompt information.
  • the analysis node 200 may assemble different collection prompt information configured by the user into a configuration file.
  • the analysis node 200 may acquire a configuration file template, and then fill different collection prompt information into corresponding positions of the configuration file template, thereby generating a configuration file.
  • the configuration file usually has a specific format, for example, a format recognizable by the collection node 100 .
  • the analysis node 200 may automatically send the configuration file to the collection node 100 after generating the configuration file, or may send the configuration file to the collection node 100 in response to a user-triggered configuration file download operation, which is not limited in this embodiment.
  • the collection node 100 collects performance information from the hardware debugging interface of the collected node according to the configuration file.
  • the collection node 100 receives the configuration file, and can configure the configuration register in the collection node according to the collection prompt information in the configuration file, for example, the configuration register in the configuration module (as shown in FIG. 9 ) of the collected node, so as to A line that gates the target register in the node being harvested to the independent storage element in the node being harvested.
  • the target register may be a register matching the collection prompt information, such as a register corresponding to a register address in the collection prompt information.
  • the target register is a plurality of registers
  • the plurality of registers may be connected to different storage spaces of the independent storage unit. In this way, when the application is running, the collection node 100 can respectively read different performance information from different storage spaces of the independent storage component through the hardware debugging interface of the collected node.
  • the collection node 100 may read the data in the corresponding storage space at intervals according to the collection frequency in the configuration file, so as to collect performance information.
  • S716 The collection node 100 sends performance information.
  • S718 The collection node 100 analyzes the performance information, and obtains an analysis result.
  • the collection node 100 may analyze the performance information through a statistical method, so as to obtain an analysis result. For example, the collection node 100 may determine the cache hit ratio based on the number of cache event misses and the number of cache event hits.
  • the collection node 100 can also present the analysis results to the user in the form of graphs.
  • the collection node 100 may generate at least one of a line graph, a histogram, a flame graph or a table according to the analysis result, and then present the line graph, histogram, flame graph or table to the user.
  • the above S702 to S706 is a specific implementation of networking the collection node 100 and the analysis node 200, and the above S702 to S706 may not be executed to execute the performance monitoring method of the embodiment of the present application.
  • the above S708 to S714 are a specific implementation of the collection node 100 reading the performance information through the hardware debugging interface of the collected node.
  • the hardware debug interface for reading performance information.
  • the collection node 100 may also include any one or more of a position sensor, an acceleration sensor, an air temperature sensor, or an air pressure sensor, etc., so that the collection node 100 may also return position information, acceleration information, air temperature information, or air pressure information, so that The analysis node 200 analyzes the operation status of the application under complex external conditions.
  • the performance monitoring method reads the performance information by using the hardware debugging interface of the collected node without installing the collection software, which breaks through the limitation of the software and can realize the monitoring of the collected nodes of different operating systems. Performance monitoring with high availability. Moreover, when the method collects performance information, no additional process or thread will be started, and the application on the collected node will not be affected. The analysis results obtained by analyzing the performance information based on the performance information are more accurate, which improves the reliability of performance monitoring .
  • the method supports free configuration of the network, and the transmission performance information of the collection node 100 will not generate load on the current network, which reduces the pressure on the current network and can transmit more and more detailed performance information.
  • one collection node 100 can be connected to one or more collected nodes, thus reducing the number of collection nodes 100 that need to be configured, and simplifying the workload of installation and debugging of collection nodes 100 .
  • the performance monitoring system 10, the collection node 100 and the performance monitoring method performed by the performance monitoring system 10 provided by the embodiment of the present application are introduced above in conjunction with FIG. 1 to FIG.
  • the program product is described.
  • the embodiment of the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium may be any available medium that a computing device can store, or a data storage device such as a data center that includes one or more available media.
  • the available media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, DVD), or semiconductor media (eg, solid state hard disk), etc.
  • the computer-readable storage medium includes instructions, and the instructions instruct the computing device to execute the steps performed by the analysis node 200 in the above performance monitoring method applied to the performance monitoring system 10 .
  • the embodiment of the present application also provides a computer program product.
  • the computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on the computing device, the processes or functions according to the embodiments of the present application will be generated in whole or in part.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, e.g. (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wirelessly (such as infrared, wireless, microwave, etc.) to another website site, computer or data center.
  • another computer-readable storage medium e.g. (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wirelessly (such as infrared, wireless, microwave, etc.) to another website site, computer or data center.
  • the computer program product may be a software installation package that can be downloaded and executed on a computing device if any of the aforementioned performance monitoring methods are required.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

Système de surveillance de performances qui permet de surveiller les performances d'une application sur un nœud collecté. Le système comprend un nœud de collecte (100) et un nœud d'analyse (200), le nœud de collecte (100) étant connecté à un nœud collecté au moyen d'une interface de débogage matériel du nœud collecté, le nœud de collecte (100) étant destiné à lire des informations de performances au moyen de l'interface de débogage matériel du nœud collecté, et le nœud d'analyse (200) étant destiné à recevoir les informations de performances envoyées par le nœud de collecte et analyser les informations de performances pour obtenir un résultat d'analyse. Le nœud de collecte (100) collecte des informations de performances au moyen de l'interface de débogage matériel, et aucun logiciel de collecte doit être installé, de telle sorte que des limitations de compatibilité de logiciel sont palliées ; et le procédé est applicable à la surveillance de performances de divers types de nœuds collectés, qui ont des interfaces de débogage matériel, et présente une disponibilité plus élevée.
PCT/CN2022/089717 2021-07-08 2022-04-28 Système de surveillance de performances et procédé associé WO2023279815A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202110770996 2021-07-08
CN202110770996.4 2021-07-08
CN202111043661.9A CN115599641A (zh) 2021-07-08 2021-09-07 性能监控系统及相关方法
CN202111043661.9 2021-09-07

Publications (1)

Publication Number Publication Date
WO2023279815A1 true WO2023279815A1 (fr) 2023-01-12

Family

ID=84801188

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/089717 WO2023279815A1 (fr) 2021-07-08 2022-04-28 Système de surveillance de performances et procédé associé

Country Status (1)

Country Link
WO (1) WO2023279815A1 (fr)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103281366A (zh) * 2013-05-21 2013-09-04 山东地纬计算机软件有限公司 一种支持实时运行状态获取的嵌入式代理监控装置及方法
CN103501253A (zh) * 2013-10-18 2014-01-08 浪潮电子信息产业股份有限公司 一种高性能计算应用特征的监控组织方法
CN104156296A (zh) * 2014-08-01 2014-11-19 浪潮(北京)电子信息产业有限公司 智能监控大规模数据中心集群计算节点的系统和方法
US20150378810A1 (en) * 2013-03-18 2015-12-31 Fujitsu Limited Management apparatus, method and program
CN107133110A (zh) * 2017-04-27 2017-09-05 中国科学院国家授时中心 基于集群并行运算的gnss导航信号海量数据快速处理方法
CN107943668A (zh) * 2017-12-15 2018-04-20 江苏神威云数据科技有限公司 计算机服务器集群日志监控方法及监控平台
CN111131379A (zh) * 2019-11-08 2020-05-08 西安电子科技大学 一种分布式流量采集系统和边缘计算方法

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150378810A1 (en) * 2013-03-18 2015-12-31 Fujitsu Limited Management apparatus, method and program
CN103281366A (zh) * 2013-05-21 2013-09-04 山东地纬计算机软件有限公司 一种支持实时运行状态获取的嵌入式代理监控装置及方法
CN103501253A (zh) * 2013-10-18 2014-01-08 浪潮电子信息产业股份有限公司 一种高性能计算应用特征的监控组织方法
CN104156296A (zh) * 2014-08-01 2014-11-19 浪潮(北京)电子信息产业有限公司 智能监控大规模数据中心集群计算节点的系统和方法
CN107133110A (zh) * 2017-04-27 2017-09-05 中国科学院国家授时中心 基于集群并行运算的gnss导航信号海量数据快速处理方法
CN107943668A (zh) * 2017-12-15 2018-04-20 江苏神威云数据科技有限公司 计算机服务器集群日志监控方法及监控平台
CN111131379A (zh) * 2019-11-08 2020-05-08 西安电子科技大学 一种分布式流量采集系统和边缘计算方法

Similar Documents

Publication Publication Date Title
US10846160B2 (en) System and method for remote system recovery
EP3402160B1 (fr) Procédé et appareil de traitement de service
US9507619B2 (en) Virtualizing a host USB adapter
US8898349B1 (en) IPMI over USB data transfer between host computer and baseboard management controller (BMC)
US6263373B1 (en) Data processing system and method for remotely controlling execution of a processor utilizing a test access port
US9684583B2 (en) Trace data export to remote memory using memory mapped write transactions
US9639447B2 (en) Trace data export to remote memory using remotely generated reads
US20080071962A1 (en) Device connection system and device connection method
CN104571333A (zh) 基于1553b总线的控制计算机
CN116719700B (zh) 服务器主机系统的硬件分区的监测方法及装置
CN101661304B (zh) 一种计算机及其输入设备共用方法
WO2013108150A1 (fr) Configuration de noeuds de calcul dans un ordinateur parallèle par accès direct à la mémoire à distance
US20150207731A1 (en) System and method of forwarding ipmi message packets based on logical unit number (lun)
US20110173403A1 (en) Using dma for copying performance counter data to memory
CN116302306A (zh) 用于微服务体系结构的基于匹配的增强的调试
CN114398179A (zh) 一种跟踪标识的获取方法、装置、服务器及存储介质
WO2023279815A1 (fr) Système de surveillance de performances et procédé associé
US8996771B1 (en) System and method for communication via universal serial bus
CN104932820B (zh) 基于usb映射的触摸屏使用方法和系统
US20220334906A1 (en) Multimodal user experience degradation detection
WO2023112012A1 (fr) Fils de service de trafic pour grands groupes d'adresses de réseau
CN115599641A (zh) 性能监控系统及相关方法
TWI815098B (zh) Web請求處理方法、裝置、電子設備以及內儲程式之電腦可讀取記錄媒體
WO2019071616A1 (fr) Procédé et dispositif de traitement
Zhang et al. Research on development of embedded uninterruptable power supply system for IOT-based mobile service

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22836574

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE