CN112069029A - Performance acquisition monitoring system of domestic platform PMU self-adaptation - Google Patents

Performance acquisition monitoring system of domestic platform PMU self-adaptation Download PDF

Info

Publication number
CN112069029A
CN112069029A CN202010920679.1A CN202010920679A CN112069029A CN 112069029 A CN112069029 A CN 112069029A CN 202010920679 A CN202010920679 A CN 202010920679A CN 112069029 A CN112069029 A CN 112069029A
Authority
CN
China
Prior art keywords
pmu
performance
module
monitoring
event
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010920679.1A
Other languages
Chinese (zh)
Other versions
CN112069029B (en
Inventor
姚蕊
张杨
陈树峰
赵晓燕
赵明辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Computer Technology and Applications
Original Assignee
Beijing Institute of Computer Technology and Applications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Computer Technology and Applications filed Critical Beijing Institute of Computer Technology and Applications
Priority to CN202010920679.1A priority Critical patent/CN112069029B/en
Publication of CN112069029A publication Critical patent/CN112069029A/en
Application granted granted Critical
Publication of CN112069029B publication Critical patent/CN112069029B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3013Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is an embedded system, i.e. a combination of hardware and software dedicated to perform a certain function in mobile devices, printers, automotive or aircraft systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computer Hardware Design (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention relates to a self-adaptive performance acquisition monitoring system for a PMU (phasor measurement Unit) of a domestic platform, belonging to the technical field of computers. By adopting the PMU self-adaptive performance acquisition monitoring system, the PMU hardware configuration information is extracted through the front end, the relevant configuration operation of the platform PMU hardware is used as the hardware shielding layer by the rear end, the front end configuration information is received in a parameter form and then the materialized configuration is carried out, so that the effect of continuously modifying the core code of the Tian bright kernel PMU without upgrading the platform is achieved, developers are greatly convenient to concentrate on the PMU performance optimization problem of the program, and the development adaptive debugging time is saved. The invention also provides an operation interface such as user query and setting and dynamic display of the real-time analysis result of the visual performance, and the visual operation helps developers to more quickly and intuitively locate the performance hot spot. And technical support and application support are provided for monitoring the complex program performance of a domestic key software and hardware platform. The invention is simple and effective to realize and meets the application requirements.

Description

Performance acquisition monitoring system of domestic platform PMU self-adaptation
Technical Field
The invention belongs to the technical field of computers, and particularly relates to a self-adaptive performance acquisition monitoring system for a domestic platform PMU.
Background
With the deep advance of the domestic process in China, a lot of excellent domestic processors and domestic operating systems emerge, and the domestic key software and hardware represented by the domestic Loongson processor and the domestic Tian bright embedded operating system gradually replace foreign products to be applied to the key military fields of domestic aerospace, national defense and the like. However, compared to the more mature commercial processor chips such as Intel, AMD, IBM, Linux, VxWorks, and other foreign mainstream embedded operating systems, how to monitor and improve the running condition of the complex programs on the domestic software-based hardware platform and the overall performance of the platform is a more serious challenge currently facing.
Currently, a PMU (Performance Monitoring Unit) based online Performance analysis method for Performance Monitoring of parallel programs on a multi-core platform and overall Performance optimization of the platform is increasingly applied because the method does not need to modify program behavior, is easy to use, has low overhead, and has accurate Monitoring results. Modern processors provide PMUs for capturing micro-architecture level event information for collecting events related to operations in the processor, such as the number of cycles spent, Cache misses, or instructions executed. These recorded results may provide useful information about how the software is using the hardware. For example, a PMU may be configured such that the corresponding register is incremented by 1 each time a data Cache miss event occurs, to record such an event. Reading the counting register is helpful for revealing the conditions occurring in a chip when a program runs, and by counting and analyzing the events which influence the performance, the user can know what bottom hardware behaviors can be generated by different program implementation modes, and further analyzes the hardware events generated by which codes influence the performance of the program according to the behaviors, so that a programmer is guided to improve the overall performance of the system through improvement of a program level, and optimization basis can be provided for an automatic optimization tool.
The existing performance acquisition monitoring tool based on PMU hardware support mainly comprises a Perf performance optimizing tool and an Oprofile performance acquiring and analyzing tool under a Linux operating system and an Intel Vtune series performance acquiring and analyzing tool under a Windows operating system for foreign platforms, and a domestic Loongson 2F platform mainly comprises a Tprofiler performance monitoring tool under the Linux operating system, a DUET tool of a Loongson 3A1000 platform and the like.
Although the PMU-based performance monitoring and optimizing tool exists at home and abroad, the PMU-based performance monitoring and optimizing tool has stronger dependence on a hardware platform and an operating system operating environment, basically does not have the application capability across the hardware platform, and has more limitation on the portability and expansibility of a domestic system platform. Particularly, for example, a loongson processor platform which is autonomously developed in China and based on an MIPS instruction set is taken as an example, a loongson processor 1 series processor facing a low-end embedded field, a loongson processor 2 series processor facing a desktop computer and a loongson processor 3 series processor facing a server and a High Performance Computer (HPC) are proposed at present, each series processor is further divided into multiple products, for example, the loongson processor 2 series processor is further divided into LS2F, LS2H, LS2K1000, LS2K2000 and the like, the loongson processor 3 series processor is further divided into LS3A1000, LS3B1500, LS3A/B2000, LS3A/B3000, PMU 3 LS3A/B4000, LS3C5000 and the like, PMU design changes are brought along with continuous perfection and development of the loongson products, so that the loongson processors have differences on configuration, access and monitorable events, only a performance monitoring of a hardware event of A1 group of counters is realized from LS2F, only 32 hardware event performance monitoring of the hardware events is supported, and a performance of the counters is realized, and a performance monitoring of the LS, Shared cache bank 4 sets of 48-bit performance counters may monitor 573 processor core performance events and 37 shared cache performance events. The existing Tprofiler tool can only be applied to the Linux environment of the LS2F platform of the Loongson No. 2 processor, the DUET tool can only be applied to the Linux environment of the LS3A1000 platform of the Loongson No. 3 processor, a system kernel PMU performance acquisition scheme and a client display mode need to be redesigned according to hardware characteristics when a new processor product is released, the system-level PMU self-adaption capability and the expandability capability are lacked, and matched PMU performance acquisition monitoring tools are not available in the domestic system environment for LS3A/B2000, LS3A/B3000, LS3A/B4000, LS3C5000 and other series platforms released at the late stage of the Loongson. Domestic platforms lack PMU-adaptable performance acquisition monitoring capabilities.
Disclosure of Invention
Technical problem to be solved
The technical problem to be solved by the invention is as follows: aiming at the defects and shortcomings of the existing domestic platform PMU performance acquisition monitoring tool, a PMU self-adaptive performance acquisition monitoring system is provided by taking a domestic Loongson processor platform PMU and a domestic Tian bright embedded operating system as research objects.
(II) technical scheme
In order to solve the technical problems, the invention provides a PMU adaptive performance acquisition monitoring system of a domestic platform, which consists of a front end positioned in an upper computer development environment layer and a rear end positioned in a target computer kernel layer and is used for realizing the adaptive performance acquisition monitoring and analysis of the full-series Loongson platform; the front end is used for providing a user interactive operation interface to help a user to perform visual setting, performance analysis and result presentation and extracting PMU hardware configuration information; the back end is used for taking the relevant configuration operation of the platform PMU hardware as a hardware shielding layer, receiving PMU hardware configuration information of the front end in a parameter mode and then carrying out materialization configuration, so that core codes of the sky bright kernel PMU do not need to be continuously modified according to the upgrading of the platform.
Preferably, the front end comprises an extensible configuration file module, a configuration file loading module, a PMU sampling monitoring setting module, a performance real-time analysis module and an analysis result dynamic display module;
the extensible configuration file module is used for adding configuration information of Loongson CPU PMU with unknown model or future development and realizing the extensibility of the performance acquisition monitoring system; the method comprises the following steps that a developer interface guiding mode is provided in the form of a drop-down menu or an input text box to complete the input of necessary information, wherein the necessary information comprises a Loongson model number, a total number of a coprocessor CP0, the number of performance counters, the number of performance counter bits, counter overflow conditions and a monitorable hardware event, and a background integrates configuration information according to a system recognizable template format to be used for recognition of a PMU self-adaptive configuration module at the rear end and recognition of a drop-down menu selected by a platform in the PMU sampling monitoring setting module;
the PMU sampling monitoring setting module is used for providing developer visual interface operation to complete parameter setting of platform selection, event selection to be monitored, sampling mode selection and sampling period, and supporting online modification of monitoring events and sampling periods;
the configuration file loading module is used for sending the configuration information of the extensible configuration file module and the parameter setting information of the PMU sampling monitoring setting module to an Tian bright kernel at the rear end of the target machine through a communication interface according to an RSP remote debugging protocol;
the performance real-time analysis module is used for receiving the performance data collected by the PMU performance statistics module at the rear end through the communication interface, further analyzing the returned data according to the RSP protocol, analyzing the required data section from the returned data packet, and sorting the data into data acceptable for the graphical interface through analysis calculation;
the analysis result dynamic display module is used for receiving the analysis data output by the performance real-time analysis module and feeding back the data into a histogram and a table in real time, the histogram dynamically displays the occurrence frequency of all monitoring events, and the table dynamically refreshes the data of the total occurrence frequency of the statistical events, the occurrence frequency of each task or function, the interruption overflow frequency, the task switching frequency, the running time and the like.
Preferably, the PMU adaptive configuration module at the back end is configured to encapsulate a PMU bottom layer call interface related to the platform in a parameter form or a data structure based on an embedded operating system kernel of day bright, and after receiving configuration information and parameter setting information loaded at the front end, instantiate all the entities to complete PMU bottom layer adaptive processing.
Preferably, the PMU adaptive configuration module includes functions of:
PMU automatic configuration:
ty _ device _ driver PMU _ AUTOCONFIG (sys _ device _ major _ number major, sys _ device _ minor _ number minor, void _ pargp, PMU _ attri _ pmuatti), which sets the front-end configuration information and parameter setting information content in the sky bright kernel through a structure PMU _ attri, to complete the automatic configuration of PMUs of different loongson platforms, wherein the PMU _ attri structure records all the attributes of PMU performance counters and is defined as follows:
Figure BDA0002666634040000051
Figure BDA0002666634040000061
sample engine configuration:
ty _ device _ driver PMU _ TASKSAMPLE (system _ tcb _ unused, system _ tcb _ created _ task, system _ tcb _ started _ task, system _ tcb _ switched _ task, system _ tcb _ deleted _ task, PMU _ task _ msg _ pmutaskmsg), which binds a PMU performance counter to a monitoring task or thread, saves context and execution order when tasks are switched, records run time, scheduling information, and sample information such as task/system-wide count register values, interrupt overflow times, etc., and saves and passes to the PMU performance statistics module via the struct PMU _ task _ msg, where the struct PMU _ task _ msg is defined as follows:
Figure BDA0002666634040000062
Figure BDA0002666634040000071
PMU interruption processing:
sys _ isr PMU _ isr (sys _ vector _ number vector, U int32_ t cause _ pci), which is responsible for parsing out PMU interrupt numbers defined in configuration information, registering the interrupts, and executing PMU interrupt processing; the specific treatment process comprises the following steps: enabling a use register PCI to count, sequentially judging the number of a control counter generating the interrupt, then recording the overflow times and the sampling period +1 of the counting register in a PMU performance counter corresponding to the counter generating the interrupt, updating PMU _ task _ msg structure related members, resetting the count, and clearing the use.
Preferably, the PMU performance statistics module at the back end is independent of PMU hardware, and is configured to analyze input parameter requirements of a collection mode and a monitoring event set by a developer in a front-end configuration file, transfer each parameter to the PMU adaptive configuration module through a system call of an inner core of the sky bright, and statistically sort sampling information transferred by the PMU adaptive configuration module and return the sampled information to the performance real-time analysis module at the front end so as to draw a histogram.
Preferably, the PMU performance statistics module includes the functions of:
firstly, event collection and monitoring are started;
secondly, the collection and monitoring are suspended;
thirdly, continuing to collect and monitor;
quitting monitoring;
acquiring data of a designated control register;
sixthly, acquiring data of a specified counting register;
seventhly, modifying hardware events monitored by the PMU online;
and modifying the sampling frequency on line. .
Preferably, the running of the application program is located at the target machine, the information in the running of the program is obtained from the back end, and the communication between the front end and the back end adopts the RSP communication protocol.
Preferably, the sampling mode includes sampling based on events and sampling based on time.
Preferably, the analysis result dynamic display module can start and stop receiving data at any time, and the plotting interface can start and pause accordingly.
The invention also provides a self-adaptive performance acquisition monitoring method of the domestic platform PMU, which is realized by utilizing the system.
(III) advantageous effects
By adopting the PMU self-adaptive performance acquisition monitoring system, the PMU hardware configuration information is extracted through the front end, the relevant configuration operation of the platform PMU hardware is used as the hardware shielding layer by the rear end, the front end configuration information is received in a parameter form and then the materialized configuration is carried out, so that the effect of continuously modifying the core code of the Tian bright kernel PMU without upgrading the platform is achieved, developers are greatly convenient to concentrate on the PMU performance optimization problem of the program, and the development adaptive debugging time is saved. The invention also provides an operation interface such as user query and setting and dynamic display of the real-time analysis result of the visual performance, and the visual operation helps developers to more quickly and intuitively locate the performance hot spot. And technical support and application support are provided for monitoring the complex program performance of a domestic key software and hardware platform. The invention is simple and effective to realize and meets the application requirements.
Drawings
Fig. 1 is a schematic block diagram of a system provided by the present invention.
Detailed Description
In order to make the objects, contents, and advantages of the present invention clearer, the following detailed description of the embodiments of the present invention will be made in conjunction with the accompanying drawings and examples.
In order to adapt to the continuous perfection and development of a domestic Loongson processor platform, avoid spending a large amount of time to revise and adapt to an operation system kernel PMU module due to the change of PMU hardware design, and need to peel off the direct behavior of hardware from a system kernel to achieve the PMU self-adaptation purpose, so the PMU self-adaptation performance acquisition monitoring system is divided into a front end and a rear end, wherein the front end fills user matching information into a PMU attribute configuration file of the platform in a set format, and transmits the rear end through an RSP protocol, and the form is convenient for a user to operate and can provide extensible support for the subsequent Loongson platform upgrading; the kernel of the back-end system does not define specific properties of PMUs, the operation of each PMU bottom layer is packaged in a parameter or structure form, the parameters are materialized to complete the self-adaptive processing of the PMU bottom layer after receiving configuration information, and the front end and the back end jointly form the collection and monitoring of the self-adaptive performance of the PMU, thereby ensuring the portability and the expandable effectiveness of a domestic system platform.
The PMU self-adaptive performance acquisition monitoring system of the invention constructs performance monitoring analysis tools by two parts, namely the front end positioned in an upper computer development environment layer and the rear end positioned in a target computer kernel layer, and realizes the self-adaptive performance acquisition monitoring and analysis of a full-series Loongson platform. The front end provides a user interactive operation interface, which consists of an expandable configuration file module, a configuration file loading module, a PMU sampling monitoring setting module, a performance real-time analysis module and an analysis result dynamic display module and is used for helping a user to perform visual setting, performance analysis and result presentation; the back end mainly comprises a PMU self-adaptive configuration module and a PMU performance statistics module, and the front end and the back end realize data communication through an RSP remote debugging protocol. Fig. 1 is a PMU adaptive performance collection monitoring scheme, where a front end and a back end form an overall architecture for adaptive performance collection monitoring analysis.
Specifically, the design scheme of the system front end is as follows:
(1) and the extensible configuration file module is used for adding configuration information of the Loongson CPU PMU with unknown model or future development and is used for realizing the extensibility of the self-adaptive performance acquisition monitoring system. The method includes the steps that a developer interface guiding mode is provided in the form of a drop-down menu or an input text box to complete entry of necessary information, the necessary information comprises a Loongson model number, a total number of a coprocessor CP0, the number of performance counters, the number of performance counter bits, counter overflow conditions, monitorable hardware events and the like, and configuration information is integrated by a background according to a system recognizable template format and is used for recognition of a back-end PMU self-adaptive configuration module and recognition of a drop-down menu selected by a platform in a PMU sampling monitoring setting module.
(2) The PMU sampling monitoring setting module is used for providing developer visual interface operation to complete parameter setting such as platform selection, event selection to be monitored, sampling mode (based on event sampling/based on time sampling), sampling period and the like, and supporting on-line modification of the monitoring event and the sampling period;
(3) the configuration file loading module is used for sending the configuration information of the extensible configuration file module and the parameter setting information of the PMU sampling monitoring setting module to an Tian bright kernel at the rear end of the target machine through a communication interface according to an RSP remote debugging protocol;
(4) the real-time performance analysis module is used for receiving the performance data collected by the PMU performance statistics module at the rear end through the communication interface, further analyzing the returned data according to the RSP protocol, analyzing the required data section from the returned data packet, and sorting the data into data acceptable to the graphical interface through analysis calculation;
(5) and the analysis result dynamic display module is used for receiving the analysis data output by the performance real-time analysis module and feeding back the data into a histogram and a table in real time, wherein the histogram dynamically displays the occurrence frequency of all monitoring events, and the table dynamically refreshes the data such as the total occurrence frequency of the statistical events, the occurrence frequency of each task (or function), the interruption overflow frequency, the task switching frequency, the running time and the like. The analysis result dynamic display module can start and stop receiving data at any time, and the plotting interface can start and pause correspondingly.
The operation of the application program is located at the target machine, so that the information in the program operation needs to be acquired from the back end, the communication between the front end and the back end adopts an RSP communication protocol, the protocol is a transport layer protocol facing to the datagram, connection does not need to be established between a software unit and the target machine before the datagram is transmitted, and no overtime retransmission mechanism and other mechanisms exist, so that the transmission speed is high, the delay between the overhead and the transmitted data can be greatly reduced, the overhead is low, and the overhead limitation requirement during real-time acquisition and monitoring can be met.
The design scheme of the system back end is as follows:
(1) the PMU self-adaptive configuration module encapsulates a PMU bottom layer calling interface related to a platform in a parameter form or a data structure based on an embedded operating system kernel of sky bright, and after receiving configuration information and parameter setting information loaded at the front end, materializes all the entities to complete PMU bottom layer self-adaptive processing. The PMU adaptive configuration module comprises the following main functions:
PMU automatic configuration:
ty _ device _ driver PMU _ AUTOCONFIG (sys _ device _ major _ number major, sys _ device _ minor _ number minor, void _ pargp, PMU _ attri _ pmuatti), which sets the front-end configuration information and parameter setting information content in the sky bright kernel through a structure PMU _ attri, to complete the automatic configuration of PMUs of different loongson platforms, wherein the PMU _ attri structure records all the attributes of PMU performance counters and is defined as follows:
Figure BDA0002666634040000121
Figure BDA0002666634040000131
sample engine configuration:
ty _ device _ driver PMU _ TASKSAMPLE (system _ tcb _ unused, system _ tcb _ created _ task, system _ tcb _ started _ task, system _ tcb _ switched _ task, system _ tcb _ deleted _ task, PMU _ task _ msg _ pmutaskmsg), which binds PMU performance counters to monitor tasks (threads), saves context and execution order when tasks are switched, records run time, scheduling information, and task/system-wide count register values, interrupt overflow times, etc., and saves and passes to the PMU performance statistics module through the structure PMU _ task _ msg, where the structure PMU _ task _ msg is defined as follows:
Figure BDA0002666634040000132
Figure BDA0002666634040000141
PMU interruption processing:
sys _ isr PMU _ isr (sys _ vector _ number vector, U int32_ t cause _ pci), which is responsible for parsing out the PMU interrupt number defined in the configuration information and registering the interrupt, and performing PMU interrupt processing. The specific treatment process comprises the following steps: enabling a use register PCI to count, sequentially judging the number of a control counter generating the interrupt, then recording the overflow times and the sampling period +1 of the counting register in a PMU performance counter corresponding to the counter generating the interrupt, updating PMU _ task _ msg structure related members, resetting the count, and clearing the use.
(2) The PMU performance statistics module is irrelevant to PMU hardware and is responsible for analyzing the input parameter requirements of acquisition modes, monitoring events and the like set by a developer in a front-end configuration file, transmitting all parameters to the PMU self-adaptive configuration module through system call of a sky bright kernel, meanwhile statistically arranging sampling information transmitted by the PMU self-adaptive configuration module, returning the sampling information to the front-end performance real-time analysis module to draw a bar chart, and helping the developer to position performance hot spots through a visual interface. The PMU performance statistics module comprises the following main functions:
firstly, event collection and monitoring are started;
secondly, the collection and monitoring are suspended;
thirdly, continuing to collect and monitor;
quitting monitoring;
acquiring data of a designated control register;
sixthly, acquiring data of a specified counting register;
seventhly, modifying hardware events monitored by the PMU online;
and modifying the sampling frequency on line.
The system is suitable for a Loongson whole-system processor platform, realizes the self-adaption of the whole acquisition monitoring process such as any LS CPU PMU self-adaption configuration, sampling mode setting, monitoring event list selection, statistic display and the like based on the sky bright embedded operating system environment, has the expandability and can provide continuous support for the development of subsequent Loongson products. The system effectively shields the influence of huge PMU operation difference caused by different hardware platforms, provides a uniform performance analysis interface and result display for an upper-layer program developer, and the developer does not need to spend effort to adapt a platform to modify a PMU monitoring module in a system kernel and call the program, so that the developer is more concentrated on the performance optimization problem of the program based on the method, and provides technical support and application support for the complex program performance monitoring of a domestic key software and hardware platform.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A home-made platform PMU adaptive performance acquisition monitoring system is characterized by comprising a front end positioned in an upper computer development environment layer and a rear end positioned in a target computer kernel layer, and is used for realizing the acquisition monitoring and analysis of the adaptive performance of a full-series Loongson platform; the front end is used for providing a user interactive operation interface to help a user to perform visual setting, performance analysis and result presentation and extracting PMU hardware configuration information; the back end is used for taking the relevant configuration operation of the platform PMU hardware as a hardware shielding layer, receiving PMU hardware configuration information of the front end in a parameter mode and then carrying out materialization configuration, so that core codes of the sky bright kernel PMU do not need to be continuously modified according to the upgrading of the platform.
2. The system of claim 1, wherein the front end is comprised of an extensible profile module, a profile loading module, a PMU sampling monitoring settings module, a real-time performance analysis module, and a dynamic analysis results presentation module;
the extensible configuration file module is used for adding configuration information of Loongson CPU PMU with unknown model or future development and realizing the extensibility of the performance acquisition monitoring system; the method comprises the following steps that a developer interface guiding mode is provided in the form of a drop-down menu or an input text box to complete the input of necessary information, wherein the necessary information comprises a Loongson model number, a total number of a coprocessor CP0, the number of performance counters, the number of performance counter bits, counter overflow conditions and a monitorable hardware event, and a background integrates configuration information according to a system recognizable template format to be used for recognition of a PMU self-adaptive configuration module at the rear end and recognition of a drop-down menu selected by a platform in the PMU sampling monitoring setting module;
the PMU sampling monitoring setting module is used for providing developer visual interface operation to complete parameter setting of platform selection, event selection to be monitored, sampling mode selection and sampling period, and supporting online modification of monitoring events and sampling periods;
the configuration file loading module is used for sending the configuration information of the extensible configuration file module and the parameter setting information of the PMU sampling monitoring setting module to an Tian bright kernel at the rear end of the target machine through a communication interface according to an RSP remote debugging protocol;
the performance real-time analysis module is used for receiving the performance data collected by the PMU performance statistics module at the rear end through the communication interface, further analyzing the returned data according to the RSP protocol, analyzing the required data section from the returned data packet, and sorting the data into data acceptable for the graphical interface through analysis calculation;
the analysis result dynamic display module is used for receiving the analysis data output by the performance real-time analysis module and feeding back the data into a histogram and a table in real time, the histogram dynamically displays the occurrence frequency of all monitoring events, and the table dynamically refreshes the data of the total occurrence frequency of the statistical events, the occurrence frequency of each task or function, the interruption overflow frequency, the task switching frequency, the running time and the like.
3. The system of claim 2, wherein the back-end PMU adaptive configuration module is configured to encapsulate a platform-dependent PMU bottom call interface in a parametric form or in a data structure based on the native bright embedded operating system kernel, and to instantiate all entities after receiving the front-end loaded configuration information and parameter setting information to complete PMU bottom adaptive processing.
4. The system of claim 3, wherein the PMU adaptation configuration module contains functions that include:
PMU automatic configuration:
ty _ device _ driver PMU _ AUTOCONFIG (sys _ device _ major _ number major, sys _ device _ minor _ number minor, void _ pargp, PMU _ attri _ pmuatti), which sets the front-end configuration information and parameter setting information content in the sky bright kernel through a structure PMU _ attri, to complete the automatic configuration of PMUs of different loongson platforms, wherein the PMU _ attri structure records all the attributes of PMU performance counters and is defined as follows:
typedef struct pmu_attri
{
agent 32_ t event _ type; // record hardware event type
uint32_ t event _ nump [ event _ type ]; // record the number of events corresponding to each event type
char pmu _ event [ event _ type ]; v/record hardware event name corresponding to hardware event type
uint32_ t pmu _ cp0[2 ]; // record coprocessor CP0 Total number
uint32_ t × pmu _ select; v/identify the select number corresponding to each set of PMU Performance counters
U int64_ t pmu _ cnt; // record control register field set
U agent 64_ t pmu _ cntl; // record count register field set
U int32_ t cause _ pci; // identify Performance counter Overflow interrupt indication
}pmuAttri;
Sample engine configuration:
ty _ device _ driver PMU _ TASKSAMPLE (system _ tcb _ unused, system _ tcb _ created _ task, system _ tcb _ started _ task, system _ tcb _ switched _ task, system _ tcb _ deleted _ task, PMU _ task _ msg _ pmutaskmsg), which binds a PMU performance counter to a monitoring task or thread, saves context and execution order when tasks are switched, records run time, scheduling information, and sample information such as task/system-wide count register values, interrupt overflow times, etc., and saves and passes to the PMU performance statistics module via the struct PMU _ task _ msg, where the struct PMU _ task _ msg is defined as follows:
typedef struct pmu_task_msg
{
char taskname [32 ]; // Current monitoring task name
uint32_ t taskid; // Current task id
Agent 32_ t event _ cnt; // number of overflow of currently generated interrupts, i.e. number of cycles of event occurrence
Agent 64_ t sample _ cnt; // the number of occurrences of the current event (not the total number)
Agent 64_ t sample _ period; // sampling period for current performance monitoring
Agent 64_ t sample _ freq; // sampling frequency of current performance monitoring
U int64_ t total _ cnt; // total number of event occurrences total _ cnt ═ event _ cnt @ sample _ per id + sample _ cnt so far
uint64_t total_time_enabled;
uint64_t total_time_running;
U int32_ t flag; // whether the current PMU performance counter is enabled
}pmuMsg;
PMU interruption processing:
sys _ isr PMU _ isr (sys _ vector _ number vector, U int32_ t cause _ pci), which is responsible for parsing out PMU interrupt numbers defined in configuration information, registering the interrupts, and executing PMU interrupt processing; the specific treatment process comprises the following steps: enabling a use register PCI to count, sequentially judging the number of a control counter generating the interrupt, then recording the overflow times and the sampling period +1 of the counting register in a PMU performance counter corresponding to the counter generating the interrupt, updating PMU _ task _ msg structure related members, resetting the count, and clearing the use.
5. The system of claim 4, wherein the back-end PMU performance statistics module, independent of PMU hardware, is configured to parse the input parameter requirements of the acquisition mode and the monitoring event set by the developer in the front-end configuration file, and pass the parameters to the PMU adaptive configuration module through a system call of the Tian bright kernel, and meanwhile statistically sort the sampling information passed by the PMU adaptive configuration module and return the information to the front-end performance real-time analysis module to draw the histogram.
6. The system of claim 5, wherein the PMU performance statistics module includes functions to: firstly, event collection and monitoring are started; secondly, the collection and monitoring are suspended; thirdly, continuing to collect and monitor; quitting monitoring; acquiring data of a designated control register; sixthly, acquiring data of a specified counting register; seventhly, modifying hardware events monitored by the PMU online; and modifying the sampling frequency on line.
7. The system of claim 1, wherein the application program runs on the target machine, information in the program run is obtained from the back-end, and the communication between the front-end and the back-end employs an RSP communication protocol.
8. The system of claim 2, wherein the sampling manner comprises event-based sampling, time-based sampling.
9. The system of claim 2, wherein the analysis result dynamic presentation module is capable of starting and stopping the receiving of data at any time, and the mapping interface is capable of starting and pausing accordingly.
10. A home platform PMU adaptive performance collection monitoring method implemented using the system of any one of claims 1 to 9.
CN202010920679.1A 2020-09-04 2020-09-04 Domestic platform PMU self-adaptive performance acquisition monitoring system Active CN112069029B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010920679.1A CN112069029B (en) 2020-09-04 2020-09-04 Domestic platform PMU self-adaptive performance acquisition monitoring system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010920679.1A CN112069029B (en) 2020-09-04 2020-09-04 Domestic platform PMU self-adaptive performance acquisition monitoring system

Publications (2)

Publication Number Publication Date
CN112069029A true CN112069029A (en) 2020-12-11
CN112069029B CN112069029B (en) 2023-11-14

Family

ID=73665607

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010920679.1A Active CN112069029B (en) 2020-09-04 2020-09-04 Domestic platform PMU self-adaptive performance acquisition monitoring system

Country Status (1)

Country Link
CN (1) CN112069029B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115061898A (en) * 2022-08-17 2022-09-16 杭州安恒信息技术股份有限公司 Adaptive speed limiting method, device, equipment and medium based on Hadoop analysis platform
CN115237728A (en) * 2022-09-26 2022-10-25 东方电子股份有限公司 Visual monitoring method for real-time operating system running state
CN118069485A (en) * 2024-04-25 2024-05-24 沐曦集成电路(上海)有限公司 Automatic generation system of register transmission level code of performance counter

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100082950A1 (en) * 2008-09-29 2010-04-01 Kinzhalin Arzhan I Dynamically reconfiguring platform settings
CN101937392A (en) * 2010-08-27 2011-01-05 华南理工大学 Dynamic defect detection method for embedded software
US20150312116A1 (en) * 2014-04-28 2015-10-29 Vmware, Inc. Virtual performance monitoring decoupled from hardware performance-monitoring units
CN205453290U (en) * 2015-11-28 2016-08-10 国网江西省电力公司赣东北供电分公司 Intelligent substation relay protection operating condition real time kinematic monitoring and recorder
CN106126384A (en) * 2016-06-12 2016-11-16 华为技术有限公司 A kind of method and device of acquisition performance monitor unit PMU event
CN107153604A (en) * 2017-05-17 2017-09-12 北京计算机技术及应用研究所 Parallel program performance method for monitoring and analyzing based on PMU
CN107977369A (en) * 2016-10-21 2018-05-01 北京计算机技术及应用研究所 Easy to the embedded data base management system of transplanting

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100082950A1 (en) * 2008-09-29 2010-04-01 Kinzhalin Arzhan I Dynamically reconfiguring platform settings
CN101937392A (en) * 2010-08-27 2011-01-05 华南理工大学 Dynamic defect detection method for embedded software
US20150312116A1 (en) * 2014-04-28 2015-10-29 Vmware, Inc. Virtual performance monitoring decoupled from hardware performance-monitoring units
CN205453290U (en) * 2015-11-28 2016-08-10 国网江西省电力公司赣东北供电分公司 Intelligent substation relay protection operating condition real time kinematic monitoring and recorder
CN106126384A (en) * 2016-06-12 2016-11-16 华为技术有限公司 A kind of method and device of acquisition performance monitor unit PMU event
WO2017215557A1 (en) * 2016-06-12 2017-12-21 华为技术有限公司 Method and device for collecting performance monitor unit (pmu) events
CN107977369A (en) * 2016-10-21 2018-05-01 北京计算机技术及应用研究所 Easy to the embedded data base management system of transplanting
CN107153604A (en) * 2017-05-17 2017-09-12 北京计算机技术及应用研究所 Parallel program performance method for monitoring and analyzing based on PMU

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115061898A (en) * 2022-08-17 2022-09-16 杭州安恒信息技术股份有限公司 Adaptive speed limiting method, device, equipment and medium based on Hadoop analysis platform
CN115061898B (en) * 2022-08-17 2022-11-08 杭州安恒信息技术股份有限公司 Adaptive speed limiting method, device, equipment and medium based on Hadoop analysis platform
CN115237728A (en) * 2022-09-26 2022-10-25 东方电子股份有限公司 Visual monitoring method for real-time operating system running state
CN115237728B (en) * 2022-09-26 2022-12-06 东方电子股份有限公司 Visual monitoring method for real-time operating system running state
CN118069485A (en) * 2024-04-25 2024-05-24 沐曦集成电路(上海)有限公司 Automatic generation system of register transmission level code of performance counter
CN118069485B (en) * 2024-04-25 2024-07-05 沐曦集成电路(上海)有限公司 Automatic generation system of register transmission level code of performance counter

Also Published As

Publication number Publication date
CN112069029B (en) 2023-11-14

Similar Documents

Publication Publication Date Title
CN112069029A (en) Performance acquisition monitoring system of domestic platform PMU self-adaptation
US8166462B2 (en) Method and apparatus for sorting and displaying costs in a data space profiler
US10691571B2 (en) Obtaining application performance data for different performance events via a unified channel
US8539455B2 (en) System for and method of capturing performance characteristics data from a computer system and modeling target system performance
US8443341B2 (en) System for and method of capturing application characteristics data from a computer system and modeling target system
US8584098B2 (en) Component statistics for application profiling
US5265254A (en) System of debugging software through use of code markers inserted into spaces in the source code during and after compilation
US8640114B2 (en) Method and apparatus for specification and application of a user-specified filter in a data space profiler
US7739662B2 (en) Methods and apparatus to analyze processor systems
US8813055B2 (en) Method and apparatus for associating user-specified data with events in a data space profiler
EP0567722B1 (en) System for analyzing and debugging embedded software through dynamic and interactive use of code markers
US8176475B2 (en) Method and apparatus for identifying instructions associated with execution events in a data space profiler
US7721268B2 (en) Method and system for a call stack capture
US20090055594A1 (en) System for and method of capturing application characteristics data from a computer system and modeling target system
US20020073406A1 (en) Using performance counter profiling to drive compiler optimization
US20080114806A1 (en) Method and Apparatus for Data Space Profiling of Applications Across a Network
US20110214109A1 (en) Generating stack traces of call stacks that lack frame pointers
JPH09218800A (en) Method and device for analyzing software executed in incorporation system
US20090300166A1 (en) Mechanism for adaptive profiling for performance analysis
CN110515808B (en) Database monitoring method and device, computer equipment and storage medium
US8286192B2 (en) Kernel subsystem for handling performance counters and events
US7490269B2 (en) Noise accommodation in hardware and software testing
CN104380264A (en) Run-time instrumentation reporting
CN108563526A (en) A kind of iOS interim cards monitoring strategies
JP2010033543A (en) Software operation monitoring system, client computer, server computer thereof, and program thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant