CN112069029A - Performance acquisition monitoring system of domestic platform PMU self-adaptation - Google Patents
Performance acquisition monitoring system of domestic platform PMU self-adaptation Download PDFInfo
- Publication number
- CN112069029A CN112069029A CN202010920679.1A CN202010920679A CN112069029A CN 112069029 A CN112069029 A CN 112069029A CN 202010920679 A CN202010920679 A CN 202010920679A CN 112069029 A CN112069029 A CN 112069029A
- Authority
- CN
- China
- Prior art keywords
- pmu
- performance
- module
- monitoring
- event
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000012544 monitoring process Methods 0.000 title claims abstract description 78
- 230000003044 adaptive effect Effects 0.000 claims abstract description 19
- 238000010223 real-time analysis Methods 0.000 claims abstract description 12
- 238000011161 development Methods 0.000 claims abstract description 11
- 230000000007 visual effect Effects 0.000 claims abstract description 11
- 238000005070 sampling Methods 0.000 claims description 42
- 238000004458 analytical method Methods 0.000 claims description 27
- 238000004891 communication Methods 0.000 claims description 13
- 238000000034 method Methods 0.000 claims description 10
- 238000012545 processing Methods 0.000 claims description 10
- 238000012986 modification Methods 0.000 claims description 5
- 230000004048 modification Effects 0.000 claims description 5
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000002452 interceptive effect Effects 0.000 claims description 3
- 230000008569 process Effects 0.000 claims description 3
- 239000011800 void material Substances 0.000 claims description 3
- 239000003795 chemical substances by application Substances 0.000 claims 6
- 230000006978 adaptation Effects 0.000 claims 1
- 230000001419 dependent effect Effects 0.000 claims 1
- 238000013507 mapping Methods 0.000 claims 1
- 238000005457 optimization Methods 0.000 abstract description 6
- 230000000694 effects Effects 0.000 abstract description 3
- 239000012141 concentrate Substances 0.000 abstract description 2
- 238000005259 measurement Methods 0.000 abstract 1
- 230000006399 behavior Effects 0.000 description 4
- 238000013461 design Methods 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005008 domestic process Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3051—Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3003—Monitoring arrangements specially adapted to the computing system or computing system component being monitored
- G06F11/3013—Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is an embedded system, i.e. a combination of hardware and software dedicated to perform a certain function in mobile devices, printers, automotive or aircraft systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Quality & Reliability (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Computer Hardware Design (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention relates to a self-adaptive performance acquisition monitoring system for a PMU (phasor measurement Unit) of a domestic platform, belonging to the technical field of computers. By adopting the PMU self-adaptive performance acquisition monitoring system, the PMU hardware configuration information is extracted through the front end, the relevant configuration operation of the platform PMU hardware is used as the hardware shielding layer by the rear end, the front end configuration information is received in a parameter form and then the materialized configuration is carried out, so that the effect of continuously modifying the core code of the Tian bright kernel PMU without upgrading the platform is achieved, developers are greatly convenient to concentrate on the PMU performance optimization problem of the program, and the development adaptive debugging time is saved. The invention also provides an operation interface such as user query and setting and dynamic display of the real-time analysis result of the visual performance, and the visual operation helps developers to more quickly and intuitively locate the performance hot spot. And technical support and application support are provided for monitoring the complex program performance of a domestic key software and hardware platform. The invention is simple and effective to realize and meets the application requirements.
Description
Technical Field
The invention belongs to the technical field of computers, and particularly relates to a self-adaptive performance acquisition monitoring system for a domestic platform PMU.
Background
With the deep advance of the domestic process in China, a lot of excellent domestic processors and domestic operating systems emerge, and the domestic key software and hardware represented by the domestic Loongson processor and the domestic Tian bright embedded operating system gradually replace foreign products to be applied to the key military fields of domestic aerospace, national defense and the like. However, compared to the more mature commercial processor chips such as Intel, AMD, IBM, Linux, VxWorks, and other foreign mainstream embedded operating systems, how to monitor and improve the running condition of the complex programs on the domestic software-based hardware platform and the overall performance of the platform is a more serious challenge currently facing.
Currently, a PMU (Performance Monitoring Unit) based online Performance analysis method for Performance Monitoring of parallel programs on a multi-core platform and overall Performance optimization of the platform is increasingly applied because the method does not need to modify program behavior, is easy to use, has low overhead, and has accurate Monitoring results. Modern processors provide PMUs for capturing micro-architecture level event information for collecting events related to operations in the processor, such as the number of cycles spent, Cache misses, or instructions executed. These recorded results may provide useful information about how the software is using the hardware. For example, a PMU may be configured such that the corresponding register is incremented by 1 each time a data Cache miss event occurs, to record such an event. Reading the counting register is helpful for revealing the conditions occurring in a chip when a program runs, and by counting and analyzing the events which influence the performance, the user can know what bottom hardware behaviors can be generated by different program implementation modes, and further analyzes the hardware events generated by which codes influence the performance of the program according to the behaviors, so that a programmer is guided to improve the overall performance of the system through improvement of a program level, and optimization basis can be provided for an automatic optimization tool.
The existing performance acquisition monitoring tool based on PMU hardware support mainly comprises a Perf performance optimizing tool and an Oprofile performance acquiring and analyzing tool under a Linux operating system and an Intel Vtune series performance acquiring and analyzing tool under a Windows operating system for foreign platforms, and a domestic Loongson 2F platform mainly comprises a Tprofiler performance monitoring tool under the Linux operating system, a DUET tool of a Loongson 3A1000 platform and the like.
Although the PMU-based performance monitoring and optimizing tool exists at home and abroad, the PMU-based performance monitoring and optimizing tool has stronger dependence on a hardware platform and an operating system operating environment, basically does not have the application capability across the hardware platform, and has more limitation on the portability and expansibility of a domestic system platform. Particularly, for example, a loongson processor platform which is autonomously developed in China and based on an MIPS instruction set is taken as an example, a loongson processor 1 series processor facing a low-end embedded field, a loongson processor 2 series processor facing a desktop computer and a loongson processor 3 series processor facing a server and a High Performance Computer (HPC) are proposed at present, each series processor is further divided into multiple products, for example, the loongson processor 2 series processor is further divided into LS2F, LS2H, LS2K1000, LS2K2000 and the like, the loongson processor 3 series processor is further divided into LS3A1000, LS3B1500, LS3A/B2000, LS3A/B3000, PMU 3 LS3A/B4000, LS3C5000 and the like, PMU design changes are brought along with continuous perfection and development of the loongson products, so that the loongson processors have differences on configuration, access and monitorable events, only a performance monitoring of a hardware event of A1 group of counters is realized from LS2F, only 32 hardware event performance monitoring of the hardware events is supported, and a performance of the counters is realized, and a performance monitoring of the LS, Shared cache bank 4 sets of 48-bit performance counters may monitor 573 processor core performance events and 37 shared cache performance events. The existing Tprofiler tool can only be applied to the Linux environment of the LS2F platform of the Loongson No. 2 processor, the DUET tool can only be applied to the Linux environment of the LS3A1000 platform of the Loongson No. 3 processor, a system kernel PMU performance acquisition scheme and a client display mode need to be redesigned according to hardware characteristics when a new processor product is released, the system-level PMU self-adaption capability and the expandability capability are lacked, and matched PMU performance acquisition monitoring tools are not available in the domestic system environment for LS3A/B2000, LS3A/B3000, LS3A/B4000, LS3C5000 and other series platforms released at the late stage of the Loongson. Domestic platforms lack PMU-adaptable performance acquisition monitoring capabilities.
Disclosure of Invention
Technical problem to be solved
The technical problem to be solved by the invention is as follows: aiming at the defects and shortcomings of the existing domestic platform PMU performance acquisition monitoring tool, a PMU self-adaptive performance acquisition monitoring system is provided by taking a domestic Loongson processor platform PMU and a domestic Tian bright embedded operating system as research objects.
(II) technical scheme
In order to solve the technical problems, the invention provides a PMU adaptive performance acquisition monitoring system of a domestic platform, which consists of a front end positioned in an upper computer development environment layer and a rear end positioned in a target computer kernel layer and is used for realizing the adaptive performance acquisition monitoring and analysis of the full-series Loongson platform; the front end is used for providing a user interactive operation interface to help a user to perform visual setting, performance analysis and result presentation and extracting PMU hardware configuration information; the back end is used for taking the relevant configuration operation of the platform PMU hardware as a hardware shielding layer, receiving PMU hardware configuration information of the front end in a parameter mode and then carrying out materialization configuration, so that core codes of the sky bright kernel PMU do not need to be continuously modified according to the upgrading of the platform.
Preferably, the front end comprises an extensible configuration file module, a configuration file loading module, a PMU sampling monitoring setting module, a performance real-time analysis module and an analysis result dynamic display module;
the extensible configuration file module is used for adding configuration information of Loongson CPU PMU with unknown model or future development and realizing the extensibility of the performance acquisition monitoring system; the method comprises the following steps that a developer interface guiding mode is provided in the form of a drop-down menu or an input text box to complete the input of necessary information, wherein the necessary information comprises a Loongson model number, a total number of a coprocessor CP0, the number of performance counters, the number of performance counter bits, counter overflow conditions and a monitorable hardware event, and a background integrates configuration information according to a system recognizable template format to be used for recognition of a PMU self-adaptive configuration module at the rear end and recognition of a drop-down menu selected by a platform in the PMU sampling monitoring setting module;
the PMU sampling monitoring setting module is used for providing developer visual interface operation to complete parameter setting of platform selection, event selection to be monitored, sampling mode selection and sampling period, and supporting online modification of monitoring events and sampling periods;
the configuration file loading module is used for sending the configuration information of the extensible configuration file module and the parameter setting information of the PMU sampling monitoring setting module to an Tian bright kernel at the rear end of the target machine through a communication interface according to an RSP remote debugging protocol;
the performance real-time analysis module is used for receiving the performance data collected by the PMU performance statistics module at the rear end through the communication interface, further analyzing the returned data according to the RSP protocol, analyzing the required data section from the returned data packet, and sorting the data into data acceptable for the graphical interface through analysis calculation;
the analysis result dynamic display module is used for receiving the analysis data output by the performance real-time analysis module and feeding back the data into a histogram and a table in real time, the histogram dynamically displays the occurrence frequency of all monitoring events, and the table dynamically refreshes the data of the total occurrence frequency of the statistical events, the occurrence frequency of each task or function, the interruption overflow frequency, the task switching frequency, the running time and the like.
Preferably, the PMU adaptive configuration module at the back end is configured to encapsulate a PMU bottom layer call interface related to the platform in a parameter form or a data structure based on an embedded operating system kernel of day bright, and after receiving configuration information and parameter setting information loaded at the front end, instantiate all the entities to complete PMU bottom layer adaptive processing.
Preferably, the PMU adaptive configuration module includes functions of:
PMU automatic configuration:
ty _ device _ driver PMU _ AUTOCONFIG (sys _ device _ major _ number major, sys _ device _ minor _ number minor, void _ pargp, PMU _ attri _ pmuatti), which sets the front-end configuration information and parameter setting information content in the sky bright kernel through a structure PMU _ attri, to complete the automatic configuration of PMUs of different loongson platforms, wherein the PMU _ attri structure records all the attributes of PMU performance counters and is defined as follows:
sample engine configuration:
ty _ device _ driver PMU _ TASKSAMPLE (system _ tcb _ unused, system _ tcb _ created _ task, system _ tcb _ started _ task, system _ tcb _ switched _ task, system _ tcb _ deleted _ task, PMU _ task _ msg _ pmutaskmsg), which binds a PMU performance counter to a monitoring task or thread, saves context and execution order when tasks are switched, records run time, scheduling information, and sample information such as task/system-wide count register values, interrupt overflow times, etc., and saves and passes to the PMU performance statistics module via the struct PMU _ task _ msg, where the struct PMU _ task _ msg is defined as follows:
PMU interruption processing:
sys _ isr PMU _ isr (sys _ vector _ number vector, U int32_ t cause _ pci), which is responsible for parsing out PMU interrupt numbers defined in configuration information, registering the interrupts, and executing PMU interrupt processing; the specific treatment process comprises the following steps: enabling a use register PCI to count, sequentially judging the number of a control counter generating the interrupt, then recording the overflow times and the sampling period +1 of the counting register in a PMU performance counter corresponding to the counter generating the interrupt, updating PMU _ task _ msg structure related members, resetting the count, and clearing the use.
Preferably, the PMU performance statistics module at the back end is independent of PMU hardware, and is configured to analyze input parameter requirements of a collection mode and a monitoring event set by a developer in a front-end configuration file, transfer each parameter to the PMU adaptive configuration module through a system call of an inner core of the sky bright, and statistically sort sampling information transferred by the PMU adaptive configuration module and return the sampled information to the performance real-time analysis module at the front end so as to draw a histogram.
Preferably, the PMU performance statistics module includes the functions of:
firstly, event collection and monitoring are started;
secondly, the collection and monitoring are suspended;
thirdly, continuing to collect and monitor;
quitting monitoring;
acquiring data of a designated control register;
sixthly, acquiring data of a specified counting register;
seventhly, modifying hardware events monitored by the PMU online;
and modifying the sampling frequency on line. .
Preferably, the running of the application program is located at the target machine, the information in the running of the program is obtained from the back end, and the communication between the front end and the back end adopts the RSP communication protocol.
Preferably, the sampling mode includes sampling based on events and sampling based on time.
Preferably, the analysis result dynamic display module can start and stop receiving data at any time, and the plotting interface can start and pause accordingly.
The invention also provides a self-adaptive performance acquisition monitoring method of the domestic platform PMU, which is realized by utilizing the system.
(III) advantageous effects
By adopting the PMU self-adaptive performance acquisition monitoring system, the PMU hardware configuration information is extracted through the front end, the relevant configuration operation of the platform PMU hardware is used as the hardware shielding layer by the rear end, the front end configuration information is received in a parameter form and then the materialized configuration is carried out, so that the effect of continuously modifying the core code of the Tian bright kernel PMU without upgrading the platform is achieved, developers are greatly convenient to concentrate on the PMU performance optimization problem of the program, and the development adaptive debugging time is saved. The invention also provides an operation interface such as user query and setting and dynamic display of the real-time analysis result of the visual performance, and the visual operation helps developers to more quickly and intuitively locate the performance hot spot. And technical support and application support are provided for monitoring the complex program performance of a domestic key software and hardware platform. The invention is simple and effective to realize and meets the application requirements.
Drawings
Fig. 1 is a schematic block diagram of a system provided by the present invention.
Detailed Description
In order to make the objects, contents, and advantages of the present invention clearer, the following detailed description of the embodiments of the present invention will be made in conjunction with the accompanying drawings and examples.
In order to adapt to the continuous perfection and development of a domestic Loongson processor platform, avoid spending a large amount of time to revise and adapt to an operation system kernel PMU module due to the change of PMU hardware design, and need to peel off the direct behavior of hardware from a system kernel to achieve the PMU self-adaptation purpose, so the PMU self-adaptation performance acquisition monitoring system is divided into a front end and a rear end, wherein the front end fills user matching information into a PMU attribute configuration file of the platform in a set format, and transmits the rear end through an RSP protocol, and the form is convenient for a user to operate and can provide extensible support for the subsequent Loongson platform upgrading; the kernel of the back-end system does not define specific properties of PMUs, the operation of each PMU bottom layer is packaged in a parameter or structure form, the parameters are materialized to complete the self-adaptive processing of the PMU bottom layer after receiving configuration information, and the front end and the back end jointly form the collection and monitoring of the self-adaptive performance of the PMU, thereby ensuring the portability and the expandable effectiveness of a domestic system platform.
The PMU self-adaptive performance acquisition monitoring system of the invention constructs performance monitoring analysis tools by two parts, namely the front end positioned in an upper computer development environment layer and the rear end positioned in a target computer kernel layer, and realizes the self-adaptive performance acquisition monitoring and analysis of a full-series Loongson platform. The front end provides a user interactive operation interface, which consists of an expandable configuration file module, a configuration file loading module, a PMU sampling monitoring setting module, a performance real-time analysis module and an analysis result dynamic display module and is used for helping a user to perform visual setting, performance analysis and result presentation; the back end mainly comprises a PMU self-adaptive configuration module and a PMU performance statistics module, and the front end and the back end realize data communication through an RSP remote debugging protocol. Fig. 1 is a PMU adaptive performance collection monitoring scheme, where a front end and a back end form an overall architecture for adaptive performance collection monitoring analysis.
Specifically, the design scheme of the system front end is as follows:
(1) and the extensible configuration file module is used for adding configuration information of the Loongson CPU PMU with unknown model or future development and is used for realizing the extensibility of the self-adaptive performance acquisition monitoring system. The method includes the steps that a developer interface guiding mode is provided in the form of a drop-down menu or an input text box to complete entry of necessary information, the necessary information comprises a Loongson model number, a total number of a coprocessor CP0, the number of performance counters, the number of performance counter bits, counter overflow conditions, monitorable hardware events and the like, and configuration information is integrated by a background according to a system recognizable template format and is used for recognition of a back-end PMU self-adaptive configuration module and recognition of a drop-down menu selected by a platform in a PMU sampling monitoring setting module.
(2) The PMU sampling monitoring setting module is used for providing developer visual interface operation to complete parameter setting such as platform selection, event selection to be monitored, sampling mode (based on event sampling/based on time sampling), sampling period and the like, and supporting on-line modification of the monitoring event and the sampling period;
(3) the configuration file loading module is used for sending the configuration information of the extensible configuration file module and the parameter setting information of the PMU sampling monitoring setting module to an Tian bright kernel at the rear end of the target machine through a communication interface according to an RSP remote debugging protocol;
(4) the real-time performance analysis module is used for receiving the performance data collected by the PMU performance statistics module at the rear end through the communication interface, further analyzing the returned data according to the RSP protocol, analyzing the required data section from the returned data packet, and sorting the data into data acceptable to the graphical interface through analysis calculation;
(5) and the analysis result dynamic display module is used for receiving the analysis data output by the performance real-time analysis module and feeding back the data into a histogram and a table in real time, wherein the histogram dynamically displays the occurrence frequency of all monitoring events, and the table dynamically refreshes the data such as the total occurrence frequency of the statistical events, the occurrence frequency of each task (or function), the interruption overflow frequency, the task switching frequency, the running time and the like. The analysis result dynamic display module can start and stop receiving data at any time, and the plotting interface can start and pause correspondingly.
The operation of the application program is located at the target machine, so that the information in the program operation needs to be acquired from the back end, the communication between the front end and the back end adopts an RSP communication protocol, the protocol is a transport layer protocol facing to the datagram, connection does not need to be established between a software unit and the target machine before the datagram is transmitted, and no overtime retransmission mechanism and other mechanisms exist, so that the transmission speed is high, the delay between the overhead and the transmitted data can be greatly reduced, the overhead is low, and the overhead limitation requirement during real-time acquisition and monitoring can be met.
The design scheme of the system back end is as follows:
(1) the PMU self-adaptive configuration module encapsulates a PMU bottom layer calling interface related to a platform in a parameter form or a data structure based on an embedded operating system kernel of sky bright, and after receiving configuration information and parameter setting information loaded at the front end, materializes all the entities to complete PMU bottom layer self-adaptive processing. The PMU adaptive configuration module comprises the following main functions:
PMU automatic configuration:
ty _ device _ driver PMU _ AUTOCONFIG (sys _ device _ major _ number major, sys _ device _ minor _ number minor, void _ pargp, PMU _ attri _ pmuatti), which sets the front-end configuration information and parameter setting information content in the sky bright kernel through a structure PMU _ attri, to complete the automatic configuration of PMUs of different loongson platforms, wherein the PMU _ attri structure records all the attributes of PMU performance counters and is defined as follows:
sample engine configuration:
ty _ device _ driver PMU _ TASKSAMPLE (system _ tcb _ unused, system _ tcb _ created _ task, system _ tcb _ started _ task, system _ tcb _ switched _ task, system _ tcb _ deleted _ task, PMU _ task _ msg _ pmutaskmsg), which binds PMU performance counters to monitor tasks (threads), saves context and execution order when tasks are switched, records run time, scheduling information, and task/system-wide count register values, interrupt overflow times, etc., and saves and passes to the PMU performance statistics module through the structure PMU _ task _ msg, where the structure PMU _ task _ msg is defined as follows:
PMU interruption processing:
sys _ isr PMU _ isr (sys _ vector _ number vector, U int32_ t cause _ pci), which is responsible for parsing out the PMU interrupt number defined in the configuration information and registering the interrupt, and performing PMU interrupt processing. The specific treatment process comprises the following steps: enabling a use register PCI to count, sequentially judging the number of a control counter generating the interrupt, then recording the overflow times and the sampling period +1 of the counting register in a PMU performance counter corresponding to the counter generating the interrupt, updating PMU _ task _ msg structure related members, resetting the count, and clearing the use.
(2) The PMU performance statistics module is irrelevant to PMU hardware and is responsible for analyzing the input parameter requirements of acquisition modes, monitoring events and the like set by a developer in a front-end configuration file, transmitting all parameters to the PMU self-adaptive configuration module through system call of a sky bright kernel, meanwhile statistically arranging sampling information transmitted by the PMU self-adaptive configuration module, returning the sampling information to the front-end performance real-time analysis module to draw a bar chart, and helping the developer to position performance hot spots through a visual interface. The PMU performance statistics module comprises the following main functions:
firstly, event collection and monitoring are started;
secondly, the collection and monitoring are suspended;
thirdly, continuing to collect and monitor;
quitting monitoring;
acquiring data of a designated control register;
sixthly, acquiring data of a specified counting register;
seventhly, modifying hardware events monitored by the PMU online;
and modifying the sampling frequency on line.
The system is suitable for a Loongson whole-system processor platform, realizes the self-adaption of the whole acquisition monitoring process such as any LS CPU PMU self-adaption configuration, sampling mode setting, monitoring event list selection, statistic display and the like based on the sky bright embedded operating system environment, has the expandability and can provide continuous support for the development of subsequent Loongson products. The system effectively shields the influence of huge PMU operation difference caused by different hardware platforms, provides a uniform performance analysis interface and result display for an upper-layer program developer, and the developer does not need to spend effort to adapt a platform to modify a PMU monitoring module in a system kernel and call the program, so that the developer is more concentrated on the performance optimization problem of the program based on the method, and provides technical support and application support for the complex program performance monitoring of a domestic key software and hardware platform.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.
Claims (10)
1. A home-made platform PMU adaptive performance acquisition monitoring system is characterized by comprising a front end positioned in an upper computer development environment layer and a rear end positioned in a target computer kernel layer, and is used for realizing the acquisition monitoring and analysis of the adaptive performance of a full-series Loongson platform; the front end is used for providing a user interactive operation interface to help a user to perform visual setting, performance analysis and result presentation and extracting PMU hardware configuration information; the back end is used for taking the relevant configuration operation of the platform PMU hardware as a hardware shielding layer, receiving PMU hardware configuration information of the front end in a parameter mode and then carrying out materialization configuration, so that core codes of the sky bright kernel PMU do not need to be continuously modified according to the upgrading of the platform.
2. The system of claim 1, wherein the front end is comprised of an extensible profile module, a profile loading module, a PMU sampling monitoring settings module, a real-time performance analysis module, and a dynamic analysis results presentation module;
the extensible configuration file module is used for adding configuration information of Loongson CPU PMU with unknown model or future development and realizing the extensibility of the performance acquisition monitoring system; the method comprises the following steps that a developer interface guiding mode is provided in the form of a drop-down menu or an input text box to complete the input of necessary information, wherein the necessary information comprises a Loongson model number, a total number of a coprocessor CP0, the number of performance counters, the number of performance counter bits, counter overflow conditions and a monitorable hardware event, and a background integrates configuration information according to a system recognizable template format to be used for recognition of a PMU self-adaptive configuration module at the rear end and recognition of a drop-down menu selected by a platform in the PMU sampling monitoring setting module;
the PMU sampling monitoring setting module is used for providing developer visual interface operation to complete parameter setting of platform selection, event selection to be monitored, sampling mode selection and sampling period, and supporting online modification of monitoring events and sampling periods;
the configuration file loading module is used for sending the configuration information of the extensible configuration file module and the parameter setting information of the PMU sampling monitoring setting module to an Tian bright kernel at the rear end of the target machine through a communication interface according to an RSP remote debugging protocol;
the performance real-time analysis module is used for receiving the performance data collected by the PMU performance statistics module at the rear end through the communication interface, further analyzing the returned data according to the RSP protocol, analyzing the required data section from the returned data packet, and sorting the data into data acceptable for the graphical interface through analysis calculation;
the analysis result dynamic display module is used for receiving the analysis data output by the performance real-time analysis module and feeding back the data into a histogram and a table in real time, the histogram dynamically displays the occurrence frequency of all monitoring events, and the table dynamically refreshes the data of the total occurrence frequency of the statistical events, the occurrence frequency of each task or function, the interruption overflow frequency, the task switching frequency, the running time and the like.
3. The system of claim 2, wherein the back-end PMU adaptive configuration module is configured to encapsulate a platform-dependent PMU bottom call interface in a parametric form or in a data structure based on the native bright embedded operating system kernel, and to instantiate all entities after receiving the front-end loaded configuration information and parameter setting information to complete PMU bottom adaptive processing.
4. The system of claim 3, wherein the PMU adaptation configuration module contains functions that include:
PMU automatic configuration:
ty _ device _ driver PMU _ AUTOCONFIG (sys _ device _ major _ number major, sys _ device _ minor _ number minor, void _ pargp, PMU _ attri _ pmuatti), which sets the front-end configuration information and parameter setting information content in the sky bright kernel through a structure PMU _ attri, to complete the automatic configuration of PMUs of different loongson platforms, wherein the PMU _ attri structure records all the attributes of PMU performance counters and is defined as follows:
typedef struct pmu_attri
{
agent 32_ t event _ type; // record hardware event type
uint32_ t event _ nump [ event _ type ]; // record the number of events corresponding to each event type
char pmu _ event [ event _ type ]; v/record hardware event name corresponding to hardware event type
uint32_ t pmu _ cp0[2 ]; // record coprocessor CP0 Total number
uint32_ t × pmu _ select; v/identify the select number corresponding to each set of PMU Performance counters
U int64_ t pmu _ cnt; // record control register field set
U agent 64_ t pmu _ cntl; // record count register field set
U int32_ t cause _ pci; // identify Performance counter Overflow interrupt indication
}pmuAttri;
Sample engine configuration:
ty _ device _ driver PMU _ TASKSAMPLE (system _ tcb _ unused, system _ tcb _ created _ task, system _ tcb _ started _ task, system _ tcb _ switched _ task, system _ tcb _ deleted _ task, PMU _ task _ msg _ pmutaskmsg), which binds a PMU performance counter to a monitoring task or thread, saves context and execution order when tasks are switched, records run time, scheduling information, and sample information such as task/system-wide count register values, interrupt overflow times, etc., and saves and passes to the PMU performance statistics module via the struct PMU _ task _ msg, where the struct PMU _ task _ msg is defined as follows:
typedef struct pmu_task_msg
{
char taskname [32 ]; // Current monitoring task name
uint32_ t taskid; // Current task id
Agent 32_ t event _ cnt; // number of overflow of currently generated interrupts, i.e. number of cycles of event occurrence
Agent 64_ t sample _ cnt; // the number of occurrences of the current event (not the total number)
Agent 64_ t sample _ period; // sampling period for current performance monitoring
Agent 64_ t sample _ freq; // sampling frequency of current performance monitoring
U int64_ t total _ cnt; // total number of event occurrences total _ cnt ═ event _ cnt @ sample _ per id + sample _ cnt so far
uint64_t total_time_enabled;
uint64_t total_time_running;
U int32_ t flag; // whether the current PMU performance counter is enabled
}pmuMsg;
PMU interruption processing:
sys _ isr PMU _ isr (sys _ vector _ number vector, U int32_ t cause _ pci), which is responsible for parsing out PMU interrupt numbers defined in configuration information, registering the interrupts, and executing PMU interrupt processing; the specific treatment process comprises the following steps: enabling a use register PCI to count, sequentially judging the number of a control counter generating the interrupt, then recording the overflow times and the sampling period +1 of the counting register in a PMU performance counter corresponding to the counter generating the interrupt, updating PMU _ task _ msg structure related members, resetting the count, and clearing the use.
5. The system of claim 4, wherein the back-end PMU performance statistics module, independent of PMU hardware, is configured to parse the input parameter requirements of the acquisition mode and the monitoring event set by the developer in the front-end configuration file, and pass the parameters to the PMU adaptive configuration module through a system call of the Tian bright kernel, and meanwhile statistically sort the sampling information passed by the PMU adaptive configuration module and return the information to the front-end performance real-time analysis module to draw the histogram.
6. The system of claim 5, wherein the PMU performance statistics module includes functions to: firstly, event collection and monitoring are started; secondly, the collection and monitoring are suspended; thirdly, continuing to collect and monitor; quitting monitoring; acquiring data of a designated control register; sixthly, acquiring data of a specified counting register; seventhly, modifying hardware events monitored by the PMU online; and modifying the sampling frequency on line.
7. The system of claim 1, wherein the application program runs on the target machine, information in the program run is obtained from the back-end, and the communication between the front-end and the back-end employs an RSP communication protocol.
8. The system of claim 2, wherein the sampling manner comprises event-based sampling, time-based sampling.
9. The system of claim 2, wherein the analysis result dynamic presentation module is capable of starting and stopping the receiving of data at any time, and the mapping interface is capable of starting and pausing accordingly.
10. A home platform PMU adaptive performance collection monitoring method implemented using the system of any one of claims 1 to 9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010920679.1A CN112069029B (en) | 2020-09-04 | 2020-09-04 | Domestic platform PMU self-adaptive performance acquisition monitoring system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010920679.1A CN112069029B (en) | 2020-09-04 | 2020-09-04 | Domestic platform PMU self-adaptive performance acquisition monitoring system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112069029A true CN112069029A (en) | 2020-12-11 |
CN112069029B CN112069029B (en) | 2023-11-14 |
Family
ID=73665607
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010920679.1A Active CN112069029B (en) | 2020-09-04 | 2020-09-04 | Domestic platform PMU self-adaptive performance acquisition monitoring system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112069029B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115061898A (en) * | 2022-08-17 | 2022-09-16 | 杭州安恒信息技术股份有限公司 | Adaptive speed limiting method, device, equipment and medium based on Hadoop analysis platform |
CN115237728A (en) * | 2022-09-26 | 2022-10-25 | 东方电子股份有限公司 | Visual monitoring method for real-time operating system running state |
CN118069485A (en) * | 2024-04-25 | 2024-05-24 | 沐曦集成电路(上海)有限公司 | Automatic generation system of register transmission level code of performance counter |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100082950A1 (en) * | 2008-09-29 | 2010-04-01 | Kinzhalin Arzhan I | Dynamically reconfiguring platform settings |
CN101937392A (en) * | 2010-08-27 | 2011-01-05 | 华南理工大学 | Dynamic defect detection method for embedded software |
US20150312116A1 (en) * | 2014-04-28 | 2015-10-29 | Vmware, Inc. | Virtual performance monitoring decoupled from hardware performance-monitoring units |
CN205453290U (en) * | 2015-11-28 | 2016-08-10 | 国网江西省电力公司赣东北供电分公司 | Intelligent substation relay protection operating condition real time kinematic monitoring and recorder |
CN106126384A (en) * | 2016-06-12 | 2016-11-16 | 华为技术有限公司 | A kind of method and device of acquisition performance monitor unit PMU event |
CN107153604A (en) * | 2017-05-17 | 2017-09-12 | 北京计算机技术及应用研究所 | Parallel program performance method for monitoring and analyzing based on PMU |
CN107977369A (en) * | 2016-10-21 | 2018-05-01 | 北京计算机技术及应用研究所 | Easy to the embedded data base management system of transplanting |
-
2020
- 2020-09-04 CN CN202010920679.1A patent/CN112069029B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100082950A1 (en) * | 2008-09-29 | 2010-04-01 | Kinzhalin Arzhan I | Dynamically reconfiguring platform settings |
CN101937392A (en) * | 2010-08-27 | 2011-01-05 | 华南理工大学 | Dynamic defect detection method for embedded software |
US20150312116A1 (en) * | 2014-04-28 | 2015-10-29 | Vmware, Inc. | Virtual performance monitoring decoupled from hardware performance-monitoring units |
CN205453290U (en) * | 2015-11-28 | 2016-08-10 | 国网江西省电力公司赣东北供电分公司 | Intelligent substation relay protection operating condition real time kinematic monitoring and recorder |
CN106126384A (en) * | 2016-06-12 | 2016-11-16 | 华为技术有限公司 | A kind of method and device of acquisition performance monitor unit PMU event |
WO2017215557A1 (en) * | 2016-06-12 | 2017-12-21 | 华为技术有限公司 | Method and device for collecting performance monitor unit (pmu) events |
CN107977369A (en) * | 2016-10-21 | 2018-05-01 | 北京计算机技术及应用研究所 | Easy to the embedded data base management system of transplanting |
CN107153604A (en) * | 2017-05-17 | 2017-09-12 | 北京计算机技术及应用研究所 | Parallel program performance method for monitoring and analyzing based on PMU |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115061898A (en) * | 2022-08-17 | 2022-09-16 | 杭州安恒信息技术股份有限公司 | Adaptive speed limiting method, device, equipment and medium based on Hadoop analysis platform |
CN115061898B (en) * | 2022-08-17 | 2022-11-08 | 杭州安恒信息技术股份有限公司 | Adaptive speed limiting method, device, equipment and medium based on Hadoop analysis platform |
CN115237728A (en) * | 2022-09-26 | 2022-10-25 | 东方电子股份有限公司 | Visual monitoring method for real-time operating system running state |
CN115237728B (en) * | 2022-09-26 | 2022-12-06 | 东方电子股份有限公司 | Visual monitoring method for real-time operating system running state |
CN118069485A (en) * | 2024-04-25 | 2024-05-24 | 沐曦集成电路(上海)有限公司 | Automatic generation system of register transmission level code of performance counter |
CN118069485B (en) * | 2024-04-25 | 2024-07-05 | 沐曦集成电路(上海)有限公司 | Automatic generation system of register transmission level code of performance counter |
Also Published As
Publication number | Publication date |
---|---|
CN112069029B (en) | 2023-11-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112069029A (en) | Performance acquisition monitoring system of domestic platform PMU self-adaptation | |
US8166462B2 (en) | Method and apparatus for sorting and displaying costs in a data space profiler | |
US10691571B2 (en) | Obtaining application performance data for different performance events via a unified channel | |
US8539455B2 (en) | System for and method of capturing performance characteristics data from a computer system and modeling target system performance | |
US8443341B2 (en) | System for and method of capturing application characteristics data from a computer system and modeling target system | |
US8584098B2 (en) | Component statistics for application profiling | |
US5265254A (en) | System of debugging software through use of code markers inserted into spaces in the source code during and after compilation | |
US8640114B2 (en) | Method and apparatus for specification and application of a user-specified filter in a data space profiler | |
US7739662B2 (en) | Methods and apparatus to analyze processor systems | |
US8813055B2 (en) | Method and apparatus for associating user-specified data with events in a data space profiler | |
EP0567722B1 (en) | System for analyzing and debugging embedded software through dynamic and interactive use of code markers | |
US8176475B2 (en) | Method and apparatus for identifying instructions associated with execution events in a data space profiler | |
US7721268B2 (en) | Method and system for a call stack capture | |
US20090055594A1 (en) | System for and method of capturing application characteristics data from a computer system and modeling target system | |
US20020073406A1 (en) | Using performance counter profiling to drive compiler optimization | |
US20080114806A1 (en) | Method and Apparatus for Data Space Profiling of Applications Across a Network | |
US20110214109A1 (en) | Generating stack traces of call stacks that lack frame pointers | |
JPH09218800A (en) | Method and device for analyzing software executed in incorporation system | |
US20090300166A1 (en) | Mechanism for adaptive profiling for performance analysis | |
CN110515808B (en) | Database monitoring method and device, computer equipment and storage medium | |
US8286192B2 (en) | Kernel subsystem for handling performance counters and events | |
US7490269B2 (en) | Noise accommodation in hardware and software testing | |
CN104380264A (en) | Run-time instrumentation reporting | |
CN108563526A (en) | A kind of iOS interim cards monitoring strategies | |
JP2010033543A (en) | Software operation monitoring system, client computer, server computer thereof, and program thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |