CN112069029B - Domestic platform PMU self-adaptive performance acquisition monitoring system - Google Patents
Domestic platform PMU self-adaptive performance acquisition monitoring system Download PDFInfo
- Publication number
- CN112069029B CN112069029B CN202010920679.1A CN202010920679A CN112069029B CN 112069029 B CN112069029 B CN 112069029B CN 202010920679 A CN202010920679 A CN 202010920679A CN 112069029 B CN112069029 B CN 112069029B
- Authority
- CN
- China
- Prior art keywords
- pmu
- performance
- module
- monitoring
- task
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000012544 monitoring process Methods 0.000 title claims abstract description 80
- 238000010223 real-time analysis Methods 0.000 claims abstract description 14
- 230000000007 visual effect Effects 0.000 claims abstract description 11
- 238000011161 development Methods 0.000 claims abstract description 8
- 230000006978 adaptation Effects 0.000 claims abstract description 3
- 238000005070 sampling Methods 0.000 claims description 47
- 238000004458 analytical method Methods 0.000 claims description 27
- 238000004891 communication Methods 0.000 claims description 13
- 230000003044 adaptive effect Effects 0.000 claims description 12
- 238000012545 processing Methods 0.000 claims description 12
- 238000000034 method Methods 0.000 claims description 6
- 230000004048 modification Effects 0.000 claims description 5
- 238000012986 modification Methods 0.000 claims description 5
- 230000002452 interceptive effect Effects 0.000 claims description 3
- 239000011800 void material Substances 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 2
- 238000004806 packaging method and process Methods 0.000 claims description 2
- 238000005457 optimization Methods 0.000 abstract description 5
- 230000000694 effects Effects 0.000 abstract description 2
- 238000013461 design Methods 0.000 description 4
- 230000006399 behavior Effects 0.000 description 3
- 230000009471 action Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3051—Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3003—Monitoring arrangements specially adapted to the computing system or computing system component being monitored
- G06F11/3013—Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is an embedded system, i.e. a combination of hardware and software dedicated to perform a certain function in mobile devices, printers, automotive or aircraft systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Computer Hardware Design (AREA)
- Mathematical Physics (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention relates to a domestic platform PMU self-adaptive performance acquisition monitoring system, and belongs to the technical field of computers. By adopting the PMU self-adaption based performance acquisition monitoring system, the PMU hardware configuration information is extracted through the front end, the platform PMU hardware related configuration operation is used as the hardware shielding layer by the rear end, the front end configuration information is received in a parameter form and then subjected to materialization configuration, the effect of continuously modifying the core code of the PMU of the core of the day bright according to the upgrade of the platform is achieved, the PMU performance optimization problem of a developer more focused on a program per se is greatly facilitated, and the development adaptation debugging time is saved. The invention simultaneously provides operation interfaces such as user inquiry, setting and the like and dynamic display of visual performance real-time analysis results, and visual operation helps developers to more quickly and intuitively locate performance hot spots. Provides technical support and application support for monitoring complex program performance of domestic key software and hardware platforms. The invention is simple and effective in realization and meets the application requirement.
Description
Technical Field
The invention belongs to the technical field of computers, and particularly relates to a domestic platform PMU self-adaptive performance acquisition monitoring system.
Background
With the deep advancement of the domestic localization process, a batch of excellent domestic processors and domestic operation systems emerge, and domestic key software and hardware represented by domestic Loongson processors and domestic bright embedded operation systems are gradually replaced by foreign products to be applied to the domestic aerospace, national defense and other key military fields. However, how to monitor and promote the running situation of complex programs on domestic software-based hardware platforms and the overall performance of the platform are currently more serious challenges than the more mature commercial processor chips such as Intel, AMD, IBM and Linux, vxWorks, etc. of the foreign mainstream embedded operating systems.
Currently, an online performance analysis method based on PMU (Performance Monitoring Unit performance monitoring register) in a performance monitoring and optimizing method of parallel programs on a multi-core platform and overall performance of the platform is increasingly applied due to the advantages of no need of modifying program behaviors, easiness in use, low cost and accurate monitoring results. Modern processors provide PMUs for capturing event information at the microstructure level for collecting events related to operations in the processor, such as the number of cycles spent, cache misses, or the number of instructions executed. These recorded results may provide useful information about how the software uses the hardware. For example, the PMU may be configured such that the corresponding register is incremented by 1 each time a data Cache miss event occurs to record the event. The reading of the counting register is helpful to reveal the conditions occurring in the chip when the program runs, by carrying out statistics and analysis on the events influencing the performance, we can know what underlying hardware behaviors can be generated by different program implementations, and further analyze which hardware events generated by codes influence the performance of the program according to the behaviors, so as to guide a programmer to improve the overall performance of the system through improvement of the program level, and also provide an optimization basis for an automatic optimization tool.
The existing performance acquisition monitoring tool based on PMU hardware support mainly comprises a Perf, oprofile performance tuning tool under a Linux operating system and an Intel Vtune series performance acquisition analysis tool under a Windows operating system for foreign platforms, and the domestic Loongson 2F platform mainly comprises a Tprofile performance monitoring tool under the Linux operating system, a DUET tool of the Loongson 3A1000 platform and the like.
Although the PMU-based performance monitoring and optimizing tool exists at home and abroad, the PMU-based performance monitoring and optimizing tool has strong dependence on a hardware platform and an operating system running environment, basically does not have the application capability of crossing the hardware platform, and has more limitation in the aspects of portability and expansibility of a home-made system platform. Taking a Loongson processor platform based on an MIPS instruction set which is independently developed in China as an example, a Loongson No. 1 serial processor, a Loongson No. 2 serial processor and a Loongson No. 3 serial processor which are oriented to a desktop computer and are oriented to a server and a high-performance computer (HPC) are provided, each serial processor is further divided into a plurality of products, for example, the Loongson No. 2 serial processor is further divided into LS2F, LS2H, LS K1000, LS2K2000 and the like, the Loongson No. 3 serial processor is further divided into LS3A1000, LS3B1500, LS3A/B2000, LS3A/B3000, LS3A/B4000, LS3C5000 and the like, the PMU hardware design is changed along with the continuous perfection and development of the Loongson products, so that the difference exists in configuration, access and monitorable events, only 1 group of performance counters is only supported by 32 hardware event performance monitors, and the LS3A4000 core 28 groups of the core 28, and the performance of the core 4 core performance counters are only shared by the PMU is realized, and the performance of the core performance counters is shared by the core 3A and the performance counters is shared by the PMU is only 37. The conventional Tprofiler tool can only be applied to the Linux environment of the LS2F platform of the Loongson No. 2 processor, the DUET tool can only be applied to the Linux environment of the LS3A1000 platform of the Loongson No. 3 processor, a system kernel PMU performance acquisition scheme and a client side display mode are required to be redesigned according to hardware characteristics every time a new processor product is pushed out, system-level PMU self-adaption capability and expandability capability are lacked, and no matched PMU performance acquisition monitoring tool exists in domestic system environments for series of platforms such as LS3A/B2000, LS3A/B3000, LS3A/B4000, LS3C5000 and the like which are pushed out later by the Loongson. The domestic platform lacks the capability of PMU-adaptive performance acquisition monitoring.
Disclosure of Invention
First, the technical problem to be solved
The invention aims to solve the technical problems that: aiming at the defects and shortcomings of the conventional domestic platform PMU performance acquisition monitoring tool, a domestic Loongson processor platform PMU and a domestic day bright embedded operating system are taken as research objects, and a PMU self-adaptive performance acquisition monitoring system is provided.
(II) technical scheme
In order to solve the technical problems, the invention provides a domestic platform PMU self-adaptive performance acquisition monitoring system, which consists of a front end positioned at an upper computer development environment layer and a rear end positioned at a target computer kernel layer, and is used for realizing full-series Loongson platform self-adaptive performance acquisition monitoring and analysis; the front end is used for providing a user interactive operation interface to help a user to perform visual setting, performance analysis and result presentation, and is also used for extracting PMU hardware configuration information; the back end is used for taking the relevant configuration operation of the PMU hardware of the platform as a hardware shielding layer, and carrying out materialized configuration after receiving the PMU hardware configuration information of the front end in a parameter form, so that the core code of the PMU of the core of the Tian bright is not required to be continuously modified according to the upgrading of the platform.
Preferably, the front end consists of an extensible configuration file module, a configuration file loading module, a PMU sampling monitoring setting module, a performance real-time analysis module and an analysis result dynamic display module;
the extensible configuration file module is used for adding configuration information of a Loongson CPU PMU of an unknown model or developed in the future and realizing the expandability of the performance acquisition monitoring system; providing a developer interface guiding mode in a form of a drop-down menu or an input text box to complete the input of necessary information, wherein the necessary information comprises a Loongson model, a coprocessor CP0 total number, a performance counter number, a counter overflow condition and a monitorable hardware event, and integrating configuration information by a background according to a system identifiable template format for the identification of a PMU self-adaptive configuration module at the rear end and the identification of the drop-down menu selected by a platform in the PMU sampling monitoring setting module;
the PMU sampling monitoring setting module is used for providing a developer visual interface operation to complete platform selection, event selection to be monitored, sampling mode selection and sampling period parameter setting, and supporting online modification of monitoring events and sampling periods;
the configuration file loading module is used for sending the configuration information of the extensible configuration file module and the parameter setting information of the PMU sampling monitoring setting module to a day bright kernel at the rear end of the target machine through a communication interface according to an RSP remote debugging protocol;
the performance real-time analysis module is used for receiving the performance data acquired by the PMU performance statistics module at the back end through the communication interface, further analyzing the returned data according to the RSP protocol, analyzing the needed data segment from the returned data packet, and finishing the data into data acceptable by the graphical interface through analysis and calculation;
the analysis result dynamic display module is used for receiving the analysis data output by the performance real-time analysis module and feeding the data back into a histogram and a table in real time, wherein the histogram dynamically displays the occurrence times of all monitoring events, and the table dynamically refreshes the data such as the total occurrence times of the statistics events, the occurrence times of each task or function, the interruption overflow times, the task switching times, the running time and the like.
Preferably, the PMU adaptive configuration module at the back end is configured to encapsulate the platform-related PMU bottom calling interface in a parameter form or a data structure based on the kernel of the embedded operating system bright, and after receiving the configuration information and the parameter setting information loaded at the front end, implement all the entities to complete the PMU bottom adaptive processing.
Preferably, the PMU adaptive configuration module includes functions including:
(1) PMU automatic configuration:
the function of ty_device_driver pmu_auto config (sys_device_major_number major, sys_device_minor_number minor, void pargp, PMU _attribute) sets the front end configuration information and parameter setting information content in the kernel of the sky bright through the structure PMU _attribute to complete the automatic configuration of different Loongson platform PMUs, wherein the PMU _attribute structure records all the attributes of the PMU performance counters and is defined as follows:
(2) sampling engine configuration:
ty_device_driver pmu_ TASKSAMPLE (sys_tcb_un, sys_tcb_created_task, sys_tcb_started_task, sys_tcb_switcd_task, sys_tcb_running_task, sys_tcb_reduced_task, PMU _task_msg) bind pmutaskmsg to a monitoring task or thread, save context and execution order when task switches, save and pass sampling information such as run time, scheduling information and count register value of task/full system, interrupt overflow number, etc. through structure PMU _task_msg to PMU performance statistics module, wherein structure PMU _task_msg is defined as follows:
(3) PMU interrupt processing:
sys_isr PMU_isr (sys_vector_number_vector, uint32_t cause_pcb), which is responsible for resolving PMU interrupt numbers defined in configuration information and registering the interrupt, and executing PMU interrupt processing; the specific processing flow is as follows: enabling the PCI count of the Cause register, sequentially judging the number of the control counter generating the interrupt, recording the overflow times and the sampling period +1 of the count register in the PMU performance counter corresponding to the counter generating the interrupt, updating the related member of the PMU _task_msg structure body, resetting the count, and clearing the Cause.
Preferably, the PMU performance statistics module of the back end is independent of PMU hardware, and is configured to analyze the input parameter requirements of the collection mode and the monitoring event set by the developer in the front end configuration file, transmit each parameter to the PMU adaptive configuration module through the system call of the kernel of day bright, and meanwhile, make statistics and arrangement on the sampling information transmitted by the PMU adaptive configuration module, and return the sampling information to the performance real-time analysis module of the front end so as to draw a histogram.
Preferably, the PMU performance statistics module includes the functions of:
(1) starting event acquisition monitoring;
(2) suspending acquisition monitoring;
(3) continuously collecting and monitoring;
(4) exiting monitoring;
(5) acquiring specified control register data;
(6) acquiring specified count register data;
(7) modifying the hardware event monitored by the PMU on line;
(8) the sampling frequency is modified online. .
Preferably, the running of the application program is located in the target machine, information in the running of the program is obtained from the back end, and communication between the front end and the back end adopts an RSP communication protocol.
Preferably, the sampling mode comprises event-based sampling and time-based sampling.
Preferably, the analysis result dynamic display module can start and stop receiving data at any time, and the mapping interface can also start and pause correspondingly.
The invention also provides a domestic platform PMU self-adaptive performance acquisition monitoring method realized by the system.
(III) beneficial effects
By adopting the PMU self-adaption based performance acquisition monitoring system, the PMU hardware configuration information is extracted through the front end, the platform PMU hardware related configuration operation is used as the hardware shielding layer by the rear end, the front end configuration information is received in a parameter form and then subjected to materialization configuration, the effect of continuously modifying the core code of the PMU of the core of the day bright according to the upgrade of the platform is achieved, the PMU performance optimization problem of a developer more focused on a program per se is greatly facilitated, and the development adaptation debugging time is saved. The invention simultaneously provides operation interfaces such as user inquiry, setting and the like and dynamic display of visual performance real-time analysis results, and visual operation helps developers to more quickly and intuitively locate performance hot spots. Provides technical support and application support for monitoring complex program performance of domestic key software and hardware platforms. The invention is simple and effective in realization and meets the application requirement.
Drawings
Fig. 1 is a schematic block diagram of a system provided by the present invention.
Detailed Description
For the purposes of clarity, content, and advantages of the present invention, a detailed description of the embodiments of the present invention will be described in detail below with reference to the drawings and examples.
In order to adapt to the continuous perfection and development of a domestic Loongson processor platform, the problem that a great amount of time is spent for revising and adapting to an operating system kernel PMU module due to the change of PMU hardware design is avoided, and the PMU self-adaptation purpose is achieved by stripping the hardware direct action from the system kernel, so that the PMU self-adaptation performance acquisition and monitoring system is divided into a front end part and a rear end part, the front end fills user matching information into a platform PMU attribute configuration file in a set format, and the rear end is transmitted through an RSP protocol; the back-end system kernel does not define the specific attribute of the PMU, the operation of each PMU bottom layer is packaged in a parameter or structure form, the parameter materialization is completed after the configuration information is received, the front end and the back end jointly form the system kernel to complete the collection and monitoring of the PMU self-adaptive performance, and the portability and the extensible effectiveness of the domestic system platform are ensured.
The PMU self-adaptive performance acquisition monitoring system is characterized in that a performance monitoring analysis tool is constructed by the front end of an upper computer development environment layer and the rear end of a target computer kernel layer, so that the self-adaptive performance acquisition monitoring and analysis of a full-series Loongson platform are realized. The front end provides a user interactive operation interface, which consists of an extensible configuration file module, a configuration file loading module, a PMU sampling monitoring setting module, a performance real-time analysis module and an analysis result dynamic display module and is used for helping a user to carry out visual setting, performance analysis and result presentation; the back end mainly comprises a PMU self-adaptive configuration module and a PMU performance statistics module, and the front end and the back end realize data communication through an RSP remote debugging protocol. Fig. 1 is a PMU adaptive performance acquisition monitoring scheme, where the front end and the back end form an overall architecture for adaptive performance acquisition monitoring analysis.
Specifically, the design scheme of the front end of the system is as follows:
(1) And the extensible configuration file module is used for adding configuration information of a Loongson CPU PMU of an unknown model or developed in the future and realizing the extensibility of the adaptive performance acquisition monitoring system. The method comprises the steps of providing a developer interface guiding mode in a form of a drop-down menu or an input text box to complete the input of necessary information, wherein the necessary information comprises Loongson model number, coprocessor CP0 total number, performance counter number, counter overflow condition, monitorable hardware event and the like, and integrating configuration information by a background according to a system recognizable template format for the recognition of a back-end PMU self-adaptive configuration module and the recognition of the drop-down menu selected by a platform in a PMU sampling monitoring setting module.
(2) The PMU sampling monitoring setting module is used for providing a developer visual interface operation to complete platform selection, event selection to be monitored, sampling mode (based on event sampling/based on time sampling) selection, sampling period and other parameter setting, and supporting online modification of monitoring events and sampling periods;
(3) The configuration file loading module is used for sending the configuration information of the extensible configuration file module and the parameter setting information of the PMU sampling monitoring setting module to a day bright kernel at the rear end of the target machine through a communication interface according to an RSP remote debugging protocol;
(4) The performance real-time analysis module is used for receiving the performance data acquired by the PMU performance statistics module at the back end through the communication interface, further analyzing the returned data according to the RSP protocol, analyzing the needed data segment from the returned data packet, and finishing the data into data acceptable by the graphical interface through analysis and calculation;
(5) The analysis result dynamic display module receives the analysis data output by the performance real-time analysis module and feeds the data back into a histogram and a table in real time, wherein the histogram dynamically displays the occurrence times of all monitoring events, and the table dynamically refreshes the data such as the total occurrence times of statistical events, the occurrence times of each task (or function), the interruption overflow times, the task switching times, the running time and the like. The analysis result dynamic display module can start and stop receiving data at any time, and the drawing interface can correspondingly start and stop.
Because the running of the application program is located in the target machine, information in the running of the program needs to be acquired from the rear end, communication between the front end and the rear end adopts an RSP communication protocol, the protocol is a transport layer protocol for datagrams, connection between a software unit and the target machine is not required to be established before the datagrams are transmitted, and mechanisms such as timeout retransmission and the like are not required, so that the transmission speed is high, the cost can be greatly reduced, the delay between the cost and the transmitted data is reduced, and the cost is low, thereby meeting the cost limiting requirement in real-time acquisition and monitoring.
The design scheme of the system rear end is as follows:
(1) And the PMU self-adaptive configuration module is used for packaging a platform-related PMU bottom calling interface in a parameter form or a data structure based on a kernel of the embedded operating system bright, and after receiving front-end loaded configuration information and parameter setting information, materializing all the front-end loaded configuration information and parameter setting information to complete PMU bottom self-adaptive processing. The PMU adaptive configuration module comprises the following main functions:
(1) PMU automatic configuration:
the function of ty_device_driver pmu_auto config (sys_device_major_number major, sys_device_minor_number minor, void pargp, PMU _attribute) sets the front end configuration information and parameter setting information content in the kernel of the sky bright through the structure PMU _attribute to complete the automatic configuration of different Loongson platform PMUs, wherein the PMU _attribute structure records all the attributes of the PMU performance counters and is defined as follows:
(2) sampling engine configuration:
the function binds PMU performance counters to monitoring tasks (threads), keeps context and execution order when tasks switch, records run time, scheduling information and count register values of tasks/whole system, sampling information such as interrupt overflow number, etc. by structure PMU _task_msg, which is stored and passed to PMU performance statistics module:
(3) PMU interrupt processing:
sys_isr pmu_isr (sys_vector_number_vector, uint32_t cause_pcb), which is responsible for resolving the PMU interrupt number defined in the configuration information and registering the interrupt, and performing PMU interrupt processing. The specific processing flow is as follows: enabling the PCI count of the Cause register, sequentially judging the number of the control counter generating the interrupt, recording the overflow times and the sampling period +1 of the count register in the PMU performance counter corresponding to the counter generating the interrupt, updating the related member of the PMU _task_msg structure body, resetting the count, and clearing the Cause.
(2) And the PMU performance statistics module is irrelevant to PMU hardware and is responsible for analyzing the requirements of input parameters such as acquisition modes, monitoring events and the like set by a developer in a front-end configuration file, transmitting each parameter to the PMU self-adaptive configuration module through system call of a day bright kernel, and simultaneously carrying out statistics and arrangement on sampling information transmitted by the PMU self-adaptive configuration module and returning the sampling information to a front-end performance real-time analysis module to draw a cylindrical chart, so that the developer is helped to locate performance hot spots through a visual interface. The PMU performance statistics module comprises the following main functions:
(1) starting event acquisition monitoring;
(2) suspending acquisition monitoring;
(3) continuously collecting and monitoring;
(4) exiting monitoring;
(5) acquiring specified control register data;
(6) acquiring specified count register data;
(7) modifying the hardware event monitored by the PMU on line;
(8) the sampling frequency is modified online.
The system is suitable for a Loongson full-system processor platform, realizes the self-adaption of the whole acquisition monitoring process of arbitrary LS CPU PMU self-adaption configuration, sampling mode setting, monitoring event list selection, statistics display and the like based on the environment of the embedded operation system of the day bright, has the extensible capability, and can provide continuous support for the development of subsequent Loongson products. The system effectively shields the influence of the huge PMU operation difference caused by different hardware platforms, provides a unified performance analysis interface and result display for an upper-layer program developer, does not need to spend energy for adapting the platform to modify the PMU monitoring module in the system kernel and call the program itself, enables the developer to concentrate on the performance optimization problem of the program itself based on the method, and provides technical guarantee and application support for complex program performance monitoring of domestic key software and hardware platforms.
The foregoing is merely a preferred embodiment of the present invention, and it should be noted that modifications and variations could be made by those skilled in the art without departing from the technical principles of the present invention, and such modifications and variations should also be regarded as being within the scope of the invention.
Claims (7)
1. A domestic platform PMU self-adaptive performance acquisition monitoring system is characterized by comprising a front end positioned at an upper computer development environment layer and a rear end positioned at a target computer kernel layer, and is used for realizing acquisition monitoring and analysis of self-adaptive performance of a full-series Loongson platform; the front end is used for providing a user interactive operation interface to help a user to perform visual setting, performance analysis and result presentation, and is also used for extracting PMU hardware configuration information; the back end is used for taking the relevant configuration operation of the PMU hardware of the platform as a hardware shielding layer, and carrying out materialized configuration after receiving the PMU hardware configuration information of the front end in a parameter form, so that the core code of the PMU of the core of the Tian bright is not required to be continuously modified according to the upgrading of the platform;
the front end consists of an extensible configuration file module, a configuration file loading module, a PMU sampling monitoring setting module, a performance real-time analysis module and an analysis result dynamic display module;
the extensible configuration file module is used for adding configuration information of a Loongson CPU PMU of an unknown model or developed in the future and realizing the expandability of the performance acquisition monitoring system; providing a developer interface guiding mode in a form of a drop-down menu or an input text box to complete the input of necessary information, wherein the necessary information comprises a Loongson model, a coprocessor CP0 total number, a performance counter number, a counter overflow condition and a monitorable hardware event, and integrating configuration information by a background according to a system identifiable template format for the identification of a PMU self-adaptive configuration module at the rear end and the identification of the drop-down menu selected by a platform in the PMU sampling monitoring setting module;
the PMU sampling monitoring setting module is used for providing a developer visual interface operation to complete platform selection, event selection to be monitored, sampling mode selection and sampling period parameter setting, and supporting online modification of monitoring events and sampling periods;
the configuration file loading module is used for sending the configuration information of the extensible configuration file module and the parameter setting information of the PMU sampling monitoring setting module to a day bright kernel at the rear end of the target machine through a communication interface according to an RSP remote debugging protocol;
the performance real-time analysis module is used for receiving the performance data acquired by the PMU performance statistics module at the back end through the communication interface, further analyzing the returned data according to the RSP protocol, analyzing the needed data segment from the returned data packet, and finishing the data into data acceptable by the graphical interface through analysis and calculation;
the analysis result dynamic display module is used for receiving the analysis data output by the performance real-time analysis module and feeding the data back into a histogram and a table in real time, wherein the histogram dynamically displays the occurrence times of all monitoring events, and the table dynamically refreshes the data such as the total occurrence times of the statistics events, the occurrence times of each task or function, the interruption overflow times, the task switching times, the running time and the like;
the PMU self-adaptive configuration module of the back end is used for packaging a PMU bottom calling interface related to a platform in a parameter form or a data structure based on an embedded operating system kernel of a day bright, and after receiving configuration information and parameter setting information loaded by the front end, materializing all the PMU bottom self-adaptive processing is completed;
the PMU adaptive configuration module comprises the following functions:
(1) PMU automatic configuration:
the function of ty_device_driver pmu_auto config (sys_device_major_number major, sys_device_minor_number minor, void pargp, PMU _attribute) sets the front end configuration information and parameter setting information content in the kernel of the sky bright through the structure PMU _attribute to complete the automatic configuration of different Loongson platform PMUs, wherein the PMU _attribute structure records all the attributes of the PMU performance counters and is defined as follows:
typedef structpmu_attri
{
uint32_t event_type; hardware event type of record/record
uint32_t event_ramp [ event_type ]; record the number of events corresponding to each event type
char pmu _event [ event_type ]; hardware event name corresponding to/recording hardware event type
uint 32_tpmu_cp 0[2]; record co-processor CP0 total number
uint32_t pmu _select; select number corresponding to each set of PMU performance counters
uint64_t pmu _cnt; record control register field setting
uint64_t pmu _cntl; register field setting for record count
uint32_t cause_pci; exit interrupt indication for a performance counter
}pmuAttri;
(2) Sampling engine configuration:
ty_device_driver pmu_ TASKSAMPLE (sys_tcb_un, sys_tcb_created_task, sys_tcb_started_task, sys_tcb_switcd_task, sys_tcb_running_task, sys_tcb_reduced_task, PMU _task_msg) bind pmutaskmsg to a monitoring task or thread, save context and execution order when task switches, save and pass sampling information such as run time, scheduling information and count register value of task/full system, interrupt overflow number, etc. through structure PMU _task_msg to PMU performance statistics module, wherein structure PMU _task_msg is defined as follows:
typedef struct pmu_task_msg
{
char taskname [32]; current monitoring task name
uint32_t task; /(Current task id)
uint32_t event_cnt; the number of times that an interrupt overflow has currently occurred, i.e. the number of cycles an event occurs
uint64_t sample_cnt; number of times of occurrence of current event (not total number)
uint64_t sample_perid; sample period for/(current performance monitoring)
uint64_t sample_freq; sampling frequency for/(current performance monitoring)
uint64_t total_cnt; total event occurrence count total_cnt=event_cnt =sample_period+sample_cnt up to now
uint64_t total_time_enabled;
uint64_t total_time_running;
uint32_t flag; i/whether to enable the current PMU performance counter
}pmuMsg;
(3) PMU interrupt processing:
sys_isr PMU_isr (sys_vector_number_vector, uint 32_tcase_pcb), which is responsible for resolving PMU interrupt numbers defined in configuration information and registering the interrupt, and executing PMU interrupt processing; the specific processing flow is as follows: enabling the PCI count of the Cause register, sequentially judging the number of the control counter generating the interrupt, recording the overflow times and the sampling period +1 of the count register in the PMU performance counter corresponding to the counter generating the interrupt, updating the related member of the PMU _task_msg structure body, resetting the count, and clearing the Cause.
2. The system of claim 1 wherein the back-end PMU performance statistics module, independent of PMU hardware, is configured to parse the input parameter requirements of the collection mode and the monitoring event set by the developer in the front-end configuration file, transmit each parameter to the PMU adaptive configuration module through the system call of the kernel of day bright, and at the same time, perform statistics and sort the sampled information transmitted by the PMU adaptive configuration module, and return the sampled information to the front-end performance real-time analysis module so as to draw a histogram.
3. The system of claim 2, wherein the PMU performance statistics module includes functionality to: (1) starting event acquisition monitoring; (2) suspending acquisition monitoring; (3) continuously collecting and monitoring; (4) exiting monitoring; (5) acquiring specified control register data; (6) acquiring specified count register data; (7) modifying the hardware event monitored by the PMU on line; (8) the sampling frequency is modified online.
4. The system of claim 1, wherein the application program is run on the target machine, information in the program run is obtained from the back end, and communication between the front end and the back end adopts an RSP communication protocol.
5. The system of claim 1, wherein the sampling pattern comprises event-based sampling, time-based sampling.
6. The system of claim 1, wherein the analysis results dynamic presentation module is capable of starting and stopping the receipt of data at any time, and the mapping interface is capable of starting and stopping accordingly.
7. A method for monitoring performance of a domestic platform PMU adaptation implemented by using the system of any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010920679.1A CN112069029B (en) | 2020-09-04 | 2020-09-04 | Domestic platform PMU self-adaptive performance acquisition monitoring system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010920679.1A CN112069029B (en) | 2020-09-04 | 2020-09-04 | Domestic platform PMU self-adaptive performance acquisition monitoring system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112069029A CN112069029A (en) | 2020-12-11 |
CN112069029B true CN112069029B (en) | 2023-11-14 |
Family
ID=73665607
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010920679.1A Active CN112069029B (en) | 2020-09-04 | 2020-09-04 | Domestic platform PMU self-adaptive performance acquisition monitoring system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112069029B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115061898B (en) * | 2022-08-17 | 2022-11-08 | 杭州安恒信息技术股份有限公司 | Adaptive speed limiting method, device, equipment and medium based on Hadoop analysis platform |
CN115237728B (en) * | 2022-09-26 | 2022-12-06 | 东方电子股份有限公司 | Visual monitoring method for real-time operating system running state |
CN118069485B (en) * | 2024-04-25 | 2024-07-05 | 沐曦集成电路(上海)有限公司 | Automatic generation system of register transmission level code of performance counter |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101937392A (en) * | 2010-08-27 | 2011-01-05 | 华南理工大学 | Dynamic defect detection method for embedded software |
CN205453290U (en) * | 2015-11-28 | 2016-08-10 | 国网江西省电力公司赣东北供电分公司 | Intelligent substation relay protection operating condition real time kinematic monitoring and recorder |
CN106126384A (en) * | 2016-06-12 | 2016-11-16 | 华为技术有限公司 | A kind of method and device of acquisition performance monitor unit PMU event |
CN107153604A (en) * | 2017-05-17 | 2017-09-12 | 北京计算机技术及应用研究所 | Parallel program performance method for monitoring and analyzing based on PMU |
CN107977369A (en) * | 2016-10-21 | 2018-05-01 | 北京计算机技术及应用研究所 | Easy to the embedded data base management system of transplanting |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8006082B2 (en) * | 2008-09-29 | 2011-08-23 | Intel Corporation | Dynamically reconfiguring platform settings |
US9756118B2 (en) * | 2014-04-28 | 2017-09-05 | Vmware, Inc. | Virtual performance monitoring decoupled from hardware performance-monitoring units |
-
2020
- 2020-09-04 CN CN202010920679.1A patent/CN112069029B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101937392A (en) * | 2010-08-27 | 2011-01-05 | 华南理工大学 | Dynamic defect detection method for embedded software |
CN205453290U (en) * | 2015-11-28 | 2016-08-10 | 国网江西省电力公司赣东北供电分公司 | Intelligent substation relay protection operating condition real time kinematic monitoring and recorder |
CN106126384A (en) * | 2016-06-12 | 2016-11-16 | 华为技术有限公司 | A kind of method and device of acquisition performance monitor unit PMU event |
WO2017215557A1 (en) * | 2016-06-12 | 2017-12-21 | 华为技术有限公司 | Method and device for collecting performance monitor unit (pmu) events |
CN107977369A (en) * | 2016-10-21 | 2018-05-01 | 北京计算机技术及应用研究所 | Easy to the embedded data base management system of transplanting |
CN107153604A (en) * | 2017-05-17 | 2017-09-12 | 北京计算机技术及应用研究所 | Parallel program performance method for monitoring and analyzing based on PMU |
Also Published As
Publication number | Publication date |
---|---|
CN112069029A (en) | 2020-12-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112069029B (en) | Domestic platform PMU self-adaptive performance acquisition monitoring system | |
US10996947B2 (en) | Diagnosing production applications | |
Lu et al. | Log-based abnormal task detection and root cause analysis for spark | |
US9594754B2 (en) | Purity analysis using white list/black list analysis | |
US8166462B2 (en) | Method and apparatus for sorting and displaying costs in a data space profiler | |
US7739662B2 (en) | Methods and apparatus to analyze processor systems | |
US8826254B2 (en) | Memoizing with read only side effects | |
US8640114B2 (en) | Method and apparatus for specification and application of a user-specified filter in a data space profiler | |
US10691571B2 (en) | Obtaining application performance data for different performance events via a unified channel | |
US20130081001A1 (en) | Immediate delay tracker tool | |
US20020073406A1 (en) | Using performance counter profiling to drive compiler optimization | |
US20090055594A1 (en) | System for and method of capturing application characteristics data from a computer system and modeling target system | |
Lengauer et al. | The taming of the shrew: Increasing performance by automatic parameter tuning for java garbage collectors | |
WO2014074166A1 (en) | Selecting functions for memoization analysis | |
WO2014074163A1 (en) | Input vector analysis for memoization estimation | |
CN110515808B (en) | Database monitoring method and device, computer equipment and storage medium | |
JP2012531642A (en) | Time-based context sampling of trace data with support for multiple virtual machines | |
US8286192B2 (en) | Kernel subsystem for handling performance counters and events | |
US20230244588A1 (en) | Parallel program scalability bottleneck detection method and computing device | |
CN115757603A (en) | Visual data modeling system and method | |
CN109766028B (en) | Touch control sub-management system and method for infrared touch screen | |
CN111966464A (en) | Simple JVM monitoring method and device and computer readable storage medium | |
CN107153604B (en) | PMU-based parallel program performance monitoring and analyzing method | |
CN110990227A (en) | Numerical pool application characteristic performance acquisition and monitoring system and operation method thereof | |
Cheung et al. | Performance profiling with EndoScope, an acquisitional software monitoring framework |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |