CN111858243A - Multi-hardware event monitoring count value estimation method based on exponential increase - Google Patents

Multi-hardware event monitoring count value estimation method based on exponential increase Download PDF

Info

Publication number
CN111858243A
CN111858243A CN202010678027.1A CN202010678027A CN111858243A CN 111858243 A CN111858243 A CN 111858243A CN 202010678027 A CN202010678027 A CN 202010678027A CN 111858243 A CN111858243 A CN 111858243A
Authority
CN
China
Prior art keywords
event
monitoring
hardware
hardware event
count value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010678027.1A
Other languages
Chinese (zh)
Other versions
CN111858243B (en
Inventor
王一超
王杰
文敏华
韦建文
林新华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN202010678027.1A priority Critical patent/CN111858243B/en
Publication of CN111858243A publication Critical patent/CN111858243A/en
Application granted granted Critical
Publication of CN111858243B publication Critical patent/CN111858243B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3024Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/3012Organisation of register space, e.g. banked or distributed register file

Abstract

A multi-hardware event monitoring count value estimation method based on exponential growth is characterized in that a life cycle of a working process is maintained through a main thread, a related data structure is created and initialized, a slave thread control signal is sent, a monitored application is operated, hardware event scheduling, timing alternate monitoring and post-processing estimation are carried out through the slave thread responding to the life cycle signal of the main thread, and a multi-hardware event monitoring count value is obtained through reading a hardware event counting register built in a CPU. According to the invention, the hardware event count value on the non-monitoring time slice is filled through the exponential growth estimation algorithm, so that the accuracy of the multi-hardware event monitoring count library based on the MPX technology can be improved, and the usability of the monitoring result under the MPX is enhanced.

Description

Multi-hardware event monitoring count value estimation method based on exponential increase
Technical Field
The invention relates to a technology in the field of semiconductor performance optimization, in particular to a method for estimating a monitoring count value of a multi-hardware event (hardware event) based on exponential growth.
Background
Currently, there are two modes of collecting data in the hardware event register. One is called OCOE (one counter onevent). In this mode, a register only records one hardware event during the entire program run. The recording mode can completely record the occurrence of each hardware event. One is called Multiplexing (MPX). The MPX technology divides the available time of each register into different time slices by a time division multiplexing method, different hardware events are monitored on the different time slices in turn, and the registers can only provide data of the hardware events in the time slices on the registers.
Disclosure of Invention
The invention provides a multi-hardware event monitoring counting value estimation method based on exponential increase, aiming at the problems that in the prior art, the unmonitored missing value and the precision of MPX are insufficient, all hardware event behavior modes cannot be completely covered, and the precision of MPX measurement results is reduced, and the event is supposed to gradually evolve from an initial value to a tail value through constant multiple increase on an unmonitored time slice between two continuous sampling values, and the hardware event counting value on the unmonitored time slice is filled through an exponential increase estimation algorithm, so that the accuracy of a multi-hardware event monitoring counting library based on MPX technology can be improved, and the usability of the monitoring results under MPX is enhanced.
The invention is realized by the following technical scheme:
the invention relates to a multi-hardware event monitoring count value estimation method based on exponential growth, which respectively maintains the life cycle of a working process through a main thread, creates and initializes a related data structure and a slave thread, sends a slave thread control signal and runs a monitored application, responds to the life cycle signal of the main thread through the slave thread, carries out hardware event scheduling, timing alternate monitoring and post-processing estimation, and obtains the multi-hardware event monitoring count value by reading a hardware event counting register built in a CPU.
Technical effects
The invention integrally solves the defect of insufficient precision caused by lack of metadata due to the mechanism intermittent monitoring of hardware events in the existing MPX; compared with the prior art, the method can obviously improve the working precision of the hardware event monitoring software in the MPX mode, thereby overcoming the reliability problem caused by insufficient precision of the MPX mode in the actual industrial production.
Drawings
FIG. 1 is a schematic flow diagram of the present invention;
FIG. 2 is a schematic diagram of a hardware event scheduling process;
FIG. 3 is a diagram of the original MPX estimation strategy;
fig. 4 is a schematic diagram of the MPX estimation strategy in the present method.
Detailed Description
As shown in fig. 1, the present embodiment relates to a method for estimating a cpu multiple hardware event monitoring count value based on exponential growth, which includes the following steps:
step 1: the PAPI hardware performance collection framework master thread initializes the current state and creates a slave thread for monitoring hardware events.
The state initialization comprises the following steps: initializing the PAPI internal global variables, obtaining current operating system information, creating a file descriptor for recording hardware event count values, enabling MPX mode, and setting a monitoring slot length.
Step 2: and after the global initialization is completed, the master thread creates an event set (eventset) for saving the events to be monitored, binds the event set to the slave thread, and starts the life cycle of the workflow.
And step 3: the main thread adds the hardware event to be monitored to the step 2 event set.
And 4, step 4: and (3) the main thread sends a monitoring start signal to the slave thread, the slave thread starts to read the monitoring result from a hardware event counting register built in the CPU periodically according to the time slice length set in the step (1), and the monitoring result is written into a file descriptor corresponding to the current hardware event.
The monitoring result is that: the monitored hardware event count value, the program running time length and the current hardware event monitoring time length sequence data.
And 5: and (4) starting the monitored program in the main thread, monitoring the hardware events to be monitored added in the step (3) in turn by the slave thread according to the monitoring time slice set by initialization, and collecting the monitoring result and writing the monitoring result into the file descriptor to obtain the hardware event time sequence data.
As shown in fig. 2, the specific steps of the timed alternate monitoring include:
step 5.1: the scheduling system creates queues equal in number to the event registers, each queue corresponding to an event register one-to-one.
Step 5.2: and storing any hardware event into a queue corresponding to all event registers capable of monitoring the event.
Step 5.3: all queues are randomly ordered.
Step 5.4: and checking the current events of all the head of the queue, and when the head of the queue events are repeated, putting the head of the queue event at the back of the queue, and sequencing the later events to push forward by one bit.
Step 5.5: repeat step 5.4 until all head of line events are not repeated. All non-head-of-line events that repeat with the current head-of-line event are put to the tail of the line, and the sequence of events after the repeat event is pushed forward by one bit.
Step 5.6: and putting the current head of line event on a corresponding event register for monitoring.
Step 5.7: and after the current time slice expires, taking down the event on the event register and placing the event at the tail of the corresponding queue.
Step 5.8: and repeating the step 5.4 to the step 5.7 until receiving the end signal of the main thread.
Step 6: after the monitored program ends, the master thread sends a stop signal to the slave thread, and the slave thread stops monitoring, and the process proceeds to step 7.
And 7: and (4) the slave thread performs post-processing estimation on the hardware event time series data collected in the step (5), sends the result to the master thread, and the master thread outputs the result.
The hardware event time sequence data comprises: the actual hardware event accumulated count value c, the accumulated running time r of the monitored program, and the accumulated monitored time e of the monitored hardware event.
The post-processing estimation comprises the following specific steps:
step 7.1) performing first-order difference on all the hardware event time sequence data from the thread to obtain a difference value of the hardware event time sequence data, namely: the actual single-time slice count value C, the single-time slice running time length R of the monitored program and the single-time slice monitored time length E of the monitored hardware event.
Step 7.2) reading time series data of a hardware event from the thread, where the i-th differential counter value C of the time series data is definediI +1 th differential count value Ci+1I th differential run-time length RiI th differential monitored duration EiAnd sequentially calculating the ratio of the (i + 1) th count value to the (i) th count value
Figure BDA0002584740760000031
Multiple of growth of a single time slice
Figure BDA0002584740760000032
Number of unmonitored time slices
Figure BDA0002584740760000033
Step 7.3) repeat estimation: a jth unmonitored count value between the ith count value and the (i + 1) th count value
Figure BDA0002584740760000034
Until the count values over all n unmonitored time slices are estimated. And accumulating all the monitoring values and the estimated values to obtain the total count value of the current hardware event.
And 7.4) repeating the step 7.2-7.3 until the total count value of all the monitored hardware events is obtained, and finishing the processing work after the hardware events are monitored and counted.
And 8: and the slave thread stops exiting, and the main thread destroys the data structure memory in the running process and stops exiting.
And step 9: the hardware event monitor counting operation ends.
On the basis of the embodiment, the correctness of the method is verified through a related Benchmark program rondia Benchmark Suite, and meanwhile, the accuracy of the method is improved to a different extent in different Benchmark kernel programs compared with that of MPX of the original edition.
The specific development of the embodiment is the secondary packaging development based on the PAPI, and libraries and software such as Linux Perf, HPCToolkit, Intel Vtune, Gooda and the like which use MPX technology to monitor and count multiple hardware events are also applicable. The specific operating platform of this example is a common rack-mounted Intel X86 server, which is equipped with a CentOS 7.664 bit operating system and is equipped with two Intel Xeon Gold 6248 processors and 192GB main memory. Firstly, a library based on the PAPI is required to be created, so that the library intercepts the intermediate output file descriptor of the PAPI, and the recorded data is estimated and processed by the method: with 100ms as a monitoring period, all, cond, brmis, all, brmis, cond, dtlm, m, l1lh, l1lm, l2lh, l2lm, ldam, ich, icm, uishall, urshall and inst16 hardware events in five types of Rodinia Benchmark Suite applications are monitored, and the accuracy improvement of 5% -59% is obtained.
Compared to the master MPX post-processing strategy in fig. 3, the present embodiment integrates the monitored values around the interpolation point. Extreme estimation values caused when two continuous monitoring values have large difference are avoided, and an exponential growth multiplying power method is introduced to cover wider change rules, so that the accuracy of MPX estimation is improved.
Compared with the prior art, the method obtains the data estimation which is closer to the real data distribution through the variation trend based on exponential growth, thereby obtaining the precision improvement.
The foregoing embodiments may be modified in many different ways by those skilled in the art without departing from the spirit and scope of the invention, which is defined by the appended claims and all changes that come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Claims (7)

1. A multi-hardware event monitoring count value estimation method based on exponential growth is characterized in that a life cycle of a working process is maintained through a main thread, a related data structure and a slave thread are created and initialized, a slave thread control signal is sent and a monitored application is operated, the slave thread responds to the life cycle signal of the main thread, hardware event scheduling, timing alternate monitoring and post-processing estimation are carried out, and a multi-hardware event monitoring count value is obtained by reading a hardware event counting register built in a CPU.
2. The method for estimating the multiple hardware event supervision count value based on exponential growth according to claim 1, wherein the post-processing estimation specifically comprises:
step 1) performing first-order difference on all hardware event time sequence data from a thread to obtain a difference value of the hardware event time sequence data, namely: actual single-time slice count value C, single-time slice running time length R of a monitored program, and single-time slice monitored time length E of a monitored hardware event;
step 2) reading time series data of a hardware event from the thread, wherein the ith differential counter value C of the time series data is definediI +1 th differential count value Ci+1I th differential run-time length RiI th differential monitored duration EiAnd sequentially calculating the ratio of the (i + 1) th count value to the (i) th count value
Figure FDA0002584740750000011
Multiple of growth of a single time slice
Figure FDA0002584740750000012
Number of unmonitored time slices
Figure FDA0002584740750000013
Step 3) repeated estimation: the ith count value and the (i + 1) th count valueJ-th unmonitored count value in between
Figure FDA0002584740750000014
Until the count values on all n unmonitored time slices are estimated; accumulating all the monitoring values and the estimated values to obtain a total count value of the current hardware event;
and 4) repeating the step 2 to the step 3 until the total count value of all the monitored hardware events is obtained, and finishing the processing work after the hardware events are monitored and counted.
3. The method for estimating the multi-hardware event monitoring count value based on exponential growth according to claim 1, wherein the timing alternate monitoring specifically comprises:
step 1: the scheduling system creates queues with the same number as the event registers, and each queue corresponds to the event register in a one-to-one mode;
step 2: storing any hardware event into a queue corresponding to all event registers capable of monitoring the event;
and step 3: randomly sequencing all queues;
and 4, step 4: checking the events of all the current head of the queue, when the events of the head of the queue are repeated, putting the events of the head of the queue at the back of the queue, and sequencing the events at the back of the queue and pushing the events one bit forward;
and 5: repeating the step 4 until all the head-of-line events are not repeated, putting all the non-head-of-line events which are repeated with the current head-of-line event at the tail of the line, and sequencing the events after the repeated events to push forward by one bit;
step 6: putting the current head of line event on a corresponding event register for monitoring;
and 7: taking down the event on the event register and placing the event at the tail of the corresponding queue after the current time slice expires;
and 8: and repeating the steps 4 to 7 until an end signal of the main thread is received.
4. The method for estimating the multi-hardware event monitoring count value based on exponential growth according to any one of claims 1 to 3, characterized by comprising:
step 1, a main thread of a PAPI hardware performance acquisition framework initializes the current state and creates a slave thread for monitoring hardware events;
step 2, after the global initialization is completed by the main thread, an event set used for storing the event to be monitored is created, the event set is bound to the slave thread, and the life cycle of the workflow is started;
step 3, the main thread adds the hardware event to be monitored to the event set in the step 2;
step 4, the main thread sends a monitoring start signal to the slave thread, the slave thread starts to read the monitoring result from the hardware event counting register arranged in the CPU regularly according to the time slice length set in the step 1, and the monitoring result is written into the file descriptor corresponding to the current hardware event;
step 5, the monitored program is started in the main thread, the slave thread monitors the hardware events to be monitored added in the step 3 in turn according to the monitoring time slice set by initialization, and the monitoring result is collected in the step 4 and written into the file descriptor to obtain the hardware event time sequence data;
step 6, after the monitored program is finished, the main thread sends a stop signal to the slave thread, the slave thread stops monitoring, and the process goes to step 7;
And 7, the slave thread performs post-processing estimation on the hardware event time series data collected in the step 5, sends the result to the main thread, and the main thread outputs the result.
5. The method of claim 4, wherein the hardware event time series data comprises: the actual hardware event accumulated count value c, the accumulated running time r of the monitored program, and the accumulated monitored time e of the monitored hardware event.
6. The method of claim 5, wherein the state initialization comprises: initializing the PAPI internal global variables, obtaining current operating system information, creating a file descriptor for recording hardware event count values, enabling MPX mode, and setting a monitoring slot length.
7. The method as claimed in claim 5, wherein the monitoring result is: the monitored hardware event count value, the program running time length and the current hardware event monitoring time length sequence data.
CN202010678027.1A 2020-07-15 2020-07-15 Multi-hardware event monitoring count value estimation method based on exponential growth Active CN111858243B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010678027.1A CN111858243B (en) 2020-07-15 2020-07-15 Multi-hardware event monitoring count value estimation method based on exponential growth

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010678027.1A CN111858243B (en) 2020-07-15 2020-07-15 Multi-hardware event monitoring count value estimation method based on exponential growth

Publications (2)

Publication Number Publication Date
CN111858243A true CN111858243A (en) 2020-10-30
CN111858243B CN111858243B (en) 2024-03-19

Family

ID=72983372

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010678027.1A Active CN111858243B (en) 2020-07-15 2020-07-15 Multi-hardware event monitoring count value estimation method based on exponential growth

Country Status (1)

Country Link
CN (1) CN111858243B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7289964B1 (en) * 1999-08-31 2007-10-30 Accenture Llp System and method for transaction services patterns in a netcentric environment
CN104852927A (en) * 2015-06-01 2015-08-19 国家电网公司 Safety comprehensive management system based on multi-source heterogeneous information
CN110275732A (en) * 2019-05-28 2019-09-24 上海交通大学 The Parallel Implementation method of particle in cell method on ARMv8 processor
US20200159888A1 (en) * 2018-11-15 2020-05-21 The Research Foundation For The State University Of New York Secure processor for detecting and preventing exploits of software vulnerability

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7289964B1 (en) * 1999-08-31 2007-10-30 Accenture Llp System and method for transaction services patterns in a netcentric environment
CN104852927A (en) * 2015-06-01 2015-08-19 国家电网公司 Safety comprehensive management system based on multi-source heterogeneous information
US20200159888A1 (en) * 2018-11-15 2020-05-21 The Research Foundation For The State University Of New York Secure processor for detecting and preventing exploits of software vulnerability
CN110275732A (en) * 2019-05-28 2019-09-24 上海交通大学 The Parallel Implementation method of particle in cell method on ARMv8 processor

Also Published As

Publication number Publication date
CN111858243B (en) 2024-03-19

Similar Documents

Publication Publication Date Title
US7774784B2 (en) Determining an actual amount of time a processor consumes in executing a portion of code
US9864676B2 (en) Bottleneck detector application programming interface
US9424288B2 (en) Analyzing database cluster behavior by transforming discrete time series measurements
EP2239664B1 (en) Context switch sampling
JP5299161B2 (en) Computer apparatus and power consumption sampling method
Ahn et al. Scalable analysis techniques for microprocessor performance counter metrics
US7490269B2 (en) Noise accommodation in hardware and software testing
JP2016100006A (en) Method and device for generating benchmark application for performance test
US20090122938A1 (en) Method and System for Identifying Sources of Operating System Jitter
CN111858243A (en) Multi-hardware event monitoring count value estimation method based on exponential increase
Neill et al. Fuse: Accurate multiplexing of hardware performance counters across executions
Dreyer et al. Precise continuous non-intrusive measurement-based execution time estimation
US7024667B2 (en) Parallel efficiency calculating method and apparatus
CN110990227A (en) Numerical pool application characteristic performance acquisition and monitoring system and operation method thereof
CN112115025B (en) Energy consumption determining method and device for distributed equipment
US7958342B1 (en) Methods for optimizing computer system performance counter utilization
US6681192B2 (en) Systems and methods for fast timer calibration
US7971190B2 (en) Machine learning performance analysis tool
van Dam et al. Enabling structured exploration of workflow performance variability in extreme-scale environments
Khoshbakht et al. SAPPP: the Software-Aware power and performance profiler
Wylie et al. Integrated runtime measurement summarisation and selective event tracing for scalable parallel execution performance diagnosis
Fedasyuk et al. A Model for Estimating Firmware Execution Time Taking Into Account Peripheral Behavior
CN109783312B (en) Resource usage metering method, device and system
CN115202972A (en) Method and device for measuring processor load of vehicle-mounted system
CN115454792A (en) Method and system for analyzing performance bottleneck on embedded operating system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant