CN111858243B - Multi-hardware event monitoring count value estimation method based on exponential growth - Google Patents

Multi-hardware event monitoring count value estimation method based on exponential growth Download PDF

Info

Publication number
CN111858243B
CN111858243B CN202010678027.1A CN202010678027A CN111858243B CN 111858243 B CN111858243 B CN 111858243B CN 202010678027 A CN202010678027 A CN 202010678027A CN 111858243 B CN111858243 B CN 111858243B
Authority
CN
China
Prior art keywords
event
monitoring
hardware
count value
hardware event
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010678027.1A
Other languages
Chinese (zh)
Other versions
CN111858243A (en
Inventor
王一超
王杰
文敏华
韦建文
林新华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN202010678027.1A priority Critical patent/CN111858243B/en
Publication of CN111858243A publication Critical patent/CN111858243A/en
Application granted granted Critical
Publication of CN111858243B publication Critical patent/CN111858243B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3024Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/3012Organisation of register space, e.g. banked or distributed register file

Abstract

A multi-hardware event monitoring count value estimation method based on exponential increase is characterized in that a related data structure and a slave thread are respectively maintained through a life cycle of a work flow of a main thread, the related data structure and the slave thread are created and initialized, a slave thread control signal is sent, a monitored application is operated, hardware event scheduling, timing alternate monitoring and post-processing estimation are carried out through the slave thread in response to the life cycle signal of the main thread, and the multi-hardware event monitoring count value is obtained through reading a hardware event counting register arranged in a CPU. According to the invention, the hardware event count value on the non-monitoring time slice is filled through the exponential growth estimation algorithm, so that the accuracy of the multi-hardware event monitoring count library based on the MPX technology can be improved, and the usability of the monitoring result under the MPX is enhanced.

Description

Multi-hardware event monitoring count value estimation method based on exponential growth
Technical Field
The invention relates to a technology in the field of semiconductor performance optimization, in particular to a multi-hardware event (hardware event) monitoring count value estimation method based on exponential growth.
Background
There are two modes of data collection for hardware event registers. One is known as OCOE (one counter one event). In this mode, one register only records one hardware event during the entire program run. This way of recording can fully record each occurrence of a hardware event. One is called Multiplexing (MPX). MPX techniques divide the available time of each register into different time slices by time division multiplexing, and alternately monitor different hardware events on the different time slices, where the registers can only provide data of the hardware events in the time slices on the registers.
Disclosure of Invention
Aiming at the problems that the unmonitored missing value and the precision of MPX in the prior art are insufficient, all hardware event behavior modes cannot be covered completely, and the precision of MPX measurement results is reduced, the invention provides an exponential-growth-based multi-hardware event monitoring count value estimation method, which is characterized in that on an unmonitored time slice in the middle of two continuous sampling values, events gradually evolve from a start value to an end value through constant multiple growth, the hardware event count value on the unmonitored time slice is filled through an exponential-growth estimation algorithm, the accuracy of an MPX-based multi-hardware event monitoring count library can be improved, and the usability of monitoring results under the MPX is enhanced.
The invention is realized by the following technical scheme:
the invention relates to a multi-hardware event monitoring count value estimation method based on exponential growth, which is characterized in that a slave thread control signal is sent and monitored application is operated through a life cycle of a main thread maintenance workflow, a related data structure is created and initialized and the slave thread is started, the slave thread responds to the life cycle signal of the main thread, hardware event scheduling, timing alternate monitoring and post-processing estimation are carried out, and the multi-hardware event monitoring count value is obtained through reading a hardware event counting register arranged in a CPU.
Technical effects
The invention integrally solves the defect of insufficient precision caused by the lack of metadata due to intermittent monitoring of hardware events by the mechanism of the existing MPX; compared with the prior art, the method can obviously improve the working precision of the hardware event monitoring software in the MPX mode, thereby overcoming the reliability problem caused by insufficient precision of the MPX mode in actual industrial production.
Drawings
FIG. 1 is a schematic flow chart of the present invention;
FIG. 2 is a schematic diagram of a hardware event scheduling flow;
FIG. 3 is a schematic diagram of an original MPX estimation strategy;
fig. 4 is a schematic diagram of the MPX estimation strategy of the method.
Detailed Description
As shown in fig. 1, this embodiment relates to a method for estimating a multi-hardware event monitoring count value of a central processing unit based on exponential growth, which includes the following steps:
step 1: the PAPI hardware performance acquisition framework master thread initializes the current state and creates a slave thread for monitoring hardware events.
The state initialization includes: initializing PAPI internal global variables, obtaining current operating system information, creating file descriptors for recording hardware event count values, enabling MPX mode, and setting a monitoring time slice length.
Step 2: after the global initialization is completed, the master thread creates an event set (eventset) for storing the events to be monitored, binds the event set to the slave thread, and starts the life cycle of the workflow.
Step 3: the main thread adds hardware events to be monitored to the event set of step 2.
Step 4: the main thread sends a monitoring start signal to the slave thread, and a hardware event count register built in the slave CPU from the start of the slave thread periodically reads a monitoring result according to the time slice length set in the step 1 and writes the monitoring result into a file descriptor corresponding to the current hardware event.
The monitoring result is as follows: the monitored hardware event count value, the program running time length and the current hardware event monitoring time length sequence data.
Step 5: the monitored program is started on the main thread, the slave thread monitors the hardware events to be monitored added in the step 3 in turn according to the monitoring time slices set by initialization, and the step 4 collects the monitoring results and writes the monitoring results into the file descriptors to obtain the hardware event time sequence data.
As shown in fig. 2, the specific steps of the timing alternate monitoring include:
step 5.1: the scheduling system creates a number of queues equal to the number of event registers, each queue corresponding one-to-one to an event register.
Step 5.2: for any hardware event, storing the hardware event into a queue corresponding to all event registers capable of monitoring the event.
Step 5.3: and (5) randomly sequencing all the queues.
Step 5.4: checking all the current queue head events, when the queue head events are repeated, putting the queue head events which are ranked later to the queue tail, and sequencing the later events to push forward by one bit.
Step 5.5: repeat step 5.4 until all queue head events are not repeated. And putting all non-head-of-queue events which are repeated with the current head-of-queue event at the tail of the queue, and pushing the event sequence after the repeated events forward by one bit.
Step 5.6: and putting the current queue head event on a corresponding event register for monitoring.
Step 5.7: and after the current time slice expires, the event on the event register is taken down and put to the tail of the corresponding queue.
Step 5.8: and repeating the steps 5.4 to 5.7 until an end signal of the main thread is received.
Step 6: after the monitored program is finished, the main thread sends a stop signal to the slave thread, the slave thread stops monitoring, and the step 7 is carried out.
Step 7: and (5) the slave thread carries out post-processing estimation on the hardware event time sequence data collected in the step (5), and sends the result to the master thread, and the master thread outputs the result.
The hardware event time sequence data comprises: an actual hardware event cumulative count value c, a cumulative run time r of the monitored program, and a cumulative monitored time e of the monitored hardware event.
The specific steps of the post-processing estimation are as follows:
step 7.1) performing first order difference on all the hardware event time sequence data by the slave thread to obtain a difference value of the hardware event time sequence data, namely: an actual single-time-slice count value C, a single-time-slice run duration R of the monitored program, and a single-time-slice monitored duration E of the monitored hardware event.
Step 7.2) reading time series data of a hardware event from the thread, where an i-th differential count value C of the time series data is defined i I+1th differential count value C i+1 The i-th differential run length R i The i-th differential monitored duration E i And sequentially calculating the ratio of the (i+1) th count value to the (i) th count valueMultiple of growth of individual time slicesThe number of unmonitored time slices +.>
Step 7.3) repeated estimation: the j-th unmonitored count value between the i-th count value and the i+1-th count valueUntil the count values over all n unmonitored time slices are estimated. And accumulating all the monitoring values and the estimated values to obtain the total count value of the current hardware event.
Step 7.4) repeating the steps 7.2-7.3 until the total count value of all the monitored hardware events is obtained, and ending the processing work after the hardware event monitoring count.
Step 8: stopping the exit of the slave thread, and destroying the data structure memory in the running process by the master thread and stopping the exit.
Step 9: the hardware event monitoring and counting operation ends.
The correctness of the invention is verified by the related benchmark program Rodinia Benchmark Suite on the basis of the embodiment, and meanwhile, compared with the original MPX, the accuracy of the invention is improved to different degrees on different benchmark kernel programs.
The specific development of the embodiment is the secondary packaging development based on PAPI, and libraries and software for performing multi-hardware event monitoring and counting by using MPX technology such as Linux Perf, HPCToolkit, intel Vtune, gooda and the like are also applicable. The specific operation platform of the example is a common rack-mounted Intel X86 server, and the server is provided with a CentOS 7.6 64bit operation system, two Intel Xeon Gold 6248 processors and 192GB main memory. Firstly, a library based on PAPI needs to be created, the intermediate output file descriptor of the PAPI is intercepted, and the recorded data is estimated and processed by the method: with 100ms as a monitoring period, five types of Rodinia Benchmark Suite application of the system are monitored SRAD, BFS, LU, KNN, LMD, wherein the five types of the system comprise five hardware events, namely all, brins, cond, bris, all, bris, cond, dtlm, m, l1lh, l1lm, l2lh and l2lm, ldram, ich, icm, uistall, urstall, inst, and 5% -59% of precision improvement is obtained.
In comparison with the master MPX post-processing strategy in fig. 3, since this embodiment comprehensively considers the interpolation point vicinity monitor values. The extreme estimated value caused when the two continuous monitoring values have a larger difference is avoided, and meanwhile, an exponential growth multiplying power method is introduced to cover a wider change rule, so that the accuracy of MPX estimated value is improved.
Compared with the prior art, the method obtains the data estimation which is closer to the real data distribution through the change trend based on the exponential growth, thereby obtaining the precision improvement.
The foregoing embodiments may be partially modified in numerous ways by those skilled in the art without departing from the principles and spirit of the invention, the scope of which is defined in the claims and not by the foregoing embodiments, and all such implementations are within the scope of the invention.

Claims (5)

1. The method is characterized in that a related data structure and a slave thread are maintained and initialized through a life cycle of a work flow of the master thread, a slave thread control signal is sent and a monitored application is operated, hardware event scheduling, timing alternate monitoring and post-processing estimation are carried out through the slave thread in response to the life cycle signal of the master thread, and the multi-hardware event monitoring count value is obtained through reading a hardware event counting register arranged in a CPU;
the post-processing estimation specifically comprises the following steps:
step 1) performing first-order difference on all the hardware event time sequence data by the slave thread to obtain a difference value of the hardware event time sequence data, namely: the actual single-time slice count value C, the single-time slice running duration R of the monitored program and the single-time slice monitored duration E of the monitored hardware event;
step 2) Slave threadReading time series data of a hardware event, wherein an ith differential count value C of the time series data is defined i I+1th differential count value C i+1 The i-th differential run length R i The i-th differential monitored duration E i And sequentially calculating the ratio of the (i+1) th count value to the (i) th count valueMultiple of growth of individual time slicesThe number of unmonitored time slices +.>
Step 3) repeating the estimation: the j-th unmonitored count value between the i-th count value and the i+1-th count valueUntil the count values over all n unmonitored time slices are estimated; accumulating all monitoring values and estimated values to obtain the total count value of the current hardware event;
step 4), repeating the steps 2-3 until the total count value of all the monitored hardware events is obtained, and ending the processing work after the hardware event is monitored and counted;
the timing alternate monitoring specifically comprises the following steps:
step 1: the scheduling system creates queues with the same number as the event registers, and each queue corresponds to the event registers one to one;
step 2: for any hardware event, storing the hardware event into a queue corresponding to all event registers capable of monitoring the event;
step 3: randomly sequencing all queues;
step 4: checking all the current queue head events, when the queue head events are repeated, putting the queue head events with the rear ranking to the queue tail, and sequencing the rear events forward by one bit;
step 5: repeating the step 4 until all the queue head events are not repeated, putting all the non-queue head events which are repeated with the current queue head event to the queue tail, and pushing the event sequence after the repeated events forward by one bit;
step 6: the current queue head event is put on a corresponding event register for monitoring;
step 7: after the current time slice expires, taking down the event on the event register and putting the event on the tail of the corresponding queue;
step 8: and repeating the steps 4 to 7 until an end signal of the main thread is received.
2. The method for estimating the monitoring count value based on the exponentially growing multi-hardware event according to claim 1, characterized by comprising the following specific steps:
step 1, initializing the current state of a main thread of a PAPI hardware performance acquisition framework and creating a slave thread for monitoring hardware events;
step 2, creating an event set for storing events to be monitored after global initialization of the master thread is completed, binding the event set to the slave thread, and starting a life cycle of a workflow;
step 3, adding hardware events to be monitored to the event set in the step 2 by the main thread;
step 4, the main thread sends a monitoring start signal to the slave thread, and the slave thread starts a hardware event counting register built in the slave CPU to periodically read a monitoring result according to the time slice length set in the step 1 and write the monitoring result into a file descriptor corresponding to the current hardware event;
step 5, starting the monitored program in the main thread, periodically and alternately monitoring the hardware events to be monitored added in the step 3 by the slave thread according to the monitoring time slices set by initialization, and collecting the monitoring results in the step 4 and writing the monitoring results into a file descriptor to obtain hardware event time sequence data;
step 6, after the monitored program is finished, the main thread sends a stop signal to the auxiliary thread, the auxiliary thread stops monitoring, and the step 7 is carried out;
and 7, carrying out post-processing estimation on the hardware event time sequence data collected in the step 5 by the slave thread, and sending the result to the main thread, and outputting the result by the main thread.
3. The method for estimating a count value based on exponentially growing multi-hardware event monitoring of claim 2, wherein the hardware event time-series data includes: an actual hardware event cumulative count value c, a cumulative run time r of the monitored program, and a cumulative monitored time e of the monitored hardware event.
4. The method for estimating a multi-hardware event monitor count based on exponential growth of claim 3, wherein said state initialization comprises: initializing PAPI internal global variables, obtaining current operating system information, creating file descriptors for recording hardware event count values, enabling MPX mode, and setting a monitoring time slice length.
5. The method for estimating a monitoring count value based on exponentially growing multi-hardware events of claim 3, wherein the monitoring result is: the monitored hardware event count value, the program running time length and the current hardware event monitoring time length sequence data.
CN202010678027.1A 2020-07-15 2020-07-15 Multi-hardware event monitoring count value estimation method based on exponential growth Active CN111858243B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010678027.1A CN111858243B (en) 2020-07-15 2020-07-15 Multi-hardware event monitoring count value estimation method based on exponential growth

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010678027.1A CN111858243B (en) 2020-07-15 2020-07-15 Multi-hardware event monitoring count value estimation method based on exponential growth

Publications (2)

Publication Number Publication Date
CN111858243A CN111858243A (en) 2020-10-30
CN111858243B true CN111858243B (en) 2024-03-19

Family

ID=72983372

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010678027.1A Active CN111858243B (en) 2020-07-15 2020-07-15 Multi-hardware event monitoring count value estimation method based on exponential growth

Country Status (1)

Country Link
CN (1) CN111858243B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7289964B1 (en) * 1999-08-31 2007-10-30 Accenture Llp System and method for transaction services patterns in a netcentric environment
CN104852927A (en) * 2015-06-01 2015-08-19 国家电网公司 Safety comprehensive management system based on multi-source heterogeneous information
CN110275732A (en) * 2019-05-28 2019-09-24 上海交通大学 The Parallel Implementation method of particle in cell method on ARMv8 processor

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11741196B2 (en) * 2018-11-15 2023-08-29 The Research Foundation For The State University Of New York Detecting and preventing exploits of software vulnerability using instruction tags

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7289964B1 (en) * 1999-08-31 2007-10-30 Accenture Llp System and method for transaction services patterns in a netcentric environment
CN104852927A (en) * 2015-06-01 2015-08-19 国家电网公司 Safety comprehensive management system based on multi-source heterogeneous information
CN110275732A (en) * 2019-05-28 2019-09-24 上海交通大学 The Parallel Implementation method of particle in cell method on ARMv8 processor

Also Published As

Publication number Publication date
CN111858243A (en) 2020-10-30

Similar Documents

Publication Publication Date Title
CN108647136B (en) Hard disk damage prediction method and device based on SMART information and deep learning
US9864676B2 (en) Bottleneck detector application programming interface
US9424288B2 (en) Analyzing database cluster behavior by transforming discrete time series measurements
US7769562B2 (en) Method and apparatus for detecting degradation in a remote storage device
US8930736B2 (en) Inferred electrical power consumption of computing devices
EP2239664B1 (en) Context switch sampling
US20130191107A1 (en) Monitoring data analyzing apparatus, monitoring data analyzing method, and monitoring data analyzing program
US7490269B2 (en) Noise accommodation in hardware and software testing
CN102884486A (en) Malfunction analysis apparatus, malfunction analysis method, and recording medium
WO2017150286A1 (en) System analyzing device, system analyzing method, and computer-readable recording medium
US20230153189A1 (en) Visualization system for debug or performance analysis of soc systems
CN111858243B (en) Multi-hardware event monitoring count value estimation method based on exponential growth
US7725285B2 (en) Method and apparatus for determining whether components are not present in a computer system
JP5153724B2 (en) Processing time estimation device and processing time estimation program
US7024667B2 (en) Parallel efficiency calculating method and apparatus
EP3792770B1 (en) Trace-data processing device
US7958342B1 (en) Methods for optimizing computer system performance counter utilization
CN115599621A (en) Micro-service abnormity diagnosis method, device, equipment and storage medium
Xingzhi et al. Failure threshold setting for Wiener-process-based remaining useful life estimation
Wylie et al. Integrated runtime measurement summarisation and selective event tracing for scalable parallel execution performance diagnosis
CN109783312B (en) Resource usage metering method, device and system
US20240119200A1 (en) Method and system of building characteristic model based on data annealing process
Wong Parallel application signature for performance prediction
Fedasyuk et al. A Model for Estimating Firmware Execution Time Taking Into Account Peripheral Behavior
CN115760461A (en) Identification method and device based on tax data processing, storage medium and terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant