CN109885442B - Performance analysis method, device, equipment and storage medium - Google Patents

Performance analysis method, device, equipment and storage medium Download PDF

Info

Publication number
CN109885442B
CN109885442B CN201910112938.5A CN201910112938A CN109885442B CN 109885442 B CN109885442 B CN 109885442B CN 201910112938 A CN201910112938 A CN 201910112938A CN 109885442 B CN109885442 B CN 109885442B
Authority
CN
China
Prior art keywords
counting
performance
performance counter
counter
counters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910112938.5A
Other languages
Chinese (zh)
Other versions
CN109885442A (en
Inventor
郭永磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Suiyuan Intelligent Technology Co ltd
Shanghai Suiyuan Technology Co ltd
Original Assignee
Shanghai Suiyuan Technology Co Ltd
Shanghai Suiyuan Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Suiyuan Technology Co Ltd, Shanghai Suiyuan Intelligent Technology Co Ltd filed Critical Shanghai Suiyuan Technology Co Ltd
Priority to CN201910112938.5A priority Critical patent/CN109885442B/en
Publication of CN109885442A publication Critical patent/CN109885442A/en
Application granted granted Critical
Publication of CN109885442B publication Critical patent/CN109885442B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the invention discloses a performance analysis method, a performance analysis device, performance analysis equipment and a storage medium. The performance analysis method comprises the following steps: configuring working parameters of a plurality of performance counters in a target chip, wherein at least two performance counters in the plurality of performance counters have a cascade relation, and the counting start time and/or the counting end time of a later-stage performance counter are determined by the counting state of an associated earlier-stage performance counter; counting a plurality of events to be counted of the target chip through a plurality of performance counters to obtain counting data; and acquiring counting data of at least one performance counter, and analyzing the performance of the target chip according to the counting data. The technical scheme of the embodiment of the invention can realize the logic combination among the performance counters, overcomes the defect of single information of the existing performance analysis tool and realizes the comprehensive analysis of the performance of the chip.

Description

Performance analysis method, device, equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of chip performance analysis, in particular to a performance analysis method, a performance analysis device, performance analysis equipment and a storage medium.
Background
With the development of the chip industry, especially after a large number of artificial intelligence dedicated acceleration chips are brought to birth, the more and more complicated chip design makes the chip developers increasingly urgent for the performance analysis of the chips, and meanwhile, the difficulties encountered during the performance analysis are also increasingly great.
The existing chip performance analyzer generally obtains a small amount of performance information of each hardware module based on a performance counter, and provides all the performance information to a user. The performance information is reviewed and analyzed empirically by the user to analyze the performance of the chip. The chip performance analyzer cannot comprehensively analyze the information, and often has higher professional requirements on users who perform performance analysis.
Disclosure of Invention
The embodiment of the invention provides a performance analysis method, which is used for realizing the comprehensive analysis of the performance of a chip.
In a first aspect, an embodiment of the present invention provides a performance analysis method, where the method includes:
configuring working parameters of a plurality of performance counters in a target chip, wherein at least two performance counters in the plurality of performance counters have a cascade relation, and the counting start time and/or the counting end time of a later-stage performance counter are determined by the counting state of an associated earlier-stage performance counter;
counting a plurality of events to be counted of the target chip through a plurality of performance counters to obtain counting data;
and acquiring counting data of at least one performance counter, and analyzing the performance of the target chip according to the counting data.
Optionally, the operating parameter includes at least one of an operating mode, a count start timing, a count end timing, an event requiring counting, and a count up mode.
Optionally, the later performance counter and the associated earlier performance counter belong to the same hardware module or different hardware modules in the target chip.
Optionally, the counting start timing and/or the counting end timing of the later stage performance counter is determined by the counting state of the associated earlier stage performance counter, and the method includes:
the counting starting time and/or the counting ending time of the back-stage performance counter are determined according to the count value of the related front-stage performance counter; alternatively, the first and second electrodes may be,
the counting start timing and/or the counting end timing of the subsequent performance counter is determined according to the number of times that the count value of the associated previous performance counter reaches the counting threshold.
Optionally, the configuring the operating parameters of the multiple performance counters in the target chip includes:
and configuring the number of the performance counters and the working parameters of at least one hardware module in the target chip according to a plurality of events to be counted.
Optionally, the obtaining count data of at least one performance counter includes:
reading count data of at least one performance counter; alternatively, the first and second electrodes may be,
and receiving the counting data reported by at least one performance counter.
Optionally, before the obtaining the count data of the at least one performance counter, the method further includes:
stopping counting of the plurality of performance counters in the target chip.
In a second aspect, an embodiment of the present invention further provides a performance analysis apparatus, where the apparatus includes:
the parameter configuration module is used for configuring working parameters of a plurality of performance counters in a target chip, wherein at least two performance counters in the performance counters have a cascade relation, and the counting start time and/or the counting end time of a later-stage performance counter are determined by the counting state of the associated earlier-stage performance counter;
the event counting module is used for counting a plurality of events to be counted of the target chip through a plurality of performance counters to obtain counting data;
and the performance analysis module is used for acquiring counting data of at least one performance counter and displaying the counting data so as to analyze the performance of the target chip according to the counting data.
In a third aspect, an embodiment of the present invention further provides an apparatus, where the apparatus includes:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement a performance analysis method as in any of the embodiments of the invention.
In a fourth aspect, embodiments of the present invention also provide a storage medium containing computer-executable instructions for performing the performance analysis method according to any one of the embodiments of the present invention when executed by a computer processor.
According to the technical scheme of the embodiment of the invention, cascade counting can be realized through the cascade relation among the performance counters in the target chip, so that the logic combination among all the statistical events is realized in the event counting process, the counting starting time and/or the counting ending time of the later-stage performance counters are determined by the counting state of the associated earlier-stage performance counters, the performance counters can automatically start or end counting when the counting is required to be started or ended according to the cascade relation, the performance counters can be used more reasonably, a large amount of redundant data reported by the performance counters can be avoided, more effective basis is provided for performance analysis, and then the multiple events to be counted of the target chip are counted by the multiple performance counters to obtain the counting data; the counting data of at least one performance counter is obtained, the performance of the target chip is analyzed according to the counting data, the technical problem that a chip performance analyzer cannot comprehensively analyze the information can be solved, the defect that the existing performance analysis tool is single in information is overcome, and the comprehensive analysis of the performance of the chip is realized.
Drawings
Fig. 1 is a schematic flow chart of a performance analysis method according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart illustrating an alternative example of a performance analysis method according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of a performance analysis apparatus according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of an apparatus according to a third embodiment of the present invention.
Detailed Description
The technical scheme of the invention is further explained by the specific implementation mode in combination with the attached drawings. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the steps as a sequential process, many of the steps can be performed in parallel, concurrently or simultaneously. In addition, the order of the steps may be rearranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like.
Example one
Fig. 1 is a schematic flow chart of a performance analysis method according to an embodiment of the present invention, which is particularly suitable for a situation where comprehensive analysis of chip performance is required, and the method can be executed by a performance analysis apparatus, which can be implemented by software and/or hardware, and generally can be independently configured in a terminal or a server to implement the method of the present embodiment.
As shown in fig. 1, the method of the embodiment may specifically include:
s110, configuring working parameters of a plurality of performance counters in the target chip.
In the embodiment of the present invention, the operating parameter may include at least one of an operating mode, a counting start timing, a counting end timing, an event requiring counting, and a counting increase mode. Alternatively, the working mode may include an independent counting mode or a parallel counting mode layer, and may further include a software mode or a hardware mode, etc. Illustratively, the count-up pattern may be a rising edge accumulation, a falling edge accumulation, a high level accumulation, or the like.
Wherein at least two performance counters of the plurality of performance counters have a cascade relationship, and the counting start timing and/or the counting end timing of the subsequent performance counter is determined by the counting state of the associated preceding performance counter. The counting state includes a count value of the performance counter and/or a number of times that the count value of the counter reaches a counting threshold value. In the embodiment of the present invention, the counting start timing and/or the counting end timing of the subsequent performance counter may be controlled by the output of the previous performance counter, so as to implement cascade counting.
As an alternative to the embodiment of the present invention, the determination of the counting start timing and/or the counting end timing of the subsequent stage performance counter by the counting state of the associated previous stage performance counter may include: the counting start timing and/or the counting end timing of the rear performance counter are determined according to the counting value of the associated front performance counter, for example, the rear performance counter starts or ends counting if the counting value of the front performance counter is greater than or equal to a certain value; alternatively, the count start timing and/or the count end timing of the later performance counter is determined according to the number of times the count value of the associated earlier performance counter reaches the count threshold, for example, if it is counted that the number of times the count value of the earlier performance counter reaches the count threshold reaches or exceeds a preset number threshold, the later performance counter starts or ends counting.
Alternatively, the latter performance counter and the associated former performance counter may belong to the same hardware module or different hardware modules in the target chip. That is, two or more performance counters in the same hardware module may have a cascade relationship, and two or more performance counters in different hardware modules may also have a cascade relationship.
The comprehensive performance analysis of the target chip usually needs to combine a plurality of performance evaluation indexes. Therefore, the performance counters are arranged in one, two, and two or more hardware modules of the target chip, and further, a plurality of performance counters can be arranged in the same hardware module. Optionally, configuring operating parameters of a plurality of performance counters in the target chip, including: and configuring the number of the performance counters and the working parameters of at least one hardware module in the target chip according to a plurality of events to be counted. The advantage of this arrangement is that the complex combinational logic across modules can be set in the event statistical process to realize the comprehensive performance analysis of each module of the target chip.
In the embodiment of the invention, at least one hardware module needing to be provided with the performance counter in the target chip can be determined according to the logic relation among a plurality of events to be counted, the number of the performance counters needing to be provided for each hardware module can be further determined, and the working parameters of the performance counters are configured. Illustratively, a performance counter unit of at least one hardware module may be configured according to a plurality of events to be counted, wherein the performance counter unit comprises at least one performance counter. It will be appreciated that the mode of operation of the performance counters within one performance counter unit may also be controlled by the performance counter output of another performance counter unit.
It should be noted that the operating parameters of each performance counter may be configured uniformly before the plurality of performance counters count, or the operating parameters of each performance counter may be configured after at least one performance counter starts counting. That is, the operating parameters of the performance counters may be configured according to actual requirements.
And S120, counting a plurality of events to be counted of the target chip through a plurality of performance counters to obtain counting data.
The event to be counted can include a software event to be counted and a hardware event to be counted. The performance counter can be used for counting when the event to be counted occurs, and the count value of the performance counter is increased by 1 every time the event to be counted occurs. Each performance counter may count for a particular event to be counted, resulting in count data. It can be understood that to analyze a certain performance of a target chip, two or more events to be counted are required to be counted.
It should be noted that the event to be counted is the event to be counted. The event can be an event directly representing the execution function of the target chip, such as the execution of an instruction, the pipeline halt of the instruction and the like, or a counting event of a performance counter. For example, the count value of the performance counter is set with a count upper limit, and the event to be counted may be the number of times the count value of the performance counter reaches the count upper limit, or the like.
Before counting a plurality of events to be counted of the target chip through a plurality of performance counters, the performance counters need to be started to prepare for counting. And starting a software scene needing performance analysis, and in the running process of the software, if an event to be counted occurs on a target chip, triggering a performance counter to perform actual counting action.
S130, obtaining counting data of at least one performance counter, and analyzing the performance of the target chip according to the counting data.
As described above, since one performance counter is often used to count the occurrence of a statistical event, one, two or more performance counters may be required to count data when performing performance analysis on a target chip. The counting data of at least one performance counter is obtained, and the counting data of the performance counter can be actively obtained or passively received. Illustratively, obtaining count data for the at least one performance counter may include: reading count data of at least one performance counter; or receiving the counting data reported by at least one performance counter.
Wherein reading the count data of the at least one performance counter may further comprise: reading the counting data of at least one performance counter in real time; or reading the counting data of at least one performance counter according to a preset time interval; or, when a data read request input by a user is received, the count data of at least one performance counter and the like are read.
Optionally, before the obtaining the count data of the at least one performance counter, the method further includes: stopping counting of the plurality of performance counters in the target chip. The technical scheme is particularly suitable for the condition that the performance analysis is carried out on the target chip by combining the counting data of a plurality of performance counters. The stopping of the counting of the plurality of performance counters in the target chip may be stopping of the counting of all the performance counters in the target chip, or stopping of the counting of some of all the performance counters in the target chip.
Optionally, stopping counting of the plurality of performance counters in the target chip comprises: the counting of the plurality of performance counter units of the at least one hardware module is stopped. Of course, at least one performance counter to be stopped in the target chip may be determined according to at least one performance to be analyzed of the target chip, and the counting of the at least one performance counter may be stopped.
The performance of the target chip is analyzed according to the counting data, and the counting data is counted, summarized and presented after the counting data of at least one performance counter is acquired.
In order to facilitate the user to view the data, optionally, after obtaining the count data of the at least one performance counter, the method may further include: and displaying the counting data. On the basis of the above technical solutions, after analyzing the performance of the target chip according to the count data, the method may further include: and displaying the analysis result of the target chip.
According to the technical scheme of the embodiment, cascade counting can be realized through the cascade relation among the performance counters in the target chip, so that the logic combination among all the statistical events is realized in the event counting process, the counting starting time and/or the counting ending time of the later-stage performance counters are determined by the counting state of the associated earlier-stage performance counters, the performance counters can automatically start or end counting when the counting is required to be started or ended according to the cascade relation, the performance counters can be used more reasonably, a large amount of redundant data reported by the performance counters can be avoided, more effective basis is provided for performance analysis, and then the multiple events to be counted of the target chip are counted by the multiple performance counters to obtain the counting data; the counting data of at least one performance counter is obtained, the performance of the target chip is analyzed according to the counting data, the technical problem that a chip performance analyzer cannot comprehensively analyze the information can be solved, the defect that the existing performance analysis tool is single in information is overcome, and the comprehensive analysis of the performance of the chip is realized.
Example two
Fig. 2 is a schematic flowchart of an alternative example of a performance analysis method according to a second embodiment of the present invention. As shown in fig. 2, the performance analysis method of the present embodiment may specifically include:
s210, configuring at least one performance counter in a performance counter unit in the current hardware module in the target chip.
Wherein the target chip comprises a plurality of hardware modules. When configuring the performance counters, one or two or more performance counters in the performance counter unit may be configured in units of the performance counter unit of the hardware module.
Configuring at least one performance counter in a performance counter unit in a current hardware module in a target chip, including: and configuring the working parameters of each performance counter in a performance counter unit in the hardware module before the hardware module is configured according to the event to be counted. The configuration of the working parameters of the performance counters includes a working mode, a counting increasing mode, a mode (counting starting time) for triggering the starting of counting, a mode (counting ending time) for triggering the ending of counting, an event to be counted and the like for each performance counter.
For example, the operation mode of the current performance counter may include a parallel counting mode and an independent counting mode, i.e., it is determined whether the current performance counter and the remaining performance counters except the current performance counter need to be counted in parallel. The manner of triggering the current performance counter to start counting may include being triggered by the count values of the remaining performance counters other than the current performance counter. The event to be counted of the current performance counter may include a software and hardware event triggered by the hardware module, or count the number of times that the count value of the remaining performance counters except the current performance counter reaches a count threshold (upper limit of the count value).
S220, judging whether more events to be counted need to be counted or not, if so, returning to execute S210; if not, go to S230.
Since a performance counter is often counted for a predetermined event to be counted. In the embodiment of the invention, the target events to be counted can be determined according to the performance to be analyzed, and then the number of the performance counters can be determined according to the target events to be counted. If it is determined that more events to be counted need to be counted, the performance counter of the performance counter unit of the hardware module may be configured according to the performance counter of the performance counter unit of the hardware module and according to the method in S210. If not, the configuration of each performance counter of the current performance counter unit is completed.
S230, judging whether more hardware modules need to be configured or not, if so, returning to execute S210; if not, go to S240.
After the configuration of each performance counter of the performance counter counting unit of the current hardware module is completed, it is further determined whether there is another hardware module that needs to configure the performance counter. If so, the performance counter unit of the hardware module may be configured according to the method in S210 described above. The advantage of setting up like this is, can realize cascading through the mutual cooperation of a plurality of performance counters of different hardware modules and count a plurality of events of waiting to count.
Wherein at least two performance counters of the plurality of performance counters have a cascade relationship, and the counting start timing and/or the counting end timing of the subsequent performance counter is determined by the counting state of the associated preceding performance counter.
And S240, starting each performance counter of each performance counter unit.
And when all the performance counter units needing to be configured currently in the target chip are configured, starting the performance counters of all the performance counter units so as to count when the event to be counted occurs.
And S250, starting a software scene needing performance analysis.
And starting a software scene needing performance analysis, triggering events to be counted of each hardware module, and counting the events to be counted by a performance counter unit.
And S260, reading the counting data of the performance counter unit.
It is understood that the count data of the performance counter unit can be read according to the actual performance analysis requirement, and the specific condition for reading the count data is not limited herein.
S270, stopping the counting of the performance counter unit.
For example, if each item of counting data of the performance to be analyzed is counted or a preset stop counting condition is reached, the counting of the performance counter unit may be stopped. The preset stop counting condition may be that the time for starting the counting counter reaches a preset time threshold, or the counting data of at least one performance counter reaches a preset counting threshold, for example, exceeds a preset highest threshold.
And S280, summarizing and presenting the counting data of the performance counters.
And performing performance analysis according to the counting data of each performance counter, wherein in order to enable the performance analysis result to be more visual, the counting data needs to be summarized, processed and displayed.
According to the technical scheme, the events to be counted of software and hardware influencing the performance of the chip are counted based on the performance counters configured in the target chip, the cross-module complex combinational logic can be set in the event counting process, and the method is particularly suitable for performing performance analysis on the chip with complex hardware design and realizing the comprehensive performance analysis on each hardware module of the target chip.
EXAMPLE III
Fig. 3 is a schematic structural diagram of a performance analysis apparatus according to a third embodiment of the present invention, where the performance analysis apparatus of this embodiment is particularly suitable for a situation that a comprehensive analysis needs to be performed on chip performance, and the performance analysis apparatus may be implemented by software and/or hardware, and generally can be independently configured in a terminal or a server to implement the performance analysis method according to the third embodiment of the present invention. As shown in fig. 3, the performance analysis apparatus of the present embodiment may include: a parameter configuration module 310, an event statistics module 320, and a performance analysis module 330.
The parameter configuration module 310 is configured to configure working parameters of a plurality of performance counters in a target chip, where at least two performance counters in the plurality of performance counters have a cascade relationship, and a counting start time and/or a counting end time of a subsequent performance counter is determined by a counting state of an associated previous performance counter; the event counting module 320 is configured to count a plurality of events to be counted of the target chip through a plurality of performance counters to obtain count data; the performance analysis module 330 is configured to obtain count data of at least one performance counter, and display the count data, so as to analyze the performance of the target chip according to the count data.
According to the technical scheme of the embodiment, cascade counting can be realized through the cascade relation among the performance counters in the target chip, so that the logic combination among all the statistical events is realized in the event counting process, the counting starting time and/or the counting ending time of the later-stage performance counters are determined by the counting state of the associated earlier-stage performance counters, the performance counters can automatically start or end counting when the counting is required to be started or ended according to the cascade relation, the performance counters can be used more reasonably, a large amount of redundant data reported by the performance counters can be avoided, more effective basis is provided for performance analysis, and then the multiple events to be counted of the target chip are counted by the multiple performance counters to obtain the counting data; the counting data of at least one performance counter is obtained, the performance of the target chip is analyzed according to the counting data, the technical problem that a chip performance analyzer cannot comprehensively analyze the information can be solved, the defect that the existing performance analysis tool is single in information is overcome, and the comprehensive analysis of the performance of the chip is realized.
On the basis of the above technical solution, the operating parameter may include at least one of an operating mode, a counting start timing, a counting end timing, an event requiring counting, and a counting increase mode.
On the basis of the above technical solutions, the back-stage performance counter and the associated front-stage performance counter may belong to the same hardware module or different hardware modules in the target chip.
On the basis of the above technical solutions, the determining of the counting start timing and/or the counting end timing of the subsequent stage performance counter by the counting state of the associated previous stage performance counter includes:
the counting starting time and/or the counting ending time of the back-stage performance counter are determined according to the count value of the related front-stage performance counter; alternatively, the first and second electrodes may be,
the counting start timing and/or the counting end timing of the subsequent performance counter is determined according to the number of times that the count value of the associated previous performance counter reaches the counting threshold.
On the basis of the above technical solutions, the parameter configuration module may be further configured to:
and configuring the number of the performance counters and the working parameters of at least one hardware module in the target chip according to a plurality of events to be counted.
On the basis of the above technical solutions, the performance analysis module may be configured to:
reading count data of at least one performance counter; alternatively, the first and second electrodes may be,
and receiving the counting data reported by at least one performance counter.
On the basis of the above technical solutions, the performance analysis apparatus may further include:
and the counting stopping module is used for stopping counting of the plurality of performance counters in the target chip before the counting data of the at least one performance counter is acquired.
The performance analysis device can execute the performance analysis method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of executing the performance analysis method.
Example four
Fig. 4 is a schematic structural diagram of an apparatus according to a fourth embodiment of the present invention. FIG. 4 illustrates a block diagram of an exemplary device 412 suitable for use in implementing embodiments of the present invention. The device 412 shown in fig. 4 is only an example and should not impose any limitation on the functionality or scope of use of embodiments of the present invention.
As shown in FIG. 4, device 412 is in the form of a general purpose computing device. The components of device 412 may include, but are not limited to: one or more processors or processors 416, a storage device 428, a bus 418 for storing one or more programs, and connecting various system components including the storage device 428 and the processors 416. When executed by the one or more processors 416, cause the one or more processors 416 to implement the performance analysis method of any of the embodiments of the present invention.
Bus 418 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Device 412 typically includes a variety of computer system readable media. Such media can be any available media that is accessible by device 412 and includes both volatile and nonvolatile media, removable and non-removable media.
Storage 428 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)430 and/or cache memory 432. The device 412 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 434 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 4, commonly referred to as a "hard drive"). Although not shown in FIG. 4, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 418 by one or more data media interfaces. Memory 428 can include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.
A program/utility 440 having a set (at least one) of program modules 442 may be stored, for instance, in memory 428, such program modules 442 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. The program modules 442 generally perform the functions and/or methodologies of the described embodiments of the invention.
The device 412 may also communicate with one or more external devices 414 (e.g., keyboard, pointing device, display 424, etc.), with one or more devices that enable a user to interact with the device 412, and/or with any devices (e.g., network card, modem, etc.) that enable the device 412 to communicate with one or more other computing devices. Such communication may occur via input/output (I/O) interfaces 422. Also, the device 412 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) through the network adapter 420. As shown, network adapter 420 communicates with the other modules of device 412 over bus 418. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the device 412, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The processor 416 executes programs stored in the storage device 428 to perform various functional applications and data processing, such as implementing the performance analysis methods provided by the embodiments of the present invention.
In addition, an embodiment of the present invention further provides a storage medium readable by a computer, and having a computer program stored thereon, where the computer program is used to execute a performance analysis method when executed by a processor, and the method includes: configuring working parameters of a plurality of performance counters in a target chip, wherein at least two performance counters in the plurality of performance counters have a cascade relation, and the counting start time and/or the counting end time of a later-stage performance counter are determined by the counting state of an associated earlier-stage performance counter; counting a plurality of events to be counted of the target chip through a plurality of performance counters to obtain counting data; and acquiring counting data of at least one performance counter, and analyzing the performance of the target chip according to the counting data.
Optionally, the computer-executable instructions, when executed by the computer processor, may also be used to implement the solution of the performance analysis method provided by any embodiment of the present invention.
The logic and/or steps represented in the flowcharts or otherwise described herein, such as an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable storage medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable storage medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable storage medium may even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (9)

1. A method of performance analysis, comprising:
configuring working parameters of a plurality of performance counters in a target chip, wherein at least two performance counters in the plurality of performance counters have a cascade relation, and the counting start time and/or the counting end time of a later-stage performance counter are determined by the counting state of an associated earlier-stage performance counter;
counting a plurality of events to be counted of the target chip through a plurality of performance counters to obtain counting data;
acquiring counting data of at least one performance counter, and analyzing the performance of the target chip according to the counting data;
wherein, the counting start time and/or the counting end time of the back stage performance counter is determined by the counting state of the related front stage performance counter, and the method comprises the following steps:
the counting starting time and/or the counting ending time of the back-stage performance counter are determined according to the count value of the related front-stage performance counter; alternatively, the first and second electrodes may be,
the counting start timing and/or the counting end timing of the subsequent performance counter is determined according to the number of times that the count value of the associated previous performance counter reaches the counting threshold.
2. The method of claim 1, wherein the operating parameters include at least one of an operating mode, a count start opportunity, a count end opportunity, an event requiring counting, and a count up mode.
3. The method of claim 1, wherein the back-stage performance counter and the associated front-stage performance counter belong to a same hardware module or different hardware modules in a target chip.
4. The method of claim 1, wherein configuring the operating parameters of the plurality of performance counters in the target chip comprises:
and configuring the number of the performance counters and the working parameters of at least one hardware module in the target chip according to a plurality of events to be counted.
5. The method of claim 1, wherein obtaining count data for at least one performance counter comprises:
reading count data of at least one performance counter; alternatively, the first and second electrodes may be,
and receiving the counting data reported by at least one performance counter.
6. The method of claim 5, further comprising, prior to said obtaining count data for at least one performance counter:
stopping counting of the plurality of performance counters in the target chip.
7. A performance analysis apparatus, comprising:
the parameter configuration module is used for configuring working parameters of a plurality of performance counters in a target chip, wherein at least two performance counters in the performance counters have a cascade relation, and the counting start time and/or the counting end time of a later-stage performance counter are determined by the counting state of the associated earlier-stage performance counter;
the event counting module is used for counting a plurality of events to be counted of the target chip through a plurality of performance counters to obtain counting data;
the performance analysis module is used for acquiring counting data of at least one performance counter and displaying the counting data so as to analyze the performance of the target chip according to the counting data;
wherein, the counting start time and/or the counting end time of the back stage performance counter is determined by the counting state of the related front stage performance counter, and the method comprises the following steps:
the counting starting time and/or the counting ending time of the back-stage performance counter are determined according to the count value of the related front-stage performance counter; alternatively, the first and second electrodes may be,
the counting start timing and/or the counting end timing of the subsequent performance counter is determined according to the number of times that the count value of the associated previous performance counter reaches the counting threshold.
8. An apparatus, characterized in that the apparatus comprises:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the performance analysis method of any of claims 1-6.
9. A storage medium containing computer-executable instructions for performing the performance analysis method of any one of claims 1-6 when executed by a computer processor.
CN201910112938.5A 2019-02-13 2019-02-13 Performance analysis method, device, equipment and storage medium Active CN109885442B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910112938.5A CN109885442B (en) 2019-02-13 2019-02-13 Performance analysis method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910112938.5A CN109885442B (en) 2019-02-13 2019-02-13 Performance analysis method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN109885442A CN109885442A (en) 2019-06-14
CN109885442B true CN109885442B (en) 2020-03-27

Family

ID=66928105

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910112938.5A Active CN109885442B (en) 2019-02-13 2019-02-13 Performance analysis method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN109885442B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116089183A (en) * 2022-11-28 2023-05-09 格兰菲智能科技有限公司 Graphics processor performance test method, device, computer equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102369517A (en) * 2011-09-01 2012-03-07 华为技术有限公司 Chip state monitoring method, device and chip
CN109150534A (en) * 2017-06-19 2019-01-04 华为技术有限公司 terminal device and data processing method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102369517A (en) * 2011-09-01 2012-03-07 华为技术有限公司 Chip state monitoring method, device and chip
CN109150534A (en) * 2017-06-19 2019-01-04 华为技术有限公司 terminal device and data processing method

Also Published As

Publication number Publication date
CN109885442A (en) 2019-06-14

Similar Documents

Publication Publication Date Title
CN111563014A (en) Interface service performance test method, device, equipment and storage medium
CN103578568A (en) Method and apparatus for testing performances of solid state disks
CN111831410A (en) Task processing method and device, storage medium and electronic equipment
US9442817B2 (en) Diagnosis of application server performance problems via thread level pattern analysis
CN112115047B (en) Delay testing method and system for hard real-time operating system
CN109379305A (en) A kind of data distributing method, device, server and storage medium
CN111045879B (en) Method, device and storage medium for generating pressure test report
CN111831411A (en) Task processing method and device, storage medium and electronic equipment
CN110569154B (en) Chip interface function testing method, system, terminal and storage medium
CN109885442B (en) Performance analysis method, device, equipment and storage medium
CN112306833A (en) Application program crash statistical method and device, computer equipment and storage medium
CN108415765B (en) Task scheduling method and device and intelligent terminal
CN111124519B (en) Method and system for optimizing starting speed of android application program based on input response
CN112333246B (en) ABtest experiment method and device, intelligent terminal and storage medium
CN113190427A (en) Caton monitoring method and device, electronic equipment and storage medium
CN112965889A (en) Stability testing method and device, electronic equipment and readable storage medium
CN111290942A (en) Pressure testing method, device and computer readable medium
WO2023051466A1 (en) Communication method, apparatus and system
CN116820567A (en) Method, system, electronic device and storage medium for determining instruction consumption information
CN115630011A (en) Method and device for realizing I2C bus communication of master and slave equipment by using CPLD
US10496524B2 (en) Separating test coverage in software processes using shared memory
CN115686789A (en) Discrete event parallel processing method, terminal equipment and storage medium
CN112269723B (en) Performance analysis method and device of storage equipment and readable storage medium
CN111679924B (en) Reliability simulation method and device for componentized software system and electronic equipment
CN109388564B (en) Test method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: 201306 C, 888, west two road, Nanhui new town, Pudong New Area, Shanghai

Patentee after: SHANGHAI SUIYUAN INTELLIGENT TECHNOLOGY Co.,Ltd.

Country or region after: China

Patentee after: Shanghai Suiyuan Technology Co.,Ltd.

Address before: 201306 C, 888, west two road, Nanhui new town, Pudong New Area, Shanghai

Patentee before: SHANGHAI SUIYUAN INTELLIGENT TECHNOLOGY Co.,Ltd.

Country or region before: China

Patentee before: SHANGHAI ENFLAME TECHNOLOGY Co.,Ltd.

CP03 Change of name, title or address