Disclosure of Invention
The invention aims to overcome the defects of the background technology, and provides an AXI bus performance analysis method and device based on a hierarchical state machine, which can realize the comprehensive analysis of transaction characteristics and respectively and independently analyze the read/write channels of the total linear performance, thereby accurately finding out the performance bottleneck, improving the analysis efficiency and greatly shortening the chip design cycle.
The invention provides an AXI bus performance analysis method based on a hierarchical state machine, which comprises the following steps: step 1, analyzing bus occupation: taking the transmission number existing on the AXI bus as an analysis condition of the bus occupation condition to respectively obtain a bus occupation signal, a read channel occupation signal and a write channel occupation signal; step 2, setting a hierarchical state machine: setting a hierarchical state machine according to the occupation conditions of a read channel and a write channel, wherein the hierarchical state machine comprises a parent layer state machine and two sub-layer state machines; step 3, analyzing the bus performance: the parent layer state machine corresponds to the global state machine to analyze the global performance of the AXI bus, and the sub-layer state machine corresponds to the read-write two-channel real-time state machine to analyze the real-time performance of the AXI read-write channel.
In the above technical solution, the step 1 includes the following steps: step 1.1, judging that the bus is occupied at the moment when the transaction number existing on the bus is not 0, and judging that the bus is idle at the moment when the transaction number existing on the bus is 0; step 1.2, monitoring four channels of a write address/write response/read address/read data of an AXI bus, recording respective unfinished transmission numbers of the read/write channels when the address of the read address/write address channel is valid, and recording respective finished transmission numbers of the read/write channels when the address of the read/write address channel is valid, wherein the difference value between the unfinished transmission numbers and the finished transmission numbers is a transmission number, when the transmission number is 0, the bus is not occupied, and when the transmission number is 1, the bus is occupied.
In the above technical solution, the specific steps of step 1.2 are as follows: step 1.2.1, when the configuration monitoring initial signal is 1, the state machine enters a monitoring state; step 1.2.2, monitoring write address channel signals awready and awvalid, write response channel signals break and bvalid, read address channel signals arready and arvalid, and read data channel signals rready and rvaalid, and analyzing the number of unfinished transmissions on the current read-write channel through the 4 groups of 8 signals; step 1.2.3, when the write address channel signal awready & awvalid is 1, the write address channel address is valid, the write address channel transmission length signal awlen is monitored, the read address channel unfinished transmission number wsignal0 is recorded as wsignal0+ (awlen + 1' b1), and the initial value of wsignal0 is 0; step 1.2.4, when the read & bvalid signal of the write response channel is 1, the write response channel response is valid, and wsignal0 is not 0, recording the response channel completion transmission number wsignal1 ═ wsignal1+ 1' b1, if wsignal0 is 0, it indicates that the write transmission is not completed on the previous bus, and the write completion transmission number is not recorded in the current response; step 1.2.5, when the read address channel signal address & effective is 1, the read address channel address is valid, the read address channel transmission length signal arlen is monitored, the read address channel unfinished transmission number rsignal0 ═ rsignal0+ (arlen + 1' b1) is recorded, and the initial value of rsignal0 is 0; step 1.2.6, when a read data channel signal rready & rvalid is 1, the read data channel data is valid, and rsignal0 is not 0, recording a read data channel transmission completion number rsignal1 as rsignal1+ 1' b1, if rsignal0 is 0, indicating that the read transmission on the previous bus is not completed, and counting the transmission completion number of the response; step 1.2.7, recording wsignal 0-wsignal 1, and when the value is larger than 0, indicating that incomplete transmission exists on the write channel, and configuring a write channel occupation signal w _ active to be 1' b 1; when the value equals 0, indicating that all transfers on the write channel have been completed, configuring the write channel occupancy signal w _ active to 1' b 0; when the value is less than 0, clearing the two transmission signals, and the bus has unfinished concurrent write transactions before monitoring begins; step 1.2.8, recording rsignal 0-rsignal 1, when the value is larger than 0, it indicates that there is unfinished transmission on the read channel, and configuring a read channel occupation signal r _ active equal to 1' b 1; when the value equals 0, indicating that all transmissions on the read channel have been completed, configure the read channel occupancy signal r _ active to 1' b 0; when the value is less than 0, clearing the two transmission signals, and the bus has unfinished concurrent read transactions before monitoring begins; step 1.2.9, recording (wsignal0+ rsignal0) - (wsignal1+ rsignal1), and when the value is larger than 0, indicating that unfinished transmission exists on the bus, and configuring a bus occupation signal axi _ active as 1' b 1; when the value equals 0, indicating that all bus transfers have been completed, the bus busy signal axi _ active is 1' b0 configured.
In the above technical solution, the specific steps of step 2 are as follows: step 2.1, the layer state machine is divided into 3 state machines including a parent layer state machine, a read channel sublayer state machine and a write channel sublayer state machine, wherein the parent layer state machine comprises 4 states: an initial state, a monitoring counting state and an analysis state; the sublayer state machine contains 3 states: an initial state, a counting state and an analysis state.
In the above technical solution, the step 2 further includes the following steps: step 2.2, the parent layer state machine generates two real-time reading/real-time writing sub-layer state machines according to the occupation condition of the reading/writing channel, when the bus is occupied, the parent layer state machine enters the corresponding sub-layer state machine, and when the bus is idle, the parent layer state machine returns to the corresponding sub-layer state machine; step 2.3, when the read channel occupation signal is 1, starting a sublayer state machine serving as a read channel real-time state machine, and when the read channel occupation signal is 0, quitting the read channel real-time state machine; and when the write channel occupation signal is 1, starting a sublayer state machine serving as a write channel real-time state machine, and when the write channel occupation signal is 0, exiting the write channel real-time state machine.
In the above technical solution, the specific steps of step 3 are as follows: step 3.1, carrying out AXI bus global performance analysis under a parent layer state machine, and recording the number of clock cycles after entering a monitoring state until exiting the monitoring counting state; recording the number of occupied cycles and the number of transmission cycles of a bus in a monitoring counting state, wherein the difference value of the two numbers of cycles is the total delay cycle number; recording a transmission length signal when the address is effective, wherein discrete transmission is adopted when the transmission length is equal to 1, and continuous transmission is adopted when the transmission length is greater than 1; when the data is valid, recording a data gating signal, and when the data gating signal is not full-on, the data gating signal is transmitted in a narrow band; step 3.2, performing AXI bus real-time performance analysis under a sublayer state machine, recording the transaction number and the address validity number from the entering counting state to the exiting state, wherein the address validity number is the transaction outranging number; and recording the counting state period number and the transmission period number, wherein the difference value of the two period numbers is the current transaction delay period number.
In the above technical solution, the specific steps of step 3.1 are as follows: 3.1.1, jumping to a monitoring counting state when a bus occupation signal is 1 by a parent layer state machine; step 3.1.2, the parent layer state machine is in a monitoring counting state, records the number of clocks and transmission cycles, and is used for analyzing the bus occupancy rate and the average delay; when the address is valid, recording a transmission length signal awlen/arlen, if the signal is equal to 0, indicating that the transaction of the address only has one transmission, namely one discrete transmission, and if the signal is greater than 0, indicating that the transaction of the address is burst transmission and is continuous transmission corresponding to the transmission times; when the data is valid, recording a gating signal wstrb of a write channel, judging whether the data with the full bit width is valid according to the signal and the data bit width, and if not, performing narrow-band transmission; step 3.1.3, when the bus occupation signal is 1, the transaction is not ended, the monitoring counting state is continuously kept, when the bus occupation signal is 0, the transaction is ended, and the parent-layer state machine jumps to the monitoring state; step 3.1.4, no unfinished transaction exists on the current bus, the next transaction is waited to arrive, and all counters are not cleared; step 3.1.5, when the bus occupation signal is 1, a new transaction comes, the state machine jumps to a monitoring counting state from the monitoring state, and when the bus occupation signal is 0, no new transaction comes, and the state machine is kept unchanged; step 3.1.6, which is led out from step 3.1.2, when the monitoring stop signal is 0, the monitoring technical state is continuously kept, and when the monitoring stop signal is 1, the monitoring counting state is finished; 3.1.7, jumping to an analysis state by a parent layer state machine; step 3.1.8, bus occupancy rate can be obtained through analysis according to the number of clock cycles; the average delay of each transmission can be analyzed according to the clock period number and the transmission period number; the discrete continuous transmission ratio can be obtained by analyzing according to the discrete and continuous transmission numbers, the average delay can be greatly reduced by continuous transmission, and if too much discrete transmission exists and the bus performance is insufficient, the bus performance can be optimized by converting the combinable discrete transmission into the continuous transmission and the like; the narrow-band transmission ratio can be obtained through analysis according to the transmission number of the narrow-band transmission, when the narrow-band transmission ratio is large, the bus data bit width is not necessarily most suitable for the current use environment, and the bus performance can be optimized by combining the required bandwidth and bit width adjustment.
In the above technical solution, the specific steps of step 3.2 are as follows: 3.2.1, jumping to a monitoring state when the channel occupation signal is 1 by the sublayer state machine; step 3.2.2, starting real-time analysis of the current transaction, recording the clock period number and the transmission period number, and analyzing real-time delay; step 3.2.3, calculating the number of delay cycles while recording the parameters, wherein the value is the difference value between the number of clock cycles and the number of transmission cycles; meanwhile, an address channel is monitored, and an incomplete transaction exists on the current channel, so that when the address is valid, the fact that a concurrent transaction outranging exists in the transaction is shown, and the outranging is recorded as outranging + 1' b1, and the initial value of outranging is 1; step 3.2.4, when the channel occupation signal is 1, keeping the monitoring state, and when the channel occupation signal is 0, finishing the monitoring state; step 3.2.5, jumping to the channel analysis state by the sublayer state machine; step 3.2.6, calculating to obtain the average delay of each transmission in the current transaction according to the delay period number and the transmission period count; step 3.2.7, comparing the average delay and outranging of the current transaction with the maximum delay and outranging, and assigning a larger value to the maximum delay and the maximum outranging, wherein the default maximum delay and the maximum outranging are both 0; the maximum delay is the worst path delay in all transmissions and is an important index for whether the bus meets the performance requirement; the maximum outranging is an important index for analyzing whether the performance of the bus is matched with a bus module, if the outranging capability of the module is larger than that of the bus, the bus processing concurrency capability is insufficient when the module performs concurrent transaction sending, the transaction is blocked on a module interface, the system performance is reduced, and the bus becomes a performance bottleneck of the whole system at the moment, so that the performance bottleneck and the optimization direction can be definitely provided for the bus design by calculating the maximum outranging of the bus.
The invention also provides an AXI bus performance analysis device based on the hierarchical state machine, which comprises the following parts: the bus occupation analysis module: taking the transmission number existing on the AXI bus as an analysis condition of the bus occupation condition to respectively obtain a bus occupation signal, a read channel occupation signal and a write channel occupation signal; the layer state machine setting module: setting a hierarchical state machine according to the occupation conditions of a read channel and a write channel, wherein the hierarchical state machine comprises a parent layer state machine and two sub-layer state machines; the bus performance analysis module: the parent layer state machine corresponds to the global state machine to analyze the global performance of the AXI bus, and the sub-layer state machine corresponds to the read-write two-channel real-time state machine to analyze the real-time performance of the AXI read-write channel.
In the above technical solution, the hierarchical state machine setting module includes the following parts: setting unit of parent layer state machine: the parent layer state machine consists of 4 states, namely an initial state, a monitoring counting state and an analysis state; the initial state is used as a default state, no counter is used, and when the monitoring starting signal is 1, the monitoring state is jumped to; in the monitoring state, a global clock cycle counter is used, performance parameters are not counted, when a bus occupation signal is 1, the transaction is sent to the bus, and the monitoring counting state is jumped to; in the technical state of monitoring, a global performance parameter counter is used, when a bus occupation signal is 0, the monitoring state is jumped to, and when a monitoring stop signal is 0, the analysis state is jumped to; the analysis state leads out all parameters to an accessible register for use, and when an analysis end signal is 1, the initial state is jumped to; sublayer state machine setting unit: the sublayer state machine consists of 3 states, namely a channel initial state, a channel monitoring state and a channel analysis state; the initial state of the channel is used as a default state, no counter is used, and when the channel occupation signal is 1, the channel monitoring state is jumped to; in the channel monitoring state, a real-time performance parameter counter is used, and when a channel occupation signal is 0, a channel analysis state is jumped to; the channel analysis state is similar to the analysis state of the parent layer, each parameter is led out to an accessible register for use, and when an analysis end signal is 1, the channel analysis state jumps to the channel initial state.
The AXI bus performance analysis method and device based on the hierarchical state machine have the following beneficial effects:
the method comprises the steps that the number of transmission exists on an AXI bus as an analysis condition of the occupation condition of the bus, an analysis environment is provided for burst transmission and OT performance parameters, a hierarchical state machine is generated according to the occupation condition of a read channel and a write channel, the hierarchical state machine comprises a parent layer state machine and two sub-layer state machines, independent performance analysis of the read/write channel is achieved, the parent layer state machine corresponds to a global state machine and conducts overall performance analysis of the AXI bus, and the sub-layer state machines correspond to read-write two-channel real-time state machines and conduct real-time performance analysis of the AXI read-write channel. The technical arrangement realizes comprehensive and accurate analysis of the AXI bus performance, solves the problem that parameter analysis is inconsistent with the actual condition due to the fact that part of characteristics can not be analyzed or transmission loss possibly occurs, can accurately find out the performance bottleneck, improves the analysis efficiency, and greatly shortens the chip design period.
Detailed Description
The invention is described in further detail below with reference to the following figures and examples, which should not be construed as limiting the invention.
The AXI bus performance analysis parameters aimed at by the invention comprise: average delay, bus occupancy rate, discrete continuous transmission occupancy rate and narrow-band transmission occupancy rate of the global performance analysis parameters; the method also comprises average/maximum output of single transaction and maximum delay of real-time performance analysis parameters.
Referring to the flowchart shown in fig. 1, the present invention first provides an AXI bus performance analysis method based on a hierarchical state machine, which includes the following steps: step 101, bus occupation analysis: and taking the transmission number existing on the AXI bus as an analysis condition of the bus occupation condition to respectively obtain a bus occupation signal, a read channel occupation signal and a write channel occupation signal. Then step 102 is entered for generating a hierarchical state machine: and generating a layer state machine through the occupation condition of the read and write channels, wherein the layer state machine comprises a parent layer state machine and two sub-layer state machines. And finally, entering a step 103 of bus performance analysis: the parent layer state machine corresponds to the global state machine to analyze the global performance of the AXI bus, and the sub-layer state machine corresponds to the read-write two-channel real-time state machine to analyze the real-time performance of the AXI read-write channel.
The invention uses the transmission number on the bus to generate the occupation signal to start the counting of the performance parameter on the analysis method, and the transaction information is sent out as the judgment of one transaction initiation under the prior art. The invention takes the transmission number on the bus as the reference, solves the problem of transmission loss, is not influenced by the change of bus behavior caused by different characteristics, and can accurately obtain the occupied signal so as to analyze the bus performance.
As shown in fig. 2, the step 101 is as follows:
step 1011, judging that the bus is occupied at the moment when the transaction number existing on the bus is not 0, and judging that the bus is idle at the moment when the transaction number existing on the bus is 0;
step 1012, monitoring four channels of write address/write response/read address/read data of the AXI bus, recording respective unfinished transmission number of the read/write channel when the address of the read address/write address channel is valid, and recording respective finished transmission number of the read/write channel when the address of the read/write address channel is valid, wherein the difference value between the unfinished transmission number and the finished transmission number is the existing transmission number, when the existing transmission number is 0, the bus is not occupied, and when the existing transmission number is 1, the bus is occupied.
The use of a hierarchical state machine can more fully separate read-write channels for analysis and more flexibly analyze transaction characteristics: for modules with different functions, the performance requirements of reading and writing are different, and the performance analysis of a separation channel is more beneficial to the subsequent design of a bus; aiming at the transaction characteristics, when OT is 1, the bus transaction is serial, when OT is greater than 1, the bus transaction is parallel, the number of transaction cycles is less than the cycle accumulation of multiple transmissions, OT needs to be analyzed when the bus is occupied, and a first-level state machine cannot keep the record of global parameters when the OT is analyzed, so that a hierarchical state machine is adopted, a parent-level state machine corresponds to the global state machine to analyze the global performance of the AXI bus, and a sub-level state machine corresponds to a read-write two-channel real-time state machine to analyze the real-time performance of the AXI read-write channel.
As shown in fig. 3, the bus occupation analysis flow chart includes the following steps:
in step 201, when the configuration monitoring start signal is 1, the parent layer state machine enters a monitoring state.
In step 202, write address channel signals awready and awvalid, write response channel signals break and bvalid, read address channel signals arready and arvalid, and read data channel signals rready and rvaalid are monitored, and the number of unfinished transmissions on the current read/write channel is analyzed through the 4 groups of 8 signals.
In step 203, when awready & awvalid is 1, the address of the write address channel is valid, the write address channel transmission length signal awlen is monitored, and the number wsignal0 of unfinished transmissions of the write channel is recorded as wsignal0+ (awlen + 1' b1), and the initial value of wsignal0 is 0.
In step 204, when the break & bvalid is 1, the write response channel response is valid, and wsignal0 is not 0, the record response channel completion transmission number wsignal1 is wsignal1+ 1' b1, if wsignal0 is 0, it indicates that the write transmission is not completed on the previous bus, and the response does not count the completion write transmission number.
In step 205, when the address & effective is 1, the address of the read address channel is valid, the read address channel transmission length signal arlen is monitored, and rsignal0 ═ rsignal0+ (arlen + 1' b1) is recorded, and the initial value of rsignal0 is 0.
In step 206, when the rready & rvalid is 1, the data of the read data channel is valid, and the rsignal0 is not 0, the read channel completion transmission number rsignal1 is recorded as rsignal1+ 1' b1, if the rsignal0 is 0, it indicates that the read transmission on the previous bus is not completed, and the response does not count the completion read transmission number.
Recording wsignal 0-wsignal 1 in step 207, and when the value is greater than 0, indicating that there is an incomplete transmission on the write channel, configuring a write channel occupation signal w _ active to be 1' b 1; when the value equals 0, indicating that all transfers on the write channel have been completed, configuring the write channel occupancy signal w _ active to 1' b 0; when the value is less than 0, both transfer signals are cleared and there may be outstanding concurrent write transactions on the bus before monitoring begins.
In step 208, rsignal 0-rsignal 1 is recorded, and when the value is greater than 0, it indicates that there is an incomplete transmission on the read channel, and a read channel occupation signal r _ active is 1' b 1; when the value equals 0, indicating that all transmissions on the read channel have been completed, configure the read channel occupancy signal r _ active to 1' b 0; when the value is less than 0, both transfer signals are cleared and there may be outstanding concurrent read transactions on the bus before monitoring begins.
Step 209 records (wsignal0+ rsignal0) - (wsignal1+ rsignal1), when the value is greater than 0, it indicates that there is an incomplete transmission on the bus, and configures a bus occupation signal axi _ active to be 1' b 1; when the value equals 0, indicating that all bus transfers have been completed, the bus busy signal axi _ active is 1' b0 configured.
As shown in fig. 4, the hierarchical state machine transition diagram is composed of a parent layer state machine and two sub-layer state machines, which are respectively 3 state machines including a parent layer state machine, a read channel sub-layer state machine, and a write channel sub-layer state machine, where the parent layer state machine includes 4 states: an initial state, a monitoring counting state and an analysis state; the sublayer state machine contains 3 states: an initial state, a counting state and an analysis state. The parent layer state machine is used for analyzing global parameters and can also be called as a global state machine, and the child layer state machine is used for analyzing real-time parameters and is divided into a read channel real-time state machine and a write channel real-time state machine.
The parent layer state machine generates two real-time reading/real-time writing sub-layer state machines according to the occupation condition of the reading/writing channel, when the bus is occupied, the corresponding sub-layer state machine is entered from the parent layer state machine, and when the bus is idle, the parent layer state machine is returned from the corresponding sub-layer state machine;
when the read channel occupation signal is 1, starting a sublayer state machine (read channel real-time state machine), and when the read channel occupation signal is 0, exiting the sublayer state machine (read channel real-time state machine); when the write channel occupation signal is 1, activating a sublayer state machine (write channel real-time state machine), and when the write channel occupation signal is 0, quitting the sublayer state machine (write channel real-time state machine);
the parent layer state machine consists of 4 states, namely an initial state, a monitoring counting state and an analysis state; the initial state is used as a default state, no counter is used, and when the monitoring starting signal is 1, the monitoring state is jumped to; in the monitoring state, a global clock cycle counter is used, performance parameters are not counted, when a bus occupation signal is 1, the transaction is sent to the bus, and the monitoring counting state is jumped to; in the technical state of monitoring, a global performance parameter counter is used, when a bus occupation signal is 0, the monitoring state is jumped to, and when a monitoring stop signal is 0, the analysis state is jumped to; the analysis state draws parameters into accessible registers for use, jumping to the initial state when the analysis end signal is 1.
The sublayer state machine consists of 3 states, namely a channel initial state, a channel monitoring state and a channel analysis state; the initial state of the channel is used as a default state, no counter is used, and when the channel occupation signal is 1, the channel monitoring state is jumped to; in the channel monitoring state, a real-time performance parameter counter is used, and when a channel occupation signal is 0, a channel analysis state is jumped to; the channel analysis state is similar to the analysis state of the parent layer, each parameter is led out to an accessible register for use, and when an analysis end signal is 1, the channel analysis state jumps to the channel initial state.
As shown in fig. 5, the specific steps of step 103 are as follows:
step 1031, carrying out AXI bus global performance analysis under a parent layer state machine, and recording the number of clock cycles after entering a monitoring state until exiting the monitoring counting state; recording the number of occupied cycles and the number of transmission cycles of a bus in a monitoring counting state, wherein the difference value of the two numbers of cycles is the total delay cycle number; recording a transmission length signal when the address is effective, wherein discrete transmission is adopted when the transmission length is equal to 1, and continuous transmission is adopted when the transmission length is greater than 1; when the data is valid, recording a data gating signal, and when the data gating signal is not full-on, the data gating signal is transmitted in a narrow band;
step 1032, carrying out AXI bus real-time performance analysis under a sub-layer state machine, recording the transaction number and the address validity number from the entering counting state to the exiting state, wherein the address validity number is the transaction outranging number; and recording the counting state period number and the transmission period number, wherein the difference value of the two period numbers is the current transaction delay period number.
As shown in the flowchart of fig. 6, in step 301, the parent layer state machine jumps to the monitoring count state when the bus occupation signal is 1.
In step 302, the parent state machine is in a monitoring counting state, records the number of clocks and transmission cycles, and is used for analyzing the bus occupancy rate and the average delay; when the address is valid, recording a transmission length signal awlen/arlen, if the signal is equal to 0, indicating that the transaction of the address only has one transmission, namely one discrete transmission, and if the signal is greater than 0, indicating that the transaction of the address is burst transmission and is continuous transmission corresponding to the transmission times; and when the data is valid, recording a gating signal wstrb of the write channel, judging whether the data with the full bit width is valid according to the signal and the data bit width, and if the data with the full bit width is not valid, performing narrow-band transmission.
In step 303, when the bus occupation signal is 1, the transaction is not ended yet, the monitoring counting state is continuously maintained, and when the bus occupation signal is 0, the transaction is ended, and the parent-layer state machine jumps to the monitoring state.
In step 304, there are no outstanding transactions on the current bus, waiting for the next transaction to arrive, and all counters are not cleared.
In step 305, when the bus occupation signal is 1, a new transaction comes, the state machine jumps from the monitoring state to the monitoring counting state, and when the bus occupation signal is 0, no new transaction comes, and the state machine remains unchanged.
In step 306, which is led from step 302, when the monitoring stop signal is 0, the monitoring technical state is continuously maintained, and when the monitoring stop signal is 1, the monitoring counting state is ended.
In step 307, the parent state machine jumps to the analysis state.
In step 308, the bus occupancy rate can be obtained by analyzing the number of clock cycles; the average delay of each transmission can be analyzed according to the clock period number and the transmission period number; the discrete continuous transmission ratio can be obtained by analyzing according to the discrete and continuous transmission numbers, the average delay can be greatly reduced by continuous transmission, and if too much discrete transmission exists and the bus performance is insufficient, the bus performance can be optimized by converting the combinable discrete transmission into the continuous transmission and the like; the narrow-band transmission ratio can be obtained through analysis according to the transmission number of the narrow-band transmission, when the narrow-band transmission ratio is large, the bus data bit width is not necessarily most suitable for the current use environment, and the bus performance can be optimized by combining the required bandwidth and bit width adjustment.
As shown in the design flowchart of the sub-layer state machine shown in fig. 7, in step 401, the sub-layer state machine jumps to the monitoring state when the channel occupation signal is 1.
In step 402, real-time analysis of the current transaction is initiated, recording the number of clock cycles and the number of transmission cycles for analyzing the real-time delay.
In step 403, while recording the parameters, calculating the number of delay cycles, which is the difference between the number of clock cycles and the number of transmission cycles; meanwhile, an address channel is monitored, and an incomplete transaction exists on the current channel, so that when the address is valid, the concurrent transaction exists in the transaction, that is, outranging is the concurrent times of the concurrent transaction, outranging + 1' b1 is recorded, and the initial value of outranging is 1; OT is used as a characteristic for greatly improving the total linear performance, waiting time delay between transactions is greatly reduced, if a hierarchical state machine is not used for generating a channel occupation state for processing, the transactions arrive at a bus when the previous transaction is not completed, all concurrent transactions before the response arrives can be lost in the traditional analysis method, even the subsequent counting is deviated, and all parameters lose practical significance.
In step 404, when the channel occupancy signal is 1, the monitoring state is maintained, and when the channel occupancy signal is 0, the monitoring state is ended.
In step 405, the sublayer state machine jumps to the channel analysis state.
In step 406, the average delay of each transmission in the current transaction is calculated according to the number of delay cycles and the transmission cycle count.
In step 407, the average delay and outrating of the current transaction are compared with the maximum delay and the maximum outrating, and a larger value is assigned to the maximum delay and the maximum outrating, where the default maximum delay and the maximum outrating are both 0; the maximum delay is the worst path delay in all transmissions and is an important index for whether the bus meets the performance requirement; the maximum outranging is an important index for analyzing whether the performance of the bus is matched with a bus module, if the outranging capability of the module is larger than that of the bus, the bus processing concurrency capability is insufficient when the module performs concurrent transaction sending, the transaction is blocked on a module interface, the system performance is reduced, and the bus becomes a performance bottleneck of the whole system at the moment, so that the performance bottleneck and the optimization direction can be definitely provided for the bus design by calculating the maximum outranging of the bus.
As shown in fig. 8, the present invention further provides an AXI bus performance analysis apparatus based on a hierarchical state machine, including the following parts:
the bus occupation analysis module: taking the transmission number existing on the AXI bus as an analysis condition of the bus occupation condition to respectively obtain a bus occupation signal, a read channel occupation signal and a write channel occupation signal;
the layer state machine setting module: setting a hierarchical state machine according to the occupation conditions of a read channel and a write channel, wherein the hierarchical state machine comprises a parent layer state machine and two sub-layer state machines;
the bus performance analysis module: the parent layer state machine corresponds to the global state machine to analyze the global performance of the AXI bus, and the sub-layer state machine corresponds to the read-write two-channel real-time state machine to analyze the real-time performance of the AXI read-write channel.
Wherein, the hierarchical state machine setting module comprises the following parts:
setting unit of parent layer state machine: the parent layer state machine consists of 4 states, namely an initial state, a monitoring counting state and an analysis state; the initial state is used as a default state, no counter is used, and when the monitoring starting signal is 1, the monitoring state is jumped to; in the monitoring state, a global clock cycle counter is used, performance parameters are not counted, when a bus occupation signal is 1, the transaction is sent to the bus, and the monitoring counting state is jumped to; in the technical state of monitoring, a global performance parameter counter is used, when a bus occupation signal is 0, the monitoring state is jumped to, and when a monitoring stop signal is 0, the analysis state is jumped to; the analysis state leads out all parameters to an accessible register for use, and when an analysis end signal is 1, the initial state is jumped to;
sublayer state machine setting unit: the sublayer state machine consists of 3 states, namely a channel initial state, a channel monitoring state and a channel analysis state; the initial state of the channel is used as a default state, no counter is used, and when the channel occupation signal is 1, the channel monitoring state is jumped to; in the channel monitoring state, a real-time performance parameter counter is used, and when a channel occupation signal is 0, a channel analysis state is jumped to; the channel analysis state is similar to the analysis state of the parent layer, each parameter is led out to an accessible register for use, and when an analysis end signal is 1, the channel analysis state jumps to the channel initial state.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.
Those not described in detail in this specification are within the skill of the art.