CN116680146A - Method and device for guaranteeing safe and reliable operation of application software - Google Patents

Method and device for guaranteeing safe and reliable operation of application software Download PDF

Info

Publication number
CN116680146A
CN116680146A CN202310628804.5A CN202310628804A CN116680146A CN 116680146 A CN116680146 A CN 116680146A CN 202310628804 A CN202310628804 A CN 202310628804A CN 116680146 A CN116680146 A CN 116680146A
Authority
CN
China
Prior art keywords
data
software
application software
application
level index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310628804.5A
Other languages
Chinese (zh)
Inventor
刘永伟
潘磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xi'an Huachuang Marco Intelligent Control System Co ltd
Original Assignee
Xi'an Huachuang Marco Intelligent Control System Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xi'an Huachuang Marco Intelligent Control System Co ltd filed Critical Xi'an Huachuang Marco Intelligent Control System Co ltd
Priority to CN202310628804.5A priority Critical patent/CN116680146A/en
Publication of CN116680146A publication Critical patent/CN116680146A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Quality & Reliability (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention provides a method and a device for guaranteeing safe and reliable operation of application software, which relate to the technical field of software, and the method comprises the following steps: acquiring system-level index data and application-level index data in the running process of application software; performing data preprocessing on the system-level index data and the application program-level index data to obtain preprocessed data; extracting feature data for identifying the performance of the system and the application software and the resource utilization rate from the preprocessing data; taking the extracted characteristic data, the running state data of the system and the application software as the input of a software performance prediction model; and dynamically adjusting resource allocation based on the output of the software performance prediction model to ensure safe and reliable operation of the application software. The invention can monitor the states of the computer system and the hardware when the specific application software runs, and flexibly adjust the system and the hardware resources allocated to the software running when the software runs, thereby ensuring the long-term stable running of the appointed application software in the computer system.

Description

Method and device for guaranteeing safe and reliable operation of application software
Technical Field
The invention relates to the technical field of software, in particular to a method and a device for guaranteeing safe and reliable operation of application software.
Background
In the information technology age today, software is increasingly used in a variety of fields. The safe and stable operation of software has become a major concern for software developers and users. However, due to the complexity of the software and hardware system, it is often difficult to ensure long-term stable operation of the software. The problems of various resource exhaustion, overload, abnormal software operation and the like can lead to serious consequences such as system breakdown, data loss and the like, and seriously influence the normal work and development of various industries.
In order to ensure stable operation of software, various methods have been proposed, including software testing, debugging, monitoring, and the like. The traditional method is often dependent on manual monitoring intervention, so that the efficiency is low, and human errors are easy to occur. Accordingly, various automated monitoring and adjustment methods have been proposed in recent years to address these issues.
Such as by way of a fixed resource allocation based on a static configuration, the resource allocation is typically based on a static configuration, meaning that the resources are allocated according to a set of predetermined rules or thresholds. One or more rules may be set that allocate more resources to an application if its resource usage exceeds a certain threshold, or allocate more memory if the application's memory usage reaches a certain level. These rules are typically preset and are not dynamically adjusted according to changing conditions. If any problems are detected, they will make the necessary adjustments to the resource allocation or perform maintenance to solve the problem. IT can be seen that this approach is time consuming, costly and error prone, as IT professionals must continually monitor and adjust system resources to keep everything running, which is especially problematic for ever changing and growing systems. Furthermore, it fails to address problems due to hardware anomalies that can affect software performance and stability. Manual monitoring maintenance is prone to human error, and even the most experienced IT professionals may ignore small problems that may ultimately lead to more serious problems.
In addition, many schemes have been developed in the art that use virtualization and rule-based monitoring and adjustment to ensure reliable operation of software. Virtualization is a technique that allows multiple virtual machines to run on a single physical machine, each running its own operating system and application programs that are isolated from each other on the same host, by creating virtual machines running software and isolating them from the underlying hardware, and then setting rules for monitoring and adjusting the system and hardware resources allocated to the software, allowing the software to run under fixed resources and rules. These rules are based on specific thresholds or conditions, such as CPU usage, memory usage, or network bandwidth. If the usage of the application exceeds these thresholds or conditions, the system will adjust the allocated resources. Virtualization can negatively impact system performance because creating and managing virtual machines creates additional overhead, and virtualization can increase system burden, thereby affecting overall performance. Setting up and managing the virtualized environment can be complex, requiring a level of expertise, which can result in increased costs for recruiting and training IT professionals. Virtualization may not be applicable to all types of hardware, some hardware may not support virtualization, or may not have sufficient resources to run multiple virtual machines, so virtualization is not applicable to all applications or scenarios.
It can be seen that existing methods of ensuring safe and reliable operation of software are not always effective, as they rely on manual intervention and can be time consuming and costly. Furthermore, they cannot solve the problem due to hardware anomalies.
Disclosure of Invention
In view of the above, the present invention provides a method and apparatus for ensuring safe and reliable running of application software, so as to solve at least one of the above-mentioned problems.
In order to achieve the above purpose, the present invention adopts the following scheme:
according to a first aspect of the present invention, there is provided a method of ensuring safe and reliable operation of application software, the method comprising: acquiring system-level index data and application-level index data in the running process of application software; performing data preprocessing on the system-level index data and the application-level index data to obtain preprocessed data; extracting feature data for identifying system and application software performance and resource utilization rate from the preprocessing data; taking the extracted characteristic data, the system and the running state data of the application software as the input of a software performance prediction model; and dynamically adjusting resource allocation based on the output of the software performance prediction model to ensure safe and reliable operation of the application software.
As an embodiment of the present invention, the system level index data in the above method includes: CPU usage, memory usage, network usage, and disk I/O, the application level index data comprising: application response time, error rate, and transaction throughput.
As an embodiment of the present invention, the data preprocessing in the method includes: using a moving average calculation, calculating an average of a window of data points to smooth the data to reduce noise in the data and highlight trends; the recent data points are weighted using an exponential moving average algorithm to weight recent trends in the data.
As an embodiment of the present invention, extracting feature data for identifying system and application software performance and resource utilization from the preprocessing data in the above method includes: calculating the mean value, standard deviation and variance of the preprocessed data to obtain statistical features, wherein the statistical features are used for capturing the distribution and change of the data along with time; calculating power spectral density and Fourier coefficient from the preprocessed data to obtain frequency domain characteristics, wherein the frequency domain characteristics are used for representing the periodic characteristics and frequency characteristics of related systems, hardware and software carried in the original data, which change in the running process; computing autocorrelation and cross-correlation data from the pre-processing data to obtain time domain features identifying causal relationships between system resources, hardware loading and software performance and possible hysteresis effects; and carrying out wavelet transformation calculation on the preprocessed data to obtain transformation characteristics, wherein the transformation characteristics are used for identifying running states and anomalies under multiple scales.
As one embodiment of the invention, the software performance prediction model in the method is a machine learning model trained by utilizing a random forest algorithm based on historical characteristic data, historical running state data of a system and application software.
As an embodiment of the present invention, dynamically adjusting the resource allocation based on the output of the software performance prediction model in the above method to ensure safe and reliable operation of the application software includes: and distributing system resources based on the predicted resource usage amount output by the software performance prediction model and the current resource usage amount in the system level index data, wherein a distribution formula is as follows: allocation amount= (predicted resource usage amount-current resource usage amount)/total amount of available resources.
According to a second aspect of the present invention, an apparatus for ensuring safe and reliable operation of application software, the apparatus comprising: the index data acquisition unit is used for acquiring system-level index data and application program-level index data in the running process of the application software; the preprocessing unit is used for preprocessing the system-level index data and the application-level index data to obtain preprocessed data; the feature extraction unit is used for extracting feature data for identifying the performance and resource utilization rate of the system and the application software from the preprocessing data; the input unit is used for taking the extracted characteristic data, the running state data of the system and the application software as the input of a software performance prediction model; and the dynamic adjustment unit is used for dynamically adjusting the resource allocation based on the output of the software performance prediction model so as to ensure the safe and reliable operation of the application software.
As an embodiment of the present invention, the system level index data in the above apparatus includes: CPU usage, memory usage, network usage, and disk I/O, the application level index data comprising: application response time, error rate, and transaction throughput.
As an embodiment of the present invention, the preprocessing unit in the above apparatus includes: a moving average calculation module for calculating an average of the window of data points using a moving average calculation to smooth the data to reduce noise in the data and highlight trends; an exponential moving average calculation module for weighting the nearest data points using an exponential moving average algorithm to weight decisions for recent trends in the data.
As an embodiment of the present invention, the feature extraction unit in the above-described apparatus includes: the statistical feature extraction module is used for carrying out mean value, standard deviation and variance calculation on the preprocessed data to obtain statistical features, and the statistical features are used for capturing the distribution and change of the data along with time; the frequency domain feature extraction module is used for calculating power spectral density and Fourier coefficient from the preprocessed data to obtain frequency domain features, wherein the frequency domain features are used for representing the periodic characteristics and the frequency characteristics of related systems, hardware and software carried in the original data, which change in the running process; a time domain feature extraction module for calculating autocorrelation and cross-correlation data from the preprocessed data to obtain time domain features for identifying causal relationships between system resources, hardware loading and software performance and possible hysteresis effects; and the transformation feature extraction module is used for carrying out wavelet transformation calculation on the preprocessed data to obtain transformation features, and the transformation features are used for identifying running states and anomalies under multiple scales.
As one embodiment of the invention, the software performance prediction model in the device is a machine learning model trained by utilizing a random forest algorithm based on historical characteristic data, historical running state data of a system and application software.
As an embodiment of the present invention, the dynamic adjustment unit in the above apparatus is specifically configured to: and distributing system resources based on the predicted resource usage amount output by the software performance prediction model and the current resource usage amount in the system level index data, wherein a distribution formula is as follows: allocation amount= (predicted resource usage amount-current resource usage amount)/total amount of available resources.
According to a third aspect of the present invention there is provided an electronic device comprising a memory, a processor and a computer program stored on said memory and executable on said processor, the processor implementing the steps of the above method when executing said computer program.
According to a fourth aspect of the present invention there is provided a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the above method.
According to a fifth aspect of the present invention there is provided a computer program product comprising computer programs/instructions which when executed by a processor implement the steps of the above method.
According to the technical scheme, the method and the device for guaranteeing safe and reliable operation of the application software can monitor the states of the computer system and the hardware when the specific application software operates, and flexibly adjust the system and the hardware resources allocated to the software operation when the software operates, so that long-term stable operation of the specific application software in the computer system can be guaranteed.
Drawings
In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. In the drawings:
FIG. 1 is a schematic flow chart of a method for guaranteeing safe and reliable operation of application software according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of a method for preprocessing index data according to an embodiment of the present application;
FIG. 3 is a schematic flow chart of feature data extraction according to an embodiment of the present application;
FIG. 4 is a schematic structural diagram of a device for guaranteeing safe and reliable operation of application software according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a preprocessing unit according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a feature extraction unit according to an embodiment of the present application;
fig. 7 is a schematic block diagram of a system configuration of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings. The exemplary embodiments of the present application and their descriptions herein are for the purpose of explaining the present application, but are not to be construed as limiting the application.
Fig. 1 is a schematic flow chart of a method for guaranteeing safe and reliable operation of application software according to an embodiment of the present application, where the method includes the following steps:
step S101: and acquiring system-level index data and application program-level index data in the running process of the application software.
The system level index data in this embodiment refers to the environment in which the application software is running, i.e. the index data of the running system, and the application level index data refers to the index data of the application software itself when running.
Preferably, the system level index data in the present embodiment includes, but is not limited to, the following indexes: CPU usage, memory usage, network usage, and disk I/O.
The various system level index data can be extracted by the following calculation formulas and algorithms:
(1) CPU utilization
CPU usage = 100 x (1- (idle_time/total_time));
where idle_time is the time the CPU is idle and total_time is the total time elapsed.
(2) Memory utilization rate
Memory usage = 100 x (1- (free_memory/total_memory));
where free memory is the amount of free memory available and total memory is the total amount of memory installed.
(3) Network utilization (network traffic)
Network traffic= (bytes_send+bytes_received)/time_accepted;
where bytes_sent and bytes_received are the amount of data sent and received, respectively, over the network, and time_elapsed is the time elapsed since the last measurement.
(4) Disk I/O
Disk I/o= (read_bytes+write_bytes)/time_elased;
where read_bytes and write_bytes are the amount of data to read from and write to disk, respectively, and time_elapsed is the time elapsed since the last measurement.
Further preferably, the application level index data in the present embodiment includes, but is not limited to, the following indexes: application response time, error rate, and transaction throughput.
It will be seen that the system level metrics described above provide information about the overall performance and resource usage of the computer system, and that a fixed interval time, such as every second or minute, may be used to collect system level metrics to capture dynamic changes in system performance. The above-described application level index data regarding the performance and behavior of a particular application may also be extracted by detecting system events generated by specific event-triggered actions or user interactions when the application code is running.
Step S102: and carrying out data preprocessing on the system-level index data and the application-level index data to obtain preprocessed data. The preprocessing of this step is to make it possible to reduce the influence of noise and highlight the trend in the index data collected in the above step S101.
Preferably, as shown in fig. 2, the present step may specifically include the following sub-steps:
step S1021: using moving average calculations, the average of a window of data points is calculated to smooth the data to reduce noise in the data and highlight trends.
The moving average method is a simple smooth prediction technique, which can sequentially calculate a time-series average value containing a certain number of terms according to time-series data and item-by-item transition so as to reflect long-term trend of the data. Therefore, when the value of the time sequence is greatly fluctuated due to the influence of the periodical fluctuation and the random fluctuation and is not easy to show the development trend of the event, the influence of the factors can be eliminated by using a moving average method, the development direction and trend (namely a trend line) of the event are shown, and then the long-term trend of the sequence can be predicted according to the trend line analysis. In this embodiment, this is a great aid in using the acquired index data for subsequent performance predictions.
Step S1022: the recent data points are weighted using an exponential moving average algorithm to weight recent trends in the data.
The exponential moving average algorithm may give different weights to the data points, respectively, calculate a moving average according to the different weights, and determine a predicted value based on the final moving average. An exponential moving average algorithm is used because the closer data points have a greater impact on the predicted value, which more reflects the trend of recent changes. Therefore, for more recent data points, a larger weight value is given, and for more distant data points, a smaller weight value is given correspondingly, and the effect of each data point on the predicted value is adjusted by different weight values, so that the predicted value can more approximately reflect the future development trend.
From the above, the application firstly reduces noise in the data and highlights trend through moving average calculation, and then weights recent data points by using an exponential moving average algorithm to weight recent trend, so that collected index data can be better used for predicting the performance of application software in subsequent steps.
The above-described steps S101 and S102 accomplish the data collection task, and the data collection portion of the present application involves collecting different sets of metrics and events that can provide a comprehensive view of the operational state of specified software and hardware resources. The collected index data is used as the basis of the subsequent analysis and prediction process, and the system can be actively managed and potential faults or performance degradation of the application software can be prevented.
Step S103: feature data for identifying system and application software performance and resource utilization are extracted from the pre-processing data.
The feature extraction section of the present application involves extracting meaningful feature data from the data collected by the data collection section described above, which can be used as input for subsequent analysis and prediction processes. The feature extraction process involves selecting raw data and converting it into a set of specific data that describe the operating state of the software and hardware resources, which will serve as an important basis for identifying and predicting software failures or performance degradation.
Preferably, as shown in fig. 3, the present step may include the following sub-steps:
step S1031: and carrying out mean, standard deviation and variance calculation on the preprocessed data to obtain statistical features, wherein the statistical features are used for capturing the distribution and change of the data along with time.
By capturing the statistical features, the method can be used for detecting abnormal behaviors of a system, hardware and software in a subsequent performance prediction model and identifying trends in data.
The above statistical features can be obtained in the present application by the following formula:
average value: frac {1} { n } \sum_ { i=1 } n x _i;
Standard deviation: sqrt { \frac {1} { n-1} \sum } { i=1 } n (x_i- \bar { x }) ] 2};
variance: frac {1} { n-1} \sum_ { i=1 } n (x_i- \bar { x }) 2.
Step S1032: and calculating power spectral density and Fourier coefficient from the preprocessed data to obtain frequency domain characteristics, wherein the frequency domain characteristics are used for representing the periodic characteristics and frequency characteristics of related systems, hardware and software carried in the original data and changed in the running process. In this embodiment, these frequency domain features may be used to detect periodic or frequency-dependent anomalies in systems, hardware, and software.
The above frequency domain features can be obtained in the present application by the following formula:
power spectral density: s (f) =lim_ { t\to\infty } frac {1} { T } \left\int_ { -T/2} { T/2}x (T) e { -2\pi i f t}dt\right |2;
fourier coefficients: c_k=frac {1} { T } \int_0 } T x (T) e { -2\pi i k T/T } dt.
Step S1033: from the pre-processed data, auto-and cross-correlation data are calculated to obtain time domain features that identify causal relationships between system resources, hardware loading and software performance and possible hysteresis effects.
The above time domain features can be obtained in the present application by the following formula:
Autocorrelation: r (k) =frac { sum_ { t=k+1 } n (x_t-bar { x }) (x_ { t-k } -bar { x }) } { sum_ { t=1 } n (x_t-bar { x }) } 2};
cross-correlation: r (k) =frac { sum_ { t=k+1 } n (x_t-bar { x }) (y_ { t-k } -bar { y }) } { sqrt { sum_ { t=1 }, n (x_t-bar { x }) 2} sqrt { sum_ t=1 }, n (y_t-bar { y }) 2 }.
Step S1034: and carrying out wavelet transformation calculation on the preprocessed data to obtain transformation characteristics, wherein the transformation characteristics are used for identifying running states and anomalies under multiple scales. The wavelet transformation calculation comprises continuous wavelet transformation and discrete wavelet transformation, which can decompose the collected original data into different frequency bands and time scales, extract characteristics from each frequency band and scale, capture the frequency and time characteristics of the data, and further identify the running state and abnormality under multiple scales.
The above transformation characteristics can be obtained in the present application by the following formula:
continuous wavelet transformation: cwt_ { a, b } = int_ { -infty } infty x (t)
frac{1}{sqrt{a}}psi^*left(frac{t-b}{a}R)dt;
Discrete wavelet transform: dwt_ { j, k } = sum_ { n=0 } {2^j-1} h_n x_ {2 jk+n }.
Step S104: and taking the extracted characteristic data, the running state data of the system and the application software as the input of a software performance prediction model.
The software performance prediction model of the embodiment can be used for predicting possible faults or performance degradation of monitored software, and can be a machine learning model trained by utilizing a random forest algorithm based on historical characteristic data, historical running state data of a system and application software. The random forest algorithm is the main algorithm used in the present application to train the machine learning model. The method is an integrated learning algorithm based on decision trees, and a plurality of decision trees can be constructed and combined to predict. Each tree is trained on a random subset of data and a random subset of features to avoid overfitting. The training process of the model comprises the steps of disassembling collected historical characteristic data and historical running state data into a training set and a verification set, and evaluating and verifying the trained model on the verification set.
In the present application, a classification model is trained using a supervised machine learning algorithm, a random forest algorithm, which can distinguish between normal and abnormal operation of software and computer systems, and the trained machine learning model is used in a predictive and decision-making process to determine whether the software and computer systems are operating normally or abnormally.
Step S105: and dynamically adjusting resource allocation based on the output of the software performance prediction model to ensure safe and reliable operation of the application software.
The predictive and resource allocation portion of the present application is responsible for predicting potential failures or performance degradation in the monitored software and allocating system resources accordingly. And predicting in real time according to the current state of the system and the behavior of the monitored application software by using the software performance prediction model in the steps, and judging possible faults or performance degradation of the monitored application software.
When a possible failure or performance degradation is predicted, the present application can automatically issue a warning and allocate system and hardware resources to the monitored application software accordingly.
The resource allocation process involves dynamically adjusting the allocation of system and hardware resources to ensure stable operation of the software. The allocation process of resources is automated by defined trigger conditions, pausing or disabling software policies, and policies that adjust the resource allocation threshold.
The prediction and resource allocation portion of this step can be specifically divided into three portions, namely, resource monitoring, failure prediction, and resource adjustment:
(1) Resource monitoring
The resource monitoring process may continuously monitor system resources, such as CPU, memory, hard disk, network bandwidth, etc., used by the monitored software, and may collect and process the monitoring data to generate real-time resource usage statistics for each software. Resource statistics may then be used to determine if system resources are being used efficiently and if there are bottlenecks that need to be resolved.
(2) Failure prediction
The fault prediction process predicts potential faults or performance degradation of the monitored software using the software performance prediction model described above. The input data of the software performance prediction model is the feature data extracted at the feature extraction section. The output of the model is a probability value that indicates the likelihood of failure in the near future.
In the invention, the linear regression can be used for predicting the historical data monitored in real time, and the running condition of the future 1 hour or one day is predicted according to the historical data of the acquired system resources, hardware load and software running state in a period of time, wherein the following linear regression algorithm is used for predicting the historical data:
Y=a+bX
Where Y is the dependent variable (e.g., CPU usage), X is the independent variable (e.g., time), a is the intercept, and b is the slope. Linear regression algorithms calculate intercept and slope values from historical data and then predict faults or performance degradation that may occur at some time in the future.
(3) Resource adjustment
If the failure probability exceeds a predefined threshold, resource adjustment is automatically triggered and an alarm is sent to notify the system administrator. The alert will be sent in the form of any other suitable notification, such as a pop-up window on the computer desktop, an email or text message. The system administrator may then take appropriate action to prevent or mitigate the failure.
The automatic resource adjustment process adjusts the allocation of the system to the monitored software resources based on predictions made during the failure prediction process and resource usage statistics collected during the resource monitoring process. If a failure is predicted, the resource allocation process may allocate more resources for the software to prevent the failure from occurring. On the other hand, if the resource usage statistics indicate that system resources are being over-utilized, the resource allocation process may reduce the resources allocated to the software to improve the overall performance of the system.
Preferably, the step may allocate system resources based on the predicted resource usage amount output by the software performance prediction model and the current resource usage amount in the system level index data, and the allocation formula is as follows:
allocation amount= (predicted resource usage amount-current resource usage amount)/total amount of available resources.
The allocation formula described above can be used to balance resource usage and ensure that the system operates in an optimal manner.
From the above, the method for guaranteeing safe and reliable operation of application software provided by the invention can monitor the states of the computer system and hardware when the specific application software is operated, and flexibly adjust the system and hardware resources allocated to the software operation when the software is operated, thereby guaranteeing long-term stable operation of the appointed application software in the computer system. And further can bring the following beneficial effects:
the system reliability is improved: the invention provides a real-time monitoring mechanism for the computer system, and collects and analyzes data related to system performance and software operation. By automatically detecting and predicting possible faults or performance degradation, the invention can take corresponding measures in advance to prevent system breakdown, improve system reliability and reduce downtime.
The system efficiency is improved: the invention can balance the allocation of system and hardware resources, optimize the resource allocation and prevent the resource from being exhausted. The invention can improve the utilization efficiency of system resources and the overall performance of the system by more effectively distributing the resources.
The maintenance cost is reduced: the invention can automatically detect and diagnose the system faults, identify the cause of the problems, take corresponding measures in advance, prevent the system from collapsing and reduce the manual intervention and maintenance cost. This can significantly reduce the workload of system administrators and reduce maintenance costs.
Enhancing user experience: the invention reduces the occurrence of software crash, improves the stability of the system and enhances the user experience by ensuring the stable operation of the appointed software in the computer system. This is particularly important for critical applications or systems that require uninterrupted operation.
The system safety is improved: the method can detect abnormal software behaviors in real time, analyze problem reasons and take corresponding measures to prevent potential safety risks. By monitoring the behavior of the software application program, the invention can prevent the abnormality of the software caused by the resource exhaustion, the data loss, the downtime of the system resource exhaustion, the overload threat of hardware and network, and the like, thereby improving the safety of the whole system.
Fig. 4 is a schematic structural diagram of a device for guaranteeing safe and reliable operation of application software according to an embodiment of the present application, where the device includes: the index data acquisition unit 410, the preprocessing unit 420, the feature extraction unit 430, the input unit 440, and the dynamic adjustment unit 450 are sequentially connected therebetween.
The index data obtaining unit 410 is configured to obtain system level index data and application program level index data during an application software running process.
The preprocessing unit 420 is configured to perform data preprocessing on the system level index data and the application level index data to obtain preprocessed data.
The feature extraction unit 430 is used to extract feature data for identifying system and application software performance and resource utilization from the pre-processing data.
The input unit 440 is used for taking the extracted characteristic data, the operation state data of the system and the application software as the input of the software performance prediction model.
The dynamic adjustment unit 450 is configured to dynamically adjust resource allocation based on the output of the software performance prediction model to ensure safe and reliable operation of the application software.
Preferably, the system level index data includes: CPU usage, memory usage, network usage, and disk I/O, the application level index data comprising: application response time, error rate, and transaction throughput.
Preferably, as shown in fig. 5, the preprocessing unit 420 may further include:
a moving average calculation module 421 for calculating an average of the window of data points using moving average calculation to smooth the data to reduce noise in the data and highlight trends.
An exponential moving average calculation module 422 for weighting the nearest data points using an exponential moving average algorithm to weight decisions for recent trends in the data.
Preferably, as shown in fig. 6, the feature extraction unit 430 may further include:
a statistical feature extraction module 431, configured to perform mean, standard deviation and variance calculation on the preprocessed data to obtain a statistical feature, where the statistical feature is used to capture distribution and variation of the data over time;
a frequency domain feature extraction module 432, configured to calculate a power spectral density and a fourier coefficient from the preprocessed data to obtain a frequency domain feature, where the frequency domain feature is used to characterize a periodic characteristic and a frequency characteristic of a related system, hardware and software carried in the original data that change during an operation process;
a time domain feature extraction module 433 for computing auto-and cross-correlation data from the pre-processed data to obtain time domain features for identifying causal relationships between system resources, hardware loading and software performance and possible hysteresis effects;
The transformation feature extraction module 434 is configured to perform wavelet transformation computation on the preprocessed data to obtain transformation features, where the transformation features are used to identify operating states and anomalies under multiple scales.
Preferably, the software performance prediction model is a machine learning model trained by using a random forest algorithm based on historical characteristic data, historical running state data of a system and application software.
Preferably, the dynamic adjustment unit 450 is specifically configured to:
and distributing system resources based on the predicted resource usage amount output by the software performance prediction model and the current resource usage amount in the system level index data, wherein a distribution formula is as follows:
allocation amount= (predicted resource usage amount-current resource usage amount)/total amount of available resources.
From the above, the device for guaranteeing safe and reliable operation of application software provided by the invention can monitor the states of the computer system and hardware when the specific application software is operated, and flexibly adjust the system and hardware resources allocated to the software operation when the software is operated, thereby guaranteeing long-term stable operation of the appointed application software in the computer system.
The embodiment of the invention also provides electronic equipment, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the method when executing the program.
The embodiment of the invention also provides a computer readable storage medium, and the computer readable storage medium stores a computer program for executing the method.
As shown in fig. 7, the electronic device 600 may further include: a communication module 110, an input unit 120, an audio processor 130, a display 160, a power supply 170. It is noted that the electronic device 600 need not include all of the components shown in fig. 7; in addition, the electronic device 600 may further include components not shown in fig. 7, to which reference is made to the related art.
As shown in fig. 7, the central processor 100, sometimes also referred to as a controller or operational control, may include a microprocessor or other processor device and/or logic device, which central processor 100 receives inputs and controls the operation of the various components of the electronic device 600.
The memory 140 may be, for example, one or more of a buffer, a flash memory, a hard drive, a removable media, a volatile memory, a non-volatile memory, or other suitable device. The information about failure may be stored, and a program for executing the information may be stored. And the central processor 100 can execute the program stored in the memory 140 to realize information storage or processing, etc.
The input unit 120 provides an input to the central processor 100. The input unit 120 is, for example, a key or a touch input device. The power supply 170 is used to provide power to the electronic device 600. The display 160 is used for displaying display objects such as images and characters. The display may be, for example, but not limited to, an LCD display.
The memory 140 may be a solid state memory such as Read Only Memory (ROM), random Access Memory (RAM), SIM card, or the like. But also a memory which holds information even when powered down, can be selectively erased and provided with further data, an example of which is sometimes referred to as EPROM or the like. Memory 140 may also be some other type of device. Memory 140 includes a buffer memory 141 (sometimes referred to as a buffer). The memory 140 may include an application/function storage 142, the application/function storage 142 for storing application programs and function programs or a flow for executing operations of the electronic device 600 by the central processor 100.
The memory 140 may also include a data store 143, the data store 143 for storing data, such as contacts, digital data, pictures, sounds, and/or any other data used by the electronic device. The driver storage 144 of the memory 140 may include various drivers of the electronic device for communication functions and/or for performing other functions of the electronic device (e.g., messaging applications, address book applications, etc.).
The communication module 110 is a transmitter/receiver 110 that transmits and receives signals via an antenna 111. A communication module (transmitter/receiver) 110 is coupled to the central processor 100 to provide an input signal and receive an output signal, which may be the same as in the case of a conventional mobile communication terminal.
Based on different communication technologies, a plurality of communication modules 110, such as a cellular network module, a bluetooth module, and/or a wireless local area network module, etc., may be provided in the same electronic device. The communication module (transmitter/receiver) 110 is also coupled to a speaker 131 and a microphone 132 via an audio processor 130 to provide audio output via the speaker 131 and to receive audio input from the microphone 132 to implement usual telecommunication functions. The audio processor 130 may include any suitable buffers, decoders, amplifiers and so forth. In addition, the audio processor 130 is also coupled to the central processor 100 so that sound can be recorded locally through the microphone 132 and so that sound stored locally can be played through the speaker 131.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The principles and embodiments of the present invention have been described in detail with reference to specific examples, which are provided to facilitate understanding of the method and core ideas of the present invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims (15)

1. A method for ensuring safe and reliable operation of application software, the method comprising:
acquiring system-level index data and application-level index data in the running process of application software;
performing data preprocessing on the system-level index data and the application-level index data to obtain preprocessed data;
Extracting feature data for identifying system and application software performance and resource utilization rate from the preprocessing data;
taking the extracted characteristic data, the system and the running state data of the application software as the input of a software performance prediction model;
and dynamically adjusting resource allocation based on the output of the software performance prediction model to ensure safe and reliable operation of the application software.
2. The method for ensuring safe and reliable operation of application software as recited in claim 1, wherein the system level index data comprises: CPU usage, memory usage, network usage, and disk I/O, the application level index data comprising: application response time, error rate, and transaction throughput.
3. The method for ensuring safe and reliable operation of application software according to claim 1, wherein the data preprocessing comprises:
using a moving average calculation, calculating an average of a window of data points to smooth the data to reduce noise in the data and highlight trends;
the recent data points are weighted using an exponential moving average algorithm to weight recent trends in the data.
4. The method for ensuring safe and reliable operation of application software according to claim 1, wherein said extracting feature data for identifying system and application software performance and resource utilization from said pre-processing data comprises:
Calculating the mean value, standard deviation and variance of the preprocessed data to obtain statistical features, wherein the statistical features are used for capturing the distribution and change of the data along with time;
calculating power spectral density and Fourier coefficient from the preprocessed data to obtain frequency domain characteristics, wherein the frequency domain characteristics are used for representing the periodic characteristics and frequency characteristics of related systems, hardware and software carried in the original data, which change in the running process;
computing autocorrelation and cross-correlation data from the pre-processing data to obtain time domain features identifying causal relationships between system resources, hardware loading and software performance and possible hysteresis effects;
and carrying out wavelet transformation calculation on the preprocessed data to obtain transformation characteristics, wherein the transformation characteristics are used for identifying running states and anomalies under multiple scales.
5. The method for guaranteeing safe and reliable operation of application software according to claim 1, wherein the software performance prediction model is a machine learning model trained by using a random forest algorithm based on historical feature data, historical operation state data of the system and the application software.
6. The method for ensuring safe and reliable operation of application software according to claim 1, wherein dynamically adjusting resource allocation based on the output of the software performance prediction model to ensure safe and reliable operation of application software comprises:
And distributing system resources based on the predicted resource usage amount output by the software performance prediction model and the current resource usage amount in the system level index data, wherein a distribution formula is as follows:
allocation amount= (predicted resource usage amount-current resource usage amount)/total amount of available resources.
7. An apparatus for ensuring safe and reliable operation of application software, the apparatus comprising:
the index data acquisition unit is used for acquiring system-level index data and application program-level index data in the running process of the application software;
the preprocessing unit is used for preprocessing the system-level index data and the application-level index data to obtain preprocessed data;
the feature extraction unit is used for extracting feature data for identifying the performance and resource utilization rate of the system and the application software from the preprocessing data;
the input unit is used for taking the extracted characteristic data, the running state data of the system and the application software as the input of a software performance prediction model;
and the dynamic adjustment unit is used for dynamically adjusting the resource allocation based on the output of the software performance prediction model so as to ensure the safe and reliable operation of the application software.
8. The apparatus for ensuring safe and reliable operation of application software as recited in claim 7, wherein said system level index data comprises: CPU usage, memory usage, network usage, and disk I/O, the application level index data comprising: application response time, error rate, and transaction throughput.
9. The apparatus for ensuring safe and reliable operation of application software as recited in claim 7, wherein the preprocessing unit includes:
a moving average calculation module for calculating an average of the window of data points using a moving average calculation to smooth the data to reduce noise in the data and highlight trends;
an exponential moving average calculation module for weighting the nearest data points using an exponential moving average algorithm to weight decisions for recent trends in the data.
10. The apparatus for guaranteeing safe and reliable operation of application software according to claim 7, wherein the feature extraction unit comprises:
the statistical feature extraction module is used for carrying out mean value, standard deviation and variance calculation on the preprocessed data to obtain statistical features, and the statistical features are used for capturing the distribution and change of the data along with time;
The frequency domain feature extraction module is used for calculating power spectral density and Fourier coefficient from the preprocessed data to obtain frequency domain features, wherein the frequency domain features are used for representing the periodic characteristics and the frequency characteristics of related systems, hardware and software carried in the original data, which change in the running process;
a time domain feature extraction module for calculating autocorrelation and cross-correlation data from the preprocessed data to obtain time domain features for identifying causal relationships between system resources, hardware loading and software performance and possible hysteresis effects;
and the transformation feature extraction module is used for carrying out wavelet transformation calculation on the preprocessed data to obtain transformation features, and the transformation features are used for identifying running states and anomalies under multiple scales.
11. The apparatus for ensuring safe and reliable operation of application software according to claim 7, wherein the software performance prediction model is a machine learning model trained by using a random forest algorithm based on historical feature data, historical operating state data of the system and the application software.
12. The apparatus for ensuring safe and reliable operation of application software according to claim 7, wherein the dynamic adjustment unit is specifically configured to:
And distributing system resources based on the predicted resource usage amount output by the software performance prediction model and the current resource usage amount in the system level index data, wherein a distribution formula is as follows:
allocation amount= (predicted resource usage amount-current resource usage amount)/total amount of available resources.
13. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method of any of claims 1 to 6 when the computer program is executed by the processor.
14. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any one of claims 1 to 6.
15. A computer program product comprising computer programs/instructions which, when executed by a processor, implement the steps of the method of any of claims 1 to 6.
CN202310628804.5A 2023-05-30 2023-05-30 Method and device for guaranteeing safe and reliable operation of application software Pending CN116680146A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310628804.5A CN116680146A (en) 2023-05-30 2023-05-30 Method and device for guaranteeing safe and reliable operation of application software

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310628804.5A CN116680146A (en) 2023-05-30 2023-05-30 Method and device for guaranteeing safe and reliable operation of application software

Publications (1)

Publication Number Publication Date
CN116680146A true CN116680146A (en) 2023-09-01

Family

ID=87788332

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310628804.5A Pending CN116680146A (en) 2023-05-30 2023-05-30 Method and device for guaranteeing safe and reliable operation of application software

Country Status (1)

Country Link
CN (1) CN116680146A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117116200A (en) * 2023-10-23 2023-11-24 广州市惠正信息科技有限公司 Method and system for adjusting resolution of LED display screen
CN117234786A (en) * 2023-11-10 2023-12-15 江西师范大学 Computer performance control analysis system and method based on big data

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117116200A (en) * 2023-10-23 2023-11-24 广州市惠正信息科技有限公司 Method and system for adjusting resolution of LED display screen
CN117116200B (en) * 2023-10-23 2023-12-29 广州市惠正信息科技有限公司 Method and system for adjusting resolution of LED display screen
CN117234786A (en) * 2023-11-10 2023-12-15 江西师范大学 Computer performance control analysis system and method based on big data
CN117234786B (en) * 2023-11-10 2024-02-20 江西师范大学 Computer performance control analysis system and method based on big data

Similar Documents

Publication Publication Date Title
US11403164B2 (en) Method and device for determining a performance indicator value for predicting anomalies in a computing infrastructure from values of performance indicators
CN110708204B (en) Abnormity processing method, system, terminal and medium based on operation and maintenance knowledge base
US10558545B2 (en) Multiple modeling paradigm for predictive analytics
US9600394B2 (en) Stateful detection of anomalous events in virtual machines
CN116680146A (en) Method and device for guaranteeing safe and reliable operation of application software
US10248561B2 (en) Stateless detection of out-of-memory events in virtual machines
CN110362612B (en) Abnormal data detection method and device executed by electronic equipment and electronic equipment
US20130190095A1 (en) Faults and Performance Issue Prediction
US9720823B2 (en) Free memory trending for detecting out-of-memory events in virtual machines
US8260622B2 (en) Compliant-based service level objectives
US11307916B2 (en) Method and device for determining an estimated time before a technical incident in a computing infrastructure from values of performance indicators
CN108509325B (en) Method and device for dynamically determining system timeout time
CA2816469A1 (en) Faults and performance issue prediction
WO2018071005A1 (en) Deep long short term memory network for estimation of remaining useful life of the components
CN109670690A (en) Data information center monitoring and early warning method, system and equipment
US11675643B2 (en) Method and device for determining a technical incident risk value in a computing infrastructure from performance indicator values
US9372734B2 (en) Outage window scheduler tool
CN111262750B (en) Method and system for evaluating baseline model
CN114943321A (en) Fault prediction method, device and equipment for hard disk
JP6777142B2 (en) System analyzer, system analysis method, and program
US11763312B2 (en) Automated rules execution testing and release system
CN115280337A (en) Machine learning based data monitoring
CN114726758A (en) Industrial network abnormity determining method and device, computer equipment and storage medium
US20220382614A1 (en) Hierarchical neural network-based root cause analysis for distributed computing systems
CN110413482B (en) Detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination