WO2023093031A1 - 容器组调控方法、装置及电子设备 - Google Patents

容器组调控方法、装置及电子设备 Download PDF

Info

Publication number
WO2023093031A1
WO2023093031A1 PCT/CN2022/101384 CN2022101384W WO2023093031A1 WO 2023093031 A1 WO2023093031 A1 WO 2023093031A1 CN 2022101384 W CN2022101384 W CN 2022101384W WO 2023093031 A1 WO2023093031 A1 WO 2023093031A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
container group
busyness
container
regulated
Prior art date
Application number
PCT/CN2022/101384
Other languages
English (en)
French (fr)
Inventor
胡仲臣
杨军
卢道和
陈鉴镔
陈刚
程志峰
朱嘉伟
罗海湾
李勋棋
熊思清
周琪
郭英亚
李兴龙
周佳振
文玉茹
何勇彬
Original Assignee
深圳前海微众银行股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海微众银行股份有限公司 filed Critical 深圳前海微众银行股份有限公司
Publication of WO2023093031A1 publication Critical patent/WO2023093031A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3055Monitoring arrangements for monitoring the status of the computing system or of the computing system component, e.g. monitoring if the computing system is on, off, available, not available
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • G06F11/3072Monitoring arrangements determined by the means or processing involved in reporting the monitored data where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Definitions

  • the embodiments of the present application relate to the technical field of operation and maintenance, and in particular, to a container group control method, device, and electronic equipment.
  • the purpose of the present application is to provide a control method, device and electronic equipment for a container group, so as to improve the efficiency and accuracy of container control.
  • the embodiment of the present application provides a container group control method, including:
  • the container group to be regulated includes multiple containers, the initial container group data includes at least one data indicator of each container, and the initial container group data is normalized to obtain the container group data, including:
  • the container group data is obtained according to the values of the converted target data indicators.
  • the processing of the container group data according to the pre-trained health degree algorithm to obtain the first health degree data corresponding to the container group to be regulated includes:
  • the processing of the container group data according to the pre-trained busyness algorithm to obtain the first busyness data corresponding to the container group to be regulated includes:
  • the busyness prediction is performed on the container group data in the form of a two-dimensional array to obtain the first busyness data, wherein the first busyness data is in the form of a one-dimensional array, and the one Each variable in the dimensional array represents how busy a container is.
  • performing decision processing on the first health degree data and the first busyness degree data according to pre-stored decision rules to obtain a first decision result corresponding to the container group to be regulated includes:
  • Decision processing is performed according to the discrete containers in the container group to be regulated, the average busyness value, and the first health degree data to obtain a first decision result corresponding to the container group to be regulated.
  • the processing of the initial container group data according to preset training data processing rules to obtain real-time training data includes:
  • Real-time training data is obtained according to the abnormal training data and the normal training data.
  • the initial container group data is analyzed and processed according to the pre-stored exception data processing rules, the second health data, the second busyness data, and the second decision result to obtain abnormal training data and normal training data.
  • data including:
  • the pre-stored regression filtering rules and the second health degree data select the first target container data whose health degree is lower than the preset health degree threshold from the initial container group data, and at the same time according to the pre-stored regression filtering rules and the second health degree data.
  • busyness data select second target container data whose busyness is higher than a preset busyness threshold from the initial container group data;
  • the obtaining new training data according to pre-acquired historical training data and real-time training data includes:
  • the initial historical training data is screened to obtain historical training data
  • the embodiment of the present application provides a container group control device, including:
  • the obtaining module is used to obtain the initial container group data corresponding to the container group to be regulated;
  • a processing module configured to perform normalization processing on the initial container group data to obtain container group data
  • the processing module is further configured to process the data of the container group according to the pre-trained health degree algorithm to obtain the first health degree data corresponding to the container group to be regulated, and at the same time process the data of the container group according to the pre-trained busyness degree algorithm. Processing the container group data to obtain the first busyness data corresponding to the container group to be regulated;
  • the processing module is further configured to perform decision processing on the first health degree data and the first busyness degree data according to prestored decision rules, obtain a first decision result corresponding to the container group to be regulated, and The container in the container group to be regulated is regulated according to the first decision result.
  • an embodiment of the present application provides an electronic device, including: a processor, and a memory communicatively connected to the processor;
  • the memory stores computer-executable instructions
  • the processor executes the computer-executed instructions stored in the memory to implement the container group regulation method according to any one of the first aspect.
  • the embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores computer-executable instructions, and when the processor executes the computer-executable instructions, any one of the first aspects is implemented.
  • the container group control method is not limited to:
  • an embodiment of the present application provides a computer program product, including a computer program.
  • the computer program is executed by a processor, the container group control method according to any one of the first aspect is implemented.
  • the embodiment of the present application provides a container group control method, device, and electronic equipment.
  • the initial container group data corresponding to the container group to be regulated can be obtained first, and then the initial container group data is normalized.
  • the first health degree data and the first busy degree data are decided and processed, and the first decision result corresponding to the container group to be regulated is obtained, and the containers in the container group to be regulated are regulated according to the first decision result , by comprehensively making decisions based on the two dimensions of health data and busyness data corresponding to container group data, it not only increases the accuracy of decision-making results, but also reduces manpower consumption, thereby improving the accuracy and efficiency of container regulation , to ensure the normal operation of various financial services.
  • FIG. 1 is a schematic diagram of the architecture of the application system of the container group control method provided by the embodiment of the present application;
  • FIG. 2 is a schematic flow chart of a container group control method provided in an embodiment of the present application
  • Fig. 3 is the application schematic diagram of the algorithm module provided by the embodiment of the present application.
  • FIG. 4 is a schematic diagram of the application of the training model provided by the embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a container group regulating device provided in an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present application.
  • this application comprehensively makes decisions based on the two dimensions of health data and busyness data corresponding to container group data, which not only increases the accuracy of decision-making results, but also reduces manpower consumption, thereby improving It ensures the accuracy and efficiency of container regulation, and ensures the technical effect of the normal operation of various financial businesses.
  • Figure 1 is a schematic diagram of the architecture of the application system of the container group control method provided by the embodiment of the present application.
  • the server 102 can obtain the initial container group data from the container 101 to be regulated, and then perform normalization processing on the initial container group data to obtain container group data with a uniform format, and then according to The pre-trained health degree algorithm A and busy degree algorithm B respectively process the data of the container group to obtain the processing results, and then further process the processing results according to the pre-stored decision rules to obtain the decision results, and then reverse the regulation according to the decision results
  • the containers in the container group 101 are regulated.
  • the container group 101 to be regulated may include multiple containers, specifically a set of container instances responsible for the same business processing logic, and these containers have the property of being interchangeable and sharing processing traffic. And there may be one or more container groups 101 to be regulated.
  • the server 102 may simultaneously regulate multiple container groups 101 to be regulated.
  • FIG. 2 is a schematic flowchart of a method for controlling a container group provided by an embodiment of the present application, and the method of this embodiment may be executed by the server 102 . As shown in Figure 2, the method of this embodiment may include:
  • the container group to be regulated contains multiple containers, and the initial data of each container may be obtained before the container group to be regulated is regulated, so as to obtain the initial container group data corresponding to the container group to be regulated.
  • the initial container group data may be in the form of a two-dimensional array, each row of the two-dimensional array is the initial data of a container instance corresponding to a single container, and multiple rows of data of multiple containers form a two-dimensional array.
  • each container included in the two-dimensional array belongs to the same container group, and all instances of the container group have been included in the two-dimensional array.
  • the initial data may be real-time monitoring data
  • the real-time monitoring data includes at least one data indicator such as a host indicator, a container indicator, a process indicator, a service indicator, and basic container information.
  • the host index is the performance index of the host where the container runs, which can include: CPU (total CPU, CPU usage), memory (total memory, memory usage, cache size, cache memory size), disk ( disk space, disk reads and writes, disk inodes), network (network traffic, number of network connections, number of available ports), etc.
  • Container indicators are performance indicators when the container is running, which can include: CPU (total CPU, CPU usage, CPU throttling rate), memory (total memory, memory usage, buffer or cache size), disk (disk space, Disk read and write, disk inode), network (network traffic, number of network connections, number of available ports), etc.
  • Process indicators are various indicators when the business process is running, which may include: the number of active threads, the number of open files, and so on.
  • Business indicators are indicators generated when container applications perform business processing, which can include: transaction volume (transaction volume per minute), success rate (system success rate, business success rate, number of failures), time consumption (average time consumption, maximum consumption time) etc.
  • the basic information of a container is the position of the container in the entire environment, which may include: the container group to which the container belongs, the system to which the container group belongs, the host to which the container belongs, and the physical location of the host, etc.
  • S202 Perform normalization processing on the initial container group data to obtain container group data.
  • the initial container group data can be normalized to obtain the container group data.
  • the container group to be regulated contains multiple containers, and the initial container group data contains at least one data indicator of each container, then the initial container group data is normalized to obtain the container group data, which may specifically include:
  • the value and preset value range of the target data indicator are acquired.
  • the container group data is obtained according to the values of the converted target data indicators.
  • data indexes of different dimensions may be converted into floating-point numbers with a fixed value range.
  • the fixed value range can be [0-1].
  • the preset value range can be obtained through manual writing.
  • a value range of [0-1000000] may be manually specified, and the unit is MB.
  • normalization conversion processing can be performed according to the value of the target data index and the preset value range of the target data index, and the converted value of the target data index can be determined.
  • the normalization processing can be performed according to the following expression:
  • the target data indicator may be a host indicator, a container indicator, a process indicator or a service indicator.
  • its string identifier can be converted into a unique value according to certain conversion rules, which can be as follows:
  • the physical location of the host it can generally be expressed as: computer room + machine column + cabinet + machine position, and the original value is a fixed-length string.
  • the value of the computer room is [IDC1, IDC4, IDC2, IDC9], after sorting them, they are numbered according to natural numbers, and IDC1 is converted into number 1, IDC4 is converted into number 3, IDC2 is converted into number 2, and IDC9 is converted into number 4.
  • the original value is generally a fixed asset number, or it can be sorted according to the above method and then numbered according to a natural number.
  • the container group to which the container belongs it can also be numbered according to the natural number after sorting by the container group name.
  • S203 Process the container group data according to the pre-trained health degree algorithm to obtain the first health degree data corresponding to the container group to be regulated, and at the same time process the container group data according to the pre-trained busy degree algorithm to obtain the corresponding The first busyness data of .
  • the container group data may be comprehensively predicted from the two dimensions of health and busyness to obtain the first health degree data and the first busyness degree data corresponding to the container group data.
  • the container group data is processed according to the pre-trained health degree algorithm to obtain the first health degree data corresponding to the container group to be regulated, which may specifically include:
  • the health status of each container in the container group to be regulated can be predicted according to the pre-trained health degree algorithm.
  • the container group data may be in the form of a two-dimensional array
  • the first health degree data may be in the form of a one-dimensional array.
  • the length of the one-dimensional array may be the same as the length of the input two-dimensional array, that is, for the real-time monitoring data of each container in the container group, a corresponding health degree prediction result may be obtained.
  • the value of the health degree prediction result may be 1: healthy, 2: sub-healthy, 3: abnormal, and then the prediction results of multiple containers may form a one-dimensional array, that is, the first health degree data.
  • the health degree algorithm may use an existing algorithm, for example, an SVM (Support Vector Machine, support vector machine) algorithm may be used.
  • SVM Small Vector Machine, support vector machine
  • the container group data is processed according to the pre-trained busyness algorithm to obtain the first busyness data corresponding to the container group to be regulated, which may specifically include:
  • the busyness prediction is performed on the container group data in the form of a two-dimensional array to obtain the first busyness data, wherein the first busyness data is in the form of a one-dimensional array, and the one Each variable in the dimensional array represents how busy a container is.
  • the busyness of each container in the container group to be regulated can be predicted according to the pre-trained busyness algorithm.
  • the container group data may be in the form of a two-dimensional array
  • the first busyness data may be in the form of a one-dimensional array.
  • the length of the one-dimensional array may be the same as the length of the input two-dimensional array, that is, for the real-time monitoring data of each container in the container group, a corresponding busyness prediction result may be obtained.
  • the value interval of the busyness prediction result can be (0, 100), the interval of (0, 20) indicates that the container is idle, the interval of (20, 60) indicates that the container is normal, and the interval of (60, 100) indicates that the container is busy, and the busyness prediction results of multiple containers can form a one-dimensional array, which is the first busyness data.
  • the busyness algorithm can adopt the existing algorithm, for example, can adopt DTS (Decision Trees, decision tree) algorithm.
  • DTS Decision Trees, decision tree
  • classification and regression algorithms such as K clustering, convolutional neural network, etc.
  • K clustering convolutional neural network, etc.
  • S204 Perform decision processing on the first health degree data and the first busy degree data according to the pre-stored decision rules, obtain the first decision result corresponding to the container group to be regulated, and regulate the containers in the container group to be regulated according to the first decision result .
  • a decision can be made based on the two dimensions of health degree and busy degree, and the decision result of the container group to be regulated is obtained, and then the Whether the container group to be regulated needs to be regulated and how to make a decision.
  • performing decision processing on the first health degree data and the first busyness degree data according to the pre-stored decision rules, to obtain the first decision result corresponding to the container group to be regulated which may specifically include:
  • the average busyness and the standard deviation of busyness are determined according to the first busyness data.
  • the discrete containers in the group of containers to be regulated are determined according to the average busyness and the standard deviation of busyness.
  • Decision processing is performed according to the discrete containers in the container group to be regulated, the average busyness value, and the first health degree data to obtain a first decision result corresponding to the container group to be regulated.
  • the average busyness and the standard deviation of the busyness can be determined according to the first busyness, and then the container group to be regulated can be determined according to the average busyness and the standard deviation of the busyness. Whether there are discrete containers, if there are discrete containers in the container group to be regulated, it is determined that the busyness of the container group is not balanced enough, and then a decision can be made based on the discrete containers in the container group to be regulated, the average busyness, and the first health degree processing to obtain the first decision result corresponding to the container group to be regulated.
  • the average busyness is a
  • the standard deviation of the busyness is d
  • the busyness of the container is b. You can first determine whether the busyness b of the container falls outside the 2 standard deviations, that is: b–a > d *2, Bins that fall outside 2 standard deviations can be called discrete bins.
  • the analysis is performed on the entire container group instead of a single container.
  • the standard deviation + average value filtering scheme can be used to quickly filter out whether the container group is inconsistent with other containers.
  • the discrete container is where the fault point is, which improves the efficiency and accuracy of locating the faulty container, and the quick decision to stop the discrete container also improves the speed of system abnormal recovery, thereby ensuring that all financial services stable operation.
  • Table 1 is the first decision result table corresponding to each container in the container group to be regulated.
  • Table 1 according to the average busyness a, whether there are discrete containers, and the first health degree data of the container group to be regulated Each container is shut down or expanded for regulation and so on.
  • Table 1 The first decision result table corresponding to each container in the container group to be regulated
  • FIG. 3 is a schematic diagram of the application of the algorithm module provided by the embodiment of the present application.
  • the algorithm module may include: a data input sub-module, normalization processing There are 5 sub-modules: sub-module, health algorithm sub-module, busyness algorithm sub-module and decision-making sub-module.
  • the data input sub-module can accept the input of a two-dimensional array. Each line of the two-dimensional array is the real-time monitoring data of a single container, that is, the initial container group data. Multiple lines of data of multiple containers form a two-dimensional array, and the two-dimensional Each container contained in the array belongs to the same container group.
  • the normalization processing sub-module can convert data of various dimensions into floating-point numbers with a fixed value range, that is, normalize the initial container group data into container group data, and then input the container group data into the health degree algorithm module and the busy degree algorithm sub-module, and correspondingly obtain the first health degree data and the first busy degree data in the form of a one-dimensional array, and finally input the first health degree data and the first busy degree data to the decision-making sub-module for decision-making processing,
  • the first decision result is obtained, and the prediction of the container index is divided into two dimensions of health and busyness to make the prediction of the running state of the container more three-dimensional and accurate.
  • the method may further include:
  • the initial container group data is processed according to preset training data processing rules to obtain real-time training data.
  • the health degree algorithm and the busyness degree algorithm can also be simultaneously renew. That is, the newly obtained initial container group data corresponding to the container group to be regulated can be preprocessed to obtain real-time training data, and then the real-time training data can be merged with the previous historical training data to obtain new training data, and based on the new training data Update and train the pre-trained health degree algorithm and busy degree algorithm to obtain new health degree algorithm and busy degree algorithm.
  • the algorithm is used to calculate the data of the day as training data, which reduces the The creation cost of the training data allows the training data to be continuously updated with the development of the business.
  • the initial container group data is processed according to preset training data processing rules to obtain real-time training data, which may specifically include:
  • the initial container group data is processed according to the pre-trained health degree algorithm, busyness degree algorithm and decision rule to obtain second health degree data, second busyness degree data and a second decision result.
  • the initial container group data is analyzed and processed according to the pre-stored exception data processing rule, the second health degree data, the second busyness degree data and the second decision result to obtain abnormal training data and normal training data.
  • Real-time training data is obtained according to the abnormal training data and the normal training data.
  • the initial container group data can be obtained first, and the newly obtained initial container group data needs to be used as training data and historical training data to jointly train the health degree algorithm and the busy degree algorithm.
  • the newly acquired initial container group data since the newly acquired initial container group data only includes monitoring data and configuration data, and does not include the health, busyness and decision results of the containers to be regulated, therefore, the newly acquired initial container group data needs to be passed through the algorithm module first, Get the second health degree, the second busy degree and the second decision result, and then process the initial container group data according to the second health degree, the second busy degree and the second decision result to obtain real-time training data, and then the real-time
  • the training data is merged with the historical training data to obtain new training data.
  • the above-mentioned links have been passed in the early stage, so three predicted values of health, busyness and decision-making results have been saved.
  • the initial container group data when the initial container group data is processed according to the second health degree, the second busyness degree and the second decision result to obtain real-time training data, the initial container group data can be classified to obtain abnormal training data and normal training data, Then, the health degree algorithm and the busy degree algorithm are jointly updated according to the abnormal training data and the normal training data, which avoids the overfitting of the health degree algorithm and the busy degree algorithm to the abnormal scene, but at the same time maintains a certain sensitivity.
  • analyzing and processing the initial container group data according to the pre-stored exception data processing rules, the second health degree data, the second busyness degree data and the second decision result, to obtain abnormal training data and normal training data Specifically can include:
  • the pre-stored regression filtering rules and the second health degree data select the first target container data whose health degree is lower than the preset health degree threshold from the initial container group data, and at the same time according to the pre-stored regression filtering rules and the second health degree data.
  • the second busyness data selects second target container data whose busyness is higher than a preset busyness threshold from the initial container group data.
  • the third target container data is acquired from the initial container group data according to a pre-stored threshold filtering rule.
  • the fourth target container data corresponding to the abnormality identifier is obtained from the initial container group data according to the pre-stored production abnormality filtering rules.
  • the initial container group data can be screened from three dimensions: regression filtering, threshold filtering, and production abnormal filtering.
  • the initial container group data can be sorted according to the busyness according to the pre-stored regression filtering rules and the second health data, and then the second target container data whose busyness is higher than the preset busyness threshold can be selected.
  • the first target container data whose health degree is lower than the preset health degree threshold can be selected from the initial container group data according to the pre-stored regression filtering rule and the second health degree data.
  • the busyness threshold and the health threshold can be defined according to the required sample volume.
  • the threshold filtering rules can be: when the host CPU usage exceeds 80%, when the host memory usage exceeds 80%, or when the container CPU usage The rate exceeds 90%.
  • the corresponding initial container group data can be obtained through threshold filtering to obtain the third target container data.
  • the production anomaly filtering rule may be that after a real event occurs, corresponding data is filtered from the initial container group data according to the abnormal identifier of the real event occurrence, so as to obtain the fourth target container data.
  • the initial container group data may use the exception identifier as an index for locating a piece of data.
  • the abnormality identification may be a time period and a involved container group, and when the time period in which the real event occurs and the involved container group are determined, the fourth target container data may be uniquely obtained from the initial container group data.
  • regression filtering is the anomaly found from the dimensions of health and busyness
  • threshold filtering is the anomaly found from the dimension of the original indicator.
  • Production anomaly filtering is to discover anomalies from the dimensions of the actual anomalies that have occurred.
  • the specific method of aggregation can be as follows: for the target container data, if they belong to the same container group and the time difference does not exceed the preset duration, then the target container data with an earlier time can be selected as the real abnormal training data.
  • the preset duration can be any value in 3-10 minutes.
  • the filtering target For regression filtering and threshold filtering, a single container is used as the filtering target, so a single container can be associated
  • the corresponding container group data is obtained as abnormal training data.
  • manual confirmation marking can also be performed, and manual confirmation marking can provide a user interface for operation and maintenance personnel.
  • the monitoring data and configuration data in the same container group that is, the initial container group data
  • three prompts for abnormal data filtering schemes can also be displayed, so that the operation and maintenance personnel can confirm whether the prediction results ( That is, the second health data, the second busyness data, and the second decision result) and abnormal training data are adjusted.
  • this application does not require the operation and maintenance personnel to confirm and adjust all the abnormal training data filtered out, and for the data that has not been confirmed, it can be output according to the prediction results calculated by the algorithm, which improves the abnormal training
  • the accuracy of data and forecast results improves the accuracy of subsequent health and busy algorithm updates.
  • the real-time training data is a collection of abnormal training data and normal training data after calculation by the algorithm module and manual confirmation of marking. It is a temporary data that will be solidified and merged into the historical training database later.
  • the source of the normal training data in the real-time training data is also the output of the algorithm module.
  • the specific acquisition process can be as follows: if the data volume of the abnormal training data obtained after manually confirming the marking is N, then it can also be extracted from the output of the algorithm module N pieces of data that do not overlap with abnormal training data are used as normal training data.
  • new training data is obtained according to pre-acquired historical training data and real-time training data, which may specifically include:
  • the initial historical training data is screened to obtain the historical training data.
  • the historical training data may include data accumulated in the past several days, plus real-time training data newly added on the current day.
  • the historical training data can contain a small amount of initial artificial training data.
  • the initial artificial training data fabricates several scenarios, so that the health degree algorithm and the busyness degree algorithm can be initialized and trained from scratch, and a health with certain predictive ability can be obtained. degree algorithm and busyness algorithm. But the initial artificial training data is not necessary, because in the first few days, after several rounds of manual confirmation, marking and training, a health algorithm and a busyness algorithm with certain predictive ability can be formed quickly, so that you can start Carry out self-update of health degree algorithm and busy degree algorithm.
  • the health degree algorithm and the busyness degree algorithm can be quickly started and put into use without artificially creating initial training data, and grow rapidly after a few rounds, which greatly reduces the training cost of the health degree algorithm and the busyness degree algorithm. cost.
  • historical training data will accumulate more and more.
  • a part of historical training data can be selected to participate in training.
  • the historical training data can be obtained according to the pre-stored data ratio relationship in different time periods.
  • the data ratio relationship in different time periods can be such that [within 1 month, 1 to 3 months, 3 to 6 months, 6 to 12 months, and more than 12 months] the respective proportions of data [50%, 30%, 10%, 5%, 5%]. You can also participate in training without screening all the data in the past month.
  • historical training data can also be input into the health degree algorithm and the busyness degree algorithm respectively to train the health degree algorithm and the busyness degree algorithm.
  • the method of training optimization can be to randomly select 80% of the data as training, 20% of the data as verification, and select the model with the highest verification score after multiple trainings. It should be noted that this application only lists and illustrates a specific proportional relationship between training data and verification data, and methods of training and verifying according to other proportional relationships are also within the protection scope of this application.
  • Fig. 4 is a schematic diagram of the application of the training model provided by the embodiment of the present application. As shown in Fig. 4, in this embodiment, it may include: obtaining monitoring data and configuration data first, wherein the monitoring data may be corresponding to the container group to be regulated Real-time monitoring indicators: host indicators, container indicators, process indicators, business indicators, etc.
  • the configuration data can be the basic information of the container, specifically the container group to which the container belongs, the host to which the container belongs, and the physical location of the host.
  • Configuration data and monitoring data can be collectively referred to as initial container group data, and the initial container group data added every day can be imported in batches into the historical database, that is, the historical database contains (monitoring data + configuration data) at each point in time.
  • the newly acquired initial container group data can be input into the algorithm module with the same structure as that in Figure 3.
  • the algorithm module in this embodiment has exactly the same structure as the algorithm module in Figure 3, but they are independent of each other and do not affect each other during operation. .
  • the second health degree data, the second busy degree data and the second decision result can be obtained, and then the second health degree data, the second busy degree data and the second decision result can be written together into the historical database.
  • the initial container group data can be filtered according to regression filtering rules, threshold filtering rules and production abnormality filtering rules to obtain abnormal training data and normal training data, and then real-time training data, and then new ones can be obtained based on real-time training data and historical training data Training data, and the algorithm is trained and updated according to the new training data.
  • the algorithm is used to calculate the data of the day and then used as the training data, which not only reduces the creation cost of the training data, but also makes the training data It can be continuously updated as the business develops.
  • Fig. 5 is a schematic structural diagram of the container group control device provided by the embodiment of the present application. As shown in Fig. 5, the device provided by this embodiment may include :
  • the obtaining module 501 is configured to obtain initial container group data corresponding to the container group to be regulated.
  • the processing module 502 is configured to perform normalization processing on the initial container group data to obtain container group data.
  • the container group to be regulated includes a plurality of containers
  • the initial container group data includes at least one data index of each container
  • the processing module 502 is further configured to:
  • the value and preset value range of the target data indicator are obtained.
  • the container group data is obtained according to the values of the converted target data indicators.
  • the processing module 502 is further configured to process the data of the container group according to the pre-trained health degree algorithm to obtain the first health degree data corresponding to the container group to be regulated, and to process the data of the container group according to the pre-trained busyness degree algorithm.
  • the container group data is processed to obtain the first busyness data corresponding to the container group to be regulated.
  • processing module 502 is further configured to:
  • processing module 502 is also used for:
  • the busyness prediction is performed on the container group data in the form of a two-dimensional array to obtain the first busyness data, wherein the first busyness data is in the form of a one-dimensional array, and the one Each variable in the dimensional array represents how busy a container is.
  • the processing module 502 is further configured to perform decision processing on the first health degree data and the first busyness degree data according to pre-stored decision rules, to obtain a first decision result corresponding to the container group to be regulated, and according to The first decision result regulates the containers in the container group to be regulated.
  • processing module 502 is further configured to:
  • the average busyness and the standard deviation of busyness are determined according to the first busyness data.
  • the discrete containers in the group of containers to be regulated are determined according to the average busyness and the standard deviation of busyness.
  • Decision processing is performed according to the discrete containers in the container group to be regulated, the average busyness value, and the first health degree data to obtain a first decision result corresponding to the container group to be regulated.
  • processing module 502 is further configured to:
  • the initial container group data is processed according to preset training data processing rules to obtain real-time training data.
  • processing module 502 is further configured to:
  • the initial container group data is processed according to the pre-trained health degree algorithm, busyness degree algorithm and decision rule to obtain second health degree data, second busyness degree data and a second decision result.
  • the initial container group data is analyzed and processed according to the pre-stored exception data processing rule, the second health degree data, the second busyness degree data and the second decision result to obtain abnormal training data and normal training data.
  • Real-time training data is obtained according to the abnormal training data and the normal training data.
  • processing module 502 is also used for:
  • the pre-stored regression filtering rules and the second health degree data select the first target container data whose health degree is lower than the preset health degree threshold from the initial container group data, and at the same time according to the pre-stored regression filtering rules and the second health degree data.
  • the second busyness data selects second target container data whose busyness is higher than a preset busyness threshold from the initial container group data.
  • the third target container data is acquired from the initial container group data according to a pre-stored threshold filtering rule.
  • the fourth target container data corresponding to the abnormality identifier is obtained from the initial container group data according to the pre-stored production abnormality filtering rule.
  • processing module 502 is further configured to:
  • the initial historical training data is screened to obtain the historical training data.
  • the device provided in the embodiment of the present application can implement the method in the above embodiment as shown in FIG. 2 , and its implementation principle and technical effect are similar, and will not be repeated here.
  • FIG. 6 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present application.
  • a device 600 provided by this embodiment includes a processor 601 and a memory communicatively connected to the processor. Wherein, the processor 601 and the memory 602 are connected through a bus 603 .
  • the processor 601 executes the computer-executed instructions stored in the memory 602, so that the processor 601 executes the container group regulation method in the foregoing method embodiments.
  • the processor may be a central processing unit (English: Central Processing Unit, referred to as: CPU), can also be other general-purpose processors, digital signal processors (English: Digital Signal Processor, referred to as: DSP), application-specific integrated circuits (English: Application Specific Integrated Circuit, referred to as: ASIC), etc.
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like. The steps of the method disclosed in conjunction with the invention can be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in the processor.
  • the memory may include high-speed RAM memory, and may also include non-volatile storage NVM, such as at least one disk memory.
  • the bus can be an Industry Standard Architecture (ISA) bus, a Peripheral Device Interconnect (Peripheral Component Interconnect, PCI) bus or Extended Industry Standard Architecture (Extended Industry Standard Architecture, EISA) bus, etc.
  • ISA Industry Standard Architecture
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the bus can be divided into address bus, data bus, control bus and so on.
  • the buses in the drawings of the present application are not limited to only one bus or one type of bus.
  • An embodiment of the present application further provides a computer-readable storage medium, where computer-executable instructions are stored in the computer-readable storage medium, and when a processor executes the computer-executable instructions, the container group control method in the foregoing method embodiment is implemented.
  • An embodiment of the present application further provides a computer program product, including a computer program, and when the computer program is executed by a processor, implements the container group control method as described above.
  • the above-mentioned computer-readable storage medium can be implemented by any type of volatile or non-volatile storage device or their combination, such as static random access memory (SRAM), electrically erasable Programmable Read Only Memory (EEPROM), Erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic Disk or Optical Disk.
  • SRAM static random access memory
  • EEPROM electrically erasable Programmable Read Only Memory
  • EPROM Erasable Programmable Read Only Memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Magnetic Disk Magnetic Disk or Optical Disk.
  • Readable storage media can be any available media that can be accessed by a general purpose or special purpose computer.
  • An exemplary readable storage medium is coupled to the processor such the processor can read information from, and write information to, the readable storage medium.
  • the readable storage medium can also be a component of the processor.
  • the processor and the readable storage medium can be located in an application-specific integrated circuit (Application Specific Integrated Circuits, referred to as: ASIC).
  • ASIC Application Specific Integrated Circuits
  • the processor and the readable storage medium can also exist in the device as discrete components.
  • the aforementioned program can be stored in a computer-readable storage medium.
  • the program executes the steps including the above-mentioned method embodiments; and the aforementioned storage medium includes: ROM, RAM, magnetic disk or optical disk and other various media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

本申请实施例提供了一种容器组调控方法、装置及电子设备,所述方法包括:获取待调控容器组对应的初始容器组数据,将初始容器组数据进行归一化处理,得到容器组数据,根据预训练的健康度算法对容器组数据进行处理,得到待调控容器组对应的第一健康度数据,同时根据预训练的繁忙度算法对容器组数据进行处理,得到待调控容器组对应的第一繁忙度数据,根据预存的决策规则对第一健康度数据和第一繁忙度数据进行决策处理,得到待调控容器组对应的第一决策结果,并根据第一决策结果对待调控容器组中的容器进行调控。该实施例既增加了决策结果的准确性,又减少了人力的消耗,进而提高了容器调控的准确性与效率。

Description

容器组调控方法、装置及电子设备
本申请要求于2021年11月24日提交中国专利局、申请号为2021114041878、申请名称为“容器组调控方法、装置及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及运维技术领域,尤其涉及一种容器组调控方法、装置及电子设备。
背景技术
随着计算机技术的发展,越来越多的技术应用在金融领域,传统金融业正在逐步向金融科技(Fintech)转变,运维技术也不例外,但由于金融行业的安全性、实时性要求,也对运维技术提出了更高的要求。随着各金融业务的增长,各金融业务运行的系统中容器组的数量也在不断的增加,对应的,监控指标的数量也突飞猛进。
现有技术中,在对容器组进行监控时,需要人工预先配置告警阈值,然后系统根据预先配置的告警阈值运行各金融业务,进而得到容器组监控结果。然后需要人工定时查看容器组监控结果,进而确定各金融业务是否在正常运行。同时还需要根据容器组的监控结果决定是否需要对某个容器组进行扩缩容或者关停。
然而,单纯依靠人工的方式对容器组进行监控以及调整,需要消耗大量的人力,且主观性强,降低了容器调控的效率与准确性,进而影响了各金融业务的正常运行。
技术解决方案
本申请的目的在于提供一种容器组调控方法、装置及电子设备,以提高容器调控的效率与准确性。
第一方面,本申请实施例提供一种容器组调控方法,包括:
获取待调控容器组对应的初始容器组数据;
将所述初始容器组数据进行归一化处理,得到容器组数据;
根据预训练的健康度算法对所述容器组数据进行处理,得到所述待调控容器组对应的第一健康度数据,同时根据预训练的繁忙度算法对所述容器组数据进行处理,得到所述待调控容器组对应的第一繁忙度数据;
根据预存的决策规则对所述第一健康度数据和所述第一繁忙度数据进行决策处理,得到所述待调控容器组对应的第一决策结果,并根据所述第一决策结果对所述待调控容器组中的容器进行调控。
可选的,所述待调控容器组中包含多个容器,所述初始容器组数据中包含各容器的至少一数据指标,所述将所述初始容器组数据进行归一化处理,得到容器组数据,包括:
针对每个目标数据指标,获取所述目标数据指标的数值以及预设取值范围;
根据所述目标数据指标的数值以及所述目标数据指标的预设取值范围进行归一化转换处理,确定转换后的目标数据指标的数值;
根据各转换后的目标数据指标的数值得到容器组数据。
可选的,所述根据预训练的健康度算法对所述容器组数据进行处理,得到所述待调控容器组对应的第一健康度数据,包括:
根据预训练的健康度算法对二维数组形式的所述容器组数据进行健康度预测,得到第一健康度数据,其中,所述第一健康度数据为一维数组形式的,且所述一维数组中的每个变量表示一个容器的健康度。
可选的,所述根据预训练的繁忙度算法对所述容器组数据进行处理,得到所述待调控容器组对应的第一繁忙度数据,包括:
根据预训练的繁忙度算法对二维数组形式的所述容器组数据进行繁忙度预测,得到第一繁忙度数据,其中,所述第一繁忙度数据为一维数组形式的,且所述一维数组中的每个变量表示一个容器的繁忙度。
可选的,所述根据预存的决策规则对所述第一健康度数据和所述第一繁忙度数据进行决策处理,得到所述待调控容器组对应的第一决策结果,包括:
根据所述第一繁忙度数据确定繁忙度平均值以及繁忙度标准差;
根据所述繁忙度平均值以及所述繁忙度标准差确定所述待调控容器组中的离散容器;
根据所述待调控容器组中的离散容器、所述繁忙度平均值以及所述第一健康度数据进行决策处理,得到所述待调控容器组对应的第一决策结果。
可选的,在所述根据预存的决策规则对所述第一健康度数据和所述第一繁忙度数据进行决策处理,得到所述待调控容器组对应的第一决策结果,并根据所述第一决策结果对所述待调控容器组中的容器进行调控之后,还包括:
获取所述待调控容器组对应的初始容器组数据;
根据预设训练数据处理规则对所述初始容器组数据进行处理,得到实时训练数据;
根据预先获取的历史训练数据与实时训练数据得到新的训练数据,并根据新的训练数据对所述预训练的健康度算法和繁忙度算法进行更新训练,得到新的健康度算法和繁忙度算法。
可选的,所述根据预设训练数据处理规则对所述初始容器组数据进行处理,得到实时训练数据,包括:
根据预训练的健康度算法、繁忙度算法以及决策规则对所述初始容器组数据进行处理,得到第二健康度数据、第二繁忙度数据以及第二决策结果;
根据预存的异常数据处理规则、所述第二健康度数据、第二繁忙度数据以及第二决策结果对所述初始容器组数据进行分析处理,得到异常训练数据和正常训练数据;
根据所述异常训练数据以及所述正常训练数据得到实时训练数据。
可选的,所述根据预存的异常数据处理规则、所述第二健康度数据、第二繁忙度数据以及第二决策结果对所述初始容器组数据进行分析处理,得到异常训练数据和正常训练数据,包括:
根据预存的回归过滤规则以及所述第二健康度数据从所述初始容器组数据中选择健康度低于预设健康度阈值的第一目标容器数据,同时根据预存的回归过滤规则以及所述第二繁忙度数据从所述初始容器组数据中选择繁忙度高于预设繁忙度阈值的第二目标容器数据;
根据预存的阈值过滤规则从所述初始容器组数据中获取第三目标容器数据;
根据预存的生产异常过滤规则从所述初始容器组数据中获取与异常标识对应的第四目标容器数据;
对所述第一目标容器数据、所述第二目标容器数据、所述第三目标容器数据以及所述第四目标容器数据进行数据融合处理,得到异常训练数据;
根据所述初始容器组数据以及所述异常训练数据,得到正常训练数据。
可选的,所述根据预先获取的历史训练数据与实时训练数据得到新的训练数据,包括:
根据预存的不同时间段内的数据比例关系从初始历史训练数据中进行筛选,得到历史训练数据;
根据所述历史训练数据与实时训练数据得到新的训练数据。
第二方面,本申请实施例提供一种容器组调控装置,包括:
获取模块,用于获取待调控容器组对应的初始容器组数据;
处理模块,用于将所述初始容器组数据进行归一化处理,得到容器组数据;
所述处理模块,还用于根据预训练的健康度算法对所述容器组数据进行处理,得到所述待调控容器组对应的第一健康度数据,同时根据预训练的繁忙度算法对所述容器组数据进行处理,得到所述待调控容器组对应的第一繁忙度数据;
所述处理模块,还用于根据预存的决策规则对所述第一健康度数据和所述第一繁忙度数据进行决策处理,得到所述待调控容器组对应的第一决策结果,并根据所述第一决策结果对所述待调控容器组中的容器进行调控。
第三方面,本申请实施例提供一种电子设备,包括:处理器,以及与所述处理器通信连接的存储器;
所述存储器存储计算机执行指令;
所述处理器执行所述存储器存储的计算机执行指令,实现如第一方面任一项所述的容器组调控方法。
第四方面,本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如第一方面任一项所述的容器组调控方法。
第五方面,本申请实施例提供一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时,实现如第一方面任一项所述的容器组调控方法。
本申请实施例提供了一种容器组调控方法、装置及电子设备,采用上述方案后,可以先获取待调控容器组对应的初始容器组数据,然后对该初始容器组数据进行归一化处理,得到容器组数据,再根据预训练的健康度算法对容器组数据进行处理,得到第一健康度数据,同时根据预训练的繁忙度算法对该容器组数据进行处理,得到第一繁忙度数据,最后根据预存的决策规则对第一健康度数据和第一繁忙度数据进行决策处理,得到待调控容器组对应的第一决策结果,并根据该第一决策结果对待调控容器组中的容器进行调控,通过根据容器组数据对应的健康度数据以及繁忙度数据两个维度综合进行决策的方式,既增加了决策结果的准确性,又减少了人力的消耗,进而提高了容器调控的准确性与效率,保证了各金融业务的正常运行。
附图说明
图1为本申请实施例提供的容器组调控方法的应用系统的架构示意图;
图2为本申请实施例提供的容器组调控方法的流程示意图;
图3为本申请实施例提供的算法模块的应用示意图;
图4为本申请实施例提供的训练模型的应用示意图;
图5为本申请实施例提供的容器组调控装置的结构示意图;
图6为本申请实施例提供的电子设备的硬件结构示意图。
本发明的实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例还能够包括除了图示或描述的那些实例以外的其他顺序实例。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
现有技术中,需要对各个容器组进行监控,确定各个容器组的工作状态,然后根据各个容器组的工作状态对应进行调控。在监控容器组时,需要先人工配置告警阈值,然后系统根据预先配置的告警阈值运行各金融业务,进而得到容器组监控结果。得到容器组监控结果之后,运维人员需要定时查看容器组监控结果,进而确定各金融业务是否在正常运行,同时还需要根据容器组的监控结果决定是否需要对某个容器组进行扩缩容或者关停。然而,随着业务的增长,需要监控和调控的容器组数量越来越多,单纯依靠人工的方式对容器组进行监控以及调整,需要消耗大量的人力,且主观性强,降低了容器调控的效率与准确性,进而影响了各金融业务的正常运行。
基于上述技术问题,本申请通过根据容器组数据对应的健康度数据以及繁忙度数据两个维度综合进行决策的方式,达到了既增加了决策结果的准确性,又减少了人力的消耗,进而提高了容器调控的准确性与效率,保证了各金融业务的正常运行的技术效果。
图1为本申请实施例提供的容器组调控方法的应用系统的架构示意图,如图1所示,在该应用系统中,包括:待调控容器组101、服务器102,服务器102中部署有预先训练完成的健康度算法A和繁忙度算法B,服务器102可以从待调控容器101中获取初始容器组数据,然后对该初始容器组数据进行归一化处理,得到格式统一的容器组数据,再根据预训练的健康度算法A和繁忙度算法B分别对容器组数据进行处理,得到处理结果,然后根据预存的决策规则对处理结果进一步进行处理,得到决策结果,再根据该决策结果反向对待调控容器组101中的容器进行调控。
其中,待调控容器组101中可以包含多个容器,具体为一组负责同样的业务处理逻辑的容器实例集合,这些容器具有可相互替代,分担处理流量的性质。且待调控容器组101可以有一个或多个,在该实例中,服务器102可以同时对多个待调控容器组101进行调控。
下面以具体地实施例对本申请的技术方案进行详细说明。下面这几个具体的实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例不再赘述。
图2为本申请实施例提供的容器组调控方法的流程示意图,本实施例的方法可以由服务器102执行。如图2所示,本实施例的方法,可以包括:
S201:获取待调控容器组对应的初始容器组数据。
在本实施例中,待调控容器组中包含有多个容器,在对待调控容器组进行调控之前可以先获取每个容器的初始数据,得到待调控容器组对应的初始容器组数据。
进一步的,初始容器组数据可以为二维数组的形式,二维数组的每一行为单个容器对应的容器实例的初始数据,多个容器的多行数据组成了二维数组。另外,二维数组中包含的每个容器均属于同一个容器组,且该容器组的所有实例均已经被包含在该二维数组中。
其中,初始数据可以为实时监控数据,实时监控数据中包含主机指标、容器指标、进程指标、业务指标以及容器基本信息等至少一个数据指标。
对应的,主机指标为容器运行所在宿主机的性能指标,可以包括:CPU(CPU总量,CPU使用率)、内存(内存总量,内存使用率,缓存大小、高速缓冲存储器大小)、磁盘(磁盘空间、磁盘读写、磁盘索引节点)、网络(网络流量、网络连接数、可用端口数)等。
容器指标为容器运行时的性能指标,可以包括:CPU(CPU总量、CPU使用率、CPU节流率)、内存(内存总量、内存使用率、buffer或cache大小)、磁盘(磁盘空间、磁盘读写、磁盘inode)、网络(网络流量、网络连接数、可用端口数)等。
进程指标为业务进程运行时的各种指标,可以包括:活跃线程数、打开文件数等。
业务指标为容器应用进行业务处理时所产生的指标,可以包括:交易量(每分钟交易量)、成功率(系统成功率、业务成功率、失败数量)、耗时(平均耗时、最大耗时)等。
容器基本信息为该容器在整个环境中所属的位置,可以包括:容器所属容器组、容器组所属系统、容器所属主机、主机物理位置等。
S202:将初始容器组数据进行归一化处理,得到容器组数据。
在本实施例中,在得到初始容器组数据之后,由于初始容器组数据中的数据类型可能各不相同,为了避免不同的数据的数量级不一样,在训练中出现不同数据的权重出现偏差的情况,可以将初始容器组数据进行归一化处理,得到容器组数据。
进一步的,待调控容器组中包含多个容器,初始容器组数据中包含各容器的至少一数据指标,则将初始容器组数据进行归一化处理,得到容器组数据,具体可以包括:
针对每个目标数据指标,获取所述目标数据指标的数值以及预设取值范围。
根据所述目标数据指标的数值以及所述目标数据指标的预设取值范围进行归一化转换处理,确定转换后的目标数据指标的数值。
根据各转换后的目标数据指标的数值得到容器组数据。
具体的,针对每个目标数据指标,可以将不同维度的数据指标转换为固定取值范围的浮点数。示例性的,固定取值范围可以为 [0-1]。
在进行转换时,可以先获取每个目标数据指标的预设取值范围,即[min, max]。对应的,在一种实现方式中,预设取值范围可以通过人工写入的方式获取。示例性的,根据实际应用场景可以人工指定取值范围为[0-1000000],单位为MB。在另一种实现方式中,可以通过预设时间段内的历史值来确定预设取值范围,示例性的,过去一个月中所出现的最大值为512G,则可以将预设取值范围设置为[0-524288],单位为MB(其中,524288MB=512G*1024)。
在获取到预设取值范围之后,即可以根据目标数据指标的数值以及目标数据指标的预设取值范围进行归一化转换处理,确定转换后的目标数据指标的数值。示例性的,假设目标数据指标的数值a,归一化处理后的数值b,则可以根据以下表达式进行归一化处理:
If min >= a,b = 0。
If max > a > min,b = (a - min) / (max–min)。
If a >= max,b = 1。
其中,目标数据指标可以为主机指标、容器指标、进程指标或业务指标。
此外,对于容器基本信息,可以按照一定的转化规则将其字符串标识转化为唯一的数值,具体可以为:
对于主机物理位置,一般可表示为:机房+机列+机柜+机位,原值为定长字符串。如机房的取值为[IDC1,IDC4,IDC2,IDC9],将其排序后按照自然数编号,得到IDC1转化为数字1,IDC4转化为数字3,IDC2转化为数字2,IDC9转化为数字4。如机柜的取值为[A01,A02,B01,B02],同样将其排序后按照自然数编号,得到[A01=1,A02=2,B01=3,B03=4]。
对于容器所属主机,原值一般为一个固资编号,也可以按照上述方式进行排序后按照自然数编号。
对于容器组所属系统,均已有一个系统编号,使用系统编号即可。
对于容器所属容器组,也可以按照容器组名称排序后按照自然数编号。
通过上述转化方式,对于同个属性,无需关注数值的大小,只需要关注是否属于同一个机柜、主机或者子系统等,即所属关系在转换过程中得到了维持,提高了后续数据识别的效率与便利性。
S203:根据预训练的健康度算法对容器组数据进行处理,得到待调控容器组对应的第一健康度数据,同时根据预训练的繁忙度算法对容器组数据进行处理,得到待调控容器组对应的第一繁忙度数据。
在本实施例中,在得到容器组数据之后,可以从健康度和繁忙度两个维度综合对容器组数据进行预测处理,得到容器组数据对应的第一健康度数据和第一繁忙度数据。
进一步的,根据预训练的健康度算法对所述容器组数据进行处理,得到所述待调控容器组对应的第一健康度数据,具体可以包括:
根据预训练的健康度算法对二维数组形式的所述容器组数据进行健康度预测,得到第一健康度数据,其中,所述第一健康度数据为一维数组形式的,且所述一维数组中的每个变量表示一个容器的健康度。
具体的,可以根据预先训练完成的健康度算法对待调控容器组内的各个容器的健康情况进行预测。其中,容器组数据可以为二维数组形式,第一健康度数据可以为一维数组形式。进一步的,一维数组的长度可以与所输入二维数组的长度相同,即对于容器组中每个容器的实时监控数据,均可以对应得到一个健康度预测结果。示例性的,健康度预测结果的取值可以为1:健康,2:亚健康,3:异常,然后多个容器的预测结果可以组成一个一维数组,即第一健康度数据。
其中,健康度算法可以采用现有的算法,示例性的,可以采用SVM (Support Vector Machine,支持向量机)算法。
此外,根据预训练的繁忙度算法对所述容器组数据进行处理,得到所述待调控容器组对应的第一繁忙度数据,具体可以包括:
根据预训练的繁忙度算法对二维数组形式的所述容器组数据进行繁忙度预测,得到第一繁忙度数据,其中,所述第一繁忙度数据为一维数组形式的,且所述一维数组中的每个变量表示一个容器的繁忙度。
具体的,可以根据预先训练完成的繁忙度算法对待调控容器组内的各个容器的繁忙程度进行预测。其中,容器组数据可以为二维数组形式,第一繁忙度数据可以为一维数组形式。进一步的,一维数组的长度可以与所输入二维数组的长度相同,即对于容器组中每个容器的实时监控数据,均可以对应得到一个繁忙度预测结果。示例性的,繁忙度预测结果的取值区间可以为(0,100),在(0,20)的区间表示该容器空闲,在(20,60)的区间表示该容器正常,在(60,100)的区间表示该容器繁忙,多个容器的繁忙度预测结果可以组成一个一维数组,即第一繁忙度数据。
其中,繁忙度算法可以采用现有的算法,示例性的,可以采用DTS(Decision Trees,决策树)算法。
另外,还可以用一些其他的分类和回归算法,比如K聚类、卷积神经网络等方式来确定健康度和繁忙度,在此不再详细进行限定,且不同方式来确定健康度和繁忙度的方式均在本申请的保护范围内。
现有技术中,需要对不同的业务类型(如高CPU、高IO、批量、联机)设计不同的策略分支进行预测,且不同的容器组,具有不同的指标特征,普通的算法没办法动态的适应每种指标的特征,只能先设计出已有的算法,建立不同模型,然后选择合适的指标进行验证,造成了分支多、需要人工分类、随时调整、不够精准的缺点,而采用上述方案后,通过从健康度和繁忙度两个维度综合进行预测,可以适应容器组的各种指标特征,且无需再区分业务类型,提高了业务类型处理的便利性与准确性。
S204:根据预存的决策规则对第一健康度数据和第一繁忙度数据进行决策处理,得到待调控容器组对应的第一决策结果,并根据第一决策结果对待调控容器组中的容器进行调控。
在本实施例中,在得到预测的第一健康度数据以及第一繁忙度数据之后,可以以健康度和繁忙度两个维度为依据进行决策,得到待调控容器组的决策结果,进而对该待调控容器组是否需要进行调控和如何进行调控进行决策。
进一步的,根据预存的决策规则对所述第一健康度数据和所述第一繁忙度数据进行决策处理,得到所述待调控容器组对应的第一决策结果,具体可以包括:
根据所述第一繁忙度数据确定繁忙度平均值以及繁忙度标准差。
根据所述繁忙度平均值以及所述繁忙度标准差确定所述待调控容器组中的离散容器。
根据所述待调控容器组中的离散容器、所述繁忙度平均值以及所述第一健康度数据进行决策处理,得到所述待调控容器组对应的第一决策结果。
具体的,在得到第一繁忙度数据之后,可以先根据第一繁忙度确定出繁忙度平均值,以及繁忙度标准差,然后可以根据繁忙度平均值以及繁忙度标准差确定待调控容器组内是否有离散容器,若待调控容器组内有离散容器,则确定该容器组的繁忙度不够均衡,然后可以根据待调控容器组中的离散容器、繁忙度平均值以及第一健康度数据进行决策处理,得到待调控容器组对应的第一决策结果。示例性的,繁忙度平均值为a,繁忙度标准差为d,容器的繁忙度为b,可以先确定是否有容器的繁忙度b落在2个标准差外,即:b–a > d * 2,落在2个标准差外的容器可以被称为离散容器。
采用上述方案后,是对整个容器组进行分析,而不再是单个容器进行分析,在加上使用标准差+平均值过滤的方案,能够快速过滤出容器组中是否有与其他容器运行状态不一致的离散容器,通常该离散容器就是故障点的所在,提高了故障容器的定位效率以及准确性,且快速决策对该离散容器做停止操作也提高了系统异常恢复的速度,进而保证了各金融业务的稳定运行。
进一步的,表1为待调控容器组内各容器对应的第一决策结果表,在该表中,可以根据繁忙度平均值a、是否有离散容器以及待调控容器组的第一健康度数据对各容器进行关停或扩容调控等。
表1 待调控容器组内各容器对应的第一决策结果表
Figure 177742dest_path_image001
采用上述方案后,可以先获取待调控容器组对应的初始容器组数据,然后对该初始容器组数据进行归一化处理,得到容器组数据,再根据预训练的健康度算法对容器组数据进行处理,得到第一健康度数据,同时根据预训练的繁忙度算法对该容器组数据进行处理,得到第一繁忙度数据,最后根据预存的决策规则对第一健康度数据和第一繁忙度数据进行决策处理,得到待调控容器组对应的第一决策结果,并根据该第一决策结果对待调控容器组中的容器进行调控,通过根据容器组数据对应的健康度数据以及繁忙度数据两个维度综合进行决策的方式,既增加了决策结果的准确性,又减少了人力的消耗,进而提高了容器调控的准确性与效率,保证了各金融业务的正常运行。
基于图2的方法,本说明书实施例还提供了该方法的一些具体实施方案,下面进行说明。
在另一实施例中,图3为本申请实施例提供的算法模块的应用示意图,如图3所示,在该实施例中,所述算法模块可以包括:数据输入子模块,归一化处理子模块,健康度算法子模块,繁忙度算法子模块以及决策子模块5个子模块。数据输入子模块可以接受一个二维数组的输入,该二维数组的每一行为单个容器的实时监控数据,即初始容器组数据,多个容器的多行数据组成了二维数组,且二维数组中包含的每个容器均属于同一个容器组。归一化处理子模块可以将各种维度的数据转换成为固定取值范围的浮点数,即将初始容器组数据归一化处理为容器组数据,然后可以将容器组数据分别输入至健康度算法子模块和繁忙度算法子模块,并对应得到一维数组形式的第一健康度数据和第一繁忙度数据,最后将第一健康度数据和第一繁忙度数据输入至决策子模块进行决策处理,得到第一决策结果,通过将容器指标的预测分为健康度和繁忙度两个维度预测值,让对容器的运行状态预测更加立体和准确。
此外,在另一实施例中,在根据预存的决策规则对所述第一健康度数据和所述第一繁忙度数据进行决策处理,得到所述待调控容器组对应的第一决策结果,并根据所述第一决策结果对所述待调控容器组中的容器进行调控之后,所述方法还可以包括:
获取所述待调控容器组对应的初始容器组数据。
根据预设训练数据处理规则对所述初始容器组数据进行处理,得到实时训练数据。
根据预先获取的历史训练数据与实时训练数据得到新的训练数据,并根据新的训练数据对所述预训练的健康度算法和繁忙度算法进行更新训练,得到新的健康度算法和繁忙度算法。
在本实施例中,在通过健康度算法和繁忙度算法对待调控容器进行调控时,为了提高健康度算法和繁忙度算法的准确性与实时性,也可以同时对健康度算法与繁忙度算法进行更新。即可以将新获取的待调控容器组对应的初始容器组数据进行预处理得到实时训练数据,然后将实时训练数据与之前的历史训练数据进行合并,得到新的训练数据,并根据新的训练数据对预训练的健康度算法和繁忙度算法进行更新训练,得到新的健康度算法和繁忙度算法,通过本申请的自更新策略,使用算法来对当天数据进行计算后作为训练数据,既降低了训练数据的创造成本,又让训练数据可以随着业务的发展而不断地更新。
此外,为了让实时预测的计算能力得到保证,及时响应实时的预测需求,不受历史数据计算的影响,算法模型在实际应用时和算法更新时均部署了一份,两者完全一样,但相互独立。因此,在得到新的健康度算法和繁忙度算法之后,可以推送更新实际应用时的算法模块中的健康度算法和繁忙度算法以及算法更新时的算法模块中的健康度算法和繁忙度算法,进而实现了让最新的训练完后的模型可以及时被使用,并不断地得到更新。
进一步的,根据预设训练数据处理规则对所述初始容器组数据进行处理,得到实时训练数据,具体可以包括:
根据预训练的健康度算法、繁忙度算法以及决策规则对所述初始容器组数据进行处理,得到第二健康度数据、第二繁忙度数据以及第二决策结果。
根据预存的异常数据处理规则、所述第二健康度数据、第二繁忙度数据以及第二决策结果对所述初始容器组数据进行分析处理,得到异常训练数据和正常训练数据。
根据所述异常训练数据以及所述正常训练数据得到实时训练数据。
具体的,在对健康度算法和繁忙度算法进行更新时,可以先获取初始容器组数据,新获取的初始容器组数据需要作为训练数据和历史训练数据共同训练健康度算法和繁忙度算法,然而,由于新获取的初始容器组数据中仅包含监控数据和配置数据,并不包含待调控容器的健康度、繁忙度和决策结果,因此,需要将新获取的初始容器组数据先经过算法模块,得到第二健康度、第二繁忙度和第二决策结果,然后再根据第二健康度、第二繁忙度和第二决策结果对初始容器组数据进行处理,得到实时训练数据,再将该实时训练数据与历史训练数据进行合并,得到新的训练数据。而对于历史训练数据,在前期均已经过上述的环节,因此均已保存有健康度、繁忙度和决策结果三个预测值。
另外,在根据第二健康度、第二繁忙度和第二决策结果对初始容器组数据进行处理,得到实时训练数据时,可以对初始容器组数据进行分类,得到异常训练数据和正常训练数据,然后根据异常训练数据、正常训练数据共同对健康度算法和繁忙度算法进行更新,避免了健康度算法和繁忙度算法对异常场景的过拟合,但又同时保持了一定的敏感性。
再进一步的,根据预存的异常数据处理规则、所述第二健康度数据、第二繁忙度数据以及第二决策结果对所述初始容器组数据进行分析处理,得到异常训练数据和正常训练数据,具体可以包括:
根据预存的回归过滤规则以及所述第二健康度数据从所述初始容器组数据中选择健康度低于预设健康度阈值的第一目标容器数据,同时根据预存的回归过滤规则以及所述第二繁忙度数据从所述初始容器组数据中选择繁忙度高于预设繁忙度阈值的第二目标容器数据。
根据预存的阈值过滤规则从所述初始容器组数据中获取第三目标容器数据。
根据预存的生产异常过滤规则从所述初始容器组数据中获取与异常标识对应的第四目标容器数据。
对所述第一目标容器数据、所述第二目标容器数据、所述第三目标容器数据以及所述第四目标容器数据进行数据融合处理,得到异常训练数据。
根据所述初始容器组数据以及所述异常训练数据,得到正常训练数据。
具体的,在确定异常训练数据时,可以从回归过滤、阈值过滤以及生产异常过滤三个维度对初始容器组数据进行筛选。
对应的,对于回归过滤,可以根据预存的回归过滤规则和第二健康度数据对初始容器组数据按照繁忙度高低进行排序,然后选择繁忙度高于预设繁忙度阈值的第二目标容器数据。同理,可以根据预存的回归过滤规则以及第二健康度数据从初始容器组数据中选择健康度低于预设健康度阈值的第一目标容器数据。
其中,繁忙度越高,表明该容器出现异常的可能性越高,健康度越低,表明该容器出现异常的可能性越高。且繁忙度阈值和健康度阈值可以根据需要的标本量来进行定义。
对于阈值过滤,可以根据实际应用场景自定义设置阈值过滤规则,示例性的,阈值过滤规则可以为:当宿主机cpu使用率超过80%、当宿主机内存使用率超过80%或当容器cpu使用率超过90%。
若满足了以上任意一个条件,均可以通过阈值过滤把对应的初始容器组数据获取出来,得到第三目标容器数据。
对于生产异常过滤,生产异常过滤规则可以是真实事件发生之后,根据真实事件发生的异常标识从初始容器组数据中过滤出对应的数据,以得到第四目标容器数据。其中,初始容器组数据可以以异常标识作为定位一个数据的索引。示例性的,异常标识可以为时间段和涉及的容器组,当确定了真实事件发生的时间段和涉及的容器组,可以唯一地从初始容器组数据中获取到第四目标容器数据。
另外,回归过滤、阈值过滤、生产异常过滤实际上是从三个维度挑选了部分数据,回归过滤是从健康度和繁忙度的维度发现的异常,阈值过滤是从原始指标的维度发现的异常,生产异常过滤则是从实际已经发生的异常的维度发现的异常。这三种过滤最终挑选出了一些很有可能是异常的数据,这些数据存在着时间上和归属容器组上有重合的可能性,数据聚合可以将这些有重合的数据合并起来,降低数据的冗余性。对应的,聚合的具体方法可以为:对于目标容器数据来说,若同属于一个容器组,并且时间差不超过预设时长,则可以选取时间较早的目标容器数据作为真正的异常训练数据。其中,预设时长可以为3-10分钟中的任意值。且对于每个最终输出的数据点,均是以(时间点+容器组)为整体提供给出来的,对于回归过滤和阈值过滤,都是以单个容器为过滤目标,因此可以将单个容器关联的容器组的数据对应获取到,作为异常训练数据。
此外,在得到异常训练数据之后,还可以进行人工确认打标,人工确认打标可以提供用户界面给运维人员。在该用户界面中,可以显示同个容器组内的监控数据和配置数据(即初始容器组数据),还可以显示三个异常数据过滤方案的提示,方便运维人员确认是否需要对预测结果(即第二健康度数据、第二繁忙度数据和第二决策结果)和异常训练数据进行调整。需要特别注意,本申请并不要求运维人员对所有过滤出来的异常训练数据进行确认调整,且对于没有进行确认的数据,则可以按照算法计算得出的预测结果输出即可,提高了异常训练数据和预测结果的准确性,进而提高了后续健康度算法和繁忙度算法更新的准确性。
另外,实时训练数据为算法模块计算后以及人工确认打标后的异常训练数据和正常训练数据的合集,是一个临时数据,后续会固化合并到历史训练数据库中。而实时训练数据中的正常训练数据的来源也是算法模块的输出,具体获取过程可以为:假如人工确认打标后得到的异常训练数据的数据量为N,那么在算法模块的输出中可以同样取出N条与异常训练数据不重合的数据,作为正常训练数据。
此外,在另一实施例中,根据预先获取的历史训练数据与实时训练数据得到新的训练数据,具体可以包括:
根据预存的不同时间段内的数据比例关系从初始历史训练数据中进行筛选,得到历史训练数据。
根据所述历史训练数据与实时训练数据得到新的训练数据。
在本实施例中,历史训练数据中可以包含有过去多天累积出来的数据,再加上当天新增的实时训练数据。另外,历史训练数据可以包含有少量的初始人造训练数据,初始人造训练数据编造了若干的场景,以让健康度算法和繁忙度算法可以从零开始进行初始化训练,得到一个具有一定预测能力的健康度算法和繁忙度算法。但该初始人造训练数据也不是必须的,因为在初始的几天,经过几轮人工确认打标和训练后,可以很快形成一个具有一定预测能力的健康度算法和繁忙度算法,从而可以开始进行健康度算法和繁忙度算法的自更新。通过上述方式可以让健康度算法和繁忙度算法不需要人工创造初始训练数据就快速地开始和投入使用,并在少数几轮后快速地成长起来,大大减少了健康度算法和繁忙度算法训练的成本。
另外,历史训练数据会越积累越多,为了保证训练的效率,可以从历史训练数据中选取出一部分来参与训练。但为了让历史训练数据既能体现近期的运行情况,也能体现过往长期的运行情况,可以根据预存的不同时间段内的数据比例关系来获取历史训练数据。示例性的,不同时间段内的数据比例关系可以为让[1个月内、1~3个月、3~6个月、6~12个月、12个月以上]的数据量分别占比[50%、30%、10%、5%、5%]。还可以对近一个月的所有数据不做筛选,全量参与训练。然后按照近一个月的数据规模,分别从1~3个月、3~6个月、6~12个月、12个月以上的数据中随机选取对应的数量总量的数据来参与训练。需要说明的是,本申请只是列举了一些历史训练数据的具体获取方式,按照其他方式获取历史训练数据的实例也在本申请的保护范围内。
此外,还可以将历史训练数据分别输入到健康度算法和繁忙度算法中,对健康度算法和繁忙度算法进行训练。训练优化采用的方式可以为随机取出80%的数据作为训练,20%的数据作为验证,多次训练后选择验证分数最高的模型。需要说明的是,本申请只是列举了说明了训练数据和验证数据的一个具体比例关系,按照其他比例关系来进行训练和验证的方式也在本申请的保护范围内。
图4为本申请实施例提供的训练模型的应用示意图,如图4所示,在该实施例中,可以包括:先获取监控数据和配置数据,其中,监控数据可以为待调控容器组所对应的实时监控指标:主机指标、容器指标、进程指标、业务指标等。配置数据可以为容器的基本信息,具体可以为容器所属容器组、容器所属主机、主机物理位置。配置数据和监控数据可以统称为初始容器组数据,可以将每天新增的初始容器组数据批量导入到历史数据库中,即历史数据库中包含每个时间点的(监控数据+配置数据)。然后可以将新获取的初始容器组数据输入到与图3结构相同的算法模块中,本实施例中的算法模块与图3中的算法模块结构完全一样,但相互独立,运行过程中互不影响。通过该实施例中的算法模块,可以得到第二健康度数据、第二繁忙度数据和第二决策结果,然后可以将第二健康度数据、第二繁忙度数据和第二决策结果一并写入到历史数据库中。然后可以根据回归过滤规则、阈值过滤规则以及生产异常过滤规则对初始容器组数据进行过滤,得到异常训练数据和正常训练数据,进而得到实时训练数据,然后根据实时训练数据和历史训练数据得到新的训练数据,并根据新的训练数据对算法进行训练以及更新,通过本申请的自更新策略,使用算法来对当天数据进行计算后作为训练数据,既降低了训练数据的创造成本,又让训练数据可以随着业务的发展而不断地更新。
基于同样的思路,本说明书实施例还提供了上述方法对应的装置,图5为本申请实施例提供的容器组调控装置的结构示意图,如图5所示,本实施例提供的装置,可以包括:
获取模块501,用于获取待调控容器组对应的初始容器组数据。
处理模块502,用于将所述初始容器组数据进行归一化处理,得到容器组数据。
在本实施例中,所述待调控容器组中包含多个容器,所述初始容器组数据中包含各容器的至少一数据指标,所述处理模块502,还用于:
针对每个目标数据指标,获取所述目标数据指标的数值以及预设取值范围。
根据所述目标数据指标的数值以及所述目标数据指标的预设取值范围进行归一化转换处理,确定转换后的目标数据指标的数值。
根据各转换后的目标数据指标的数值得到容器组数据。
所述处理模块502,还用于根据预训练的健康度算法对所述容器组数据进行处理,得到所述待调控容器组对应的第一健康度数据,同时根据预训练的繁忙度算法对所述容器组数据进行处理,得到所述待调控容器组对应的第一繁忙度数据。
在本实施例中,所述处理模块502,还用于:
根据预训练的健康度算法对二维数组形式的所述容器组数据进行健康度预测,得到第一健康度数据,其中,所述第一健康度数据为一维数组形式的,且所述一维数组中的每个变量表示一个容器的健康度。
此外,所述处理模块502,还用于:
根据预训练的繁忙度算法对二维数组形式的所述容器组数据进行繁忙度预测,得到第一繁忙度数据,其中,所述第一繁忙度数据为一维数组形式的,且所述一维数组中的每个变量表示一个容器的繁忙度。
所述处理模块502,还用于根据预存的决策规则对所述第一健康度数据和所述第一繁忙度数据进行决策处理,得到所述待调控容器组对应的第一决策结果,并根据所述第一决策结果对所述待调控容器组中的容器进行调控。
在本实施例中,所述处理模块502,还用于:
根据所述第一繁忙度数据确定繁忙度平均值以及繁忙度标准差。
根据所述繁忙度平均值以及所述繁忙度标准差确定所述待调控容器组中的离散容器。
根据所述待调控容器组中的离散容器、所述繁忙度平均值以及所述第一健康度数据进行决策处理,得到所述待调控容器组对应的第一决策结果。
此外,在另一实施例中,所述处理模块502,还用于:
获取所述待调控容器组对应的初始容器组数据。
根据预设训练数据处理规则对所述初始容器组数据进行处理,得到实时训练数据。
根据预先获取的历史训练数据与实时训练数据得到新的训练数据,并根据新的训练数据对所述预训练的健康度算法和繁忙度算法进行更新训练,得到新的健康度算法和繁忙度算法。
在本实施例中,所述处理模块502,还用于:
根据预训练的健康度算法、繁忙度算法以及决策规则对所述初始容器组数据进行处理,得到第二健康度数据、第二繁忙度数据以及第二决策结果。
根据预存的异常数据处理规则、所述第二健康度数据、第二繁忙度数据以及第二决策结果对所述初始容器组数据进行分析处理,得到异常训练数据和正常训练数据。
根据所述异常训练数据以及所述正常训练数据得到实时训练数据。
进一步的,所述处理模块502,还用于:
根据预存的回归过滤规则以及所述第二健康度数据从所述初始容器组数据中选择健康度低于预设健康度阈值的第一目标容器数据,同时根据预存的回归过滤规则以及所述第二繁忙度数据从所述初始容器组数据中选择繁忙度高于预设繁忙度阈值的第二目标容器数据。
根据预存的阈值过滤规则从所述初始容器组数据中获取第三目标容器数据。
根据预存的生产异常过滤规则从所述初始容器组数据中获取与异常标识对应的第四目标容器数据。
对所述第一目标容器数据、所述第二目标容器数据、所述第三目标容器数据以及所述第四目标容器数据进行数据融合处理,得到异常训练数据。
根据所述初始容器组数据以及所述异常训练数据,得到正常训练数据。
此外,在另一实施例中,所述处理模块502,还用于:
根据预存的不同时间段内的数据比例关系从初始历史训练数据中进行筛选,得到历史训练数据。
根据所述历史训练数据与实时训练数据得到新的训练数据。
本申请实施例提供的装置,可以实现上述如图2所示的实施例的方法,其实现原理和技术效果类似,此处不再赘述。
图6为本申请实施例提供的电子设备的硬件结构示意图,如图6所示,本实施例提供的设备600包括:处理器601,以及与所述处理器通信连接的存储器。其中,处理器601、存储器602通过总线603连接。
在具体实现过程中,处理器601执行所述存储器602存储的计算机执行指令,使得处理器601执行上述方法实施例中的容器组调控方法。
处理器601的具体实现过程可参见上述方法实施例,其实现原理和技术效果类似,本实施例此处不再赘述。
在上述的图6所示的实施例中,应理解,处理器可以是中央处理单元(英文:Central Processing Unit,简称:CPU),还可以是其他通用处理器、数字信号处理器(英文:Digital Signal Processor,简称:DSP)、专用集成电路(英文:Application Specific Integrated Circuit,简称:ASIC)等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合发明所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。
存储器可能包含高速RAM存储器,也可能还包括非易失性存储NVM,例如至少一个磁盘存储器。
总线可以是工业标准体系结构(Industry Standard Architecture,ISA)总线、外部设备互连(Peripheral Component Interconnect,PCI)总线或扩展工业标准体系结构(Extended Industry Standard Architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,本申请附图中的总线并不限定仅有一根总线或一种类型的总线。
本申请实施例还提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现上述方法实施例的容器组调控方法。
本申请实施例还提供一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时,实现如上所述的容器组调控方法。
上述的计算机可读存储介质,上述可读存储介质可以是由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。可读存储介质可以是通用或专用计算机能够存取的任何可用介质。
一种示例性的可读存储介质耦合至处理器,从而使处理器能够从该可读存储介质读取信息,且可向该可读存储介质写入信息。当然,可读存储介质也可以是处理器的组成部分。处理器和可读存储介质可以位于专用集成电路(Application Specific Integrated Circuits,简称:ASIC)中。当然,处理器和可读存储介质也可以作为分立组件存在于设备中。
本领域普通技术人员可以理解:实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取存储介质中。该程序在执行时,执行包括上述各方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是:以上各实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述各实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。

Claims (13)

  1. 一种容器组调控方法,其特征在于,包括:
    获取待调控容器组对应的初始容器组数据;
    将所述初始容器组数据进行归一化处理,得到容器组数据;
    根据预训练的健康度算法对所述容器组数据进行处理,得到所述待调控容器组对应的第一健康度数据,同时根据预训练的繁忙度算法对所述容器组数据进行处理,得到所述待调控容器组对应的第一繁忙度数据;
    根据预存的决策规则对所述第一健康度数据和所述第一繁忙度数据进行决策处理,得到所述待调控容器组对应的第一决策结果,并根据所述第一决策结果对所述待调控容器组中的容器进行调控。
  2. 根据权利要求1所述的方法,其特征在于,所述待调控容器组中包含多个容器,所述初始容器组数据中包含各容器的至少一数据指标,所述将所述初始容器组数据进行归一化处理,得到容器组数据,包括:
    针对每个目标数据指标,获取所述目标数据指标的数值以及预设取值范围;
    根据所述目标数据指标的数值以及所述目标数据指标的预设取值范围进行归一化转换处理,确定转换后的目标数据指标的数值;
    根据各转换后的目标数据指标的数值得到容器组数据。
  3. 根据权利要求1或2所述的方法,其特征在于,所述根据预训练的健康度算法对所述容器组数据进行处理,得到所述待调控容器组对应的第一健康度数据,包括:
    根据预训练的健康度算法对二维数组形式的所述容器组数据进行健康度预测,得到第一健康度数据,其中,所述第一健康度数据为一维数组形式的,且所述一维数组中的每个变量表示一个容器的健康度。
  4. 根据权利要求1-3任一项所述的方法,其特征在于,所述根据预训练的繁忙度算法对所述容器组数据进行处理,得到所述待调控容器组对应的第一繁忙度数据,包括:
    根据预训练的繁忙度算法对二维数组形式的所述容器组数据进行繁忙度预测,得到第一繁忙度数据,其中,所述第一繁忙度数据为一维数组形式的,且所述一维数组中的每个变量表示一个容器的繁忙度。
  5. 根据权利要求1-4任一项所述的方法,其特征在于,所述根据预存的决策规则对所述第一健康度数据和所述第一繁忙度数据进行决策处理,得到所述待调控容器组对应的第一决策结果,包括:
    根据所述第一繁忙度数据确定繁忙度平均值以及繁忙度标准差;
    根据所述繁忙度平均值以及所述繁忙度标准差确定所述待调控容器组中的离散容器;
    根据所述待调控容器组中的离散容器、所述繁忙度平均值以及所述第一健康度数据进行决策处理,得到所述待调控容器组对应的第一决策结果。
  6. 根据权利要求1-5任一项所述的方法,其特征在于,在所述根据预存的决策规则对所述第一健康度数据和所述第一繁忙度数据进行决策处理,得到所述待调控容器组对应的第一决策结果,并根据所述第一决策结果对所述待调控容器组中的容器进行调控之后,还包括:
    获取所述待调控容器组对应的初始容器组数据;
    根据预设训练数据处理规则对所述初始容器组数据进行处理,得到实时训练数据;
    根据预先获取的历史训练数据与实时训练数据得到新的训练数据,并根据新的训练数据对所述预训练的健康度算法和繁忙度算法进行更新训练,得到新的健康度算法和繁忙度算法。
  7. 根据权利要求6所述的方法,其特征在于,所述根据预设训练数据处理规则对所述初始容器组数据进行处理,得到实时训练数据,包括:
    根据预训练的健康度算法、繁忙度算法以及决策规则对所述初始容器组数据进行处理,得到第二健康度数据、第二繁忙度数据以及第二决策结果;
    根据预存的异常数据处理规则、所述第二健康度数据、第二繁忙度数据以及第二决策结果对所述初始容器组数据进行分析处理,得到异常训练数据和正常训练数据;
    根据所述异常训练数据以及所述正常训练数据得到实时训练数据。
  8. 根据权利要求7所述的方法,其特征在于,所述根据预存的异常数据处理规则、所述第二健康度数据、第二繁忙度数据以及第二决策结果对所述初始容器组数据进行分析处理,得到异常训练数据和正常训练数据,包括:
    根据预存的回归过滤规则以及所述第二健康度数据从所述初始容器组数据中选择健康度低于预设健康度阈值的第一目标容器数据,同时根据预存的回归过滤规则以及所述第二繁忙度数据从所述初始容器组数据中选择繁忙度高于预设繁忙度阈值的第二目标容器数据;
    根据预存的阈值过滤规则从所述初始容器组数据中获取第三目标容器数据;
    根据预存的生产异常过滤规则从所述初始容器组数据中获取与异常标识对应的第四目标容器数据;
    对所述第一目标容器数据、所述第二目标容器数据、所述第三目标容器数据以及所述第四目标容器数据进行数据融合处理,得到异常训练数据;
    根据所述初始容器组数据以及所述异常训练数据,得到正常训练数据。
  9. 根据权利要求6-8任一项所述的方法,其特征在于,所述根据预先获取的历史训练数据与实时训练数据得到新的训练数据,包括:
    根据预存的不同时间段内的数据比例关系从初始历史训练数据中进行筛选,得到历史训练数据;
    根据所述历史训练数据与实时训练数据得到新的训练数据。
  10. 一种容器组调控装置,其特征在于,包括:
    获取模块,用于获取待调控容器组对应的初始容器组数据;
    处理模块,用于将所述初始容器组数据进行归一化处理,得到容器组数据;
    所述处理模块,还用于根据预训练的健康度算法对所述容器组数据进行处理,得到所述待调控容器组对应的第一健康度数据,同时根据预训练的繁忙度算法对所述容器组数据进行处理,得到所述待调控容器组对应的第一繁忙度数据;
    所述处理模块,还用于根据预存的决策规则对所述第一健康度数据和所述第一繁忙度数据进行决策处理,得到所述待调控容器组对应的第一决策结果,并根据所述第一决策结果对所述待调控容器组中的容器进行调控。
  11. 一种电子设备,其特征在于,包括:处理器,以及与所述处理器通信连接的存储器;
    所述存储器存储计算机执行指令;
    所述处理器执行所述存储器存储的计算机执行指令,以实现如权利要求1至9任一项所述的容器组调控方法。
  12. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如权利要求1至9任一项所述的容器组调控方法。
  13. 一种计算机程序产品,包括计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至9任一项所述的容器组调控方法。
PCT/CN2022/101384 2021-11-24 2022-06-27 容器组调控方法、装置及电子设备 WO2023093031A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111404187.8A CN114064413B (zh) 2021-11-24 2021-11-24 容器组调控方法、装置及电子设备
CN202111404187.8 2021-11-24

Publications (1)

Publication Number Publication Date
WO2023093031A1 true WO2023093031A1 (zh) 2023-06-01

Family

ID=80275838

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/101384 WO2023093031A1 (zh) 2021-11-24 2022-06-27 容器组调控方法、装置及电子设备

Country Status (2)

Country Link
CN (1) CN114064413B (zh)
WO (1) WO2023093031A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114064413B (zh) * 2021-11-24 2023-06-16 深圳前海微众银行股份有限公司 容器组调控方法、装置及电子设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100325642A1 (en) * 2009-06-22 2010-12-23 Microsoft Corporation Automatically re-starting services
CN102916831A (zh) * 2012-09-18 2013-02-06 冯晋阳 业务系统的健康度获得方法及系统
CN111221624A (zh) * 2019-12-31 2020-06-02 中国电力科学研究院有限公司 一种针对基于Docker容器技术的调控云平台的容器管理方法
CN114064413A (zh) * 2021-11-24 2022-02-18 深圳前海微众银行股份有限公司 容器组调控方法、装置及电子设备

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005022417A2 (en) * 2003-08-27 2005-03-10 Ascential Software Corporation Methods and systems for real time integration services
CN111796959B (zh) * 2020-06-30 2023-08-08 中国工商银行股份有限公司 宿主机容器自愈方法、装置及系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100325642A1 (en) * 2009-06-22 2010-12-23 Microsoft Corporation Automatically re-starting services
CN102916831A (zh) * 2012-09-18 2013-02-06 冯晋阳 业务系统的健康度获得方法及系统
CN111221624A (zh) * 2019-12-31 2020-06-02 中国电力科学研究院有限公司 一种针对基于Docker容器技术的调控云平台的容器管理方法
CN114064413A (zh) * 2021-11-24 2022-02-18 深圳前海微众银行股份有限公司 容器组调控方法、装置及电子设备

Also Published As

Publication number Publication date
CN114064413B (zh) 2023-06-16
CN114064413A (zh) 2022-02-18

Similar Documents

Publication Publication Date Title
CN105677836A (zh) 一种同时支持离线数据和实时在线数据的大数据处理解决系统
CN107330641A (zh) 一种基于Storm流处理框架和规则引擎的金融衍生品实时风险控制系统及方法
CN108563739A (zh) 天气数据获取方法及装置、计算机装置及可读存储介质
US8024205B2 (en) System and method for calculating damage as a result of natural catastrophes
CN113486584B (zh) 设备故障的预测方法、装置、计算机设备及计算机可读存储介质
CN116186548B (zh) 电力负荷预测模型训练方法及电力负荷预测方法
CN111667151B (zh) 一种电力市场风险全景识别方法及系统
CN111986027A (zh) 基于人工智能的异常交易处理方法、装置
CN110413927B (zh) 在发布订阅系统中基于匹配实时性的优化方法及系统
WO2023093031A1 (zh) 容器组调控方法、装置及电子设备
CN117674119A (zh) 电网运行风险评估方法、装置、计算机设备和存储介质
CN114155044A (zh) 一种电力现货市场节点电价预测方法及系统
CN118115098A (zh) 基于深度学习的大数据分析与处理系统
CN109829115B (zh) 搜索引擎关键词优化方法
CN116662572A (zh) 基于金融事理图谱的资产分析方法、装置、系统及介质
CN116227989A (zh) 多维度的业务信息化监督方法及系统
CN115293809A (zh) 基于人工智能的台风暴雨风险评级方法及相关设备
CN112800089A (zh) 一种中间数据存储级别调整方法、存储介质及计算机设备
CN111582369B (zh) 一种atm的分类方法及装置
CN118395384B (zh) 一种多维分解与智能融合的电力负荷预测方法及相关设备
CN112104467B (zh) 割接操作风险评级方法、装置及计算设备
CN117391405B (zh) 用于客户与业务人员的智能匹配的方法、系统和电子设备
CN118333528B (zh) 采购计划排配方法、装置及电子设备
Li et al. Node Resource Balance Scheduling Algorithm of Power Internet of Things Based on Big Data
CN116780515A (zh) 用电量预测方法、装置、计算机设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22897127

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE