CN113344282A - Method, system and computer readable medium for capacity data processing and allocation - Google Patents

Method, system and computer readable medium for capacity data processing and allocation Download PDF

Info

Publication number
CN113344282A
CN113344282A CN202110696330.9A CN202110696330A CN113344282A CN 113344282 A CN113344282 A CN 113344282A CN 202110696330 A CN202110696330 A CN 202110696330A CN 113344282 A CN113344282 A CN 113344282A
Authority
CN
China
Prior art keywords
data
service
growth rate
calculating
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110696330.9A
Other languages
Chinese (zh)
Other versions
CN113344282B (en
Inventor
王丽
史晨阳
彭晓
孙纪周
邢世伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Everbright Bank Co Ltd
Original Assignee
China Everbright Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Everbright Bank Co Ltd filed Critical China Everbright Bank Co Ltd
Priority to CN202110696330.9A priority Critical patent/CN113344282B/en
Publication of CN113344282A publication Critical patent/CN113344282A/en
Application granted granted Critical
Publication of CN113344282B publication Critical patent/CN113344282B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/103Workflow collaboration or project management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/109Time management, e.g. calendars, reminders, meetings or time accounting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance

Abstract

The invention relates to a method, a system and a computer readable medium for processing and allocating capacity data, which inquire the system data characteristics according to system codes and index codes; calculating the time length of the predicted characteristic data; when the translation fitting is determined, calculating the growth rate, and writing the growth rate into an intermediate database; judging a translation structure and calculating a growth weight; obtaining the growth rate and the growth weight, calculating translation fitting, and constructing a sequence decomposition fitting model; judging an adjustment mode according to the time length of the feature data, adjusting the growth rate according to monthly features, calculating the growth rate, and writing the growth rate into an intermediate database; calculating the predicted deviation as an adjusting scheme and putting the predicted deviation into storage; and acquiring the deviation adjustment of the predicted length, and storing the predicted data in a warehouse. The method has strong prediction flexibility and high prediction accuracy, and a user can more clearly analyze the influence on the prediction accuracy; enhancing the fitting and predicting effects; the service index data can be well predicted according to the requirement.

Description

Method, system and computer readable medium for capacity data processing and allocation
Technical Field
The invention relates to the technical field of computer application, in particular to a method, a system and a computer readable medium for processing and allocating capacity data of a financial computer system based on internet and large-scale data processing.
Background
Data changes over time for each application in existing financial computer systems, including banks, and these changes, while generally growing, may in some cases result in a decrease or even repeated change in data for a particular application. New applications may also fluctuate over time, including an increase or decrease in the number of applications or fluctuations.
Each application in the existing banking computer system is often independent of the dependence on hardware resources, and particularly, expensive extensible resources such as a CPU, a hard disk, a memory and the like are all independently used. Wherein, only higher level hardware devices and equipment can be purchased by banks for pursuing the robustness of the system, the IO index, etc. In a production environment, an operator of a machine room that manages and controls each application manually adjusts extensive hardware resources, such as adding enough hard disks after shutdown, such as adding 80% hard disks, or reducing 50% hard disks after shutdown, according to hardware display indexes provided by manual patrol or a specific host, such as hard disk, memory capacity, or CPU saturation indexes; similar empirical operations are performed on resources such as CPU or memory. Such manual adjustment cannot accommodate the dynamic changes of a large-scale data processing-related bank computer system with high intensity and high concurrency in actual production, and wastes much expensive computer hardware equipment in order to maintain sufficient data operation security.
From the aspect of application, especially in the financial field, the appearance of scenes such as internet network loan, festival merchant promotion, high-income financing sale, second killing and the like causes businesses such as a bank form to rapidly develop business on each line in the internet, and business growth presents a new form, and the application of the lines causes the capacity control and allocation of each system including data to become extremely frequent.
Traditional commercial banks need to invest a lot of funds and resources, and the construction and development need a lot of personnel including information center operators to participate in auxiliary information systems and technical support systems, wherein the personnel consumption cost is higher than that of hardware resources which are expensive, but in order to continuously adapt to each newly-proposed business type and service mode, only the extensive large-scale consumption increase can be tolerated to meet the business needs of continuous development. For example, as the service varieties, delivery channels and technical implementation of banks increase, the corresponding computer application systems in banks also increase, so that the situation arises: each application system independently corresponds to a background service, a payment system and other support systems, and a great number of application systems are provided with a front processor to realize specific service processing, data processing or equipment control management; a large number of front-end processor systems for different services are often placed in a bank machine room. The system with the architecture and the system working method thereof increase the investment of system maintenance personnel, cause the waste of bank equipment and software investment, such as serious repeated development phenomena in different regions and different systems of the same bank, and are more dangerous: the system may be faulty or crashed due to manual configuration errors or management clutter of the application system's capacity.
Disclosure of Invention
In view of the above-mentioned deficiencies in the prior art, the present invention provides a method, system and computer-readable medium for dynamically controlling and scheduling capacity data processing and scheduling for a bank computer system to address at least one of the above-mentioned problems.
In a first aspect, the present invention provides a capacity data processing and scheduling method.
In one embodiment of the first aspect of the invention, the method comprises:
step s201, the pre-system counts the self service supply condition and adjusts and distributes the service supply condition to the system of each service flow;
step s202, the main-front system accesses a channel system and a service system required by the complete service through a channel;
step s203, calculating bearing capacity data based on system pressure test, and collecting and counting service index data based on a monitoring device;
step s204, the data acquisition synchronizer manages the acquisition device, receives the data, classifies the data systematically and transmits the data to the capacity platform server;
step s205, selecting a target index based on the analysis; if the modeling requirement exists, jumping to step s 301; if the predicted value exists, jumping to step s 601;
step s601, feeding back the prediction result to the capacity platform server, displaying whether an alarm state exists and calculating and transmitting allocation parameters based on the relation between the threshold value and the prediction value;
step s602, the application service coordinator receives the allocation parameters, calculates and obtains a service adjustment value based on the bearing capacity data and the prediction threshold difference, and sends a service amount allocation request to the specified system; skipping to step s 201;
step s301, the capacity platform server sends a prediction request of a target index to the algorithm platform analysis server according to the requirement;
step s302, the algorithm platform server identifies the index code of the target index and the system code of the target index;
step s303, inquiring system data characteristics according to the system code and the index code;
step s304, calculating the time length of the predicted characteristic data;
step s103, judging whether to perform translation fitting; jumping to step s104 when the translation fitting is determined, and jumping to step s107 when the non-translation fitting is determined;
step s104, calculating the growth rate, and writing the growth rate into an intermediate database;
step s105, judging a translation structure and calculating an increase weight;
step s106, obtaining the growth rate and the growth weight, calculating the translation fitting, jumping to step s116,
step s107, constructing a sequence decomposition fitting model;
step s108, fitting model optimization and realizing prediction;
step s109, judging an adjustment mode according to the time length of the feature data, if the over-growth rate is adjusted according to the monthly feature, skipping to step s 110; if the predicted length is adjusted according to the deviation, jumping to step s 113;
step s110, calculating a growth rate, and writing the growth rate into an intermediate database;
step s111, obtaining the monthly feature adjustment of the growth rate, and jumping to step s 116;
step s113, calculating the predicted deviation as an adjustment scheme, and warehousing;
step s114, obtaining a predicted length deviation adjustment;
step s115, adjusting twice based on the deviation amplitude;
at step s116, the prediction data is saved and/or a jump is made directly to step 601.
In a second aspect, the present invention provides a system for processing and scheduling capacity data.
In one embodiment of the second aspect of the invention, it comprises:
various services; and/or a number of channel systems;
a total front system;
a plurality of business systems;
a data acquisition synchronizer;
an application service coordinator;
a capacity platform server;
an algorithm analysis server;
a database and, in addition,
a number of memories for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the capacity data processing and scheduling method;
in a third aspect, the invention provides a computer-readable storage medium.
In an embodiment of the third aspect of the invention, a computer program is stored thereon, which when executed by a processor, implements the capacity data processing and scheduling method described above.
Compared with the prior art, the invention has the beneficial effects that:
the capacity data processing and allocating method, the capacity data processing and allocating system and the computer readable medium have the advantages of strong prediction flexibility and high prediction accuracy, and a user can more clearly analyze the influence on the prediction accuracy; enhancing the fitting and predicting effects; and (3) according to the time sequence data fitting model under different conditions, service index data can be well predicted according to requirements, and then the result is put into a warehouse to complete the whole time sequence prediction process. Enhancing the fitting and predicting effects; the model is fitted with the time sequence data under different conditions, the service index data can be well predicted according to requirements, and then the result is put into a warehouse to complete the whole time sequence prediction process; and based on such model prediction and calculation, capacity resources can be organically and dynamically adjusted and allocated, so that the capacity resources of the whole system can be automatically increased or decreased and automatically allocated.
Drawings
FIG. 1 is a schematic diagram of a prior art data processing and deployment process;
FIG. 2 is a schematic diagram of a deployment system architecture for the capacity data processing and deployment method, system, and computer-readable medium of the present invention;
FIG. 3 is a schematic diagram of the internal structure of an algorithmic analysis server of the capacity data processing and deployment method, system, and computer readable medium of the present invention;
FIG. 4 is a first portion of a data processing and deployment flow diagram of the method, system, and computer readable medium of capacity data processing and deployment of the present invention;
FIG. 5 is a second portion of a data processing and deployment flow diagram of the method, system, and computer-readable medium of capacity data processing and deployment of the present invention;
fig. 6 is a block diagram of a capacity data processing apparatus according to the present invention.
In the drawings, the main parts are illustrated by symbols:
Detailed Description
The invention is described in detail below with reference to the figures and examples:
explanation of technical terms related to the present invention:
service: a complete service is generated by a user, equipment deployed by a channel system is used for sending a data request, the data request reaches a general front system through a channel, the data request reaches a specified service system through the channel after authentication, control and distribution of the general front system, the service system processes the data request, response piece information is generated after the data request is processed and returns to the general front system through the channel, the response piece information is delivered to the specified channel system through the channel by the general front system, and the channel system feeds back the service. Thus, a business process is completed.
And (3) pressure testing: any system requires a pressure test at the beginning of its operation. The pressure test is based on the system accessing the total pre-system, and when accessing, mainly calculates indexes of TPS, QPS (system throughput) and the like of the access system based on the use upper limit of physical resources of the access system and the total pre-system, and obtains the bearing capacity data of the access system according to the application service distribution logic of the total pre-system.
Application service: and in the front system, a CPU process group for processing the information of the specific access system. The application service can be quantified in the total front system, and the distributed application service amount of the access system depends on the pressure test at the beginning of system access. The pre-master system has limited CPU and memory resources, and in the face of all application services in the active state, the pre-master system will allocate its resources equally to each active application service. Therefore, the application service amount allocated by the access system determines the service capability of the access system under the control of the overall front system. Application services need to be reasonably and efficiently distributed to each access system.
Application service volume provisioning: the expected service ration number of the access system is specified, and the unit is the number. The system can foreseeably allocate the service quantity of the appointed access system under a prediction mechanism, thereby preventing the resource waste caused by the system downtime caused by the service quantity increase or the service quantity reduction.
And (4) channel: the system is controlled by the front system and is positioned in a virtual communication channel between the front system and other systems, and the realization premise is based on the application service volume distributed to the communication system by the front system. The more the application service volume is, the more unobstructed the channel is.
Bearing capacity data: taking the access system as a unit (here, the accessed service system is mainly considered), each access system corresponds to the current application service volume. And the corresponding threshold values of all the service indexes and the unit service capacity bearing capacity of the system are also included. Each access system has an application service capacity, each system has a plurality of indexes, and each index corresponds to a threshold value and a unit service capacity. The unit of the application service amount is the number, namely the number of the services currently distributed to the system corresponding to the prior system; the threshold unit is an index unit, namely the upper limit of the index required to be met by a normal operation system is obtained under the pressure test; the unit service capacity is an index unit of each application service amount, that is, the index value can be accommodated when each application service is added in the current state.
Service index data: the method comprises the steps of taking an access system as a unit (mainly considering the accessed service system), designating daily frequency time sequence data acquired by the service system based on a specific method, and forming different indexes based on the acquisition mode and content, wherein the different indexes can feed back the capacity condition of the application field of the service system from different angles. The daily transaction amount is the simplest and most direct indicator and is measured in units of pens per day.
Blending parameters: the deployment parameters are directed to a given business system, relating to the bearing capacity data and the relationship between the predicted values and the threshold values of the system.
A channel system: and the service-related client system is responsible for receiving the service request of the user and then transmitting the request data to a service processing-related system, or receiving a service processing receipt and feeding back the result to the user.
A service system: and the service-related service end system is responsible for receiving the service processing task, finishing service processing through the service system or the functions among the service systems, and transmitting the receipt information back to the receiving channel after processing.
Total front system: the financial industry middle platform system is responsible for connecting a foreground channel system and a background business system and allocating the flow logic of business in a third party, a branch and a data center. And the hot plug of the channel is supported, so that the high expandability can be met, the application service is quantized, the service is uniformly distributed and the channel is managed, and the processing of a large number of services is met.
Data acquisition synchronizer: the collected bearing capacity data and the collected service index data are arranged into two items of data by taking a service system as a unit, one service system corresponds to one system code and is used for identifying the service system, a plurality of service indexes are arranged below the service system, and each service index corresponds to one index code and is used for identifying the index. The two items of data are arranged into daily frequency time sequence data in a time range owned by each index under the system, a threshold value unique to the index and the current application service volume of each system.
Capacity platform server: the daily frequency time sequence data and the threshold value are stored in a database, prediction requests are sequentially executed on different systems according to index codes, each effective index of each system is managed, and the time sequence data of each index can be displayed and early warned.
Algorithm platform server: and according to the acquired system code and index code positioning tasks, performing situation-based fitting prediction on historical time sequence data of indexes in the database to obtain future predicted time sequence data, and storing the future predicted time sequence data into the database.
The capacity platform server compares the threshold value with the predicted value of the predicted index, and transmits the index predicted value, the application service amount of the current system and the index threshold value as the application service allocation parameters of the system to the application service allocator when the predicted value of one index of the current system exceeds the threshold value or the history and the predicted value are far lower than the threshold value.
A database: storing two parts of contents, including historical data storage after the data acquisition synchronizer collects and is processed by a capacity platform server; and the algorithm platform analysis server carries out prediction data storage after data modeling prediction.
The application service tuner:
based on the received parameters, the predicted value and the threshold are calculated to obtain the deployment amplitude, and the threshold and the current application service volume of the system are calculated to obtain the unit application service carrying capacity of the system. And the two parameters are recalculated to obtain service volume allocation, and the service volume allocation is an application service volume increase and decrease value which is required to be carried out by the current system based on prediction in the future.
The algorithm server of the invention comprises: the device comprises a historical data loading unit, a data preprocessing unit, a growth rate calculating unit, a growth rate storing unit, a prediction feedback unit, an optimization calculating unit, a model building unit, a deviation adjusting unit and a middleware container.
Referring to fig. 2 to 5, a first general embodiment of the present invention:
in the service circulation of the pre-finished configuration and construction, after a front-end user initiates a service request, the service request enters a bank front-end system through various service channel systems, such as pos machines, ATM machines, online banks and the like, in the front-end system, the service request enters a designated background service system through channels after standardization and distribution scheduling for service processing, a service processing result is returned to the front-end system through the same channel for analysis, and the processing result is returned to a channel system end as a receipt to be displayed to the user. In the above process, the two channels are all transmitted by using the application service as a bearing unit, and under each complete service cycle, each channel has a specific application service characteristic which is based on the specific traffic which can be borne by the application service of the total previous system unit.
The total front system application service number is fixed, when the system is accessed, the maximum required application service number of the channel between the access system and the total front system and the service bearing capacity of the unit application service can be obtained after the pressure test is completed, and the data are transmitted to the data acquisition synchronizer of the device. Meanwhile, all relevant service index data in the service system are also transmitted to the synchronizer.
And after the data is collected, the data is processed in the capacity platform server, and the collected data is updated in real time in the process to be used for the calculation of the database. The design method comprises the steps that a daily task sends a modeling request to an algorithm platform analysis server, a data prediction model of a certain service index is built on the algorithm platform, the built model is stored, and when a prediction request is sent again, the model is used for predicting future values of the specified service index. And after receiving the prediction result, the capacity platform server judges whether the service change constitutes the necessity of adjusting the application service volume by taking the bearing capacity related data collected once as a threshold value.
When the application service volume is adjusted to be necessary, an alarm is given and the adjustment parameters are automatically transmitted to the application service adjusting device, all the parameters are combined and calculated in the device, the information of service volume distribution can be transmitted to the pre-main system, and the application servers of the specified service are uniformly distributed through the standardized setting interface of the pre-main system.
The complete process of the invention when predicting the time sequence is that firstly, the data characteristics of the system are inquired according to the system coding and the index coding, namely, the parameters participating in the prediction task and the complete and effective data for model training are sorted out in an automatic calling mode, and then the time length of the existing data is calculated and is used for comparing with the prediction time length in the demand.
Generally, when the ratio of the training time length to the prediction time length of the model is small, a good prediction result cannot be obtained. Based on knowledge of the model application, when the above ratio exceeds 3, a decomposition fitting model is considered for prediction, and when the ratio does not exceed, a translation fitting model is considered. And since the forecasted demand is generally between one week and two months, training data boundaries can be obtained approximately between three weeks and half a year.
When the amount of data for training is sufficient to apply the decomposition fitting model, the training of the model is directly performed, and the logic of use of the algorithm is described below.
In the field of Time Series analysis, there is a common analysis method called Time Series Decomposition (decomplexing of Time Series), which divides the Time Series y (t) into several parts, respectively, a season item, a trend item, and a remainder item. In the real life and production links, the effect of holidays h (t) is usually added to seasonal terms, trend terms and residual terms. The model overall construction then mainly consists of three parts, growth (growth trend), seaselectivity (seasonal trend) and holidases (influence of holidays on the predicted values). Wherein g (t) represents a growth function for fitting aperiodic variations of the predicted values in the time series; s (t) is used to indicate periodic changes, such as weekly, seasonal, etc.; h (t) represents the influence of the holidays with the non-fixed period on the predicted value in the time series, and the final residual unpredictable items are added to obtain a formula:
y(t)=g(t)+s(t)+h(t)+ε
therefore, the decomposition fitting mode is to perform the above decomposition on the original time sequence data, then perform fitting in sequence, and finally restore to obtain the prediction model of the original time sequence data.
For the growth trend term g (t), the trend of existence of two forms, i.e., a limited trend and an unlimited trend, is considered first, the former conforming to a logistic regression function, and the latter conforming to a piecewise linear regression function. For the characteristics of logistic regression, the function can be written as
Figure BDA0003128567910000111
Where C is called the maximum asymptotic value of the curve, k represents the rate of increase of the curve, and m represents the midpoint of the curve, while the characteristics of linear regression are simpler, i.e., f (x) kx + b. When the service field index is predicted, no clear practical limiting condition exists in the data characteristics, so that the piecewise linear regression function is considered to be used. The segmentation mode is realized by designing variable points, a plurality of variable points are set in the model construction process, and the time S is set when S variable points are set at corresponding variable point positionsj(1. ltoreq. j. ltoreq.S), by assuming the vector
Figure BDA0003128567910000112
To represent each time stamp sjAt the initial first growth rate, k,
Figure BDA0003128567910000113
expressing the function for the indication of the corresponding time stamp, whereby it can be obtained that the growth rate at the time stamp t is k + aTδ and at the base of the growth rateAnd calculating to obtain each segmented partial function expression. The selection of the positions of the variable points is generally carried out in an evenly distributed manner, the growth rate of the variable points is distributed randomly through probability, and in order to ensure the robustness of the model, a Laplace random distribution manner is adopted, namely the variation of the growth rate conforms to the distribution deltajLaplace (0, τ), τ being a predetermined degree of variation, can be obtained from the ratio of the range value to the average value of the training data. For the piecewise linear regression function used in the present invention, the overall trend function can be finally obtained as:
g(t)=(k+a(t)δ)·t+(m+a(t)Tγ),
where k is the initial growth rate, δ is the change in growth rate, m is the initial offset, and γ is (γ)1,…,γS)T,γj=-sjδjIs the incremental offset under the influence of the growth rate and together with m forms the total offset.
For the periodic seasonal term s (t), the factor which is most commonly considered in the time series prediction process is mainly represented by the periodicity formed by the influence of the seasonal characteristics of each day, week, month, year and the like of the time series.
And the periodic function can be expressed by a fourier series, namely:
Figure BDA0003128567910000114
wherein P is the period.
Because the basic time unit of the time series in the model is day, the selectable period is week, month and year, and in order to simplify the calculation scheme, the period of year is represented by P365.25, the period of week is represented by P7, and the period of week is used when the data amount does not satisfy the year. By considering the popularity of the time series data, the number of stages is set to 10 for the annual cycle and 3 for the cycle, so that the fourier series parameter vector β of each cycle type is formed (a ═ b ═ c1,b1,…,aN,bN)TThen, the model of the season term is obtained as s (t) ═ x (t) β, where x (t) is the vector of the corresponding trigonometric function, β is obtained by the probability random distribution, where the regular distribution β is selected~Normal(0,σ2)。
For the special and holiday effects h (t), because the special dates are unstable and the influence on the time sequence data is not ignored, an independent model is made for each special date, different front and back window values are set for the special dates, and the window values can reflect the gradual change effect of the influence of the special dates on the time sequence data in a normal distribution mode.
So as to obtain an integral special date model integrated into
Figure BDA0003128567910000121
Wherein Z (t) is an indicator function corresponding to a special date model, k-Normal (0, upsilon)2) The window effect under normal distribution.
The model for fitting the time series data is completely constructed up to this point:
y (t) ═ g (t) + s (t) + h (t) + epsilon, and then the function is solved by the L-BFGS algorithm. The algorithm is based on the root solving of the Newton method, namely, the Newton method is used for solving the solution of the function in the unitary function problem in a mode of continuously iterating second-order Taylor expansion, when the original function is solved, the derivative function can be directly solved, the result when the derivative function is 0 is the solution of the original function, and the solving formula f '(x') is obtained after a constant term is eliminatedk)+f″(xk)(x-xk) 0. The fitting algorithm is optimized based on a BFGS algorithm, the BFGS algorithm expands a multivariate function second-order Taylor through a Newton method thought, a second derivative function which is difficult to calculate is synchronously approximated in an iterative mode in the iterative process, the original single iterative process is converted into a mode of synchronously approximating iterative solution of a function root and the second derivative function, the intermediate memory capacity of the fitting algorithm sharply rises along with the increase of a calculation element, namely a dependent variable, the L-BFGS algorithm changes the storage of the calculation result of the second derivative function in the iteration to the storage of the calculation parameter before the calculation, although the calculation precision generated by the iteration is reduced, the memory space is effectively saved. And obtaining the required regression model after fitting.
After the sequence decomposition fitting model is obtained, firstly, the fitted time sequence data is compared with the fitting result, namely, the prediction deviation of the training time sequence data is calculated and is stored in a database as the deviation adjusting scheme of the index in the current service time. And secondly, judging whether the training time sequence data accords with the long period characteristics, wherein the data for more than two years is judged to be the long period, and the fitting model for the long period data is used for carrying out deviation adjustment on the prediction time length.
Two ways of model adjustment are detailed below.
For time sequence data with longer time span, the model fitting effect of the time sequence data can judge the data for overall training more evenly, but for the time sequence data prediction of financial business class with certain timeliness, the time sequence data closer to the prediction time period has more influence effect on the prediction result, namely, the real data before the business date can cause more relevant influence on the predicted value after the business date. In order to compensate for the high influence of the low-order autocorrelation in the fitting process of single time series data, the fitting result of the time series data with longer time span needs to be correspondingly adjusted to enhance the fitting and prediction effects. The adjustment mode can be two, namely the adjustment of the deviation of the predicted length and the adjustment of the growth rate according to the monthly characteristics.
The method is characterized in that the deviation of a new prediction result and the existing actual data is used as the deviation of a future prediction result in a mode of model fitting iteration once. When the service date is T, if the original required prediction data is a time period from T to T + k, the original model fitting problem f (0 → T) is equal to yT→T+kOn the basis, only the time period is adjusted, the model fitting is added once, and the problem f (0 → T-k) is predicted to be yT-k→TThe deviation of the predicted result from the real data in the time period is used as the adjustment value of the future prediction result
Figure BDA0003128567910000131
yT→T+k+yT-k→T-ytrueT-k→T. Because of the limitation of data length and computational complexity, the deviation adjustment is only carried out once iteration, and on the basis, the situation of larger deviation is not avoided due to the particularity of one iterationIn this case, when the deviation amplitude exceeds 30% or 70% at a certain time, the conventional recognition adjustment and the specialized adjustment are further performed, respectively. When the conventional identification adjustment is carried out, the characteristics of the current date, namely whether the current date is a working day, a rest day and a special holiday, are sequentially judged, and when the current date is judged to be the working day, the rest day and the special holiday, corresponding adjustment is carried out according to the average growth rate of the historical characteristic value, and the method is similar to the growth rate adjustment and translation algorithm described later. When the data is regulated in a special mode, whether the data is a special holiday or not is judged firstly based on the characteristics of data mutation, if so, the data is regulated by adding the growth rate according to the reference of the date in the same period of the last year, otherwise, a special date is temporarily added on the current day and is regulated according to a conventional identification regulation mode, and the added date is reported for technicians to judge whether the data is abnormal or not.
The growth rate monthly feature adjustment is mainly used for predicting time sequence data with a non-long period of less than two years, and the growth rate daily, weekly, monthly or including year of the non-long period data needs to be calculated firstly as the data of the translation fitting part before adjustment and is stored in a database. Then, for the decomposition fitting model adjusted by using the growth rate according to the monthly characteristics, firstly, judging whether the characteristics of the daily date is a special holiday or not, if the characteristics are the special holiday, adding the growth rate adjustment of the special date according to the reference of the date in the same period of the last year, and if the characteristics are not the special holiday, sequentially performing the growth rate adjustment of one period according to the average growth of the working day or the rest day and the average growth of the month as the daily data. Since whether the change of the working day and the month is not taken into consideration in the periodicity in the decomposition fitting process, the two items are mainly considered in the growth rate adjustment, and the month dimension data is predicted using data of about one year in the case of the data classification, so that the adjustment is made.
After the adjustment of the decomposed and fitted model is completed, the model fitting process using the translation mode is described below.
The time sequence data using the translation fitting has the characteristics of short period and training time length lower than the prediction time length by 3, so that the prediction requirement cannot be realized by a common machine learning or fitting mode. Too short a period of the possessed training data causes great randomness to the prediction result, and the overfitting effect is common. Then, the regularity which exists in the time series data is taken into consideration as a key mode of prediction, namely model fitting is carried out on the time series data on the basis of the growth effect of days, months, years and special holidays in the time dimension. Using a formula
f(t)=D(t)+M(t)·1if cross month+Y(t)·1if cross year+S(t)·1if special
For the day growth part D (t), the differentiation of working days and non-working days is considered, namely the average growth rate of the historical working days is respectively calculated according to the subareas to which the dates belong, the growth rate is increased on the basis of the previous period in the training data after the growth rate is obtained, and meanwhile, the corresponding proportion coefficient is multiplied to obtain the numerical value of the part
Figure BDA0003128567910000151
TdSpecific gravity coefficient r for one daydBased on whether or not there is a subsequent item.
For the month growth section m (t), it is decided whether the item exists or not according to whether or not the month is crossed in the data for training. If the specific gravity coefficient exists, the average monthly growth rate of the data is calculated, the growth rate is increased on the basis of the previous week, and the specific gravity coefficient is multiplied to obtain M (T) ═ m (T-T)m)·(1+gmonth)·rm,TmThe number of days of the current month.
For the annual growth section y (t), it is decided whether or not this item is present depending on whether or not the training data is year-over. When the specific gravity coefficient exists, the average annual growth rate of the data is calculated, the annual growth rate is increased on the basis of the previous year, and the specific gravity coefficient is multiplied to obtain Y (T) -y (T-T)y)·(1+gyear)·ry,TyThe days of the current year.
For the special day increasing part S (T), the importance degree is high but the calculability degree is not enough, so when the special date value of the previous period exists, the special date value is directly used as the value of the current period and is multiplied by the weight S (T-T) rsWhen the term exists, the weight is 0.5, i.e. at the same time other terms share the remaining 0.5, at non-special dates other terms will share the weight of the complete 1. With respect to the first three items,and the weight is distributed according to the descending rank proportion of the day, the month and the year based on the existing items, and finally the establishment of the translation fitting model is realized.
In conclusion, based on various conditions, time sequence data fitting models under different conditions are obtained finally, service index data can be well predicted according to requirements, and then results are put into a warehouse to complete the whole time sequence prediction process.
After the current prediction function is realized, the prediction content is application field data in production business, and the capacity content related to the business is mainly contained in the management and control process of each business system by the prior system. Therefore, in order to apply the prediction result to the essential requirement of capacity-related deployment, i.e. to summarize the resource optimization configuration proposal for each relevant system after the changes and information generated in any system are integrated and analyzed, the final resource configuration scheme is realized by two-part operation.
Step one, calculating future bearing requirements through the pre-collected unit application service bearing capacity. Because the service bearing capacity test is carried out on any service system before the service system is on line, a theoretical service peak value can be obtained under the given application service, and the peak value is a service threshold value capable of normally carrying out system service. The bearing capacity of the unit application service under the current service system can be obtained by dividing the service threshold value by the number of the application services.
And step two, calculating service allocation in the application service allocator based on the predicted data time interval and the unit application service bearing capacity, and feeding the allocation value back to the front-end system in an instruction form, so that the front-end system can pre-allocate service to a specified service system, and the service prediction and early warning of the service system are realized. The calculation mode is based on the predicted index type, service indexes such as daily transaction amount, average response time, daily throughput and the like in a system have unique threshold values, capacity risk prompting and alarming are respectively carried out when 80% of the threshold values are reached or the threshold values are exceeded in predicted data, meanwhile, after permission is obtained, predicted peak values are divided by unit application service bearing capacity, rounded up service amount distribution is carried out, and the service amount distribution is transmitted to a front-end system in an instruction mode. And after the front system acquires the allocation information, performing instruction allocation on the application service resources of the specified service system. Finally, a set of complete process from service prediction to application resource allocation is realized, and the efficiency is improved while the capacity wind control is guaranteed.
Referring to fig. 2 to 5, a second general embodiment of the present invention:
a capacity data processing and scheduling method, the method comprising:
step s201, the pre-system counts the self service supply condition and adjusts and distributes the service supply condition to the system of each service flow; step s202, the main-front system accesses a channel system and a service system required by the complete service through a channel;
step s203, calculating bearing capacity data based on system pressure test, and collecting and counting service index data based on a monitoring device;
step s204, the data acquisition synchronizer manages the acquisition device, receives the data, classifies the data systematically and transmits the data to the capacity platform server;
step s205, selecting a target index based on the analysis; if the modeling requirement exists, jumping to step s 301; if the predicted value exists, jumping to step s 601;
step s601, feeding back the prediction result to the capacity platform server, displaying whether an alarm state exists and calculating and transmitting allocation parameters based on the relation between the threshold value and the prediction value;
step s602, the application service coordinator receives the allocation parameters, calculates and obtains a service adjustment value based on the bearing capacity data and the prediction threshold difference, and sends a service amount allocation request to the specified system; skipping to step s 201;
step s301, the capacity platform server sends a prediction request of a target index to the algorithm platform analysis server according to the requirement;
step s302, the algorithm platform server identifies the index code of the target index and the system code of the target index;
step s303, inquiring system data characteristics according to the system code and the index code;
step s304, calculating the time length of the predicted characteristic data;
step s103, judging whether to perform translation fitting; jumping to step s104 when the translation fitting is determined, and jumping to step s107 when the non-translation fitting is determined;
step s104, calculating the growth rate, and writing the growth rate into an intermediate database;
step s105, judging a translation structure and calculating an increase weight;
step s106, obtaining the growth rate and the growth weight, calculating the translation fitting, jumping to step s116,
step s107, constructing a sequence decomposition fitting model;
step s108, fitting model optimization and realizing prediction;
step s109, judging an adjustment mode according to the time length of the feature data, if the over-growth rate is adjusted according to the monthly feature, skipping to step s 110; if the predicted length is adjusted according to the deviation, jumping to step s 113;
step s110, calculating a growth rate, and writing the growth rate into an intermediate database;
step s111, obtaining the monthly feature adjustment of the growth rate, and jumping to step s 116;
step s113, calculating the predicted deviation as an adjustment scheme, and warehousing;
step s114, obtaining a predicted length deviation adjustment;
step s115, adjusting twice based on the deviation amplitude;
at step s116, the prediction data is saved and/or a jump is made directly to step 601.
An automatic capacity resource allocation system, comprising: various services; and/or a number of channel systems; a total front system; a plurality of business systems; a data acquisition synchronizer; an application service coordinator; a capacity platform server; an algorithm analysis server; a database and a plurality of memories for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the capacity data processing and scheduling method.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, implements the capacity data processing and scheduling method
Referring to fig. 2 to 5, a third embodiment of the present invention
Now, it is assumed that a pre-main system is accessed to an electronic payment platform system and is coded into an EPAY, when the business system is accessed, a pressure test is required, bearing capacity data can be obtained after the pressure test, the current application service volume of a corresponding system is designed into 2 from the pre-main system, and the code is RJYL corresponding to the daily transaction volume of a certain index in the electronic payment platform and can obtain a threshold value of 600000 under the current condition in the EPAY system; meanwhile, the unit bearing capacity of the RJYL index under the EPAY system is calculated to be C100000/item according to the application service volume, the threshold and the parameter setting. The data are used as bearing capacity data and transmitted to a data acquisition synchronizer.
And in a period of time after the EPAY system is accessed, the data acquisition synchronizer continuously acquires the transaction amount instantaneous index of the time dimension through a built-in monitor of the EPAY system. All instantaneous indexes are subjected to statistical processing in the acquisition synchronizer to be processed into processing indexes taking day as a unit, and in this case, the indexes of RJYL can be obtained. Meanwhile, the bearing capacity data and the system information of the indexes are arranged into information and stored in the processed service index data, and the information and the processed service index data are transmitted to the capacity platform server together.
And after receiving the bearing capacity data and the service index data, the capacity platform server classifies and stores the bearing capacity data and the service index data according to the system and the indexes. And when the prediction task is received, sending a request to the algorithm platform server, wherein the request content comprises a system code EPAY, an index code RJYL, the maximum historical data length and the predicted data length of 30 days.
After receiving the prediction request, the algorithm platform server may start the prediction process, and first obtain historical data, that is, processed service index data, in this example, time sequence data of the RJYL index of the EPAY system, according to the request parameter. Because the request has the longest historical data, the algorithm platform server completely queries the target data from the database to obtain a time sequence data list with a value every day as shown in the table.
2018-08-18 2018-08-19 2019-01-01 2019-12-31
429257 1179578 821226 601519
Next, the time-series data (y) is obtained1,…,yp) Thereafter, the date difference between the earliest date and the latest date is calculated as the time length P of the feature data, and here, P may be calculated as 500. And judging the relation between P and L to select a fitting model, when P is more than three times of L, using sequence decomposition fitting, and when P is not more than three times of L, selecting translation fitting, wherein the former is selected.
The decomposition fitting process of the time series data is divided into two steps, wherein the first step is the construction of a model, and the second step is the optimization solution. The model is constructed as the current time sequence data (y)1,…,yp) Constructing a function y that satisfies all data pointstWhen the last term is an error term which cannot be predicted, the original time sequence data is decomposed into three time sequence numbersThe trend item, the period item and the special item are respectively used.
For the trend term, a piecewise linear regression function is employed. First of all in time series data (y)1,…,yp) Setting 1 change point every L days, setting the interval between the last change point and the service date to be not less than L days, wherein the change point position is a sectional point of a piecewise linear function, and the sequence of the change points is (2018-09-16,2018-10-15, …,2019-11-29) and is 15 change points. Since the linear function has the expression y-kt + b, the piecewise linear function is expressed by the set piecewise point as:
g(t)=(k+a(t)δ)·t+(m+a(t)Tγ), where k is the initial growth rate,(s)1,…,s15) The number of the change points is 15,
Figure BDA0003128567910000201
i.e. vectors combined by growth rates at 15 change points, each growth rate being set to satisfy a laplacian distribution, deltajLaplace (0, tau), the value of tau calculated by the quotient of the range and the average of the historical time series data is 0.105, so that the growth rate under random distribution can be obtained:
δ=[0.56140769,1.00103428,-0.04021555,0.06693926,…,0.6871913]from this, the growth rate of all piecewise linear functions with respect to k can be derived,
Figure BDA0003128567910000202
the segment position at which each growth rate is located is expressed as an indication function. Where m is the initial offset and γ is (γ)1,…,γ15)T,γj=-sjδjFor the incremental shift under the influence of the growth rate, a is determined by the transposition of the indicator functionj(t)TThe offset increment for each segment is obtained, which together with m forms the total offset. Setting the x-axis 0 point of the rectangular coordinate system as the date 2018-08-18, adding the date to 500 as the service date, m as the first date value 429257, k as the first segment average growth 1.0243, and finally obtaining the expression before the trend term model optimization:
Figure BDA0003128567910000203
for periodic terms, since P<365 by 2, if not meeting two years, the period is week, and 3-level Fourier series with the period of 7 can be obtained
Figure BDA0003128567910000204
And initializing a period amplitude parameter vector in the series according to normal distribution, namely beta ═ a1,b1,…,a3,b3)~Normal(0,σ2) Wherein sigma is calculated by taking the predicted time length L in the historical time sequence data as a time window to calculate the quotient of the difference value and the average value, and then the average value in all the windows is 1.773, thus obtaining the initial parameter sequence value:
[a1,b1,…,a3,b3]=[-4.1,6.5,2.1,-5.8,0.56,1.5]the initial expression of the periodic term model can be finally obtained:
Figure BDA0003128567910000211
Figure BDA0003128567910000212
for special items, separate model construction needs to be carried out on each different festival
Figure BDA0003128567910000213
Wherein kappa-Normal (0, upsilon)2) For the special influence effect of data under the initial state under each special date, upsilon defaults to 10, but the special festival holiday can be set to be larger to reflect the influence, for the current data, spring festival, Shuangelen, national celebration, valentine's day and labor festival are added into the model, although the prediction time period has no special date, the historical data completely includes, and needs to be taken as the influence consideration, and finally the initial expression of the special item model is obtained:
Figure BDA0003128567910000214
obtaining the integral decomposition model which is not subjected to optimization fitting
Figure BDA0003128567910000215
The model can obtain a current fitting value after each historical time sequence data is brought in, and the second step of optimization solution is started after the model is built. Optimizing the objective to make the real value and the model fitting value as close as possible on each historical time sequence data, i.e. optimizing and solving a new function
Figure BDA0003128567910000216
Due to the presence of a large number of initialized variable parameters in the function, i.e., parameters in each decomposition term, the minimum of (c) includes:
g (t), g (t, k, m, δ) and s (t), s (t, a)1,b1,a2,b2,a3,b3) And h (t) h (t, κ), or
And the optimization function F needs to be optimized to be changed into the following steps:
minF=minF(X)=minF(k,m,δ,a1,b1,…,κ)。
the solving method adopts an L-BFGS algorithm, firstly, a target function F (X) is subjected to quadratic Taylor expansion at a position k, a high-order infinitesimal part is ignored, and derivation is carried out to obtain:
F′(X)=F′(Xk)+F″(Xk)(X-Xk) When F' (X) is 0, the minimum value is obtained, and X is obtainedk+1=Xk-H-1·gkK is 0,1, … wherein H-1The inverse of the second derivative of the objective function, g, being in matrix form due to multiparametric reasonskIs a first derivative function of the objective function, when k is 0, the model parameter X can be initialized by the formula0Calculating to obtain optimized once-after-parameter X1Then iterate to the optimal parameter X*=(k*,m**,a1 *,b1 *,…,κ*) And obtaining the parameter combination which can enable the model fitting effect to be optimal, and completing the model optimization. Each iteration process requires the calculation of the second derivative reciprocal and the first derivative of the objective function, the latterThe calculation is simple, the former calculation is obtained by a double iterative approximation method, namely:
Figure BDA0003128567910000221
sk=Xk+1-Xk,yk=gk+1-gk
and in each iteration, the matrix is not directly iteratively calculated, but s within 10 times of iterative calculationkAnd ykTo reduce the amount of computation.
After model fitting is completed, the transaction amount value of L day can be directly predicted according to the model (46800,518422, …,539076), but certain adjustment needs to be made on the predicted value. The selection of the adjustment method depends on the time length of the historical data, namely, whether the time length of the historical data exceeds the sum of two years and the length of the prediction time period is judged, if yes, the deviation adjustment of the prediction length is also adopted, and if not, the monthly characteristic adjustment of the growth rate is adopted. Since P <365 x 2, the growth rate is adjusted monthly here.
According to the adjustment scheme, the growth rate of historical data needs to be calculated firstly, in the historical 500 days, as the cycle item is only fitted to the week in the model and the fitting length is not more than two years, on the premise of not considering the annual cycle, the growth rate of the month and the growth rate of whether the working day is taken as an adjustment key point, and the adjustment of adding a special date is assisted. Here the average monthly growth rate r is calculatedmThe calculation method is that the daily average R of each month is firstly calculated in the historical datamonthThe average difference of the day is calculated as the growth rate per month (463272,458594, …,507193) for 12 values in total, and the average growth rate per month r is obtained by averaging the differencesm2902. Here the average growth rate r for all working days in 500 days is calculatedwdAverage growth rate r over all weekendsweThe growth rate is calculated by dividing the difference by the division dimension, the division dimension of the working days is (1,1,1,1,3) and the division dimension of the weekends is (1,6), all the working calendar history data are arranged in sequence, then the difference is made and divided by the respective division dimension, the difference of the friday to the monday is divided by 3, and the same is trueAfter all weekend historical data are sequentially arranged, differences are divided by division dimensions, the differences from weekdays to monday are divided by 6, then the respective differences are averaged to obtain average growth rates, two average growth rate values 3300 and 740 are respectively obtained, for a special date, whether the special date exists in a forecast date is firstly checked, in this example, only the time of the spring festival in one day is considered, namely 2020-01-25 is the special date, and whether the spring festival exists in the historical data is secondly checked, in this example, the date 2019-02-05 is existed in one day, the date value in the history is directly applied to the forecast date 2020-01-25, and the day forecast value is a final value. If there are multiple periods of special dates in the history data, the growth rate of the special dates is calculated after the history values of the latest dates are directly applied and is modified to a final value. For the non-special dates, the 29 predicted date values with the special dates removed are increased and adjusted to be predicted values according to the interval between the 29 predicted date values and the business dates by the total increase effect of half-working day or half-resting day increase and half-month increase. The 29-day prediction bias of 2020-01-25 days was removed from 2020-01-01 to 2020-01-30 as follows:
Figure BDA0003128567910000231
Figure BDA0003128567910000232
after the calculation of the deviation of the growth rate is completed, the deviation can be added into the original prediction result, and the adjusted prediction result can be obtained:
Figure BDA0003128567910000233
and storing the prediction result as a final prediction result.
Case two:
the case of using the predicted length deviation adjustment is exemplified while employing the sequence decomposition fitting model. Suppose the historical data has three years, namely the historical data is as follows:
2017-01-01 2017-01-02 2019-01-01 2019-12-31
327536 401822 821226 601519
when the demand forecast still has 30 daily transaction amount values of 2020-01-01 to 2020-01-31 days, the forecast length L is 30 and the historical data time length P is 1095. Since P > L × 3, a sequence decomposition fitting model is still used in the model selection judgment, and the model construction of this time is different from the selection of the period term in case one, here, the period term of two parts of the week and the year is used, and the period term of this year is a fourier series with the period of 365.25 and the series of 10. In addition, the setting and optimization processes of other parameters are similar to those in the first embodiment, and are not described herein in detail.
After the model is constructed and optimized, the prediction data (46800,518422, …,539076) of the service date which is obtained by model fitting and is 30 days in the future can be obtained finally, and the adopted adjustment mode is judged according to the time length of the historical data. Since P >365 x 2 this time, the predicted length deviation adjustment method was used.
In the adjustment of the deviation of the predicted length, the deviation between the true value in the historical data and the predicted value obtained by fitting is mainly used as the adjustment result of the same time period in the future. And (4) moving the service date T to history by L days, fitting and predicting the daily transaction amount of 30 days after the new service date in a sequence decomposition mode under the condition again according to history data of 1065 days 2017-01-01 to 2019-12-01 of history elimination 30 days. I.e. the daily traffic for 30 days after the new business date is fitted according to the model.
Thereby obtaining the historical true values of the 30 days from the 2019-12-02 day to the 2019-12-31 day:
(437200,468382, …,518956), and decomposing the fit predictors:
(440020,438922, …,373933) and the difference between the predicted value and the true value is used to obtain the deviation (epsilon) of 30 daysT-29,…,εT) The 30 deviations are applied to future prediction results (2820, -29460, …, -185023) respectively, and prediction data of 30 days after adjustment can be obtained
Figure BDA0003128567910000241
Figure BDA0003128567910000242
And observing the change ratio of the predicted value after adjustment and the predicted value before adjustment, and when the deviation amplitude exceeds 30% or 70% of the reference of the predicted value, respectively carrying out conventional identification adjustment and special adjustment. Here, 2020-01-30 days need regular identification adjustment, the day is identified as working day, non-special day, that is, the average growth rate r of working day is calculated by all historical datawd714, the growth rate is accumulated from the date of the business and a secondary adjustment value 714 x 29 is added to the adjusted forecast. When the day 2020-01-24 needs to be specialized, the average increase rate of the last cycle divided by the day is added to the divided by day data as the final result of the prediction, and if the divided by day is not used as the special day, the date special data is temporarily regarded as abnormal and no secondary adjustment is performed, and the condition is reported to the technician. Finally, the prediction data of which the days 2020-01-24 and 2020-01-30 are secondarily adjusted can be obtained forAnd storing the final prediction result in a warehouse.
And judging whether the prediction value of the 30 days exceeds a preset threshold value in subsequent work, and reporting the prediction value as early warning information to an application field management system when prediction data exceeding the threshold value exists.
Situation three
When the prediction demand is still 30 days, if the historical data which can be used at the moment is quite small and does not reach three times the prediction date length, namely P <3 x L, then a translation fitting method is adopted for prediction. It is assumed at this time that the history data time length is 61, as follows.
2019-11-01 2019-11-02 2019-12-01 2019-12-31
818657 481261 425818 601519
After the historical data is obtained, the working day growth rate and the weekend growth of the index historical data under the system are firstly calculatedThe rate, the monthly growth rate and the annual growth rate can be calculated, in the example, the annual growth rate cannot be calculated because the historical data is only 61 days, and the others can be calculated. The calculation method is described in detail in the adjustment of the growth rate, the difference of average values under respective division dimensions is used as the growth rate, and the average of respective differences is the average growth rate, so that the calculation condition is that at least two periods exist. The average monthly growth rate r can be calculatedm28762 average growth rate r of all working dayswd253 average growth rate r over weekendswe3467. And secondly, according to the first step in the adjustment of the deviation of the prediction length, the service date T is shifted to the history by L days, the 30-day data of the new service date and the old service date are directly shifted to a future prediction time period, so that a shifted reference 30-day prediction value (945504, … and 601519) is obtained, and then the adjustment of each growth rate is carried out every day on the basis of the reference prediction value. The components of the growth rate can be obtained from the formula, and the ratio of the working day growth rate, the weekend growth rate and the month growth rate is (0.5,0.5,0.5) due to the annual growth rate. From this, a deviation of the growth rate of 30 days can be calculated as
Figure BDA0003128567910000261
Figure BDA0003128567910000262
Finally, adding the growth rate deviation into the translation reference value through similar operation in the growth rate adjustment, and obtaining the predicted daily transaction amount 30 days after the business date after model fitting.
In summary, after prediction and storage are realized, the following introduces an early warning and optimization strategy feedback manner by way of example.
Assuming that, according to the above situation, a predicted value 30 days after the service date T exists, the application index is the daily transaction amount index of the EPAY system. Firstly, based on the existing information, a threshold value G and a unit bearing capacity C of a daily transaction amount index of the EPAY system are called. When the threshold value G is 600000, the predicted value is 30
Figure BDA0003128567910000263
Judging the relationship between the maximum value of the predicted value and the threshold value, wherein the maximum value is
Figure BDA0003128567910000264
Due to the fact that
Figure BDA0003128567910000265
Therefore, the alarm of capacity prediction is needed, after the operator confirms and passes the allocation request, the allocation parameters can be transmitted to the application service allocator, the number of the application services needing to be increased is calculated to be 1 in the allocator according to the unit bearing capacity C of 100000/item, the numerical value is transmitted to the total front system as an instruction, the total front system is used as an entrance and exit of all the admission service systems, the number of the allocated application services of the EPAY system can be increased by 1 unit according to the instruction, and an early warning mechanism is realized.
The pre-master system, after receiving the instruction, converts the application service provisioning scheme into a system internal instruction, accessing the system EPAY via the listener tag. For increasing application services, a service configurator is required to perform specified number of service copies on a marked system and start the marked system, and a subsequent CPU of the total front system can automatically perform balanced process processing according to different service numbers of different systems. For reducing the application services, only the service configurator needs to end the designated number of application services under the marking system, the CPU of the total front system can automatically balance the process processing, and the subsequent service configurator needs to recycle the services which are not started.
After the application service volume of an access system under the total front system is adjusted, the channel of the access system is improved or reduced, and no matter whether the total processing capacity of the total front system is changed or not, the response capacity and the service processing capacity of the access system are changed based on the change of the application service volume ratio, namely, the use ratio of the physical resources of the total front system is changed.
Then, by setting k as an initial growth rate in the formula as 15 change points, that is, as a vector formed by combining growth rates at 15 change points, each growth rate satisfies a laplacian distribution, that is, a value calculated by a quotient of a range of historical time series data and an average is 0.105, then the growth rates under a random distribution can be obtained, and thus the growth rates of all piecewise linear functions with respect to k can be obtained, and the piecewise position where each growth rate is located is expressed as an indication function. In the formula, m is initial offset, and for offset increment under the influence of the growth rate, the offset increment of each segment is obtained by transposition of an indicating function, and the offset increment and m together form the total offset. Setting the x-axis 0 point of the rectangular coordinate system as the date 2018-08-18, adding the date to 500 as the service date, and setting m as the first date value 4. For special items, the period item model initial expression can be finally obtained, independent model construction h needs to be carried out on each different festival, wherein the default is 10 for the special influence effect of the data in the initial state under each special date, the special holiday can be set to be larger to reflect the influence, for the current data, the spring festival, the double eleven, the national day, the valentine's day and the labor day are added into the model, although the prediction time period has no special date, the historical data completely include the data and need to be taken as the influence consideration, the special item model initial expression sum is finally obtained, and the optimization function F needs to be optimized to be solved. And when the time is set as the minimum value, obtaining the reciprocal of a second derivative function of the objective function, wherein the second derivative function is the first derivative function of the objective function due to the fact that the multi-parameter reason is in a matrix form, when k is 0, obtaining the parameters after one-time optimization through calculation of the formula and the initialized model parameters, and then continuously iterating to the optimal parameters to obtain the parameter combination capable of enabling the model fitting effect to be optimal, so that the model optimization is completed. The second derivative reciprocal and the first derivative of the objective function need to be calculated in each iteration process, the latter is simple in calculation, the former calculation is obtained by means of double iterative approximation, namely, the matrix is not directly iterated and calculated in each iteration, and the sum within 10 iterative times is iteratively calculated so as to reduce the calculation amount. And after the calculation of the deviation of the growth rate is finished, the deviation can be added into the original prediction result, and the adjusted prediction result can be obtained and stored as the final prediction result. Finally, adding the growth rate deviation into the translation reference value through similar operation in the growth rate adjustment, and obtaining the predicted daily transaction amount 30 days after the business date after model fitting. Based on the same inventive concept, the embodiment of the present application provides an electronic device, as shown in fig. 6, the electronic device includes a memory and a processor, and the memory is communicatively connected with the processor.
The memory stores a computer program, and when the computer program is executed by the processor, the data processing and allocating method provided by the embodiment of the application is realized.
Those skilled in the art will appreciate that the electronic devices provided by the embodiments of the present application may be specially designed and manufactured for the required purposes, or may include known devices in general-purpose computers. For example, the electronic device may be a mobile phone, a tablet computer, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, and the like, and the embodiment of the present application does not limit the specific type of the electronic device. These devices have stored therein computer programs that are selectively activated or reconfigured. Such a computer program may be stored in a device (e.g., computer) readable medium or in any type of medium suitable for storing electronic instructions and respectively coupled to a bus.
The Memory in the electronic device of the present application may be a ROM (Read-Only Memory) or other type of static storage device that may store static information and instructions, which may be, but is not limited to, RAM (Random Access Memory) or other type of dynamic storage device that can store information and instructions, EEPROM (Electrically Erasable Programmable Read Only Memory), CD-ROM (Compact Disc Read-Only Memory) or other optical disk storage, optical disk storage (including Compact Disc, laser Disc, optical Disc, digital versatile Disc, blu-ray Disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.
The Processor in the electronic device of the present application may be a Central Processing Unit (CPU), a general purpose Processor, a Digital Signal Processor (DSP), or an ASIC
An Application Specific Integrated Circuit (asic), an FPGA (Field-Programmable Gate Array) or other Programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. A processor may also be a combination of computing functions, e.g., comprising one or more microprocessors, a DSP and a microprocessor, or the like.
The electronic device provided by the embodiment of the present application has the same inventive concept as the embodiments described above, and the details that are not shown in detail in the electronic device may refer to the embodiments described above, and are not described herein again.
Based on the same inventive concept, embodiments of the present application provide a computer-readable storage medium, where a computer program is stored on the storage medium, and when the computer program is executed by a processor of a server, the data desensitization method provided by embodiments of the present application is implemented.
The computer-readable medium provided herein includes, but is not limited to, any type of disk including floppy disks, hard disks, optical disks, CD-ROMs, and magneto-optical disks, ROMs, RAMs, EPROMs (Erasable Programmable Read-Only memories), EEPROMs, flash Memory, magnetic or optical cards. That is, a readable medium includes any medium that stores or transmits information in a form readable by a device (e.g., a computer).
The computer-readable storage medium provided in the embodiments of the present application has the same inventive concept as the embodiments described above, and contents not shown in detail in the computer-readable storage medium may refer to the embodiments described above, and are not described herein again. Steps, measures, schemes may be alternated, modified, combined, or deleted. Further, other steps, measures, or schemes in various operations, methods, or flows that have been discussed in this application can be alternated, altered, rearranged, broken down, combined, or deleted. Further, steps, measures, schemes in the prior art having various operations, methods, procedures disclosed in the present application may also be alternated, modified, rearranged, decomposed, combined, or deleted.
Are understood to indicate or imply relative importance or implicitly indicate the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present application, "a plurality" means two or more unless otherwise specified. The steps are not necessarily performed in the order indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the structure of the present invention in any way. Any simple modification, equivalent change and modification of the above embodiments according to the technical spirit of the present invention are within the technical scope of the present invention.

Claims (3)

1. A capacity data processing and scheduling method, the method comprising:
step s201, the pre-system counts the self service supply condition and adjusts and distributes the service supply condition to the system of each service flow;
step s202, the main-front system accesses a channel system and a service system required by the complete service through a channel;
step s203, calculating bearing capacity data based on system pressure test, and collecting and counting service index data based on a monitoring device;
step s204, the data acquisition synchronizer manages the acquisition device, receives the data, classifies the data systematically and transmits the data to the capacity platform server;
step s205, selecting a target index based on the analysis; if the modeling requirement exists, jumping to step s 301; if the predicted value exists, jumping to step s 601;
step s601, feeding back the prediction result to the capacity platform server, displaying whether an alarm state exists and calculating and transmitting allocation parameters based on the relation between the threshold value and the prediction value;
step s602, the application service coordinator receives the allocation parameters, calculates and obtains a service adjustment value based on the bearing capacity data and the prediction threshold difference, and sends a service amount allocation request to the specified system; skipping to step s 201;
step s301, the capacity platform server sends a prediction request of a target index to the algorithm platform analysis server according to the requirement;
step s302, the algorithm platform server identifies the index code of the target index and the system code of the target index;
step s303, inquiring system data characteristics according to the system code and the index code;
step s304, calculating the time length of the predicted characteristic data;
step s103, judging whether to perform translation fitting; jumping to step s104 when the translation fitting is determined, and jumping to step s107 when the non-translation fitting is determined;
step s104, calculating the growth rate, and writing the growth rate into an intermediate database;
step s105, judging a translation structure and calculating an increase weight;
step s106, obtaining the growth rate and the growth weight, calculating the translation fitting, jumping to step s116,
step s107, constructing a sequence decomposition fitting model;
step s108, fitting model optimization and realizing prediction;
step s109, judging an adjustment mode according to the time length of the feature data, if the over-growth rate is adjusted according to the monthly feature, skipping to step s 110; if the predicted length is adjusted according to the deviation, jumping to step s 113;
step s110, calculating a growth rate, and writing the growth rate into an intermediate database;
step s111, obtaining the monthly feature adjustment of the growth rate, and jumping to step s 116;
step s113, calculating the predicted deviation as an adjustment scheme, and warehousing;
step s114, obtaining a predicted length deviation adjustment;
step s115, adjusting twice based on the deviation amplitude;
at step s116, the prediction data is saved and/or a jump is made directly to step 601.
2. An automatic capacity resource allocation system, comprising:
various services; and/or a number of channel systems;
a total front system;
a plurality of business systems;
a data acquisition synchronizer;
an application service coordinator;
a capacity platform server;
an algorithm analysis server;
a database and, in addition,
a number of memories for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the capacity data processing and scheduling method of claim 1.
3. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the capacity data processing and scheduling method according to claim 1.
CN202110696330.9A 2021-06-23 2021-06-23 Method, system and computer readable medium for capacity data processing and allocation Active CN113344282B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110696330.9A CN113344282B (en) 2021-06-23 2021-06-23 Method, system and computer readable medium for capacity data processing and allocation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110696330.9A CN113344282B (en) 2021-06-23 2021-06-23 Method, system and computer readable medium for capacity data processing and allocation

Publications (2)

Publication Number Publication Date
CN113344282A true CN113344282A (en) 2021-09-03
CN113344282B CN113344282B (en) 2023-01-17

Family

ID=77478018

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110696330.9A Active CN113344282B (en) 2021-06-23 2021-06-23 Method, system and computer readable medium for capacity data processing and allocation

Country Status (1)

Country Link
CN (1) CN113344282B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114518988A (en) * 2022-02-10 2022-05-20 中国光大银行股份有限公司 Resource capacity system, method of controlling the same, and computer-readable storage medium
CN114548459A (en) * 2022-02-25 2022-05-27 江苏明月软件技术有限公司 Ticket data regulation and control method and system and computer readable storage medium

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120130760A1 (en) * 2009-10-26 2012-05-24 Jerry Shan Adjusting a point prediction that is part of the long-term product life cycle based forecast
CN104766144A (en) * 2015-04-22 2015-07-08 携程计算机技术(上海)有限公司 Order forecasting method and system
CN108764863A (en) * 2018-05-24 2018-11-06 腾讯科技(深圳)有限公司 A kind of virtual resource transfer method, device, server and storage medium
CN109657831A (en) * 2017-10-11 2019-04-19 顺丰科技有限公司 A kind of Traffic prediction method, apparatus, equipment, storage medium
US20190318369A1 (en) * 2016-12-15 2019-10-17 Koubei Holding Limited Method and device for predicting business volume
CN110928748A (en) * 2019-12-04 2020-03-27 中国银行股份有限公司 Business system operation monitoring method and device
CN110990174A (en) * 2019-10-25 2020-04-10 苏州浪潮智能科技有限公司 Method, device and medium for predicting SSD available time based on Prophet model
CN111045907A (en) * 2019-12-12 2020-04-21 苏州博纳讯动软件有限公司 System capacity prediction method based on traffic
CN111176575A (en) * 2019-12-28 2020-05-19 苏州浪潮智能科技有限公司 SSD (solid State disk) service life prediction method, system, terminal and storage medium based on Prophet model
CN112231193A (en) * 2020-12-10 2021-01-15 北京必示科技有限公司 Time series data capacity prediction method, time series data capacity prediction device, electronic equipment and storage medium
CN112256550A (en) * 2020-11-19 2021-01-22 深信服科技股份有限公司 Storage capacity prediction model generation method and storage capacity prediction method
CN112269811A (en) * 2020-10-13 2021-01-26 北京同创永益科技发展有限公司 IT capacity prediction method and system based on traffic
CN112541635A (en) * 2020-12-16 2021-03-23 平安养老保险股份有限公司 Service data statistical prediction method and device, computer equipment and storage medium
CN112633542A (en) * 2019-09-24 2021-04-09 顺丰科技有限公司 System performance index prediction method, device, server and storage medium

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120130760A1 (en) * 2009-10-26 2012-05-24 Jerry Shan Adjusting a point prediction that is part of the long-term product life cycle based forecast
CN104766144A (en) * 2015-04-22 2015-07-08 携程计算机技术(上海)有限公司 Order forecasting method and system
US20190318369A1 (en) * 2016-12-15 2019-10-17 Koubei Holding Limited Method and device for predicting business volume
CN109657831A (en) * 2017-10-11 2019-04-19 顺丰科技有限公司 A kind of Traffic prediction method, apparatus, equipment, storage medium
CN108764863A (en) * 2018-05-24 2018-11-06 腾讯科技(深圳)有限公司 A kind of virtual resource transfer method, device, server and storage medium
CN112633542A (en) * 2019-09-24 2021-04-09 顺丰科技有限公司 System performance index prediction method, device, server and storage medium
CN110990174A (en) * 2019-10-25 2020-04-10 苏州浪潮智能科技有限公司 Method, device and medium for predicting SSD available time based on Prophet model
CN110928748A (en) * 2019-12-04 2020-03-27 中国银行股份有限公司 Business system operation monitoring method and device
CN111045907A (en) * 2019-12-12 2020-04-21 苏州博纳讯动软件有限公司 System capacity prediction method based on traffic
CN111176575A (en) * 2019-12-28 2020-05-19 苏州浪潮智能科技有限公司 SSD (solid State disk) service life prediction method, system, terminal and storage medium based on Prophet model
CN112269811A (en) * 2020-10-13 2021-01-26 北京同创永益科技发展有限公司 IT capacity prediction method and system based on traffic
CN112256550A (en) * 2020-11-19 2021-01-22 深信服科技股份有限公司 Storage capacity prediction model generation method and storage capacity prediction method
CN112231193A (en) * 2020-12-10 2021-01-15 北京必示科技有限公司 Time series data capacity prediction method, time series data capacity prediction device, electronic equipment and storage medium
CN112541635A (en) * 2020-12-16 2021-03-23 平安养老保险股份有限公司 Service data statistical prediction method and device, computer equipment and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
孙书彤: "基于业务预测的云资源动态调度人工智能方法及系统研究", 《电脑与电信》 *
常润梅等: "电信企业云计算数据中心容量管理", 《辽宁工程技术大学学报(自然科学版)》 *
王林等: "基于机器学习的智能化主机容量管理平台", 《中国金融电脑》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114518988A (en) * 2022-02-10 2022-05-20 中国光大银行股份有限公司 Resource capacity system, method of controlling the same, and computer-readable storage medium
CN114518988B (en) * 2022-02-10 2023-03-24 中国光大银行股份有限公司 Resource capacity system, control method thereof, and computer-readable storage medium
CN114548459A (en) * 2022-02-25 2022-05-27 江苏明月软件技术有限公司 Ticket data regulation and control method and system and computer readable storage medium

Also Published As

Publication number Publication date
CN113344282B (en) 2023-01-17

Similar Documents

Publication Publication Date Title
CN113344282B (en) Method, system and computer readable medium for capacity data processing and allocation
US6611726B1 (en) Method for determining optimal time series forecasting parameters
US9747560B2 (en) Method and system for combination of independent demand data streams
CN108647914B (en) Production scheduling method and device, computer equipment and storage medium
CN108985691A (en) A kind of automatic replenishing method and system based on dynamic stock control
CN108446795B (en) Power system load fluctuation analysis method and device and readable storage medium
US20100185557A1 (en) Resource allocation techniques
US20060200400A1 (en) Resource allocation technique
CA2521927A1 (en) A factor risk model based system, method, and computer program product for generating risk forecasts
CN111274531A (en) Commodity sales amount prediction method, commodity sales amount prediction device, computer equipment and storage medium
US6546303B1 (en) Computation of supply chain planning process efficiency
Lesnevski et al. Simulation of coherent risk measures based on generalized scenarios
US20100280969A1 (en) Method and system for managing pension portfolios
CN110647724B (en) Method for constructing banknote adding and clearing model, model construction equipment and storage medium
CN116562715A (en) Index data monitoring method, device, computer equipment and storage medium
CN116205569A (en) Intelligent inventory analysis system based on sales
CN115689222A (en) Material scheduling method and construction site material management system based on Internet of things
CN112132498A (en) Inventory management method, device, equipment and storage medium
CN111415040B (en) Method and device for calculating suggested purchase amount based on PET purchase model
Balut et al. A method for repricing aircraft procurement programs
EP1843232A1 (en) Production scheduling system
Broer et al. Investment behaviour of Dutch industrial firms: A panel data study
Adesokan et al. Analyzing expected returns of a stock using the Markov chain model and the capital asset pricing model
EP4250200A1 (en) A method for inventory management
CN115375413B (en) ERP purchase calculation method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant