CN113344282B - Method, system and computer readable medium for capacity data processing and allocation - Google Patents

Method, system and computer readable medium for capacity data processing and allocation Download PDF

Info

Publication number
CN113344282B
CN113344282B CN202110696330.9A CN202110696330A CN113344282B CN 113344282 B CN113344282 B CN 113344282B CN 202110696330 A CN202110696330 A CN 202110696330A CN 113344282 B CN113344282 B CN 113344282B
Authority
CN
China
Prior art keywords
data
service
growth rate
prediction
calculating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110696330.9A
Other languages
Chinese (zh)
Other versions
CN113344282A (en
Inventor
王丽
史晨阳
彭晓
孙纪周
邢世伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Everbright Bank Co Ltd
Original Assignee
China Everbright Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Everbright Bank Co Ltd filed Critical China Everbright Bank Co Ltd
Priority to CN202110696330.9A priority Critical patent/CN113344282B/en
Publication of CN113344282A publication Critical patent/CN113344282A/en
Application granted granted Critical
Publication of CN113344282B publication Critical patent/CN113344282B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/103Workflow collaboration or project management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/109Time management, e.g. calendars, reminders, meetings or time accounting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance

Abstract

The invention relates to a method, a system and a computer readable medium for processing and allocating capacity data, which inquire the characteristics of system data according to system codes and index codes; calculating the time length of the predicted characteristic data; when the translation fitting is determined, calculating the growth rate, and writing the growth rate into an intermediate database; judging a translation structure and calculating a growth weight; obtaining a growth rate, calculating a growth weight, performing translational fitting, and constructing a sequence decomposition fitting model; judging an adjustment mode according to the time length of the feature data, adjusting the growth rate according to monthly features, calculating the growth rate, and writing the growth rate into an intermediate database; calculating a prediction deviation as an adjustment scheme, and warehousing; and acquiring deviation adjustment of the predicted length, and storing the predicted data in a warehouse. The method has strong prediction flexibility and high prediction accuracy, and a user can more clearly analyze the influence on the prediction accuracy; enhancing the fitting and predicting effects; the service index data can be well predicted according to the requirement.

Description

Method, system and computer readable medium for capacity data processing and allocation
Technical Field
The invention relates to the technical field of computer application, in particular to a method, a system and a computer readable medium for processing and allocating financial computer system capacity data based on internet and large-scale data processing.
Background
Data changes over time for each application in existing financial computer systems, including banks, and these changes, while generally growing, may in some cases result in a decrease or even repeated change in data for a particular application. New applications may also fluctuate over time, including an increase or decrease in the number of applications or fluctuations.
Each application in the existing banking computer system is often independent of the dependence on hardware resources, and particularly, expensive extensible resources such as a CPU, a hard disk, a memory and the like are all independently used. Wherein, only higher level hardware devices and equipment can be purchased by the bank for pursuit of the robustness of the system, the IO index, etc. In a production environment, an operator of a machine room that manages and controls each application manually adjusts extensive hardware resources, such as adding enough hard disks after shutdown, such as adding 80% hard disks, or reducing 50% hard disks after shutdown, according to hardware display indexes provided by manual patrol or a specific host, such as hard disk, memory capacity, or CPU saturation indexes; similar empirical operations are performed for resources such as CPU or memory. Such manual adjustment cannot accommodate the dynamic changes of a large-scale data processing-related bank computer system with high intensity and high concurrency in actual production, and wastes much expensive computer hardware equipment in order to maintain sufficient data operation security.
From the aspect of application, especially in the financial field, the appearance of scenes such as internet network loan, festival merchant promotion, high-income financing sale, second killing and the like causes businesses such as a bank form to rapidly develop business on each line in the internet, and business growth presents a new form, and the application of the lines causes the capacity control and allocation of each system including data to become extremely frequent.
Traditional commercial banks need to invest a lot of capital and resources, and the construction and development need a lot of personnel including information center operators to participate in the auxiliary information system and technical support system, wherein the personnel consumption cost is higher than the hardware resource which is already expensive, but in order to continuously adapt to each new business type and service mode which is newly launched, only the extensive large-scale consumption increase can be tolerated to meet the business requirement of continuous development. For example, with the increasing variety of services, delivery channels and technical implementation of banks, the corresponding computer application systems in banks are also increased, and thus a situation arises: each application system independently corresponds to a background service, a payment system and other support systems, and a great number of application systems are provided with a front processor to realize specific service processing, data processing or equipment control management; a large number of front-end computer systems for different services are often placed in the bank room. The system with the architecture and the system working method thereof increase the investment of system maintenance personnel, cause the waste of bank equipment and software investment, such as the phenomenon of repeated development in different areas of the same bank and different systems in each stage is serious, and more dangerous is that: the system may be faulty or crashed due to manual configuration errors or management clutter of the application system's capacity.
Disclosure of Invention
In view of the above-mentioned deficiencies in the prior art, the present invention provides a method, system and computer-readable medium for dynamically controlling and scheduling capacity data processing and scheduling for a bank computer system to address at least one of the above-mentioned problems.
In a first aspect, the present invention provides a capacity data processing and scheduling method.
In one embodiment of the first aspect of the invention, the method comprises:
step s201, the total front system counts the self service supply condition and adjusts and distributes to the system of each service flow;
step s202, the head system accesses a channel system and a service system required by the complete service through a channel;
step s203, calculating bearing capacity data based on system pressure test, and collecting and counting service index data based on a monitoring device;
step s204, the data acquisition synchronizer manages the acquisition device, receives the data, classifies the data systematically and transmits the data to the capacity platform server;
step s205, selecting a target index based on the analysis; if the modeling requirement exists, jumping to the step s301; if the predicted value exists, jumping to step s601;
step s601, feeding back the prediction result to the capacity platform server, displaying whether an alarm state exists and calculating and transmitting allocation parameters based on the relation between the threshold value and the prediction value;
step s602, the application service dispatcher receives the dispatching parameters, calculates and obtains a service regulation value based on the bearing capacity data and the prediction threshold difference, and sends a service quantity distribution request to a specified system; skipping to step s201;
step s301, the capacity platform server sends a prediction request of a target index to the algorithm platform analysis server according to the requirement;
step s302, the algorithm platform server identifies the index code of the target index and the system code of the target index;
step s303, inquiring system data characteristics according to the system code and the index code;
step s304, calculating the time length of the predicted characteristic data;
step s103, judging whether to perform translation fitting; jumping to step s104 when the translation fitting is determined, and jumping to step s107 when the non-translation fitting is determined;
step s104, calculating the growth rate, and writing the growth rate into an intermediate database;
step s105, judging a translation structure and calculating a growth weight;
step s106, obtaining the growth rate and the growth weight, calculating the translation fitting, skipping to step s116,
step s107, constructing a sequence decomposition fitting model;
step s108, fitting model optimization and realizing prediction;
step s109, judging an adjustment mode according to the time length of the feature data, if the over-growth rate is adjusted according to the monthly feature, skipping to step s110; if the predicted length is adjusted according to the deviation, jumping to step s113;
step s110, calculating a growth rate, and writing the growth rate into an intermediate database;
step s111, obtaining the monthly feature adjustment of the growth rate, and jumping to step s116;
step s113, calculating the predicted deviation as an adjustment scheme, and warehousing;
step s114, obtaining a predicted length deviation adjustment;
step s115, adjusting twice based on the deviation amplitude;
at step s116, the prediction data is saved and/or a jump is made directly to step 601.
In a second aspect, the present invention provides a system for processing and scheduling capacity data.
In one embodiment of the second aspect of the invention, it comprises:
various services; and/or a number of channel systems;
a total front system;
a plurality of business systems;
a data acquisition synchronizer;
an application service coordinator;
a capacity platform server;
an algorithm analysis server;
a database and, in addition,
a number of memories for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the capacity data processing and scheduling method;
in a third aspect, the invention provides a computer-readable storage medium.
In an embodiment of the third aspect of the invention, a computer program is stored thereon, which when executed by a processor, implements the capacity data processing and scheduling method described above.
Compared with the prior art, the invention has the beneficial effects that:
the capacity data processing and allocating method, system and computer readable medium of the invention have strong prediction flexibility and high prediction accuracy, and the user can more clearly analyze the influence on the prediction accuracy; enhancing the fitting and predicting effects; and (3) fitting the model according to the time sequence data under different conditions, well predicting the service index data according to requirements, and then warehousing the result to finish the whole time sequence prediction process. Enhancing the fitting and predicting effects; the model is fitted to the time sequence data under different conditions, the service index data can be well predicted according to requirements, and then the result is put in storage to complete the whole time sequence prediction process; and based on such model prediction and calculation, capacity resources can be organically and dynamically adjusted and allocated, so that the capacity resources of the whole system can be automatically increased or decreased and automatically allocated.
Drawings
FIG. 1 is a schematic diagram of a prior art data processing and deployment process;
FIG. 2 is a schematic diagram of a deployment system architecture for the capacity data processing and deployment method, system, and computer-readable medium of the present invention;
FIG. 3 is a schematic diagram of the internal structure of an algorithmic analysis server of the capacity data processing and deployment method, system, and computer readable medium of the present invention;
FIG. 4 is a first portion of a data processing and deployment flow diagram of the capacity data processing and deployment method, system, and computer-readable medium of the present invention;
FIG. 5 is a second portion of a data processing and deployment flow diagram of the method, system, and computer-readable medium of capacity data processing and deployment of the present invention;
fig. 6 is a block diagram of a capacity data processing apparatus according to the present invention.
In the drawings, the main parts are illustrated by symbols:
Detailed Description
The invention is described in detail below with reference to the figures and examples:
explanation of technical terms related to the present invention:
service: a complete service is generated by a user, equipment deployed by a channel system is used for sending a data request, the data request reaches a general front system through a channel, the data request reaches a specified service system through the channel after authentication, control and distribution of the general front system, the service system processes the data request, response piece information is generated after the data request is processed and returns to the general front system through the channel, the response piece information is delivered to the specified channel system through the channel by the general front system, and the channel system feeds back the service. Thus, a business process is completed.
And (3) testing pressure: any system requires a pressure test at the beginning of its operation. The pressure test is based on the system accessing the total pre-system, and when accessing, mainly calculates indexes of TPS, QPS (system throughput) and the like of the access system based on the use upper limit of physical resources of the access system and the total pre-system, and obtains the bearing capacity data of the access system according to the application service distribution logic of the total pre-system.
Application service: in the front system, the CPU process group is used for processing the information of the specific access system. The application service can be quantified in the total front system, and the distributed application service amount of the access system depends on the pressure test at the beginning of system access. The pre-master system has limited CPU and memory resources, and in the face of all application services in the active state, the pre-master system will allocate its resources equally to each active application service. Therefore, the application service amount allocated by the access system determines the service capability of the access system under the control of the overall front system. Application services need to be reasonably and efficiently distributed to each access system.
Application service volume provisioning: the expected service ration number of the access system is specified, and the unit is the number. The system can foreseeably allocate the service quantity of the appointed access system under a prediction mechanism, thereby preventing the system downtime caused by the service quantity increase or the resource waste caused by the service quantity decrease.
Channel: the virtual communication channel is controlled by the front system and is positioned between the front system and other systems, and the realization premise is based on the application service quantity distributed by the front system for the communication system. The more the application service volume, the more unobstructed the channel.
Bearing capacity data: taking the access system as a unit (mainly considering the accessed service system), each access system corresponds to the current application service volume. And the corresponding threshold values of all the service indexes and the unit service capacity bearing capacity of the system are also included. Each access system has an application service volume, each system has a plurality of indicators, and each indicator corresponds to a threshold and a unit service volume bearing capacity. The unit of the application service amount is the number, namely the number of services currently allocated to the system corresponding to the total previous system; the threshold unit is an index unit, namely the upper limit of the index required to be met by a normal operation system is obtained under the pressure test; the unit service capacity is an index unit, that is, the index value can be accommodated for each additional application service in the current state.
Service index data: the method comprises the steps of taking an access system as a unit (mainly considering the accessed service system), designating daily frequency time sequence data acquired by the service system based on a specific method, and forming different indexes based on the acquisition mode and content, wherein the different indexes can feed back the capacity condition of the application field of the service system from different angles. "daily transaction amount" is the simplest and most direct indicator, and is given in units of pens per day.
Blending parameters: the deployment parameters are directed to a given business system, relating to the bearing capacity data and the relationship between the predicted values and the threshold values of the system.
A channel system: and the service-related client system is responsible for receiving the service request of the user and then transmitting the request data to a service processing-related system, or receiving a service processing receipt and feeding back the result to the user.
The service system comprises: and the service-related service end system is responsible for receiving the service processing task, finishing service processing through the service system or the functions among the service systems, and transmitting the receipt information back to the receiving channel after processing.
Total front system: the financial industry middle platform system is responsible for connecting a foreground channel system and a background service system and allocating the flow logic of the service in a third party, a branch and a data center. The hot plug of the channel is supported, so that high expandability can be met, application services are quantized, the service is uniformly distributed and the channel is managed, and the processing of a large number of services is met.
Data acquisition synchronizer: the acquired bearing capacity data and the service index data are sorted by taking a service system as a unit, one service system corresponds to one system code and is used for identifying the service system, a plurality of service indexes are arranged below the service system, and each service index corresponds to one index code and is used for identifying the index. The two items of data are sorted into daily frequency time sequence data in the time range owned by each index under the system, a unique threshold value of the index and the current application service volume of each system.
Capacity platform server: the daily frequency time sequence data and the threshold value are stored in a database, prediction requests are sequentially executed on different systems according to index codes, each effective index of each system is managed, and the time sequence data of each index can be displayed and early warned.
Algorithm platform server: and according to the acquired system code and index code positioning tasks, performing situation-based fitting prediction on historical time sequence data of indexes in the database to obtain future predicted time sequence data, and storing the future predicted time sequence data into the database.
And the capacity platform server compares the threshold value with the predicted value of the predicted index, and transmits the index predicted value, the application service amount of the current system and the index threshold value as application service allocation parameters of the system to the application service allocator when the predicted value of one index of the current system exceeds the threshold value or the history and the predicted value are far lower than the threshold value.
A database: storing two parts of contents, including historical data storage after the data acquisition synchronizer collects and is processed by a capacity platform server; and the algorithm platform analysis server carries out prediction data storage after data modeling prediction.
The application service tuner:
based on the received parameters, the predicted value and the threshold are calculated to obtain the deployment amplitude, and the threshold and the current application service volume of the system are calculated to obtain the unit application service carrying capacity of the system. And the two parameters are recalculated to obtain service volume allocation, and the service volume allocation is an application service volume increase and decrease value which is required to be carried out by the current system based on prediction in the future.
The algorithm server of the invention comprises: the device comprises a historical data loading unit, a data preprocessing unit, a growth rate calculating unit, a growth rate storing unit, a prediction feedback unit, an optimization calculating unit, a model building unit, a deviation adjusting unit and a middleware container.
Referring to fig. 2 to 5, a first general embodiment of the present invention:
in the service circulation of the pre-finished configuration and construction, after a front-end user initiates a service request, the service request enters a bank front-end system through various service channel systems, such as pos machines, ATM machines, online banks and the like, in the front-end system, the service request enters a designated background service system through channels after standardization and distribution scheduling for service processing, a service processing result is returned to the front-end system through the same channel for analysis, and the processing result is returned to a channel system end as a receipt to be displayed to the user. In the above process, the two channels are all transmitted by using the application service as a bearing unit, and under each complete service cycle, each channel has a specific application service characteristic which is based on the specific traffic which can be borne by the application service of the total previous system unit.
Because the number of the application services of the total front system is fixed, when the system is accessed, the maximum required number of the application services of the channel between the access system and the total front system and the service bearing capacity of the unit application service can be obtained after the pressure test is finished, and the data are transmitted to the data acquisition synchronizer of the device. Meanwhile, all relevant service index data in the service system are also transmitted to the synchronizer.
And after the data collection is finished, the data is processed in the capacity platform server, and the collected data is updated in real time in the process to be used as a calculation database. The design method comprises the steps that a daily task sends a modeling request to an algorithm platform analysis server, a data prediction model of a certain service index is built on the algorithm platform, the built model is stored, and when a prediction request is sent again, the model is used for predicting future values of the specified service index. And after receiving the prediction result, the capacity platform server judges whether the service change constitutes the necessity of adjusting the application service volume by taking the bearing capacity related data collected once as a threshold value.
When the application service volume is adjusted to be necessary, an alarm is given and the adjustment parameters are automatically transmitted to the application service adjusting device of the invention, after all the parameters are combined and calculated in the device, the information of the service volume allocation can be transmitted to the head system, and the application servers of the specified service are uniformly allocated through the standardized setting interface of the head system.
The complete process of the invention in the time sequence prediction comprises the steps of firstly inquiring the data characteristics of the system according to the system coding and the index coding, namely, sorting out the parameters participating in the prediction task and the complete and effective data for model training in an automatic calling mode, and then calculating the time length of the existing data, wherein the time length is used for comparing with the prediction time length in the demand.
Generally speaking, when the ratio of the training time length of the model to the prediction time length is small, a good prediction result cannot be obtained. Based on knowledge of the model application, when the above ratio exceeds 3, a decomposition fitting model is considered for prediction, and when the ratio does not exceed, a translation fitting model is considered. And since the forecasted demand is generally between one week and two months, training data boundaries can be obtained approximately between three weeks and half a year.
When the amount of data for training is sufficient to apply the decomposition fitting model, the training of the model is directly performed, and the logic of use of the algorithm is described below.
In the field of Time Series analysis, there is a common analysis method called Time Series Decomposition (Decomposition of Time Series), which divides the Time Series y (t) into several parts, respectively, a season item, a trend item, and a remainder item. In the real life and production links, the effect of holidays h (t) is usually added to seasonal terms, trend terms and residual terms. The model overall construction then mainly consists of three parts, growth (growth trend), seaselectivity (seasonal trend) and holidases (influence of holidays on the predicted values). Wherein g (t) represents a growth function for fitting aperiodic variations of the predicted values in the time series; s (t) is used to indicate periodic variations, such as weekly, seasonal in each year, etc.; h (t) represents the influence of the holidays with the non-fixed period on the predicted value in the time series, and the final residual unpredictable items are added to obtain a formula:
y(t)=g(t)+s(t)+h(t)+ε
therefore, the decomposition fitting mode is to perform the decomposition on the original time sequence data, then perform fitting in sequence, and finally restore to obtain the prediction model of the original time sequence data.
For the growth trend term g (t), first consider the trends of the existence of two morphologies, namely, the extreme trend and the non-extreme trendThe trend is that the former fits a logistic regression function and the latter fits a piecewise linear regression function. For the characteristics of logistic regression, the function can be written as
Figure BDA0003128567910000111
Where C is called the maximum asymptotic value of the curve, k represents the rate of increase of the curve, m represents the midpoint of the curve, and the behavior for linear regression is simpler, i.e., f (x) = kx + b. When the service field index is predicted, no clear practical limiting condition exists in the data characteristics, so that the piecewise linear regression function is considered to be used. The segmentation mode is realized by designing variable points, a plurality of variable points are set in the model construction process, and the time S is set when S variable points are set at corresponding variable point positions j (1. Ltoreq. J. Ltoreq.S), by assuming the vector
Figure BDA0003128567910000112
To represent each time stamp s j At the initial first growth rate, k,
Figure BDA0003128567910000113
expressing the function for the indication of the corresponding time stamp, whereby it can be obtained that the growth rate at the time stamp t is k + a T δ and calculating each piecewise partial function expression based on the obtained growth rate. The selection of the positions of the variable points is generally performed in a mode of average distribution, the growth rate of the variable points is distributed through probability random, and in order to ensure the robustness of the model, a mode of Laplace random distribution is adopted, namely, the variation of the growth rate accords with the distribution delta j Laplace (0, τ), τ is a preset degree of change, and can be obtained from the ratio of the training data variance value to the average value. For the piecewise linear regression function used in the present invention, the overall trend function can be finally obtained as:
g(t)=(k+a(t)δ)·t+(m+a(t) T γ),
where k is the initial growth rate, δ is the change in growth rate, m is the initial offset, γ = (γ) 1 ,…,γ S ) T ,γ j =-s j δ j To increase the rate of growthThe ringing offset is incremented and together with m forms the total offset.
For the periodic seasonal term s (t), the factor which is most commonly considered in the time series prediction process is mainly represented by the periodicity formed by the influence of the seasonal characteristics of each day, week, month, year and the like in the time series.
And the periodic function can be expressed by a fourier series, namely:
Figure BDA0003128567910000114
wherein P is the period.
Because the basic time unit of the time sequence in the model is day, the selectable period is week, month and year, in order to simplify the calculation scheme, P =365.25 is adopted to represent that the year is the period, P =7 is adopted to represent that the week is the period, and the week is used as the period when the data quantity does not meet the year. By considering the generalization of the time series data, the number of stages is set to 10 for the annual period and 3 for the cycle, and then the fourier stage parameter vector β = (a) is formed for each cycle type 1 ,b 1 ,…,a N ,b N ) T Then, a model of the seasonal term is obtained as s (T) = X (T) β, where X (T) is a vector of a corresponding trigonometric function, and β is obtained by probabilistic distribution, where regular distribution β -Normal (0, σ) is selected 2 )。
For special and holiday effects h (t), because the special dates are unstable and the influence on the time sequence data cannot be ignored, an independent model is made for each special date, different front and back window values are set for the special dates, and the window values can reflect the gradual change effect of the influence of the special dates on the time sequence data in a normal distribution mode.
So as to obtain an integral special date model integrated into
Figure BDA0003128567910000121
Wherein Z (t) is an indicator function corresponding to a special date model, k-Normal (0, upsilon) 2 ) The window effect under normal distribution.
Thus, a model for fitting time series data is completely constructed:
y (t) = g (t) + s (t) + h (t) + epsilon, after which the function is solved by the L-BFGS algorithm. The algorithm is based on Newton method root solving, wherein the Newton method is to solve the solution of the function in the unitary function problem by continuously iterating second-order Taylor expansion mode, when solving the original function, the derivative function can be directly solved, the result when the derivative function is 0 is the solution of the original function, and the solving formula f '(x') is obtained after eliminating constant terms k )+f″(x k )(x-x k ) And =0. The fitting algorithm is optimized based on a BFGS algorithm, the BFGS algorithm expands a multivariate function second-order Taylor through a Newton method thought, a second derivative function which is difficult to calculate is synchronously approximated in an iterative mode in the iterative process, the original single iterative process is converted into a mode of synchronously approximating iterative solution of a function root and the second derivative function, the intermediate memory capacity of the fitting algorithm sharply rises along with the increase of a calculation element, namely a dependent variable, the L-BFGS algorithm changes the storage of the calculation result of the second derivative function in the iteration to the storage of the calculation parameter before the calculation, although the calculation precision generated by the iteration is reduced, the memory space is effectively saved. And obtaining the required regression model after fitting.
After obtaining the sequence decomposition fitting model, firstly, the fitted time sequence data is compared with the fitting result, namely, the prediction deviation of the training time sequence data is calculated and is stored in a database as the deviation adjustment scheme of the index in the current service time. And secondly, judging whether the training time sequence data accords with the long period characteristics, wherein the data for more than two years is judged to be the long period, and the fitting model for the long period data is used for carrying out deviation adjustment on the prediction time length.
Two ways of model adjustment are detailed below.
For time sequence data with longer time span, the model fitting effect of the time sequence data can judge the data for overall training more evenly, but for the time sequence data prediction of financial business class with certain timeliness, the time sequence data closer to the prediction time period has more influence effect on the prediction result, namely, the real data before the business date can cause more relevant influence on the predicted value after the business date. In order to compensate for the high influence of the low-order autocorrelation in the fitting process of single time series data, the fitting result of the time series data with a longer time span needs to be adjusted correspondingly to enhance the fitting and prediction effects. The adjustment modes can be two, namely the adjustment of the deviation of the predicted length and the adjustment of the growth rate according to the monthly characteristics.
The method is characterized in that the deviation of a new prediction result and the existing actual data is used as the deviation of a future prediction result in a mode of model fitting iteration once. When the service date is T, if the data needing to be predicted originally is a time period from T to T + k, the original model fitting problem f (0 → T) = y T→T+k On the basis, only the time period is adjusted, model fitting is added once, and the problem f (0 → T-k) = y is predicted T-k→T The deviation of the predicted result from the real data in the time period is used as the adjustment value of the future prediction result
Figure BDA0003128567910000131
y T→T+k +y T-k→T -ytrue T-k→T . Due to the limitation of data length and calculation complexity, the deviation adjustment is only carried out once, on the basis, the condition of large deviation is avoided due to the particularity of one iteration, and when the deviation amplitude exceeds 30% or 70% at a certain moment, the conventional identification adjustment and the specialized adjustment are respectively further carried out. When the conventional identification adjustment is carried out, the characteristics of the current date, namely whether the current date is a working day, a rest day and a special holiday, are sequentially judged, and when the current date is judged to be the working day, the rest day and the special holiday, corresponding adjustment is carried out according to the average growth rate of the historical characteristic value, and the method is similar to the growth rate adjustment and translation algorithm described later. When the data is regulated in a special mode, whether the data is a special holiday or not is judged firstly based on the characteristics of data mutation, if so, the data is regulated by adding the growth rate according to the reference of the date in the same period of the last year, otherwise, a special date is temporarily added on the current day and is regulated according to a conventional identification regulation mode, and the added date is reported for technicians to judge whether the data is abnormal or not.
The growth rate monthly feature adjustment is mainly used for predicting time sequence data with a non-long period of less than two years, and the growth rate daily, weekly, monthly or including year of the non-long period data needs to be calculated firstly as the data of the translation fitting part before adjustment and is stored in a database. Then, for the decomposition fitting model adjusted by using the growth rate according to the monthly characteristics, firstly, judging whether the characteristics of the daily date is a special holiday or not, if the characteristics are the special holiday, adding the growth rate adjustment of the special date according to the reference of the date in the same period of the last year, and if the characteristics are not the special holiday, sequentially performing the growth rate adjustment of one period according to the average growth of the working day or the rest day and the average growth of the month as the daily data. Since whether the change of the working day and the month is not taken into consideration in the periodicity in the decomposition fitting process, the two items are mainly considered in the growth rate adjustment, and the month dimension data is predicted using data of about one year in the case of the data classification, so that the adjustment is made.
After the adjustment of the decomposed and fitted model is completed, the model fitting process using the translation mode is described below.
The time series data using the translation fitting has the characteristics of short period and training time length less than 3 compared with the prediction time length, so that the prediction requirement cannot be realized by a general machine learning or fitting mode. Too short a period of the possessed training data causes great randomness to the prediction result, and the overfitting effect is common. Then, the regularity which exists in the time series data is taken into consideration as a key mode of prediction, namely model fitting is carried out on the time series data on the basis of the growth effect of days, months, years and special holidays in the time dimension. Using a formula
f(t)=D(t)+M(t)·1 if cross month +Y(t)·1 if cross year +S(t)·1 if special
For the day growth part D (t), the differentiation into working days and non-working days is considered, namely, the average growth rate of the historical working days is respectively calculated according to the subareas to which the dates belong, the growth rate is increased on the basis of the previous period in the training data after the growth rate is obtained, and meanwhile, the corresponding proportion coefficient is multiplied to obtain the numerical value of the part
Figure BDA0003128567910000151
T d Specific gravity coefficient r for one day d Based on whether or not there is a subsequent item.
For the month growth part M (t), it is decided whether the item exists or not according to whether the training data is monthly. When the specific gravity coefficient exists, the average growth rate of the month of the data is calculated, the growth rate is increased on the basis of the last week, and the corresponding specific gravity coefficient is multiplied to obtain M (T) = M (T-T) m )·(1+g month )·r m ,T m The number of days of the current month.
For the annual growth part Y (t), it is decided whether the item is present or not depending on whether or not the training data is year-over. When the specific gravity coefficient exists, the average annual growth rate of the data is calculated, the annual growth rate is increased on the basis of the more than one year, and the corresponding specific gravity coefficient is multiplied to obtain Y (T) = Y (T-T) y )·(1+g year )·r y ,T y The days of the current year.
For the special day growth part S (T), the importance degree is higher but the calculability degree is not enough, so when the special date value of the previous period exists, the special date value is directly used as the value of the current period and is multiplied by the weight S (T) = S (T-T) · r s When the term exists, the weight is 0.5, i.e. at the same time other terms share the remaining 0.5, at non-special dates other terms will share the weight of the complete 1. And for the first three items, the weight is distributed according to the descending rank proportion of the day, the month and the year based on the existing items, and finally the establishment of the translation fitting model is realized.
In conclusion, based on various conditions, time sequence data fitting models under different conditions are obtained finally, service index data can be well predicted according to requirements, and then results are put into a warehouse to complete the whole time sequence prediction process.
After the current prediction function is realized, the prediction content is application field data in production business, and the capacity content related to the business is mainly contained in the management and control process of each business system by the prior system. Therefore, in order to apply the prediction result to the essential requirement of capacity-related deployment, i.e. to summarize the resource optimization configuration proposal for each relevant system after the changes and information generated in any system are integrated and analyzed, the final resource configuration scheme is realized by two-part operation.
Step one, calculating future bearing requirements through the pre-collected unit application service bearing capacity. Because the service bearing capacity test is carried out on any service system before the service system is on line, a theoretical service peak value can be obtained under the given application service, and the peak value is a service threshold value capable of normally carrying out system service. The bearing capacity of the unit application service under the current service system can be obtained by dividing the service threshold value by the number of the application services.
And step two, calculating service allocation in the application service allocator based on the predicted data time interval and the unit application service bearing capacity, and feeding the allocation value back to the front-end system in an instruction form, so that the front-end system can pre-allocate service to a specified service system, and the service prediction and early warning of the service system are realized. The calculation mode is based on the predicted index type, service indexes such as daily transaction amount, average response time, daily throughput and the like in one system have unique threshold values, capacity risk prompt and alarm are respectively carried out when 80% of the threshold values are reached or the threshold values are exceeded in predicted data, meanwhile, after permission is obtained, predicted peak values are divided by unit application service bearing capacity, rounded up service amount distribution is carried out, and the service amount distribution is transmitted to a front-end system in an instruction mode. And after the front system acquires the allocation information, performing instruction allocation on the application service resources of the specified business system. Finally, a set of complete process from service prediction to application resource allocation is realized, and the capacity wind control is guaranteed while the efficiency is improved.
Referring to fig. 2 to 5, a second general embodiment of the present invention:
a capacity data processing and scheduling method, the method comprising:
step s201, the pre-system counts the self service supply condition and adjusts and distributes the service supply condition to the system of each service flow; step s202, the main-front system accesses a channel system and a service system required by the complete service through a channel;
step s203, calculating bearing capacity data based on system pressure test, and collecting and counting service index data based on a monitoring device;
step s204, the data acquisition synchronizer manages the acquisition device, receives data, classifies the data by the system and transmits the data to the capacity platform server;
step s205, selecting a target index based on the analysis; if the modeling requirement exists, jumping to the step s301; if the predicted value exists, jumping to step s601;
step s601, feeding back the prediction result to the capacity platform server, displaying whether an alarm state exists and calculating and transmitting allocation parameters based on the relation between the threshold value and the prediction value;
step s602, the application service dispatcher receives the dispatching parameters, calculates and obtains a service regulation value based on the bearing capacity data and the prediction threshold difference, and sends a service quantity distribution request to a specified system; skipping to step s201;
step s301, the capacity platform server sends a prediction request of a target index to the algorithm platform analysis server according to the requirement;
step s302, the algorithm platform server identifies the index code of the target index and the system code of the target index;
step s303, inquiring system data characteristics according to the system code and the index code;
step s304, calculating the time length of the predicted characteristic data;
step s103, judging whether to perform translation fitting; jumping to step s104 when the translation fitting is determined, and jumping to step s107 when the translation fitting is determined not;
step s104, calculating the growth rate, and writing the growth rate into an intermediate database;
step s105, judging a translation structure and calculating an increase weight;
step s106, obtaining the growth rate and the growth weight, calculating the translation fitting, jumping to step s116,
step s107, constructing a sequence decomposition fitting model;
step s108, fitting model optimization and realizing prediction;
step s109, judging the adjustment mode according to the time length of the feature data, if the over-growth rate is adjusted according to the monthly feature, skipping to step s110; if the predicted length is adjusted according to the deviation, jumping to step s113;
step s110, calculating a growth rate, and writing the growth rate into an intermediate database;
step s111, obtaining the monthly feature adjustment of the growth rate, and jumping to step s116;
step s113, calculating the predicted deviation as an adjustment scheme, and warehousing;
step s114, obtaining a predicted length deviation adjustment;
step s115, adjusting twice based on the deviation amplitude;
at step s116, the prediction data is saved and/or a jump is made directly to step 601.
An automatic capacity resource allocation system, comprising: various services; and/or a number of channel systems; a total front system; a plurality of business systems; a data acquisition synchronizer; an application service coordinator; a capacity platform server; an algorithm analysis server; a database and a plurality of memories for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the capacity data processing and scheduling method.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, implements the capacity data processing and scheduling method
Referring to fig. 2 to 5, a third embodiment of the present invention
At present, it is assumed that a pre-main system is accessed to an electronic payment platform system and is coded into an EPAY, when the service system is accessed, a pressure test is required, bearing capacity data can be obtained after the pressure test, the current application service volume of a corresponding system is designed to be 2 by the pre-main system, the total application service volume of the corresponding system corresponds to the daily transaction volume of a certain index under the electronic payment platform and is coded into an RJYL, and a threshold value under the current condition in the EPAY system can be obtained and is 600000; meanwhile, the unit bearing capacity of the RJYL index under the EPAY system is C =100000/item through calculation according to the application service amount, the threshold and the parameter setting. The data are used as bearing capacity data and transmitted to a data acquisition synchronizer.
And in a period of time after the EPAY system is accessed, the data acquisition synchronizer continuously acquires the transaction amount instantaneous index of the time dimension through a built-in monitor of the EPAY system. All instantaneous indexes are subjected to statistical processing in the acquisition synchronizer to form a processing index taking day as a unit, and the RJYL index can be obtained in the example. Meanwhile, the bearing capacity data and the system information of the indexes are arranged into information, the information is stored in the processed service index data, and the information and the processed service index data are transmitted to the capacity platform server together.
And after receiving the bearing capacity data and the service index data, the capacity platform server classifies and stores the bearing capacity data and the service index data according to the system and the indexes. And when the prediction task is received, sending a request to the algorithm platform server, wherein the request content comprises a system code EPAY, an index code RJYL, the maximum historical data length and the predicted data length of 30 days.
After receiving the prediction request, the algorithm platform server may start the prediction process, and first obtain historical data, that is, processed service index data, in this example, time sequence data of the RJYL index of the EPAY system, according to the request parameter. Because the request has the longest historical data, the algorithm platform server completely inquires the target data from the database to obtain the time sequence data with a value every day, and the time sequence data is listed as a table.
2018-08-18 2018-08-19 2019-01-01 2019-12-31
429257 1179578 821226 601519
Next, the time-series data (y) is obtained 1 ,…,y p ) Thereafter, the date difference between the earliest date and the latest date is calculated as the time length P of the feature data, and here, P may be calculated as 500. And judging the relation between P and L to select which fitting model, when P is more than three times of L, using sequence decomposition fitting, and when P is not more than three times of L, selecting translation fitting, wherein the former is selected.
The decomposition fitting process of the time series data is divided into two steps, wherein the first step is the construction of a model, and the second step is the optimization solution. The model is constructed as the current time sequence data (y) 1 ,…,y p ) Constructing a function y that satisfies all data points t If the last term is an error term which cannot be predicted, the original time series data is decomposed into three parts of time series data, namely a trend term, a period term and a special term.
For the trend term, a piecewise linear regression function is employed. First of all in time series data (y) 1 ,…,y p ) And setting 1 change point every L days, wherein the interval between the last change point and the service date is not lower than L days, the change point position is a sectional point of a sectional linear function, and the sequence of the change points is (2018-09-16, 2018-10-15, \8230; 2019-11-29) and is 15 change points in total. Since the expression of the linear function is y = kt + b, the piecewise linear function is then expressed by the set segmentation point as:
g(t)=(k+a(t)δ)·t+(m+a(t) T γ), where k is the initial growth rate,(s) 1 ,…,s 15 ) The number of the change points is 15,
Figure BDA0003128567910000201
that is, the vector formed by combining the growth rates at the 15 change points, and the setting of each growth rate satisfies the Laplace distribution, namely delta j Laplace (0, tau), the value of tau calculated by the quotient of the range and the average of the historical time series data is 0.105, so that the growth rate under random distribution can be obtained:
δ=[0.56140769,1.00103428,-0.04021555,0.06693926,…,0.6871913]from this, the growth rate of all piecewise linear functions with respect to k can be derived,
Figure BDA0003128567910000202
the position of the segment at which each growth rate is located is expressed as an indicator function. Where m is the initial offset, γ = (γ) 1 ,…,γ 15 ) T ,γ j =-s j δ j For the incremental shift under the influence of the growth rate, a is determined by the transposition of the indicator function j (t) T The offset increments for each segment are obtained, together with m forming the total offset. Setting the x-axis 0 point of a rectangular coordinate system as a date 2018-08-18, adding the date to 500 as a service date, wherein m is a first date value 429257, and k is a first segment average growth 1.0243, and finally obtaining a trend term model optimization pre-expression:
Figure BDA0003128567910000203
for periodic terms, since P<365 x 2 does not satisfy two years, the period is week, and 3-level Fourier series with the period of 7 can be obtained
Figure BDA0003128567910000204
And initializing a periodic amplitude parameter vector in the series according to normal distribution, namely beta = (a) 1 ,b 1 ,…,a 3 ,b 3 )~Normal(0,σ 2 ) Wherein σ is the quotient of the variance value and the average value calculated by taking the predicted time length L in the historical time series data as a time window, and then the average value in all windows is 1.773, thus obtaining the initial parameter sequence value:
[a 1 ,b 1 ,…,a 3 ,b 3 ]=[-4.1,6.5,2.1,-5.8,0.56,1.5]the initial expression of the periodic term model can be finally obtained:
Figure BDA0003128567910000211
Figure BDA0003128567910000212
for special items, separate model construction needs to be carried out on each different festival
Figure BDA0003128567910000213
Wherein k is more than or equal to Normal (0, upsilon) 2 ) For the special influence effect of data under the initial state under each special date, upsilon defaults to 10, but the special festival holiday can be set to be larger to reflect the influence, for the current data, spring festival, shuangelen, national celebration, valentine's day and labor festival are added into the model, although the prediction time period has no special date, the historical data completely includes, and needs to be taken as the influence consideration, and finally the initial expression of the special item model is obtained:
Figure BDA0003128567910000214
obtaining the integral decomposition model which is not subjected to optimization fitting
Figure BDA0003128567910000215
The model can obtain a current fitting value after each historical time sequence data is brought in, and the second step of optimization solution is started after the model is built. Optimizing the objective to make the real value and the model fitting value as close as possible on each historical time sequence data, i.e. optimizing and solving a new function
Figure BDA0003128567910000216
Due to the presence of a large number of initialized variable parameters in the function, i.e., parameters in each decomposition term, the minimum of (c) includes:
g (t) = g (t, k, m, δ) and s (t) = s (t, a) 1 ,b 1 ,a 2 ,b 2 ,a 3 ,b 3 ) And h (t) = h (t, κ), or
And the optimization function F needs to be optimized to obtain:
minF=minF(X)=minF(k,m,δ,a 1 ,b 1 ,…,κ)。
the solving method adopts an L-BFGS algorithm, firstly, a target function F (X) is subjected to quadratic Taylor expansion at k, a high-order infinitesimal part is ignored, and derivation is carried out to obtain:
F′(X)=F′(X k )+F″(X k )(X-X k ) If F' (X) =0 is the minimum value, X is obtained k+1 =X k -H -1 ·g k K =0,1, \ 8230wherein H -1 The inverse of the second derivative of the objective function, g, being in matrix form due to multiparametric reasons k Is a first derivative function of the objective function, when k is 0, the model parameter X can be initialized by the formula 0 Calculating to obtain optimized once-after-parameter X 1 Then continuously iterating to the optimal parameter X * =(k * ,m ** ,a 1 * ,b 1 * ,…,κ * ) And obtaining a parameter combination which can enable the model fitting effect to be optimal, and completing model optimization. The second derivative reciprocal and the first derivative of the target function need to be calculated in each iteration process, the latter is simple in calculation, the former calculation is obtained by a double iteration approximation method, namely:
Figure BDA0003128567910000221
s k =X k+1 -X k ,y k =g k+1 -
g k and in each iteration, the matrix is not directly iteratively calculated, but s within 10 iterations of iterative calculation is calculated k And y k To reduce the amount of computation.
After model fitting is completed, the trading volume value of L day can be directly predicted according to the model (46800, 518422, \8230; 539076), and the predicted value still needs to be adjusted to a certain extent. The selection of the adjustment method depends on the time length of the historical data, namely, whether the time length of the historical data exceeds the sum of two years and the length of the prediction time period is judged, if yes, the deviation adjustment of the prediction length is also adopted, and if not, the monthly characteristic adjustment of the growth rate is adopted. Since P <365 x 2, the growth rate is adjusted monthly here.
According to the adjustment scheme, the growth rate of historical data needs to be calculated firstly, in the historical 500 days, as the cycle item is only fitted to the week in the model and the fitting length is not more than two years, on the premise of not considering the annual cycle, the growth rate of the month and the growth rate of whether the working day is taken as an adjustment key point, and the adjustment of adding a special date is assisted. Here the average monthly growth rate r is calculated m The calculation method is that the daily average R of each month is firstly calculated in the historical data month In total, 12 values of = 463272,458594, \8230;, 507193 are calculated, the average difference per day is calculated as the monthly growth rate, and the average monthly growth rate r is obtained by averaging the differences m =2902. Here the average growth rate r for all working days in 500 days is calculated wd Average growth rate r over weekends we The growth rates are calculated by dividing the difference by the division dimension, the division dimension of the working days is (1, 3), the division dimension of the weekend is (1, 6), all working calendar history data are sequentially arranged, then the difference is made and divided by the respective division dimension, the difference from friday to monday is divided by 3, all weekend history data are sequentially arranged, then the difference from monday to monday is divided by 6, then the respective differences are averaged to obtain the average growth rate, two average growth rate values 3300 and 330740 are respectively obtained, for the special dates, whether the special dates exist in the prediction date is firstly checked, in the case, only the time of the spring festival, namely 2020-01-25 is taken as the special date, and whether the spring festival exists in the history data or not is checked, in the case, the date 2019-02-05 exists, the historical date value is directly applied to the prediction date 2020-01-25 in one day, and the predicted value of the day is the final value. If there are multiple periods of special dates in the history data, the growth rate of the special dates is calculated after the history values of the latest dates are directly applied and is modified to a final value. For non-special dates, the 29 predicted date values with special dates removed are according to the business dateAt intervals, the predicted value is adjusted by the total increase effect increase of half a working day or a rest day increase and half a month increase. The 29-day prediction bias of 2020-01-25 was removed from 2020-01-01 to 2020-01-30 as follows:
Figure BDA0003128567910000231
Figure BDA0003128567910000232
after the calculation of the deviation of the growth rate is completed, the deviation can be added into the original prediction result, and the adjusted prediction result can be obtained:
Figure BDA0003128567910000233
and storing the prediction result as a final prediction result.
Case two:
the case of using the predicted length deviation adjustment is exemplified while employing the sequence decomposition fitting model. Suppose the historical data has three years, namely the historical data is as follows:
2017-01-01 2017-01-02 2019-01-01 2019-12-31
327536 401822 821226 601519
when the demand forecast still has 30 daily transaction amount values of 2020-01-01 to 2020-01-31 days, the forecast length L is 30 and the historical data time length P is 1095. Since P > L × 3, a sequence decomposition fitting model is still used in the model selection judgment, and the model construction of this time is different from the selection of the period term in case one, here, the period term of two parts of the week and the year is used, and the period term of this year is a fourier series with the period of 365.25 and the series of 10. In addition, the setting and optimization processes of other parameters are similar to those in the first embodiment, and are not described herein in detail.
After the model is constructed and optimized, the predicted data (46800, 518422, \8230;, 539076) of the service date for the next 30 days obtained by model fitting can be finally obtained, and the adopted adjustment mode is judged according to the time length of the historical data. Since P >365 x 2 this time, the predicted length deviation adjustment method was used.
In the adjustment of the deviation of the prediction length, the deviation between a true value in historical data and a predicted value obtained by fitting is mainly used as an adjustment result in the same time period in the future. And (4) moving the service date T to the history by L days, fitting and predicting the daily transaction amount of 30 days after the new service date in a sequence decomposition mode under the condition again according to the history data of 1065 days 2017-01-01 to 2019-12-01 of 30 days after the history is eliminated. I.e. the daily traffic for 30 days after the new business date is fitted according to the model.
Thereby obtaining the historical true values of the 30 days from the 2019-12-02 day to the 2019-12-31 day:
(437200, 468382, \8230;, 518956), and decomposition fitting predicted values:
(440020, 438922, \8230;, 373933) the difference between the predicted value and the true value gave a deviation (. Epsilon.) of 30 days T-29 ,…,ε T )=(282029460, \8230, -185023) for obtaining 30 days after adjustment by applying the 30 deviations to the future prediction results respectively
Figure BDA0003128567910000241
Figure BDA0003128567910000242
And observing the change ratio of the predicted value after adjustment and the predicted value before adjustment, and when the deviation amplitude exceeds 30% or 70% of the reference of the predicted value, respectively carrying out conventional identification adjustment and special adjustment. Here, 2020-01-30 days need regular identification adjustment, the day is identified as working day, non-special day, that is, the average growth rate r of working day is calculated by all historical data wd =714, and further accumulate growth rates from the date of the transaction and add a secondary adjustment value 714 × 29 to the adjusted predicted value. When the date 2020-01-24 needs to be adjusted in a specialized manner, the average increase rate of the last divided date data is added to the divided date data as the final result of the prediction, and if the divided date is not a special date, the date special data is temporarily regarded as an abnormal and is not adjusted in a secondary manner, and the condition is reported to the technician. Finally, the prediction data which is secondarily adjusted in 2020-01-24 days and 2020-01-30 days can be obtained and stored as the final prediction result.
And judging whether the prediction value of the 30 days exceeds a preset threshold value in subsequent work, and reporting the prediction value as early warning information to an application field management system when prediction data exceeding the threshold value exists.
Situation three
When the prediction demand is still 30 days, if the historical data which can be used at the moment is very small and does not reach three times of the prediction date length, namely P <3 x L, then a translation fitting method is adopted for prediction. It is assumed that the history data time length is 61 at this time, as follows.
2019-11-01 2019-11-02 2019-12-01 2019-12-31
818657 481261 425818 601519
After the historical data are obtained, the working day growth rate, the weekend growth rate, the month growth rate and the annual growth rate of the index historical data in the system are calculated firstly, in the example, the annual growth rate cannot be calculated because the historical data are only 61 days, and the others can be calculated. The calculation method is described in detail in the adjustment of the growth rate, the difference of average values under respective division dimensions is used as the growth rate, and the average of respective differences is the average growth rate, so that the calculation condition is that at least two periods exist. The average monthly growth rate r can be calculated m =28762, average growth rate r of all working days wd =253 average growth rate r on all weekends we =3467. Secondly, the service date T is shifted to the history by L days in accordance with the first step in the adjustment of the deviation of the prediction length, the 30-day data between the new service date and the old service date are directly translated to the future prediction time period, so as to obtain the 30-day prediction value (945504, \8230;, 601519) of the translated benchmark, and then the 30-day prediction value is based on the translationThe reference predicted value is adjusted for each growth rate every day. The component of the growth rate can be obtained from the aforementioned formula, and the ratio of the working day growth rate, weekend growth rate and month growth rate is (0.5 ) due to the annual growth rate. From this, a deviation of the growth rate of 30 days can be calculated as
Figure BDA0003128567910000261
Figure BDA0003128567910000262
And finally, adding the growth rate deviation into the translation reference value by similar operation in the growth rate adjustment, so as to obtain the predicted daily transaction amount 30 days after the service date after model fitting.
In summary, after prediction and storage are realized, the following introduces an early warning and optimization strategy feedback manner by way of example.
In the above situation, a predicted value 30 days after the business date T is obtained, and the application index is the daily transaction amount index of the EPAY system. Firstly, based on the existing information, a threshold value G and a unit bearing capacity C of a daily transaction amount index of the EPAY system are called. When for threshold G =600000, the predicted values are based on 30
Figure BDA0003128567910000263
Judging the relationship between the maximum value of the predicted value and the threshold value, wherein the maximum value is
Figure BDA0003128567910000264
Due to the fact that
Figure BDA0003128567910000265
Therefore, the alarm that the capacity needs to be predicted can be transmitted to the application service tuner after the operator confirms and passes the scheduling request, the number of the application services that need to be increased is calculated to be 1 in the tuner according to the unit bearing capacity C =100000/item, the numerical value is transmitted to the total front system as an instruction, the total front system is used as an entrance of all the hosting service systems, and the allocation application of the EPAY system can be performed according to the instructionThe service number is increased by 1 unit, and an early warning mechanism is realized.
The pre-master system, after receiving the instruction, converts the application service provisioning scheme into a system internal instruction, accessing the system EPAY via the listener tag. For increasing application services, a service configurator is required to perform specified number of service copies on a marked system and start the marked system, and a subsequent CPU of the total front system can automatically perform balanced process processing according to different service numbers of different systems. For reducing the application services, only the service configurator needs to end the designated number of application services under the marking system, the CPU of the total front system can automatically balance the process processing, and the subsequent service configurator needs to recycle the services which are not started.
After the application service volume of an access system under the total front system is adjusted, the channel of the access system can be improved or reduced, and the response capability and the business processing capability of the access system can be changed based on the change of the application service volume ratio no matter whether the total processing capability of the total front system is changed, namely the use ratio of the physical resources of the total front system is changed.
Then, by taking k as an initial growth rate in the formula and taking 15 change points, that is, taking a vector formed by combining growth rates at 15 change points, setting each growth rate to satisfy a laplacian distribution, that is, a value calculated by a quotient of a range of historical time series data and an average is 0.105, then the growth rates under a random distribution can be obtained, and thus the growth rates of all piecewise linear functions related to k can be obtained, and the piecewise position where each growth rate is located is expressed as an indication function. In the formula, m is initial offset, and for offset increment under the influence of the growth rate, the offset increment of each segment is obtained by transposition of an indicating function, and the offset increment and m together form the total offset. Setting the x-axis 0 point of the rectangular coordinate system as the date 2018-08-18, adding the date to 500 as the service date, and setting m as the first date value 4. For special items, independent model construction h needs to be carried out on each different festival day for the initial expression of the periodic item model, wherein the effect of special influence of data in the initial state under each special date is defaulted to 10, but the special holiday can be set to be larger to reflect the influence of the special items, spring festival, double eleven festival, national day festival, valentine's day and labor day are added into the model for the current data, although the prediction time period does not have the special date, the historical data completely includes the special date and needs to be taken as the influence consideration, the initial expression sum of the special item model is finally obtained, and further the optimization function F needs to be optimized to be solved. And when the order is the minimum value, obtaining the reciprocal of a second derivative function of the objective function, wherein the multi-parameter reason is a matrix form, the reciprocal is the first derivative function of the objective function, when k is 0, the parameters after one-time optimization can be obtained through calculation of the formula and the initialized model parameters, and then the iteration is carried out continuously to the optimal parameters, so that the parameter combination which can optimize the model fitting effect can be obtained, and the model optimization is completed. The second derivative reciprocal and the first derivative of the objective function need to be calculated in each iteration process, the latter is simple in calculation, the former calculation is obtained by means of double iterative approximation, namely, the matrix is not directly iterated and calculated in each iteration, and the sum within 10 iterative times is iteratively calculated so as to reduce the calculation amount. And after the calculation of the deviation of the growth rate is finished, the deviation can be added into the original prediction result, and the adjusted prediction result can be obtained and stored as the final prediction result. And finally, adding the growth rate deviation into the translation reference value by similar operation in the growth rate adjustment, so as to obtain the predicted daily transaction amount 30 days after the service date after model fitting. Based on the same inventive concept, the embodiment of the present application provides an electronic device, as shown in fig. 6, the electronic device includes a memory and a processor, and the memory is communicatively connected with the processor.
The memory stores computer programs, and when the computer programs are executed by the processor, the data processing and deploying method provided by the embodiment of the application is realized.
Those skilled in the art will appreciate that the electronic devices provided by the embodiments of the present application may be specially designed and manufactured for the required purposes, or may include known devices in general-purpose computers. For example, the electronic device may be a mobile phone, a tablet computer, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, and the like, and the embodiment of the present application does not limit the specific type of the electronic device. These devices have stored therein computer programs that are selectively activated or reconfigured. Such a computer program may be stored in a device (e.g., computer) readable medium or in any type of medium suitable for storing electronic instructions and respectively coupled to a bus.
The Memory in the electronic device of the present application may be a ROM (Read-Only Memory) or other types of static storage devices that can store static information and instructions, may be a RAM (Random Access Memory) or other types of dynamic storage devices that can store information and instructions, may also be an EEPROM (Electrically Erasable Programmable Read Only Memory), a CD-ROM (Compact disk Read-Only Memory) or other optical disk storage, optical disk storage (including Compact disk, laser disk, optical disk, digital versatile disk, blu-ray disk, etc.), a magnetic disk storage medium or other magnetic storage device, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to this.
The Processor in the electronic device of the present application may be a Central Processing Unit (CPU), a general purpose Processor, a Digital Signal Processor (DSP), or an Application Specific Integrated Circuit (ASIC)
An Application Specific Integrated Circuit (asic), an FPGA (Field-Programmable Gate Array), or other Programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. A processor may also be a combination of computing functions, e.g., comprising one or more microprocessors, a DSP and a microprocessor, or the like.
The electronic device provided by the embodiment of the present application has the same inventive concept as the embodiments described above, and the details that are not shown in detail in the electronic device may refer to the embodiments described above, and are not described herein again.
Based on the same inventive concept, embodiments of the present application provide a computer-readable storage medium, where a computer program is stored on the storage medium, and when the computer program is executed by a processor of a server, the data desensitization method provided by embodiments of the present application is implemented.
The computer-readable medium provided herein includes, but is not limited to, any type of disk including floppy disks, hard disks, optical disks, CD-ROMs, and magneto-optical disks, ROMs, RAMs, EPROMs (Erasable Programmable Read-Only memories), EEPROMs, flash Memory, magnetic or optical cards. That is, a readable medium includes any medium that stores or transmits information in a form readable by a device (e.g., a computer).
The computer-readable storage medium provided in the embodiments of the present application has the same inventive concept as the embodiments described above, and contents not shown in detail in the computer-readable storage medium may refer to the embodiments described above, and are not described herein again. Steps, measures, solutions may be alternated, modified, combined, or deleted. Further, other steps, measures, or schemes in various operations, methods, or flows that have been discussed in this application can be alternated, altered, rearranged, broken down, combined, or deleted. Further, steps, measures, schemes in the prior art having various operations, methods, procedures disclosed in the present application may also be alternated, modified, rearranged, decomposed, combined, or deleted.
Are understood to indicate or imply relative importance or implicitly indicate the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present application, "a plurality" means two or more unless otherwise specified. The steps are not necessarily performed in the order indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless otherwise indicated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the structure of the present invention in any way. Any simple modification, equivalent change and modification of the above embodiments according to the technical essence of the present invention are within the technical scope of the present invention.

Claims (3)

1. A capacity data processing and scheduling method, the method comprising:
step s201, the pre-system counts the self service supply condition and adjusts and distributes the service supply condition to the system of each service flow;
step s202, the main-front system accesses a channel system and a service system required by the complete service through a channel;
step s203, calculating bearing capacity data based on system pressure test, and collecting and counting service index data based on a monitoring device;
step s204, the data acquisition synchronizer manages the acquisition device, receives the data, classifies the data systematically and transmits the data to the capacity platform server;
step s205, selecting a target index based on the analysis; if the modeling requirement exists, jumping to the step s301; if the predicted value exists, jumping to step s601;
step s601, feeding back the prediction result to the capacity platform server, displaying whether an alarm state exists and calculating and transmitting allocation parameters based on the relation between the threshold value and the prediction value;
step s602, the application service coordinator receives the allocation parameters, calculates and obtains a service adjustment value based on the bearing capacity data and the prediction threshold difference, and sends a service amount allocation request to the specified system; skipping to step s201;
step s301, the capacity platform server sends a prediction request of a target index to the algorithm platform analysis server according to the requirement;
step s302, the algorithm platform server identifies the index code of the target index and the system code of the target index;
step s303, inquiring system data characteristics according to the system codes and the index codes;
step s304, calculating the time length of the predicted characteristic data;
step s103, judging whether to perform translation fitting; jumping to step s104 when the translation fitting is determined, and jumping to step s107 when the translation fitting is determined not;
step s104, calculating the growth rate, and writing the growth rate into an intermediate database;
step s105, judging a translation structure and calculating an increase weight;
step s106, obtaining the growth rate and the growth weight, calculating the translation fitting, skipping to step s116,
step s107, constructing a sequence decomposition fitting model;
step s108, fitting model optimization and realizing prediction;
step s109, judging an adjustment mode according to the time length of the feature data, for example, adjusting the growth rate according to the monthly feature, and jumping to step s110; if the predicted length is adjusted according to the deviation, jumping to step s113;
step s110, calculating the growth rate, and writing the growth rate into an intermediate database;
step s111, obtaining the growth rate and adjusting according to monthly characteristics, and jumping to step s116;
step s113, calculating the predicted deviation as an adjustment scheme, and warehousing;
step s114, obtaining a predicted length deviation adjustment;
step s115, adjusting twice based on the deviation amplitude;
step s116, the prediction data is saved and/or a jump is made directly to step 601.
2. An automatic capacity resource allocation system, comprising:
various services; and/or a number of channel systems;
a total front system;
a plurality of business systems;
a data acquisition synchronizer;
an application service coordinator;
a capacity platform server;
an algorithm analysis server;
a database and, in addition,
a number of memories for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the capacity data processing and scheduling method of claim 1.
3. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the capacity data processing and scheduling method according to claim 1.
CN202110696330.9A 2021-06-23 2021-06-23 Method, system and computer readable medium for capacity data processing and allocation Active CN113344282B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110696330.9A CN113344282B (en) 2021-06-23 2021-06-23 Method, system and computer readable medium for capacity data processing and allocation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110696330.9A CN113344282B (en) 2021-06-23 2021-06-23 Method, system and computer readable medium for capacity data processing and allocation

Publications (2)

Publication Number Publication Date
CN113344282A CN113344282A (en) 2021-09-03
CN113344282B true CN113344282B (en) 2023-01-17

Family

ID=77478018

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110696330.9A Active CN113344282B (en) 2021-06-23 2021-06-23 Method, system and computer readable medium for capacity data processing and allocation

Country Status (1)

Country Link
CN (1) CN113344282B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114518988B (en) * 2022-02-10 2023-03-24 中国光大银行股份有限公司 Resource capacity system, control method thereof, and computer-readable storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111176575A (en) * 2019-12-28 2020-05-19 苏州浪潮智能科技有限公司 SSD (solid State disk) service life prediction method, system, terminal and storage medium based on Prophet model
CN112231193A (en) * 2020-12-10 2021-01-15 北京必示科技有限公司 Time series data capacity prediction method, time series data capacity prediction device, electronic equipment and storage medium

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120130760A1 (en) * 2009-10-26 2012-05-24 Jerry Shan Adjusting a point prediction that is part of the long-term product life cycle based forecast
CN104766144A (en) * 2015-04-22 2015-07-08 携程计算机技术(上海)有限公司 Order forecasting method and system
CN107093096B (en) * 2016-12-15 2022-03-25 口碑(上海)信息技术有限公司 Traffic prediction method and device
CN109657831A (en) * 2017-10-11 2019-04-19 顺丰科技有限公司 A kind of Traffic prediction method, apparatus, equipment, storage medium
CN108764863B (en) * 2018-05-24 2021-10-29 腾讯科技(深圳)有限公司 Data transfer method, device, server and storage medium
CN112633542A (en) * 2019-09-24 2021-04-09 顺丰科技有限公司 System performance index prediction method, device, server and storage medium
CN110990174A (en) * 2019-10-25 2020-04-10 苏州浪潮智能科技有限公司 Method, device and medium for predicting SSD available time based on Prophet model
CN111045907B (en) * 2019-12-12 2020-10-09 苏州博纳讯动软件有限公司 System capacity prediction method based on traffic
CN112269811A (en) * 2020-10-13 2021-01-26 北京同创永益科技发展有限公司 IT capacity prediction method and system based on traffic
CN112256550A (en) * 2020-11-19 2021-01-22 深信服科技股份有限公司 Storage capacity prediction model generation method and storage capacity prediction method
CN112541635A (en) * 2020-12-16 2021-03-23 平安养老保险股份有限公司 Service data statistical prediction method and device, computer equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111176575A (en) * 2019-12-28 2020-05-19 苏州浪潮智能科技有限公司 SSD (solid State disk) service life prediction method, system, terminal and storage medium based on Prophet model
CN112231193A (en) * 2020-12-10 2021-01-15 北京必示科技有限公司 Time series data capacity prediction method, time series data capacity prediction device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113344282A (en) 2021-09-03

Similar Documents

Publication Publication Date Title
CN108647914B (en) Production scheduling method and device, computer equipment and storage medium
US20100185557A1 (en) Resource allocation techniques
US9747560B2 (en) Method and system for combination of independent demand data streams
US6611726B1 (en) Method for determining optimal time series forecasting parameters
JP2007524907A (en) Resource allocation method
CN108985691A (en) A kind of automatic replenishing method and system based on dynamic stock control
CN108446795B (en) Power system load fluctuation analysis method and device and readable storage medium
Chaturvedi et al. Safety stock, excess capacity or diversification: Trade‐offs under supply and demand uncertainty
CA2521927A1 (en) A factor risk model based system, method, and computer program product for generating risk forecasts
CN113344282B (en) Method, system and computer readable medium for capacity data processing and allocation
CN111274531A (en) Commodity sales amount prediction method, commodity sales amount prediction device, computer equipment and storage medium
Lesnevski et al. Simulation of coherent risk measures based on generalized scenarios
US20100280969A1 (en) Method and system for managing pension portfolios
CN112700111A (en) Working data processing method and device, computer equipment and storage medium
CN116630082A (en) Method and device for allocating production resources, electronic equipment and storage medium
CN113869811A (en) Storage mode determining method and device, computer equipment and storage medium
US20140297334A1 (en) System and method for macro level strategic planning
CN112053181A (en) Agricultural product price prediction method and system
CN116205569A (en) Intelligent inventory analysis system based on sales
CN116562715A (en) Index data monitoring method, device, computer equipment and storage medium
CN115689222A (en) Material scheduling method and construction site material management system based on Internet of things
Balut et al. A method for repricing aircraft procurement programs
EP1843232A1 (en) Production scheduling system
CN115375413B (en) ERP purchase calculation method and system
CN112749821B (en) Express delivery quantity prediction method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant