WO2022121219A1 - 基于分布曲线的预测方法、装置、设备及存储介质 - Google Patents

基于分布曲线的预测方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2022121219A1
WO2022121219A1 PCT/CN2021/090828 CN2021090828W WO2022121219A1 WO 2022121219 A1 WO2022121219 A1 WO 2022121219A1 CN 2021090828 W CN2021090828 W CN 2021090828W WO 2022121219 A1 WO2022121219 A1 WO 2022121219A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
historical
business
period
preset
Prior art date
Application number
PCT/CN2021/090828
Other languages
English (en)
French (fr)
Inventor
揭珍
周跃斌
甘嘉成
张海波
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2022121219A1 publication Critical patent/WO2022121219A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Definitions

  • the present application relates to the field of intelligent decision-making of artificial intelligence, and in particular, to a prediction method, apparatus, device and storage medium based on a distribution curve.
  • the current business data forecasting method generally uses a time series forecasting algorithm and historical business data to predict business data at a preset time, thereby obtaining forecast business data.
  • the distribution law between the number of business orders and the data of business expenses leads to a high deviation rate of the forecast results of the forecast business data, resulting in a low forecast accuracy of the business data.
  • the present application provides a prediction method, device, device and storage medium based on a distribution curve, which are used to improve the prediction accuracy of business expense data.
  • a first aspect of the present application provides a prediction method based on a distribution curve, including:
  • the first historical business order data of the first preset historical period after data preprocessing and the second historical business order data of the second preset historical period, and the first preset historical period is included in the second preset historical period.
  • the end date of the second preset historical period is the day before the start date of the preset forecast period
  • the first historical business order data includes the historical period working day average daily business cost data and historical period holidays
  • Daily average business cost data the second historical business order data includes historical average daily business cost data on working days in the same period and historical average daily business cost data on holidays in the same period;
  • the business expense data ratio values corresponding to the multiple date differences are calculated, and the exponential function in the preset prediction model and the multiple date differences are calculated. value, and calculate the periodic working day distribution curve value, the periodic holiday distribution curve value, the working day distribution curve value of the same period and the holiday distribution curve value of the same period of the business expense data ratio value;
  • the first historical business order data, the second historical business order data, the periodic working day distribution curve value, the periodic holiday distribution curve value, and the same working day distribution curve value and the distribution curve value of holidays in the same period predict the business expense data of the preset forecast period, and obtain the first forecast data set corresponding to the daily average business expense data of the historical period working days, the holiday days of the historical period
  • the first prediction data set, the second prediction data set, the third prediction data set and the fourth prediction data set are combined to obtain business expense prediction data.
  • a second aspect of the present application provides a distribution curve-based prediction device, comprising a memory, a processor, and computer-readable instructions stored on the memory and executable on the processor, the processor executing the The computer readable instructions implement the following steps:
  • the first historical business order data of the first preset historical period after data preprocessing and the second historical business order data of the second preset historical period, and the first preset historical period is included in the second preset historical period.
  • the end date of the second preset historical period is the day before the start date of the preset forecast period
  • the first historical business order data includes the historical period working day average daily business cost data and historical period holidays
  • Daily average business cost data the second historical business order data includes historical average daily business cost data on working days in the same period and historical average daily business cost data on holidays in the same period;
  • the business expense data ratio values corresponding to the multiple date differences are calculated, and the exponential function in the preset prediction model and the multiple date differences are calculated. value, and calculate the periodic working day distribution curve value, the periodic holiday distribution curve value, the working day distribution curve value of the same period and the holiday distribution curve value of the same period of the business expense data ratio value;
  • the first historical business order data, the second historical business order data, the periodic working day distribution curve value, the periodic holiday distribution curve value, and the same working day distribution curve value and the distribution curve value of holidays in the same period predict the business expense data of the preset forecast period, and obtain the first forecast data set corresponding to the daily average business expense data of the historical period working days, the holiday days of the historical period
  • the first prediction data set, the second prediction data set, the third prediction data set and the fourth prediction data set are combined to obtain business expense prediction data.
  • a third aspect of the present application provides a computer-readable storage medium, where computer instructions are stored in the computer-readable storage medium, and when the computer instructions are executed on a computer, the computer is caused to perform the following steps:
  • the first historical business order data of the first preset historical period after data preprocessing and the second historical business order data of the second preset historical period, and the first preset historical period is included in the second preset historical period.
  • the end date of the second preset historical period is the day before the start date of the preset forecast period
  • the first historical business order data includes the historical period working day average daily business cost data and historical period holidays
  • Daily average business cost data the second historical business order data includes historical average daily business cost data on working days in the same period and historical average daily business cost data on holidays in the same period;
  • the business expense data ratio values corresponding to the multiple date differences are calculated, and the exponential function in the preset prediction model and the multiple date differences are calculated. value, and calculate the periodic working day distribution curve value, the periodic holiday distribution curve value, the working day distribution curve value of the same period and the holiday distribution curve value of the same period of the business expense data ratio value;
  • the first historical business order data, the second historical business order data, the periodic working day distribution curve value, the periodic holiday distribution curve value, and the same working day distribution curve value and the distribution curve value of holidays in the same period predict the business expense data of the preset forecast period, and obtain the first forecast data set corresponding to the daily average business expense data of the historical period working days, the holiday days of the historical period
  • the first prediction data set, the second prediction data set, the third prediction data set and the fourth prediction data set are combined to obtain business expense prediction data.
  • a fourth aspect of the present application provides a prediction device based on a distribution curve, including:
  • the acquisition module is used to acquire the first historical business order data of the first preset historical period after data preprocessing, and the second historical business order data of the second preset historical period, the first preset historical period is included in the In the second preset historical period, the end date of the second preset historical period is the day before the start date of the preset forecast period, and the first historical business order data includes the average daily business cost of working days in the historical period data and historical period holiday average daily business expense data, and the second historical business order data includes historical historical weekday average daily business expense data and historical historical period holiday average daily business expense data;
  • the first calculation module is used to obtain a plurality of signing dates in the second historical business order data, and the corresponding starting dates of the signing dates, and calculate the starting dates corresponding to the multiple signing dates and the signing dates. The difference between, to get multiple date differences;
  • the second calculation module is configured to calculate the business expense data ratio values corresponding to the plurality of date differences by using the preset prediction model and the second historical business order data, and use the exponential function in the preset prediction model. and the multiple date difference values, calculate the periodic working day distribution curve value, the periodic holiday distribution curve value, the contemporaneous working day distribution curve value and the contemporaneous holiday distribution curve value of the proportional value of the business expense data;
  • a forecasting module configured to use the preset forecasting model, the first historical business order data, the second historical business order data, the periodic working day distribution curve value, the periodic holiday distribution curve value, the The distribution curve value of the working day in the same period and the distribution curve value of the holiday in the same period are used to predict the business expense data of the preset forecast period, and obtain the first predicted data set corresponding to the average daily business expense data of the working day in the historical period.
  • the second forecast data set corresponding to the daily average business cost data on holidays in the historical period the third forecast data set corresponding to the historical average daily business cost data on working days in the same period of history, and the third forecast data set corresponding to the historical average daily business cost data on holidays in the same period Four prediction datasets;
  • the merging processing module is used for merging the first prediction data set, the second prediction data set, the third prediction data set and the fourth prediction data set to obtain business expense prediction data.
  • the business expense data in the preset prediction period is predicted to obtain The first predicted data set, the second predicted data set, the third predicted data set and the fourth predicted data set, and the first predicted data set, the second predicted data set, the third predicted data set and the fourth predicted data set
  • the business expense forecast data is obtained by merging, which can incorporate the influence of the data of working days and holidays, as well as the distribution law between the number of business orders and business expense data into the forecast of the preset forecast model, which avoids the historical business order data being susceptible to the business of the date.
  • the problem of the influence between the order quantity and the business expense data increases the deviation rate of the forecast business expense data, thereby improving the forecasting accuracy of the business expense data.
  • FIG. 1 is a schematic diagram of an embodiment of a prediction method based on a distribution curve in an embodiment of the present application
  • FIG. 2 is a schematic diagram of another embodiment of a prediction method based on a distribution curve in an embodiment of the present application
  • FIG. 3 is a schematic diagram of an embodiment of a prediction device based on a distribution curve in an embodiment of the present application
  • FIG. 4 is a schematic diagram of another embodiment of a prediction device based on a distribution curve in an embodiment of the present application
  • FIG. 5 is a schematic diagram of an embodiment of a prediction device based on a distribution curve in an embodiment of the present application.
  • Embodiments of the present application provide a distribution curve-based prediction method, apparatus, device, and storage medium, which improve the prediction accuracy of service expense data.
  • An embodiment of the prediction method based on the distribution curve in the embodiment of the present application includes:
  • first historical business order data of a first preset historical period after data preprocessing and second historical business order data of a second preset historical period, where the first preset historical period is included in the second preset history period, the end date of the second preset historical period is the day before the start date of the preset forecast period, and the first historical business order data includes the data of the average daily business cost of working days in the historical period and the data of the average daily business cost of holidays in the historical period.
  • the second historical business order data includes the historical average daily business expense data on working days in the same period and the historical average daily business expense data on holidays in the same period.
  • the execution subject of the present application may be a prediction device based on a distribution curve, and may also be a terminal or a server, which is not specifically limited here.
  • the embodiments of the present application take the server as an execution subject as an example for description.
  • the first historical business order data may be the first historical auto insurance policy data
  • the second historical business order data may be the second historical auto insurance policy data
  • the server may, according to the first historical preset time period, pre-process the data in the preset database.
  • the processed historical auto insurance policy data is retrieved and extracted to obtain the first historical auto insurance policy data
  • the second historical preset period the historical auto insurance policy data that has undergone data preprocessing in the preset database is retrieved and extracted to obtain the second historical Auto insurance policy data
  • the first historical auto insurance policy data includes the auto insurance policy information in addition to the historical weekday average amount data and historical period holiday average daily amount data, which includes the auto insurance policy date and start date date, and the vehicle type corresponding to the auto insurance policy, which includes new and used vehicles
  • the second historical auto insurance policy data can be obtained.
  • the preset prediction period is the remaining date in the larger month of the second preset historical period except the date of the second preset historical period.
  • the end date of the first preset history period is the same as the end date of the second preset history period.
  • the data on the average daily business expense on workdays in the historical period can be the data on the average daily amount on workdays in the historical period
  • the data on the average daily business expense on holidays in the historical period can be the data on the average daily amount on holidays in the historical period
  • the data on the average daily business expense on workdays in the historical period can be It is the historical average daily amount data on working days in the same period.
  • the historical average daily amount data on holidays in the same period can be the historical average daily amount data on holidays in the same period.
  • the preset forecast period is 7.20-7.31, and the first preset historical period is 7 days, which is 7.13
  • the second preset historical period is 30 days, which is from 6.20 to 7.19.
  • the average daily amount data for working days in the historical cycle is the historical average daily auto insurance premium data from 7.13 to 7.17, and the average daily amount for holidays in the historical cycle.
  • the data is the historical average daily auto insurance premium data from 7.18th to 7.19th, and the historical average daily amount data for working days in the same period is 6.22nd-6.26th, 6.29th-7.3rd, 7.6th-7.10th and 7.13th-7.17th, a total of 20 days
  • the historical average daily auto insurance premium data, the historical average daily amount data for holidays in the same period is 6.20-6.21, 6.27-6.28, 7.4-7.5, 7.11-7.12 and 7.18-7.19, a total of 10 days of history Average daily car insurance premium data.
  • the server pre-obtains the training data of the first preset period, the training data of the second preset period, the training distribution curve value and the difference between the training dates; the training data of the first preset period and the training data of the second preset period are , the difference between the training distribution curve value and the training date, input the initial prediction model, and use the initial prediction model to predict the business cost data during the training prediction period to obtain the prediction result; according to the preset loss function and prediction result, the initial prediction model The parameters are iteratively adjusted to obtain a preset prediction model.
  • the server trains the initial prediction model through the training data of the first preset period, the training data of the second preset period, the training distribution curve value and the difference between the training dates, and obtains the prediction result. Iteratively adjusts the weight values or structural parameters of the parameters until the preset loss function converges, stops the adjustment, and obtains the preset prediction model.
  • the second historical business order data may be the second historical auto insurance policy data
  • the start date may be the start date
  • each policy in the second historical auto insurance policy data has a corresponding date of establishment and start date.
  • historical auto insurance policy data A, B, and C as an example, the date of signing and the starting date in the second historical auto insurance policy data A are 7.10 and 7.10, respectively, and the date of signing and starting the policy in the second historical auto insurance policy data B are 6.20 and 6.21, the date of signing and the starting date in the second historical auto insurance policy data C are 6.20 and 6.22 respectively, then the date difference of the second historical auto insurance policy data A is 0, and the date difference of the second historical auto insurance policy data B is 0.
  • the value is 1, the date difference of the second historical auto insurance policy data C is 2, the second historical auto insurance policy data A is marked as T+0, the second historical auto insurance policy data B is marked as T+1, and the second historical auto insurance policy data A is marked as T+1.
  • the auto insurance policy data C is marked as T+2, and so on, the date difference of other second historical auto insurance policy data can be obtained, and the date difference is marked on the second historical auto insurance policy data in the form of T+date difference,
  • the preferred maximum value of the date difference value is 30, and the date difference value may not be limited, that is, it can exceed 30.
  • the second historical business order data can be the second historical auto insurance policy data
  • the business expense data ratio value can be the premium income ratio value
  • the server calls a preset prediction model
  • Exponential functions in prebuilt predictive models Taking T+date difference as the x-axis, and the distribution curve value y j as the y-axis, j and n both represent the date difference, k n represents the premium income ratio value, and calculate the corresponding premium income ratio values for multiple date differences.
  • Periodic working day distribution curve value, periodic holiday distribution curve value, contemporaneous working day distribution curve value and contemporaneous holiday distribution curve value among them, periodic working day distribution curve value, periodic holiday distribution curve value, contemporaneous working day distribution curve value and contemporaneous holiday
  • the corresponding number of distribution curve values includes multiple values, that is, a date corresponds to multiple distribution curve values.
  • the first historical business order data, the second historical business order data, the distribution curve value of periodic working days, the distribution curve value of periodic holidays, the distribution curve value of working days in the same period, and the distribution curve value of holidays in the same period predict the Set the business cost data in the forecast period to forecast, and obtain the first forecast data set corresponding to the daily average business cost data of the historical period, the second forecast data set corresponding to the historical period holiday average business cost data, and the historical average daily business cost of the same period.
  • the server predicts the business expense data in the preset forecast period by using the preset forecast model, the average daily business expense data of the historical period and the distribution curve value of the period of workdays, and obtains the first forecast data set.
  • the daily average business expense data and the periodic holiday distribution curve value are used to predict the business expense data in the preset forecast period to obtain the second forecast data set.
  • the forecast model the historical average daily business expense data of the same period of workdays and the distribution curve of the same period of workdays value, predict the business expense data in the preset forecast period, obtain a third forecast data set, and use the forecast model, historical average daily business expense data on holidays in the same period, and holiday distribution curve values in the same period to conduct business expense data in the preset forecast period. Prediction, a fourth prediction data set is obtained.
  • the server may convert the first prediction data set, the second prediction data set, the third prediction data set and the fourth prediction data set into matrices respectively to obtain the first matrix, the second matrix, the third matrix and the fourth matrix, and convert the The first matrix, the second matrix, the third matrix and the fourth matrix are added by matrix to obtain the business expense forecast matrix, and the business expense forecast matrix is vector-transformed to obtain business expense forecast data.
  • the data influence of working days and holidays, as well as the distribution law between the number of business orders and business expense data can be incorporated into the prediction of the preset prediction model, so as to avoid the historical business order data being susceptible to dated business orders
  • the problem of the influence between the quantity and the business expense data increases the deviation rate of the forecast business expense data, thereby improving the forecasting accuracy of the business expense data.
  • another embodiment of the prediction method based on the distribution curve in the embodiment of the present application includes:
  • first historical business order data of a first preset historical period after data preprocessing and second historical business order data of a second preset historical period, where the first preset historical period is included in the second preset history period, the end date of the second preset historical period is the day before the start date of the preset forecast period, and the first historical business order data includes the data of the average daily business cost of working days in the historical period and the data of the average daily business cost of holidays in the historical period.
  • the second historical business order data includes the historical average daily business expense data on working days in the same period and the historical average daily business expense data on holidays in the same period.
  • the server obtains the initial historical business order data of the target preset time period, performs data cleaning on the initial historical business order data, and obtains candidate historical business order data; and sequentially performs time period classification and date type classification on the candidate historical business order data to obtain the first Target business order data in a preset historical period and target business order data in a second preset historical period, the date type includes working days and holidays, and the business order data includes business order information; calculate the target business order in the first preset historical period The average daily business cost data on weekdays in the historical period of the data and the average daily business cost data on holidays in the historical period, as well as the historical average daily business cost data on working days and the historical average daily business cost on holidays in the second preset historical period of the target business order data in the second preset historical period Expense data; determine the business order information in the first preset historical period, the daily average business expense data on working days in the historical period, and the daily average business expense data on holidays in the historical period as the first historical business order data, and use the business order information in the
  • the initial historical business order data may be the initial historical auto insurance policy data
  • the candidate historical business order data may be the candidate historical auto insurance policy data
  • the target business order data may be the target auto insurance policy data
  • the business order information may be the auto insurance policy information
  • the historical cycle The average daily business cost data on working days can be the average daily amount data on working days in the historical period
  • the average daily business cost data on holidays in the historical period can be the daily average amount data on holidays in the historical period
  • the average daily business cost data on working days in the historical period can be used in the historical period.
  • the daily average amount data, the historical period holiday average daily business expense data can be the historical period holiday average daily amount data
  • the first historical business expense data can be the first historical auto insurance policy data
  • the second historical business expense data can be the second historical business expense data
  • the server removes outliers, fills empty values, deduplicates and changes dimension values on the initial historical auto insurance policy data to clean the initial historical auto insurance policy data to obtain candidate historical auto insurance policy data, according to the first preset
  • the candidate historical auto insurance policy data is classified into the initial auto insurance policy data of the first preset historical period and the initial auto insurance policy data of the second preset historical period, and through the preset label extraction algorithm, Extract the label information of the initial auto insurance policy data of the first preset historical period, and classify the initial auto insurance policy data of the first preset historical period into working days data and holiday data according to the weekday identifier and holiday identifier in the label information, thereby Obtain the target auto insurance policy data for the first preset historical period, and similarly obtain
  • the server calculates the daily average amount data of the working day data in the target auto insurance policy data of the first preset historical period, obtains the daily average amount data of working days in the historical period, and calculates the amount of holiday data in the target auto insurance policy data of the first preset historical period. For the average daily amount data, the average daily amount data on holidays in the historical period can be obtained. Similarly, the historical average daily amount data on working days and historical average daily amount data on holidays in the same period of the target auto insurance policy data in the second preset historical period can be obtained.
  • the auto insurance policy information in the preset historical period, the daily average amount data on working days in the historical period, and the average daily amount data on holidays in the historical period are determined as the first historical auto insurance policy data, and the auto insurance policy information in the second preset historical period, the historical period of the same working day
  • the daily average amount data and the historical average daily amount data on holidays in the same period are determined as the second historical auto insurance policy data.
  • steps 202-203 The execution process of steps 202-203 is similar to the execution process of the above-mentioned steps 102-103, and will not be repeated here.
  • Predict the daily average business expense data in the preset forecast period by using the preset forecast model, the first historical business order data, the second historical business order data, the multiple date difference values, and the business expense data ratio value, and obtain a forecast.
  • the server sequentially predicts and calculates the average value of the daily average business cost data in the preset forecast period by using the preset forecast model, the first historical business order data and the second historical business order data, and obtains the average daily business of the historical period on working days.
  • the fourth-day average forecast data corresponding to the average business cost data based on the first day's average forecast data, the second day's average forecast data, the third day's average forecast data, the fourth day's average forecast data, multiple date differences and business expenses
  • the data ratio value is used to forecast the daily average business expense data in the preset forecast period respectively, and obtain multiple daily business expense estimated data sets in the preset forecast period.
  • the multiple daily business expense estimation data sets include the first day business expense estimation data set, the second day business expense estimation data set, the third day business expense estimation data set, and the fourth day business expense estimation data
  • the first-day business expense estimation dataset, the second-day business expense estimation dataset, the third-day business expense estimation dataset, and the fourth-day business expense estimation dataset correspond to the first-day average forecast data, Average forecast data for the second day, average forecast data for the third day and average forecast data for the fourth day.
  • the first historical business order data may be the first historical auto insurance policy data
  • the preset forecast period is from 7.20 to 7.31.
  • the server works through the preset three-time exponential smoothing prediction algorithm and the historical cycle in the first historical auto insurance policy data.
  • the distribution curve value of periodic working days, the distribution curve value of periodic holidays, the distribution curve value of working days in the same period, and the distribution curve value of holidays in the same period Predict the business cost data, and obtain the first forecast data set corresponding to the daily average business cost data in the historical period, the second forecast data set corresponding to the historical period holiday average business cost data, and the historical average daily business cost data in the same period.
  • the server obtains the historical period accumulated business expense data on working days and the historical period holiday accumulated business expense data of the first historical business order data, and the historical business expense data of the second historical business order data in the same period of working days and the historical accumulated holiday accumulation in the same period
  • Business expense data Calculate the forecast based on the historical cycle workday accumulated business cost data, historical period holiday accumulated business cost data, multiple daily business cost estimated data sets, periodic workday distribution curve value, periodic holiday distribution curve value and prediction algorithm.
  • Set the business cost data in the forecast period and obtain the first forecast data set corresponding to the daily average business cost data of the historical period, the daily average business cost data of the historical period and holidays, and the corresponding second forecast data set.
  • Expense data historical holiday accumulated business expense data, multiple daily business expense estimation data sets, working day distribution curve values in the same period, holiday distribution curve values in the same period, and forecasting algorithms, calculate business expense data for a preset forecast period, and obtain historical contemporaneous data
  • the business expense can be auto insurance premium income
  • the preset forecast period is 7.20-7.31
  • the first preset historical period is 7.13-7.19
  • the second preset historical period is 6.20-7.19
  • the historical period works
  • the daily accumulated business expense data is the accumulative amount data of the auto insurance premium income that has been signed from July 13th to July 17th and the insurance starts in the statistical month.
  • the accumulated business expense data of the historical period and holidays is the signed order and
  • the accumulative amount data of auto insurance premium income corresponding to the starting date of coverage in the statistical month, the historical accumulative business expense data for working days in the same period are 6.22-6.26, 6.29-7.3, 7.6-7.10 and 7.13-7.17
  • the accumulated amount data of the auto insurance premium income corresponding to the date of signing the contract and starting the insurance in the statistical month, the historical holiday business expense data for the same period is 6.20-6.21, 6.27-6.28, 7.4-7.5, 7.11-
  • the accumulated amount data of the auto insurance premium income corresponding to the date of signing the contract on 7.12 and 7.18-7.19 and starting the insurance in August can be obtained through the preset summation function.
  • the server generates the first prediction sequence, the second prediction sequence, the third prediction sequence and the fourth prediction sequence according to the first prediction data set, the second prediction data set, the third prediction data set and the fourth prediction data set, respectively ; Add the first prediction sequence, the second prediction sequence, the third prediction sequence and the fourth prediction sequence in turn to obtain a merged sequence; calculate the arithmetic mean of the merged sequence to obtain business cost prediction data.
  • the first prediction sequence, the second prediction sequence, the third prediction sequence, and the fourth prediction sequence are prediction data sorted in ascending order of the date difference, and the sorted prediction data are marked with the date difference.
  • the server generates the corresponding first prediction sequence, second prediction sequence, third prediction sequence and fourth prediction from the first prediction data set, the second prediction data set, the third prediction data set and the fourth prediction data set, respectively.
  • sequence, the first predicted sequence is The second prediction sequence is The third prediction sequence is The fourth prediction sequence is Will The merged sequence is obtained, the arithmetic mean of the merged sequence is calculated, and the target business cost forecast data is obtained.
  • the server merges the first forecast data set, the second forecast data set, the third forecast data set and the fourth forecast data set, and after obtaining the business expense forecast data, also obtains the deviation rate of the business expense forecast data, and
  • the preset prediction model is optimized according to the deviation rate and the preset optimization algorithm.
  • the server obtains the actual business expense data corresponding to the business expense forecast data, calculates the deviation rate of the business expense forecast data according to the business expense forecast data and the business expense actual data, and optimizes the preset prediction model according to the deviation rate and the preset optimization algorithm Update, in which, the model parameters or weight values of the preset prediction model can be optimized and updated, the model structure, network layer and algorithm of the preset prediction model can also be optimized and updated, and the business cost prediction in the preset prediction model can also be optimized and updated.
  • the prediction execution process of the data is optimized and updated.
  • the preset optimization algorithm can be any one of the gradient descent method, the Newton method, the impulse algorithm Momentum, the Newton momentum algorithm Nesterov Momentum, the adaptive gradient algorithm Adagrad and the Adam optimization algorithm (Adam), or any number of them. overlay. By optimizing and updating the preset prediction model, the prediction accuracy of the preset prediction model is improved.
  • the data influence of working days and holidays, as well as the distribution law between the number of business orders and business expense data can be incorporated into the prediction of the preset prediction model, so as to avoid the historical business order data being susceptible to dated business orders
  • the problem of the influence between the quantity and the business expense data increases the deviation rate of the forecast business expense data, thereby improving the forecasting accuracy of the business expense data.
  • the prediction method based on the distribution curve in the embodiment of the present application is described above.
  • the following describes the prediction device based on the distribution curve in the embodiment of the present application.
  • an embodiment of the prediction device based on the distribution curve in the embodiment of the present application include:
  • the acquisition module 301 is configured to acquire the first historical business order data of the first preset historical period after data preprocessing, and the second historical business order data of the second preset historical period, the first preset historical period is included in the The second preset historical period, the end date of the second preset historical period is the day before the start date of the preset forecast period, and the first historical business order data includes the daily average business expense data on working days in the historical period and the daily average on holidays in the historical period.
  • Business cost data, the second historical business order data includes the historical average daily business cost data on working days and historical holiday average daily business cost data in the same period;
  • the first calculation module 302 is used to obtain a plurality of signing dates in the second historical business order data, and the corresponding starting dates of the signing dates, and calculate the number of signing dates and the starting dates corresponding to the signing dates. Difference, get multiple date differences;
  • the second calculation module 303 is configured to calculate the business expense data ratio values corresponding to the plurality of date differences by using the preset prediction model and the second historical business order data, and use the exponential function in the preset prediction model and the multiple date differences. value, calculate the periodic working day distribution curve value, the periodic holiday distribution curve value, the working day distribution curve value of the same period and the holiday distribution curve value of the same period of the business expense data ratio value;
  • the prediction module 304 is configured to use a preset prediction model, first historical business order data, second historical business order data, periodic working day distribution curve value, periodic holiday distribution curve value, contemporaneous working day distribution curve value and contemporaneous holiday distribution curve value value, predict the business expense data in the preset forecast period, and obtain the first forecast data set corresponding to the daily average business expense data on working days in the historical period, the second forecast data set corresponding to the historical period holiday average business expense data, and the historical period corresponding to the average daily business expense data.
  • the merging processing module 305 is configured to perform merging processing on the first prediction data set, the second prediction data set, the third prediction data set and the fourth prediction data set to obtain business expense prediction data.
  • each module in the foregoing distribution curve-based prediction apparatus corresponds to each step in the foregoing distribution curve-based prediction method embodiment, and the functions and implementation processes thereof will not be repeated here.
  • the data influence of working days and holidays, as well as the distribution law between the number of business orders and business expense data can be incorporated into the prediction of the preset prediction model, so as to avoid the historical business order data being susceptible to dated business orders
  • the problem of the influence between the quantity and the business expense data increases the deviation rate of the forecast business expense data, thereby improving the forecasting accuracy of the business expense data.
  • another embodiment of the prediction device based on the distribution curve in the embodiment of the present application includes:
  • the acquisition module 301 is configured to acquire the first historical business order data of the first preset historical period after data preprocessing, and the second historical business order data of the second preset historical period, the first preset historical period is included in the The second preset historical period, the end date of the second preset historical period is the day before the start date of the preset forecast period, and the first historical business order data includes the daily average business expense data on working days in the historical period and the daily average on holidays in the historical period.
  • Business cost data, the second historical business order data includes the historical average daily business cost data on working days and historical holiday average daily business cost data in the same period;
  • the first calculation module 302 is used to obtain a plurality of signing dates in the second historical business order data, and the corresponding starting dates of the signing dates, and calculate the number of signing dates and the starting dates corresponding to the signing dates. Difference, get multiple date differences;
  • the second calculation module 303 is configured to calculate the business expense data ratio values corresponding to the plurality of date differences by using the preset prediction model and the second historical business order data, and use the exponential function in the preset prediction model and the multiple date differences. value, calculate the periodic working day distribution curve value, the periodic holiday distribution curve value, the working day distribution curve value of the same period and the holiday distribution curve value of the same period of the business expense data ratio value;
  • the prediction module 304 is configured to use a preset prediction model, first historical business order data, second historical business order data, periodic working day distribution curve value, periodic holiday distribution curve value, contemporaneous working day distribution curve value and contemporaneous holiday distribution curve value value, predict the business expense data in the preset forecast period, and obtain the first forecast data set corresponding to the daily average business expense data on working days in the historical period, the second forecast data set corresponding to the historical period holiday average business expense data, and the historical period corresponding to the average daily business expense data.
  • the prediction module 304 specifically includes:
  • the first forecasting unit 3041 is used to predict the average daily business cost of the preset forecast period by using the preset forecasting model, the first historical business order data, the second historical business order data, multiple date differences and business cost data ratio values. Predict the data, and obtain multiple daily business cost estimated data sets for the preset forecast period;
  • the second predicting unit 3042 is configured to estimate the data sets of daily business expenses, periodic working day distribution curve value, periodic holiday distribution curve value, contemporaneous working day distribution curve value and contemporaneous holiday distribution curve value according to the preset prediction period, Predict the business cost data in the preset forecast period, and obtain the first forecast data set corresponding to the daily average business cost data of the historical period working days, the second forecast data set corresponding to the historical period holiday average daily business cost data, and the historical working days of the same period.
  • the merging processing module 305 is configured to perform merging processing on the first prediction data set, the second prediction data set, the third prediction data set and the fourth prediction data set to obtain business expense prediction data.
  • the first prediction unit 3041 can also be specifically used for:
  • the first historical business order data and the second historical business order data the daily average business cost data in the preset forecast period is predicted and averaged in turn, and the corresponding daily average business cost data of working days in the historical period is obtained.
  • the first-day average forecast data, the second-day average forecast data corresponding to the historical period holiday average daily business expense data, the third-day average forecast data corresponding to the historical weekday average business expense data and the historical holiday average daily business expense data The corresponding average forecast data for the fourth day;
  • the average forecast data of the first day the average forecast data of the second day, the average forecast data of the third day, the average forecast data of the fourth day, the difference values of multiple dates and the proportion of business expense data, the daily average forecast data of the preset forecast period Business expense data is forecasted, and multiple daily business expense estimation data sets for a preset forecast period are obtained.
  • the second prediction unit 3042 can also be specifically used for:
  • the accumulated business expense data of historical period working days calculates the forecast value of the preset forecast period.
  • For business expense data obtain the first forecast data set corresponding to the daily average business expense data of the historical period, the daily average business expense data of the historical period and holidays, and the corresponding second forecast data set;
  • the accumulated business expense data of working days in the same period calculates the forecast value of the preset forecast period.
  • a third forecast data set corresponding to the historical average daily business expense data on working days in the same period and a fourth forecast data set corresponding to the historical average daily business expense data on holidays in the same period are obtained.
  • the obtaining module 301 may also be specifically used for:
  • the candidate historical business order data is sequentially classified by time period and date type, and the target business order data of the first preset historical period and the target business order data of the second preset historical period are obtained.
  • the date types include working days and holidays, and business orders Data includes business order information;
  • the business order information in the first preset historical period, the daily average business expense data on working days in the historical period, and the daily average business expense data on holidays in the historical period are determined as the first historical business order data, and the business order information in the second preset historical period is determined.
  • the historical average daily business cost data on working days in the same period and the historical average daily business cost data on holidays in the same period are determined as the second historical business order data.
  • the merging processing module 305 can also be specifically used for:
  • the first prediction data set, the second prediction data set, the third prediction data set and the fourth prediction data set respectively generate the first prediction sequence, the second prediction sequence, the third prediction sequence and the fourth prediction sequence;
  • the prediction device based on the distribution curve further includes:
  • the optimization module 306 is configured to obtain the deviation rate of the business expense forecast data, and optimize the preset prediction model according to the deviation rate and the preset optimization algorithm.
  • each module and each unit in the above-mentioned distribution curve-based prediction apparatus corresponds to each step in the above-mentioned distribution curve-based prediction method embodiment, and the functions and implementation process thereof will not be repeated here.
  • the data influence of working days and holidays, as well as the distribution law between the number of business orders and business expense data can be incorporated into the prediction of the preset prediction model, so as to avoid the historical business order data being susceptible to dated business orders
  • the problem of the influence between the quantity and the business expense data increases the deviation rate of the forecast business expense data, thereby improving the forecasting accuracy of the business expense data.
  • FIGS 3 and 4 above describe in detail the distribution curve-based prediction apparatus in the embodiment of the present application from the perspective of modular functional entities, and the following describes the distribution curve-based prediction device in the embodiment of the present application in detail from the perspective of hardware processing.
  • FIG. 5 is a schematic structural diagram of a distribution curve-based prediction device provided by an embodiment of the present application.
  • the distribution curve-based prediction device 500 may vary greatly due to different configurations or performances, and may include one or more processors (central processing units, CPU) 510 (eg, one or more processors) and memory 520, one or more storage media 530 (eg, one or more mass storage devices) that store application programs 533 or data 532.
  • the memory 520 and the storage medium 530 may be short-term storage or persistent storage.
  • the program stored in the storage medium 530 may include one or more modules (not shown in the figure), and each module may include a series of instructions to operate on the distribution curve-based prediction apparatus 500 .
  • the processor 510 may be configured to communicate with the storage medium 530 to execute a series of instruction operations in the storage medium 530 on the distribution curve-based prediction device 500 .
  • the distribution curve-based prediction apparatus 500 may also include one or more power sources 540, one or more wired or wireless network interfaces 550, one or more input output interfaces 560, and/or, one or more operating systems 531, such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, and more.
  • operating systems 531 such as Windows Server, Mac OS X, Unix, Linux, FreeBSD, and more.
  • the present application also provides a prediction device based on a distribution curve, comprising: a memory and at least one processor, wherein instructions are stored in the memory, and the memory and the at least one processor are interconnected by lines; the at least one processor The processor invokes the instructions in the memory to cause the distribution curve-based prediction device to perform the steps in the distribution curve-based prediction method described above.
  • the present application also provides a computer-readable storage medium, and the computer-readable storage medium may be a non-volatile computer-readable storage medium or a volatile computer-readable storage medium.
  • the computer-readable storage medium stores computer instructions, and when the computer instructions are executed on the computer, the computer performs the following steps:
  • the first historical business order data of the first preset historical period after data preprocessing and the second historical business order data of the second preset historical period, and the first preset historical period is included in the second preset historical period.
  • the end date of the second preset historical period is the day before the start date of the preset forecast period
  • the first historical business order data includes the historical period working day average daily business cost data and historical period holidays
  • Daily average business cost data the second historical business order data includes historical average daily business cost data on working days in the same period and historical average daily business cost data on holidays in the same period;
  • the business expense data ratio values corresponding to the multiple date differences are calculated, and the exponential function in the preset prediction model and the multiple date differences are calculated. value, and calculate the periodic working day distribution curve value, the periodic holiday distribution curve value, the working day distribution curve value of the same period and the holiday distribution curve value of the same period of the business expense data ratio value;
  • the first historical business order data, the second historical business order data, the periodic working day distribution curve value, the periodic holiday distribution curve value, and the same working day distribution curve value and the distribution curve value of holidays in the same period predict the business expense data of the preset forecast period, and obtain the first forecast data set corresponding to the daily average business expense data of the historical period working days, the holiday days of the historical period
  • the first prediction data set, the second prediction data set, the third prediction data set and the fourth prediction data set are combined to obtain business expense prediction data.
  • the integrated unit if implemented as a software functional unit and sold or used as a stand-alone product, may be stored in a computer-readable storage medium.
  • the technical solutions of the present application can be embodied in the form of software products in essence, or the parts that contribute to the prior art, or all or part of the technical solutions, and the computer software products are stored in a storage medium , including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), magnetic disk or optical disk and other media that can store program codes .

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Accounting & Taxation (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Finance (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Data Mining & Analysis (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本申请涉及人工智能技术领域,提供一种基于分布曲线的预测方法、装置、设备及存储介质,用于提高业务费用数据的预测准确性。基于分布曲线的预测方法包括:计算第二历史业务订单数据中多个订立日期和各订立日期对应的起始日期之间的多个日期差值;通过预置预测模型、第二历史业务订单数据和多个日期差值,计算分布曲线值;通过预置预测模型、第一历史业务订单数据、第二历史业务订单数据和分布曲线值,对预设预测时段的业务费用数据进行预测,得到多个预测数据集;将多个预测数据集进行合并处理,得到业务费用预测数据。此外,本申请还涉及区块链技术,第一历史业务订单数据和第二历史业务订单数据可存储于区块链中。

Description

基于分布曲线的预测方法、装置、设备及存储介质
本申请要求于2020年12月09日提交中国专利局、申请号为202011425186.7、发明名称为“基于分布曲线的预测方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。
技术领域
本申请涉及人工智能的智能决策领域,尤其涉及一种基于分布曲线的预测方法、装置、设备及存储介质。
背景技术
随着业务的普及,对业务的各项数据的统计与预测也显得尤为重要,例如:业务数据的预测。目前的业务数据的预测方式,一般都是通过时间序列预测算法和历史业务数据,对预设时间的业务数据进行预测,从而得到预测业务数据。
但是,发明人意识到上述业务数据的预测方式是直接将所有历史日期的历史业务数据融合在一起后进行预测,而节假日和工作日之间存在一定的先后顺序,每个周期月的节假日与工作日的历史业务数据与上一个周期月的节假日与工作日的历史业务数据并不是一一对应相等的关系,且历史业务数据易受到日期的业务订单数量和业务费用数据之间的影响,忽略了业务订单数量和业务费用数据之间的分布规律,导致了预测业务数据的预测结果的偏差率较高,从而导致了业务数据的预测准确性低。
发明内容
本申请提供一种基于分布曲线的预测方法、装置、设备及存储介质,用于提高业务费用数据的预测准确性。
本申请第一方面提供了一种基于分布曲线的预测方法,包括:
获取经过数据预处理的第一预设历史时段的第一历史业务订单数据,以及第二预设历史时段的第二历史业务订单数据,所述第一预设历史时段包含于所述第二预设历史时段,所述第二预设历史时段的结束日期为预设预测时段的起始日期的前一天,所述第一历史业务订单数据包括历史周期工作日日均业务费用数据和历史周期节假日日均业务费用数据,所述第二历史业务订单数据包括历史同期工作日日均业务费用数据和历史同期节假日日均业务费用数据;
获取所述第二历史业务订单数据中的多个订立日期,以及各订立日期对应的起始日期,并计算所述多个订立日期和各订立日期对应的起始日期之间的差值,得到多个日期差值;
通过预置预测模型和所述第二历史业务订单数据,计算所述多个日期差值分别对应的业务费用数据比例值,通过所述预置预测模型中的指数函数和所述多个日期差值,计算所述业务费用数据比例值的周期工作日分布曲线值、周期节假日分布曲线值、同期工作日分布曲线值和同期节假日分布曲线值;
通过所述预置预测模型、所述第一历史业务订单数据、所述第二历史业务订单数据、所述周期工作日分布曲线值、所述周期节假日分布曲线值、所述同期工作日分布曲线值和所述同期节假日分布曲线值,对所述预设预测时段的业务费用数据进行预测,得到所述历史周期工作日日均业务费用数据对应的第一预测数据集、所述历史周期节假日日均业务费用数据对应的第二预测数据集、所述历史同期工作日日均业务费用数据对应的第三预测数据集,以及所述历史同期节假日日均业务费用数据对应的第四预测数据集;
将所述第一预测数据集、所述第二预测数据集、所述第三预测数据集和所述第四预测数据集进行合并处理,得到业务费用预测数据。
本申请第二方面提供了一种基于分布曲线的预测设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:
获取经过数据预处理的第一预设历史时段的第一历史业务订单数据,以及第二预设历史时段的第二历史业务订单数据,所述第一预设历史时段包含于所述第二预设历史时段,所述第二预设历史时段的结束日期为预设预测时段的起始日期的前一天,所述第一历史业务订单数据包括历史周期工作日日均业务费用数据和历史周期节假日日均业务费用数据,所述第二历史业务订单数据包括历史同期工作日日均业务费用数据和历史同期节假日日均业务费用数据;
获取所述第二历史业务订单数据中的多个订立日期,以及各订立日期对应的起始日期,并计算所述多个订立日期和各订立日期对应的起始日期之间的差值,得到多个日期差值;
通过预置预测模型和所述第二历史业务订单数据,计算所述多个日期差值分别对应的业务费用数据比例值,通过所述预置预测模型中的指数函数和所述多个日期差值,计算所述业务费用数据比例值的周期工作日分布曲线值、周期节假日分布曲线值、同期工作日分布曲线值和同期节假日分布曲线值;
通过所述预置预测模型、所述第一历史业务订单数据、所述第二历史业务订单数据、所述周期工作日分布曲线值、所述周期节假日分布曲线值、所述同期工作日分布曲线值和所述同期节假日分布曲线值,对所述预设预测时段的业务费用数据进行预测,得到所述历史周期工作日日均业务费用数据对应的第一预测数据集、所述历史周期节假日日均业务费用数据对应的第二预测数据集、所述历史同期工作日日均业务费用数据对应的第三预测数据集,以及所述历史同期节假日日均业务费用数据对应的第四预测数据集;
将所述第一预测数据集、所述第二预测数据集、所述第三预测数据集和所述第四预测数据集进行合并处理,得到业务费用预测数据。
本申请第三方面提供了一种计算机可读存储介质,所述计算机可读存储介质中存储计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如下步骤:
获取经过数据预处理的第一预设历史时段的第一历史业务订单数据,以及第二预设历史时段的第二历史业务订单数据,所述第一预设历史时段包含于所述第二预设历史时段,所述第二预设历史时段的结束日期为预设预测时段的起始日期的前一天,所述第一历史业务订单数据包括历史周期工作日日均业务费用数据和历史周期节假日日均业务费用数据,所述第二历史业务订单数据包括历史同期工作日日均业务费用数据和历史同期节假日日均业务费用数据;
获取所述第二历史业务订单数据中的多个订立日期,以及各订立日期对应的起始日期,并计算所述多个订立日期和各订立日期对应的起始日期之间的差值,得到多个日期差值;
通过预置预测模型和所述第二历史业务订单数据,计算所述多个日期差值分别对应的业务费用数据比例值,通过所述预置预测模型中的指数函数和所述多个日期差值,计算所述业务费用数据比例值的周期工作日分布曲线值、周期节假日分布曲线值、同期工作日分布曲线值和同期节假日分布曲线值;
通过所述预置预测模型、所述第一历史业务订单数据、所述第二历史业务订单数据、所述周期工作日分布曲线值、所述周期节假日分布曲线值、所述同期工作日分布曲线值和所述同期节假日分布曲线值,对所述预设预测时段的业务费用 数据进行预测,得到所述历史周期工作日日均业务费用数据对应的第一预测数据集、所述历史周期节假日日均业务费用数据对应的第二预测数据集、所述历史同期工作日日均业务费用数据对应的第三预测数据集,以及所述历史同期节假日日均业务费用数据对应的第四预测数据集;
将所述第一预测数据集、所述第二预测数据集、所述第三预测数据集和所述第四预测数据集进行合并处理,得到业务费用预测数据。
本申请第四方面提供了一种基于分布曲线的预测装置,包括:
获取模块,用于获取经过数据预处理的第一预设历史时段的第一历史业务订单数据,以及第二预设历史时段的第二历史业务订单数据,所述第一预设历史时段包含于所述第二预设历史时段,所述第二预设历史时段的结束日期为预设预测时段的起始日期的前一天,所述第一历史业务订单数据包括历史周期工作日日均业务费用数据和历史周期节假日日均业务费用数据,所述第二历史业务订单数据包括历史同期工作日日均业务费用数据和历史同期节假日日均业务费用数据;
第一计算模块,用于获取所述第二历史业务订单数据中的多个订立日期,以及各订立日期对应的起始日期,并计算所述多个订立日期和各订立日期对应的起始日期之间的差值,得到多个日期差值;
第二计算模块,用于通过预置预测模型和所述第二历史业务订单数据,计算所述多个日期差值分别对应的业务费用数据比例值,通过所述预置预测模型中的指数函数和所述多个日期差值,计算所述业务费用数据比例值的周期工作日分布曲线值、周期节假日分布曲线值、同期工作日分布曲线值和同期节假日分布曲线值;
预测模块,用于通过所述预置预测模型、所述第一历史业务订单数据、所述第二历史业务订单数据、所述周期工作日分布曲线值、所述周期节假日分布曲线值、所述同期工作日分布曲线值和所述同期节假日分布曲线值,对所述预设预测时段的业务费用数据进行预测,得到所述历史周期工作日日均业务费用数据对应的第一预测数据集、所述历史周期节假日日均业务费用数据对应的第二预测数据集、所述历史同期工作日日均业务费用数据对应的第三预测数据集,以及所述历史同期节假日日均业务费用数据对应的第四预测数据集;
合并处理模块,用于将所述第一预测数据集、所述第二预测数据集、所述第三预测数据集和所述第四预测数据集进行合并处理,得到业务费用预测数据。
本申请实施例中,通过结合工作日、节假日、两个预设历史时段、多个日期差值、多个分布曲线值和预置预测模型,对预设预测时段的业务费用数据进行预测,得到第一预测数据集、第二预测数据集、第三预测数据集和第四预测数据集,并将第一预测数据集、第二预测数据集、第三预测数据集和第四预测数据集进行合并得到业务费用预测数据,能够将工作日和节假日的数据影响,以及业务订单数量和业务费用数据之间的分布规律纳入预置预测模型的预测中,避免了历史业务订单数据易受到日期的业务订单数量和业务费用数据之间的影响的问题,提高了预测业务费用数据的偏差率,从而提高了对业务费用数据的预测准确性。
附图说明
图1为本申请实施例中基于分布曲线的预测方法的一个实施例示意图;
图2为本申请实施例中基于分布曲线的预测方法的另一个实施例示意图;
图3为本申请实施例中基于分布曲线的预测装置的一个实施例示意图;
图4为本申请实施例中基于分布曲线的预测装置的另一个实施例示意图;
图5为本申请实施例中基于分布曲线的预测设备的一个实施例示意图。
具体实施方式
本申请实施例提供了一种基于分布曲线的预测方法、装置、设备及存储介质,提高了业务费用数据的预测准确性。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”或“具有”及其任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
为便于理解,下面对本申请实施例的具体流程进行描述,请参阅图1,本申请实施例中基于分布曲线的预测方法的一个实施例包括:
101、获取经过数据预处理的第一预设历史时段的第一历史业务订单数据,以及第二预设历史时段的第二历史业务订单数据,第一预设历史时段包含于第二预设历史时段,第二预设历史时段的结束日期为预设预测时段的起始日期的前一天,第一历史业务订单数据包括历史周期工作日日均业务费用数据和历史周期节假日日均业务费用数据,第二历史业务订单数据包括历史同期工作日日均业务费用数据和历史同期节假日日均业务费用数据。
可以理解的是,本申请的执行主体可以为基于分布曲线的预测装置,还可以是终端或者服务器,具体此处不做限定。本申请实施例以服务器为执行主体为例进行说明。
例如,第一历史业务订单数据可为第一历史车险保单数据,第二历史业务订单数据可为第二历史车险保单数据,服务器可根据第一历史预设时段,对预置数据库中经过数据预处理的历史车险保单数据进行检索并提取,得到第一历史车险保单数据,根据第二历史预设时段,对预置数据库中经过数据预处理的历史车险保单数据进行检索并提取,得到第二历史车险保单数据,第一历史车险保单数据除了包括历史周期工作日日均金额数据和历史周期节假日日均金额数据之外,还包括车险保单信息,该车险保单信息包括车险保单的订立日期和起保日期,以及车险保单对应的车辆类型,该车辆类型包括新车和旧车,同理可得第二历史车险保单数据。
其中,预设预测时段为第二预设历史时段较大月份中除了第二预设历史时段日期的剩余日期。第一预设历史时段的结束日期与第二预设历史时段的结束日期相同。例如,历史周期工作日日均业务费用数据可为历史周期工作日日均金额数据,历史周期节假日日均业务费用数据可为历史周期节假日日均金额数据,历史同期工作日日均业务费用数据可为历史同期工作日日均金额数据,历史同期节假日日均金额数据可为历史同期节假日日均金额数据,预设预测时段为7.20日-7.31日,第一预设历史时段为7天,为7.13日-7.19日,第二预设历史时段为30天,为6.20日-7.19日,历史周期工作日日均金额数据为7.13日-7.17日的历史日均车险保费数据,历史周期节假日日均金额数据为7.18日-7.19日的历史日均车险保费数据,历史同期工作日日均金额数据为6.22日-6.26日、6.29日-7.3日、7.6日-7.10日和7.13日-7.17日一共20天的历史日均车险保费数据,历史同期节假日日均金额数据为6.20日-6.21日、6.27日-6.28日、7.4日-7.5日、7.11日-7.12日和7.18日-7.19日一共10天的历史日均车险保费数据。
需要说明的是,服务器预先获取第一预设时段训练数据、第二预设时段训练数据、训练分布曲线值和训练日期差值;将第一预设时段训练数据、第二预设时段训练数据、训练分布曲线值和训练日期差值,输入初始预测模型,通过初始预测模型对训练预测时段的业务费用数据进行预测,得到预测结果;根据预置的损失函数和预测结果,对初始预测模型的参数进行迭代调整,得到预置预测模型。
服务器通过第一预设时段训练数据、第二预设时段训练数据、训练分布曲线值和训练日期差值,对初始预测模型进行训练,得到预测结果,通过预测结果和损失函数,对初始预测模型的权重值或结构参数进行迭代调整,直至预置的损失函数收敛,停止调整,得到预置预测模型。
102、获取第二历史业务订单数据中的多个订立日期,以及各订立日期对应的起始日期,并计算多个订立日期和各订立日期对应的起始日期之间的差值,得到多个日期差值。
例如,第二历史业务订单数据可为第二历史车险保单数据,起始日期可为起保日期,第二历史车险保单数据中每个保单都有对应的订立日期和起保日期,以第二历史车险保单数据A、B和C为例,第二历史车险保单数据A中的订立日期和起保日期分别为7.10和7.10,第二历史车险保单数据B中的订立日期和起保日期分别为6.20和6.21,第二历史车险保单数据C中的订立日期和起保日期分别为6.20和6.22,则第二历史车险保单数据A的日期差值为0,第二历史车险保单数据B的日期差值为1,第二历史车险保单数据C的日期差值为2,将第二历史车险保单数据A标记为T+0,将第二历史车险保单数据B标记为T+1,将第二历史车险保单数据C标记为T+2,以此类推,可得其他第二历史车险保单数据的日期差值,并将日期差值以T+日期差值的形式标记在第二历史车险保单数据上,以便于对第二历史车险保单数据的日期差值的读取和计算,其中,日期差值的优选最大值为30,日期差值也可以不限制,即可超过30。
103、通过预置预测模型和第二历史业务订单数据,计算多个日期差值分别对应的业务费用数据比例值,通过预置预测模型中的指数函数和多个日期差值,计算业务费用数据比例值的周期工作日分布曲线值、周期节假日分布曲线值、同期工作日分布曲线值和同期节假日分布曲线值。
例如,第二历史业务订单数据可为第二历史车险保单数据,业务费用数据比例值可为保费收入比例值,服务器调用预置预测模型,该预置预测模型可由比例值算法、分布曲线算法和预测算法构成,比例值算法用于计算保费收入比例值,分布曲线算法用于计算目标曲线值,预测算法用于根据第一历史车险保单数据、第二历史车险保单数据、保费收入比例值和计算多个分布曲线值,对预设预测时段的保费收入数据进行多维度预测和合并,得到最终的预测值,该多维度为第一预设历史时段的工作日和节假日,以及第二预设历史时段的工作日和节假日;可通过该预置预测模型以保费收入比例值=第二历史车险保单数据中T+日期差值的金额数据/第二历史车险保单数据,计算各日期差值对应的保费收入比例值,例如:保费收入比例值k n=第二历史车险保单数据中T+日期差值的金额数据S T+n/第二历史车险保单数据S为:
Figure PCTCN2021090828-appb-000001
n表示日期差值,通过
Figure PCTCN2021090828-appb-000002
计算得到k 0-k 30
通过预置预测模型中的指数函数
Figure PCTCN2021090828-appb-000003
以T+日期差值为x轴,以分 布曲线值y j为y轴,j和n均表示日期差值,k n表示保费收入比例值,计算多个日期差值分别对应的保费收入比例值的周期工作日分布曲线值、周期节假日分布曲线值、同期工作日分布曲线值和同期节假日分布曲线值,其中,周期工作日分布曲线值、周期节假日分布曲线值、同期工作日分布曲线值和同期节假日分布曲线值各自对应的数量包括多个,即一个日期对应有多个分布曲线值,例如,以第二历史车险保单数据中6.20日同期工作日分布曲线值y 0-y 2为例,T+0日期差值对应的同期工作日分布曲线值为y 0=k 0,T+1日期差值对应的同期工作日分布曲线值为y 1=k 0+k,T+2日期差值对应的同期工作日分布曲线值为y 2=k 0+k 1+k 2。其中,同期工作日分布曲线值会趋于1。
104、通过预置预测模型、第一历史业务订单数据、第二历史业务订单数据、周期工作日分布曲线值、周期节假日分布曲线值、同期工作日分布曲线值和同期节假日分布曲线值,对预设预测时段的业务费用数据进行预测,得到历史周期工作日日均业务费用数据对应的第一预测数据集、历史周期节假日日均业务费用数据对应的第二预测数据集、历史同期工作日日均业务费用数据对应的第三预测数据集,以及历史同期节假日日均业务费用数据对应的第四预测数据集。
服务器通过预置预测模型、历史周期工作日日均业务费用数据和周期工作日分布曲线值,对预设预测时段的业务费用数据进行预测,得到第一预测数据集,通过预测模型、历史周期节假日日均业务费用数据和周期节假日分布曲线值,对预设预测时段的业务费用数据进行预测,得到第二预测数据集,通过预测模型、历史同期工作日日均业务费用数据和同期工作日分布曲线值,对预设预测时段的业务费用数据进行预测,得到第三预测数据集,通过预测模型、历史同期节假日日均业务费用数据和同期节假日分布曲线值,对预设预测时段的业务费用数据进行预测,得到第四预测数据集。
105、将第一预测数据集、第二预测数据集、第三预测数据集和第四预测数据集进行合并处理,得到业务费用预测数据。
服务器可分别将第一预测数据集、第二预测数据集、第三预测数据集和第四预测数据集转换为矩阵,得到第一矩阵、第二矩阵、第三矩阵和第四矩阵,将第一矩阵、第二矩阵、第三矩阵和第四矩阵进行矩阵相加,从而得到业务费用预测矩阵,将业务费用预测矩阵进行向量转换,得到业务费用预测数据。
本申请实施例中,能够将工作日和节假日的数据影响,以及业务订单数量和业务费用数据之间的分布规律纳入预置预测模型的预测中,避免了历史业务订单数据易受到日期的业务订单数量和业务费用数据之间的影响的问题,提高了预测业务费用数据的偏差率,从而提高了对业务费用数据的预测准确性。
请参阅图2,本申请实施例中基于分布曲线的预测方法的另一个实施例包括:
201、获取经过数据预处理的第一预设历史时段的第一历史业务订单数据,以及第二预设历史时段的第二历史业务订单数据,第一预设历史时段包含于第二预设历史时段,第二预设历史时段的结束日期为预设预测时段的起始日期的前一天,第一历史业务订单数据包括历史周期工作日日均业务费用数据和历史周期节假日日均业务费用数据,第二历史业务订单数据包括历史同期工作日日均业务费 用数据和历史同期节假日日均业务费用数据。
具体地,服务器获取目标预设时段的初始历史业务订单数据,对初始历史业务订单数据进行数据清洗,得到候选历史业务订单数据;将候选历史业务订单数据依次进行时段分类和日期类型分类,得到第一预设历史时段的目标业务订单数据和第二预设历史时段的目标业务订单数据,日期类型包括工作日和节假日,业务订单数据包括业务订单信息;计算第一预设历史时段的目标业务订单数据的历史周期工作日日均业务费用数据和历史周期节假日日均业务费用数据,以及第二预设历史时段的目标业务订单数据的历史同期工作日日均业务费用数据和历史同期节假日日均业务费用数据;将第一预设历史时段的业务订单信息、历史周期工作日日均业务费用数据和历史周期节假日日均业务费用数据确定为第一历史业务订单数据,将第二预设历史时段的业务订单信息、历史同期工作日日均业务费用数据和历史同期节假日日均业务费用数据确定为第二历史业务订单数据。
例如,初始历史业务订单数据可为初始历史车险保单数据,候选历史业务订单数据可为候选历史车险保单数据,目标业务订单数据可为目标车险保单数据,业务订单信息可为车险保单信息,历史周期工作日日均业务费用数据可为历史周期工作日日均金额数据,历史周期节假日日均业务费用数据可为历史周期节假日日均金额数据,历史周期工作日日均业务费用数据可为历史周期工作日日均金额数据,历史周期节假日日均业务费用数据可为历史周期节假日日均金额数据,第一历史业务费用数据可为第一历史车险保单数据,第二历史业务费用数据可为第二历史车险保单数据,服务器对初始历史车险保单数据进行异常值剔除、空值填充、去重和维值更改,以对初始历史车险保单数据进行数据清洗,得到候选历史车险保单数据,按照第一预设历史时段和第二预设历史时段,将候选历史车险保单数据分类为第一预设历史时段的初始车险保单数据和第二预设历史时段的初始车险保单数据,通过预置的标签提取算法,提取第一预设历史时段的初始车险保单数据的标签信息,根据标签信息中的工作日标识和节假日标识,将第一预设历史时段的初始车险保单数据分类为工作日数据和节假日数据,从而得到第一预设历史时段的目标车险保单数据,同理可得第二预设历史时段的目标车险保单数据;
服务器计算第一预设历史时段的目标车险保单数据中工作日数据的日均金额数据,得到历史周期工作日日均金额数据,计算第一预设历史时段的目标车险保单数据中节假日日数据的日均金额数据,得到历史周期节假日日均金额数据,同理可得第二预设历史时段的目标车险保单数据的历史同期工作日日均金额数据和历史同期节假日日均金额数据,将第一预设历史时段的车险保单信息、历史周期工作日日均金额数据和历史周期节假日日均金额数据确定为第一历史车险保单数据,将第二预设历史时段的车险保单信息、历史同期工作日日均金额数据和历史同期节假日日均金额数据确定为第二历史车险保单数据。
202、获取第二历史业务订单数据中的多个订立日期,以及各订立日期对应的起始日期,并计算多个订立日期和各订立日期对应的起始日期之间的差值,得到多个日期差值。
203、通过预置预测模型和第二历史业务订单数据,计算多个日期差值分别对应的业务费用数据比例值,通过预置预测模型中的指数函数和多个日期差值,计算业务费用数据比例值的周期工作日分布曲线值、周期节假日分布曲线值、同期工作日分布曲线值和同期节假日分布曲线值。
步骤202-203的执行过程与上述步骤102-103的执行过程类似,在此不再赘 述。
204、通过预置预测模型、第一历史业务订单数据、第二历史业务订单数据、多个日期差值和业务费用数据比例值,对预设预测时段的日均业务费用数据进行预测,得到预设预测时段的多个日业务费用预估数据集。
具体地,服务器通过预置预测模型、第一历史业务订单数据和第二历史业务订单数据,对预设预测时段的日均业务费用数据依次进行预测和均值计算,得到历史周期工作日日均业务费用数据对应的第一日均预测数据、历史周期节假日日均业务费用数据对应的第二日均预测数据、历史同期工作日日均业务费用数据对应的第三日均预测数据和历史同期节假日日均业务费用数据对应的第四日均预测数据;根据第一日均预测数据、第二日均预测数据、第三日均预测数据、第四日均预测数据、多个日期差值和业务费用数据比例值,分别对预设预测时段的日均业务费用数据进行预测,得到预设预测时段的多个日业务费用预估数据集。
其中,多个日业务费用预估数据集包括第一日业务费用预估数据集、第二日业务费用预估数据集、第三日业务费用预估数据集和第四日业务费用预估数据集,第一日业务费用预估数据集、第二日业务费用预估数据集、第三日业务费用预估数据集和第四日业务费用预估数据集分别对应第一日均预测数据、第二日均预测数据、第三日均预测数据和第四日均预测数据。例如,第一历史业务订单数据可为第一历史车险保单数据,预设预测时段为7.20日-7.31日,服务器通过预置的三次指数平滑预测算法和第一历史车险保单数据中的历史周期工作日日均业务费用数据,对7.20日-7.31日分别的日预测数据1-12,计算(日预测数据1+日预测数据2+……+日预测数据12)/12,得到第一日均预测数据,依次类推可得第二日均预测数据、第三日均预测数据和第四日均预测数据,通过日业务费用预估数据=日均预测数据*日期差值*分布曲线比例值,计算得到7.20日中第一日均预测数据对应的T+0对应的日业务费用预估数据0(第一日均预测数据*0*y 0)、T+1对应的日业务费用预估数据1(第一日均预测数据*1*y 1)、T+2对应的日业务费用预估数据2(第一日均预测数据*2*y 2)……T+30对应的日业务费用预估数据30(第一日均预测数据*30*y 30),同理可得7.21日-7.31日中各日中第二日均预测数据、第三日均预测数据和第四日均预测数据分别对应的日业务费用预估数据0-日业务费用预估数据30,即预设预测时段的多个日业务费用预估数据集。
205、根据预设预测时段的多个日业务费用预估数据集、周期工作日分布曲线值、周期节假日分布曲线值、同期工作日分布曲线值和同期节假日分布曲线值,对预设预测时段的业务费用数据进行预测,得到历史周期工作日日均业务费用数据对应的第一预测数据集、历史周期节假日日均业务费用数据对应的第二预测数据集、历史同期工作日日均业务费用数据对应的第三预测数据集,以及历史同期节假日日均业务费用数据对应的第四预测数据集。
具体地,服务器获取第一历史业务订单数据的历史周期工作日累计业务费用数据和历史周期节假日累计业务费用数据,以及第二历史业务订单数据的历史同期工作日累计业务费用数据和历史同期节假日累计业务费用数据;根据历史周期工作日累计业务费用数据、历史周期节假日累计业务费用数据、多个日业务费用预估数据集、周期工作日分布曲线值、周期节假日分布曲线值和预测算法,计算 预设预测时段的业务费用数据,得到历史周期工作日日均业务费用数据对应的第一预测数据集、历史周期节假日日均业务费用数据和对应的第二预测数据集;根据历史同期工作日累计业务费用数据、历史同期节假日累计业务费用数据、多个日业务费用预估数据集、同期工作日分布曲线值、同期节假日分布曲线值和预测算法,计算预设预测时段的业务费用数据,得到历史同期工作日日均业务费用数据对应的第三预测数据集和历史同期节假日日均业务费用数据对应的第四预测数据集。
例如,业务费用可为车险保费收入,预设预测时段为7.20日-7.31日,第一预设历史时段为7.13日-7.19日,第二预设历史时段为6.20日-7.19日,历史周期工作日累计业务费用数据为7.13日-7.17日中已签单且在统计月起保的日期所对应的车险保费收入的累计金额数据,历史周期节假日累计业务费用数据为7.18日-7.19日中已签单且在统计月起保的日期所对应的车险保费收入的累计金额数据,历史同期工作日累计业务费用数据为6.22日-6.26日、6.29日-7.3日、7.6日-7.10日和7.13日-7.17日中已签单且在统计月起保的日期所对应的车险保费收入的累计金额数据,历史同期节假日业务费用数据为6.20日-6.21日、6.27日-6.28日、7.4日-7.5日、7.11日-7.12日和7.18日-7.19日中已签单且在8月起保的日期所对应的车险保费收入的累计金额数据,可通过预置的求和函数求得历史周期工作日累计业务费用数据、历史周期节假日累计业务费用数据、历史同期工作日累计业务费用数据和历史同期节假日累计业务费用数据,服务器通过预置预测模型中的预测算法(预测数据=累计业务费用数据+日业务费用预估数据集*分布曲线值),计算得到第一预测数据集中7.13日的第一预测数据0(第一预测数据0=历史周期工作日累计业务费用数据*第一日均预测数据对应的日业务费用预估数据0*T+0对应的周期工作日分布曲线值y 0)、第一预测数据1(第一预测数据1=历史周期工作日累计业务费用数据*第一日均预测数据对应的日业务费用预估数据1*第一日均预测数据对应的周期工作日分布曲线值y 1)……第一预测数据30(第一预测数据30=周期工作日累计金额数据*第一日均预测数据对应的日业务费用预估数据30*第一日均预测数据对应的周期工作日分布曲线值y 30),依次类推,可得第一预测数据集中7.14日-7.17日中每日的第一预测数据0-第一预测数据30,同理可得第二预测数据集、第三预测数据集和第四预测数据集。
206、将第一预测数据集、第二预测数据集、第三预测数据集和第四预测数据集进行合并处理,得到业务费用预测数据。
具体地,服务器根据第一预测数据集、第二预测数据集、第三预测数据集和第四预测数据集,分别生成第一预测序列、第二预测序列、第三预测序列和第四预测序列;将第一预测序列、第二预测序列、第三预测序列和第四预测序列依次相加,得到合并序列;计算合并序列的算术均值,得到业务费用预测数据。
其中,第一预测序列、第二预测序列、第三预测序列和第四预测序列均为按照日期差值从小到大的顺序进行排序的预测数据,且该排序的预测数据标记有日期差值。例如,服务器将第一预测数据集、第二预测数据集、第三预测数据集和第四预测数据集,分别生成对应的第一预测序列、第二预测序列、第三预测序列和第四预测序列,第一预测序列为
Figure PCTCN2021090828-appb-000004
第二预测序列为
Figure PCTCN2021090828-appb-000005
Figure PCTCN2021090828-appb-000006
第三预测序列为
Figure PCTCN2021090828-appb-000007
第四预测序列为
Figure PCTCN2021090828-appb-000008
Figure PCTCN2021090828-appb-000009
Figure PCTCN2021090828-appb-000010
Figure PCTCN2021090828-appb-000011
得到合并序列,计算合并序列的算术均值,得到目标业务费用预测数据。具体地,服务器将第一预测数据集、第二预测数据集、第三预测数据集和第四预测数据集进行合并处理,得到业务费用预测数据之后,还获取业务费用预测数据的偏差率,并根据偏差率和预置的优化算法,对预置预测模型进行优化。
服务器获取业务费用预测数据对应的业务费用真实数据,根据业务费用预测数据和业务费用真实数据,计算业务费用预测数据的偏差率;根据偏差率和预置的优化算法,对预置预测模型进行优化更新,其中,可对预置预测模型的模型参数或权重值进行优化更新,也对预置预测模型的模型结构、网络层和算法进行优化更新,也可对预置预测模型中对于业务费用预测数据的预测执行过程进行优化更新。预置的优化算法可为梯度下降法、牛顿法、冲量算法Momentum、牛顿动量算法Nesterov Momentum、自适应梯度算法Adagrad和亚当优化算法(adam optimization algorithm,Adam)中的任意一个,或任意多个的叠加。通过对预置预测模型进行优化更新,提高了预置预测模型的预测准确性。
本申请实施例中,能够将工作日和节假日的数据影响,以及业务订单数量和业务费用数据之间的分布规律纳入预置预测模型的预测中,避免了历史业务订单数据易受到日期的业务订单数量和业务费用数据之间的影响的问题,提高了预测业务费用数据的偏差率,从而提高了对业务费用数据的预测准确性。
上面对本申请实施例中基于分布曲线的预测方法进行了描述,下面对本申请实施例中基于分布曲线的预测装置进行描述,请参阅图3,本申请实施例中基于分布曲线的预测装置一个实施例包括:
获取模块301,用于获取经过数据预处理的第一预设历史时段的第一历史业务订单数据,以及第二预设历史时段的第二历史业务订单数据,第一预设历史时段包含于第二预设历史时段,第二预设历史时段的结束日期为预设预测时段的起始日期的前一天,第一历史业务订单数据包括历史周期工作日日均业务费用数据和历史周期节假日日均业务费用数据,第二历史业务订单数据包括历史同期工作日日均业务费用数据和历史同期节假日日均业务费用数据;
第一计算模块302,用于获取第二历史业务订单数据中的多个订立日期,以及各订立日期对应的起始日期,并计算多个订立日期和各订立日期对应的起始日期之间的差值,得到多个日期差值;
第二计算模块303,用于通过预置预测模型和第二历史业务订单数据,计算多个日期差值分别对应的业务费用数据比例值,通过预置预测模型中的指数函数和多个日期差值,计算业务费用数据比例值的周期工作日分布曲线值、周期节假日分布曲线值、同期工作日分布曲线值和同期节假日分布曲线值;
预测模块304,用于通过预置预测模型、第一历史业务订单数据、第二历史业务订单数据、周期工作日分布曲线值、周期节假日分布曲线值、同期工作日分布曲线值和同期节假日分布曲线值,对预设预测时段的业务费用数据进行预测,得到历史周期工作日日均业务费用数据对应的第一预测数据集、历史周期节假日日均业务费用数据对应的第二预测数据集、历史同期工作日日均业务费用数据对应的第三预测数据集,以及历史同期节假日日均业务费用数据对应的第四预测数据集;
合并处理模块305,用于将第一预测数据集、第二预测数据集、第三预测数据集和第四预测数据集进行合并处理,得到业务费用预测数据。
上述基于分布曲线的预测装置中各个模块的功能实现与上述基于分布曲线的预测方法实施例中各步骤相对应,其功能和实现过程在此处不再一一赘述。
本申请实施例中,能够将工作日和节假日的数据影响,以及业务订单数量和业务费用数据之间的分布规律纳入预置预测模型的预测中,避免了历史业务订单数据易受到日期的业务订单数量和业务费用数据之间的影响的问题,提高了预测业务费用数据的偏差率,从而提高了对业务费用数据的预测准确性。
请参阅图4,本申请实施例中基于分布曲线的预测装置的另一个实施例包括:
获取模块301,用于获取经过数据预处理的第一预设历史时段的第一历史业务订单数据,以及第二预设历史时段的第二历史业务订单数据,第一预设历史时段包含于第二预设历史时段,第二预设历史时段的结束日期为预设预测时段的起始日期的前一天,第一历史业务订单数据包括历史周期工作日日均业务费用数据和历史周期节假日日均业务费用数据,第二历史业务订单数据包括历史同期工作日日均业务费用数据和历史同期节假日日均业务费用数据;
第一计算模块302,用于获取第二历史业务订单数据中的多个订立日期,以及各订立日期对应的起始日期,并计算多个订立日期和各订立日期对应的起始日期之间的差值,得到多个日期差值;
第二计算模块303,用于通过预置预测模型和第二历史业务订单数据,计算多个日期差值分别对应的业务费用数据比例值,通过预置预测模型中的指数函数和多个日期差值,计算业务费用数据比例值的周期工作日分布曲线值、周期节假日分布曲线值、同期工作日分布曲线值和同期节假日分布曲线值;
预测模块304,用于通过预置预测模型、第一历史业务订单数据、第二历史业务订单数据、周期工作日分布曲线值、周期节假日分布曲线值、同期工作日分布曲线值和同期节假日分布曲线值,对预设预测时段的业务费用数据进行预测,得到历史周期工作日日均业务费用数据对应的第一预测数据集、历史周期节假日日均业务费用数据对应的第二预测数据集、历史同期工作日日均业务费用数据对应的第三预测数据集,以及历史同期节假日日均业务费用数据对应的第四预测数据集;
其中,预测模块304具体包括:
第一预测单元3041,用于通过预置预测模型、第一历史业务订单数据、第二历史业务订单数据、多个日期差值和业务费用数据比例值,对预设预测时段的日均业务费用数据进行预测,得到预设预测时段的多个日业务费用预估数据集;
第二预测单元3042,用于根据预设预测时段的多个日业务费用预估数据集、周期工作日分布曲线值、周期节假日分布曲线值、同期工作日分布曲线值和同期节假日分布曲线值,对预设预测时段的业务费用数据进行预测,得到历史周期工作日日均业务费用数据对应的第一预测数据集、历史周期节假日日均业务费用数据对应的第二预测数据集、历史同期工作日日均业务费用数据对应的第三预测数据集,以及历史同期节假日日均业务费用数据对应的第四预测数据集;
合并处理模块305,用于将第一预测数据集、第二预测数据集、第三预测数据集和第四预测数据集进行合并处理,得到业务费用预测数据。
可选的,第一预测单元3041还可以具体用于:
通过预置预测模型、第一历史业务订单数据和第二历史业务订单数据,对预设预测时段的日均业务费用数据依次进行预测和均值计算,得到历史周期工作日 日均业务费用数据对应的第一日均预测数据、历史周期节假日日均业务费用数据对应的第二日均预测数据、历史同期工作日日均业务费用数据对应的第三日均预测数据和历史同期节假日日均业务费用数据对应的第四日均预测数据;
根据第一日均预测数据、第二日均预测数据、第三日均预测数据、第四日均预测数据、多个日期差值和业务费用数据比例值,分别对预设预测时段的日均业务费用数据进行预测,得到预设预测时段的多个日业务费用预估数据集。
可选的,第二预测单元3042还可以具体用于:
获取第一历史业务订单数据的历史周期工作日累计业务费用数据和历史周期节假日累计业务费用数据,以及第二历史业务订单数据的历史同期工作日累计业务费用数据和历史同期节假日累计业务费用数据;
根据历史周期工作日累计业务费用数据、历史周期节假日累计业务费用数据、多个日业务费用预估数据集、周期工作日分布曲线值、周期节假日分布曲线值和预测算法,计算预设预测时段的业务费用数据,得到历史周期工作日日均业务费用数据对应的第一预测数据集、历史周期节假日日均业务费用数据和对应的第二预测数据集;
根据历史同期工作日累计业务费用数据、历史同期节假日累计业务费用数据、多个日业务费用预估数据集、同期工作日分布曲线值、同期节假日分布曲线值和预测算法,计算预设预测时段的业务费用数据,得到历史同期工作日日均业务费用数据对应的第三预测数据集和历史同期节假日日均业务费用数据对应的第四预测数据集。
可选的,获取模块301还可以具体用于:
获取目标预设时段的初始历史业务订单数据,对初始历史业务订单数据进行数据清洗,得到候选历史业务订单数据;
将候选历史业务订单数据依次进行时段分类和日期类型分类,得到第一预设历史时段的目标业务订单数据和第二预设历史时段的目标业务订单数据,日期类型包括工作日和节假日,业务订单数据包括业务订单信息;
计算第一预设历史时段的目标业务订单数据的历史周期工作日日均业务费用数据和历史周期节假日日均业务费用数据,以及第二预设历史时段的目标业务订单数据的历史同期工作日日均业务费用数据和历史同期节假日日均业务费用数据;
将第一预设历史时段的业务订单信息、历史周期工作日日均业务费用数据和历史周期节假日日均业务费用数据确定为第一历史业务订单数据,将第二预设历史时段的业务订单信息、历史同期工作日日均业务费用数据和历史同期节假日日均业务费用数据确定为第二历史业务订单数据。
可选的,合并处理模块305还可以具体用于:
根据第一预测数据集、第二预测数据集、第三预测数据集和第四预测数据集,分别生成第一预测序列、第二预测序列、第三预测序列和第四预测序列;
将第一预测序列、第二预测序列、第三预测序列和第四预测序列依次相加,得到合并序列;
计算合并序列的算术均值,得到业务费用预测数据。
可选的,基于分布曲线的预测装置,还包括:
优化模块306,用于获取业务费用预测数据的偏差率,并根据偏差率和预置的优化算法,对预置预测模型进行优化。
上述基于分布曲线的预测装置中各模块和各单元的功能实现与上述基于分 布曲线的预测方法实施例中各步骤相对应,其功能和实现过程在此处不再一一赘述。
本申请实施例中,能够将工作日和节假日的数据影响,以及业务订单数量和业务费用数据之间的分布规律纳入预置预测模型的预测中,避免了历史业务订单数据易受到日期的业务订单数量和业务费用数据之间的影响的问题,提高了预测业务费用数据的偏差率,从而提高了对业务费用数据的预测准确性。
上面图3和图4从模块化功能实体的角度对本申请实施例中的基于分布曲线的预测装置进行详细描述,下面从硬件处理的角度对本申请实施例中基于分布曲线的预测设备进行详细描述。
图5是本申请实施例提供的一种基于分布曲线的预测设备的结构示意图,该基于分布曲线的预测设备500可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上处理器(central processing units,CPU)510(例如,一个或一个以上处理器)和存储器520,一个或一个以上存储应用程序533或数据532的存储介质530(例如一个或一个以上海量存储设备)。其中,存储器520和存储介质530可以是短暂存储或持久存储。存储在存储介质530的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对基于分布曲线的预测设备500中的一系列指令操作。更进一步地,处理器510可以设置为与存储介质530通信,在基于分布曲线的预测设备500上执行存储介质530中的一系列指令操作。
基于分布曲线的预测设备500还可以包括一个或一个以上电源540,一个或一个以上有线或无线网络接口550,一个或一个以上输入输出接口560,和/或,一个或一个以上操作系统531,例如Windows Serve,Mac OS X,Unix,Linux,FreeBSD等等。本领域技术人员可以理解,图5示出的基于分布曲线的预测设备结构并不构成对基于分布曲线的预测设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
本申请还提供一种基于分布曲线的预测设备,包括:存储器和至少一个处理器,所述存储器中存储有指令,所述存储器和所述至少一个处理器通过线路互连;所述至少一个处理器调用所述存储器中的所述指令,以使得所述基于分布曲线的预测设备执行上述基于分布曲线的预测方法中的步骤。
本申请还提供一种计算机可读存储介质,该计算机可读存储介质可以为非易失性计算机可读存储介质,也可以为易失性计算机可读存储介质。计算机可读存储介质存储有计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如下步骤:
获取经过数据预处理的第一预设历史时段的第一历史业务订单数据,以及第二预设历史时段的第二历史业务订单数据,所述第一预设历史时段包含于所述第二预设历史时段,所述第二预设历史时段的结束日期为预设预测时段的起始日期的前一天,所述第一历史业务订单数据包括历史周期工作日日均业务费用数据和历史周期节假日日均业务费用数据,所述第二历史业务订单数据包括历史同期工作日日均业务费用数据和历史同期节假日日均业务费用数据;
获取所述第二历史业务订单数据中的多个订立日期,以及各订立日期对应的起始日期,并计算所述多个订立日期和各订立日期对应的起始日期之间的差值,得到多个日期差值;
通过预置预测模型和所述第二历史业务订单数据,计算所述多个日期差值分别对应的业务费用数据比例值,通过所述预置预测模型中的指数函数和所述多个日期差值,计算所述业务费用数据比例值的周期工作日分布曲线值、周期节假日 分布曲线值、同期工作日分布曲线值和同期节假日分布曲线值;
通过所述预置预测模型、所述第一历史业务订单数据、所述第二历史业务订单数据、所述周期工作日分布曲线值、所述周期节假日分布曲线值、所述同期工作日分布曲线值和所述同期节假日分布曲线值,对所述预设预测时段的业务费用数据进行预测,得到所述历史周期工作日日均业务费用数据对应的第一预测数据集、所述历史周期节假日日均业务费用数据对应的第二预测数据集、所述历史同期工作日日均业务费用数据对应的第三预测数据集,以及所述历史同期节假日日均业务费用数据对应的第四预测数据集;
将所述第一预测数据集、所述第二预测数据集、所述第三预测数据集和所述第四预测数据集进行合并处理,得到业务费用预测数据。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (20)

  1. 一种基于分布曲线的预测方法,其中,所述基于分布曲线的预测方法包括:
    获取经过数据预处理的第一预设历史时段的第一历史业务订单数据,以及第二预设历史时段的第二历史业务订单数据,所述第一预设历史时段包含于所述第二预设历史时段,所述第二预设历史时段的结束日期为预设预测时段的起始日期的前一天,所述第一历史业务订单数据包括历史周期工作日日均业务费用数据和历史周期节假日日均业务费用数据,所述第二历史业务订单数据包括历史同期工作日日均业务费用数据和历史同期节假日日均业务费用数据;
    获取所述第二历史业务订单数据中的多个订立日期,以及各订立日期对应的起始日期,并计算所述多个订立日期和各订立日期对应的起始日期之间的差值,得到多个日期差值;
    通过预置预测模型和所述第二历史业务订单数据,计算所述多个日期差值分别对应的业务费用数据比例值,通过所述预置预测模型中的指数函数和所述多个日期差值,计算所述业务费用数据比例值的周期工作日分布曲线值、周期节假日分布曲线值、同期工作日分布曲线值和同期节假日分布曲线值;
    通过所述预置预测模型、所述第一历史业务订单数据、所述第二历史业务订单数据、所述周期工作日分布曲线值、所述周期节假日分布曲线值、所述同期工作日分布曲线值和所述同期节假日分布曲线值,对所述预设预测时段的业务费用数据进行预测,得到所述历史周期工作日日均业务费用数据对应的第一预测数据集、所述历史周期节假日日均业务费用数据对应的第二预测数据集、所述历史同期工作日日均业务费用数据对应的第三预测数据集,以及所述历史同期节假日日均业务费用数据对应的第四预测数据集;
    将所述第一预测数据集、所述第二预测数据集、所述第三预测数据集和所述第四预测数据集进行合并处理,得到业务费用预测数据。
  2. 根据权利要求1所述的基于分布曲线的预测方法,其中,所述通过所述预置预测模型、所述第一历史业务订单数据、所述第二历史业务订单数据、所述周期工作日分布曲线值、所述周期节假日分布曲线值、所述同期工作日分布曲线值和所述同期节假日分布曲线值,对所述预设预测时段的业务费用数据进行预测,得到所述历史周期工作日日均业务费用数据对应的第一预测数据集、所述历史周期节假日日均业务费用数据对应的第二预测数据集、所述历史同期工作日日均业务费用数据对应的第三预测数据集,以及所述历史同期节假日日均业务费用数据对应的第四预测数据集,包括:
    通过所述预置预测模型、所述第一历史业务订单数据、所述第二历史业务订单数据、所述多个日期差值和所述业务费用数据比例值,对所述预设预测时段的日均业务费用数据进行预测,得到所述预设预测时段的多个日业务费用预估数据集;
    根据所述预设预测时段的多个日业务费用预估数据集、所述周期工作日分布曲线值、所述周期节假日分布曲线值、所述同期工作日分布曲线值和所述同期节假日分布曲线值,对所述预设预测时段的业务费用数据进行预测,得到所述历史周期工作日日均业务费用数据对应的第一预测数据集、所述历史周期节假日日均业务费用数据对应的第二预测数据集、所述历史同期工作日日均业务费用数据对应的第三预测数据集,以及所述历史同期节假日日均业务费用数据对应的第四预测数据集。
  3. 根据权利要求2所述的基于分布曲线的预测方法,其中,所述通过所述预置预测模型、所述第一历史业务订单数据、所述第二历史业务订单数据、所述多个日期差值和所述业务费用数据比例值,对所述预设预测时段的日均业务费用数据进行预测,得到所述预设预测时段的多个日业务费用预估数据集,包括:
    通过所述预置预测模型、所述第一历史业务订单数据和所述第二历史业务订单数据,对所述预设预测时段的日均业务费用数据依次进行预测和均值计算,得到所述历史周期工作日日均业务费用数据对应的第一日均预测数据、所述历史周期节假日日均业务费用数据对应的第二日均预测数据、所述历史同期工作日日均业务费用数据对应的第三日均预测数据和所述历史同期节假日日均业务费用数据对应的第四日均预测数据;
    根据所述第一日均预测数据、所述第二日均预测数据、所述第三日均预测数据、所述第四日均预测数据、所述多个日期差值和所述业务费用数据比例值,分别对所述预设预测时段的日均业务费用数据进行预测,得到所述预设预测时段的多个日业务费用预估数据集。
  4. 根据权利要求2所述的基于分布曲线的预测方法,其中,所述根据所述预设预测时段的多个日业务费用预估数据集、所述周期工作日分布曲线值、所述周期节假日分布曲线值、所述同期工作日分布曲线值和所述同期节假日分布曲线值,对所述预设预测时段的业务费用数据进行预测,得到所述历史周期工作日日均业务费用数据对应的第一预测数据集、所述历史周期节假日日均业务费用数据对应的第二预测数据集、所述历史同期工作日日均业务费用数据对应的第三预测数据集,以及所述历史同期节假日日均业务费用数据对应的第四预测数据集,包括:
    获取所述第一历史业务订单数据的历史周期工作日累计业务费用数据和历史周期节假日累计业务费用数据,以及所述第二历史业务订单数据的历史同期工作日累计业务费用数据和历史同期节假日累计业务费用数据;
    根据所述历史周期工作日累计业务费用数据、所述历史周期节假日累计业务费用数据、所述多个日业务费用预估数据集、所述周期工作日分布曲线值、所述周期节假日分布曲线值和预测算法,计算所述预设预测时段的业务费用数据,得到所述历史周期工作日日均业务费用数据对应的第一预测数据集、所述历史周期节假日日均业务费用数据和对应的第二预测数据集;
    根据所述历史同期工作日累计业务费用数据、所述历史同期节假日累计业务费用数据、所述多个日业务费用预估数据集、所述同期工作日分布曲线值、所述同期节假日分布曲线值和预测算法,计算所述预设预测时段的业务费用数据,得到所述历史同期工作日日均业务费用数据对应的第三预测数据集和所述历史同期节假日日均业务费用数据对应的第四预测数据集。
  5. 根据权利要求1所述的基于分布曲线的预测方法,其中,所述获取经过数据预处理的第一预设历史时段的第一历史业务订单数据,以及第二预设历史时段的第二历史业务订单数据,包括:
    获取目标预设时段的初始历史业务订单数据,对所述初始历史业务订单数据进行数据清洗,得到候选历史业务订单数据;
    将所述候选历史业务订单数据依次进行时段分类和日期类型分类,得到第一预设历史时段的目标业务订单数据和第二预设历史时段的目标业务订单数据,所述日期类型包括工作日和节假日,所述业务订单数据包括业务订单信息;
    计算所述第一预设历史时段的目标业务订单数据的历史周期工作日日均业 务费用数据和历史周期节假日日均业务费用数据,以及所述第二预设历史时段的目标业务订单数据的历史同期工作日日均业务费用数据和历史同期节假日日均业务费用数据;
    将所述第一预设历史时段的业务订单信息、历史周期工作日日均业务费用数据和历史周期节假日日均业务费用数据确定为第一历史业务订单数据,将所述第二预设历史时段的业务订单信息、历史同期工作日日均业务费用数据和历史同期节假日日均业务费用数据确定为第二历史业务订单数据。
  6. 根据权利要求1所述的基于分布曲线的预测方法,其中,所述将所述第一预测数据集、所述第二预测数据集、所述第三预测数据集和所述第四预测数据集进行合并处理,得到业务费用预测数据,包括:
    根据所述第一预测数据集、所述第二预测数据集、所述第三预测数据集和所述第四预测数据集,分别生成第一预测序列、第二预测序列、第三预测序列和第四预测序列;
    将所述第一预测序列、所述第二预测序列、所述第三预测序列和所述第四预测序列依次相加,得到合并序列;
    计算所述合并序列的算术均值,得到业务费用预测数据。
  7. 根据权利要求1-6中任一项所述的基于分布曲线的预测方法,其中,所述将所述第一预测数据集、所述第二预测数据集、所述第三预测数据集和所述第四预测数据集进行合并处理,得到业务费用预测数据之后,还包括:
    获取所述业务费用预测数据的偏差率,并根据所述偏差率和预置的优化算法,对所述预置预测模型进行优化。
  8. 一种基于分布曲线的预测设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:
    获取经过数据预处理的第一预设历史时段的第一历史业务订单数据,以及第二预设历史时段的第二历史业务订单数据,所述第一预设历史时段包含于所述第二预设历史时段,所述第二预设历史时段的结束日期为预设预测时段的起始日期的前一天,所述第一历史业务订单数据包括历史周期工作日日均业务费用数据和历史周期节假日日均业务费用数据,所述第二历史业务订单数据包括历史同期工作日日均业务费用数据和历史同期节假日日均业务费用数据;
    获取所述第二历史业务订单数据中的多个订立日期,以及各订立日期对应的起始日期,并计算所述多个订立日期和各订立日期对应的起始日期之间的差值,得到多个日期差值;
    通过预置预测模型和所述第二历史业务订单数据,计算所述多个日期差值分别对应的业务费用数据比例值,通过所述预置预测模型中的指数函数和所述多个日期差值,计算所述业务费用数据比例值的周期工作日分布曲线值、周期节假日分布曲线值、同期工作日分布曲线值和同期节假日分布曲线值;
    通过所述预置预测模型、所述第一历史业务订单数据、所述第二历史业务订单数据、所述周期工作日分布曲线值、所述周期节假日分布曲线值、所述同期工作日分布曲线值和所述同期节假日分布曲线值,对所述预设预测时段的业务费用数据进行预测,得到所述历史周期工作日日均业务费用数据对应的第一预测数据集、所述历史周期节假日日均业务费用数据对应的第二预测数据集、所述历史同期工作日日均业务费用数据对应的第三预测数据集,以及所述历史同期节假日日均业务费用数据对应的第四预测数据集;
    将所述第一预测数据集、所述第二预测数据集、所述第三预测数据集和所述第四预测数据集进行合并处理,得到业务费用预测数据。
  9. 根据权利要求8所述的基于分布曲线的预测设备,所述处理器执行所述计算机程序时还实现以下步骤:
    通过所述预置预测模型、所述第一历史业务订单数据、所述第二历史业务订单数据、所述多个日期差值和所述业务费用数据比例值,对所述预设预测时段的日均业务费用数据进行预测,得到所述预设预测时段的多个日业务费用预估数据集;
    根据所述预设预测时段的多个日业务费用预估数据集、所述周期工作日分布曲线值、所述周期节假日分布曲线值、所述同期工作日分布曲线值和所述同期节假日分布曲线值,对所述预设预测时段的业务费用数据进行预测,得到所述历史周期工作日日均业务费用数据对应的第一预测数据集、所述历史周期节假日日均业务费用数据对应的第二预测数据集、所述历史同期工作日日均业务费用数据对应的第三预测数据集,以及所述历史同期节假日日均业务费用数据对应的第四预测数据集。
  10. 根据权利要求9所述的基于分布曲线的预测设备,所述处理器执行所述计算机程序时还实现以下步骤:
    通过所述预置预测模型、所述第一历史业务订单数据和所述第二历史业务订单数据,对所述预设预测时段的日均业务费用数据依次进行预测和均值计算,得到所述历史周期工作日日均业务费用数据对应的第一日均预测数据、所述历史周期节假日日均业务费用数据对应的第二日均预测数据、所述历史同期工作日日均业务费用数据对应的第三日均预测数据和所述历史同期节假日日均业务费用数据对应的第四日均预测数据;
    根据所述第一日均预测数据、所述第二日均预测数据、所述第三日均预测数据、所述第四日均预测数据、所述多个日期差值和所述业务费用数据比例值,分别对所述预设预测时段的日均业务费用数据进行预测,得到所述预设预测时段的多个日业务费用预估数据集。
  11. 根据权利要求9所述的基于分布曲线的预测设备,所述处理器执行所述计算机程序时还实现以下步骤:
    获取所述第一历史业务订单数据的历史周期工作日累计业务费用数据和历史周期节假日累计业务费用数据,以及所述第二历史业务订单数据的历史同期工作日累计业务费用数据和历史同期节假日累计业务费用数据;
    根据所述历史周期工作日累计业务费用数据、所述历史周期节假日累计业务费用数据、所述多个日业务费用预估数据集、所述周期工作日分布曲线值、所述周期节假日分布曲线值和预测算法,计算所述预设预测时段的业务费用数据,得到所述历史周期工作日日均业务费用数据对应的第一预测数据集、所述历史周期节假日日均业务费用数据和对应的第二预测数据集;
    根据所述历史同期工作日累计业务费用数据、所述历史同期节假日累计业务费用数据、所述多个日业务费用预估数据集、所述同期工作日分布曲线值、所述同期节假日分布曲线值和预测算法,计算所述预设预测时段的业务费用数据,得到所述历史同期工作日日均业务费用数据对应的第三预测数据集和所述历史同期节假日日均业务费用数据对应的第四预测数据集。
  12. 根据权利要求8所述的基于分布曲线的预测设备,所述处理器执行所述计算机程序时还实现以下步骤:
    获取目标预设时段的初始历史业务订单数据,对所述初始历史业务订单数据进行数据清洗,得到候选历史业务订单数据;
    将所述候选历史业务订单数据依次进行时段分类和日期类型分类,得到第一预设历史时段的目标业务订单数据和第二预设历史时段的目标业务订单数据,所述日期类型包括工作日和节假日,所述业务订单数据包括业务订单信息;
    计算所述第一预设历史时段的目标业务订单数据的历史周期工作日日均业务费用数据和历史周期节假日日均业务费用数据,以及所述第二预设历史时段的目标业务订单数据的历史同期工作日日均业务费用数据和历史同期节假日日均业务费用数据;
    将所述第一预设历史时段的业务订单信息、历史周期工作日日均业务费用数据和历史周期节假日日均业务费用数据确定为第一历史业务订单数据,将所述第二预设历史时段的业务订单信息、历史同期工作日日均业务费用数据和历史同期节假日日均业务费用数据确定为第二历史业务订单数据。
  13. 根据权利要求8所述的基于分布曲线的预测设备,所述处理器执行所述计算机程序时还实现以下步骤:
    根据所述第一预测数据集、所述第二预测数据集、所述第三预测数据集和所述第四预测数据集,分别生成第一预测序列、第二预测序列、第三预测序列和第四预测序列;
    将所述第一预测序列、所述第二预测序列、所述第三预测序列和所述第四预测序列依次相加,得到合并序列;
    计算所述合并序列的算术均值,得到业务费用预测数据。
  14. 根据权利要求8-13中任一项所述的基于分布曲线的预测设备,所述处理器执行所述计算机程序时还实现以下步骤:
    获取所述业务费用预测数据的偏差率,并根据所述偏差率和预置的优化算法,对所述预置预测模型进行优化。
  15. 一种计算机可读存储介质,所述计算机可读存储介质中存储计算机指令,当所述计算机指令在计算机上运行时,使得计算机执行如下步骤:
    获取经过数据预处理的第一预设历史时段的第一历史业务订单数据,以及第二预设历史时段的第二历史业务订单数据,所述第一预设历史时段包含于所述第二预设历史时段,所述第二预设历史时段的结束日期为预设预测时段的起始日期的前一天,所述第一历史业务订单数据包括历史周期工作日日均业务费用数据和历史周期节假日日均业务费用数据,所述第二历史业务订单数据包括历史同期工作日日均业务费用数据和历史同期节假日日均业务费用数据;
    获取所述第二历史业务订单数据中的多个订立日期,以及各订立日期对应的起始日期,并计算所述多个订立日期和各订立日期对应的起始日期之间的差值,得到多个日期差值;
    通过预置预测模型和所述第二历史业务订单数据,计算所述多个日期差值分别对应的业务费用数据比例值,通过所述预置预测模型中的指数函数和所述多个日期差值,计算所述业务费用数据比例值的周期工作日分布曲线值、周期节假日分布曲线值、同期工作日分布曲线值和同期节假日分布曲线值;
    通过所述预置预测模型、所述第一历史业务订单数据、所述第二历史业务订单数据、所述周期工作日分布曲线值、所述周期节假日分布曲线值、所述同期工作日分布曲线值和所述同期节假日分布曲线值,对所述预设预测时段的业务费用数据进行预测,得到所述历史周期工作日日均业务费用数据对应的第一预测数据 集、所述历史周期节假日日均业务费用数据对应的第二预测数据集、所述历史同期工作日日均业务费用数据对应的第三预测数据集,以及所述历史同期节假日日均业务费用数据对应的第四预测数据集;
    将所述第一预测数据集、所述第二预测数据集、所述第三预测数据集和所述第四预测数据集进行合并处理,得到业务费用预测数据。
  16. 根据权利要求15所述的计算机可读存储介质,当所述计算机指令在计算机上运行时,使得计算机还执行如下步骤:
    通过所述预置预测模型、所述第一历史业务订单数据、所述第二历史业务订单数据、所述多个日期差值和所述业务费用数据比例值,对所述预设预测时段的日均业务费用数据进行预测,得到所述预设预测时段的多个日业务费用预估数据集;
    根据所述预设预测时段的多个日业务费用预估数据集、所述周期工作日分布曲线值、所述周期节假日分布曲线值、所述同期工作日分布曲线值和所述同期节假日分布曲线值,对所述预设预测时段的业务费用数据进行预测,得到所述历史周期工作日日均业务费用数据对应的第一预测数据集、所述历史周期节假日日均业务费用数据对应的第二预测数据集、所述历史同期工作日日均业务费用数据对应的第三预测数据集,以及所述历史同期节假日日均业务费用数据对应的第四预测数据集。
  17. 根据权利要求16所述的计算机可读存储介质,当所述计算机指令在计算机上运行时,使得计算机还执行如下步骤:
    通过所述预置预测模型、所述第一历史业务订单数据和所述第二历史业务订单数据,对所述预设预测时段的日均业务费用数据依次进行预测和均值计算,得到所述历史周期工作日日均业务费用数据对应的第一日均预测数据、所述历史周期节假日日均业务费用数据对应的第二日均预测数据、所述历史同期工作日日均业务费用数据对应的第三日均预测数据和所述历史同期节假日日均业务费用数据对应的第四日均预测数据;
    根据所述第一日均预测数据、所述第二日均预测数据、所述第三日均预测数据、所述第四日均预测数据、所述多个日期差值和所述业务费用数据比例值,分别对所述预设预测时段的日均业务费用数据进行预测,得到所述预设预测时段的多个日业务费用预估数据集。
  18. 根据权利要求16所述的计算机可读存储介质,当所述计算机指令在计算机上运行时,使得计算机还执行如下步骤:
    获取所述第一历史业务订单数据的历史周期工作日累计业务费用数据和历史周期节假日累计业务费用数据,以及所述第二历史业务订单数据的历史同期工作日累计业务费用数据和历史同期节假日累计业务费用数据;
    根据所述历史周期工作日累计业务费用数据、所述历史周期节假日累计业务费用数据、所述多个日业务费用预估数据集、所述周期工作日分布曲线值、所述周期节假日分布曲线值和预测算法,计算所述预设预测时段的业务费用数据,得到所述历史周期工作日日均业务费用数据对应的第一预测数据集、所述历史周期节假日日均业务费用数据和对应的第二预测数据集;
    根据所述历史同期工作日累计业务费用数据、所述历史同期节假日累计业务费用数据、所述多个日业务费用预估数据集、所述同期工作日分布曲线值、所述同期节假日分布曲线值和预测算法,计算所述预设预测时段的业务费用数据,得到所述历史同期工作日日均业务费用数据对应的第三预测数据集和所述历史同 期节假日日均业务费用数据对应的第四预测数据集。
  19. 根据权利要求15所述的计算机可读存储介质,当所述计算机指令在计算机上运行时,使得计算机还执行如下步骤:
    获取目标预设时段的初始历史业务订单数据,对所述初始历史业务订单数据进行数据清洗,得到候选历史业务订单数据;
    将所述候选历史业务订单数据依次进行时段分类和日期类型分类,得到第一预设历史时段的目标业务订单数据和第二预设历史时段的目标业务订单数据,所述日期类型包括工作日和节假日,所述业务订单数据包括业务订单信息;
    计算所述第一预设历史时段的目标业务订单数据的历史周期工作日日均业务费用数据和历史周期节假日日均业务费用数据,以及所述第二预设历史时段的目标业务订单数据的历史同期工作日日均业务费用数据和历史同期节假日日均业务费用数据;
    将所述第一预设历史时段的业务订单信息、历史周期工作日日均业务费用数据和历史周期节假日日均业务费用数据确定为第一历史业务订单数据,将所述第二预设历史时段的业务订单信息、历史同期工作日日均业务费用数据和历史同期节假日日均业务费用数据确定为第二历史业务订单数据。
  20. 一种基于分布曲线的预测装置,其中,所述基于分布曲线的预测装置包括:
    获取模块,用于获取经过数据预处理的第一预设历史时段的第一历史业务订单数据,以及第二预设历史时段的第二历史业务订单数据,所述第一预设历史时段包含于所述第二预设历史时段,所述第二预设历史时段的结束日期为预设预测时段的起始日期的前一天,所述第一历史业务订单数据包括历史周期工作日日均业务费用数据和历史周期节假日日均业务费用数据,所述第二历史业务订单数据包括历史同期工作日日均业务费用数据和历史同期节假日日均业务费用数据;
    第一计算模块,用于获取所述第二历史业务订单数据中的多个订立日期,以及各订立日期对应的起始日期,并计算所述多个订立日期和各订立日期对应的起始日期之间的差值,得到多个日期差值;
    第二计算模块,用于通过预置预测模型和所述第二历史业务订单数据,计算所述多个日期差值分别对应的业务费用数据比例值,通过所述预置预测模型中的指数函数和所述多个日期差值,计算所述业务费用数据比例值的周期工作日分布曲线值、周期节假日分布曲线值、同期工作日分布曲线值和同期节假日分布曲线值;
    预测模块,用于通过所述预置预测模型、所述第一历史业务订单数据、所述第二历史业务订单数据、所述周期工作日分布曲线值、所述周期节假日分布曲线值、所述同期工作日分布曲线值和所述同期节假日分布曲线值,对所述预设预测时段的业务费用数据进行预测,得到所述历史周期工作日日均业务费用数据对应的第一预测数据集、所述历史周期节假日日均业务费用数据对应的第二预测数据集、所述历史同期工作日日均业务费用数据对应的第三预测数据集,以及所述历史同期节假日日均业务费用数据对应的第四预测数据集;
    合并处理模块,用于将所述第一预测数据集、所述第二预测数据集、所述第三预测数据集和所述第四预测数据集进行合并处理,得到业务费用预测数据。
PCT/CN2021/090828 2020-12-09 2021-04-29 基于分布曲线的预测方法、装置、设备及存储介质 WO2022121219A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011425186.7A CN112215444B (zh) 2020-12-09 2020-12-09 基于分布曲线的预测方法、装置、设备及存储介质
CN202011425186.7 2020-12-09

Publications (1)

Publication Number Publication Date
WO2022121219A1 true WO2022121219A1 (zh) 2022-06-16

Family

ID=74068162

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/090828 WO2022121219A1 (zh) 2020-12-09 2021-04-29 基于分布曲线的预测方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN112215444B (zh)
WO (1) WO2022121219A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111784008A (zh) * 2020-06-30 2020-10-16 北京金山安全软件有限公司 产品生命周期预估方法、装置、电子设备及存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112215444B (zh) * 2020-12-09 2021-04-02 平安科技(深圳)有限公司 基于分布曲线的预测方法、装置、设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7587330B1 (en) * 2003-01-31 2009-09-08 Hewlett-Packard Development Company, L.P. Method and system for constructing prediction interval based on historical forecast errors
US8065098B2 (en) * 2008-12-12 2011-11-22 Schneider Electric USA, Inc. Progressive humidity filter for load data forecasting
CN104156786A (zh) * 2014-08-18 2014-11-19 广西电网有限责任公司 一种考虑气象多因素影响的非工作日最大日负荷预测系统
CN110929941A (zh) * 2019-11-26 2020-03-27 广东电网有限责任公司 基于多负荷模式的短期电力负荷预测方法及系统
CN111045907A (zh) * 2019-12-12 2020-04-21 苏州博纳讯动软件有限公司 一种基于业务量的系统容量预测方法
CN112215444A (zh) * 2020-12-09 2021-01-12 平安科技(深圳)有限公司 基于分布曲线的预测方法、装置、设备及存储介质

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8583470B1 (en) * 2010-11-02 2013-11-12 Mindjet Llc Participant utility extraction for prediction market based on region of difference between probability functions
CN105306539B (zh) * 2015-09-22 2018-09-11 北京金山安全软件有限公司 业务信息展现控制方法、装置和互联网业务信息显示平台
CN109726872B (zh) * 2018-12-29 2021-03-02 华润电力技术研究院有限公司 一种能耗预测方法、装置、存储介质及电子设备
CN110689163B (zh) * 2019-08-16 2022-06-17 深圳市跨越新科技有限公司 一种节假日期间货量智能预测方法和系统
CN110766232B (zh) * 2019-10-30 2022-04-29 支付宝(杭州)信息技术有限公司 动态预测方法及其系统

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7587330B1 (en) * 2003-01-31 2009-09-08 Hewlett-Packard Development Company, L.P. Method and system for constructing prediction interval based on historical forecast errors
US8065098B2 (en) * 2008-12-12 2011-11-22 Schneider Electric USA, Inc. Progressive humidity filter for load data forecasting
CN104156786A (zh) * 2014-08-18 2014-11-19 广西电网有限责任公司 一种考虑气象多因素影响的非工作日最大日负荷预测系统
CN110929941A (zh) * 2019-11-26 2020-03-27 广东电网有限责任公司 基于多负荷模式的短期电力负荷预测方法及系统
CN111045907A (zh) * 2019-12-12 2020-04-21 苏州博纳讯动软件有限公司 一种基于业务量的系统容量预测方法
CN112215444A (zh) * 2020-12-09 2021-01-12 平安科技(深圳)有限公司 基于分布曲线的预测方法、装置、设备及存储介质

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111784008A (zh) * 2020-06-30 2020-10-16 北京金山安全软件有限公司 产品生命周期预估方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN112215444A (zh) 2021-01-12
CN112215444B (zh) 2021-04-02

Similar Documents

Publication Publication Date Title
US11281969B1 (en) Artificial intelligence system combining state space models and neural networks for time series forecasting
US8600843B2 (en) Method and computer system for setting inventory control levels from demand inter-arrival time, demand size statistics
US20180150783A1 (en) Method and system for predicting task completion of a time period based on task completion rates and data trend of prior time periods in view of attributes of tasks using machine learning models
WO2022121219A1 (zh) 基于分布曲线的预测方法、装置、设备及存储介质
CN107391692B (zh) 一种推荐效果的评估方法及装置
US8909644B2 (en) Real-time adaptive binning
CN108399564B (zh) 信用评分方法及装置
US20130046725A1 (en) Systems and/or methods for forecasting future behavior of event streams in complex event processing (cep) environments
KR20150043338A (ko) 캐시처리된 데이터베이스 질의 결과의 업데이트
CN108734499B (zh) 推广信息效果分析方法及装置、计算机可读介质
WO2022126977A1 (zh) 业务数据的预测方法、装置、设备及存储介质
CN111274531A (zh) 商品销售额预测方法、装置、计算机设备和存储介质
WO2015040806A1 (en) Hierarchical latent variable model estimation device, hierarchical latent variable model estimation method, supply amount prediction device, supply amount prediction method, and recording medium
CN110532156B (zh) 一种容量预测方法及装置
CN110009161A (zh) 供水预测方法及装置
CN105976170A (zh) 一种自动生成工作计划的方法及装置
JP2022172503A (ja) 衛星観測計画立案システム、衛星観測計画立案方法、および衛星観測計画立案プログラム
US20220188315A1 (en) Estimating execution time for batch queries
JP7400819B2 (ja) 予測装置、予測方法、及び予測プログラム
CN113822455B (zh) 一种时间预测方法、装置、服务器及存储介质
Zhang Bayesian analysis of big data in insurance predictive modeling using distributed computing
CN110826949A (zh) 产能控制实现方法和装置
CN114925919A (zh) 业务资源处理方法、装置、计算机设备和存储介质
CN113837782A (zh) 时间序列模型的周期项参数优化方法、装置、计算机设备
US20200118017A1 (en) Cohort Event Prediction in a Digital Medium Environment using Regularization

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21901934

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21901934

Country of ref document: EP

Kind code of ref document: A1