WO2020211245A1 - Development trend data acquisition method and device - Google Patents

Development trend data acquisition method and device Download PDF

Info

Publication number
WO2020211245A1
WO2020211245A1 PCT/CN2019/103060 CN2019103060W WO2020211245A1 WO 2020211245 A1 WO2020211245 A1 WO 2020211245A1 CN 2019103060 W CN2019103060 W CN 2019103060W WO 2020211245 A1 WO2020211245 A1 WO 2020211245A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
prediction
historical
factor
development
Prior art date
Application number
PCT/CN2019/103060
Other languages
French (fr)
Chinese (zh)
Inventor
张翔
刘媛源
郑子欧
于修铭
汪伟
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020211245A1 publication Critical patent/WO2020211245A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0203Market surveys; Market polls

Definitions

  • This application relates to the field of data processing technology, and in particular to a method and device for acquiring development trend data.
  • Listed companies are obliged to forecast their revenues and make public announcements. Therefore, analysts of listed companies need to make regular revenue forecasts for their listed companies. Under normal circumstances, analysts forecast the revenue of listed companies on a quarterly basis.
  • analysts mainly use the following methods to predict the revenue of listed companies: determine multiple factors that have a direct impact on the revenue of the listed company according to the industry to which the listed company belongs, and use the real-time data of the determined factors to perform linear simulation Or polynomial fitting, and then predict the revenue of listed companies based on the fitted linear function or polynomial function.
  • the real-time data of each factor is used for linear fitting or polynomial fitting. Therefore, the timeliness of the data corresponding to each factor has a great influence on the forecast results.
  • analysts need to collect real-time data corresponding to each factor in a timely manner, resulting in higher costs for analysts to pay for the revenue forecast of listed companies.
  • This application provides a method and device for acquiring development trend data, the main purpose of which is to use the historical development data of multiple sample objects to train a machine learning model, and to obtain the machine by inputting historical development data exclusive for prediction into the trained machine learning model
  • the first prediction data output by the learning model is further used to determine the development trend data of the prediction object according to the first prediction data.
  • an embodiment of the present application provides a method for obtaining development trend data, including:
  • the development trend data used to characterize the development trend of the prediction object is determined according to the first prediction data.
  • the method further includes:
  • the development trend data of the prediction object is determined according to the first prediction data and the second prediction data.
  • the method further includes:
  • the determining the development trend data of the prediction object according to the first prediction data and the second prediction data includes:
  • the development trend data of the prediction object is determined according to the first prediction data, the second prediction data, and the third prediction data.
  • the fitting a polynomial function using historical development data of at least two of the sample objects includes:
  • each of the first historical development data corresponding to each of the statistical periods the following polynomial function is fitted, where each of the first historical development data corresponding to each of the sample objects satisfies the polynomial function;
  • the utilizing the historical development data of at least two of the sample objects to fit a time series model includes:
  • a time series model is fitted according to each of the second historical development data corresponding to each of the statistical periods, wherein the change rule over time of each of the second historical development data corresponding to each of the sample objects satisfies the time series model;
  • the ⁇ M t characterizes the difference between the third prediction data relative to the current time and the second historical development data corresponding to the last statistical period of the current time;
  • the ⁇ M t-1 characterizes the difference The difference between the second historical development data corresponding to the last statistical period of the current time and the second historical development data corresponding to the second statistical period before the current time;
  • the ⁇ M t- 2 characterizing the difference between the second historical development data corresponding to the second statistical period before the current time and the second historical development data corresponding to the third statistical period before the current time;
  • the ⁇ t characterizes the third prediction data relative to the current time;
  • the ⁇ t-1 characterizes the second historical development data corresponding to the last statistical period of the current time;
  • the K The k 1 , the k 2 and the k 3 are all weight coefficients fitted by machine learning.
  • the fitting a time series model according to each second historical development data corresponding to each statistical period includes:
  • a list method is used to define the target equation corresponding to the model
  • the model is determined as the time series model.
  • the determining the development trend data of the prediction object according to the first prediction data, the second prediction data, and the third prediction data includes:
  • the training of the machine learning model corresponding to the object category through each of the extracted first historical factor data includes:
  • the M′ represents the first prediction data
  • the n represents the number of the factors
  • the m represents the number of historical years covered by the first historical factor data
  • the x (i, 1) Characterizing the factor data corresponding to the i-th factor of the predicted object in the previous year
  • the x (i, 2) characterizing the factor corresponding to the i-th factor in the previous year of the predicted object Data
  • the k i characterizes the factor coefficient corresponding to the i-th factor at the current time
  • the x (i, j) characterizes the factor data corresponding to the i-th factor in the previous j-th year of the prediction object;
  • the machine learning model including the formula is constructed.
  • an embodiment of the present application also provides a development trend data acquisition device, including:
  • Category recognition module factor recognition module, data acquisition module, first data extraction module, model training module, second data extraction module, model processing module, and data processing module;
  • the category recognition module is used to determine the object category to which the predicted object belongs
  • the factor identification module is configured to determine at least one factor corresponding to the object category determined by the category identification module, wherein different factors correspond to different data statistics rules;
  • the data acquisition module is configured to use at least two objects belonging to the object category determined by the category recognition module as sample objects, and obtain historical development data of each sample object respectively;
  • the first data extraction module is configured to extract each of the factor corresponding to each of the factors determined by the factor identification module from the historical development data of each of the sample objects and acquired by the data acquisition module.
  • the first historical factor data is configured to extract each of the factor corresponding to each of the factors determined by the factor identification module from the historical development data of each of the sample objects and acquired by the data acquisition module.
  • the model training module is configured to train a machine learning model corresponding to the object category through each of the first historical factor data extracted by the first data extraction module;
  • the second data extraction module is configured to extract second historical factor data corresponding to each of the factors by the factor identification module from the historical development data of the prediction object;
  • the model processing module is configured to input each of the second historical factor data extracted by the second data extraction module into the machine learning model trained by the model training module to obtain the output of the machine learning model First forecast data;
  • the data processing module is configured to determine development trend data used to characterize the development trend of the prediction object according to the first prediction data acquired by the model processing module.
  • an embodiment of the present application also provides a computer device, including a memory and a processor, the memory stores a computer program, and when the processor executes the computer program, it implements any of the foregoing The development trend data acquisition method.
  • embodiments of the present application also provide a non-volatile computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the computer program described in any of the above-mentioned first aspects is implemented. Development trend data acquisition method.
  • the development trend data acquisition method, device, computer equipment, and non-volatile computer-readable storage medium determine the object category to which the predicted object belongs, and then determine one or more factors corresponding to the object category. Obtain the historical development data of at least two sample objects belonging to the object category, extract the first historical factor data corresponding to each factor from the historical development data of each sample object, and extract each historical development data from the predicted object One factor corresponds to the second historical factor data, and then each first historical factor data is used to train the machine learning model corresponding to the object category to which the predicted object belongs. After each second historical factor data is input into the trained machine learning model, the first The forecast data can then determine the development trend data of the forecast object according to the first forecast data.
  • FIG. 1 is a flowchart of a method for forecasting the revenue of a listed company according to an embodiment of the present application
  • FIG. 3 is a flowchart of a time series model fitting method provided by an embodiment of the present application.
  • Fig. 5 is a schematic diagram of a revenue forecasting device for listed companies provided by an embodiment of the present application.
  • the forecast object is a listed company and the development trend data is the revenue forecast data of the listed company as an example, and the method and device for acquiring the development trend data provided by the embodiments of the present application will be described in detail.
  • the method corresponding to the method for acquiring development trend data is a method for forecasting the revenue of listed companies
  • the method corresponding to the method for acquiring development trend data is a method for forecasting the revenue of listed companies.
  • an embodiment of the present application provides a method for forecasting the revenue of a listed company, including:
  • Step 101 Determine the industry category of the listed company that needs to perform revenue forecasting
  • Step 102 Determine at least one factor corresponding to the industry category, where different factors correspond to different data statistics rules;
  • Step 103 Take at least two companies belonging to the industry category as sample companies, and obtain historical revenue data of each sample company respectively;
  • Step 104 Extract the first historical factor data corresponding to each factor from the historical revenue data of each sample company;
  • Step 105 Train a machine learning model corresponding to the industry category through each extracted first historical factor data
  • Step 106 Extract the second historical factor data corresponding to each factor from the historical revenue data of the listed company;
  • Step 107 Input each second historical factor data into the machine learning model, and obtain the first revenue prediction result output by the machine learning model;
  • Step 108 Determine the predicted revenue data of the listed company according to the first revenue prediction result.
  • the method for forecasting the revenue of listed companies after determining the industry category of the listed company that needs to perform revenue forecasting, one or more factors corresponding to the industry category are determined, and then at least two of the industry categories are obtained Sample company’s historical revenue data, and extract the first historical factor data corresponding to each factor from the historical revenue data of each sample company, and extract the second historical factor corresponding to each factor from the historical revenue data of listed companies Historical factor data, and then use each first historical factor data to train the machine learning model corresponding to the industry category of the listed company. After each second historical factor data is input into the trained machine learning model, the first revenue forecast result is obtained, and then The forecasted revenue data of listed companies can be determined according to the first revenue forecast result.
  • the industry category of a listed company usually collects revenue data on a quarterly basis.
  • the revenue data of the listed company in the next quarter is usually used.
  • last quarter's revenue, last quarter's total assets, last year's same quarter's revenue, and last year's total assets were identified as the four factors corresponding to the industry category of the listed company.
  • 3,000 companies belonging to the industry category of listed companies are determined as sample companies and obtained
  • the historical revenue data of each sample company in the past ten years, and then the historical revenue data obtained from each sample company’s quarterly revenue and quarterly total assets in each of the past ten years are obtained as the first Historical factor data.
  • the machine learning model is trained through the extracted 240,000 (3000*10*4*2) first historical factor data, and the machine learning model corresponding to the industry category of the listed company is obtained.
  • the last quarter's revenue, total assets of the last quarter, revenue of the same quarter last year, and total assets of the same quarter last year of the listed company to be forecasted are input into the machine learning model, and the next quarter forecasted revenue of the listed company output from the machine learning model is obtained
  • the data is used as the first revenue forecast result.
  • each first historical revenue data reflects the previous revenue of each sample company
  • the corresponding first historical revenue data can be determined to correspond to
  • the factor coefficients of different factors use the determined factor coefficients to reflect the revenue trend of each sample company over the years, and then the determined factor coefficients can be used to construct a machine learning model.
  • the two historical factor data are processed to predict the revenue of the listed company and obtain the first revenue prediction result.
  • the specific method of constructing a machine learning model may include the following steps:
  • M′ represents the first revenue forecast result
  • n represents the number of factors
  • m represents the number of historical years covered by the first historical factor data
  • x (i, 1) represents the first year of the listed company corresponding to the first factor data of the i factor
  • x (i, 2) represents the factor data corresponding to the i-th factor in the second year of the listed company
  • k i represents the factor coefficient corresponding to the i-th factor at the current time
  • x (i, j ) Characterize the factor data corresponding to the i-th factor in the j-th year of the listed company
  • the four factors identified for listed companies are the revenue of the previous quarter, the total assets of the previous quarter, the revenue of the same quarter last year, and the total assets of the same quarter last year.
  • the first historical factor data obtained is the data of 3000 companies in the past 10 years. In terms of revenue data, a total of 60,000 factor data of 10*3000*2 can be determined based on the revenue factor of the previous quarter. Accordingly, the total assets of the previous quarter, the revenue of the same quarter last year and the total assets of the same quarter last year can be determined accordingly. 6 factor data are obtained. After that, the 240,000 factor data is used as sample data for machine learning, and 4 factor coefficients corresponding to the above 4 factors are fitted. After substituting the fitted four factor coefficients into the above formula, and substituting the historical revenue data corresponding to the above four factors of the previous years of the company into the above formula, the first revenue forecast for listed companies can be calculated. Revenue forecast results.
  • the historical revenue data of each sample company is used to fit the factor coefficients corresponding to each factor, and the fitted factor coefficients are then used to construct a machine learning model corresponding to the industry category of the listed company.
  • the machine learning model Reflects the revenue change trend of the industry category of the listed company, and then can use the constructed machine learning model to predict the revenue of the listed company. Because of the reference to the revenue change trend of other companies in the same industry category and the listed company to be predicted Historical revenue situation, which can more accurately predict the revenue of listed companies.
  • the first revenue prediction can be directly The result is used as the forecasted revenue data of the listed company, and the forecasted revenue data of the listed company can also be determined by combining the revenue forecast results obtained through other forecasting methods.
  • the method of determining forecasted revenue data by combining the revenue forecast results obtained by other forecasting methods the following two methods can be used to determine the forecasted revenue data of listed companies:
  • Method 1 Combine the first revenue prediction result obtained through the machine learning model with the second revenue prediction result obtained through a polynomial function to determine the predicted revenue data of the listed company;
  • Method 2 Combine the first revenue forecast result obtained through the machine learning model, the second revenue forecast result obtained through the polynomial function, and the third revenue forecast result obtained through the time series model to determine the forecast of the listed company Revenue data.
  • the listed company revenue forecasting method shown in Figure 1 After obtaining the first revenue prediction result through the machine learning model in step 107, and in step 108, determine the listed company’s predicted revenue data according to the first revenue prediction result Previously, the historical revenue data of each sample company can be used to fit a polynomial function, so that the historical revenue data of each sample company meets the polynomial function obtained by fitting, and then the historical revenue data of listed companies can be input to fit a polynomial function. The polynomial function of to obtain the second revenue forecast result output by the polynomial function.
  • the predicted revenue data of the listed company when determining the predicted revenue data of the listed company according to the first revenue forecast result, may be determined according to the first revenue forecast result and the second revenue forecast result, which can be specifically calculated The weighted average of the first revenue forecast result and the second revenue forecast result, and the calculated weighted average is used as the forecast revenue data of the listed company.
  • the historical revenue data of each sample company is used to fit the polynomial function.
  • the process of fitting the polynomial function can be realized by the following steps:
  • Step 201 Determine the forecast period for the listed company's revenue forecast
  • Step 202 Extract the first historical revenue data corresponding to each statistical period from the historical revenue data of each sample company according to the determined prediction period, where the statistical period corresponds to the prediction period in a time span;
  • Step 203 Fit the following polynomial function according to each first historical revenue data corresponding to each statistical period, so that each first historical revenue data corresponding to each sample company satisfies the polynomial function;
  • M represents the revenue prediction result relative to the current time
  • k i represents the weight coefficient fitted by machine learning
  • x represents the first historical revenue data corresponding to the previous statistical period relative to the current time
  • x i represents The first historical revenue data corresponding to the last i+1 statistical period relative to the current time
  • t+1 represents the number of statistical periods before the current time.
  • the forecast period for the revenue forecast of listed companies is quarter.
  • the statistical period for the historical revenue data of each sample company is also quarterly.
  • the historical revenue of the sample company is obtained from the historical revenue data of the sample company as the first historical revenue data.
  • the listed company’s data corresponding to each statistical period is obtained from the listed company’s historical revenue data, and then the obtained data corresponding to each statistical period is input into the polynomial function to obtain the output of the polynomial function.
  • Revenue forecast results For example, after determining that the forecast period for the listed company’s revenue forecast is quarterly, the listed company’s historical revenue data is obtained from the listed company’s historical revenue data for each quarter, and then the obtained listed company’s historical revenue data Input the polynomial function fitted to the revenue data of a quarter to obtain the second revenue forecast result corresponding to the listed company.
  • the first revenue prediction result and the second revenue prediction result are calculated according to the weight values set in advance for the first revenue prediction result and the second revenue prediction result.
  • the revenue forecast results are weighted and the calculation results are used as the forecast revenue data of the listed company.
  • the weight values for the first revenue prediction result and the second revenue prediction result can be set to 0.5, that is, the weighted average of the first revenue prediction result and the second revenue prediction result is taken as the listed company’s Forecast revenue data.
  • the historical revenue of each sample company can be used The data is fitted to the time series model, so that the historical revenue data of each sample company conforms to the fitted time series model, and then the historical revenue data of the listed company is input into the fitted time series model.
  • the predicted revenue data of the listed company can be determined according to the first revenue prediction result, the second revenue prediction result, and the third revenue prediction result.
  • the machine learning model, polynomial function and time series model are obtained, and then the historical revenue data of listed companies are input into the machine learning model, polynomial function and time series model, respectively, to obtain the corresponding The first revenue forecast result, the second revenue forecast result, and the third revenue forecast result, and then the forecasted revenue of the listed company is determined based on the first revenue forecast result, the second revenue forecast result, and the third revenue forecast result.
  • Receive data Because of the combination of machine learning model, polynomial function and time series model three revenue forecasting methods to predict the revenue of listed companies, the accuracy of forecasting the revenue of listed companies can be further improved.
  • the historical revenue data of each sample company can be used to fit the time series model.
  • the process of fitting the time series model can be realized by the following steps:
  • Step 301 Determine the forecast period for the listed company's revenue forecast
  • Step 302 Extract the second historical revenue data corresponding to each statistical period from the historical revenue data of each sample company according to the prediction period, where the statistical period corresponds to the prediction period in a time span;
  • Step 303 Fit a time series model according to each second historical revenue data corresponding to each statistical period, so that the time series model of each second historical revenue data corresponding to each sample company meets the time series model;
  • ⁇ M t represents the difference between the revenue forecast results relative to the current time and the second historical revenue data corresponding to the previous statistical period of the current time;
  • ⁇ M t-1 represents the second historical data corresponding to the previous statistical period of the current time The difference between the revenue data and the second historical revenue data corresponding to the second statistical period before the current time;
  • ⁇ M t-2 represents the second historical revenue data corresponding to the second statistical period before the current time and the current time before The difference between the second historical revenue data corresponding to the third statistical period;
  • ⁇ t represents the revenue forecast result relative to the current time;
  • ⁇ t-1 represents the second historical revenue data corresponding to the previous statistical period at the current time ;
  • K, k 1 , k 2 and k 3 are weight coefficients fitted by machine learning.
  • the forecast period for the revenue forecast of listed companies is quarter.
  • the statistical period for the historical revenue data of each sample company is also quarterly.
  • the historical revenue of the sample company is obtained from the historical revenue data of the sample company as the second historical revenue data.
  • the time series model After fitting the time series model, obtain the listed company's data corresponding to each statistical period from the listed company's historical revenue data, and then input the obtained data corresponding to each statistical period into the time series model to obtain the time series model Output the third revenue forecast result. For example, after determining that the forecast period for the listed company’s revenue forecast is quarterly, the listed company’s historical revenue data is obtained from the listed company’s historical revenue data for each quarter, and then the obtained listed company’s historical revenue data The revenue data of a quarter is entered into the fitted time series model, and the third revenue forecast result corresponding to the listed company is obtained.
  • the first historical revenue data for the user fitting a polynomial function and the second historical revenue data for fitting the time series model may be the same data.
  • the first revenue prediction result, the second revenue prediction result, and the third revenue prediction result are weighted, and the result of the weighted calculation is used as the predicted revenue data of the listed company.
  • corresponding weighting coefficients can be set for the first revenue forecast result, the second revenue forecast result, and the third revenue forecast result according to actual needs to control the weight of each revenue forecast result to the final forecast revenue data .
  • the weighting coefficient corresponding to the first revenue prediction result can be set to 0.4
  • the weighting coefficient corresponding to the second revenue prediction result is set to 0.3
  • the weighting coefficient corresponding to the third revenue prediction result is set to 0.3.
  • the result and the third revenue forecast result are weighted, and the result of the weighted calculation is used as the forecast revenue data of the listed company.
  • a weighting system By setting up a weighting system, on the one hand, it is possible to balance the impact of the three revenue forecast results on the calculation results, weigh the pros and cons of the three revenue forecast methods, and make the final forecasted revenue data more accurate. Adjust the weighting coefficient according to the needs, so as to meet the individual needs of different users, and improve the applicability of the listed company's revenue forecasting method.
  • step 303 fits the time series model according to each second historical revenue data corresponding to each statistical period.
  • the ARIMA time series model can be fitted.
  • the fitting process of the ARIMA time series model can be implemented in the following ways:
  • the model is determined as a time series model.
  • the fitting effect of the model is first detected based on the goodness of fit, and the linear residual of the model is detected after the fitting effect reaches the preset target After the linear residual of the model is within the preset fluctuation range, the model is determined as a time series model.
  • the model is determined as a time series model to ensure the accuracy of the generated time series model, and then The accuracy of the third revenue forecast result obtained through the time series model can be guaranteed.
  • the following takes the method for obtaining forecasted revenue data provided in the second method as an example to further describe the method for forecasting the revenue of a listed company provided in the embodiment of the present application. As shown in FIG. 4, the method may include the following steps:
  • Step 401 Determine the industry category of the listed company that needs to perform revenue forecasting.
  • the industry category to which the listed company belongs must first be determined.
  • Step 402 Obtain historical revenue data of multiple sample companies belonging to the determined industry category.
  • sample companies For example, select 3000 companies from companies belonging to industry category A as the sample companies, and then obtain the historical revenue data of each sample company in the 3000 sample companies in the past 10 years. Among them, if the sample company has been established for less than 10 years, all historical data of the sample company will be obtained.
  • Step 403 Train a machine learning model based on historical revenue data of each sample company.
  • the machine learning model can predict future revenue data based on historical revenue data.
  • the machine learning model may include the following formula:
  • M′ represents the first revenue forecast result
  • n represents the number of factors
  • m represents the number of historical years covered by the first historical factor data
  • x (i, 1) represents the first year of the listed company corresponding to the first factor data of the i factor
  • x (i, 2) represents the factor data corresponding to the i-th factor in the second year of the listed company
  • k i represents the factor coefficient corresponding to the i-th factor at the current time
  • x (i, j ) Characterize the factor data corresponding to the i-th factor in the previous j-th year of the listed company
  • the last quarter's revenue, last quarter's total assets, last year's same quarter revenue, and last year's same quarter total assets are identified as the four factors corresponding to industry category A.
  • the sample company’s quarterly revenue and quarterly total assets for each year and quarter in the past 10 years are extracted from the historical revenue data of the sample company in the past 10 years as the first A historical factor data, which can extract a total of 240,000 first historical factor data of 3000*10*4*2, and then use these 240,000 first historical factor data to train machine learning model A.
  • Step 404 Use the machine learning model to process the historical revenue data of the listed company to obtain the first revenue prediction result.
  • the second historical factor data corresponding to each factor is extracted from the historical revenue data of the listed company, and then each extracted second historical factor data is input into the trained machine learning model, and the machine learning The model predicts the revenue of the listed company based on each second historical factor data, and obtains the first revenue prediction result output by the machine learning model.
  • Step 405 Fit a polynomial function through historical revenue data of each sample company.
  • the first historical revenue data of each statistical period is extracted from the historical revenue data of each sample company, and then the extracted first historical revenue data is used.
  • a polynomial function is fitted to historical revenue data, so that each first historical revenue data corresponding to each sample company satisfies the fitted polynomial function.
  • the statistical period and the forecast period correspond in time span.
  • M represents the revenue prediction result relative to the current time
  • k i represents the weight coefficient fitted by machine learning
  • x represents the first historical revenue data corresponding to the previous statistical period relative to the current time
  • x i represents The first historical revenue data corresponding to the last i+1 statistical period relative to the current time
  • t+1 represents the number of statistical periods before the current time.
  • the forecast period for the revenue forecast of listed company A is quarterly, for each sample company in the 3000 sample companies, extract the historical revenue data of the sample company from the past 10 years.
  • the quarterly revenue data is used as the first historical revenue data, which can extract a total of 120,000 first historical revenue data of 3000*10*4.
  • the 120,000 first historical revenue data are used to fit a polynomial function, so that each first historical revenue data corresponding to each sample company meets the polynomial function.
  • Step 406 Use a polynomial function to process historical revenue data of the listed company to obtain a second revenue forecast result.
  • the historical revenue data of listed companies is extracted according to the statistical cycle, the historical revenue data corresponding to each statistical period is obtained, and then the extracted historical revenue data corresponding to each statistical period is input
  • the polynomial function is used to predict the revenue of the listed company based on the input historical revenue data, and obtain the second revenue prediction result output by the polynomial function.
  • M represents the relative second revenue forecast result
  • k i represents the weight coefficient fitted by machine learning
  • x represents the quarterly revenue of listed company A in the previous quarter
  • x i represents the listing Company A’s quarterly revenue for i+1 quarters.
  • Step 407 Fit a time series model through historical revenue data of each sample company.
  • the second historical revenue data of each statistical period is extracted from the historical revenue data of each sample company, and then the extracted second historical revenue data is used.
  • the historical revenue data is fitted to the time series model, so that the change rule of each second historical revenue data corresponding to each sample company over time meets the time series model.
  • the statistical period and the forecast period correspond in time span.
  • ⁇ M t represents the difference between the revenue forecast results relative to the current time and the second historical revenue data corresponding to the previous statistical period of the current time;
  • ⁇ M t-1 represents the second historical data corresponding to the previous statistical period of the current time The difference between the revenue data and the second historical revenue data corresponding to the second statistical period before the current time;
  • ⁇ M t-2 represents the second historical revenue data corresponding to the second statistical period before the current time and the current time before The difference between the second historical revenue data corresponding to the third statistical period;
  • ⁇ t represents the revenue forecast result relative to the current time;
  • ⁇ t-1 represents the second historical revenue data corresponding to the previous statistical period at the current time ;
  • K, k 1 , k 2 and k 3 are weight coefficients fitted by machine learning.
  • the forecast period for the revenue forecast of listed company A is quarterly
  • the historical revenue data of the sample company is used as the second historical revenue data, which can be extracted to 3000*10*4, a total of 120,000 second historical revenue data.
  • the 120,000 second historical revenue data is used to fit the time series model, so that each second historical revenue data corresponding to each sample company meets the multiple time series model.
  • Step 408 Use the time series model to process the historical revenue data of the listed company to obtain the third revenue forecast result.
  • the historical revenue data of listed companies is extracted according to the statistical cycle, the historical revenue data corresponding to each statistical period is obtained, and then the extracted historical revenue data corresponding to each statistical period is input
  • the time series model uses the time series model to predict the revenue of the listed company according to the input historical revenue data, and obtains the third revenue prediction result output by the polynomial function.
  • ⁇ M 40 represents the difference between the third revenue forecast and the quarterly revenue of listed company A in the previous quarter; ⁇ M 39 represents the quarterly revenue of listed company A in the previous quarter and the quarterly revenue of listed company A in the previous second quarter.
  • K, k 1 , k 2 and k 3 are all weight coefficients fitted by machine learning.
  • Step 409 Determine the predicted revenue data of the listed company according to the first revenue prediction result, the second revenue prediction result, and the third revenue prediction result.
  • the first revenue prediction result, the second revenue prediction result, and the third revenue prediction result are calculated.
  • the revenue forecast result is weighted and the result of the weighted calculation is used as the forecast revenue data of the listed company.
  • the first revenue prediction result, the second revenue prediction result, and the third revenue prediction result can be separately calculated in advance.
  • the corresponding weighting coefficient is set for the revenue forecast result.
  • the three weighting coefficients corresponding to the first revenue prediction result, the second revenue prediction result, and the third revenue prediction result can be equal, and the three weighting factors are all 1/3.
  • different weighting coefficients can be set for the first revenue prediction result, the second revenue prediction result, and the third revenue prediction result.
  • the third method of revenue prediction results is used to predict the revenue of the listed company in the previous historical year, and then the weighting coefficient is determined based on the prediction results of the three to predict the true revenue of the historical year. For example, use the method of obtaining the first revenue forecast to predict the last quarter's revenue as X1, use the method of obtaining the second revenue forecast to predict the last quarter's revenue as X2, and use the method to obtain the third revenue forecast.
  • the method predicts that the revenue of the previous quarter is X3, and the revenue of the previous quarter is actually X4, and then the absolute value of the difference between X1, X2, and X3 and X4 can be used to determine the corresponding first revenue forecast result ,
  • the weighting coefficient of the second revenue prediction result and the third revenue prediction result the larger the absolute value of the difference, the smaller the corresponding weighting coefficient.
  • an embodiment of the present application provides a revenue forecasting device for listed companies, including: a category recognition module 501, a factor recognition module 502, a data acquisition module 503, a first data extraction module 504, and a model training module 505 , The second data extraction module 506, the model processing module 507 and the data processing module 508;
  • the category identification module 501 is used to determine the industry category of the listed company that needs to perform revenue forecasting
  • the factor identification module 502 is configured to determine at least one factor corresponding to the industry category determined by the category identification module 501, wherein different factors correspond to different data statistics rules;
  • the data acquisition module 503 is configured to use at least two companies belonging to the industry category determined by the category identification module 501 as sample companies, and obtain historical revenue data of each sample company respectively;
  • the first data extraction module 504 is configured to extract the first historical factor data corresponding to each factor determined by the factor identification module 502 from the historical revenue data of each sample company and acquired by the data acquisition module 503;
  • the model training module 505 is configured to train a machine learning model corresponding to the industry category through each first historical factor data extracted by the first data extraction module 504;
  • the second data extraction module 506 is used to extract the factor identification module 502 from the historical revenue data of the listed company to determine the second historical factor data corresponding to each factor;
  • the model processing module 507 is configured to input each second historical factor data extracted by the second data extraction module 506 into the machine learning model trained by the model training module 505 to obtain the first revenue prediction result output by the machine learning model;
  • the data processing module 508 is configured to determine the predicted revenue data of the listed company according to the first revenue prediction result obtained by the model processing module 507.
  • the category identification module 501 can be used to perform step 101 in the above method embodiment
  • the factor identification module 502 can be used to perform step 102 in the above method embodiment
  • the data acquisition module 503 can be used to perform the above method embodiment.
  • the first data extraction model 504 can be used to perform step 104 in the above method embodiment
  • the model training module 505 can be used to perform step 105 in the above method embodiment
  • the second data extraction module 506 can be used to perform the above method.
  • the model processing module 507 can be used to execute step 107 in the above method embodiment
  • the data processing module 508 can be used to execute step 108 in the above method embodiment.
  • this device embodiment may also include other modules for executing each step in the foregoing method embodiment.
  • the embodiments of the present application also provide a computer device, including a memory and a processor, and a computer program is stored on the memory.
  • a computer program is stored on the memory.
  • the processor executes the computer program stored on the memory, the listed company operation provided by the foregoing embodiments can be implemented. Method of income forecasting.
  • the embodiments of the present application also provide a non-volatile computer-readable storage medium with a computer program stored on the non-volatile computer-readable storage medium, and the above-mentioned various implementations can be implemented when the stored computer storage is executed. Examples of listed companies’ revenue forecasting methods.
  • the listed company revenue forecasting methods, devices, computer equipment, and non-volatile computer-readable storage medium are determined after determining the industry category of the listed company that needs to perform revenue forecasting One or more factors corresponding to the industry category, then obtain the historical revenue data of at least two sample companies belonging to the industry category, and extract the first history corresponding to each factor from the historical revenue data of each sample company Factor data, and extract the second historical factor data corresponding to each factor from the historical revenue data of listed companies, and then use each first historical factor data to train the machine learning model corresponding to the industry category of the listed company.
  • the first revenue forecast result is obtained, and then the forecast revenue data of the listed company can be determined according to the first revenue forecast result. It can be seen that using the historical revenue data of multiple sample companies belonging to the same industry category as the listed company to predict the revenue of the listed company, the timeliness of the first historical factor data corresponding to each factor is relatively low, so There is no need for analysts to collect real-time data corresponding to each factor in time, which can reduce the cost of analysts’ revenue forecasts for listed companies.

Abstract

A development trend data acquisition method and device. The method comprises: determining an object category to which a prediction object belongs; determining at least one factor corresponding to the object category; taking at least two objects belonging to the object category as sample objects, and separately obtaining historical development data of each sample object; separately extracting first historical factor data corresponding to each factor from within the historical development data of each sample object; by means of each piece of the extracted first historical factor data, training a machine learning model corresponding to the object category; extracting second historical factor data corresponding to each factor from within the historical development data of the prediction object; inputting each piece of second historical factor data into the machine learning model to obtain first prediction data; and according to the first prediction data, determining development trend data used for representing the development trend of the prediction object. The costs paid by analysis personnel for predicting the revenue of publicly listed companies may thus be reduced.

Description

一种发展趋势数据获取方法、装置Method and device for acquiring development trend data 技术领域Technical field
本申请涉及数据处理技术领域,尤其涉及一种发展趋势数据获取方法、装置。This application relates to the field of data processing technology, and in particular to a method and device for acquiring development trend data.
背景技术Background technique
上市公司有义务对其营收进行预测并对外公告,因此上市公司的分析人员需要定期对其所在的上市公司进行营收预测。通常情况下,分析人员以季度为周期对上市公司的营收进行预测。Listed companies are obliged to forecast their revenues and make public announcements. Therefore, analysts of listed companies need to make regular revenue forecasts for their listed companies. Under normal circumstances, analysts forecast the revenue of listed companies on a quarterly basis.
目前,分析人员主要通过如下方法对上市公司的营收进行预测:根据上市公司所属的行业确定对上市公司营收具有直接影响的多个因子,利用所确定出的各个因子的实时数据进行线性拟合或者多项式拟合,进而根据拟合出的线性函数或多项式函数对上市公司的营收进行预测。At present, analysts mainly use the following methods to predict the revenue of listed companies: determine multiple factors that have a direct impact on the revenue of the listed company according to the industry to which the listed company belongs, and use the real-time data of the determined factors to perform linear simulation Or polynomial fitting, and then predict the revenue of listed companies based on the fitted linear function or polynomial function.
针对目前对上市公司营收进行预测的方法,利用各个因子的实时数据进行线性拟合或者多项式拟合,因此各个因子所对应数据的时效性对预测结果的影响很大,为了保证营收预测的准确性,分析人员需要及时地搜集各个因子对应的实时数据,造成分析人员为上市公司营收预测所付出的成本较高。According to the current method of forecasting the revenue of listed companies, the real-time data of each factor is used for linear fitting or polynomial fitting. Therefore, the timeliness of the data corresponding to each factor has a great influence on the forecast results. In order to ensure the revenue forecast Accuracy, analysts need to collect real-time data corresponding to each factor in a timely manner, resulting in higher costs for analysts to pay for the revenue forecast of listed companies.
发明内容Summary of the invention
本申请提供一种发展趋势数据获取方法、装置,其主要目的在于利用多个样本对象的历史发展数据训练机器学习模型,通过将预测独享的历史发展数据输入所训练的机器学习模型,获得机器学习模型所输出的第一预测数据,进而根据第一预测数据来确定预测对象的发展趋势数据。将该发展趋势数据获取方法应用于上市公司的营收预测时,无需分析人员及时搜集对应于上市公司的实时数据,从而可以降低分析人员对上市公司进行营收预测所付出的成本。This application provides a method and device for acquiring development trend data, the main purpose of which is to use the historical development data of multiple sample objects to train a machine learning model, and to obtain the machine by inputting historical development data exclusive for prediction into the trained machine learning model The first prediction data output by the learning model is further used to determine the development trend data of the prediction object according to the first prediction data. When applying this development trend data acquisition method to the revenue forecast of listed companies, it is not necessary for analysts to collect real-time data corresponding to the listed company in time, thereby reducing the cost of analysts’ revenue forecasting of listed companies.
第一方面,本申请实施例提供了一种发展趋势数据获取方法,包括:In the first aspect, an embodiment of the present application provides a method for obtaining development trend data, including:
确定预测对象所属的对象类别;Determine the object category to which the predicted object belongs;
确定与所述对象类别相对应的至少一个因子,其中,不同所述因子对应有不同的数据统计规则;Determine at least one factor corresponding to the object category, wherein different factors correspond to different data statistics rules;
将属于所述对象类别的至少两个对象作为样本对象,并分别获取每一个所述样本对象的历史发展数据;Taking at least two objects belonging to the object category as sample objects, and obtaining historical development data of each of the sample objects respectively;
分别从每一个所述样本对象的历史发展数据中提取每一个所述因子对应的第一历史因子数据;Extracting the first historical factor data corresponding to each of the factors from the historical development data of each of the sample objects;
通过提取到的各个所述第一历史因子数据训练对应于所述对象类别的机器学习模型;Training a machine learning model corresponding to the object category through each of the extracted first historical factor data;
从所述预测对象的历史发展数据中提取每一个所述因子对应的第二历史因子数据;Extracting second historical factor data corresponding to each of the factors from the historical development data of the prediction object;
将各个所述第二历史因子数据输入所述机器学习模型,获得所述机器学习模型输出的第一预测数据;Input each of the second historical factor data into the machine learning model to obtain the first prediction data output by the machine learning model;
根据所述第一预测数据确定用于表征所述预测对象发展趋势的发展趋势数据。The development trend data used to characterize the development trend of the prediction object is determined according to the first prediction data.
可选地,Optionally,
在所述根据所述第一预测数据确定用于表征所述预测对象发展趋势的发展趋势数据之前,进一步包括:Before determining the development trend data used to characterize the development trend of the prediction object according to the first prediction data, the method further includes:
利用至少两个所述样本对象的历史发展数据拟合多项式函数,其中,每一个所述样本对象的历史发展数据均满足所述多项式函数;Fitting a polynomial function with historical development data of at least two of the sample objects, wherein the historical development data of each sample object satisfies the polynomial function;
将所述预测对象的历史发展数据输入所述多项式函数,获得所述多项式函数输出的第二预测数据;Input the historical development data of the prediction object into the polynomial function to obtain second prediction data output by the polynomial function;
所述根据所述第一预测数据确定用于表征所述预测对象发展趋势的发展趋势数据,包括:The determining development trend data used to characterize the development trend of the prediction object according to the first prediction data includes:
根据所述第一预测数据和所述第二预测数据确定所述预测对象的所述发展趋势数据。The development trend data of the prediction object is determined according to the first prediction data and the second prediction data.
可选地,Optionally,
在所述根据所述第一预测数据和所述第二预测数据确定所述预测对象的所述发展趋势数据之前,进一步包括:Before the determining the development trend data of the prediction object according to the first prediction data and the second prediction data, the method further includes:
利用至少两个所述样本对象的历史发展数据拟合时间序列模型,其中,每一个所述样本对象的历史发展数据随时间的变化规律均符合所述时间序列模型;Fitting a time series model with the historical development data of at least two of the sample objects, wherein the change law of the historical development data of each of the sample objects over time conforms to the time series model;
将所述预测对象的历史发展数据输入所述时间序列模型,获得所述时间序 列模型输出的第三预测数据;Input the historical development data of the prediction object into the time series model to obtain the third prediction data output by the time series model;
所述根据所述第一预测数据和所述第二预测数据确定所述预测对象的所述发展趋势数据,包括:The determining the development trend data of the prediction object according to the first prediction data and the second prediction data includes:
根据所述第一预测数据、所述第二预测数据和所述第三预测数据确定所述预测对象的所述发展趋势数据。The development trend data of the prediction object is determined according to the first prediction data, the second prediction data, and the third prediction data.
可选地,Optionally,
所述利用至少两个所述样本对象的历史发展数据拟合多项式函数,包括:The fitting a polynomial function using historical development data of at least two of the sample objects includes:
确定对所述预测对象进行发展趋势预测的预测周期;Determine the forecast period for forecasting the development trend of the forecast object;
根据所述预测周期分别从每一个所述样本对象的历史发展数据中提取每一个统计周期对应的第一历史发展数据,其中所述统计周期与所述预测周期在时间跨度上相对应;Extracting first historical development data corresponding to each statistical period from the historical development data of each sample object according to the prediction period, wherein the statistical period corresponds to the prediction period in a time span;
根据各个所述统计周期对应的各个所述第一历史发展数据,拟合出如下多项式函数,其中,每一个所述样本对象对应的每一个所述第一历史发展数据均满足所述多项式函数;According to each of the first historical development data corresponding to each of the statistical periods, the following polynomial function is fitted, where each of the first historical development data corresponding to each of the sample objects satisfies the polynomial function;
Figure PCTCN2019103060-appb-000001
Figure PCTCN2019103060-appb-000001
其中,所述M表征相对于当前时间的所述第二预测数据;所述k i表征通过机器学习拟合出的权重系数;所述x表征相对于所述当前时间的上一个所述统计周期对应的所述第一历史发展数据;所述x i表征相对于所述当前时间的上i+1个所述统计周期对应的所述第一历史发展数据;所述t+1表征所述当前时间之前所述统计周期的个数。 Wherein, the M represents the second prediction data relative to the current time; the k i represents the weight coefficient fitted by machine learning; the x represents the previous statistical period relative to the current time Corresponding to the first historical development data; the x i represents the first historical development data corresponding to the last i+1 statistical periods relative to the current time; the t+1 represents the current The number of statistical periods before the time.
可选地,Optionally,
所述利利用至少两个所述样本对象的历史发展数据拟合时间序列模型,包括:The utilizing the historical development data of at least two of the sample objects to fit a time series model includes:
确定对所述预测对象进行发展趋势预测的预测周期;Determine the forecast period for forecasting the development trend of the forecast object;
根据所述预测周期分别从每一个所述样本对象的历史发展数据中提取每一个统计周期对应的第二历史发展数据,其中所述统计周期与所述预测周期在时间跨度上相对应;Extracting the second historical development data corresponding to each statistical period from the historical development data of each sample object according to the prediction period, wherein the statistical period corresponds to the prediction period in a time span;
根据各个所述统计周期对应的各个所述第二历史发展数据拟合时间序列模 型,其中,每一个所述样本对象对应的各个所述第二历史发展数据随时间的变化规律满足所述时间序列模型;A time series model is fitted according to each of the second historical development data corresponding to each of the statistical periods, wherein the change rule over time of each of the second historical development data corresponding to each of the sample objects satisfies the time series model;
所述时间序列模型的形式如下所示:The form of the time series model is as follows:
(ΔM t) 2=K+k 1(ΔM t-1) 2-k 2(ΔM t-2) 2t-k 3ε t-1 (ΔM t ) 2 =K+k 1 (ΔM t-1 ) 2 -k 2 (ΔM t-2 ) 2t -k 3 ε t-1
其中,所述ΔM t表征相对于当前时间的所述第三预测数据与所述当前时间的上一个所述统计周期对应的所述第二历史发展数据之差;所述ΔM t-1表征所述当前时间的上一个所述统计周期对应的所述第二历史发展数据与所述当前时间之前的第二个所述统计周期对应的所述第二历史发展数据之差;所述ΔM t-2表征所述当前时间之前的第二个所述统计周期对应的所述第二历史发展数据与所述当前时间之前的第三个所述统计周期对应的所述第二历史发展数据之差;所述ε t表征相对于所述当前时间的所述第三预测数据;所述ε t-1表征所述当前时间的上一个所述统计周期对应的所述第二历史发展数据;所述K、所述k 1、所述k 2和所述k 3均为通过机器学习拟合出的权重系数。 Wherein, the ΔM t characterizes the difference between the third prediction data relative to the current time and the second historical development data corresponding to the last statistical period of the current time; the ΔM t-1 characterizes the difference The difference between the second historical development data corresponding to the last statistical period of the current time and the second historical development data corresponding to the second statistical period before the current time; the ΔM t- 2 characterizing the difference between the second historical development data corresponding to the second statistical period before the current time and the second historical development data corresponding to the third statistical period before the current time; The ε t characterizes the third prediction data relative to the current time; the ε t-1 characterizes the second historical development data corresponding to the last statistical period of the current time; the K , The k 1 , the k 2 and the k 3 are all weight coefficients fitted by machine learning.
可选地,所述根据各个所述统计周期对应的各个所述第二历史发展数据拟合时间序列模型,包括:Optionally, the fitting a time series model according to each second historical development data corresponding to each statistical period includes:
对各个所述统计周期对应的各个所述第二历史发展数据进行二次差分,获得相对应的差分序列;Performing a second difference on each of the second historical development data corresponding to each of the statistical periods to obtain a corresponding difference sequence;
根据所述差分序列,采用列表法定义与模型相对应的目标方程;According to the difference sequence, a list method is used to define the target equation corresponding to the model;
对所述目标方程进行求解获得所述模型的估计结果;Solving the target equation to obtain an estimation result of the model;
基于拟合优度对所述模型的拟合效果进行检测;Detecting the fitting effect of the model based on the goodness of fit;
在确定所述模型的拟合效果达到预先设定的目标后,对所述模型的残差进行检测;After determining that the fitting effect of the model reaches a preset target, detecting the residual of the model;
在确定所述模型的残差波动在预先设定的波动范围内时,将所述模型确定为所述时间序列模型。When it is determined that the residual fluctuation of the model is within a preset fluctuation range, the model is determined as the time series model.
可选地,所述根据所述第一预测数据、所述第二预测数据和所述第三预测数据确定所述预测对象的所述发展趋势数据,包括:Optionally, the determining the development trend data of the prediction object according to the first prediction data, the second prediction data, and the third prediction data includes:
对所述第一预测数据、所述第二预测数据和所述第三预测数据进行加权运算,获得所述预测对象的所述发展趋势数据。Perform a weighting operation on the first prediction data, the second prediction data, and the third prediction data to obtain the development trend data of the prediction object.
可选地,所述通过提取到的各个所述第一历史因子数据训练对应于所述对 象类别的机器学习模型,包括:Optionally, the training of the machine learning model corresponding to the object category through each of the extracted first historical factor data includes:
针对每一个所述因子,从该因子对应的所述第一历史因子数据中获取过去至少两年中每一年该因子对应的至少一个因子数据;For each of the factors, obtain at least one factor data corresponding to the factor in each of the past at least two years from the first historical factor data corresponding to the factor;
将各个所述因子对应的所述因子数据作为样本训练分别与每一个所述因子对应的因子系数;Using the factor data corresponding to each of the factors as a sample to train the factor coefficients corresponding to each of the factors;
利用获取到的各个所述因子系数构建如下用于计算所述第一预测数据的公式;Use each of the acquired factor coefficients to construct the following formula for calculating the first prediction data;
Figure PCTCN2019103060-appb-000002
Figure PCTCN2019103060-appb-000002
其中,所述M′表征所述第一预测数据;所述n表征所述因子的个数;所述m表征所述第一历史因子数据所覆盖历史年度的个数;所述x (i,1)表征所述预测对象此前第1年对应于第i个所述因子的因子数据;所述x (i,2)表征所述预测对象此前第2年对应于第i个所述因子的因子数据;所述k i表征当前时间对应于第i个所述因子的因子系数;所述x (i,j)表征所述预测对象此前第j年对应于第i个所述因子的因子数据; Wherein, the M′ represents the first prediction data; the n represents the number of the factors; the m represents the number of historical years covered by the first historical factor data; the x (i, 1) Characterizing the factor data corresponding to the i-th factor of the predicted object in the previous year; the x (i, 2) characterizing the factor corresponding to the i-th factor in the previous year of the predicted object Data; the k i characterizes the factor coefficient corresponding to the i-th factor at the current time; the x (i, j) characterizes the factor data corresponding to the i-th factor in the previous j-th year of the prediction object;
构建包括有所述公式的所述机器学习模型。The machine learning model including the formula is constructed.
第二方面,本申请实施例还提供了一种发展趋势数据获取装置,包括:In the second aspect, an embodiment of the present application also provides a development trend data acquisition device, including:
类别识别模块、因子识别模块、数据获取模块、第一数据提取模块、模型训练模块、第二数据提取模块、模型处理模块和数据处理模块;Category recognition module, factor recognition module, data acquisition module, first data extraction module, model training module, second data extraction module, model processing module, and data processing module;
所述类别识别模块,用于确定预测对象所属的对象类别;The category recognition module is used to determine the object category to which the predicted object belongs;
所述因子识别模块,用于确定与所述类别识别模块确定出的所述对象类别相对应的至少一个因子,其中,不同所述因子对应有不同的数据统计规则;The factor identification module is configured to determine at least one factor corresponding to the object category determined by the category identification module, wherein different factors correspond to different data statistics rules;
所述数据获取模块,用于将属于所述类别识别模块确定出的所述对象类别的至少两个对象作为样本对象,并分别获取每一个所述样本对象的历史发展数据;The data acquisition module is configured to use at least two objects belonging to the object category determined by the category recognition module as sample objects, and obtain historical development data of each sample object respectively;
所述第一数据提取模块,用于分别从每一个所述样本对象的且由所述数据获取模块获取到的所述历史发展数据中提取所述因子识别模块确定出的每一个所述因子对应的第一历史因子数据;The first data extraction module is configured to extract each of the factor corresponding to each of the factors determined by the factor identification module from the historical development data of each of the sample objects and acquired by the data acquisition module. The first historical factor data;
所述模型训练模块,用于通过所述第一数据提取模块提取到的各个所述第 一历史因子数据训练对应于所述对象类别的机器学习模型;The model training module is configured to train a machine learning model corresponding to the object category through each of the first historical factor data extracted by the first data extraction module;
所述第二数据提取模块,用于从所述预测对象的历史发展数据中提取所述因子识别模块确定出每一个所述因子对应的第二历史因子数据;The second data extraction module is configured to extract second historical factor data corresponding to each of the factors by the factor identification module from the historical development data of the prediction object;
所述模型处理模块,用于将所述第二数据提取模块提取出的各个所述第二历史因子数据输入所述模型训练模块训练出的所述机器学习模型,获得所述机器学习模型输出的第一预测数据;The model processing module is configured to input each of the second historical factor data extracted by the second data extraction module into the machine learning model trained by the model training module to obtain the output of the machine learning model First forecast data;
所述数据处理模块,用于根据所述模型处理模块获取到的所述第一预测数据确定用于表征所述预测对象发展趋势的发展趋势数据。The data processing module is configured to determine development trend data used to characterize the development trend of the prediction object according to the first prediction data acquired by the model processing module.
第三方面,本申请实施例还提供了一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现上述第一方面中任一所述的发展趋势数据获取方法。In a third aspect, an embodiment of the present application also provides a computer device, including a memory and a processor, the memory stores a computer program, and when the processor executes the computer program, it implements any of the foregoing The development trend data acquisition method.
第四方面,本申请实施例还提供了一种非易失性计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现上述第一方面中任一所述的发展趋势数据获取方法。In a fourth aspect, embodiments of the present application also provide a non-volatile computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the computer program described in any of the above-mentioned first aspects is implemented. Development trend data acquisition method.
本申请实施例提供的发展趋势数据获取方法、装置以及计算机设备和非易失性计算机可读存储介质,确定预测对象所属的对象类别后,确定与对象类别相对应的一个或多个因子,之后获取属于该对象类别的至少两个样本对象的历史发展数据,并从每一个样本对象的历史发展数据中提取每一个因子对应的第一历史因子数据,并从预测对象的历史发展数据中提取每一个因子对应的第二历史因子数据,之后利用各个第一历史因子数据训练对应于预测对象所属对象类别的机器学习模型,在将各个第二历史因子数据输入所训练的机器学习模型后获得第一预测数据,进而可以根据第一预测数据确定预测对象的发展趋势数据。由此可见,在将上市公司作为预测对象,将发展趋势数据作为营收预测数据时,利用与上市公司属于同一行业类别的多个样本公司的历史营收数据来对上市公司的营收进行预测,对各因子所对应第一历史因子数据的时效性要求较低,从而无需分析人员及时搜集各因子对应的实时数据,从而可以降低分析人员对上市公司进行营收预测所付出的成本。The development trend data acquisition method, device, computer equipment, and non-volatile computer-readable storage medium provided by the embodiments of the present application determine the object category to which the predicted object belongs, and then determine one or more factors corresponding to the object category. Obtain the historical development data of at least two sample objects belonging to the object category, extract the first historical factor data corresponding to each factor from the historical development data of each sample object, and extract each historical development data from the predicted object One factor corresponds to the second historical factor data, and then each first historical factor data is used to train the machine learning model corresponding to the object category to which the predicted object belongs. After each second historical factor data is input into the trained machine learning model, the first The forecast data can then determine the development trend data of the forecast object according to the first forecast data. It can be seen that when listed companies are used as forecast objects and development trend data are used as revenue forecast data, historical revenue data of multiple sample companies belonging to the same industry category as listed companies are used to predict the revenue of listed companies , The timeliness requirements of the first historical factor data corresponding to each factor are low, so there is no need for analysts to collect real-time data corresponding to each factor in time, which can reduce the cost of analysts for revenue forecasting of listed companies.
附图说明Description of the drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly describe the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are For some of the embodiments of the application, for those of ordinary skill in the art, other drawings may be obtained based on these drawings without creative work.
图1是本申请一个实施例提供的一种上市公司营收预测方法的流程图;FIG. 1 is a flowchart of a method for forecasting the revenue of a listed company according to an embodiment of the present application;
图2是本申请一个实施例提供的一种多项式函数拟合方法的流程图;2 is a flowchart of a polynomial function fitting method provided by an embodiment of the present application;
图3是本申请一个实施例提供的一种时间序列模型拟合方法的流程图;FIG. 3 is a flowchart of a time series model fitting method provided by an embodiment of the present application;
图4是本申请一个实施例提供的另一种上市公司营收预测方法的流程图;4 is a flowchart of another method for forecasting the revenue of a listed company according to an embodiment of the present application;
图5是本申请一个实施例提供的一种上市公司营收预测装置的示意图。Fig. 5 is a schematic diagram of a revenue forecasting device for listed companies provided by an embodiment of the present application.
具体实施方式detailed description
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例,基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本申请保护的范围。应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the following will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments These are part of the embodiments of this application, not all of them. Based on the examples of this application, all other embodiments obtained by those of ordinary skill in the art without creative work are protected by this application. range. It should be understood that the specific embodiments described here are only used to explain the application, and are not used to limit the application.
下面以预测对象为上市公司,发展趋势数据为上市公司营收预测数据为例,对本申请实施例所提供的发展趋势数据获取方法及装置进行详细说明。具体地,与发展趋势数据获取方法相对应的为上市公司营收预测方法,与发展趋势数据获取装置相对应的为上市公司营收预测装置。In the following, the forecast object is a listed company and the development trend data is the revenue forecast data of the listed company as an example, and the method and device for acquiring the development trend data provided by the embodiments of the present application will be described in detail. Specifically, the method corresponding to the method for acquiring development trend data is a method for forecasting the revenue of listed companies, and the method corresponding to the method for acquiring development trend data is a method for forecasting the revenue of listed companies.
如图1所示,本申请一个实施例提供了一种上市公司营收预测方法,包括:As shown in Figure 1, an embodiment of the present application provides a method for forecasting the revenue of a listed company, including:
步骤101:确定需要进行营收预测的上市公司所属的行业类别;Step 101: Determine the industry category of the listed company that needs to perform revenue forecasting;
步骤102:确定与行业类别相对应的至少一个因子,其中,不同因子对应有不同的数据统计规则;Step 102: Determine at least one factor corresponding to the industry category, where different factors correspond to different data statistics rules;
步骤103:将属于行业类别的至少两个公司作为样本公司,并分别获取每一个样本公司的历史营收数据;Step 103: Take at least two companies belonging to the industry category as sample companies, and obtain historical revenue data of each sample company respectively;
步骤104:分别从每一个样本公司的历史营收数据中提取每一个因子对应的第一历史因子数据;Step 104: Extract the first historical factor data corresponding to each factor from the historical revenue data of each sample company;
步骤105:通过提取到的各个第一历史因子数据训练对应于行业类别的机器学习模型;Step 105: Train a machine learning model corresponding to the industry category through each extracted first historical factor data;
步骤106:从上市公司的历史营收数据中提取每一个因子对应的第二历史因子数据;Step 106: Extract the second historical factor data corresponding to each factor from the historical revenue data of the listed company;
步骤107:将各个第二历史因子数据输入机器学习模型,获得机器学习模型输出的第一营收预测结果;Step 107: Input each second historical factor data into the machine learning model, and obtain the first revenue prediction result output by the machine learning model;
步骤108:根据第一营收预测结果确定上市公司的预测营收数据。Step 108: Determine the predicted revenue data of the listed company according to the first revenue prediction result.
本申请实施例提供的上市公司营收预测方法,确定需要进行营收预测的上市公司所属的行业类别后,确定与行业类别相对应的一个或多个因子,之后获取属于行业类别的至少两个样本公司的历史营收数据,并从每一个样本公司的历史营收数据中提取每一个因子对应的第一历史因子数据,并从上市公司的历史营收数据中提取每一个因子对应的第二历史因子数据,之后利用各个第一历史因子数据训练对应于上市公司所属行业类别的机器学习模型,在将各个第二历史因子数据输入所训练的机器学习模型后获得第一营收预测结果,进而可以根据第一营收预测结果来确定上市公司的预测营收数据。由此可见,利用与上市公司属于同一行业类别的多个样本公司的历史营收数据来对上市公司的营收进行预测,对各因子所对应第一历史因子数据的时效性要求较低,从而无需分析人员及时搜集各因子对应的实时数据,从而可以降低分析人员对上市公司进行营收预测所付出的成本。According to the method for forecasting the revenue of listed companies provided by the embodiments of the application, after determining the industry category of the listed company that needs to perform revenue forecasting, one or more factors corresponding to the industry category are determined, and then at least two of the industry categories are obtained Sample company’s historical revenue data, and extract the first historical factor data corresponding to each factor from the historical revenue data of each sample company, and extract the second historical factor corresponding to each factor from the historical revenue data of listed companies Historical factor data, and then use each first historical factor data to train the machine learning model corresponding to the industry category of the listed company. After each second historical factor data is input into the trained machine learning model, the first revenue forecast result is obtained, and then The forecasted revenue data of listed companies can be determined according to the first revenue forecast result. It can be seen that using the historical revenue data of multiple sample companies belonging to the same industry category as the listed company to predict the revenue of the listed company, the timeliness of the first historical factor data corresponding to each factor is relatively low, so There is no need for analysts to collect real-time data corresponding to each factor in time, which can reduce the cost of analysts’ revenue forecasts for listed companies.
在本申请实施例中,在确定上市公司所属行业类别对应的至少一个因子时,需要根据上市公司所属行业类别进行营收预测的周期进行。比如,上市公司所属行业类别通常按照季度对营收数据进行统计,则在对上市公司的营收预测时,通常为预测上市公司下一季度的营收数据。相应的,将上季度营收、上季度总资产、去年同季度营收和去年同季度总资产确定为上市公司所属行业类别对应的4个因子。In the embodiments of the present application, when determining at least one factor corresponding to the industry category of the listed company, it needs to be performed according to the cycle of revenue forecasting of the industry category of the listed company. For example, the industry category of a listed company usually collects revenue data on a quarterly basis. When forecasting the revenue of a listed company, the revenue data of the listed company in the next quarter is usually used. Correspondingly, last quarter's revenue, last quarter's total assets, last year's same quarter's revenue, and last year's total assets were identified as the four factors corresponding to the industry category of the listed company.
例如,在将上季度营收、上季度总资产、去年同季度营收和去年同季度总资产确定为4个因子后,将属于上市公司所属行业类别的3000家公司确定为样本公司,并获取每一个样本公司过去十年的历史营收数据,进而从获取到的历史营收数据从分别获取每一家样本公司过去十年每一年中每个季度的季度营收、 季度总资产作为第一历史因子数据。之后通过提取到的24万个(3000*10*4*2)第一历史因子数据训练机器学习模型,获得上市公司所属行业类别对应的机器学习模型。之后将待进行营收预测的上市公司的上季度营收、上季度总资产、去年同季度营收和去年同季度总资产输入机器学习模型,获得机器学习模型输出的上市公司下季度预测营收数据作为第一营收预测结果。For example, after determining the last quarter’s revenue, last quarter’s total assets, last year’s same quarter revenue, and last year’s total assets as four factors, 3,000 companies belonging to the industry category of listed companies are determined as sample companies and obtained The historical revenue data of each sample company in the past ten years, and then the historical revenue data obtained from each sample company’s quarterly revenue and quarterly total assets in each of the past ten years are obtained as the first Historical factor data. After that, the machine learning model is trained through the extracted 240,000 (3000*10*4*2) first historical factor data, and the machine learning model corresponding to the industry category of the listed company is obtained. After that, the last quarter's revenue, total assets of the last quarter, revenue of the same quarter last year, and total assets of the same quarter last year of the listed company to be forecasted are input into the machine learning model, and the next quarter forecasted revenue of the listed company output from the machine learning model is obtained The data is used as the first revenue forecast result.
可选地,在图1所示上市公司营收预测方法的基础上,由于各个第一历史营收数据反映各个样本公司此前的营收情况,因而根据各个第一历史营收数据可以确定对应于不同因子的因子系数,利用所确定出的各个因子系数反应各个样本公司历年的营收变化趋势,进而可以利用确定出的各个因子系数构建机器学习模型,通过所构建机器学习模型对上市公司的第二历史因子数据进行处理来对上市公司的营收进行预测,获得第一营收预测结果。具体构建机器学习模型的方法可以包括以下步骤:Optionally, on the basis of the revenue forecasting method for listed companies shown in Figure 1, since each first historical revenue data reflects the previous revenue of each sample company, the corresponding first historical revenue data can be determined to correspond to The factor coefficients of different factors use the determined factor coefficients to reflect the revenue trend of each sample company over the years, and then the determined factor coefficients can be used to construct a machine learning model. The two historical factor data are processed to predict the revenue of the listed company and obtain the first revenue prediction result. The specific method of constructing a machine learning model may include the following steps:
S1:针对每一个因子,从该因子对应的各个第一历史因子数据中获取过去至少两年中每一年该因子对应的至少一个因子数据;S1: For each factor, obtain at least one factor data corresponding to the factor in each of the past at least two years from the first historical factor data corresponding to the factor;
S2:将获取到的各个因子数据作为样本训练分别与每一个因子对应的因子系数;S2: Use the acquired factor data as a sample to train the factor coefficients corresponding to each factor;
S3:利用获取到的各个因子系数构建如下用于计算第一营收预测结果的公司;S3: Use the obtained factor coefficients to construct the following company for calculating the first revenue forecast result;
Figure PCTCN2019103060-appb-000003
Figure PCTCN2019103060-appb-000003
其中,M′表征第一营收预测结果;n表征因子的个数;m表征第一历史因子数据所覆盖历史年度的个数;x (i,1)表征上市公司此前第1年对应于第i个因子的因子数据;x (i,2)表征上市公司此前第2年对应于第i个因子的因子数据;k i表征当前时间对应于第i个因子的因子系数;x (i,j)表征上市公司此前第j年对应于第i个因子的因子数据; Among them, M′ represents the first revenue forecast result; n represents the number of factors; m represents the number of historical years covered by the first historical factor data; x (i, 1) represents the first year of the listed company corresponding to the first factor data of the i factor; x (i, 2) represents the factor data corresponding to the i-th factor in the second year of the listed company; k i represents the factor coefficient corresponding to the i-th factor at the current time; x (i, j ) Characterize the factor data corresponding to the i-th factor in the j-th year of the listed company;
S4:构建包括有上述公式的机器学习模型。S4: Build a machine learning model including the above formula.
例如,针对上市公司确定的4个因子分别为上季度营收、上季度总资产、去年同季度营收和去年同季度总资产,获取到的第一历史因子数据为3000家公司过去10年的营收数据,则针对上季度营收这一因子可以确定出10*3000*2共 计6万个因子数据,相应地针对上季度总资产、去年同季度营收和去年同季度总资产均可以确定出6个因子数据。之后将这24万个因子数据作为样本数据进行机器学习,拟合出分别对应于上述4个因子的4个因子系数。之后将拟合出的4个因子系数代入上述公式,并将上时公司此前历年对应于上述4个因子的历史营收数据代入上述公式,便可以计算出对上市公司进行营收预测的第一营收预测结果。For example, the four factors identified for listed companies are the revenue of the previous quarter, the total assets of the previous quarter, the revenue of the same quarter last year, and the total assets of the same quarter last year. The first historical factor data obtained is the data of 3000 companies in the past 10 years. In terms of revenue data, a total of 60,000 factor data of 10*3000*2 can be determined based on the revenue factor of the previous quarter. Accordingly, the total assets of the previous quarter, the revenue of the same quarter last year and the total assets of the same quarter last year can be determined accordingly. 6 factor data are obtained. After that, the 240,000 factor data is used as sample data for machine learning, and 4 factor coefficients corresponding to the above 4 factors are fitted. After substituting the fitted four factor coefficients into the above formula, and substituting the historical revenue data corresponding to the above four factors of the previous years of the company into the above formula, the first revenue forecast for listed companies can be calculated. Revenue forecast results.
在本申请实施例中,利用各个样本公司的历史营收数据拟合对应于各个因子的因子系数,进而利用拟合出的因子系数构建对应于上市公司所属行业类别机器学习模型,该机器学习模型反映了上市公司所属行业类别的营收变化趋势,进而可以利用所构建的机器学习模型来预测上市公司的营收,由于参考了相同行业类别其他公司的营收变化趋势以及待预测的上市公司的历史营收情况,从而可以更加准确地对上市公司的营收进行预测。In the embodiment of the present application, the historical revenue data of each sample company is used to fit the factor coefficients corresponding to each factor, and the fitted factor coefficients are then used to construct a machine learning model corresponding to the industry category of the listed company. The machine learning model Reflects the revenue change trend of the industry category of the listed company, and then can use the constructed machine learning model to predict the revenue of the listed company. Because of the reference to the revenue change trend of other companies in the same industry category and the listed company to be predicted Historical revenue situation, which can more accurately predict the revenue of listed companies.
可选地,通过训练机器学习模型,并将从上市公司的历史营收数据中提取到的第二历史因子数据输入机器学习模型获得第一营收预测结果,之后可以直接将第一营收预测结果作为上市公司的预测营收数据,还可以结合通过其他预测方法获得的营收预测结果来确定上市公司的预测营收数据。针对通过结合其他预测方法获得的营收预测结果来确定预测营收数据的方法,具体可以通过如下两种方式来确定上市公司的预测营收数据:Optionally, by training the machine learning model, and inputting the second historical factor data extracted from the historical revenue data of the listed company into the machine learning model to obtain the first revenue prediction result, then the first revenue prediction can be directly The result is used as the forecasted revenue data of the listed company, and the forecasted revenue data of the listed company can also be determined by combining the revenue forecast results obtained through other forecasting methods. Regarding the method of determining forecasted revenue data by combining the revenue forecast results obtained by other forecasting methods, the following two methods can be used to determine the forecasted revenue data of listed companies:
方式一:将通过机器学习模型获得的第一营收预测结果与通过多项式函数获得的第二营收预测结果相结合,来确定上市公司的预测营收数据;Method 1: Combine the first revenue prediction result obtained through the machine learning model with the second revenue prediction result obtained through a polynomial function to determine the predicted revenue data of the listed company;
方式二:将通过机器学习模型获得的第一营收预测结果、通过多项式函数获得的第二营收预测结果和通过时间序列模型获得的第三营收预测结果相结合,来确定上市公司的预测营收数据。Method 2: Combine the first revenue forecast result obtained through the machine learning model, the second revenue forecast result obtained through the polynomial function, and the third revenue forecast result obtained through the time series model to determine the forecast of the listed company Revenue data.
下面分别对上述两种通过将多个营收预测结果相结合来确定预测营收数据的方法进行说明。The above two methods of determining forecasted revenue data by combining multiple revenue forecasting results are described below.
针对方式一:For method one:
在图1所示上市公司营收预测方法的基础上,在步骤107通过机器学习模型获得第一营收预测结果之后,并在步骤108根据第一营收预测结果确定上市 公司的预测营收数据之前,可以利用各个样本公司的历史营收数据拟合多项式函数,使得每一个样本公司的历史营收数据均满足拟合而成的多项式函数,进而将上市公司的历史营收数据输入拟合出的多项式函数,获得多项式函数输出的第二营收预测结果。相应地,步骤108根据第一营收预测结果确定上市公司的预测营收数据时,可以根据第一营收预测结果与第二营收预测结果来确定上市公司的预测营收数据,具体可以计算第一营收预测结果和第二营收预测结果的加权平均值,将计算出的加权平均值作为上市公司的预测营收数据。On the basis of the listed company revenue forecasting method shown in Figure 1, after obtaining the first revenue prediction result through the machine learning model in step 107, and in step 108, determine the listed company’s predicted revenue data according to the first revenue prediction result Previously, the historical revenue data of each sample company can be used to fit a polynomial function, so that the historical revenue data of each sample company meets the polynomial function obtained by fitting, and then the historical revenue data of listed companies can be input to fit a polynomial function. The polynomial function of to obtain the second revenue forecast result output by the polynomial function. Correspondingly, in step 108, when determining the predicted revenue data of the listed company according to the first revenue forecast result, the predicted revenue data of the listed company may be determined according to the first revenue forecast result and the second revenue forecast result, which can be specifically calculated The weighted average of the first revenue forecast result and the second revenue forecast result, and the calculated weighted average is used as the forecast revenue data of the listed company.
根据各个样本公司的历史营收数据拟合多项式函数,进而将上市公司的历史营收数据输入多项式函数而获得第二营收预测结果,之后根据第一营收预测结果和第二营收预测结果来确定上市公司的预测营收数据,由于结合了机器学习模型和多项式函数两种营收预测方法对上市公司的营收进行预测,从而可以提高对上市公司的营收进行预测的准确性。Fit the polynomial function according to the historical revenue data of each sample company, and then input the historical revenue data of the listed company into the polynomial function to obtain the second revenue forecast result, and then according to the first revenue forecast result and the second revenue forecast result To determine the forecasted revenue data of listed companies, due to the combination of machine learning model and polynomial function two revenue forecasting methods to predict the revenue of listed companies, it can improve the accuracy of forecasting the revenue of listed companies.
在本申请实施例中,利用各个样本公司的历史营收数据拟合多项式函数,如图2所示,拟合多项式函数的过程可以通过如下步骤实现:In the embodiment of the present application, the historical revenue data of each sample company is used to fit the polynomial function. As shown in FIG. 2, the process of fitting the polynomial function can be realized by the following steps:
步骤201:确定对上市公司进行营收预测的预测周期;Step 201: Determine the forecast period for the listed company's revenue forecast;
步骤202:根据确定出的预测周期分别从每一个样本公司的历史营收数据中提取每一个统计周期对应的第一历史营收数据,其中,统计周期与预测周期在时间跨度上相对应;Step 202: Extract the first historical revenue data corresponding to each statistical period from the historical revenue data of each sample company according to the determined prediction period, where the statistical period corresponds to the prediction period in a time span;
步骤203:根据各个统计周期对应的各个第一历史营收数据,拟合出如下多项式函数,使得每一个样本公司对应的每一个第一历史营收数据均满足该多项式函数;Step 203: Fit the following polynomial function according to each first historical revenue data corresponding to each statistical period, so that each first historical revenue data corresponding to each sample company satisfies the polynomial function;
Figure PCTCN2019103060-appb-000004
Figure PCTCN2019103060-appb-000004
其中,M表征相对于当前时间的营收预测结果;k i表征通过机器学习拟合出的权重系数;x表征相对于当前时间的上一个统计周期对应的第一历史营收数据;x i表征相对于当前时间的上i+1个统计周期对应的第一历史营收数据;t+1表征当前时间之前统计周期的个数。 Among them, M represents the revenue prediction result relative to the current time; k i represents the weight coefficient fitted by machine learning; x represents the first historical revenue data corresponding to the previous statistical period relative to the current time; x i represents The first historical revenue data corresponding to the last i+1 statistical period relative to the current time; t+1 represents the number of statistical periods before the current time.
例如,现需要对上市公司下一季度的营收进行预测,则对上市公司进行营收预测的预测周期为季度,相应地,对各样本公司的历史营收数据进行统计的 统计周期也为季度。针对每一个样本公司,从该样本公司的历史营收数据中获取该样本公司历史上各个季度的营收作为第一历史营收数据。通过对各个样本公司对应的各个第一历史营收数据进行拟合,获得可以使得每一个样本公司对应的各个第一历史营收数据均满足的多项式函数。For example, now it is necessary to forecast the revenue of listed companies in the next quarter, the forecast period for the revenue forecast of listed companies is quarter. Correspondingly, the statistical period for the historical revenue data of each sample company is also quarterly. . For each sample company, the historical revenue of the sample company is obtained from the historical revenue data of the sample company as the first historical revenue data. By fitting each first historical revenue data corresponding to each sample company, a polynomial function that can satisfy each first historical revenue data corresponding to each sample company is obtained.
在拟合出多项式函数后,从上市公司的历史营收数据中获得上市公司对应于各个统计周期的数据,之后将获取到的对应于各个统计周期的数据输入多项式函数,获得多项式函数输出的第二营收预测结果。例如,在确定对上市公司进行营收预测的预测周期为季度后,从上市公司的历史营收数据中获取上市公司历史上每一个季度的营收数据,进而将获取到的上市公司历史上每一个季度的营收数据输入拟合出的多项式函数,获得对应于上市公司的第二营收预测结果。After fitting the polynomial function, the listed company’s data corresponding to each statistical period is obtained from the listed company’s historical revenue data, and then the obtained data corresponding to each statistical period is input into the polynomial function to obtain the output of the polynomial function. 2. Revenue forecast results. For example, after determining that the forecast period for the listed company’s revenue forecast is quarterly, the listed company’s historical revenue data is obtained from the listed company’s historical revenue data for each quarter, and then the obtained listed company’s historical revenue data Input the polynomial function fitted to the revenue data of a quarter to obtain the second revenue forecast result corresponding to the listed company.
在通过拟合出的多项式函数获得第二营收预测结果后,根据预先针对第一营收预测结果和第二营收预测结果设定的权重值,对第一营收预测结果和第二营收预测结果进行加权运算,将运算结果作为上市公司的预测营收数据。具体地,可以将针对第一营收预测结果和第二营收预测结果的权重值均设定为0.5,即将第一营收预测结果和第二营收预测结果的加权平均值作为上市公司的预测营收数据。After the second revenue prediction result is obtained through the fitted polynomial function, the first revenue prediction result and the second revenue prediction result are calculated according to the weight values set in advance for the first revenue prediction result and the second revenue prediction result. The revenue forecast results are weighted and the calculation results are used as the forecast revenue data of the listed company. Specifically, the weight values for the first revenue prediction result and the second revenue prediction result can be set to 0.5, that is, the weighted average of the first revenue prediction result and the second revenue prediction result is taken as the listed company’s Forecast revenue data.
针对方式二:Targeting method two:
在上述方式一所提供预测营收数据确定方法的基础上,在通过机器学习模型获得第一营收预测结果并通过多项式函数获得第二营收预测结果后,可以利用各个样本公司的历史营收数据拟合时间序列模型,使得每一个样本公司的历史营收数据随时间的变化规律符合拟合出的时间序列模型,进而将上市公司的历史营收数据输入拟合出的时间序列模型后,获得时间序列模型输出的第三营收预测结果。相应地,可以根据第一营收预测结果、第二营收预测结果和第三营收预测结果来确定上市公司的预测营收数据。On the basis of the method for determining forecast revenue data provided in the above method, after obtaining the first revenue forecast result through the machine learning model and the second revenue forecast result through the polynomial function, the historical revenue of each sample company can be used The data is fitted to the time series model, so that the historical revenue data of each sample company conforms to the fitted time series model, and then the historical revenue data of the listed company is input into the fitted time series model. Obtain the third revenue forecast result output by the time series model. Correspondingly, the predicted revenue data of the listed company can be determined according to the first revenue prediction result, the second revenue prediction result, and the third revenue prediction result.
根据各个样本公司的历史营收数据获得机器学习模型、多项式函数和时间序列模型,进而将上市公司的历史营收数据分别输入机器学习模型、多项式函数和时间序列模型,分别获得对应于上市公司的第一营收预测结果、第二营收预测结果和第三营收预测结果,之后根据第一营收预测结果、第二营收预测结 果和第三营收预测结果来确定上市公司的预测营收数据。由于结合了机器学习模型、多项式函数和时间序列模型三种营收预测方法对上市公司的营收进行预测,从而可以进一步提高对上市公司的营收进行预测的准确性。According to the historical revenue data of each sample company, the machine learning model, polynomial function and time series model are obtained, and then the historical revenue data of listed companies are input into the machine learning model, polynomial function and time series model, respectively, to obtain the corresponding The first revenue forecast result, the second revenue forecast result, and the third revenue forecast result, and then the forecasted revenue of the listed company is determined based on the first revenue forecast result, the second revenue forecast result, and the third revenue forecast result. Receive data. Because of the combination of machine learning model, polynomial function and time series model three revenue forecasting methods to predict the revenue of listed companies, the accuracy of forecasting the revenue of listed companies can be further improved.
在本申请实施例中,可以利用各个样本公司的历史营收数据拟合时间序列模型,如图3所示,拟合时间序列模型的过程可以通过如下步骤实现:In the embodiment of the present application, the historical revenue data of each sample company can be used to fit the time series model. As shown in FIG. 3, the process of fitting the time series model can be realized by the following steps:
步骤301:确定对上市公司进行营收预测的预测周期;Step 301: Determine the forecast period for the listed company's revenue forecast;
步骤302:根据预测周期分别从每一个样本公司的历史营收数据中提取每一个统计周期对应的第二历史营收数据,其中统计周期与预测周期在时间跨度上相对应;Step 302: Extract the second historical revenue data corresponding to each statistical period from the historical revenue data of each sample company according to the prediction period, where the statistical period corresponds to the prediction period in a time span;
步骤303:根据各个统计周期对应的各个第二历史营收数据拟合时间序列模型,使得每一个样本公司对应的各个第二历史营收数据随时间的变化规律满足该时间序列模型;Step 303: Fit a time series model according to each second historical revenue data corresponding to each statistical period, so that the time series model of each second historical revenue data corresponding to each sample company meets the time series model;
时间序列模型的形式如下所示:The form of the time series model is as follows:
(ΔM t) 2=K+k 1(ΔM t-1) 2-k 2(ΔM t-2) 2t-k 3ε t-1 (ΔM t ) 2 =K+k 1 (ΔM t-1 ) 2 -k 2 (ΔM t-2 ) 2t -k 3 ε t-1
其中,ΔM t表征相对于当前时间的营收预测结果与当前时间的上一个统计周期对应的第二历史营收数据之差;ΔM t-1表征当前时间的上一个统计周期对应的第二历史营收数据与当前时间之前的第二个统计周期对应的第二历史营收数据之差;ΔM t-2表征当前时间之前的第二个统计周期对应的第二历史营收数据与当前时间之前的第三个统计周期对应的第二历史营收数据之差;ε t表征相对于当前时间的营收预测结果;ε t-1表征当前时间的上一个统计周期对应的第二历史营收数据;K、k 1、k 2和k 3均为通过机器学习拟合出的权重系数。 Among them, ΔM t represents the difference between the revenue forecast results relative to the current time and the second historical revenue data corresponding to the previous statistical period of the current time; ΔM t-1 represents the second historical data corresponding to the previous statistical period of the current time The difference between the revenue data and the second historical revenue data corresponding to the second statistical period before the current time; ΔM t-2 represents the second historical revenue data corresponding to the second statistical period before the current time and the current time before The difference between the second historical revenue data corresponding to the third statistical period; ε t represents the revenue forecast result relative to the current time; ε t-1 represents the second historical revenue data corresponding to the previous statistical period at the current time ; K, k 1 , k 2 and k 3 are weight coefficients fitted by machine learning.
例如,现需要对上市公司下一季度的营收进行预测,则对上市公司进行营收预测的预测周期为季度,相应地,对各个样本公司的历史营收数据进行统计的统计周期也为季度。针对每一个样本公司,从该样本公司的历史营收数据中获取该样本公司历史上各个季度的营收作为第二历史营收数据。通过对各个样本公司对应的第二历史营收数据进行拟合,获得可以使每一个样本公司的各个第二历史营收数据随时间的变化规律均满足的时间序列模型。For example, now it is necessary to forecast the revenue of listed companies in the next quarter, the forecast period for the revenue forecast of listed companies is quarter. Correspondingly, the statistical period for the historical revenue data of each sample company is also quarterly. . For each sample company, the historical revenue of the sample company is obtained from the historical revenue data of the sample company as the second historical revenue data. By fitting the second historical revenue data corresponding to each sample company, a time series model that can satisfy the change rule of each second historical revenue data of each sample company over time is obtained.
在拟合出时间序列模型后,从上市公司的历史营收数据中获得上市公司对应于各个统计周期的数据,之后将获取到的对应于各个统计周期的数据输入时 间序列模型,获得时间序列模型输出的第三营收预测结果。例如,在确定对上市公司进行营收预测的预测周期为季度后,从上市公司的历史营收数据中获取上市公司历史上每一个季度的营收数据,进而将获取到的上市公司历史上每一个季度的营收数据输入拟合出的时间序列模型,获得对应于上市公司的第三营收预测结果。After fitting the time series model, obtain the listed company's data corresponding to each statistical period from the listed company's historical revenue data, and then input the obtained data corresponding to each statistical period into the time series model to obtain the time series model Output the third revenue forecast result. For example, after determining that the forecast period for the listed company’s revenue forecast is quarterly, the listed company’s historical revenue data is obtained from the listed company’s historical revenue data for each quarter, and then the obtained listed company’s historical revenue data The revenue data of a quarter is entered into the fitted time series model, and the third revenue forecast result corresponding to the listed company is obtained.
需要说明的是,在实际业务实现过程中,用户拟合多项式函数的第一历史营收数据和用于拟合时间序列模型的第二历史营收数据可以是相同的数据。It should be noted that, in the actual business realization process, the first historical revenue data for the user fitting a polynomial function and the second historical revenue data for fitting the time series model may be the same data.
可选地,在上述方式二所提供的确定上市公司预测营收数据的方法的基础上,在获得第一营收预测结果、第二营收预测结果和第三营收预测结果之后,可以对第一营收预测结果、第二营收预测结果和第三营收预测结果进行加权运算,将加权运算的结果作为上市公司的预测营收数据。Optionally, on the basis of the method for determining the predicted revenue data of listed companies provided in the second method, after obtaining the first revenue prediction result, the second revenue prediction result, and the third revenue prediction result, the The first revenue prediction result, the second revenue prediction result, and the third revenue prediction result are weighted, and the result of the weighted calculation is used as the predicted revenue data of the listed company.
具体地,可以根据实际需求对第一营收预测结果、第二营收预测结果和第三营收预测结果设置相对应的加权系数,以控制各个营收预测结果对最终预测营收数据的权重。例如,可以将第一营收预测结果对应的加权系数设置为0.4,将第二营收预测结果对应的加权系数设置为0.3,将第三营收预测结果对应的加权系数设置为0.3。Specifically, corresponding weighting coefficients can be set for the first revenue forecast result, the second revenue forecast result, and the third revenue forecast result according to actual needs to control the weight of each revenue forecast result to the final forecast revenue data . For example, the weighting coefficient corresponding to the first revenue prediction result can be set to 0.4, the weighting coefficient corresponding to the second revenue prediction result is set to 0.3, and the weighting coefficient corresponding to the third revenue prediction result is set to 0.3.
分别为第一营收预测结果、第二营收预测结果和第三营收预测结果设置相对应的加权系数,根据预先所设定的加权系数对第一营收预测结果、第二营收预测结果和第三营收预测结果进行加权运算,将加权运算的结果作为上市公司的预测营收数据。通过设置加权系统,一方面可以平衡三个营收预测结果对运算结果的影响程度,权衡三种营收预测方法的利弊,使得最终所获得的预测营收数据更加准确,另一方面使得用户可以根据需求自行调节加权系数,从而满足不同用户的个性化需求,提升该上市公司营收预测方法的适用性。Set the corresponding weighting coefficients for the first revenue forecast result, the second revenue forecast result and the third revenue forecast result respectively, and the first revenue forecast result and the second revenue forecast result according to the preset weighting coefficient The result and the third revenue forecast result are weighted, and the result of the weighted calculation is used as the forecast revenue data of the listed company. By setting up a weighting system, on the one hand, it is possible to balance the impact of the three revenue forecast results on the calculation results, weigh the pros and cons of the three revenue forecast methods, and make the final forecasted revenue data more accurate. Adjust the weighting coefficient according to the needs, so as to meet the individual needs of different users, and improve the applicability of the listed company's revenue forecasting method.
可选地,在图3所示拟合时间序列模型的方法的基础上,步骤303根据各个统计周期对应的各个第二历史营收数据拟合时间序列模型,具体可以拟合ARIMA时间序列模型,而ARIMA时间序列模型的拟合过程可以通过如下方式实现:Optionally, on the basis of the method of fitting the time series model shown in FIG. 3, step 303 fits the time series model according to each second historical revenue data corresponding to each statistical period. Specifically, the ARIMA time series model can be fitted. The fitting process of the ARIMA time series model can be implemented in the following ways:
对各个统计周期对应的各个第二历史营收数据进行二次差分,获得相对应的差分序列;Perform the second difference of each second historical revenue data corresponding to each statistical period to obtain the corresponding difference sequence;
根据差分序列,采用列表法定义与模型相对应的目标方程;According to the difference sequence, use the tabulation method to define the target equation corresponding to the model;
对目标方程进行求解获得模型的估计结果;Solve the target equation to obtain the estimated result of the model;
基于拟合优度对模型的拟合效果进行检测;Detect the fit effect of the model based on the goodness of fit;
在确定模型的拟合效果达到预先设定的目标后,对模型的残差进行检测;After confirming that the fitting effect of the model reaches the pre-set target, the residual error of the model is detected;
在确定模型的残差波动在预先设定的波动范围内时,将模型确定为时间序列模型。When it is determined that the residual fluctuation of the model is within the preset fluctuation range, the model is determined as a time series model.
在本申请实施例中,拟合时间序列模型的过程中,首先基于拟合优度对模型的拟合效果进行检测,在拟合效果达到预先设定的目标后对模型的线性残差进行检测,在模型的线性残差位于预先设定的波动范围内后,将模型确定为时间序列模型。通过对模型的拟合效果和线性残差进行检测,在拟合效果和线性残差均满足预先设定的条件后,将模型确定为时间序列模型,保证所生成时间序列模型的准确性,进而可以保证通过时间序列模型所获得第三营收预测结果的准确性。In the embodiment of the present application, in the process of fitting the time series model, the fitting effect of the model is first detected based on the goodness of fit, and the linear residual of the model is detected after the fitting effect reaches the preset target After the linear residual of the model is within the preset fluctuation range, the model is determined as a time series model. By detecting the fitting effect and linear residual of the model, after the fitting effect and linear residual meet the preset conditions, the model is determined as a time series model to ensure the accuracy of the generated time series model, and then The accuracy of the third revenue forecast result obtained through the time series model can be guaranteed.
下面以上述方式二提供的预测营收数据获取方法为例,对本申请实施例提供的上市公司营收预测方法作进一步详细说明,如图4所示,该方法可以包括以下步骤:The following takes the method for obtaining forecasted revenue data provided in the second method as an example to further describe the method for forecasting the revenue of a listed company provided in the embodiment of the present application. As shown in FIG. 4, the method may include the following steps:
步骤401:确定需要进行营收预测的上市公司所属的行业类别。Step 401: Determine the industry category of the listed company that needs to perform revenue forecasting.
在本申请实施例中,在需要对一个上市公司的营收进行预测时,首先需要确定该上市公司所属的行业类别。In the embodiment of the application, when it is necessary to predict the revenue of a listed company, the industry category to which the listed company belongs must first be determined.
例如,现需要对上市公司A下个季度的营收进行预测,则首先确定出上市公司A所属的行业类别A。For example, if it is now necessary to predict the revenue of listed company A in the next quarter, first determine the industry category A to which listed company A belongs.
步骤402:获取属于所确定出行业类别的多个样本公司的历史营收数据。Step 402: Obtain historical revenue data of multiple sample companies belonging to the determined industry category.
在本申请实施例中,在确定出需要进行营收预测的上市公司所属的行业类别之后,从属于所确定出的行业类别的各个公司中随机选择至少两个公司作为样本公司,之后获取各个样本公司的历史营收数据。In this embodiment of the application, after determining the industry category of the listed company that needs to perform revenue forecasting, at least two companies are randomly selected as sample companies from each company belonging to the determined industry category, and then each sample is obtained The company’s historical revenue data.
例如,从属于行业类别A的公司中选择3000个公司作为样本公司,之后分别获取3000个样本公司中每一个样本公司过去10年的历史营收数据。其中,如果样本公司的成立时间不足10年,则获得该样本公司所有的历史硬说数据。For example, select 3000 companies from companies belonging to industry category A as the sample companies, and then obtain the historical revenue data of each sample company in the 3000 sample companies in the past 10 years. Among them, if the sample company has been established for less than 10 years, all historical data of the sample company will be obtained.
步骤403:通过各个样本公司的历史营收数据训练机器学习模型。Step 403: Train a machine learning model based on historical revenue data of each sample company.
在本申请实施例中,在确定出需要进行营收预测的上市公司所属的行业类别之后,确定对应于该行业类别的至少一个因子。之后针对每一个样本公司,从该样本公司的历史营收数据中提取每一个因子对应的第一历史因子数据。之后利用提取到的各个第一历史因子数据训练相对于所确定行业类别的机器学习模型。其中,所训练出的机器学习模型可以根据历史营收数据来对未来的营收数据进行预测。具体的,机器学习模型可以包括如下公式:In this embodiment of the application, after determining the industry category of the listed company that needs to perform revenue forecasting, at least one factor corresponding to the industry category is determined. Then, for each sample company, the first historical factor data corresponding to each factor is extracted from the historical revenue data of the sample company. Then, the extracted first historical factor data is used to train a machine learning model relative to the determined industry category. Among them, the trained machine learning model can predict future revenue data based on historical revenue data. Specifically, the machine learning model may include the following formula:
Figure PCTCN2019103060-appb-000005
Figure PCTCN2019103060-appb-000005
其中,M′表征第一营收预测结果;n表征因子的个数;m表征第一历史因子数据所覆盖历史年度的个数;x (i,1)表征上市公司此前第1年对应于第i个因子的因子数据;x (i,2)表征上市公司此前第2年对应于第i个因子的因子数据;k i表征当前时间对应于第i个因子的因子系数;x (i,j)表征上市公司此前第j年对应于第i个因子的因子数据。 Among them, M′ represents the first revenue forecast result; n represents the number of factors; m represents the number of historical years covered by the first historical factor data; x (i, 1) represents the first year of the listed company corresponding to the first factor data of the i factor; x (i, 2) represents the factor data corresponding to the i-th factor in the second year of the listed company; k i represents the factor coefficient corresponding to the i-th factor at the current time; x (i, j ) Characterize the factor data corresponding to the i-th factor in the previous j-th year of the listed company
例如,将上季度营收、上季度总资产、去年同季度营收和去年同季度总资产确定为行业类别A对应的4个因子。之后针对3000个样本公司中的每一个样本公司,从该样本公司过去10年的历史营收数据中提取该样本公司过去10年中每一年每个季度的季度营收和季度总资产作为第一历史因子数据,这样可以提取到3000*10*4*2共计24万个第一历史因子数据,之后利用这24万个第一历史因子数据训练机器学习模型A。For example, the last quarter's revenue, last quarter's total assets, last year's same quarter revenue, and last year's same quarter total assets are identified as the four factors corresponding to industry category A. Then, for each of the 3000 sample companies, the sample company’s quarterly revenue and quarterly total assets for each year and quarter in the past 10 years are extracted from the historical revenue data of the sample company in the past 10 years as the first A historical factor data, which can extract a total of 240,000 first historical factor data of 3000*10*4*2, and then use these 240,000 first historical factor data to train machine learning model A.
步骤404:利用机器学习模型对上市公司的历史营收数据进行处理,获得第一营收预测结果。Step 404: Use the machine learning model to process the historical revenue data of the listed company to obtain the first revenue prediction result.
在本申请实施例中,从上市公司的历史营收数据中提取每一个因子对应的第二历史因子数据,之后将提取到的各个第二历史因子数据输入训练出的机器学习模型,由机器学习模型根据各个第二历史因子数据对上市公司的营收进行预测,获得机器学习模型输出的第一营收预测结果。In this embodiment of the application, the second historical factor data corresponding to each factor is extracted from the historical revenue data of the listed company, and then each extracted second historical factor data is input into the trained machine learning model, and the machine learning The model predicts the revenue of the listed company based on each second historical factor data, and obtains the first revenue prediction result output by the machine learning model.
例如,获取上市公司A过去10年的历史营收数据,之后从获取到的历史营收数据中提取上市公司A过去10年中每一年每个季度的季度营收和季度总资产作为第二历史因子数据,这样可以提取到10*4*2共计80个第二历史因子数据。 之后将这80个第二历史因子数据输入机器学习模型A,获得机器学习模型A输出的第一营收预测结果。具体可以将获得的80个第二历史因子数据代入上述公式,计算出第一营收预测结果。For example, obtain the historical revenue data of listed company A in the past 10 years, and then extract the quarterly revenue and quarterly total assets of listed company A in each of the past 10 years from the obtained historical revenue data as the second Historical factor data, so you can extract a total of 80 second historical factor data of 10*4*2. Then input the 80 second historical factor data into machine learning model A, and obtain the first revenue prediction result output by machine learning model A. Specifically, the obtained 80 second historical factor data can be substituted into the above formula to calculate the first revenue forecast result.
步骤405:通过各个样本公司的历史营收数据拟合多项式函数。Step 405: Fit a polynomial function through historical revenue data of each sample company.
在本申请实施例中,根据对上市公司进行营收预测的预测周期,从每一个样本公司的历史营收数据中提取每一个统计周期的第一历史营收数据,之后利用提取到的各个第一历史营收数据拟合多项式函数,使得每一个样本公司对应的各个第一历史营收数据均满足拟合出的多项式函数。其中,统计周期与预测周期在时间跨度上相对应。In the embodiment of this application, according to the forecast period of the listed company’s revenue forecast, the first historical revenue data of each statistical period is extracted from the historical revenue data of each sample company, and then the extracted first historical revenue data is used. A polynomial function is fitted to historical revenue data, so that each first historical revenue data corresponding to each sample company satisfies the fitted polynomial function. Among them, the statistical period and the forecast period correspond in time span.
所拟合出的多项式函数的形式如下:The form of the fitted polynomial function is as follows:
Figure PCTCN2019103060-appb-000006
Figure PCTCN2019103060-appb-000006
其中,M表征相对于当前时间的营收预测结果;k i表征通过机器学习拟合出的权重系数;x表征相对于当前时间的上一个统计周期对应的第一历史营收数据;x i表征相对于当前时间的上i+1个统计周期对应的第一历史营收数据;t+1表征当前时间之前统计周期的个数。 Among them, M represents the revenue prediction result relative to the current time; k i represents the weight coefficient fitted by machine learning; x represents the first historical revenue data corresponding to the previous statistical period relative to the current time; x i represents The first historical revenue data corresponding to the last i+1 statistical period relative to the current time; t+1 represents the number of statistical periods before the current time.
例如,由于对上市公司A进行营收预测的预测周期为季度,针对3000个样本公司中的每一个样本公司,从该样本公司的历史营收数据中提取过去10年该样本公司每一年每个季度的营收数据作为第一历史营收数据,可以提取到3000*10*4共计12万个第一历史营收数据。之后利用这12万个第一历史营收数据拟合多项式函数,使得每一个样本公司对应的各个第一历史营收数据满足该多项式函数。For example, since the forecast period for the revenue forecast of listed company A is quarterly, for each sample company in the 3000 sample companies, extract the historical revenue data of the sample company from the past 10 years. The quarterly revenue data is used as the first historical revenue data, which can extract a total of 120,000 first historical revenue data of 3000*10*4. Then, the 120,000 first historical revenue data are used to fit a polynomial function, so that each first historical revenue data corresponding to each sample company meets the polynomial function.
步骤406:利用多项式函数对上市公司的历史营收数据进行处理,获得第二营收预测结果。Step 406: Use a polynomial function to process historical revenue data of the listed company to obtain a second revenue forecast result.
在本申请实施例中,按照统计周期对上市公司的历史营收数据进行提取,获取对应于每一个统计周期的历史营收数据,之后将提取到的对应于各个统计周期的历史营收数据输入多项式函数,由多项式函数根据输入的各个历史营收数据对上市公司的营收进行预测,获得多项式函数输出的第二营收预测结果。In this embodiment of the application, the historical revenue data of listed companies is extracted according to the statistical cycle, the historical revenue data corresponding to each statistical period is obtained, and then the extracted historical revenue data corresponding to each statistical period is input The polynomial function is used to predict the revenue of the listed company based on the input historical revenue data, and obtain the second revenue prediction result output by the polynomial function.
例如,获取上市公司A过去10年的历史营收数据,之后从获取到的历史营 收数据中提取上市公司A过去10年中每一年每一个季度的季度营收,这样可以提取到10*4共计40个季度营收。之后将这40个季度营收输入如下的多项式函数A,获得多项式函数A输出的第二营收预测结果;For example, obtain the historical revenue data of listed company A in the past 10 years, and then extract the quarterly revenue of listed company A from the obtained historical revenue data for each and every quarter of each of the past 10 years, so as to extract 10* 4A total of 40 quarterly revenues. Then input the 40 quarterly revenue into the following polynomial function A to obtain the second revenue forecast result output by the polynomial function A;
Figure PCTCN2019103060-appb-000007
Figure PCTCN2019103060-appb-000007
其中,在该多项式函数A中,M表征相对于第二营收预测结果;k i表征通过机器学习拟合出的权重系数;x表征上市公司A上一个季度的季度营收;x i表征上市公司A上i+1个季度的季度营收。 Among them, in the polynomial function A, M represents the relative second revenue forecast result; k i represents the weight coefficient fitted by machine learning; x represents the quarterly revenue of listed company A in the previous quarter; x i represents the listing Company A’s quarterly revenue for i+1 quarters.
步骤407:通过各个样本公司的历史营收数据拟合时间序列模型。Step 407: Fit a time series model through historical revenue data of each sample company.
在本申请实施例中,根据对上市公司进行营收预测的预测周期,从每一个样本公司的历史营收数据中提取每一个统计周期的第二历史营收数据,之后利用提取到的各个第二历史营收数据拟合时间序列模型,使得每一个样本公司对应的各个第二历史营收数据随时间的变化规律满足该时间序列模型。其中,统计周期与预测周期在时间跨度上相对应。In this embodiment of the application, according to the forecast period of the listed company’s revenue forecast, the second historical revenue data of each statistical period is extracted from the historical revenue data of each sample company, and then the extracted second historical revenue data is used. 2. The historical revenue data is fitted to the time series model, so that the change rule of each second historical revenue data corresponding to each sample company over time meets the time series model. Among them, the statistical period and the forecast period correspond in time span.
所拟合出时间序列模型的形式如下所示:The form of the fitted time series model is as follows:
(ΔM t) 2=K+k 1(ΔM t-1) 2-k 2(ΔM t-2) 2t-k 3ε t-1 (ΔM t ) 2 =K+k 1 (ΔM t-1 ) 2 -k 2 (ΔM t-2 ) 2t -k 3 ε t-1
其中,ΔM t表征相对于当前时间的营收预测结果与当前时间的上一个统计周期对应的第二历史营收数据之差;ΔM t-1表征当前时间的上一个统计周期对应的第二历史营收数据与当前时间之前的第二个统计周期对应的第二历史营收数据之差;ΔM t-2表征当前时间之前的第二个统计周期对应的第二历史营收数据与当前时间之前的第三个统计周期对应的第二历史营收数据之差;ε t表征相对于当前时间的营收预测结果;ε t-1表征当前时间的上一个统计周期对应的第二历史营收数据;K、k 1、k 2和k 3均为通过机器学习拟合出的权重系数。 Among them, ΔM t represents the difference between the revenue forecast results relative to the current time and the second historical revenue data corresponding to the previous statistical period of the current time; ΔM t-1 represents the second historical data corresponding to the previous statistical period of the current time The difference between the revenue data and the second historical revenue data corresponding to the second statistical period before the current time; ΔM t-2 represents the second historical revenue data corresponding to the second statistical period before the current time and the current time before The difference between the second historical revenue data corresponding to the third statistical period; ε t represents the revenue forecast result relative to the current time; ε t-1 represents the second historical revenue data corresponding to the previous statistical period at the current time ; K, k 1 , k 2 and k 3 are weight coefficients fitted by machine learning.
例如,由于对上市公司A进行营收预测的预测周期为季度,针对3000个样本公司中的每一个样本公司,从该样本公司的历史营收数据中提取过去10年该样本公司每一年每个季度的营收数据作为第二历史营收数据,可以提取到3000*10*4共计12万个第二历史营收数据。之后利用这12万个第二历史营收数据拟合时间序列模型,使得每一个样本公司对应的各个第二历史营收数据满足该多时间序列模型。For example, since the forecast period for the revenue forecast of listed company A is quarterly, for each sample company in the 3000 sample companies, extract the historical revenue data of the sample company from the past 10 years. The quarterly revenue data is used as the second historical revenue data, which can be extracted to 3000*10*4, a total of 120,000 second historical revenue data. Then, the 120,000 second historical revenue data is used to fit the time series model, so that each second historical revenue data corresponding to each sample company meets the multiple time series model.
步骤408:利用时间序列模型对上市公司的历史营收数据进行处理,获得第三营收预测结果。Step 408: Use the time series model to process the historical revenue data of the listed company to obtain the third revenue forecast result.
在本申请实施例中,按照统计周期对上市公司的历史营收数据进行提取,获取对应于每一个统计周期的历史营收数据,之后将提取到的对应于各个统计周期的历史营收数据输入时间序列模型,由时间序列模型根据输入的各个历史营收数据对上市公司的营收进行预测,获得多项式函数输出的第三营收预测结果。In this embodiment of the application, the historical revenue data of listed companies is extracted according to the statistical cycle, the historical revenue data corresponding to each statistical period is obtained, and then the extracted historical revenue data corresponding to each statistical period is input The time series model uses the time series model to predict the revenue of the listed company according to the input historical revenue data, and obtains the third revenue prediction result output by the polynomial function.
例如,获取上市公司A过去10年的历史营收数据,之后从获取到的历史营收数据中提取上市公司A过去10年中每一年每一个季度的季度营收,这样可以提取到10*4共计40个季度营收。之后将这40个季度营收输入如下的时间序列模型A,获得时间序列模型A输出的第三营收预测结果;For example, obtain the historical revenue data of listed company A in the past 10 years, and then extract the quarterly revenue of listed company A from the obtained historical revenue data for each and every quarter of each of the past 10 years, so as to extract 10* 4A total of 40 quarterly revenues. Then input the 40 quarterly revenue into the following time series model A to obtain the third revenue forecast result output by the time series model A;
(ΔM 40) 2=K+k 1(ΔM 39) 2-k 2(ΔM 38) 240-k 3ε 39 (ΔM 40 ) 2 =K+k 1 (ΔM 39 ) 2 -k 2 (ΔM 38 ) 240 -k 3 ε 39
其中,ΔM 40表征第三营收预测结果与上市公司A上一个季度的季度营收之差;ΔM 39表征上市公司A上一个季度的季度营收与上市公司A此前第二个季度的季度营收之差;ΔM 38表征上市公司A此前第二个季度的季度营收与上市公司A此前第三个季度的季度营收之差;ε 40表征上市公司A的第三营收预测结果;ε 39表征上市公司A上一个季度的季度营收;K、k 1、k 2和k 3均为通过机器学习拟合出的权重系数。 Among them, ΔM 40 represents the difference between the third revenue forecast and the quarterly revenue of listed company A in the previous quarter; ΔM 39 represents the quarterly revenue of listed company A in the previous quarter and the quarterly revenue of listed company A in the previous second quarter. The difference in revenue; ΔM 38 represents the difference between the quarterly revenue of listed company A in the previous second quarter and the quarterly revenue of listed company A in the previous third quarter; ε 40 represents the third revenue forecast result of listed company A; ε 39 represents the quarterly revenue of listed company A in the previous quarter; K, k 1 , k 2 and k 3 are all weight coefficients fitted by machine learning.
步骤409:根据第一营收预测结果、第二营收预测结果和第三营收预测结果确定上市公司的预测营收数据。Step 409: Determine the predicted revenue data of the listed company according to the first revenue prediction result, the second revenue prediction result, and the third revenue prediction result.
在本申请实施例中,在获取到第一营收预测结果、第二营收预测结果和第三营收预测结果之后,对第一营收预测结果、第二营收预测结果和第三营收预测结果进行加权运算,将加权运算的结果作为上市公司的预测营收数据。In the embodiment of the present application, after obtaining the first revenue prediction result, the second revenue prediction result, and the third revenue prediction result, the first revenue prediction result, the second revenue prediction result, and the third revenue prediction result are calculated. The revenue forecast result is weighted and the result of the weighted calculation is used as the forecast revenue data of the listed company.
具体地,在对第一营收预测结果、第二营收预测结果和第三营收预测结果进行加权运算时,可以预先分别为第一营收预测结果、第二营收预测结果和第三营收预测结果设定相对应的加权系数。第一营收预测结果、第二营收预测结果和第三营收预测结果对应的3个加权系数可以相等,此时3个加权系数均为1/3。此外,还可以为第一营收预测结果、第二营收预测结果和第三营收预测结果设定不同的加权系数,具体可以利用获取第一营收预测结果、第二营收预测 结果和第三营收预测结果的方法来预测上市公司此前一个历史年度的营收,进而根据三者的预测结果预计历史年度的真实营收情况来确定加权系数。比如,利用获取第一营收预测结果的方法预测上一季度的营收为X1,利用获取第二营收预测结果的方法预测上一季度的营收为X2,利用获取第三营收预测结果的方法预测上一季度的营收为X3,而上一季度的营收实际为X4,进而可以根据X1、X2及X3与X4之间差值的绝对值来确定对应于第一营收预测结果、第二营收预测结果和第三营收预测结果的加权系数,所述差值的绝对值越大则对应的加权系数越小。Specifically, when the first revenue prediction result, the second revenue prediction result, and the third revenue prediction result are weighted, the first revenue prediction result, the second revenue prediction result, and the third revenue prediction result can be separately calculated in advance. The corresponding weighting coefficient is set for the revenue forecast result. The three weighting coefficients corresponding to the first revenue prediction result, the second revenue prediction result, and the third revenue prediction result can be equal, and the three weighting factors are all 1/3. In addition, different weighting coefficients can be set for the first revenue prediction result, the second revenue prediction result, and the third revenue prediction result. Specifically, it can be used to obtain the first revenue prediction result, the second revenue prediction result and The third method of revenue prediction results is used to predict the revenue of the listed company in the previous historical year, and then the weighting coefficient is determined based on the prediction results of the three to predict the true revenue of the historical year. For example, use the method of obtaining the first revenue forecast to predict the last quarter's revenue as X1, use the method of obtaining the second revenue forecast to predict the last quarter's revenue as X2, and use the method to obtain the third revenue forecast. The method predicts that the revenue of the previous quarter is X3, and the revenue of the previous quarter is actually X4, and then the absolute value of the difference between X1, X2, and X3 and X4 can be used to determine the corresponding first revenue forecast result , The weighting coefficient of the second revenue prediction result and the third revenue prediction result, the larger the absolute value of the difference, the smaller the corresponding weighting coefficient.
例如,通过如下公式计算上市公司的预测营收数据:
Figure PCTCN2019103060-appb-000008
(第一营收预测结果+第二营收预测结果+第三营收预测结果)。
For example, use the following formula to calculate the forecasted revenue data of a listed company:
Figure PCTCN2019103060-appb-000008
(The first revenue forecast result + the second revenue forecast result + the third revenue forecast result).
如图5所示,本申请一个实施例提供了一种上市公司营收预测装置,包括:类别识别模块501、因子识别模块502、数据获取模块503、第一数据提取模块504、模型训练模块505、第二数据提取模块506、模型处理模块507和数据处理模块508;As shown in FIG. 5, an embodiment of the present application provides a revenue forecasting device for listed companies, including: a category recognition module 501, a factor recognition module 502, a data acquisition module 503, a first data extraction module 504, and a model training module 505 , The second data extraction module 506, the model processing module 507 and the data processing module 508;
类别识别模块501,用于确定需要进行营收预测的上市公司所属的行业类别;The category identification module 501 is used to determine the industry category of the listed company that needs to perform revenue forecasting;
因子识别模块502,用于确定与类别识别模块501确定出的行业类别相对应的至少一个因子,其中,不同因子对应有不同的数据统计规则;The factor identification module 502 is configured to determine at least one factor corresponding to the industry category determined by the category identification module 501, wherein different factors correspond to different data statistics rules;
数据获取模块503,用于将属于类别识别模块501确定出的行业类别的至少两个公司作为样本公司,并分别获取每一个样本公司的历史营收数据;The data acquisition module 503 is configured to use at least two companies belonging to the industry category determined by the category identification module 501 as sample companies, and obtain historical revenue data of each sample company respectively;
第一数据提取模块504,用于分别从每一个样本公司的且由数据获取模块503获取到的历史营收数据中提取因子识别模块502确定出的每一个因子对应的第一历史因子数据;The first data extraction module 504 is configured to extract the first historical factor data corresponding to each factor determined by the factor identification module 502 from the historical revenue data of each sample company and acquired by the data acquisition module 503;
模型训练模块505,用于通过第一数据提取模块504提取到的各个第一历史因子数据训练对应于行业类别的机器学习模型;The model training module 505 is configured to train a machine learning model corresponding to the industry category through each first historical factor data extracted by the first data extraction module 504;
第二数据提取模块506,用于从上市公司的历史营收数据中提取因子识别模块502确定出每一个因子对应的第二历史因子数据;The second data extraction module 506 is used to extract the factor identification module 502 from the historical revenue data of the listed company to determine the second historical factor data corresponding to each factor;
模型处理模块507,用于将第二数据提取模块506提取出的各个第二历史因子数据输入模型训练模块505训练出的机器学习模型,获得机器学习模型输出 的第一营收预测结果;The model processing module 507 is configured to input each second historical factor data extracted by the second data extraction module 506 into the machine learning model trained by the model training module 505 to obtain the first revenue prediction result output by the machine learning model;
数据处理模块508,用于根据模型处理模块507获取到的第一营收预测结果确定上市公司的预测营收数据。The data processing module 508 is configured to determine the predicted revenue data of the listed company according to the first revenue prediction result obtained by the model processing module 507.
在本申请实施例中,类别识别模块501可用于执行上述方法实施例中的步骤101,因子识别模块502可用于执行上述方法实施例中的步骤102,数据获取模块503可用于执行上述方法实施例中的步骤103,第一数据提取模型504可用于执行上述方法实施例中的步骤104,模型训练模块505可用于执行上述方法实施例中的步骤105,第二数据提取模块506可用于执行上述方法实施例中的步骤106,模型处理模块507可用于执行上述方法实施例中的步骤107,数据处理模块508可用于执行上述方法实施例中的步骤108。In the embodiment of the present application, the category identification module 501 can be used to perform step 101 in the above method embodiment, the factor identification module 502 can be used to perform step 102 in the above method embodiment, and the data acquisition module 503 can be used to perform the above method embodiment. In step 103, the first data extraction model 504 can be used to perform step 104 in the above method embodiment, the model training module 505 can be used to perform step 105 in the above method embodiment, and the second data extraction module 506 can be used to perform the above method. In step 106 in the embodiment, the model processing module 507 can be used to execute step 107 in the above method embodiment, and the data processing module 508 can be used to execute step 108 in the above method embodiment.
需要说明的是,本装置实施例所包括的各个模块之间的信息交互、执行过程等内容,由于与上述方法实施例基于同一发明构思,具体内容可以参见上述方法实施例中的叙述,此处不再赘述。另外,本装置实施例还可以包括其他的模块,用于执行上述方法实施例中的各个步骤。It should be noted that the information interaction and execution process among the various modules included in the device embodiment are based on the same inventive concept as the above method embodiment, and the specific content can be referred to the description in the above method embodiment. No longer. In addition, this device embodiment may also include other modules for executing each step in the foregoing method embodiment.
本申请实施例还提供了一种计算机设备,包括有存储器和处理器,存储器上存储有计算机程序,当处理器执行存储器上所存储的计算机程序时可以实现上述各个实施例所提供的上市公司营收预测方法。The embodiments of the present application also provide a computer device, including a memory and a processor, and a computer program is stored on the memory. When the processor executes the computer program stored on the memory, the listed company operation provided by the foregoing embodiments can be implemented. Method of income forecasting.
本申请实施例还提供了一种非易失性计算机可读存储介质,该非易失性计算机可读存储介质上存储有计算机程序,在其所存储的计算机存储被执行时可以实现上述各个实施例提供的上市公司营收预测方法。The embodiments of the present application also provide a non-volatile computer-readable storage medium with a computer program stored on the non-volatile computer-readable storage medium, and the above-mentioned various implementations can be implemented when the stored computer storage is executed. Examples of listed companies’ revenue forecasting methods.
综上所述,本申请各个实施例提供的上市公司营收预测方法、装置以及计算机设备和非易失性计算机可读存储介质,确定需要进行营收预测的上市公司所属的行业类别后,确定与行业类别相对应的一个或多个因子,之后获取属于行业类别的至少两个样本公司的历史营收数据,并从每一个样本公司的历史营收数据中提取每一个因子对应的第一历史因子数据,并从上市公司的历史营收数据中提取每一个因子对应的第二历史因子数据,之后利用各个第一历史因子数据训练对应于上市公司所属行业类别的机器学习模型,在将各个第二历史因子数据输入所训练的机器学习模型后获得第一营收预测结果,进而可以根据第 一营收预测结果来确定上市公司的预测营收数据。由此可见,利用与上市公司属于同一行业类别的多个样本公司的历史营收数据来对上市公司的营收进行预测,对各因子所对应第一历史因子数据的时效性要求较低,从而无需分析人员及时搜集各因子对应的实时数据,从而可以降低分析人员对上市公司进行营收预测所付出的成本。In summary, the listed company revenue forecasting methods, devices, computer equipment, and non-volatile computer-readable storage medium provided by the various embodiments of this application are determined after determining the industry category of the listed company that needs to perform revenue forecasting One or more factors corresponding to the industry category, then obtain the historical revenue data of at least two sample companies belonging to the industry category, and extract the first history corresponding to each factor from the historical revenue data of each sample company Factor data, and extract the second historical factor data corresponding to each factor from the historical revenue data of listed companies, and then use each first historical factor data to train the machine learning model corresponding to the industry category of the listed company. After the two historical factor data is input into the trained machine learning model, the first revenue forecast result is obtained, and then the forecast revenue data of the listed company can be determined according to the first revenue forecast result. It can be seen that using the historical revenue data of multiple sample companies belonging to the same industry category as the listed company to predict the revenue of the listed company, the timeliness of the first historical factor data corresponding to each factor is relatively low, so There is no need for analysts to collect real-time data corresponding to each factor in time, which can reduce the cost of analysts’ revenue forecasts for listed companies.
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、装置、物品或者方法不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、装置、物品或者方法所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、装置、物品或者方法中还存在另外的相同要素。It should be noted that in this article, the terms "including", "including" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, device, article or method including a series of elements not only includes those elements, It also includes other elements that are not explicitly listed, or elements inherent to the process, device, article, or method. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, device, article or method that includes the element.
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。The serial numbers of the foregoing embodiments of the present application are for description only, and do not represent the superiority of the embodiments. Through the description of the above embodiments, those skilled in the art can clearly understand that the method of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM) as described above. , Magnetic disk, optical disk), including several instructions to make a terminal device (which can be a mobile phone, a computer, a server, or a network device, etc.) execute the method described in each embodiment of the present application.
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。The above are only preferred embodiments of this application, and do not limit the scope of this application. Any equivalent structure or equivalent process transformation made using the content of the description and drawings of this application, or directly or indirectly used in other related technical fields , The same reason is included in the scope of patent protection of this application.

Claims (20)

  1. 一种发展趋势数据获取方法,包括:A method for obtaining development trend data, including:
    确定预测对象所属的对象类别;Determine the object category to which the predicted object belongs;
    确定与所述对象类别相对应的至少一个因子,其中,不同所述因子对应有不同的数据统计规则;Determine at least one factor corresponding to the object category, wherein different factors correspond to different data statistics rules;
    将属于所述对象类别的至少两个对象作为样本对象,并分别获取每一个所述样本对象的历史发展数据;Taking at least two objects belonging to the object category as sample objects, and obtaining historical development data of each of the sample objects respectively;
    分别从每一个所述样本对象的历史发展数据中提取每一个所述因子对应的第一历史因子数据;Extracting the first historical factor data corresponding to each of the factors from the historical development data of each of the sample objects;
    通过提取到的各个所述第一历史因子数据训练对应于所述对象类别的机器学习模型;Training a machine learning model corresponding to the object category through each of the extracted first historical factor data;
    从所述预测对象的历史发展数据中提取每一个所述因子对应的第二历史因子数据;Extracting second historical factor data corresponding to each of the factors from the historical development data of the prediction object;
    将各个所述第二历史因子数据输入所述机器学习模型,获得所述机器学习模型输出的第一预测数据;Input each of the second historical factor data into the machine learning model to obtain the first prediction data output by the machine learning model;
    根据所述第一预测数据确定用于表征所述预测对象发展趋势的发展趋势数据。The development trend data used to characterize the development trend of the prediction object is determined according to the first prediction data.
  2. 根据权利要求1所述的方法,The method according to claim 1,
    在所述根据所述第一预测数据确定用于表征所述预测对象发展趋势的发展趋势数据之前,进一步包括:Before determining the development trend data used to characterize the development trend of the prediction object according to the first prediction data, the method further includes:
    利用至少两个所述样本对象的历史发展数据拟合多项式函数,其中,每一个所述样本对象的历史发展数据均满足所述多项式函数;Fitting a polynomial function with historical development data of at least two of the sample objects, wherein the historical development data of each sample object satisfies the polynomial function;
    将所述预测对象的历史发展数据输入所述多项式函数,获得所述多项式函数输出的第二预测数据;Input the historical development data of the prediction object into the polynomial function to obtain second prediction data output by the polynomial function;
    所述根据所述第一预测数据确定用于表征所述预测对象发展趋势的发展趋势数据,包括:The determining development trend data used to characterize the development trend of the prediction object according to the first prediction data includes:
    根据所述第一预测数据和所述第二预测数据确定所述预测对象的所述发展趋势数据。The development trend data of the prediction object is determined according to the first prediction data and the second prediction data.
  3. 根据权利要求2所述的方法,According to the method of claim 2,
    在所述根据所述第一预测数据和所述第二预测数据确定所述预测对象的所述发展趋势数据之前,进一步包括:Before the determining the development trend data of the prediction object according to the first prediction data and the second prediction data, the method further includes:
    利用至少两个所述样本对象的历史发展数据拟合时间序列模型,其中,每一个所述样本对象的历史发展数据随时间的变化规律均符合所述时间序列模型;Fitting a time series model with the historical development data of at least two of the sample objects, wherein the change law of the historical development data of each of the sample objects over time conforms to the time series model;
    将所述预测对象的历史发展数据输入所述时间序列模型,获得所述时间序列模型输出的第三预测数据;Input the historical development data of the prediction object into the time series model to obtain the third prediction data output by the time series model;
    所述根据所述第一预测数据和所述第二预测数据确定所述预测对象的所述发展趋势数据,包括:The determining the development trend data of the prediction object according to the first prediction data and the second prediction data includes:
    根据所述第一预测数据、所述第二预测数据和所述第三预测数据确定所述预测对象的所述发展趋势数据。The development trend data of the prediction object is determined according to the first prediction data, the second prediction data, and the third prediction data.
  4. 根据权利要求2或3所述的方法,所述利用至少两个所述样本对象的历史发展数据拟合多项式函数,包括:The method according to claim 2 or 3, wherein the fitting a polynomial function using historical development data of at least two of the sample objects comprises:
    确定对所述预测对象进行发展趋势预测的预测周期;Determine the forecast period for forecasting the development trend of the forecast object;
    根据所述预测周期分别从每一个所述样本对象的历史发展数据中提取每一个统计周期对应的第一历史发展数据,其中所述统计周期与所述预测周期在时间跨度上相对应;Extracting first historical development data corresponding to each statistical period from the historical development data of each sample object according to the prediction period, wherein the statistical period corresponds to the prediction period in a time span;
    根据各个所述统计周期对应的各个所述第一历史发展数据,拟合出如下多项式函数,其中,每一个所述样本对象对应的每一个所述第一历史发展数据均满足所述多项式函数;According to each of the first historical development data corresponding to each of the statistical periods, the following polynomial function is fitted, where each of the first historical development data corresponding to each of the sample objects satisfies the polynomial function;
    Figure PCTCN2019103060-appb-100001
    Figure PCTCN2019103060-appb-100001
    其中,所述M表征相对于当前时间的所述第二预测数据;所述k i表征通过机器学习拟合出的权重系数;所述x表征相对于所述当前时间的上一个所述统计周期对应的所述第一历史发展数据;所述x i表征相对于所述当前时间的上i+1个所述统计周期对应的所述第一历史发展数据;所述t+1表征所述当前时间之前所述统计周期的个数。 Wherein, the M represents the second prediction data relative to the current time; the k i represents the weight coefficient fitted by machine learning; the x represents the previous statistical period relative to the current time Corresponding to the first historical development data; the x i represents the first historical development data corresponding to the last i+1 statistical periods relative to the current time; the t+1 represents the current The number of statistical periods before the time.
  5. 根据权利要求3所述的方法,所述利利用至少两个所述样本对象的历 史发展数据拟合时间序列模型,包括:The method according to claim 3, wherein the fitting a time series model using the historical development data of at least two of the sample objects comprises:
    确定对所述预测对象进行发展趋势预测的预测周期;Determine the forecast period for forecasting the development trend of the forecast object;
    根据所述预测周期分别从每一个所述样本对象的历史发展数据中提取每一个统计周期对应的第二历史发展数据,其中所述统计周期与所述预测周期在时间跨度上相对应;Extracting the second historical development data corresponding to each statistical period from the historical development data of each sample object according to the prediction period, wherein the statistical period corresponds to the prediction period in a time span;
    根据各个所述统计周期对应的各个所述第二历史发展数据拟合时间序列模型,其中,每一个所述样本对象对应的各个所述第二历史发展数据随时间的变化规律满足所述时间序列模型;A time series model is fitted according to each of the second historical development data corresponding to each of the statistical periods, wherein the change rule over time of each of the second historical development data corresponding to each of the sample objects satisfies the time series model;
    所述时间序列模型的形式如下所示:The form of the time series model is as follows:
    (ΔM t) 2=K+k 1(ΔM t-1) 2-k 2(ΔM t-2) 2t-k 3ε t-1 (ΔM t ) 2 =K+k 1 (ΔM t-1 ) 2 -k 2 (ΔM t-2 ) 2t -k 3 ε t-1
    其中,所述ΔM t表征相对于当前时间的所述第三预测数据与所述当前时间的上一个所述统计周期对应的所述第二历史发展数据之差;所述ΔM t-1表征所述当前时间的上一个所述统计周期对应的所述第二历史发展数据与所述当前时间之前的第二个所述统计周期对应的所述第二历史发展数据之差;所述ΔM t-2表征所述当前时间之前的第二个所述统计周期对应的所述第二历史发展数据与所述当前时间之前的第三个所述统计周期对应的所述第二历史发展数据之差;所述ε t表征相对于所述当前时间的所述第三预测数据;所述ε t-1表征所述当前时间的上一个所述统计周期对应的所述第二历史发展数据;所述K、所述k 1、所述k 2和所述k 3均为通过机器学习拟合出的权重系数。 Wherein, the ΔM t characterizes the difference between the third prediction data relative to the current time and the second historical development data corresponding to the last statistical period of the current time; the ΔM t-1 characterizes the difference The difference between the second historical development data corresponding to the last statistical period of the current time and the second historical development data corresponding to the second statistical period before the current time; the ΔM t- 2 characterizing the difference between the second historical development data corresponding to the second statistical period before the current time and the second historical development data corresponding to the third statistical period before the current time; The ε t characterizes the third prediction data relative to the current time; the ε t-1 characterizes the second historical development data corresponding to the last statistical period of the current time; the K , The k 1 , the k 2 and the k 3 are all weight coefficients fitted by machine learning.
  6. 根据权利要求5所述的方法,所述根据各个所述统计周期对应的各个所述第二历史发展数据拟合时间序列模型,包括:The method according to claim 5, wherein the fitting a time series model according to each of the second historical development data corresponding to each of the statistical periods comprises:
    对各个所述统计周期对应的各个所述第二历史发展数据进行二次差分,获得相对应的差分序列;Performing a second difference on each of the second historical development data corresponding to each of the statistical periods to obtain a corresponding difference sequence;
    根据所述差分序列,采用列表法定义与模型相对应的目标方程;According to the difference sequence, a list method is used to define the target equation corresponding to the model;
    对所述目标方程进行求解获得所述模型的估计结果;Solving the target equation to obtain an estimation result of the model;
    基于拟合优度对所述模型的拟合效果进行检测;Detecting the fitting effect of the model based on the goodness of fit;
    在确定所述模型的拟合效果达到预先设定的目标后,对所述模型的残差进行检测;After determining that the fitting effect of the model reaches a preset target, detecting the residual of the model;
    在确定所述模型的残差波动在预先设定的波动范围内时,将所述模型确 定为所述时间序列模型。When it is determined that the residual fluctuation of the model is within the preset fluctuation range, the model is determined as the time series model.
  7. 根据权利要求3、5或者6所述的方法,According to the method of claim 3, 5 or 6,
    所述根据所述第一预测数据、所述第二预测数据和所述第三预测数据确定所述预测对象的所述发展趋势数据,包括:The determining the development trend data of the prediction object according to the first prediction data, the second prediction data, and the third prediction data includes:
    对所述第一预测数据、所述第二预测数据和所述第三预测数据进行加权运算,获得所述预测对象的所述发展趋势数据;Performing a weighted operation on the first prediction data, the second prediction data, and the third prediction data to obtain the development trend data of the prediction object;
    和/或,and / or,
    所述通过提取到的各个所述第一历史因子数据训练对应于所述对象类别的机器学习模型,包括:The training of the machine learning model corresponding to the object category through each of the extracted first historical factor data includes:
    针对每一个所述因子,从该因子对应的所述第一历史因子数据中获取过去至少两年中每一年该因子对应的至少一个因子数据;For each of the factors, obtain at least one factor data corresponding to the factor in each of the past at least two years from the first historical factor data corresponding to the factor;
    将各个所述因子对应的所述因子数据作为样本训练分别与每一个所述因子对应的因子系数;Using the factor data corresponding to each of the factors as a sample to train the factor coefficients corresponding to each of the factors;
    利用获取到的各个所述因子系数构建如下用于计算所述第一预测数据的公式;Use each of the acquired factor coefficients to construct the following formula for calculating the first prediction data;
    Figure PCTCN2019103060-appb-100002
    Figure PCTCN2019103060-appb-100002
    其中,所述M′表征所述第一预测数据;所述n表征所述因子的个数;所述m表征所述第一历史因子数据所覆盖历史年度的个数;所述x (i,1)表征所述预测对象此前第1年对应于第i个所述因子的因子数据;所述x (i,2)表征所述预测对象此前第2年对应于第i个所述因子的因子数据;所述k i表征当前时间对应于第i个所述因子的因子系数;所述x (i,j)表征所述预测对象此前第j年对应于第i个所述因子的因子数据; Wherein, the M′ represents the first prediction data; the n represents the number of the factors; the m represents the number of historical years covered by the first historical factor data; the x (i, 1) Characterizing the factor data corresponding to the i-th factor of the predicted object in the previous year; the x (i, 2) characterizing the factor corresponding to the i-th factor in the previous year of the predicted object Data; the k i characterizes the factor coefficient corresponding to the i-th factor at the current time; the x (i, j) characterizes the factor data corresponding to the i-th factor in the previous j-th year of the prediction object;
    构建包括有所述公式的所述机器学习模型。The machine learning model including the formula is constructed.
  8. 一种发展趋势数据获取装置,包括:类别识别模块、因子识别模块、数据获取模块、第一数据提取模块、模型训练模块、第二数据提取模块、模型处理模块和数据处理模块;A development trend data acquisition device, including: a category identification module, a factor identification module, a data acquisition module, a first data extraction module, a model training module, a second data extraction module, a model processing module, and a data processing module;
    所述类别识别模块,用于确定预测对象所属的对象类别;The category recognition module is used to determine the object category to which the predicted object belongs;
    所述因子识别模块,用于确定与所述类别识别模块确定出的所述对象类别相对应的至少一个因子,其中,不同所述因子对应有不同的数据统计规则;The factor identification module is configured to determine at least one factor corresponding to the object category determined by the category identification module, wherein different factors correspond to different data statistics rules;
    所述数据获取模块,用于将属于所述类别识别模块确定出的所述对象类别的至少两个对象作为样本对象,并分别获取每一个所述样本对象的历史发展数据;The data acquisition module is configured to use at least two objects belonging to the object category determined by the category recognition module as sample objects, and obtain historical development data of each sample object respectively;
    所述第一数据提取模块,用于分别从每一个所述样本对象的且由所述数据获取模块获取到的所述历史发展数据中提取所述因子识别模块确定出的每一个所述因子对应的第一历史因子数据;The first data extraction module is configured to extract each of the factor corresponding to each of the factors determined by the factor identification module from the historical development data of each of the sample objects and acquired by the data acquisition module. The first historical factor data;
    所述模型训练模块,用于通过所述第一数据提取模块提取到的各个所述第一历史因子数据训练对应于所述对象类别的机器学习模型;The model training module is configured to train a machine learning model corresponding to the object category through each of the first historical factor data extracted by the first data extraction module;
    所述第二数据提取模块,用于从所述预测对象的历史发展数据中提取所述因子识别模块确定出每一个所述因子对应的第二历史因子数据;The second data extraction module is configured to extract second historical factor data corresponding to each of the factors by the factor identification module from the historical development data of the prediction object;
    所述模型处理模块,用于将所述第二数据提取模块提取出的各个所述第二历史因子数据输入所述模型训练模块训练出的所述机器学习模型,获得所述机器学习模型输出的第一预测数据;The model processing module is configured to input each of the second historical factor data extracted by the second data extraction module into the machine learning model trained by the model training module to obtain the output of the machine learning model First forecast data;
    所述数据处理模块,用于根据所述模型处理模块获取到的所述第一预测数据确定用于表征所述预测对象发展趋势的发展趋势数据。The data processing module is configured to determine development trend data used to characterize the development trend of the prediction object according to the first prediction data acquired by the model processing module.
  9. 根据权利要求8所述的装置,还用于:在所述根据所述第一预测数据确定用于表征所述预测对象发展趋势的发展趋势数据之前,进一步:The device according to claim 8, further configured to: before determining the development trend data used to characterize the development trend of the prediction object according to the first prediction data, further:
    利用至少两个所述样本对象的历史发展数据拟合多项式函数,其中,每一个所述样本对象的历史发展数据均满足所述多项式函数;Fitting a polynomial function with historical development data of at least two of the sample objects, wherein the historical development data of each sample object satisfies the polynomial function;
    将所述预测对象的历史发展数据输入所述多项式函数,获得所述多项式函数输出的第二预测数据;Input the historical development data of the prediction object into the polynomial function to obtain second prediction data output by the polynomial function;
    所述根据所述第一预测数据确定用于表征所述预测对象发展趋势的发展趋势数据,包括:The determining development trend data used to characterize the development trend of the prediction object according to the first prediction data includes:
    根据所述第一预测数据和所述第二预测数据确定所述预测对象的所述发展趋势数据。The development trend data of the prediction object is determined according to the first prediction data and the second prediction data.
  10. 根据权利要求9所述的装置,还用于:The device according to claim 9, further used for:
    在所述根据所述第一预测数据和所述第二预测数据确定所述预测对象的 所述发展趋势数据之前,进一步:Before the determining the development trend data of the prediction object according to the first prediction data and the second prediction data, further:
    利用至少两个所述样本对象的历史发展数据拟合时间序列模型,其中,每一个所述样本对象的历史发展数据随时间的变化规律均符合所述时间序列模型;Fitting a time series model with the historical development data of at least two of the sample objects, wherein the change law of the historical development data of each of the sample objects over time conforms to the time series model;
    将所述预测对象的历史发展数据输入所述时间序列模型,获得所述时间序列模型输出的第三预测数据;Input the historical development data of the prediction object into the time series model to obtain the third prediction data output by the time series model;
    所述根据所述第一预测数据和所述第二预测数据确定所述预测对象的所述发展趋势数据,包括:The determining the development trend data of the prediction object according to the first prediction data and the second prediction data includes:
    根据所述第一预测数据、所述第二预测数据和所述第三预测数据确定所述预测对象的所述发展趋势数据。The development trend data of the prediction object is determined according to the first prediction data, the second prediction data, and the third prediction data.
  11. 根据权利要求9或10所述的装置,所述装置利用至少两个所述样本对象的历史发展数据拟合多项式函数,进一步包括:The device according to claim 9 or 10, wherein the device uses historical development data of at least two of the sample objects to fit a polynomial function, further comprising:
    确定对所述预测对象进行发展趋势预测的预测周期;Determine the forecast period for forecasting the development trend of the forecast object;
    根据所述预测周期分别从每一个所述样本对象的历史发展数据中提取每一个统计周期对应的第一历史发展数据,其中所述统计周期与所述预测周期在时间跨度上相对应;Extracting first historical development data corresponding to each statistical period from the historical development data of each sample object according to the prediction period, wherein the statistical period corresponds to the prediction period in a time span;
    根据各个所述统计周期对应的各个所述第一历史发展数据,拟合出如下多项式函数,其中,每一个所述样本对象对应的每一个所述第一历史发展数据均满足所述多项式函数;According to each of the first historical development data corresponding to each of the statistical periods, the following polynomial function is fitted, where each of the first historical development data corresponding to each of the sample objects satisfies the polynomial function;
    Figure PCTCN2019103060-appb-100003
    Figure PCTCN2019103060-appb-100003
    其中,所述M表征相对于当前时间的所述第二预测数据;所述k i表征通过机器学习拟合出的权重系数;所述x表征相对于所述当前时间的上一个所述统计周期对应的所述第一历史发展数据;所述x i表征相对于所述当前时间的上i+1个所述统计周期对应的所述第一历史发展数据;所述t+1表征所述当前时间之前所述统计周期的个数。 Wherein, the M represents the second prediction data relative to the current time; the k i represents the weight coefficient fitted by machine learning; the x represents the previous statistical period relative to the current time Corresponding to the first historical development data; the x i represents the first historical development data corresponding to the last i+1 statistical periods relative to the current time; the t+1 represents the current The number of statistical periods before the time.
  12. 根据权利要求10所述的装置,所述装置利利用至少两个所述样本对象的历史发展数据拟合时间序列模型,进一步包括:The device according to claim 10, wherein the device uses historical development data of at least two of the sample objects to fit a time series model, further comprising:
    确定对所述预测对象进行发展趋势预测的预测周期;Determine the forecast period for forecasting the development trend of the forecast object;
    根据所述预测周期分别从每一个所述样本对象的历史发展数据中提取每一个统计周期对应的第二历史发展数据,其中所述统计周期与所述预测周期在时间跨度上相对应;Extracting the second historical development data corresponding to each statistical period from the historical development data of each sample object according to the prediction period, wherein the statistical period corresponds to the prediction period in a time span;
    根据各个所述统计周期对应的各个所述第二历史发展数据拟合时间序列模型,其中,每一个所述样本对象对应的各个所述第二历史发展数据随时间的变化规律满足所述时间序列模型;A time series model is fitted according to each of the second historical development data corresponding to each of the statistical periods, wherein the change rule over time of each of the second historical development data corresponding to each of the sample objects satisfies the time series model;
    所述时间序列模型的形式如下所示:The form of the time series model is as follows:
    (ΔM t) 2=K+k 1(ΔM t-1) 2-k 2(ΔM t-2) 2t-k 3ε t-1 (ΔM t ) 2 =K+k 1 (ΔM t-1 ) 2 -k 2 (ΔM t-2 ) 2t -k 3 ε t-1
    其中,所述ΔM t表征相对于当前时间的所述第三预测数据与所述当前时间的上一个所述统计周期对应的所述第二历史发展数据之差;所述ΔM t-1表征所述当前时间的上一个所述统计周期对应的所述第二历史发展数据与所述当前时间之前的第二个所述统计周期对应的所述第二历史发展数据之差;所述ΔM t-2表征所述当前时间之前的第二个所述统计周期对应的所述第二历史发展数据与所述当前时间之前的第三个所述统计周期对应的所述第二历史发展数据之差;所述ε t表征相对于所述当前时间的所述第三预测数据;所述ε t-1表征所述当前时间的上一个所述统计周期对应的所述第二历史发展数据;所述K、所述k 1、所述k 2和所述k 3均为通过机器学习拟合出的权重系数。 Wherein, the ΔM t characterizes the difference between the third prediction data relative to the current time and the second historical development data corresponding to the last statistical period of the current time; the ΔM t-1 characterizes the difference The difference between the second historical development data corresponding to the last statistical period of the current time and the second historical development data corresponding to the second statistical period before the current time; the ΔM t- 2 characterizing the difference between the second historical development data corresponding to the second statistical period before the current time and the second historical development data corresponding to the third statistical period before the current time; The ε t characterizes the third prediction data relative to the current time; the ε t-1 characterizes the second historical development data corresponding to the last statistical period of the current time; the K , The k 1 , the k 2 and the k 3 are all weight coefficients fitted by machine learning.
  13. 根据权利要求12所述的装置,所述装置根据各个所述统计周期对应的各个所述第二历史发展数据拟合时间序列模型,进一步包括:The device according to claim 12, said device fitting a time series model according to each of said second historical development data corresponding to each of said statistical periods, further comprising:
    对各个所述统计周期对应的各个所述第二历史发展数据进行二次差分,获得相对应的差分序列;Performing a second difference on each of the second historical development data corresponding to each of the statistical periods to obtain a corresponding difference sequence;
    根据所述差分序列,采用列表法定义与模型相对应的目标方程;According to the difference sequence, a list method is used to define the target equation corresponding to the model;
    对所述目标方程进行求解获得所述模型的估计结果;Solving the target equation to obtain an estimation result of the model;
    基于拟合优度对所述模型的拟合效果进行检测;Detecting the fitting effect of the model based on the goodness of fit;
    在确定所述模型的拟合效果达到预先设定的目标后,对所述模型的残差进行检测;After determining that the fitting effect of the model reaches a preset target, detecting the residual of the model;
    在确定所述模型的残差波动在预先设定的波动范围内时,将所述模型确定为所述时间序列模型。When it is determined that the residual fluctuation of the model is within a preset fluctuation range, the model is determined as the time series model.
  14. 根据权利要求10、12或者13所述的装置,The device according to claim 10, 12 or 13,
    所述装置根据所述第一预测数据、所述第二预测数据和所述第三预测数据确定所述预测对象的所述发展趋势数据,进一步包括:The device determining the development trend data of the prediction object according to the first prediction data, the second prediction data, and the third prediction data further includes:
    对所述第一预测数据、所述第二预测数据和所述第三预测数据进行加权运算,获得所述预测对象的所述发展趋势数据;Performing a weighted operation on the first prediction data, the second prediction data, and the third prediction data to obtain the development trend data of the prediction object;
    和/或,and / or,
    所述通过提取到的各个所述第一历史因子数据训练对应于所述对象类别的机器学习模型,包括:The training of the machine learning model corresponding to the object category through each of the extracted first historical factor data includes:
    针对每一个所述因子,从该因子对应的所述第一历史因子数据中获取过去至少两年中每一年该因子对应的至少一个因子数据;For each of the factors, obtain at least one factor data corresponding to the factor in each of the past at least two years from the first historical factor data corresponding to the factor;
    将各个所述因子对应的所述因子数据作为样本训练分别与每一个所述因子对应的因子系数;Using the factor data corresponding to each of the factors as a sample to train the factor coefficients corresponding to each of the factors;
    利用获取到的各个所述因子系数构建如下用于计算所述第一预测数据的公式;Use each of the acquired factor coefficients to construct the following formula for calculating the first prediction data;
    Figure PCTCN2019103060-appb-100004
    Figure PCTCN2019103060-appb-100004
    其中,所述M′表征所述第一预测数据;所述n表征所述因子的个数;所述m表征所述第一历史因子数据所覆盖历史年度的个数;所述x (i,1)表征所述预测对象此前第1年对应于第i个所述因子的因子数据;所述x (i,2)表征所述预测对象此前第2年对应于第i个所述因子的因子数据;所述k i表征当前时间对应于第i个所述因子的因子系数;所述x (i,j)表征所述预测对象此前第j年对应于第i个所述因子的因子数据; Wherein, the M′ represents the first prediction data; the n represents the number of the factors; the m represents the number of historical years covered by the first historical factor data; the x (i, 1) Characterizing the factor data corresponding to the i-th factor of the predicted object in the previous year; the x (i, 2) characterizing the factor corresponding to the i-th factor in the previous year of the predicted object Data; the k i characterizes the factor coefficient corresponding to the i-th factor at the current time; the x (i, j) characterizes the factor data corresponding to the i-th factor in the previous j-th year of the prediction object;
    构建包括有所述公式的所述机器学习模型。The machine learning model including the formula is constructed.
  15. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现一种发展趋势数据获取方法的步骤,包括:A computer device includes a memory and a processor, the memory stores a computer program, and the steps of a method for acquiring development trend data when the processor executes the computer program include:
    确定预测对象所属的对象类别;Determine the object category to which the predicted object belongs;
    确定与所述对象类别相对应的至少一个因子,其中,不同所述因子对应有不同的数据统计规则;Determine at least one factor corresponding to the object category, wherein different factors correspond to different data statistics rules;
    将属于所述对象类别的至少两个对象作为样本对象,并分别获取每一个所述样本对象的历史发展数据;Taking at least two objects belonging to the object category as sample objects, and obtaining historical development data of each of the sample objects respectively;
    分别从每一个所述样本对象的历史发展数据中提取每一个所述因子对应的第一历史因子数据;Extracting the first historical factor data corresponding to each of the factors from the historical development data of each of the sample objects;
    通过提取到的各个所述第一历史因子数据训练对应于所述对象类别的机器学习模型;Training a machine learning model corresponding to the object category through each of the extracted first historical factor data;
    从所述预测对象的历史发展数据中提取每一个所述因子对应的第二历史因子数据;Extracting second historical factor data corresponding to each of the factors from the historical development data of the prediction object;
    将各个所述第二历史因子数据输入所述机器学习模型,获得所述机器学习模型输出的第一预测数据;Input each of the second historical factor data into the machine learning model to obtain the first prediction data output by the machine learning model;
    根据所述第一预测数据确定用于表征所述预测对象发展趋势的发展趋势数据。The development trend data used to characterize the development trend of the prediction object is determined according to the first prediction data.
  16. 根据权利要求1所述的计算机设备,在所述根据所述第一预测数据确定用于表征所述预测对象发展趋势的发展趋势数据之前,进一步包括:The computer device according to claim 1, before the determining the development trend data used to characterize the development trend of the prediction object according to the first prediction data, further comprising:
    利用至少两个所述样本对象的历史发展数据拟合多项式函数,其中,每一个所述样本对象的历史发展数据均满足所述多项式函数;Fitting a polynomial function with historical development data of at least two of the sample objects, wherein the historical development data of each sample object satisfies the polynomial function;
    将所述预测对象的历史发展数据输入所述多项式函数,获得所述多项式函数输出的第二预测数据;Input the historical development data of the prediction object into the polynomial function to obtain second prediction data output by the polynomial function;
    所述根据所述第一预测数据确定用于表征所述预测对象发展趋势的发展趋势数据,包括:The determining development trend data used to characterize the development trend of the prediction object according to the first prediction data includes:
    根据所述第一预测数据和所述第二预测数据确定所述预测对象的所述发展趋势数据。The development trend data of the prediction object is determined according to the first prediction data and the second prediction data.
  17. 根据权利要求2所述的计算机设备,在所述根据所述第一预测数据和所述第二预测数据确定所述预测对象的所述发展趋势数据之前,进一步包括:The computer device according to claim 2, before said determining the development trend data of the prediction object according to the first prediction data and the second prediction data, further comprising:
    利用至少两个所述样本对象的历史发展数据拟合时间序列模型,其中,每一个所述样本对象的历史发展数据随时间的变化规律均符合所述时间序列模型;Fitting a time series model with the historical development data of at least two of the sample objects, wherein the change law of the historical development data of each of the sample objects over time conforms to the time series model;
    将所述预测对象的历史发展数据输入所述时间序列模型,获得所述时间 序列模型输出的第三预测数据;Input the historical development data of the prediction object into the time series model to obtain the third prediction data output by the time series model;
    所述根据所述第一预测数据和所述第二预测数据确定所述预测对象的所述发展趋势数据,包括:The determining the development trend data of the prediction object according to the first prediction data and the second prediction data includes:
    根据所述第一预测数据、所述第二预测数据和所述第三预测数据确定所述预测对象的所述发展趋势数据。The development trend data of the prediction object is determined according to the first prediction data, the second prediction data, and the third prediction data.
  18. 一种非易失性计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现一种发展趋势数据获取方法的步骤,包括:A non-volatile computer-readable storage medium having a computer program stored thereon, and the steps of implementing a method for acquiring development trend data when the computer program is executed by a processor include:
    确定预测对象所属的对象类别;Determine the object category to which the predicted object belongs;
    确定与所述对象类别相对应的至少一个因子,其中,不同所述因子对应有不同的数据统计规则;Determine at least one factor corresponding to the object category, wherein different factors correspond to different data statistics rules;
    将属于所述对象类别的至少两个对象作为样本对象,并分别获取每一个所述样本对象的历史发展数据;Taking at least two objects belonging to the object category as sample objects, and obtaining historical development data of each of the sample objects respectively;
    分别从每一个所述样本对象的历史发展数据中提取每一个所述因子对应的第一历史因子数据;Extracting the first historical factor data corresponding to each of the factors from the historical development data of each of the sample objects;
    通过提取到的各个所述第一历史因子数据训练对应于所述对象类别的机器学习模型;Training a machine learning model corresponding to the object category through each of the extracted first historical factor data;
    从所述预测对象的历史发展数据中提取每一个所述因子对应的第二历史因子数据;Extracting second historical factor data corresponding to each of the factors from the historical development data of the prediction object;
    将各个所述第二历史因子数据输入所述机器学习模型,获得所述机器学习模型输出的第一预测数据;Input each of the second historical factor data into the machine learning model to obtain the first prediction data output by the machine learning model;
    根据所述第一预测数据确定用于表征所述预测对象发展趋势的发展趋势数据。The development trend data used to characterize the development trend of the prediction object is determined according to the first prediction data.
  19. 根据权利要求18所述的存储介质,在所述根据所述第一预测数据确定用于表征所述预测对象发展趋势的发展趋势数据之前,进一步包括:The storage medium according to claim 18, before the determining the development trend data used to characterize the development trend of the prediction object according to the first prediction data, further comprising:
    利用至少两个所述样本对象的历史发展数据拟合多项式函数,其中,每一个所述样本对象的历史发展数据均满足所述多项式函数;Fitting a polynomial function with historical development data of at least two of the sample objects, wherein the historical development data of each sample object satisfies the polynomial function;
    将所述预测对象的历史发展数据输入所述多项式函数,获得所述多项式函数输出的第二预测数据;Input the historical development data of the prediction object into the polynomial function to obtain second prediction data output by the polynomial function;
    所述根据所述第一预测数据确定用于表征所述预测对象发展趋势的发展趋势数据,包括:The determining development trend data used to characterize the development trend of the prediction object according to the first prediction data includes:
    根据所述第一预测数据和所述第二预测数据确定所述预测对象的所述发展趋势数据。The development trend data of the prediction object is determined according to the first prediction data and the second prediction data.
  20. 根据权利要求19所述的存储介质,在所述根据所述第一预测数据和所述第二预测数据确定所述预测对象的所述发展趋势数据之前,进一步包括:The storage medium according to claim 19, before said determining the development trend data of the prediction object according to the first prediction data and the second prediction data, further comprising:
    利用至少两个所述样本对象的历史发展数据拟合时间序列模型,其中,每一个所述样本对象的历史发展数据随时间的变化规律均符合所述时间序列模型;Fitting a time series model with the historical development data of at least two of the sample objects, wherein the change law of the historical development data of each of the sample objects over time conforms to the time series model;
    将所述预测对象的历史发展数据输入所述时间序列模型,获得所述时间序列模型输出的第三预测数据;Input the historical development data of the prediction object into the time series model to obtain the third prediction data output by the time series model;
    所述根据所述第一预测数据和所述第二预测数据确定所述预测对象的所述发展趋势数据,包括:The determining the development trend data of the prediction object according to the first prediction data and the second prediction data includes:
    根据所述第一预测数据、所述第二预测数据和所述第三预测数据确定所述预测对象的所述发展趋势数据。The development trend data of the prediction object is determined according to the first prediction data, the second prediction data, and the third prediction data.
PCT/CN2019/103060 2019-04-19 2019-08-28 Development trend data acquisition method and device WO2020211245A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910319456.7A CN110210645A (en) 2019-04-19 2019-04-19 A kind of development trend data capture method, device and readable storage medium storing program for executing
CN201910319456.7 2019-04-19

Publications (1)

Publication Number Publication Date
WO2020211245A1 true WO2020211245A1 (en) 2020-10-22

Family

ID=67786096

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/103060 WO2020211245A1 (en) 2019-04-19 2019-08-28 Development trend data acquisition method and device

Country Status (2)

Country Link
CN (1) CN110210645A (en)
WO (1) WO2020211245A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112668772B (en) * 2020-12-24 2024-03-12 润电能源科学技术有限公司 State development trend prediction method, device, equipment and storage medium
CN112767008A (en) * 2020-12-31 2021-05-07 平安科技(深圳)有限公司 Enterprise revenue trend prediction method and device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102968670A (en) * 2012-10-23 2013-03-13 北京京东世纪贸易有限公司 Method and device for predicting data
CN107194489A (en) * 2016-03-14 2017-09-22 阿里巴巴集团控股有限公司 Data predication method and device
CN108550047A (en) * 2018-03-20 2018-09-18 阿里巴巴集团控股有限公司 The prediction technique and device of trading volume
US10078337B1 (en) * 2017-07-14 2018-09-18 Uber Technologies, Inc. Generation of trip estimates using real-time data and historical data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102968670A (en) * 2012-10-23 2013-03-13 北京京东世纪贸易有限公司 Method and device for predicting data
CN107194489A (en) * 2016-03-14 2017-09-22 阿里巴巴集团控股有限公司 Data predication method and device
US10078337B1 (en) * 2017-07-14 2018-09-18 Uber Technologies, Inc. Generation of trip estimates using real-time data and historical data
CN108550047A (en) * 2018-03-20 2018-09-18 阿里巴巴集团控股有限公司 The prediction technique and device of trading volume

Also Published As

Publication number Publication date
CN110210645A (en) 2019-09-06

Similar Documents

Publication Publication Date Title
CN110417721B (en) Security risk assessment method, device, equipment and computer readable storage medium
CN108665159A (en) A kind of methods of risk assessment, device, terminal device and storage medium
JP5226746B2 (en) Model optimization system using variable scoring
JP2015222596A (en) System and method for forecasting frequencies associated to future loss and for related automatic processing of loss determination unit
CN108876545A (en) Order recognition methods, device and readable storage medium storing program for executing
WO2016084642A1 (en) Credit examination server, credit examination system, and credit examination program
WO2020211245A1 (en) Development trend data acquisition method and device
WO2021004318A1 (en) Resource data processing method and apparatus, computer device and storage medium
CN107798029A (en) Disparage client's Forecasting Methodology and device
CN110751326A (en) Photovoltaic day-ahead power prediction method and device and storage medium
CN111181757A (en) Information security risk prediction method and device, computing equipment and storage medium
CN111506876A (en) Data prediction analysis method, system, equipment and readable storage medium
CN113312578B (en) Fluctuation attribution method, device, equipment and medium of data index
CN114022221A (en) Sales prediction method, acquisition method and device of model thereof, and electronic equipment
CN114004691A (en) Line scoring method, device, equipment and storage medium based on fusion algorithm
CN111178498B (en) Stock fluctuation prediction method and device
CN112308293A (en) Default probability prediction method and device
CN106803192A (en) A kind of supporting impact evaluation method of the environment and surrounding of real estate
CN111951008A (en) Risk prediction method and device, electronic equipment and readable storage medium
CN113537631B (en) Medicine demand prediction method, device, electronic equipment and storage medium
JP2020135434A (en) Enterprise information processing device, enterprise event prediction method and prediction program
CN111899093B (en) Method and device for predicting default loss rate
JP6617605B6 (en) Demand amount prediction program, demand amount prediction method, and information processing device
CN114282881A (en) Depreciation measuring and calculating method and device, storage medium and computer equipment
Lyu et al. An intelligent hybrid cloud-based ANP and AI model for Development Site Selection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19925261

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19925261

Country of ref document: EP

Kind code of ref document: A1