CN109243619A - Generation method, device and the computer readable storage medium of prediction model - Google Patents

Generation method, device and the computer readable storage medium of prediction model Download PDF

Info

Publication number
CN109243619A
CN109243619A CN201810768332.2A CN201810768332A CN109243619A CN 109243619 A CN109243619 A CN 109243619A CN 201810768332 A CN201810768332 A CN 201810768332A CN 109243619 A CN109243619 A CN 109243619A
Authority
CN
China
Prior art keywords
influenza
sequence
data sequence
predetermined period
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810768332.2A
Other languages
Chinese (zh)
Other versions
CN109243619B (en
Inventor
李弦
徐亮
阮晓雯
肖京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201810768332.2A priority Critical patent/CN109243619B/en
Priority to PCT/CN2018/107488 priority patent/WO2020010710A1/en
Publication of CN109243619A publication Critical patent/CN109243619A/en
Application granted granted Critical
Publication of CN109243619B publication Critical patent/CN109243619B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of generation methods of prediction model, this method comprises: determining target area and object time unit to be predicted and predetermined period;Obtain the influenza-like case percent data sequence of continuous multiple time quantums of the target area before object time unit;According to predetermined period and default order k, influenza-like case percent data sequence is lagged 0 to k rank before and after predetermined period, obtains 2k+1 data sequence;The auto-correlation coefficient for calculating 2k+1 data sequence and influenza-like case percent data sequence determines predetermined period according to the data sequence that first auto-correlation coefficient is greater than preset threshold;Computation model parameter establishes autoregression integral moving average model as prediction model according to model parameter and predetermined period.The present invention also propose a kind of prediction model generating means and a kind of computer readable storage medium.The present invention improves the prediction precision of autoregression integral moving average model.

Description

Generation method, device and the computer readable storage medium of prediction model
Technical field
The present invention relates to field of computer technology more particularly to a kind of generation methods of prediction model, device and computer Readable storage medium storing program for executing.
Background technique
Prediction for influenza, the more universal method of currently a popular disease are using autoregression integral sliding Averaging model predicts influenza-like case percentage.Autoregression integrates moving average model and carries out influenza prediction, usually according to pre- The changing rule of the history influenza-like case percent data in geodetic area sets a changeless period progress for this area Modeling, such as 1 year or half a year.However, the fixed period may ignore the influence for the factor that some aperiodicity occur, Such as the different influences generated of annual all number length and solar term variation, cause prediction result relatively large deviation occur.For example, not With time sample different in size, some times have 53 weeks, and such as 2013, i.e. the period length in time can change.Therefore, if There is biggish deviation using the prediction result that fixed period modeling will lead to model.
Summary of the invention
The present invention provides generation method, device and the computer readable storage medium of a kind of prediction model, main purpose It is to improve the prediction precision of autoregression integral moving average model.
To achieve the above object, the present invention also provides a kind of generation methods of prediction model, this method comprises:
It determines target area and object time unit to be predicted, and obtains predetermined period;
Obtain the influenza-like case of continuous multiple time quantums of the target area before the object time unit Percent data sequence;
According to the predetermined period and default order k, by the influenza-like case percent data sequence described default Lag 0 obtains 2k+1 data sequence to k rank respectively before and after period;
Calculate separately the auto-correlation between the 2k+1 data sequence and the influenza-like case percent data sequence Coefficient, and according to lag sequence, predetermined period is determined according to the data sequence that first auto-correlation coefficient is greater than preset threshold;
According to the influenza-like case Percent sequence computation model parameter of predetermined period has been determined, according to the model parameter Autoregression integral moving average model is established as the prediction model with described predetermined period.
Optionally, in continuous multiple chronomeres before the object time unit for obtaining the target area Influenza-like case percent data sequence the step of after, the method also includes steps:
Detect whether the influenza-like case percent data sequence is stationary sequence;
If so, executing the pre- of the periodically determination for obtaining and being presented according to the influenza-like case percent data If the step of period;
If it is not, the influenza-like case percent data sequence is then converted to stationary sequence according to calculus of differences.
Optionally, described to detect that the step of whether the influenza-like case percent data sequence is stationary sequence includes:
Unit root test is carried out to the influenza-like case percent data, to detect the influenza-like case percentage number Whether be stationary sequence according to sequence, wherein if detecting has unit root in sequence, determine sequence for non-stationary series, otherwise, Determine that sequence is stationary sequence.
Optionally, the step of acquisition predetermined period includes:
The predetermined period is determined according to the periodicity that the influenza-like case percent data is presented.
Optionally, the basis has determined the influenza-like case Percent sequence computation model parameter of predetermined period, according to The model parameter and described predetermined period establish the step of autoregression integral moving average model is as prediction model packet It includes:
Calculate the auto-correlation coefficient for the steady influenza-like case percent data sequence that predetermined period has been determined and partially from phase Relationship number, and draw autocorrelogram and partial autocorrelation figure;
According to the autocorrelogram and the partial autocorrelation figure, judges the PARCOR coefficients calculated and auto-correlation coefficient is Hangover or truncation, and select autoregression integral moving average model to steady influenza-like case percentage number according to judging result It is fitted according to sequence, to obtain the prediction model.
In addition, to achieve the above object, the present invention also provides a kind of generating means of prediction model, which includes storage Device and processor, the model generator that can be run on the processor is stored in the memory, and the model generates Program realizes following steps when being executed by the processor:
It determines target area and object time unit to be predicted, and obtains predetermined period;
Obtain the influenza-like case of continuous multiple time quantums of the target area before the object time unit Percent data sequence;
According to the predetermined period and default order k, by the influenza-like case percent data sequence described default Lag 0 obtains 2k+1 data sequence to k rank respectively before and after period;
Calculate separately the auto-correlation between the 2k+1 data sequence and the influenza-like case percent data sequence Coefficient, and according to lag sequence, predetermined period is determined according to the data sequence that first auto-correlation coefficient is greater than preset threshold;
According to the influenza-like case Percent sequence computation model parameter of predetermined period has been determined, according to the model parameter Autoregression integral moving average model is established as the prediction model with described predetermined period.
Optionally, the model generator can also be executed by the processor, to obtain the target area described The object time unit before continuous multiple chronomeres in influenza-like case percent data sequence the step of it Afterwards, following steps are also realized:
Detect whether the influenza-like case percent data sequence is stationary sequence;
If so, executing the pre- of the periodically determination for obtaining and being presented according to the influenza-like case percent data If the step of period;
If it is not, the influenza-like case percent data sequence is then converted to stationary sequence according to calculus of differences.
Optionally, described to detect that the step of whether the influenza-like case percent data sequence is stationary sequence includes:
Unit root test is carried out to the influenza-like case percent data, to detect the influenza-like case percentage number Whether be stationary sequence according to sequence, wherein if detecting has unit root in sequence, determine sequence for non-stationary series, otherwise, Determine that sequence is stationary sequence.
Optionally, the step of acquisition predetermined period includes:
The predetermined period is determined according to the periodicity that the influenza-like case percent data is presented.
In addition, to achieve the above object, it is described computer-readable the present invention also provides a kind of computer readable storage medium Model generator is stored on storage medium, the model generator can be executed by one or more processor, with reality Now the step of generation method of prediction model as described above.
Generation method, device and the computer readable storage medium of prediction model proposed by the present invention, determine target area With object time unit to be predicted, and predetermined period is obtained;It is continuous more before object time unit to obtain target area The influenza-like case percent data sequence of a time quantum;According to predetermined period and default order k, by influenza-like case percentage 0 is lagged respectively before and after predetermined period than data sequence to k rank, obtains 2k+1 data sequence;Calculate separately 2k+1 data Auto-correlation coefficient between sequence and influenza-like case percent data sequence, and according to lag sequence, according to first from phase The data sequence that relationship number is greater than preset threshold determines predetermined period;According to the influenza-like case percentage of predetermined period has been determined Sequence computation model parameter establishes autoregression integral moving average model as prediction mould according to model parameter and predetermined period Type.The present invention is when the data to object time unit are predicted, by the data sequence before the object time unit After lagging multiple orders, auto-correlation coefficient between the sequence of calculation, and then according to auto-correlation coefficient determine one be suitable for it is current Predetermined period of object time unit, and modeled using the predetermined period, improve autoregression integral moving average model Prediction precision.
Detailed description of the invention
Fig. 1 is the flow diagram of the generation method for the prediction model that one embodiment of the invention provides;
Fig. 2 is the schematic diagram of internal structure of the generating means for the prediction model that one embodiment of the invention provides;
Fig. 3 is the module signal of model generator in the generating means for the prediction model that one embodiment of the invention provides Figure.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
The present invention provides a kind of generation method of prediction model.Shown in referring to Fig.1, provided for one embodiment of the invention pre- Survey the flow diagram of the generation method of model.This method can be executed by device, which can be by software and/or hard Part is realized.
In the present embodiment, the generation method of prediction model includes:
Step S10 determines target area and object time unit to be predicted, and obtains predetermined period.
Step S20 obtains the stream of continuous multiple time quantums of the target area before the object time unit Feel sample case percent data sequence.
Target area in the present embodiment is a certain area that carry out influenza-like case percent prediction, for example, a certain City or a certain province etc..In addition, being illustrated with scheme of the Zhou Zuowei time quantum to the present embodiment, it is assumed that predict this The influenza-like case percentage in week, then with the history influenza-like case percent data sequence structure in week each in front of this week 5 years ARIMA (Autoregressive Integrated Moving Average, autoregression integrate sliding average) model is built, that is, is wanted 261st week influenza-like case percentage is predicted, then is needed using 260 weeks influenza-like case percentage before it Data construct ARIMA model.Over time, object time unit can elapse forward, and the object time unit is gone through History data can change, further, since annual week number length and solar term variation lead to these historical datas there may be difference The periodicity actually showed can change.In the case where these no factors influence, by the influenza in long-time The observation of sample case percent data sequence, it can be seen that data are in annual periodicity, that is, the period is 52 weeks, but When having above-mentioned factor influence, the period can be less or greater than 52 weeks.Therefore, in the method for the present embodiment, to prediction week Before phase carries out dynamic adjustment, the periodicity that can be presented according to influenza-like case percent data determines that predetermined period is 52 Week next predetermined period is adjusted using 52 weeks as benchmark.Assuming that currently to predict the first week stream in June, 2018 Feel sample case percentage, then needs to obtain the influenza-like case percent data sequence in 2013 to 2017 each weeks.At it In his embodiment, the quantity of time quantum is arranged by user according to actual prediction demand in historical data.
Further, in order to improve foundation ARIMA model prediction precision, after getting above-mentioned data sequence, Detect whether the data sequence is stationary sequence, if it is, subsequent step is continued to execute, if it is not, then according to calculus of differences The data sequence is converted into steady influenza-like case percent data sequence.Specifically, the influenza-like case percentage is detected The step of than data sequence whether being stationary sequence includes: to carry out unit root test to the influenza-like case percent data, It whether is stationary sequence to detect the influenza-like case percent data sequence, wherein if detecting has unit root in sequence, Sequence is then determined for non-stationary series, otherwise, it is determined that sequence is stationary sequence.
Step S30 exists the influenza-like case percent data sequence according to the predetermined period and default order k Lag 0 obtains 2k+1 data sequence to k rank respectively before and after the predetermined period.
Step S40 is calculated separately between the 2k+1 data sequence and the influenza-like case percent data sequence Auto-correlation coefficient determined according to the data sequence that first auto-correlation coefficient is greater than preset threshold pre- and according to lag sequence Survey the period.
In the present embodiment, presetting order K is positive integer, and K preferably value is 2-6, below with predetermined period T0=52, K For=2, influenza-like case percent data sequence is obtained in predetermined period T0The data sequence of lag K rank nearby, can obtain Get 2k+1 data sequence.Assuming that the original data sequence that constitutes of the historical data during 2013 to 2017 be [W1, W2, W3 ... W260], according to predetermined period T0=52 extract original data sequence L0=[W1, W2, W3 ... W206], which is lagged into 0 to 2 rank respectively before and after predetermined period, gets following 5 sequences:
L1 be [W50, W51, W52 ... W256];
L2 be [W52, W53, W54 ... W257];
L3 be [W53, W54, W55 ... W258];
L4 be [W54, W55, W56 ... W259];
L5 be [W55, W56, W57 ... W260].
It is lagged on the basis of original data sequence 50 weeks and obtains data sequence L1, on the basis of original data sequence It lags 51 weeks and obtains data sequence L2, lagged on the basis of original data sequence 52 weeks and obtain data sequence L3, in original number Data sequence L4 is obtained according to lagging 53 weeks on the basis of sequence, is lagged on the basis of original data sequence 54 weeks and obtains data sequence Arrange L5.Calculate separately the auto-correlation coefficient between sequence L1, L2, L3, L4, L5 and sequence L0.
When not influenced by factors such as time or solar term in the period, i.e., when predetermined period is 52 weeks, then the 1st week extremely With the 53rd week to 258 weeks data sequence be in lesser error range within 206th week it is the same, at this point, L0 and L3 from phase Relationship number can be maximum.But if data sequence when being influenced to cause mechanical periodicity by other factors, the correlation of L0 and L3 Will be weaker, the correlation of L0 and other data sequences becomes strong, therefore, by the above-mentioned auto-correlation coefficient being calculated, presses According to the sequence from L1 to L5, determines that first auto-correlation coefficient is greater than the sequence of preset threshold, the lag week number of the sequence is made For predetermined period.For example, saying that the auto-correlation coefficient being calculated between L0 and L4 was L1 first auto-correlation coefficient into L5 Greater than the sequence of preset threshold, it is determined that current predetermined period is 53 weeks.
Step S50, according to the influenza-like case Percent sequence computation model parameter of predetermined period has been determined, according to described Model parameter and described predetermined period establish autoregression integral moving average model as the prediction model.
To the stable influenza ILI sequence of predetermined period has been determined, its auto-correlation coefficient and partial autocorrelation system are acquired respectively Number, to determine the AR parameter and MA parameter in model, the i.e. value of p, q and q, and then according to obtained above-mentioned parameter and predetermined period Establish ARIMA model.Specifically, the stable data sequence being converted to by calculus of differences is obtained, stationary sequence will be carried out and turned Value of the order of calculus of differences when changing as the parameter d of ARIMA model.The stable data sequence that predetermined period has been determined is divided Its auto-correlation coefficient and PARCOR coefficients are not acquired, and draw autocorrelogram and partial autocorrelation figure, according to autocorrelogram and partially Autocorrelogram judges PARCOR coefficients and auto-correlation coefficient is hangover or truncation, and thus selects corresponding ARIMA model To tranquilization, treated that data sequence is fitted;Parameter Estimation is carried out to the ARIMA model of fitting, obtains optimal stratum Then p and order q carries out validity check to model, with the prediction model that determination is final.Specifically, in exponential smoothing model Under, whether the residual error of observation ARIMA model is that average value is 0 and variance is the normal distribution of constant, while observing continuous residual error It is whether related, if so, decision model passes through verification.
During predict in real time using obtained ARIMA model, to influenza sample by way of rolling forecast Case percent data carries out a weekly forecasting in advance.With the variation of time, for the history influenza sample disease of different future positions The period of example percent data sequence may have variation.Therefore, it every time when mentioning the last week and being predicted, needs according to this The influenza-like case percent data in continuous more weeks before week determines new predetermined period, carries out dynamic update to model, with Improve the prediction precision of autoregression integral moving average model.
The generation method for the prediction model that the present embodiment proposes, determines target area and object time unit to be predicted, And obtain predetermined period;Obtain the influenza-like case hundred of continuous multiple time quantums of the target area before object time unit Divide and compares data sequence;According to predetermined period and default order k, by influenza-like case percent data sequence before and after predetermined period Lag 0 obtains 2k+1 data sequence to k rank respectively;Calculate separately 2k+1 data sequence and influenza-like case percentage number According to the auto-correlation coefficient between sequence, and according to lag sequence, the data of preset threshold are greater than according to first auto-correlation coefficient Sequence determines predetermined period;According to the influenza-like case Percent sequence computation model parameter of predetermined period has been determined, according to mould Shape parameter and predetermined period establish autoregression integral moving average model as prediction model.The present invention is to object time unit Data when being predicted, after lagging multiple orders to the data sequence before the object time unit, the sequence of calculation it Between auto-correlation coefficient, and then according to auto-correlation coefficient determine one be suitable for current goal time quantum predetermined period, and It is modeled using the predetermined period, improves the prediction precision of autoregression integral moving average model.
The present invention also provides a kind of generating means of prediction model.Referring to shown in Fig. 2, provided for one embodiment of the invention The schematic diagram of internal structure of the generating means of prediction model.
In the present embodiment, the generating means 1 of prediction model can be PC (Personal Computer, PC), It is also possible to the terminal devices such as smart phone, tablet computer, portable computer.The generating means 1 of the prediction model include at least Memory 11, processor 12, network interface 13 and communication bus.
Wherein, memory 11 include at least a type of readable storage medium storing program for executing, the readable storage medium storing program for executing include flash memory, Hard disk, multimedia card, card-type memory (for example, SD or DX memory etc.), magnetic storage, disk, CD etc..Memory 11 It can be the internal storage unit of the generating means 1 of prediction model, such as the generation dress of the prediction model in some embodiments Set 1 hard disk.Memory 11 is also possible to the External memory equipment of the generating means 1 of prediction model in further embodiments, Such as the plug-in type hard disk being equipped in the generating means 1 of prediction model, intelligent memory card (Smart Media Card, SMC), peace Digital (Secure Digital, SD) card, flash card (Flash Card) etc..Further, memory 11 can also be wrapped both The internal storage unit for including the generating means 1 of prediction model also includes External memory equipment.Memory 11 can be not only used for depositing Storage is installed on application software and Various types of data, such as the code of model generator 01 of generating means 1 of prediction model etc., also It can be used for temporarily storing the data that has exported or will export.
Processor 12 can be in some embodiments a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor or other data processing chips, the program for being stored in run memory 11 Code or processing data, such as execute model generator 01 etc..
Network interface 13 optionally may include standard wireline interface and wireless interface (such as WI-FI interface), be commonly used in Communication connection is established between the device 1 and other electronic equipments.
Communication bus is for realizing the connection communication between these components.
Optionally, which can also include user interface, and user interface may include display (Display), input Unit such as keyboard (Keyboard), optional user interface can also include standard wireline interface and wireless interface.It is optional Ground, in some embodiments, display can be light-emitting diode display, liquid crystal display, touch-control liquid crystal display and OLED (Organic Light-Emitting Diode, Organic Light Emitting Diode) touches device etc..Wherein, display can also be appropriate Referred to as display screen or display unit, for be shown in the information handled in the generating means 1 of prediction model and for show can Depending on the user interface changed.
Fig. 2 illustrates only the generating means 1 of the prediction model with component 11-13 and model generator 01, ability Field technique personnel, can be with it is understood that structure shown in fig. 1 does not constitute the restriction to the generating means 1 of prediction model Including perhaps combining certain components or different component layouts than illustrating less perhaps more components.
In 1 embodiment of device shown in Fig. 2, model generator 01 is stored in memory 11;Processor 12 executes Following steps are realized when the model generator 01 stored in memory 11:
It determines target area and object time unit to be predicted, and obtains predetermined period.
Obtain the influenza-like case of continuous multiple time quantums of the target area before the object time unit Percent data sequence.
According to the predetermined period and default order k, by the influenza-like case percent data sequence described default Lag 0 obtains 2k+1 data sequence to k rank respectively before and after period.
Calculate separately the auto-correlation between the 2k+1 data sequence and the influenza-like case percent data sequence Coefficient, and according to lag sequence, predetermined period is determined according to the data sequence that first auto-correlation coefficient is greater than preset threshold.
According to the influenza-like case Percent sequence computation model parameter of predetermined period has been determined, according to the model parameter Autoregression integral moving average model is established as the prediction model with described predetermined period.
Target area in the present embodiment is a certain area that carry out influenza-like case percent prediction, for example, a certain City or a certain province etc..In addition, being illustrated with scheme of the Zhou Zuowei time quantum to the present embodiment, it is assumed that predict this The influenza-like case percentage in week, then with the history influenza-like case percent data sequence structure in week each in front of this week 5 years ARIMA (Autoregressive Integrated Moving Average, autoregression integrate sliding average) model is built, that is, is wanted 261st week influenza-like case percentage is predicted, then is needed using 260 weeks influenza-like case percentage before it Data construct ARIMA model.Over time, object time unit can elapse forward, and the object time unit is gone through History data can change, further, since annual week number length and solar term variation lead to these historical datas there may be difference The periodicity actually showed can change.In the case where these no factors influence, by the influenza in long-time The observation of sample case percent data sequence, it can be seen that data are in annual periodicity, that is, the period is 52 weeks, but When having above-mentioned factor influence, the period can be less or greater than 52 weeks.Therefore, in the method for the present embodiment, to prediction week Before phase carries out dynamic adjustment, the periodicity that can be presented according to influenza-like case percent data determines that predetermined period is 52 Week next predetermined period is adjusted using 52 weeks as benchmark.Assuming that currently to predict the first week stream in June, 2018 Feel sample case percentage, then needs to obtain the influenza-like case percent data sequence in 2013 to 2017 each weeks.At it In his embodiment, the quantity of time quantum is arranged by user according to actual prediction demand in historical data.
Further, in order to improve foundation ARIMA model prediction precision, after getting above-mentioned data sequence, Detect whether the data sequence is stationary sequence, if it is, subsequent step is continued to execute, if it is not, then according to calculus of differences The data sequence is converted into steady influenza-like case percent data sequence.Specifically, the influenza-like case percentage is detected The step of than data sequence whether being stationary sequence includes: to carry out unit root test to the influenza-like case percent data, It whether is stationary sequence to detect the influenza-like case percent data sequence, wherein if detecting has unit root in sequence, Sequence is then determined for non-stationary series, otherwise, it is determined that sequence is stationary sequence.
In the present embodiment, presetting order K is positive integer, and K preferably value is 2-6, below with predetermined period T0=52, K For=2, influenza-like case percent data sequence is obtained in predetermined period T0The data sequence of lag K rank nearby, can obtain Get 2k+1 data sequence.Assuming that the original data sequence that constitutes of the historical data during 2013 to 2017 be [W1, W2, W3 ... W260], according to predetermined period T0=52 extract original data sequence L0=[W1, W2, W3 ... W206], which is lagged into 0 to 2 rank respectively before and after predetermined period, gets following 5 sequences:
L1 be [W50, W51, W52 ... W256];
L2 be [W52, W53, W54 ... W257];
L3 be [W53, W54, W55 ... W258];
L4 be [W54, W55, W56 ... W259];
L5 be [W55, W56, W57 ... W260].
It is lagged on the basis of original data sequence 50 weeks and obtains data sequence L1, on the basis of original data sequence It lags 51 weeks and obtains data sequence L2, lagged on the basis of original data sequence 52 weeks and obtain data sequence L3, in original number Data sequence L4 is obtained according to lagging 53 weeks on the basis of sequence, is lagged on the basis of original data sequence 54 weeks and obtains data sequence Arrange L5.Calculate separately the auto-correlation coefficient between sequence L1, L2, L3, L4, L5 and sequence L0.
When not influenced by factors such as time or solar term in the period, i.e., when predetermined period is 52 weeks, then the 1st week extremely With the 53rd week to 258 weeks data sequence be in lesser error range within 206th week it is the same, at this point, L0 and L3 from phase Relationship number can be maximum.But if data sequence when being influenced to cause mechanical periodicity by other factors, the correlation of L0 and L3 Will be weaker, the correlation of L0 and other data sequences becomes strong, therefore, by the above-mentioned auto-correlation coefficient being calculated, presses According to the sequence from L1 to L5, determines that first auto-correlation coefficient is greater than the sequence of preset threshold, the lag week number of the sequence is made For predetermined period.For example, saying that the auto-correlation coefficient being calculated between L0 and L4 was L1 first auto-correlation coefficient into L5 Greater than the sequence of preset threshold, it is determined that current predetermined period is 53 weeks.
To the stable influenza ILI sequence of predetermined period has been determined, its auto-correlation coefficient and partial autocorrelation system are acquired respectively Number, to determine the AR parameter and MA parameter in model, the i.e. value of p, q and q, and then according to obtained above-mentioned parameter and predetermined period Establish ARIMA model.Specifically, the stable data sequence being converted to by calculus of differences is obtained, stationary sequence will be carried out and turned Value of the order of calculus of differences when changing as the parameter d of ARIMA model.The stable data sequence that predetermined period has been determined is divided Its auto-correlation coefficient and PARCOR coefficients are not acquired, and draw autocorrelogram and partial autocorrelation figure, according to autocorrelogram and partially Autocorrelogram judges PARCOR coefficients and auto-correlation coefficient is hangover or truncation, and thus selects corresponding ARIMA model To tranquilization, treated that data sequence is fitted;Parameter Estimation is carried out to the ARIMA model of fitting, obtains optimal stratum Then p and order q carries out validity check to model, with the prediction model that determination is final.Specifically, in exponential smoothing model Under, whether the residual error of observation ARIMA model is that average value is 0 and variance is the normal distribution of constant, while observing continuous residual error It is whether related, if so, decision model passes through verification.
During predict in real time using obtained ARIMA model, to influenza sample by way of rolling forecast Case percent data carries out a weekly forecasting in advance.With the variation of time, for the history influenza sample disease of different future positions The period of example percent data sequence may have variation.Therefore, it every time when mentioning the last week and being predicted, needs according to this The influenza-like case percent data in continuous more weeks before week determines new predetermined period, carries out dynamic update to model, with Improve the prediction precision of autoregression integral moving average model.
The generating means for the prediction model that the present embodiment proposes, determine target area and object time unit to be predicted, And obtain predetermined period;Obtain the influenza-like case hundred of continuous multiple time quantums of the target area before object time unit Divide and compares data sequence;According to predetermined period and default order k, by influenza-like case percent data sequence before and after predetermined period Lag 0 obtains 2k+1 data sequence to k rank respectively;Calculate separately 2k+1 data sequence and influenza-like case percentage number According to the auto-correlation coefficient between sequence, and according to lag sequence, the data of preset threshold are greater than according to first auto-correlation coefficient Sequence determines predetermined period;According to the influenza-like case Percent sequence computation model parameter of predetermined period has been determined, according to mould Shape parameter and predetermined period establish autoregression integral moving average model as prediction model.The present invention is to object time unit Data when being predicted, after lagging multiple orders to the data sequence before the object time unit, the sequence of calculation it Between auto-correlation coefficient, and then according to auto-correlation coefficient determine one be suitable for current goal time quantum predetermined period, and It is modeled using the predetermined period, improves the prediction precision of autoregression integral moving average model.
Optionally, in other examples, model generator can also be divided into one or more module, and one A or multiple modules are stored in memory 11, and are held by one or more processors (the present embodiment is by processor 12) For row to complete the present invention, the so-called module of the present invention is the series of computation machine program instruction section for referring to complete specific function, Implementation procedure of the program in the generating means of prediction model is generated for descriptive model.
It is the model generator in one embodiment of generating means of prediction model of the present invention for example, referring to shown in Fig. 3 Program module schematic diagram, in the embodiment, model generator can be divided into data determining module 10, retrieval module 20, data computation module 30 and model generation module 40, illustratively:
Data determining module 10 is used for: being determined target area and object time unit to be predicted, and is obtained predetermined period;
And obtain the influenza sample of continuous multiple time quantums of the target area before the object time unit Case percent data sequence;
Retrieval module 20 is used for: according to the predetermined period and default order k, by the influenza-like case percentage Data sequence lags 0 to k rank respectively before and after the predetermined period, obtains 2k+1 data sequence;
Data computation module 30 is used for: calculating separately the 2k+1 data sequence and the influenza-like case percentage number According to the auto-correlation coefficient between sequence, and according to lag sequence, the data of preset threshold are greater than according to first auto-correlation coefficient Sequence determines predetermined period;
Model generation module 40 is used for: being joined according to the influenza-like case Percent sequence computation model that predetermined period has been determined Number establishes autoregression integral moving average model as the prediction model according to the model parameter and described predetermined period.
The journeys such as above-mentioned data determining module 10, retrieval module 20, data computation module 30 and model generation module 40 Sequence module is performed realized functions or operations step and is substantially the same with above-described embodiment, and details are not described herein.
In addition, the embodiment of the present invention also proposes a kind of computer readable storage medium, the computer readable storage medium On be stored with model generator, the model generator can be executed by one or more processors, to realize following operation:
It determines target area and object time unit to be predicted, and obtains predetermined period;
Obtain the influenza-like case of continuous multiple time quantums of the target area before the object time unit Percent data sequence;
According to the predetermined period and default order k, by the influenza-like case percent data sequence described default Lag 0 obtains 2k+1 data sequence to k rank respectively before and after period;
Calculate separately the auto-correlation between the 2k+1 data sequence and the influenza-like case percent data sequence Coefficient, and according to lag sequence, predetermined period is determined according to the data sequence that first auto-correlation coefficient is greater than preset threshold;
According to the influenza-like case Percent sequence computation model parameter of predetermined period has been determined, according to the model parameter Autoregression integral moving average model is established as the prediction model with described predetermined period.The computer-readable storage of the present invention Medium specific embodiment and each embodiment of generating means and method of above-mentioned prediction model are essentially identical, do not make tired state herein.
It should be noted that the serial number of the above embodiments of the invention is only for description, do not represent the advantages or disadvantages of the embodiments.And The terms "include", "comprise" herein or any other variant thereof is intended to cover non-exclusive inclusion, so that packet Process, device, article or the method for including a series of elements not only include those elements, but also including being not explicitly listed Other element, or further include for this process, device, article or the intrinsic element of method.Do not limiting more In the case where, the element that is limited by sentence "including a ...", it is not excluded that including process, device, the article of the element Or there is also other identical elements in method.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art The part contributed out can be embodied in the form of software products, which is stored in one as described above In storage medium (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that terminal device (it can be mobile phone, Computer, server or network equipment etc.) execute method described in each embodiment of the present invention.
The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills Art field, is included within the scope of the present invention.

Claims (10)

1. a kind of generation method of prediction model, which is characterized in that the described method includes:
It determines target area and object time unit to be predicted, and obtains predetermined period;
Obtain the influenza-like case percentage of continuous multiple time quantums of the target area before the object time unit Compare data sequence;
According to the predetermined period and default order k, by the influenza-like case percent data sequence in the predetermined period Front and back lags 0 to k rank respectively, obtains 2k+1 data sequence;
The auto-correlation coefficient between the 2k+1 data sequence and the influenza-like case percent data sequence is calculated separately, And according to lag sequence, predetermined period is determined according to the data sequence that first auto-correlation coefficient is greater than preset threshold;
According to the influenza-like case Percent sequence computation model parameter of predetermined period has been determined, according to the model parameter and institute It states predetermined period and establishes autoregression integral moving average model as the prediction model.
2. the generation method of prediction model as described in claim 1, which is characterized in that the institute for obtaining the target area After the step of influenza-like case percent data sequence in continuous multiple chronomeres before stating object time unit, institute The method of stating further comprises the steps of:
Detect whether the influenza-like case percent data sequence is stationary sequence;
If so, executing the default week periodically determined for obtaining and being presented according to the influenza-like case percent data The step of phase;
If it is not, the influenza-like case percent data sequence is then converted to stationary sequence according to calculus of differences.
3. the generation method of prediction model as claimed in claim 2, which is characterized in that the detection influenza-like case hundred The step of point than data sequence whether being stationary sequence includes:
Unit root test is carried out to the influenza-like case percent data, to detect the influenza-like case percent data sequence Column whether be stationary sequence, wherein if detecting has unit root in sequence, determine sequence for non-stationary series, otherwise, it is determined that Sequence is stationary sequence.
4. the generation method of prediction model as claimed any one in claims 1 to 3, which is characterized in that described obtain is preset The step of period includes:
The predetermined period is determined according to the periodicity that the influenza-like case percent data is presented.
5. the generation method of prediction model as claimed any one in claims 1 to 3, which is characterized in that described according to determination The influenza-like case Percent sequence computation model parameter of predetermined period, builds according to the model parameter and described predetermined period Vertical autoregression integrates moving average model as the step of prediction model and includes:
Calculate auto-correlation coefficient and the partial autocorrelation system of the steady influenza-like case percent data sequence that predetermined period has been determined Number, and draw autocorrelogram and partial autocorrelation figure;
According to the autocorrelogram and the partial autocorrelation figure, judges the PARCOR coefficients calculated and auto-correlation coefficient is hangover Or truncation, and select autoregression integral moving average model to steady influenza-like case percent data sequence according to judging result Column are fitted, to obtain the prediction model.
6. a kind of generating means of prediction model, which is characterized in that described device includes memory and processor, the memory On be stored with the model generator that can be run on the processor, when the model generator is executed by the processor Realize following steps:
It determines target area and object time unit to be predicted, and obtains predetermined period;
Obtain the influenza-like case percentage of continuous multiple time quantums of the target area before the object time unit Compare data sequence;
According to the predetermined period and default order k, by the influenza-like case percent data sequence in the predetermined period Front and back lags 0 to k rank respectively, obtains 2k+1 data sequence;
The auto-correlation coefficient between the 2k+1 data sequence and the influenza-like case percent data sequence is calculated separately, And according to lag sequence, predetermined period is determined according to the data sequence that first auto-correlation coefficient is greater than preset threshold;
According to the influenza-like case Percent sequence computation model parameter of predetermined period has been determined, according to the model parameter and institute It states predetermined period and establishes autoregression integral moving average model as the prediction model.
7. the generating means of prediction model as claimed in claim 6, which is characterized in that the model generator can also be by institute Processor execution is stated, with continuous multiple chronomeres before the object time unit for obtaining the target area After the step of interior influenza-like case percent data sequence, also realization following steps:
Detect whether the influenza-like case percent data sequence is stationary sequence;
If so, executing the default week periodically determined for obtaining and being presented according to the influenza-like case percent data The step of phase;
If it is not, the influenza-like case percent data sequence is then converted to stationary sequence according to calculus of differences.
8. the generating means of prediction model as claimed in claim 7, which is characterized in that the detection influenza-like case hundred The step of point than data sequence whether being stationary sequence includes:
Unit root test is carried out to the influenza-like case percent data, to detect the influenza-like case percent data sequence Column whether be stationary sequence, wherein if detecting has unit root in sequence, determine sequence for non-stationary series, otherwise, it is determined that Sequence is stationary sequence.
9. the generating means of the prediction model as described in any one of claim 6 to 8, which is characterized in that described obtain is preset The step of period includes:
The predetermined period is determined according to the periodicity that the influenza-like case percent data is presented.
10. a kind of computer readable storage medium, which is characterized in that it is raw to be stored with model on the computer readable storage medium At program, the model generator can be executed by one or more processor, to realize as any in claim 1 to 5 The step of generation method of prediction model described in.
CN201810768332.2A 2018-07-13 2018-07-13 Generation method and device of prediction model and computer readable storage medium Active CN109243619B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201810768332.2A CN109243619B (en) 2018-07-13 2018-07-13 Generation method and device of prediction model and computer readable storage medium
PCT/CN2018/107488 WO2020010710A1 (en) 2018-07-13 2018-09-26 Method and apparatus for generating prediction model, and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810768332.2A CN109243619B (en) 2018-07-13 2018-07-13 Generation method and device of prediction model and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN109243619A true CN109243619A (en) 2019-01-18
CN109243619B CN109243619B (en) 2023-03-31

Family

ID=65072559

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810768332.2A Active CN109243619B (en) 2018-07-13 2018-07-13 Generation method and device of prediction model and computer readable storage medium

Country Status (2)

Country Link
CN (1) CN109243619B (en)
WO (1) WO2020010710A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109916361A (en) * 2019-03-04 2019-06-21 中国计量科学研究院 A kind of roundness measurement signal processing method without angular position information
CN113035368A (en) * 2021-04-13 2021-06-25 桂林电子科技大学 Disease propagation prediction method based on differential migration diagram neural network
CN113537631A (en) * 2021-08-04 2021-10-22 北方健康医疗大数据科技有限公司 Method and device for predicting medicine demand, electronic equipment and storage medium

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112259239B (en) * 2020-10-21 2023-07-11 平安科技(深圳)有限公司 Parameter processing method and device, electronic equipment and storage medium
CN113380423B (en) * 2021-05-24 2024-06-18 首都医科大学 Epidemic situation scale prediction method and device, electronic equipment and storage medium
CN117147807B (en) * 2023-11-01 2024-01-26 中海(天津)能源科技有限公司 Oil quality monitoring system and method for petroleum exploration
CN117457096B (en) * 2023-12-26 2024-03-22 山东省海洋资源与环境研究院(山东省海洋环境监测中心、山东省水产品质量检验中心) Dynamic monitoring and adjusting system for simulating carbon dioxide dissolution in ocean acidification device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107633254A (en) * 2017-07-25 2018-01-26 平安科技(深圳)有限公司 Establish device, method and the computer-readable recording medium of forecast model
CN107688872A (en) * 2017-08-20 2018-02-13 平安科技(深圳)有限公司 Forecast model establishes device, method and computer-readable recording medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108268967B (en) * 2017-01-04 2021-01-26 北京京东尚科信息技术有限公司 Method and system for predicting telephone traffic
CN107145714B (en) * 2017-04-07 2020-05-22 浙江大学城市学院 Multi-factor-based public bicycle usage amount prediction method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107633254A (en) * 2017-07-25 2018-01-26 平安科技(深圳)有限公司 Establish device, method and the computer-readable recording medium of forecast model
CN107688872A (en) * 2017-08-20 2018-02-13 平安科技(深圳)有限公司 Forecast model establishes device, method and computer-readable recording medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
姜世强 等: "基于ARIMA模型的流感样病例预警预测分析" *
李广智 等: "自回归求和移动平均模型在流感发病预测中的应用" *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109916361A (en) * 2019-03-04 2019-06-21 中国计量科学研究院 A kind of roundness measurement signal processing method without angular position information
CN113035368A (en) * 2021-04-13 2021-06-25 桂林电子科技大学 Disease propagation prediction method based on differential migration diagram neural network
CN113537631A (en) * 2021-08-04 2021-10-22 北方健康医疗大数据科技有限公司 Method and device for predicting medicine demand, electronic equipment and storage medium
CN113537631B (en) * 2021-08-04 2023-11-10 北方健康医疗大数据科技有限公司 Medicine demand prediction method, device, electronic equipment and storage medium

Also Published As

Publication number Publication date
WO2020010710A1 (en) 2020-01-16
CN109243619B (en) 2023-03-31

Similar Documents

Publication Publication Date Title
CN109243619A (en) Generation method, device and the computer readable storage medium of prediction model
CN108133739B (en) Motion path pushing method and device and storage medium
CN111984426B (en) Task scheduling method and device, electronic equipment and storage medium
CN105205570A (en) Power grid power sale quantity prediction method based on season time sequence analysis
CN111178537B (en) Feature extraction model training method and device
CN114359970A (en) Pedestrian re-identification method and device, electronic equipment and storage medium
CN107015900A (en) A kind of service performance Forecasting Methodology of video website
CN109918574A (en) Item recommendation method, device, equipment and storage medium
WO2022072012A1 (en) Optimizing job runtimes via prediction-based token allocation
CN112559923A (en) Website resource recommendation method and device, electronic equipment and computer storage medium
CN116168350A (en) Intelligent monitoring method and device for realizing constructor illegal behaviors based on Internet of things
US20190385262A1 (en) Information processing method and information processing device
CN105491079A (en) Method and device for adjusting resources needed by application in cloud computing environment
CN109817342A (en) Parameter regulation means, device, equipment and the storage medium of popular season prediction model
US10558767B1 (en) Analytical derivative-based ARMA model estimation
CN109905880B (en) Network partitioning method, system, electronic device and storage medium
JP5687122B2 (en) Software evaluation device, software evaluation method, and system evaluation device
CN116108276A (en) Information recommendation method and device based on artificial intelligence and related equipment
CN113127019B (en) Verification method and related equipment
US8972072B2 (en) Optimizing power consumption in planned projects
CN113065055B (en) News information capturing method and device, electronic equipment and storage medium
CN114331446A (en) Method, device, equipment and medium for realizing out-of-chain service of block chain
CN109743203B (en) Distributed service security combination system and method based on quantitative information flow
CN109308327A (en) Figure calculation method device medium apparatus based on the compatible dot center's model of subgraph model
CN109492906A (en) Response executes reliability estimation method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant