CN109243619A - Generation method, device and the computer readable storage medium of prediction model - Google Patents
Generation method, device and the computer readable storage medium of prediction model Download PDFInfo
- Publication number
- CN109243619A CN109243619A CN201810768332.2A CN201810768332A CN109243619A CN 109243619 A CN109243619 A CN 109243619A CN 201810768332 A CN201810768332 A CN 201810768332A CN 109243619 A CN109243619 A CN 109243619A
- Authority
- CN
- China
- Prior art keywords
- influenza
- sequence
- data sequence
- predetermined period
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/80—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Health & Medical Sciences (AREA)
- Public Health (AREA)
- Engineering & Computer Science (AREA)
- Medical Informatics (AREA)
- Biomedical Technology (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Pathology (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of generation methods of prediction model, this method comprises: determining target area and object time unit to be predicted and predetermined period;Obtain the influenza-like case percent data sequence of continuous multiple time quantums of the target area before object time unit;According to predetermined period and default order k, influenza-like case percent data sequence is lagged 0 to k rank before and after predetermined period, obtains 2k+1 data sequence;The auto-correlation coefficient for calculating 2k+1 data sequence and influenza-like case percent data sequence determines predetermined period according to the data sequence that first auto-correlation coefficient is greater than preset threshold;Computation model parameter establishes autoregression integral moving average model as prediction model according to model parameter and predetermined period.The present invention also propose a kind of prediction model generating means and a kind of computer readable storage medium.The present invention improves the prediction precision of autoregression integral moving average model.
Description
Technical field
The present invention relates to field of computer technology more particularly to a kind of generation methods of prediction model, device and computer
Readable storage medium storing program for executing.
Background technique
Prediction for influenza, the more universal method of currently a popular disease are using autoregression integral sliding
Averaging model predicts influenza-like case percentage.Autoregression integrates moving average model and carries out influenza prediction, usually according to pre-
The changing rule of the history influenza-like case percent data in geodetic area sets a changeless period progress for this area
Modeling, such as 1 year or half a year.However, the fixed period may ignore the influence for the factor that some aperiodicity occur,
Such as the different influences generated of annual all number length and solar term variation, cause prediction result relatively large deviation occur.For example, not
With time sample different in size, some times have 53 weeks, and such as 2013, i.e. the period length in time can change.Therefore, if
There is biggish deviation using the prediction result that fixed period modeling will lead to model.
Summary of the invention
The present invention provides generation method, device and the computer readable storage medium of a kind of prediction model, main purpose
It is to improve the prediction precision of autoregression integral moving average model.
To achieve the above object, the present invention also provides a kind of generation methods of prediction model, this method comprises:
It determines target area and object time unit to be predicted, and obtains predetermined period;
Obtain the influenza-like case of continuous multiple time quantums of the target area before the object time unit
Percent data sequence;
According to the predetermined period and default order k, by the influenza-like case percent data sequence described default
Lag 0 obtains 2k+1 data sequence to k rank respectively before and after period;
Calculate separately the auto-correlation between the 2k+1 data sequence and the influenza-like case percent data sequence
Coefficient, and according to lag sequence, predetermined period is determined according to the data sequence that first auto-correlation coefficient is greater than preset threshold;
According to the influenza-like case Percent sequence computation model parameter of predetermined period has been determined, according to the model parameter
Autoregression integral moving average model is established as the prediction model with described predetermined period.
Optionally, in continuous multiple chronomeres before the object time unit for obtaining the target area
Influenza-like case percent data sequence the step of after, the method also includes steps:
Detect whether the influenza-like case percent data sequence is stationary sequence;
If so, executing the pre- of the periodically determination for obtaining and being presented according to the influenza-like case percent data
If the step of period;
If it is not, the influenza-like case percent data sequence is then converted to stationary sequence according to calculus of differences.
Optionally, described to detect that the step of whether the influenza-like case percent data sequence is stationary sequence includes:
Unit root test is carried out to the influenza-like case percent data, to detect the influenza-like case percentage number
Whether be stationary sequence according to sequence, wherein if detecting has unit root in sequence, determine sequence for non-stationary series, otherwise,
Determine that sequence is stationary sequence.
Optionally, the step of acquisition predetermined period includes:
The predetermined period is determined according to the periodicity that the influenza-like case percent data is presented.
Optionally, the basis has determined the influenza-like case Percent sequence computation model parameter of predetermined period, according to
The model parameter and described predetermined period establish the step of autoregression integral moving average model is as prediction model packet
It includes:
Calculate the auto-correlation coefficient for the steady influenza-like case percent data sequence that predetermined period has been determined and partially from phase
Relationship number, and draw autocorrelogram and partial autocorrelation figure;
According to the autocorrelogram and the partial autocorrelation figure, judges the PARCOR coefficients calculated and auto-correlation coefficient is
Hangover or truncation, and select autoregression integral moving average model to steady influenza-like case percentage number according to judging result
It is fitted according to sequence, to obtain the prediction model.
In addition, to achieve the above object, the present invention also provides a kind of generating means of prediction model, which includes storage
Device and processor, the model generator that can be run on the processor is stored in the memory, and the model generates
Program realizes following steps when being executed by the processor:
It determines target area and object time unit to be predicted, and obtains predetermined period;
Obtain the influenza-like case of continuous multiple time quantums of the target area before the object time unit
Percent data sequence;
According to the predetermined period and default order k, by the influenza-like case percent data sequence described default
Lag 0 obtains 2k+1 data sequence to k rank respectively before and after period;
Calculate separately the auto-correlation between the 2k+1 data sequence and the influenza-like case percent data sequence
Coefficient, and according to lag sequence, predetermined period is determined according to the data sequence that first auto-correlation coefficient is greater than preset threshold;
According to the influenza-like case Percent sequence computation model parameter of predetermined period has been determined, according to the model parameter
Autoregression integral moving average model is established as the prediction model with described predetermined period.
Optionally, the model generator can also be executed by the processor, to obtain the target area described
The object time unit before continuous multiple chronomeres in influenza-like case percent data sequence the step of it
Afterwards, following steps are also realized:
Detect whether the influenza-like case percent data sequence is stationary sequence;
If so, executing the pre- of the periodically determination for obtaining and being presented according to the influenza-like case percent data
If the step of period;
If it is not, the influenza-like case percent data sequence is then converted to stationary sequence according to calculus of differences.
Optionally, described to detect that the step of whether the influenza-like case percent data sequence is stationary sequence includes:
Unit root test is carried out to the influenza-like case percent data, to detect the influenza-like case percentage number
Whether be stationary sequence according to sequence, wherein if detecting has unit root in sequence, determine sequence for non-stationary series, otherwise,
Determine that sequence is stationary sequence.
Optionally, the step of acquisition predetermined period includes:
The predetermined period is determined according to the periodicity that the influenza-like case percent data is presented.
In addition, to achieve the above object, it is described computer-readable the present invention also provides a kind of computer readable storage medium
Model generator is stored on storage medium, the model generator can be executed by one or more processor, with reality
Now the step of generation method of prediction model as described above.
Generation method, device and the computer readable storage medium of prediction model proposed by the present invention, determine target area
With object time unit to be predicted, and predetermined period is obtained;It is continuous more before object time unit to obtain target area
The influenza-like case percent data sequence of a time quantum;According to predetermined period and default order k, by influenza-like case percentage
0 is lagged respectively before and after predetermined period than data sequence to k rank, obtains 2k+1 data sequence;Calculate separately 2k+1 data
Auto-correlation coefficient between sequence and influenza-like case percent data sequence, and according to lag sequence, according to first from phase
The data sequence that relationship number is greater than preset threshold determines predetermined period;According to the influenza-like case percentage of predetermined period has been determined
Sequence computation model parameter establishes autoregression integral moving average model as prediction mould according to model parameter and predetermined period
Type.The present invention is when the data to object time unit are predicted, by the data sequence before the object time unit
After lagging multiple orders, auto-correlation coefficient between the sequence of calculation, and then according to auto-correlation coefficient determine one be suitable for it is current
Predetermined period of object time unit, and modeled using the predetermined period, improve autoregression integral moving average model
Prediction precision.
Detailed description of the invention
Fig. 1 is the flow diagram of the generation method for the prediction model that one embodiment of the invention provides;
Fig. 2 is the schematic diagram of internal structure of the generating means for the prediction model that one embodiment of the invention provides;
Fig. 3 is the module signal of model generator in the generating means for the prediction model that one embodiment of the invention provides
Figure.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
The present invention provides a kind of generation method of prediction model.Shown in referring to Fig.1, provided for one embodiment of the invention pre-
Survey the flow diagram of the generation method of model.This method can be executed by device, which can be by software and/or hard
Part is realized.
In the present embodiment, the generation method of prediction model includes:
Step S10 determines target area and object time unit to be predicted, and obtains predetermined period.
Step S20 obtains the stream of continuous multiple time quantums of the target area before the object time unit
Feel sample case percent data sequence.
Target area in the present embodiment is a certain area that carry out influenza-like case percent prediction, for example, a certain
City or a certain province etc..In addition, being illustrated with scheme of the Zhou Zuowei time quantum to the present embodiment, it is assumed that predict this
The influenza-like case percentage in week, then with the history influenza-like case percent data sequence structure in week each in front of this week 5 years
ARIMA (Autoregressive Integrated Moving Average, autoregression integrate sliding average) model is built, that is, is wanted
261st week influenza-like case percentage is predicted, then is needed using 260 weeks influenza-like case percentage before it
Data construct ARIMA model.Over time, object time unit can elapse forward, and the object time unit is gone through
History data can change, further, since annual week number length and solar term variation lead to these historical datas there may be difference
The periodicity actually showed can change.In the case where these no factors influence, by the influenza in long-time
The observation of sample case percent data sequence, it can be seen that data are in annual periodicity, that is, the period is 52 weeks, but
When having above-mentioned factor influence, the period can be less or greater than 52 weeks.Therefore, in the method for the present embodiment, to prediction week
Before phase carries out dynamic adjustment, the periodicity that can be presented according to influenza-like case percent data determines that predetermined period is 52
Week next predetermined period is adjusted using 52 weeks as benchmark.Assuming that currently to predict the first week stream in June, 2018
Feel sample case percentage, then needs to obtain the influenza-like case percent data sequence in 2013 to 2017 each weeks.At it
In his embodiment, the quantity of time quantum is arranged by user according to actual prediction demand in historical data.
Further, in order to improve foundation ARIMA model prediction precision, after getting above-mentioned data sequence,
Detect whether the data sequence is stationary sequence, if it is, subsequent step is continued to execute, if it is not, then according to calculus of differences
The data sequence is converted into steady influenza-like case percent data sequence.Specifically, the influenza-like case percentage is detected
The step of than data sequence whether being stationary sequence includes: to carry out unit root test to the influenza-like case percent data,
It whether is stationary sequence to detect the influenza-like case percent data sequence, wherein if detecting has unit root in sequence,
Sequence is then determined for non-stationary series, otherwise, it is determined that sequence is stationary sequence.
Step S30 exists the influenza-like case percent data sequence according to the predetermined period and default order k
Lag 0 obtains 2k+1 data sequence to k rank respectively before and after the predetermined period.
Step S40 is calculated separately between the 2k+1 data sequence and the influenza-like case percent data sequence
Auto-correlation coefficient determined according to the data sequence that first auto-correlation coefficient is greater than preset threshold pre- and according to lag sequence
Survey the period.
In the present embodiment, presetting order K is positive integer, and K preferably value is 2-6, below with predetermined period T0=52, K
For=2, influenza-like case percent data sequence is obtained in predetermined period T0The data sequence of lag K rank nearby, can obtain
Get 2k+1 data sequence.Assuming that the original data sequence that constitutes of the historical data during 2013 to 2017 be [W1,
W2, W3 ... W260], according to predetermined period T0=52 extract original data sequence L0=[W1, W2, W3 ...
W206], which is lagged into 0 to 2 rank respectively before and after predetermined period, gets following 5 sequences:
L1 be [W50, W51, W52 ... W256];
L2 be [W52, W53, W54 ... W257];
L3 be [W53, W54, W55 ... W258];
L4 be [W54, W55, W56 ... W259];
L5 be [W55, W56, W57 ... W260].
It is lagged on the basis of original data sequence 50 weeks and obtains data sequence L1, on the basis of original data sequence
It lags 51 weeks and obtains data sequence L2, lagged on the basis of original data sequence 52 weeks and obtain data sequence L3, in original number
Data sequence L4 is obtained according to lagging 53 weeks on the basis of sequence, is lagged on the basis of original data sequence 54 weeks and obtains data sequence
Arrange L5.Calculate separately the auto-correlation coefficient between sequence L1, L2, L3, L4, L5 and sequence L0.
When not influenced by factors such as time or solar term in the period, i.e., when predetermined period is 52 weeks, then the 1st week extremely
With the 53rd week to 258 weeks data sequence be in lesser error range within 206th week it is the same, at this point, L0 and L3 from phase
Relationship number can be maximum.But if data sequence when being influenced to cause mechanical periodicity by other factors, the correlation of L0 and L3
Will be weaker, the correlation of L0 and other data sequences becomes strong, therefore, by the above-mentioned auto-correlation coefficient being calculated, presses
According to the sequence from L1 to L5, determines that first auto-correlation coefficient is greater than the sequence of preset threshold, the lag week number of the sequence is made
For predetermined period.For example, saying that the auto-correlation coefficient being calculated between L0 and L4 was L1 first auto-correlation coefficient into L5
Greater than the sequence of preset threshold, it is determined that current predetermined period is 53 weeks.
Step S50, according to the influenza-like case Percent sequence computation model parameter of predetermined period has been determined, according to described
Model parameter and described predetermined period establish autoregression integral moving average model as the prediction model.
To the stable influenza ILI sequence of predetermined period has been determined, its auto-correlation coefficient and partial autocorrelation system are acquired respectively
Number, to determine the AR parameter and MA parameter in model, the i.e. value of p, q and q, and then according to obtained above-mentioned parameter and predetermined period
Establish ARIMA model.Specifically, the stable data sequence being converted to by calculus of differences is obtained, stationary sequence will be carried out and turned
Value of the order of calculus of differences when changing as the parameter d of ARIMA model.The stable data sequence that predetermined period has been determined is divided
Its auto-correlation coefficient and PARCOR coefficients are not acquired, and draw autocorrelogram and partial autocorrelation figure, according to autocorrelogram and partially
Autocorrelogram judges PARCOR coefficients and auto-correlation coefficient is hangover or truncation, and thus selects corresponding ARIMA model
To tranquilization, treated that data sequence is fitted;Parameter Estimation is carried out to the ARIMA model of fitting, obtains optimal stratum
Then p and order q carries out validity check to model, with the prediction model that determination is final.Specifically, in exponential smoothing model
Under, whether the residual error of observation ARIMA model is that average value is 0 and variance is the normal distribution of constant, while observing continuous residual error
It is whether related, if so, decision model passes through verification.
During predict in real time using obtained ARIMA model, to influenza sample by way of rolling forecast
Case percent data carries out a weekly forecasting in advance.With the variation of time, for the history influenza sample disease of different future positions
The period of example percent data sequence may have variation.Therefore, it every time when mentioning the last week and being predicted, needs according to this
The influenza-like case percent data in continuous more weeks before week determines new predetermined period, carries out dynamic update to model, with
Improve the prediction precision of autoregression integral moving average model.
The generation method for the prediction model that the present embodiment proposes, determines target area and object time unit to be predicted,
And obtain predetermined period;Obtain the influenza-like case hundred of continuous multiple time quantums of the target area before object time unit
Divide and compares data sequence;According to predetermined period and default order k, by influenza-like case percent data sequence before and after predetermined period
Lag 0 obtains 2k+1 data sequence to k rank respectively;Calculate separately 2k+1 data sequence and influenza-like case percentage number
According to the auto-correlation coefficient between sequence, and according to lag sequence, the data of preset threshold are greater than according to first auto-correlation coefficient
Sequence determines predetermined period;According to the influenza-like case Percent sequence computation model parameter of predetermined period has been determined, according to mould
Shape parameter and predetermined period establish autoregression integral moving average model as prediction model.The present invention is to object time unit
Data when being predicted, after lagging multiple orders to the data sequence before the object time unit, the sequence of calculation it
Between auto-correlation coefficient, and then according to auto-correlation coefficient determine one be suitable for current goal time quantum predetermined period, and
It is modeled using the predetermined period, improves the prediction precision of autoregression integral moving average model.
The present invention also provides a kind of generating means of prediction model.Referring to shown in Fig. 2, provided for one embodiment of the invention
The schematic diagram of internal structure of the generating means of prediction model.
In the present embodiment, the generating means 1 of prediction model can be PC (Personal Computer, PC),
It is also possible to the terminal devices such as smart phone, tablet computer, portable computer.The generating means 1 of the prediction model include at least
Memory 11, processor 12, network interface 13 and communication bus.
Wherein, memory 11 include at least a type of readable storage medium storing program for executing, the readable storage medium storing program for executing include flash memory,
Hard disk, multimedia card, card-type memory (for example, SD or DX memory etc.), magnetic storage, disk, CD etc..Memory 11
It can be the internal storage unit of the generating means 1 of prediction model, such as the generation dress of the prediction model in some embodiments
Set 1 hard disk.Memory 11 is also possible to the External memory equipment of the generating means 1 of prediction model in further embodiments,
Such as the plug-in type hard disk being equipped in the generating means 1 of prediction model, intelligent memory card (Smart Media Card, SMC), peace
Digital (Secure Digital, SD) card, flash card (Flash Card) etc..Further, memory 11 can also be wrapped both
The internal storage unit for including the generating means 1 of prediction model also includes External memory equipment.Memory 11 can be not only used for depositing
Storage is installed on application software and Various types of data, such as the code of model generator 01 of generating means 1 of prediction model etc., also
It can be used for temporarily storing the data that has exported or will export.
Processor 12 can be in some embodiments a central processing unit (Central Processing Unit,
CPU), controller, microcontroller, microprocessor or other data processing chips, the program for being stored in run memory 11
Code or processing data, such as execute model generator 01 etc..
Network interface 13 optionally may include standard wireline interface and wireless interface (such as WI-FI interface), be commonly used in
Communication connection is established between the device 1 and other electronic equipments.
Communication bus is for realizing the connection communication between these components.
Optionally, which can also include user interface, and user interface may include display (Display), input
Unit such as keyboard (Keyboard), optional user interface can also include standard wireline interface and wireless interface.It is optional
Ground, in some embodiments, display can be light-emitting diode display, liquid crystal display, touch-control liquid crystal display and OLED
(Organic Light-Emitting Diode, Organic Light Emitting Diode) touches device etc..Wherein, display can also be appropriate
Referred to as display screen or display unit, for be shown in the information handled in the generating means 1 of prediction model and for show can
Depending on the user interface changed.
Fig. 2 illustrates only the generating means 1 of the prediction model with component 11-13 and model generator 01, ability
Field technique personnel, can be with it is understood that structure shown in fig. 1 does not constitute the restriction to the generating means 1 of prediction model
Including perhaps combining certain components or different component layouts than illustrating less perhaps more components.
In 1 embodiment of device shown in Fig. 2, model generator 01 is stored in memory 11;Processor 12 executes
Following steps are realized when the model generator 01 stored in memory 11:
It determines target area and object time unit to be predicted, and obtains predetermined period.
Obtain the influenza-like case of continuous multiple time quantums of the target area before the object time unit
Percent data sequence.
According to the predetermined period and default order k, by the influenza-like case percent data sequence described default
Lag 0 obtains 2k+1 data sequence to k rank respectively before and after period.
Calculate separately the auto-correlation between the 2k+1 data sequence and the influenza-like case percent data sequence
Coefficient, and according to lag sequence, predetermined period is determined according to the data sequence that first auto-correlation coefficient is greater than preset threshold.
According to the influenza-like case Percent sequence computation model parameter of predetermined period has been determined, according to the model parameter
Autoregression integral moving average model is established as the prediction model with described predetermined period.
Target area in the present embodiment is a certain area that carry out influenza-like case percent prediction, for example, a certain
City or a certain province etc..In addition, being illustrated with scheme of the Zhou Zuowei time quantum to the present embodiment, it is assumed that predict this
The influenza-like case percentage in week, then with the history influenza-like case percent data sequence structure in week each in front of this week 5 years
ARIMA (Autoregressive Integrated Moving Average, autoregression integrate sliding average) model is built, that is, is wanted
261st week influenza-like case percentage is predicted, then is needed using 260 weeks influenza-like case percentage before it
Data construct ARIMA model.Over time, object time unit can elapse forward, and the object time unit is gone through
History data can change, further, since annual week number length and solar term variation lead to these historical datas there may be difference
The periodicity actually showed can change.In the case where these no factors influence, by the influenza in long-time
The observation of sample case percent data sequence, it can be seen that data are in annual periodicity, that is, the period is 52 weeks, but
When having above-mentioned factor influence, the period can be less or greater than 52 weeks.Therefore, in the method for the present embodiment, to prediction week
Before phase carries out dynamic adjustment, the periodicity that can be presented according to influenza-like case percent data determines that predetermined period is 52
Week next predetermined period is adjusted using 52 weeks as benchmark.Assuming that currently to predict the first week stream in June, 2018
Feel sample case percentage, then needs to obtain the influenza-like case percent data sequence in 2013 to 2017 each weeks.At it
In his embodiment, the quantity of time quantum is arranged by user according to actual prediction demand in historical data.
Further, in order to improve foundation ARIMA model prediction precision, after getting above-mentioned data sequence,
Detect whether the data sequence is stationary sequence, if it is, subsequent step is continued to execute, if it is not, then according to calculus of differences
The data sequence is converted into steady influenza-like case percent data sequence.Specifically, the influenza-like case percentage is detected
The step of than data sequence whether being stationary sequence includes: to carry out unit root test to the influenza-like case percent data,
It whether is stationary sequence to detect the influenza-like case percent data sequence, wherein if detecting has unit root in sequence,
Sequence is then determined for non-stationary series, otherwise, it is determined that sequence is stationary sequence.
In the present embodiment, presetting order K is positive integer, and K preferably value is 2-6, below with predetermined period T0=52, K
For=2, influenza-like case percent data sequence is obtained in predetermined period T0The data sequence of lag K rank nearby, can obtain
Get 2k+1 data sequence.Assuming that the original data sequence that constitutes of the historical data during 2013 to 2017 be [W1,
W2, W3 ... W260], according to predetermined period T0=52 extract original data sequence L0=[W1, W2, W3 ...
W206], which is lagged into 0 to 2 rank respectively before and after predetermined period, gets following 5 sequences:
L1 be [W50, W51, W52 ... W256];
L2 be [W52, W53, W54 ... W257];
L3 be [W53, W54, W55 ... W258];
L4 be [W54, W55, W56 ... W259];
L5 be [W55, W56, W57 ... W260].
It is lagged on the basis of original data sequence 50 weeks and obtains data sequence L1, on the basis of original data sequence
It lags 51 weeks and obtains data sequence L2, lagged on the basis of original data sequence 52 weeks and obtain data sequence L3, in original number
Data sequence L4 is obtained according to lagging 53 weeks on the basis of sequence, is lagged on the basis of original data sequence 54 weeks and obtains data sequence
Arrange L5.Calculate separately the auto-correlation coefficient between sequence L1, L2, L3, L4, L5 and sequence L0.
When not influenced by factors such as time or solar term in the period, i.e., when predetermined period is 52 weeks, then the 1st week extremely
With the 53rd week to 258 weeks data sequence be in lesser error range within 206th week it is the same, at this point, L0 and L3 from phase
Relationship number can be maximum.But if data sequence when being influenced to cause mechanical periodicity by other factors, the correlation of L0 and L3
Will be weaker, the correlation of L0 and other data sequences becomes strong, therefore, by the above-mentioned auto-correlation coefficient being calculated, presses
According to the sequence from L1 to L5, determines that first auto-correlation coefficient is greater than the sequence of preset threshold, the lag week number of the sequence is made
For predetermined period.For example, saying that the auto-correlation coefficient being calculated between L0 and L4 was L1 first auto-correlation coefficient into L5
Greater than the sequence of preset threshold, it is determined that current predetermined period is 53 weeks.
To the stable influenza ILI sequence of predetermined period has been determined, its auto-correlation coefficient and partial autocorrelation system are acquired respectively
Number, to determine the AR parameter and MA parameter in model, the i.e. value of p, q and q, and then according to obtained above-mentioned parameter and predetermined period
Establish ARIMA model.Specifically, the stable data sequence being converted to by calculus of differences is obtained, stationary sequence will be carried out and turned
Value of the order of calculus of differences when changing as the parameter d of ARIMA model.The stable data sequence that predetermined period has been determined is divided
Its auto-correlation coefficient and PARCOR coefficients are not acquired, and draw autocorrelogram and partial autocorrelation figure, according to autocorrelogram and partially
Autocorrelogram judges PARCOR coefficients and auto-correlation coefficient is hangover or truncation, and thus selects corresponding ARIMA model
To tranquilization, treated that data sequence is fitted;Parameter Estimation is carried out to the ARIMA model of fitting, obtains optimal stratum
Then p and order q carries out validity check to model, with the prediction model that determination is final.Specifically, in exponential smoothing model
Under, whether the residual error of observation ARIMA model is that average value is 0 and variance is the normal distribution of constant, while observing continuous residual error
It is whether related, if so, decision model passes through verification.
During predict in real time using obtained ARIMA model, to influenza sample by way of rolling forecast
Case percent data carries out a weekly forecasting in advance.With the variation of time, for the history influenza sample disease of different future positions
The period of example percent data sequence may have variation.Therefore, it every time when mentioning the last week and being predicted, needs according to this
The influenza-like case percent data in continuous more weeks before week determines new predetermined period, carries out dynamic update to model, with
Improve the prediction precision of autoregression integral moving average model.
The generating means for the prediction model that the present embodiment proposes, determine target area and object time unit to be predicted,
And obtain predetermined period;Obtain the influenza-like case hundred of continuous multiple time quantums of the target area before object time unit
Divide and compares data sequence;According to predetermined period and default order k, by influenza-like case percent data sequence before and after predetermined period
Lag 0 obtains 2k+1 data sequence to k rank respectively;Calculate separately 2k+1 data sequence and influenza-like case percentage number
According to the auto-correlation coefficient between sequence, and according to lag sequence, the data of preset threshold are greater than according to first auto-correlation coefficient
Sequence determines predetermined period;According to the influenza-like case Percent sequence computation model parameter of predetermined period has been determined, according to mould
Shape parameter and predetermined period establish autoregression integral moving average model as prediction model.The present invention is to object time unit
Data when being predicted, after lagging multiple orders to the data sequence before the object time unit, the sequence of calculation it
Between auto-correlation coefficient, and then according to auto-correlation coefficient determine one be suitable for current goal time quantum predetermined period, and
It is modeled using the predetermined period, improves the prediction precision of autoregression integral moving average model.
Optionally, in other examples, model generator can also be divided into one or more module, and one
A or multiple modules are stored in memory 11, and are held by one or more processors (the present embodiment is by processor 12)
For row to complete the present invention, the so-called module of the present invention is the series of computation machine program instruction section for referring to complete specific function,
Implementation procedure of the program in the generating means of prediction model is generated for descriptive model.
It is the model generator in one embodiment of generating means of prediction model of the present invention for example, referring to shown in Fig. 3
Program module schematic diagram, in the embodiment, model generator can be divided into data determining module 10, retrieval module
20, data computation module 30 and model generation module 40, illustratively:
Data determining module 10 is used for: being determined target area and object time unit to be predicted, and is obtained predetermined period;
And obtain the influenza sample of continuous multiple time quantums of the target area before the object time unit
Case percent data sequence;
Retrieval module 20 is used for: according to the predetermined period and default order k, by the influenza-like case percentage
Data sequence lags 0 to k rank respectively before and after the predetermined period, obtains 2k+1 data sequence;
Data computation module 30 is used for: calculating separately the 2k+1 data sequence and the influenza-like case percentage number
According to the auto-correlation coefficient between sequence, and according to lag sequence, the data of preset threshold are greater than according to first auto-correlation coefficient
Sequence determines predetermined period;
Model generation module 40 is used for: being joined according to the influenza-like case Percent sequence computation model that predetermined period has been determined
Number establishes autoregression integral moving average model as the prediction model according to the model parameter and described predetermined period.
The journeys such as above-mentioned data determining module 10, retrieval module 20, data computation module 30 and model generation module 40
Sequence module is performed realized functions or operations step and is substantially the same with above-described embodiment, and details are not described herein.
In addition, the embodiment of the present invention also proposes a kind of computer readable storage medium, the computer readable storage medium
On be stored with model generator, the model generator can be executed by one or more processors, to realize following operation:
It determines target area and object time unit to be predicted, and obtains predetermined period;
Obtain the influenza-like case of continuous multiple time quantums of the target area before the object time unit
Percent data sequence;
According to the predetermined period and default order k, by the influenza-like case percent data sequence described default
Lag 0 obtains 2k+1 data sequence to k rank respectively before and after period;
Calculate separately the auto-correlation between the 2k+1 data sequence and the influenza-like case percent data sequence
Coefficient, and according to lag sequence, predetermined period is determined according to the data sequence that first auto-correlation coefficient is greater than preset threshold;
According to the influenza-like case Percent sequence computation model parameter of predetermined period has been determined, according to the model parameter
Autoregression integral moving average model is established as the prediction model with described predetermined period.The computer-readable storage of the present invention
Medium specific embodiment and each embodiment of generating means and method of above-mentioned prediction model are essentially identical, do not make tired state herein.
It should be noted that the serial number of the above embodiments of the invention is only for description, do not represent the advantages or disadvantages of the embodiments.And
The terms "include", "comprise" herein or any other variant thereof is intended to cover non-exclusive inclusion, so that packet
Process, device, article or the method for including a series of elements not only include those elements, but also including being not explicitly listed
Other element, or further include for this process, device, article or the intrinsic element of method.Do not limiting more
In the case where, the element that is limited by sentence "including a ...", it is not excluded that including process, device, the article of the element
Or there is also other identical elements in method.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side
Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases
The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art
The part contributed out can be embodied in the form of software products, which is stored in one as described above
In storage medium (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that terminal device (it can be mobile phone,
Computer, server or network equipment etc.) execute method described in each embodiment of the present invention.
The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair
Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills
Art field, is included within the scope of the present invention.
Claims (10)
1. a kind of generation method of prediction model, which is characterized in that the described method includes:
It determines target area and object time unit to be predicted, and obtains predetermined period;
Obtain the influenza-like case percentage of continuous multiple time quantums of the target area before the object time unit
Compare data sequence;
According to the predetermined period and default order k, by the influenza-like case percent data sequence in the predetermined period
Front and back lags 0 to k rank respectively, obtains 2k+1 data sequence;
The auto-correlation coefficient between the 2k+1 data sequence and the influenza-like case percent data sequence is calculated separately,
And according to lag sequence, predetermined period is determined according to the data sequence that first auto-correlation coefficient is greater than preset threshold;
According to the influenza-like case Percent sequence computation model parameter of predetermined period has been determined, according to the model parameter and institute
It states predetermined period and establishes autoregression integral moving average model as the prediction model.
2. the generation method of prediction model as described in claim 1, which is characterized in that the institute for obtaining the target area
After the step of influenza-like case percent data sequence in continuous multiple chronomeres before stating object time unit, institute
The method of stating further comprises the steps of:
Detect whether the influenza-like case percent data sequence is stationary sequence;
If so, executing the default week periodically determined for obtaining and being presented according to the influenza-like case percent data
The step of phase;
If it is not, the influenza-like case percent data sequence is then converted to stationary sequence according to calculus of differences.
3. the generation method of prediction model as claimed in claim 2, which is characterized in that the detection influenza-like case hundred
The step of point than data sequence whether being stationary sequence includes:
Unit root test is carried out to the influenza-like case percent data, to detect the influenza-like case percent data sequence
Column whether be stationary sequence, wherein if detecting has unit root in sequence, determine sequence for non-stationary series, otherwise, it is determined that
Sequence is stationary sequence.
4. the generation method of prediction model as claimed any one in claims 1 to 3, which is characterized in that described obtain is preset
The step of period includes:
The predetermined period is determined according to the periodicity that the influenza-like case percent data is presented.
5. the generation method of prediction model as claimed any one in claims 1 to 3, which is characterized in that described according to determination
The influenza-like case Percent sequence computation model parameter of predetermined period, builds according to the model parameter and described predetermined period
Vertical autoregression integrates moving average model as the step of prediction model and includes:
Calculate auto-correlation coefficient and the partial autocorrelation system of the steady influenza-like case percent data sequence that predetermined period has been determined
Number, and draw autocorrelogram and partial autocorrelation figure;
According to the autocorrelogram and the partial autocorrelation figure, judges the PARCOR coefficients calculated and auto-correlation coefficient is hangover
Or truncation, and select autoregression integral moving average model to steady influenza-like case percent data sequence according to judging result
Column are fitted, to obtain the prediction model.
6. a kind of generating means of prediction model, which is characterized in that described device includes memory and processor, the memory
On be stored with the model generator that can be run on the processor, when the model generator is executed by the processor
Realize following steps:
It determines target area and object time unit to be predicted, and obtains predetermined period;
Obtain the influenza-like case percentage of continuous multiple time quantums of the target area before the object time unit
Compare data sequence;
According to the predetermined period and default order k, by the influenza-like case percent data sequence in the predetermined period
Front and back lags 0 to k rank respectively, obtains 2k+1 data sequence;
The auto-correlation coefficient between the 2k+1 data sequence and the influenza-like case percent data sequence is calculated separately,
And according to lag sequence, predetermined period is determined according to the data sequence that first auto-correlation coefficient is greater than preset threshold;
According to the influenza-like case Percent sequence computation model parameter of predetermined period has been determined, according to the model parameter and institute
It states predetermined period and establishes autoregression integral moving average model as the prediction model.
7. the generating means of prediction model as claimed in claim 6, which is characterized in that the model generator can also be by institute
Processor execution is stated, with continuous multiple chronomeres before the object time unit for obtaining the target area
After the step of interior influenza-like case percent data sequence, also realization following steps:
Detect whether the influenza-like case percent data sequence is stationary sequence;
If so, executing the default week periodically determined for obtaining and being presented according to the influenza-like case percent data
The step of phase;
If it is not, the influenza-like case percent data sequence is then converted to stationary sequence according to calculus of differences.
8. the generating means of prediction model as claimed in claim 7, which is characterized in that the detection influenza-like case hundred
The step of point than data sequence whether being stationary sequence includes:
Unit root test is carried out to the influenza-like case percent data, to detect the influenza-like case percent data sequence
Column whether be stationary sequence, wherein if detecting has unit root in sequence, determine sequence for non-stationary series, otherwise, it is determined that
Sequence is stationary sequence.
9. the generating means of the prediction model as described in any one of claim 6 to 8, which is characterized in that described obtain is preset
The step of period includes:
The predetermined period is determined according to the periodicity that the influenza-like case percent data is presented.
10. a kind of computer readable storage medium, which is characterized in that it is raw to be stored with model on the computer readable storage medium
At program, the model generator can be executed by one or more processor, to realize as any in claim 1 to 5
The step of generation method of prediction model described in.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810768332.2A CN109243619B (en) | 2018-07-13 | 2018-07-13 | Generation method and device of prediction model and computer readable storage medium |
PCT/CN2018/107488 WO2020010710A1 (en) | 2018-07-13 | 2018-09-26 | Method and apparatus for generating prediction model, and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810768332.2A CN109243619B (en) | 2018-07-13 | 2018-07-13 | Generation method and device of prediction model and computer readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109243619A true CN109243619A (en) | 2019-01-18 |
CN109243619B CN109243619B (en) | 2023-03-31 |
Family
ID=65072559
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810768332.2A Active CN109243619B (en) | 2018-07-13 | 2018-07-13 | Generation method and device of prediction model and computer readable storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109243619B (en) |
WO (1) | WO2020010710A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109916361A (en) * | 2019-03-04 | 2019-06-21 | 中国计量科学研究院 | A kind of roundness measurement signal processing method without angular position information |
CN113035368A (en) * | 2021-04-13 | 2021-06-25 | 桂林电子科技大学 | Disease propagation prediction method based on differential migration diagram neural network |
CN113537631A (en) * | 2021-08-04 | 2021-10-22 | 北方健康医疗大数据科技有限公司 | Method and device for predicting medicine demand, electronic equipment and storage medium |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112259239B (en) * | 2020-10-21 | 2023-07-11 | 平安科技(深圳)有限公司 | Parameter processing method and device, electronic equipment and storage medium |
CN113380423B (en) * | 2021-05-24 | 2024-06-18 | 首都医科大学 | Epidemic situation scale prediction method and device, electronic equipment and storage medium |
CN117147807B (en) * | 2023-11-01 | 2024-01-26 | 中海(天津)能源科技有限公司 | Oil quality monitoring system and method for petroleum exploration |
CN117457096B (en) * | 2023-12-26 | 2024-03-22 | 山东省海洋资源与环境研究院(山东省海洋环境监测中心、山东省水产品质量检验中心) | Dynamic monitoring and adjusting system for simulating carbon dioxide dissolution in ocean acidification device |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107633254A (en) * | 2017-07-25 | 2018-01-26 | 平安科技(深圳)有限公司 | Establish device, method and the computer-readable recording medium of forecast model |
CN107688872A (en) * | 2017-08-20 | 2018-02-13 | 平安科技(深圳)有限公司 | Forecast model establishes device, method and computer-readable recording medium |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108268967B (en) * | 2017-01-04 | 2021-01-26 | 北京京东尚科信息技术有限公司 | Method and system for predicting telephone traffic |
CN107145714B (en) * | 2017-04-07 | 2020-05-22 | 浙江大学城市学院 | Multi-factor-based public bicycle usage amount prediction method |
-
2018
- 2018-07-13 CN CN201810768332.2A patent/CN109243619B/en active Active
- 2018-09-26 WO PCT/CN2018/107488 patent/WO2020010710A1/en active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107633254A (en) * | 2017-07-25 | 2018-01-26 | 平安科技(深圳)有限公司 | Establish device, method and the computer-readable recording medium of forecast model |
CN107688872A (en) * | 2017-08-20 | 2018-02-13 | 平安科技(深圳)有限公司 | Forecast model establishes device, method and computer-readable recording medium |
Non-Patent Citations (2)
Title |
---|
姜世强 等: "基于ARIMA模型的流感样病例预警预测分析" * |
李广智 等: "自回归求和移动平均模型在流感发病预测中的应用" * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109916361A (en) * | 2019-03-04 | 2019-06-21 | 中国计量科学研究院 | A kind of roundness measurement signal processing method without angular position information |
CN113035368A (en) * | 2021-04-13 | 2021-06-25 | 桂林电子科技大学 | Disease propagation prediction method based on differential migration diagram neural network |
CN113537631A (en) * | 2021-08-04 | 2021-10-22 | 北方健康医疗大数据科技有限公司 | Method and device for predicting medicine demand, electronic equipment and storage medium |
CN113537631B (en) * | 2021-08-04 | 2023-11-10 | 北方健康医疗大数据科技有限公司 | Medicine demand prediction method, device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
WO2020010710A1 (en) | 2020-01-16 |
CN109243619B (en) | 2023-03-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109243619A (en) | Generation method, device and the computer readable storage medium of prediction model | |
CN108133739B (en) | Motion path pushing method and device and storage medium | |
CN111984426B (en) | Task scheduling method and device, electronic equipment and storage medium | |
CN105205570A (en) | Power grid power sale quantity prediction method based on season time sequence analysis | |
CN111178537B (en) | Feature extraction model training method and device | |
CN114359970A (en) | Pedestrian re-identification method and device, electronic equipment and storage medium | |
CN107015900A (en) | A kind of service performance Forecasting Methodology of video website | |
CN109918574A (en) | Item recommendation method, device, equipment and storage medium | |
WO2022072012A1 (en) | Optimizing job runtimes via prediction-based token allocation | |
CN112559923A (en) | Website resource recommendation method and device, electronic equipment and computer storage medium | |
CN116168350A (en) | Intelligent monitoring method and device for realizing constructor illegal behaviors based on Internet of things | |
US20190385262A1 (en) | Information processing method and information processing device | |
CN105491079A (en) | Method and device for adjusting resources needed by application in cloud computing environment | |
CN109817342A (en) | Parameter regulation means, device, equipment and the storage medium of popular season prediction model | |
US10558767B1 (en) | Analytical derivative-based ARMA model estimation | |
CN109905880B (en) | Network partitioning method, system, electronic device and storage medium | |
JP5687122B2 (en) | Software evaluation device, software evaluation method, and system evaluation device | |
CN116108276A (en) | Information recommendation method and device based on artificial intelligence and related equipment | |
CN113127019B (en) | Verification method and related equipment | |
US8972072B2 (en) | Optimizing power consumption in planned projects | |
CN113065055B (en) | News information capturing method and device, electronic equipment and storage medium | |
CN114331446A (en) | Method, device, equipment and medium for realizing out-of-chain service of block chain | |
CN109743203B (en) | Distributed service security combination system and method based on quantitative information flow | |
CN109308327A (en) | Figure calculation method device medium apparatus based on the compatible dot center's model of subgraph model | |
CN109492906A (en) | Response executes reliability estimation method, device and equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |