CN111863276B - Hand, foot and mouth disease prediction method, electronic equipment and medium using fine-grained data - Google Patents

Hand, foot and mouth disease prediction method, electronic equipment and medium using fine-grained data Download PDF

Info

Publication number
CN111863276B
CN111863276B CN202010704454.2A CN202010704454A CN111863276B CN 111863276 B CN111863276 B CN 111863276B CN 202010704454 A CN202010704454 A CN 202010704454A CN 111863276 B CN111863276 B CN 111863276B
Authority
CN
China
Prior art keywords
data
time
granularity
fine
grained
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010704454.2A
Other languages
Chinese (zh)
Other versions
CN111863276A (en
Inventor
王智谨
黄耀辉
付永钢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jimei University
Original Assignee
Jimei University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jimei University filed Critical Jimei University
Priority to CN202010704454.2A priority Critical patent/CN111863276B/en
Publication of CN111863276A publication Critical patent/CN111863276A/en
Application granted granted Critical
Publication of CN111863276B publication Critical patent/CN111863276B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/50ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本申请提供一种利用细粒度数据的手足口病预测方法及装置、电子设备以及计算机可读介质。方法包括:获取手足口病的历史病例数据;对历史病例数据进行预处理,并统计为两种不同时间间隔的时序数据;根据时间聚合预所述两种不同时间间隔的时序数据,得到多元时序数据,并将多元时序数据转化为有监督数据;根据有监督数据,训练手足口病预测模型;将实时采集的手足口病的病例数据输入训练好的手足口病预测模型,得到实时预测的手足口病的发病人数。本方案,统计了历史病例数据不同粒度时间间隔的时序数据,结合更细粒度的时序数据提高了基于等时间粒度预测手足口病发病人数的准确度,并且无需额外的数据辅助,提高了预测效率。

Figure 202010704454

The present application provides a hand, foot and mouth disease prediction method and device using fine-grained data, electronic equipment, and a computer-readable medium. The method includes: obtaining historical case data of hand, foot and mouth disease; preprocessing the historical case data, and counting them as time series data of two different time intervals; data, and convert multivariate time-series data into supervised data; train hand, foot and mouth disease prediction model based on supervised data; input real-time case data of hand, foot and mouth disease into the trained hand, foot and mouth disease prediction model to obtain real-time predicted hand, foot and mouth disease The number of cases of mouth disease. In this program, the time series data of different granularity time intervals of historical case data are counted, combined with finer-grained time series data, the accuracy of predicting the number of hand, foot and mouth disease cases based on equal time granularity is improved, and no additional data assistance is required, which improves the prediction efficiency. .

Figure 202010704454

Description

利用细粒度数据的手足口病预测方法、电子设备及介质Hand, foot and mouth disease prediction method, electronic equipment and medium using fine-grained data

技术领域technical field

本申请涉及公共卫生预测技术领域,具体涉及一种利用细粒度数据的手足口病预测方法及装置、一种电子设备以及一种计算机可读介质。The present application relates to the technical field of public health forecasting, in particular to a hand, foot and mouth disease forecasting method and device using fine-grained data, an electronic device, and a computer-readable medium.

背景技术Background technique

随着全球经济一体化进程的加快,经济与交流活动增加,人群流动日益频繁,为疾病的传播与爆发提供了有利环境,公共卫生健康问题越来越严峻。同时,社会与自然环境也发生着变化,环境污染、自然灾害等影响公众健康事件的增多也增加了突发公共卫生事件爆发的可能性。With the acceleration of the process of global economic integration, the increase of economic and communication activities, the increasingly frequent flow of people provides a favorable environment for the spread and outbreak of diseases, and public health problems are becoming more and more serious. At the same time, social and natural environments are also changing, and the increase in environmental pollution, natural disasters and other incidents that affect public health also increases the possibility of outbreaks of public health emergencies.

手足口病是发病率最高的传染病之一,在世界范围内影响着公共卫生安全。早期预测对传染病的防治起到预警和决策支持的重要作用。此外,手足口病的管控也是政府、医疗机构和普通民众所关心的事。Hand, foot and mouth disease is one of the infectious diseases with the highest incidence rate, affecting public health security worldwide. Early prediction plays an important role in early warning and decision support for the prevention and control of infectious diseases. In addition, the control of hand, foot and mouth disease is also a concern of the government, medical institutions and ordinary people.

传统的方法在预测未来某一个时间间隔内手足口病的发病人数时,都是基于观察的若干个相同时间间隔的发病人数来做的,例如利用过去多年同期的发病人数来预测今年同期的发病人数,但是如上传统方法的预测往往差强人意。When traditional methods predict the number of HFMD cases in a certain time interval in the future, they are all based on observing the number of cases at the same time interval. The number of people, but the predictions of the above traditional methods are often unsatisfactory.

发明内容Contents of the invention

本申请的目的是提供一种利用细粒度数据的手足口病预测方法及装置、一种电子设备以及一种计算机可读介质。The purpose of this application is to provide a method and device for HFMD prediction using fine-grained data, an electronic device, and a computer-readable medium.

本申请第一方面提供一种利用细粒度数据的手足口病预测方法,包括:The first aspect of the present application provides a method for predicting hand, foot and mouth disease using fine-grained data, including:

S1、获取手足口病的历史病例数据;S1. Obtain historical case data of hand, foot and mouth disease;

S2、对所述历史病例数据进行预处理,并统计为两种不同时间间隔的时序数据;S2. Preprocessing the historical case data and making statistics into time series data of two different time intervals;

S3、根据时间聚合预所述两种不同时间间隔的时序数据,得到多元时序数据,并将所述多元时序数据转化为有监督数据;S3. Obtain multivariate time-series data according to time-aggregated time-series data of two different time intervals, and convert the multivariate time-series data into supervised data;

S4、根据所述有监督数据,训练手足口病预测模型;S4. According to the supervised data, train the hand, foot and mouth disease prediction model;

S5、将实时采集的手足口病的病例数据输入训练好的所述手足口病预测模型,得到实时预测的手足口病的发病人数。S5. Inputting the case data of HFMD collected in real time into the trained HFMD prediction model to obtain the number of cases of HFMD predicted in real time.

本申请第二方面提供一种利用细粒度数据的手足口病预测装置,包括:The second aspect of the present application provides a hand, foot and mouth disease prediction device using fine-grained data, including:

获取模块,用于获取手足口病的历史病例数据;Obtain module, be used for obtaining the historical case data of hand, foot and mouth disease;

预处理模块,用于对所述历史病例数据进行预处理,并统计为两种不同时间间隔的时序数据;A preprocessing module, configured to preprocess the historical case data and make statistics into time series data of two different time intervals;

聚合模块,用于根据时间聚合预所述两种不同时间间隔的时序数据,得到多元时序数据,并将所述多元时序数据转化为有监督数据;The aggregation module is used to aggregate the time-series data of the two different time intervals according to time to obtain multivariate time-series data, and convert the multivariate time-series data into supervised data;

模型训练模块,用于根据所述有监督数据,训练手足口病预测模型;Model training module, used for training hand, foot and mouth disease prediction model according to the supervised data;

预测模块,用于将实时采集的手足口病的病例数据输入训练好的所述手足口病预测模型,得到实时预测的手足口病的发病人数。The prediction module is used to input the case data of hand, foot and mouth disease collected in real time into the trained hand, foot and mouth disease prediction model to obtain the number of cases of hand, foot and mouth disease predicted in real time.

本申请第三方面提供一种电子设备,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器运行所述计算机程序时执行以实现本申请第一方面所述的方法。The third aspect of the present application provides an electronic device, including: a memory, a processor, and a computer program stored on the memory and operable on the processor. When the processor runs the computer program, it executes to realize The method described in the first aspect of the present application.

本申请第四方面提供一种计算机可读介质,其上存储有计算机可读指令,所述计算机可读指令可被处理器执行以实现本申请第一方面所述的方法。The fourth aspect of the present application provides a computer-readable medium on which computer-readable instructions are stored, and the computer-readable instructions can be executed by a processor to implement the method described in the first aspect of the present application.

相较于现有技术,本申请提供的利用细粒度数据的手足口病预测方法、装置、电子设备及介质,获取手足口病的历史病例数据;对所述历史病例数据进行预处理,并统计为两种不同时间间隔的时序数据;根据时间聚合预所述两种不同时间间隔的时序数据,得到多元时序数据,并将所述多元时序数据转化为有监督数据;根据所述有监督数据,训练手足口病预测模型;将实时采集的手足口病的病例数据输入训练好的所述手足口病预测模型,得到实时预测的手足口病的发病人数。本方案,统计了历史病例数据不同粒度时间间隔的时序数据来训练预测模型,通过预测模型,结合更细粒度的时序数据提高了基于等时间粒度预测手足口病发病人数的准确度,并且无需额外的数据辅助,提高了预测效率。Compared with the prior art, the hand, foot and mouth disease prediction method, device, electronic equipment and medium provided by this application can obtain historical case data of hand, foot and mouth disease using fine-grained data; preprocess the historical case data, and make statistics It is the time series data of two different time intervals; according to the time series data of the two different time intervals in advance, the multivariate time series data is obtained, and the multivariate time series data is converted into supervised data; according to the supervised data, Training the HFMD prediction model; inputting the HFMD case data collected in real time into the trained HFMD prediction model to obtain the real-time predicted incidence of HFMD. In this program, the time series data of different granularity time intervals of historical case data are counted to train the prediction model. Through the prediction model, combined with the more fine-grained time series data, the accuracy of predicting the number of cases of hand, foot and mouth disease based on equal time granularity is improved, and no additional With data assistance, the prediction efficiency is improved.

附图说明Description of drawings

通过阅读下文优选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本申请的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiment. The drawings are only for the purpose of illustrating the preferred embodiments and are not to be considered as limiting the application. Also throughout the drawings, the same reference numerals are used to designate the same components. In the attached picture:

图1示出了本申请的一些实施方式所提供的一种利用细粒度数据的手足口病预测方法的流程图;Fig. 1 shows a flow chart of a hand, foot and mouth disease prediction method using fine-grained data provided by some embodiments of the present application;

图2示出了本申请的一些实施方式所提供的一种具体的利用细粒度数据的手足口病预测方法的流程图;Fig. 2 shows a flow chart of a specific hand, foot and mouth disease prediction method using fine-grained data provided by some embodiments of the present application;

图3示出了本申请的一些实施方式所提供的一种利用细粒度数据的手足口病预测装置的示意图;Fig. 3 shows a schematic diagram of a hand, foot and mouth disease prediction device using fine-grained data provided by some embodiments of the present application;

图4示出了本申请的一些实施方式所提供的一种电子设备的示意图。Fig. 4 shows a schematic diagram of an electronic device provided by some embodiments of the present application.

具体实施方式Detailed ways

下面将参照附图更详细地描述本公开的示例性实施方式。虽然附图中显示了本公开的示例性实施方式,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施方式所限制。相反,提供这些实施方式是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. Although exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided for more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.

需要注意的是,除非另有说明,本申请使用的技术术语或者科学术语应当为本申请所属领域技术人员所理解的通常意义。It should be noted that, unless otherwise specified, technical terms or scientific terms used in this application shall have the usual meanings understood by those skilled in the art to which this application belongs.

另外,术语“第一”和“第二”等是用于区别不同对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。In addition, the terms "first" and "second", etc. are used to distinguish different objects, not to describe a specific order. Furthermore, the terms "include" and "have", as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, product or device comprising a series of steps or units is not limited to the listed steps or units, but optionally also includes unlisted steps or units, or optionally further includes For other steps or units inherent in these processes, methods, products or apparatuses.

由于疾病存在潜伏期,目前绝大多数的手足口病预测模型不能精准预测手足口病的爆发期,因此从细粒度的数据中挖掘潜在的爆发人数成为可能。Due to the incubation period of the disease, most of the current HFMD prediction models cannot accurately predict the outbreak period of HFMD, so it is possible to mine the potential number of outbreaks from fine-grained data.

因此,本申请通过收集手足口病的历史发病病例,设置和预测目标相等时间间隔以及比预测目标的时间间隔更细粒度的时间间隔,统计并得到以两种时间粒度为单位的发病人数时间序列,聚合两个不同时间间隔的时间序列,转化时序数据为有监督的数据,利用有监督的数据训练融合两种数据的时序神经网络,最后利用训练好的模型提供比仅仅利用和预测目标等间隔的时间序列更精准的预测。Therefore, this application collects the historical incidence cases of hand, foot and mouth disease, sets equal time intervals to the prediction target and a finer-grained time interval than the prediction target time interval, counts and obtains the time series of the number of cases in units of two time granularities , aggregate two time series with different time intervals, transform the time series data into supervised data, use the supervised data to train the time series neural network that fuses the two data, and finally use the trained model to provide an equal interval than only using and predicting the target more accurate forecasting of the time series.

具体的,本申请实施例提供一种利用细粒度数据的手足口病预测方法及装置、一种电子设备以及计算机可读介质,下面结合附图进行说明。Specifically, the embodiments of the present application provide a method and device for predicting hand, foot and mouth disease using fine-grained data, an electronic device, and a computer-readable medium, which will be described below with reference to the accompanying drawings.

请参考图1,其示出了本申请的一些实施方式所提供的一种利用细粒度数据的手足口病预测方法的流程图,如图所示,所述利用细粒度数据的手足口病预测方法,可以包括以下步骤:Please refer to FIG. 1 , which shows a flow chart of a hand, foot and mouth disease prediction method using fine-grained data provided by some embodiments of the present application. As shown in the figure, the hand, foot and mouth disease prediction using fine-grained data method may include the following steps:

步骤S101:获取手足口病的历史病例数据;Step S101: Obtain historical case data of HFMD;

步骤S102:对所述历史病例数据进行预处理,并统计为两种不同时间间隔的时序数据;Step S102: Preprocessing the historical case data and making statistics into time series data of two different time intervals;

步骤S103:根据时间聚合预所述两种不同时间间隔的时序数据,得到多元时序数据,并将所述多元时序数据转化为有监督数据;Step S103: Aggregating time-series data of the two different time intervals according to time to obtain multivariate time-series data, and converting the multivariate time-series data into supervised data;

步骤S104:根据所述有监督数据,训练手足口病预测模型;Step S104: According to the supervised data, train the hand, foot and mouth disease prediction model;

步骤S105:将实时采集的手足口病的病例数据输入训练好的所述手足口病预测模型,得到实时预测的手足口病的发病人数。Step S105: Input the case data of HFMD collected in real time into the trained HFMD prediction model to obtain the number of HFMD cases predicted in real time.

具体的,步骤S101中,手足口病的历史病例数据可以包括发病人数及发病的时间。例如,每个月份对应的发病人数,或者每个季度对应的发病人数,或者每年对应的发病人数。Specifically, in step S101, the historical case data of HFMD may include the number of cases and the time of cases. For example, the number of cases corresponding to each month, or the number of cases corresponding to each quarter, or the number of cases corresponding to each year.

步骤S102中,针对历史病例数据进行初步数据清洗,过滤零值、去除无效的样本。对清洗后的数据进行归一化处理,将历史病例数据的数值压缩至确定的区间,得到归一化后的数据。然后将归一化后的数据统计为两种不同时间间隔的时序数据。在这里,也可以是先统计为两种不同时间间隔的时序数据,然后在进行归一化处理,本申请对此不做限定。In step S102, preliminary data cleaning is performed on historical case data, zero values are filtered, and invalid samples are removed. Normalize the cleaned data, compress the value of historical case data to a certain interval, and obtain the normalized data. Then the normalized data is counted as time series data of two different time intervals. Here, the time series data of two different time intervals may also be counted first, and then normalized, which is not limited in this application.

其中,两种不同时间间隔可以为:第一种时间间隔与目标时间间隔相同,把根据第一种时间间隔统计的时序数据称为等粒度时序数据;第二种时间间隔小于目标时间间隔,把根据第二种时间间隔统计的时序数据称为细粒度时序数据;Among them, two different time intervals can be: the first time interval is the same as the target time interval, and the time series data counted according to the first time interval is called equal granularity time series data; the second time interval is smaller than the target time interval, and the The time series data counted according to the second time interval is called fine-grained time series data;

例如,目标时间间隔为一年,则第一种时间间隔为一年,第二种时间间隔可以设置为一个月,第二种时间间隔是第一种时间间隔的更细粒度设置。For example, if the target time interval is one year, the first time interval is one year, the second time interval can be set to one month, and the second time interval is a finer-grained setting of the first time interval.

设置第一种时间间隔个数为M个,则可以用[y1,y2,…,yM]表示等粒度时序数据,即将到来的时间间隔内的发病人数表示为

Figure BDA0002594179870000051
Set the number of time intervals of the first type to M, then you can use [y 1 , y 2 , ..., y M ] to represent time series data of equal granularity, and the number of patients in the upcoming time interval is expressed as
Figure BDA0002594179870000051

设置第一种时间间隔包含N个第二种时间间隔,则可以用[x1,x2,…,xM]表示等粒度时序数据,其中,任意

Figure BDA0002594179870000052
满足
Figure BDA0002594179870000053
其中xt,i表示等粒度时刻t内的第i个细粒度数据。Set the first time interval to include N second time intervals, then [x 1 , x 2 ,…, x M ] can be used to represent equal granularity time series data, where any
Figure BDA0002594179870000052
satisfy
Figure BDA0002594179870000053
Where x t, i represent the i-th fine-grained data within the equal granularity time t.

由于不同时间内的数据范围的显着差异,本发明使用Min-Max将数据归一化到[0,1],归一化的公式如下:Due to the significant difference in the range of data at different times, the present invention uses Min-Max to normalize the data to [0, 1]. The normalized formula is as follows:

Figure BDA0002594179870000054
Figure BDA0002594179870000054

其中,

Figure BDA0002594179870000055
表示某一粒度的时序数据,d′表示归一化后的时序数据,min(·)和max(·)分别表示输入向量的最小值和最大值。in,
Figure BDA0002594179870000055
Represents the time series data of a certain granularity, d' represents the normalized time series data, min(·) and max(·) represent the minimum and maximum values of the input vector, respectively.

步骤S103中:In step S103:

为了从时间序列中捕获时序特性,本发明把时间序列数据转化为有监督数据。给定一个时序数据,在转化的过程中,前若干个变量作为模型的输入变量,后一个变量被作为模型的输出。In order to capture temporal properties from time series, the present invention transforms time series data into supervised data. Given a time series data, in the transformation process, the first several variables are used as the input variables of the model, and the latter variable is used as the output of the model.

给定发病人数等粒度时序数据[y1,y2,…,yM],设置常数变量T为影响下一个时间间隔发病人数的时间间隔个数。该时序转化为有监督数据为:Given the granular time series data [y 1 , y 2 ,...,y M ] such as the number of cases, set the constant variable T as the number of time intervals that affect the number of cases in the next time interval. The time series is transformed into supervised data as:

Figure BDA0002594179870000056
Figure BDA0002594179870000056

给定第i个细粒度时刻的时序数据[x1,i,x2,i,…,xM,i],设置常数变量T为影响下一个时间间隔发病人数的时间间隔个数。该时序转化为有监督数据为:Given the time series data [x 1, i , x 2, i , ..., x M, i ] of the i-th fine-grained moment, set the constant variable T as the number of time intervals that affect the number of cases in the next time interval. The time series is transformed into supervised data as:

Figure BDA0002594179870000057
Figure BDA0002594179870000057

其中,i∈{1,…,N}。where i∈{1,...,N}.

根据时间将两种不同时间间隔的时序数据进行数据聚合,得到多元时序数据,并将所述多元时序数据转化为有监督数据,表示为:According to the time, the time series data of two different time intervals are aggregated to obtain multivariate time series data, and the multivariate time series data is converted into supervised data, expressed as:

Figure BDA0002594179870000058
Figure BDA0002594179870000058

步骤S104中,根据上述有监督数据训练手足口病预测模型,请参考图2,其示出了本申请的一些实施方式所提供的一种具体的利用细粒度数据的手足口病预测方法的流程图:In step S104, train the hand, foot and mouth disease prediction model based on the above supervised data, please refer to Figure 2, which shows the flow of a specific hand, foot and mouth disease prediction method using fine-grained data provided by some embodiments of the present application picture:

如图所示,手足口病预测模型包括:输入层、等粒度序列处理单元、细粒度序列处理单元、合并层、全连接层和输出层;As shown in the figure, the hand, foot and mouth disease prediction model includes: input layer, equal-grained sequence processing unit, fine-grained sequence processing unit, merge layer, fully connected layer and output layer;

其中,所述输入层用于输入所述有监督数据,所述等粒度序列处理单元用于处理等粒度时序数据,所述细粒度序列处理单元用于处理细粒度时序数据,然后使用合并层和全连接层将所述等粒度序列处理单元和所述细粒度序列处理单元输出的数据相互关联生成初始预测结果,输出层将初始预测结果反归一化生成最终输出的预测结果。Wherein, the input layer is used to input the supervised data, the equal granularity sequence processing unit is used to process the equal granularity time series data, the fine-grained sequence processing unit is used to process the fine-grained time series data, and then the combination layer and The fully connected layer correlates the data output by the equal-grained sequence processing unit and the fine-grained sequence processing unit to generate an initial prediction result, and the output layer denormalizes the initial prediction result to generate a final output prediction result.

具体的处理过程如下:The specific process is as follows:

Ⅰ)等粒度序列处理单元的表示部件包括:第一GRU层和第一线程层。第一GRU层用于处理输入等粒度时序数据并输出其隐藏状态。第一线程层合并隐藏状态并输出等粒度序列处理单元的预测值。I) The representation components of the equal granularity sequence processing unit include: the first GRU layer and the first thread layer. The first GRU layer is used to process the input and other granular time series data and output its hidden state. The first thread layer merges the hidden states and outputs the predicted values of the equal-grained sequence processing units.

GRU的转化用f1表示,其公式如下:The conversion of GRU is represented by f 1 , and its formula is as follows:

ht=f1(ht-1,[y1,y2,…,yT]),h t = f 1 (h t-1 , [y 1 , y 2 , . . . , y T ]),

其中,

Figure BDA0002594179870000061
是隐藏状态,n表示隐藏状态的个数;ht-1表示上一个时刻隐藏状态。利用等粒度时序数据产生预测值,可通过线程层进行转化,公式如下:in,
Figure BDA0002594179870000061
is the hidden state, n represents the number of hidden states; h t-1 represents the hidden state at the previous moment. Use equal-grained time series data to generate forecast values, which can be converted through the thread layer. The formula is as follows:

cy=wyht+byc y =w y h t +b y ,

其中

Figure BDA0002594179870000062
是等粒度数据侧的预测值,
Figure BDA0002594179870000063
为权重参数,
Figure BDA0002594179870000064
为偏置项。in
Figure BDA0002594179870000062
is the predicted value of the equal granularity data side,
Figure BDA0002594179870000063
is the weight parameter,
Figure BDA0002594179870000064
is a bias term.

Ⅱ)等粒度序列处理单元的表示部件包括:第二线程层、softmax层、第二GRU和第三线程层。第二线程层对输入的细粒度数据进行线性转化,softmax层用于强化重要时刻的细粒度数据输入、第二GRU层用于描述时序特性,利用第三线程层合并隐藏状态并输出细粒度部件的预测值。II) The representation components of the equal granularity sequence processing unit include: the second thread layer, the softmax layer, the second GRU and the third thread layer. The second thread layer linearly converts the input fine-grained data, the softmax layer is used to strengthen the fine-grained data input at important moments, the second GRU layer is used to describe the timing characteristics, and the third thread layer is used to merge hidden states and output fine-grained components predicted value of .

通过线性加权突出周期性事件是增强输入信息提取的关键方法。输入要素的线性权重可以总结如下:Highlighting periodic events by linear weighting is a key method to enhance the extraction of input information. The linear weights of the input features can be summarized as follows:

u=∑wu*[x1,x2,…,xT]+buu=∑w u *[x 1 , x 2 , . . . , x T ]+b u ,

其中,

Figure BDA0002594179870000065
表示输入的T等粒度时间间隔内对应的细粒度数据,
Figure BDA0002594179870000066
是权重,bu是偏差项,
Figure BDA0002594179870000067
为线性化输出。in,
Figure BDA0002594179870000065
Indicates the fine-grained data corresponding to the input T and other granular time intervals,
Figure BDA0002594179870000066
is the weight, b u is the bias term,
Figure BDA0002594179870000067
is the linearized output.

线性加权后,存在和预测目标时刻弱相关或不相关的数据。为了确保模型能够提取所有输入要素的潜在信息,利用softmax来弱化弱相关或不相关数据的影响,提高相关数据的权重,其公式为:After linear weighting, there are data that are weakly or uncorrelated with the predicted target time. In order to ensure that the model can extract the potential information of all input elements, softmax is used to weaken the influence of weakly correlated or irrelevant data and increase the weight of relevant data. The formula is:

p=softmax(e),p=softmax(e),

其中,

Figure BDA0002594179870000071
表示该层输出。对于p中的任一元素有:in,
Figure BDA0002594179870000071
Indicates the output of this layer. For any element in p there are:

Figure BDA0002594179870000072
Figure BDA0002594179870000072

其中,exp(·)表示指数函数,ej,t表示e中的元素。Among them, exp(·) represents the exponential function, and e j, t represent the elements in e.

第二GRU层被用来动态提取p中的时序特性,其公式如下:The second GRU layer is used to dynamically extract the timing features in p, and its formula is as follows:

h′t=f2(h′t-1,p)h′ t = f 2 (h′ t-1 , p)

其中,

Figure BDA0002594179870000073
是隐藏状态;h′t-1表示上一个时刻隐藏状态。利用细粒度时序数据产生预测值,可通过线程层进行转化,公式如下:in,
Figure BDA0002594179870000073
is the hidden state; h′ t-1 represents the hidden state at the previous moment. Using fine-grained time series data to generate forecast values can be converted through the thread layer. The formula is as follows:

ce=weh′t+bec e =w e h′ t +b e ,

其中

Figure BDA0002594179870000074
是细粒度数据侧的预测值,
Figure BDA0002594179870000075
为权重,be是偏置项。in
Figure BDA0002594179870000074
is the predicted value on the fine-grained data side,
Figure BDA0002594179870000075
is the weight, and b e is the bias term.

Ⅲ)在合并层中两个处理单元(等粒度序列处理单元和细粒度序列处理单元)的输出表示数据合并,然后通过一个全连接层将等粒度时序数据和细粒度时序数据结合起来以关联双侧(两个处理单元)的输出,通过输出层输出结合了细粒度和等粒度的预测结果。这一步可以总结如下:Ⅲ) In the merge layer, the output of the two processing units (equal-grained sequence processing unit and fine-grained sequence processing unit) represents data merging, and then through a fully connected layer, the equal-grained time-series data and fine-grained time-series data are combined to correlate double The output of the side (two processing units) combines fine-grained and equal-grained prediction results through the output layer. This step can be summarized as follows:

Figure BDA0002594179870000076
Figure BDA0002594179870000076

其中

Figure BDA0002594179870000077
是双侧输出的合并向量,
Figure BDA0002594179870000078
是这些输出的权重,
Figure BDA0002594179870000079
是预测的下一个时间间隔的门诊病人数,
Figure BDA00025941798700000710
是偏置项。in
Figure BDA0002594179870000077
is the combined vector of two-sided outputs,
Figure BDA0002594179870000078
are the weights of these outputs,
Figure BDA0002594179870000079
is the predicted number of outpatients in the next time interval,
Figure BDA00025941798700000710
is a bias item.

针对输出的预测结果进行反归一化操作,其计算公式如下:The denormalization operation is performed on the output prediction results, and the calculation formula is as follows:

d=d′*(max(d)-min(d))+min(d),d=d'*(max(d)-min(d))+min(d),

反归一化公式在方法中被应用对预测模型产生的最终预测结果进行修正。The denormalization formula is applied in the method to correct the final forecast results produced by the forecast model.

Ⅴ)上述手足口病预测模,采用均方误差Mean squared error(MSE)作为预测模型的损失函数,即目标函数。通过迭代训练使得目标函数的损失达到最小,建立最优的预测模型。V) The above hand, foot and mouth disease prediction model uses Mean squared error (MSE) as the loss function of the prediction model, that is, the objective function. Through iterative training, the loss of the objective function is minimized, and an optimal prediction model is established.

具体的,设置目标函数为MSE,设置时间窗口大小为T和GRU隐藏状态参数n。根据步骤S103和步骤S104生成数据和对应的预测模型,训练预测模型。利用反归一化函数,还原预测模型输出为预测的发病人数。根据输出的结果调整模型参数T和n,得到最优的模型。Specifically, set the objective function as MSE, set the time window size as T and the GRU hidden state parameter n. According to the data generated in step S103 and step S104 and the corresponding prediction model, the prediction model is trained. Using the denormalization function, restore the output of the prediction model to the predicted number of cases. Adjust the model parameters T and n according to the output results to obtain the optimal model.

步骤S105中,将实时采集的手足口病的病例数据输入训练好的手足口病预测模型,得到实时预测的手足口病的发病人数。In step S105, input the case data of HFMD collected in real time into the trained HFMD prediction model to obtain the number of HFMD cases predicted in real time.

如图2所示,将预测模型的输出反归一化后,将预测结果展示给公共卫生相关人员,用于疾病的预防。As shown in Figure 2, after denormalizing the output of the prediction model, the prediction results are displayed to public health personnel for disease prevention.

上述利用细粒度数据的手足口病预测方法可用于客户端,本申请实施例中,所述客户端可以包括硬件,也可以包括软件。当客户端包括硬件时,其可以是具有显示屏并且支持信息交互的各种电子设备,例如,可以包括但不限于智能手机、平板电脑、膝上型便携计算机和台式计算机等等。当客户端包括软件时,可以安装在上述电子设备中,其可以实现成多个软件或软件模块,也可以实现成单个软件或软件模块。在此不做具体限定。The above-mentioned hand, foot and mouth disease prediction method using fine-grained data can be applied to the client. In the embodiment of the present application, the client may include hardware or software. When the client includes hardware, it may be various electronic devices with display screens and supporting information interaction, for example, may include but not limited to smart phones, tablet computers, laptop computers, desktop computers and the like. When the client includes software, it can be installed in the above-mentioned electronic device, and it can be implemented as multiple software or software modules, or as a single software or software module. No specific limitation is made here.

相较于现有技术,本发明提供的利用细粒度数据的手足口病预测方法有以下优点:Compared with the prior art, the HFMD prediction method using fine-grained data provided by the present invention has the following advantages:

1、无需外部数据源1. No need for external data sources

本发明通过设置和预测目标相等时间间隔以及比预测目标的时间间隔更细粒度的时间间隔来统计手足口病的历史发病数据,利用更细粒度的时序数据提高基于等时间粒度的预测目标的准确度。从数据角度看,本发明无需额外的数据辅助;从算法的角度来看,本发明捕获细粒度的发病情况来判断未来的走势。The present invention counts the historical morbidity data of hand, foot and mouth disease by setting an equal time interval to the prediction target and a finer-grained time interval than the time interval of the prediction target, and uses finer-grained time series data to improve the accuracy of the prediction target based on equal time granularity. Spend. From the perspective of data, the present invention does not require additional data assistance; from the perspective of algorithm, the present invention captures fine-grained disease conditions to judge future trends.

2、从细节中发现危机2. Discover the crisis from the details

本方法所用到的数据仅仅为手足口病病例数据。本发明的本质是从更细粒度的时间段内统计感染人数,以预测未来更长时间段内的爆发情况,做到以小观大。由于疾病存在潜伏期,因此从细粒度的数据中挖掘潜在的爆发人数成为可能。The data used in this method is only HFMD case data. The essence of the present invention is to count the number of infected people in a finer-grained time period, so as to predict the outbreak situation in a longer time period in the future, so as to see the big from the small. Due to the incubation period of the disease, it is possible to mine potential outbreak numbers from fine-grained data.

在上述的实施例中,提供了一种利用细粒度数据的手足口病预测方法,与之相对应的,本申请还提供一种利用细粒度数据的手足口病预测装置。本申请实施例提供的利用细粒度数据的手足口病预测装置可以实施上述利用细粒度数据的手足口病预测方法,该利用细粒度数据的手足口病预测装置可以通过软件、硬件或软硬结合的方式来实现。例如,该利用细粒度数据的手足口病预测装置可以包括集成的或分开的功能模块或单元来执行上述各方法中的对应步骤。请参考图3,其示出了本申请的一些实施方式所提供的一种利用细粒度数据的手足口病预测装置的示意图。由于装置实施例基本相似于方法实施例,所以描述得比较简单,相关之处参见方法实施例的部分说明即可。下述描述的装置实施例仅仅是示意性的。In the above embodiments, a method for predicting HFMD using fine-grained data is provided. Correspondingly, the present application also provides a device for predicting HFMD using fine-grained data. The HFMD prediction device using fine-grained data provided by the embodiment of the present application can implement the above-mentioned HFMD prediction method using fine-grained data. The HFMD prediction device using fine-grained data can use software, hardware, or a combination of hardware and software. way to achieve. For example, the device for predicting hand, foot and mouth disease using fine-grained data may include integrated or separate functional modules or units to perform corresponding steps in the above-mentioned methods. Please refer to FIG. 3 , which shows a schematic diagram of a hand, foot and mouth disease prediction device using fine-grained data provided by some embodiments of the present application. Since the device embodiment is basically similar to the method embodiment, the description is relatively simple, and for relevant parts, refer to the part of the description of the method embodiment. The device embodiments described below are illustrative only.

如图3所示,所述利用细粒度数据的手足口病预测装置10可以包括:As shown in Figure 3, the HFMD prediction device 10 using fine-grained data may include:

获取模块101,用于获取手足口病的历史病例数据;Obtaining module 101, for obtaining the historical case data of hand, foot and mouth disease;

预处理模块102,用于对所述历史病例数据进行预处理,并统计为两种不同时间间隔的时序数据;A preprocessing module 102, configured to preprocess the historical case data and make statistics into time series data of two different time intervals;

聚合模块103,用于根据时间聚合预所述两种不同时间间隔的时序数据,得到多元时序数据,并将所述多元时序数据转化为有监督数据;The aggregation module 103 is used to aggregate the time series data of the two different time intervals according to time to obtain multivariate time series data, and convert the multivariate time series data into supervised data;

模型训练模块104,用于根据所述有监督数据,训练手足口病预测模型;Model training module 104, used for training hand, foot and mouth disease prediction model according to the supervised data;

预测模块105,用于将实时采集的手足口病的病例数据输入训练好的所述手足口病预测模型,得到实时预测的手足口病的发病人数。The prediction module 105 is configured to input the case data of HFMD collected in real time into the trained HFMD prediction model to obtain the number of cases of HFMD predicted in real time.

在本申请实施例的一些实施方式中,预处理模块102具体用于:对所述历史病例数据进行数据清洗,对清洗后的数据进行归一化处理。In some implementations of the embodiments of the present application, the preprocessing module 102 is specifically configured to: perform data cleaning on the historical case data, and perform normalization processing on the cleaned data.

在本申请实施例的一些实施方式中,两种不同时间间隔中,第一种时间间隔与目标时间间隔相同,把根据第一种时间间隔统计的时序数据称为等粒度时序数据;第二种时间间隔小于目标时间间隔,把根据第二种时间间隔统计的时序数据称为细粒度时序数据。In some implementations of the embodiments of the present application, among the two different time intervals, the first time interval is the same as the target time interval, and the time series data counted according to the first time interval is called equal granularity time series data; the second time series If the time interval is smaller than the target time interval, the time series data counted according to the second time interval is called fine-grained time series data.

本申请实施例提供的利用细粒度数据的手足口病预测装置10,与本申请前述实施例提供的利用细粒度数据的手足口病预测方法出于相同的发明构思,具有相同的有益效果。The device 10 for predicting HFMD using fine-grained data provided in the embodiment of the present application is based on the same inventive concept as the method for predicting HFMD using fine-grained data provided in the previous embodiments of the present application, and has the same beneficial effect.

本申请实施方式还提供一种与前述实施方式所提供的利用细粒度数据的手足口病预测方法对应的电子设备,所述电子设备可以是用于客户端的电子设备,例如手机、笔记本电脑、平板电脑、台式机电脑等,以执行上述利用细粒度数据的手足口病预测方法。The embodiment of the present application also provides an electronic device corresponding to the hand, foot and mouth disease prediction method using fine-grained data provided in the foregoing embodiment, and the electronic device may be an electronic device for a client, such as a mobile phone, a notebook computer, a tablet Computers, desktop computers, etc., to execute the above-mentioned hand, foot and mouth disease prediction method using fine-grained data.

请参考图4,其示出了本申请的一些实施方式所提供的一种电子设备的示意图。如图4所示,所述电子设备20包括:处理器200,存储器201,总线202和通信接口203,所述处理器200、通信接口203和存储器201通过总线202连接;所述存储器201中存储有可在所述处理器200上运行的计算机程序,所述处理器200运行所述计算机程序时执行本申请前述任一实施方式所提供的利用细粒度数据的手足口病预测方法。Please refer to FIG. 4 , which shows a schematic diagram of an electronic device provided by some embodiments of the present application. As shown in Figure 4, described electronic device 20 comprises: processor 200, memory 201, bus 202 and communication interface 203, described processor 200, communication interface 203 and memory 201 are connected by bus 202; Stored in the memory 201 There is a computer program that can run on the processor 200, and when the processor 200 runs the computer program, it executes the hand, foot and mouth disease prediction method using fine-grained data provided by any one of the above-mentioned embodiments of the present application.

其中,存储器201可能包含高速随机存取存储器(RAM:Random Access Memory),也可能还包括非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器。通过至少一个通信接口203(可以是有线或者无线)实现该系统网元与至少一个其他网元之间的通信连接,可以使用互联网、广域网、本地网、城域网等。Wherein, the memory 201 may include a high-speed random access memory (RAM: Random Access Memory), and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory. The communication connection between the system network element and at least one other network element is realized through at least one communication interface 203 (which may be wired or wireless), and Internet, wide area network, local network, metropolitan area network, etc. can be used.

总线202可以是ISA总线、PCI总线或EISA总线等。所述总线可以分为地址总线、数据总线、控制总线等。其中,存储器201用于存储程序,所述处理器200在接收到执行指令后,执行所述程序,前述本申请实施例任一实施方式揭示的所述利用细粒度数据的手足口病预测方法可以应用于处理器200中,或者由处理器200实现。The bus 202 may be an ISA bus, a PCI bus, or an EISA bus, etc. The bus can be divided into address bus, data bus, control bus and so on. Wherein, the memory 201 is used to store a program, and the processor 200 executes the program after receiving an execution instruction, and the method for predicting hand, foot and mouth disease using fine-grained data disclosed in any one of the above-mentioned embodiments of the present application can be Applied in the processor 200, or implemented by the processor 200.

处理器200可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器200中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器200可以是通用处理器,包括中央处理器(Central Processing Unit,简称CPU)、网络处理器(Network Processor,简称NP)等;还可以是数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器201,处理器200读取存储器201中的信息,结合其硬件完成上述方法的步骤。The processor 200 may be an integrated circuit chip with signal processing capability. In the implementation process, each step of the above method may be implemented by an integrated logic circuit of hardware in the processor 200 or instructions in the form of software. The above-mentioned processor 200 can be a general-purpose processor, including a central processing unit (Central Processing Unit, referred to as CPU), a network processor (Network Processor, referred to as NP), etc.; it can also be a digital signal processor (DSP), an application-specific integrated circuit (ASIC), off-the-shelf programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. Various methods, steps, and logic block diagrams disclosed in the embodiments of the present application may be implemented or executed. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like. The steps of the method disclosed in the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register. The storage medium is located in the memory 201, and the processor 200 reads the information in the memory 201, and completes the steps of the above method in combination with its hardware.

本申请实施例提供的电子设备与本申请实施例提供的利用细粒度数据的手足口病预测方法出于相同的发明构思,具有与其采用、运行或实现的方法相同的有益效果。The electronic device provided in the embodiment of the present application is based on the same inventive concept as the method for predicting hand, foot and mouth disease using fine-grained data provided in the embodiment of the present application, and has the same beneficial effect as the method adopted, operated or implemented.

本申请实施方式还提供一种与前述实施方式所提供的利用细粒度数据的手足口病预测方法对应的计算机可读介质,其上存储有计算机程序(即程序产品),所述计算机程序在被处理器运行时,会执行前述任意实施方式所提供的利用细粒度数据的手足口病预测方法。The embodiments of the present application also provide a computer-readable medium corresponding to the method for predicting hand, foot and mouth disease using fine-grained data provided in the foregoing embodiments, on which a computer program (ie, a program product) is stored, and the computer program is When the processor is running, it will execute the hand, foot and mouth disease prediction method using fine-grained data provided by any of the above implementations.

需要说明的是,所述计算机可读存储介质的例子还可以包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他光学、磁性存储介质,在此不再一一赘述。It should be noted that examples of the computer-readable storage medium may also include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random Access memory (RAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), flash memory or other optical and magnetic storage media will not be repeated here.

本申请的上述实施例提供的计算机可读存储介质与本申请实施例提供的利用细粒度数据的手足口病预测方法出于相同的发明构思,具有与其存储的应用程序所采用、运行或实现的方法相同的有益效果。The computer-readable storage medium provided by the above-mentioned embodiments of the present application is based on the same inventive concept as the hand-foot-and-mouth disease prediction method using fine-grained data provided by the embodiments of the present application, and has the functions adopted, run or realized by the application program stored therein. The same beneficial effect of the method.

最后应说明的是:以上各实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述各实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围,其均应涵盖在本申请的权利要求和说明书的范围当中。Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and are not intended to limit it; although the application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: It is still possible to modify the technical solutions described in the foregoing embodiments, or perform equivalent replacements for some or all of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the various embodiments of the present application. All of them should be covered by the scope of the claims and description of the present application.

Claims (6)

1. A hand-foot-and-mouth disease prediction method using fine-grained data is characterized by comprising the following steps:
s1, acquiring historical case data of hand-foot-and-mouth disease;
s2, preprocessing the historical case data, and counting the historical case data into two time sequence data with different time intervals;
s3, performing data aggregation on the time sequence data of the two different time intervals according to time to obtain multivariate time sequence data, and converting the multivariate time sequence data into supervised data;
s4, training a hand-foot-and-mouth disease prediction model according to the supervised data;
s5, inputting the real-time collected case data of the hand-foot-and-mouth disease into the trained hand-foot-and-mouth disease prediction model to obtain the number of the real-time predicted patients with the hand-foot-and-mouth disease;
in the two different time intervals in the step S2, the first time interval is the same as the target time interval, and the time series data counted according to the first time interval is called as equal-granularity time series data; the second time interval is smaller than the target time interval, and the time sequence data counted according to the second time interval is called fine-grained time sequence data;
setting the number of the first time intervals to be M, using [ y 1 ,y 2 ,…,y M ]Representing equal-granularity time series data;
setting the first time interval to contain N second time intervals, using [ x ] 1 ,x 2 ,…,x M ]The fine-grained data corresponding to the input M equal-grained time interval is represented arbitrarily
Figure FDA0003991309840000011
Satisfy the requirement of
Figure FDA0003991309840000012
Wherein x t,i Indicating equal granularity within time tThe ith fine-grained data of (1);
the preprocessing step in step S2 includes:
carrying out data cleaning on the historical case data, and carrying out normalization processing on the cleaned data;
the step S3 includes:
setting a constant variable T as the number of time intervals influencing the number of the persons who suffer from the diseases in the next time interval, converting the equal-granularity time series data, and expressing as follows:
Figure FDA0003991309840000013
setting time series data [ x ] of ith fine-grained moment 1,i ,x 2,i ,…,x M,i ]After the time series data is converted, the time series data is expressed as:
Figure FDA0003991309840000021
wherein, i belongs to {1, …, N };
carrying out data aggregation on the time sequence data of two different time intervals according to time to obtain multivariate time sequence data, and converting the multivariate time sequence data into supervised data, wherein the expression is as follows:
Figure FDA0003991309840000022
the specific treatment process is as follows:
i) the presentation component of the equal-granularity sequence processing unit comprises: a first GRU layer and a first thread layer; the first GRU layer is used for processing input equal-granularity time sequence data and outputting a hidden state of the input equal-granularity time sequence data; the first thread layer merges the hidden states and outputs the predicted value of the equal-granularity sequence processing unit;
conversion of GRU 1 Expressed, its formula is as follows:
h t =f 1 (h t-1 ,[y 1 ,y 2 ,…,y T ]),
wherein,
Figure FDA0003991309840000023
is a hidden state, n represents the number of hidden states; h is t-1 Representing a previous time hidden state; the equal-granularity time sequence data is used for generating a predicted value, and the predicted value can be converted through a thread layer, wherein the formula is as follows:
c y =w y h t +b y
wherein
Figure FDA0003991309840000024
Is a predicted value of the equal-granularity data side,
Figure FDA0003991309840000025
in order to be a weight parameter, the weight parameter,
Figure FDA0003991309840000026
is a bias term;
II) the representation part of the equal-granularity sequence processing unit comprises: the system comprises a second thread layer, a softmax layer, a second GRU and a third thread layer; the second thread layer carries out linear conversion on input fine-grained data, the softmax layer is used for strengthening fine-grained data input at important moments, the second GRU layer is used for describing time sequence characteristics, and the third thread layer is used for merging hidden states and outputting predicted values of fine-grained components;
highlighting periodic events by linear weighting is a key method for enhancing input information extraction; the linear weights of the input elements can be summarized as follows:
u=∑w u *[x 1 ,x 2 ,…,x T ]+b u
wherein,
Figure FDA0003991309840000027
representing the corresponding fine-grained data in the input T equal-grained time interval,
Figure FDA0003991309840000028
is a weight, b u Is the term of the deviation in the sense that,
Figure FDA0003991309840000029
is a linearized output;
after linear weighting, the softmax layer is utilized to weaken the influence of weakly correlated or uncorrelated data and improve the weight of correlated data, and the formula is as follows:
p=softmax(e),
wherein,
Figure FDA0003991309840000031
representing softmax layer output; for any element in p are:
Figure FDA0003991309840000032
wherein exp (·) represents an exponential function;
the second GRU layer is used to dynamically extract the timing characteristics in p, which is formulated as follows:
h' t =f 2 (h' t-1 ,p)
wherein,
Figure FDA0003991309840000033
is a hidden state; h' t-1 Representing a previous time hidden state; the fine-grained time series data are used for generating a predicted value, and conversion can be carried out through a thread layer, and the formula is as follows:
c e =w e h' t +b e
wherein
Figure FDA0003991309840000034
Is a predicted value of the fine-grained data side,
Figure FDA0003991309840000035
is a weight, b e Is a bias term;
III) merging the output representation data of the medium-granularity sequence processing unit and the fine-granularity sequence processing unit in the merging layer, then combining the equal-granularity time sequence data and the fine-granularity time sequence data through a full connection layer to associate the output of the two processing units, and outputting a prediction result combining the fine granularity and the equal granularity through an output layer; this step can be summarized as follows:
Figure FDA0003991309840000036
wherein
Figure FDA0003991309840000037
Is a merged vector that is output from both sides,
Figure FDA0003991309840000038
it is the weight of these outputs that is,
Figure FDA0003991309840000039
is the predicted number of outpatients for the next time interval,
Figure FDA00039913098400000310
is a bias term;
performing inverse normalization operation on the output prediction result, wherein the calculation formula is as follows:
d=d'*(max(d)-min(d))+min(d),
the inverse normalization formula is applied in the method to correct the final prediction result generated by the prediction model.
2. The method of claim 1, wherein the hand-foot-and-mouth disease prediction model comprises: the system comprises an input layer, an equal-granularity sequence processing unit, a fine-granularity sequence processing unit, a merging layer, a full-connection layer and an output layer;
the input layer is used for inputting the supervised data, the equal-granularity sequence processing unit is used for processing equal-granularity time series data, the fine-granularity sequence processing unit is used for processing fine-granularity time series data, then a merging layer and a full-connection layer are used for mutually correlating the data output by the equal-granularity sequence processing unit and the data output by the fine-granularity sequence processing unit to generate an initial prediction result, and the output layer is used for carrying out reverse normalization on the initial prediction result to generate a final output prediction result.
3. The method according to claim 2, characterized by using the mean square error as a loss function of the hand-foot-and-mouth disease prediction model.
4. An apparatus for predicting hand-foot-and-mouth disease using fine-grained data, comprising:
the acquisition module is used for acquiring historical case data of the hand-foot-and-mouth disease;
the preprocessing module is used for preprocessing the historical case data and counting the historical case data into two time sequence data with different time intervals;
the aggregation module is used for carrying out data aggregation on the time sequence data of the two different time intervals according to time to obtain multivariate time sequence data and converting the multivariate time sequence data into supervised data;
the model training module is used for training a hand-foot-and-mouth disease prediction model according to the supervised data;
the prediction module is used for inputting the real-time collected case data of the hand-foot-and-mouth disease into the trained hand-foot-and-mouth disease prediction model to obtain the number of the real-time predicted patients of the hand-foot-and-mouth disease;
the preprocessing module is also used for calling the time sequence data counted according to the first time interval as equal-granularity time sequence data in two different time intervals, wherein the first time interval is the same as the target time interval; the second time interval is smaller than the target time interval, and the time sequence data counted according to the second time interval is called fine-grained time sequence data;
setting the number of the first time intervals to be M, using [ y 1 ,y 2 ,…,y M ]Indicating equal grainDegree time sequence data;
setting the first time interval to contain N second time intervals, using [ x ] 1 ,x 2 ,…,x M ]The fine-grained data corresponding to the input M equal-grained time interval is represented arbitrarily
Figure FDA0003991309840000041
Satisfy the requirement of
Figure FDA0003991309840000042
Wherein x t,i Representing the ith fine-grained data within the equal-grained time t;
the historical case data processing device is used for cleaning the historical case data and carrying out normalization processing on the cleaned data;
and the aggregation module is also used for setting a constant variable T as the number of time intervals influencing the number of the people suffering from the diseases in the next time interval, and converting the equal-granularity time sequence data to be expressed as:
Figure FDA0003991309840000043
setting time series data [ x ] of ith fine-grained moment 1,i ,x 2,i ,…,x M,i ]After the time series data is converted, the time series data is expressed as:
Figure FDA0003991309840000051
wherein, i belongs to {1, …, N };
carrying out data aggregation on the time sequence data of two different time intervals according to time to obtain multivariate time sequence data, and converting the multivariate time sequence data into supervised data, wherein the expression is as follows:
Figure FDA0003991309840000052
the specific treatment process is as follows:
i) the presentation component of the equal-granularity sequence processing unit comprises: a first GRU layer and a first thread layer; the first GRU layer is used for processing input equal-granularity time sequence data and outputting a hidden state of the input equal-granularity time sequence data; the first thread layer merges the hidden states and outputs the predicted value of the equal-granularity sequence processing unit;
conversion of GRU 1 Expressed, its formula is as follows:
h t =f 1 (h t-1 ,[y 1 ,y 2 ,…,y T ]),
wherein,
Figure FDA0003991309840000053
is a hidden state, n represents the number of hidden states; h is t-1 Representing a previous time hidden state; the equal-granularity time sequence data is used for generating a predicted value, and the predicted value can be converted through a thread layer, wherein the formula is as follows:
c y =w y h t +b y
wherein
Figure FDA0003991309840000054
Is a predicted value of the equal-granularity data side,
Figure FDA0003991309840000055
in order to be a weight parameter, the weight parameter,
Figure FDA0003991309840000056
is a bias term;
II) the representation part of the equal-granularity sequence processing unit comprises: the system comprises a second thread layer, a softmax layer, a second GRU and a third thread layer; the second thread layer carries out linear transformation on the input fine-grained data, the softmax layer is used for strengthening fine-grained data input at an important moment, the second GRU layer is used for describing time sequence characteristics, and the third thread layer is used for merging hidden states and outputting a predicted value of a fine-grained component;
highlighting periodic events by linear weighting is a key method for enhancing input information extraction; the linear weights of the input elements can be summarized as follows:
u=∑w u *[x 1 ,x 2 ,…,x T ]+b u
wherein,
Figure FDA0003991309840000057
representing the corresponding fine-grained data in the input T equal-grained time interval,
Figure FDA0003991309840000058
is a weight, b u Is the term of the deviation in the sense that,
Figure FDA0003991309840000059
is a linearized output;
after linear weighting, the softmax layer is utilized to weaken the influence of weakly correlated or uncorrelated data and improve the weight of correlated data, and the formula is as follows:
p=softmax(e),
wherein,
Figure FDA0003991309840000061
representing softmax layer output; for any element in p are:
Figure FDA0003991309840000062
wherein exp (·) represents an exponential function;
the second GRU layer is used to dynamically extract the timing characteristics in p, which is formulated as follows:
h' t =f 2 (h' t-1 ,p)
wherein,
Figure FDA0003991309840000063
is a hidden state; h' t-1 Representing a previous time hidden state; the fine-grained time series data are used for generating a predicted value, and conversion can be carried out through a thread layer, and the formula is as follows:
c e =w e h' t +b e
wherein
Figure FDA0003991309840000064
Is a predicted value of the fine-grained data side,
Figure FDA0003991309840000065
is a weight, b e Is a bias term;
III) merging the output representation data of the two processing units (the equal-granularity sequence processing unit and the fine-granularity sequence processing unit) in the merging layer, combining the equal-granularity time sequence data and the fine-granularity time sequence data through a full connection layer to correlate the output of the two sides (the two processing units), and outputting a prediction result combining the fine granularity and the equal granularity through an output layer; this step can be summarized as follows:
Figure FDA0003991309840000066
wherein
Figure FDA0003991309840000067
Is a merged vector that is output from both sides,
Figure FDA0003991309840000068
it is the weight of these outputs that is,
Figure FDA0003991309840000069
is the predicted number of outpatients for the next time interval,
Figure FDA00039913098400000610
is a bias term;
performing inverse normalization operation on the output prediction result, wherein the calculation formula is as follows:
d=d'*(max(d)-min(d))+min(d),
the anti-normalization formula is applied in the method to correct the final prediction result generated by the prediction model.
5. An electronic device, comprising: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor executes the computer program to implement the method according to any of claims 1 to 3.
6. A computer readable medium having computer readable instructions stored thereon which are executable by a processor to implement the method of any one of claims 1 to 3.
CN202010704454.2A 2020-07-21 2020-07-21 Hand, foot and mouth disease prediction method, electronic equipment and medium using fine-grained data Active CN111863276B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010704454.2A CN111863276B (en) 2020-07-21 2020-07-21 Hand, foot and mouth disease prediction method, electronic equipment and medium using fine-grained data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010704454.2A CN111863276B (en) 2020-07-21 2020-07-21 Hand, foot and mouth disease prediction method, electronic equipment and medium using fine-grained data

Publications (2)

Publication Number Publication Date
CN111863276A CN111863276A (en) 2020-10-30
CN111863276B true CN111863276B (en) 2023-02-14

Family

ID=73000779

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010704454.2A Active CN111863276B (en) 2020-07-21 2020-07-21 Hand, foot and mouth disease prediction method, electronic equipment and medium using fine-grained data

Country Status (1)

Country Link
CN (1) CN111863276B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112331349B (en) * 2020-11-03 2023-04-07 四川大学华西医院 Cerebral apoplexy relapse monitoring system
CN112562861B (en) * 2020-11-19 2022-09-09 集美大学 A method and apparatus for training an infectious disease prediction model
CN113223721B (en) * 2021-03-23 2022-07-12 杭州电子科技大学 A predictive control model for novel coronavirus pneumonia

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108280443A (en) * 2018-02-23 2018-07-13 深圳市唯特视科技有限公司 A kind of action identification method based on deep feature extraction asynchronous fusion network
CN109859854A (en) * 2018-12-17 2019-06-07 中国科学院深圳先进技术研究院 Infectious disease prediction method, apparatus, electronic device and computer readable medium
CN110070923A (en) * 2019-03-07 2019-07-30 浙江大学 A kind of residual hydrogenation model and method for building up based on semi-supervised depth GRU
CN111400366A (en) * 2020-02-27 2020-07-10 西安交通大学 A visual analysis method and system for interactive outpatient volume prediction based on CatBoost model
CN111415752A (en) * 2020-03-01 2020-07-14 集美大学 A prediction method of hand, foot and mouth disease integrating meteorological factors and search index

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180082036A1 (en) * 2016-09-22 2018-03-22 General Electric Company Systems And Methods Of Medical Device Data Collection And Processing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108280443A (en) * 2018-02-23 2018-07-13 深圳市唯特视科技有限公司 A kind of action identification method based on deep feature extraction asynchronous fusion network
CN109859854A (en) * 2018-12-17 2019-06-07 中国科学院深圳先进技术研究院 Infectious disease prediction method, apparatus, electronic device and computer readable medium
CN110070923A (en) * 2019-03-07 2019-07-30 浙江大学 A kind of residual hydrogenation model and method for building up based on semi-supervised depth GRU
CN111400366A (en) * 2020-02-27 2020-07-10 西安交通大学 A visual analysis method and system for interactive outpatient volume prediction based on CatBoost model
CN111415752A (en) * 2020-03-01 2020-07-14 集美大学 A prediction method of hand, foot and mouth disease integrating meteorological factors and search index

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ARIMA模型在佛山市高明区手足口病预测中的应用;张金奖 等;《中国公共卫生管理》;20180831;第34卷(第4期);第529-533页 *
Dilated Recurrent Neural Network for Epidemiological Predictions;Jianxiang Luo 等;《2019 3rd International Conference on Electronic Information Technology and Computer Engineering》;20191020;第1728-1731页 *
Dual-grained representation for hand, foot, and mouth disease prediction within public health cyber-physical systems;Zhijin Wang;《SOFTWARE-PRACTICE & EXPERIENCE》;20201215;第51卷(第11期);第2290-2305页 *
利用时间序列模型分析预测辽宁手足口病疫情趋势;王伶 等;《中国卫生统计》;20161031;第33卷(第5期);第847-849页 *

Also Published As

Publication number Publication date
CN111863276A (en) 2020-10-30

Similar Documents

Publication Publication Date Title
CN111863276B (en) Hand, foot and mouth disease prediction method, electronic equipment and medium using fine-grained data
CN105320957B (en) Classifier training method and device
CN110929785B (en) Data classification method, device, terminal equipment and readable storage medium
TW201946013A (en) Credit risk prediction method and device based on LSTM (Long Short Term Memory) model
CN111898675B (en) Credit wind control model generation method and device, scoring card generation method, machine readable medium and equipment
CN110334881A (en) A financial time series prediction method, device and server based on long short memory network and deep data cleaning
CN113536139B (en) Interest-based content recommendation method, apparatus, computer equipment and storage medium
CN111382930B (en) Time sequence data-oriented risk prediction method and system
Maaliw et al. Time-series forecasting of COVID-19 cases using stacked long short-term memory networks
Qin et al. A new one-layer recurrent neural network for nonsmooth pseudoconvex optimization
CN112183881A (en) A social network-based public opinion event prediction method, device and storage medium
CN115829172B (en) Pollution prediction method, pollution prediction device, computer equipment and storage medium
CN117220277A (en) Wind power generation power prediction method and device, electronic equipment and storage medium
CN117095541A (en) Method, device, equipment and storage medium for predicting space-time feature fusion traffic flow
Alqahtani et al. Digital-twin-assisted healthcare framework for adult
Li et al. Video anomaly detection based on a multi-layer reconstruction autoencoder with a variance attention strategy
CN114912354A (en) A method, device and medium for predicting the risk of mosquito-borne infectious diseases
CN114625477A (en) Service node capacity adjusting method, equipment and computer readable storage medium
Khairuddin et al. Comparative study on artificial intelligence techniques in crime forecasting
Kong et al. A novel ConvLSTM with multifeature fusion for financial intelligent trading
CN116842238B (en) Method and system for realizing enterprise data visualization based on big data analysis
CN110909706A (en) Method and device for judging person during daytime and night, electronic equipment and storage medium
CN115796382A (en) Regional heating load prediction method, device, equipment and storage medium
CN116576504A (en) An interpretable regional heat load prediction method, device, equipment and storage medium
CN116525136A (en) Medical insurance drug auditing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant