CN114388138A - Artificial intelligence based epidemic situation prediction method, device, equipment and storage medium - Google Patents

Artificial intelligence based epidemic situation prediction method, device, equipment and storage medium Download PDF

Info

Publication number
CN114388138A
CN114388138A CN202210026852.2A CN202210026852A CN114388138A CN 114388138 A CN114388138 A CN 114388138A CN 202210026852 A CN202210026852 A CN 202210026852A CN 114388138 A CN114388138 A CN 114388138A
Authority
CN
China
Prior art keywords
sample data
coefficient
data
product
person number
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210026852.2A
Other languages
Chinese (zh)
Inventor
王玮璐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202210026852.2A priority Critical patent/CN114388138A/en
Publication of CN114388138A publication Critical patent/CN114388138A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/50ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders

Abstract

The application discloses an epidemic situation prediction method, device, equipment and storage medium based on artificial intelligence, and relates to the technical field of artificial intelligence. The method comprises the following steps: reading a plurality of electronic medical record data of a target area, and extracting a plurality of first sample data and a plurality of second sample data; calculating influence weights between the plurality of first sample data and the plurality of second sample data based on the blocking coefficient; respectively performing time sequence data processing on the plurality of first sample data and the plurality of second sample data according to the influence weight to respectively obtain a first time sequence function and a second time sequence function to obtain a propagation prediction simulation model; and extracting test data from the plurality of electronic medical record data, and inputting the test data into the propagation prediction simulation model to perform epidemic propagation prediction to obtain an epidemic propagation prediction result of the target area.

Description

Artificial intelligence based epidemic situation prediction method, device, equipment and storage medium
Technical Field
The present application relates to the field of artificial intelligence technology, and in particular, to a method, an apparatus, a device, and a storage medium for epidemic situation prediction based on artificial intelligence.
Background
Since the outbreak of new coronavirus all over the world, the global epidemic situation has not been improved despite various degrees of epidemic prevention measures in various countries. The novel coronavirus has the characteristics of long incubation period, hidden asymptomatic infection and the like, so that great difficulty is brought to epidemic prevention and control. Therefore, a reasonable mathematical model is established, and the method has important practical significance for scientifically analyzing the virus propagation characteristics and predicting the epidemic situation inflection point and the completion date.
In the related technology, a worker who executes propagation prediction collects epidemic situation propagation data of a sample area as input data of a traditional infectious disease prediction model, trains the model to obtain a final epidemic situation propagation prediction model, inputs the waiting prediction data of the number of infected persons and the number of cured persons in the area to be predicted into the epidemic situation propagation prediction model, and takes the result output by the model as an epidemic situation prediction result.
In carrying out the present application, the applicant has found that the related art has at least the following problems:
in fact, in the prior art, a unified standard is used for predicting the epidemic situation, so that the epidemic situation data obtained by model prediction is single, but in some environments, a special epidemic situation exists, so that the result of the predicted data is inaccurate, the authenticity is poor, and the data is difficult to be truly utilized in the subsequent epidemic prevention process.
Disclosure of Invention
In view of the above, the present application provides an artificial intelligence based epidemic situation prediction method, apparatus, device and storage medium, and mainly aims to solve the problems that in some current environments, a special epidemic situation exists, so that the predicted data result is inaccurate, the authenticity is poor, and the data is difficult to be truly utilized in the subsequent anti-epidemic process.
According to a first aspect of the present application, there is provided an artificial intelligence based epidemic situation prediction method, comprising:
reading a plurality of electronic medical record data of a target area, extracting the electronic medical record data with an address label in the target locking area from the plurality of electronic medical record data as a plurality of first sample data, and taking the plurality of electronic medical record data left after extraction as a plurality of second sample data;
determining a blocking coefficient, based on which an influence weight between the plurality of first sample data and the plurality of second sample data is calculated;
respectively performing time sequence data processing on the plurality of first sample data and the plurality of second sample data according to the influence weights to respectively obtain a first time sequence function and a second time sequence function, and linearly combining the first time sequence function and the second time sequence function to obtain a propagation prediction simulation model;
and extracting test data from the plurality of electronic medical record data, inputting the test data into the propagation prediction simulation model for epidemic propagation prediction, and obtaining an epidemic propagation prediction result of the target area.
Optionally, the determining a blocking coefficient, based on which the influence weights between the plurality of first sample data and the plurality of second sample data are calculated, comprises:
determining a baseline coefficient, a fatigue coefficient, a deviation coefficient and a city sealing rate coefficient in a locking state, and taking the baseline coefficient, the fatigue coefficient, the deviation coefficient and the city sealing rate coefficient as locking coefficients;
querying a first influence coefficient generated by the plurality of first sample data, a second influence coefficient generated by the plurality of first sample data on the plurality of second sample data, a third influence coefficient generated by the plurality of second sample data, and a fourth influence coefficient generated by the plurality of second sample data on the plurality of first sample data;
taking a natural constant as an exponential base, taking the product of the fatigue coefficient and a first coefficient as an exponent, and performing exponential operation to obtain an exponential value, wherein the first coefficient is the difference value between a time parameter and the deviation coefficient;
calculating a product of the exponent value and a second coefficient, and taking a sum of the product and the second coefficient as a spline coefficient, wherein the second coefficient is a difference value between a first preset value and the baseline coefficient;
calculating a product of the first coefficient and the closed-loop rate coefficient, calculating a sum of the product and the spline coefficient, and calculating a hyperbolic tangent value of the sum;
calculating a product of the hyperbolic tangent value, the second coefficient and the influence function, and calculating a difference value between the product and a second preset value to obtain an influence parameter;
inputting the first influence coefficient, the second influence coefficient, the third influence coefficient and the fourth influence coefficient into the influence function respectively to obtain first influence weights generated by the plurality of first sample data, second influence weights generated by the plurality of first sample data on the plurality of second sample data, third influence weights generated by the plurality of second sample data and fourth influence weights generated by the plurality of second sample data on the plurality of first sample data;
taking the first, second, third and fourth impact weights as the impact weights between the plurality of first sample data and the plurality of second sample data.
Optionally, the performing, according to the influence weights, time-series data processing on the plurality of first sample data and the plurality of second sample data respectively to obtain a first time-series function and a second time-series function respectively includes:
inquiring the plurality of first sample data and the plurality of second sample data to obtain a first propagation parameter, a first sample total number, a second propagation parameter and a second sample total number outside the target blocking area in the target blocking area;
extracting a second infected person number and a second suspected infected person number from the second propagation parameter, and constructing the first time sequence function for the first propagation parameter based on the second infected person number, the second suspected infected person number, the first propagation parameter, the first sample total number and the influence weight;
extracting a first infected person number and a first suspected infected person number from the first propagation parameter, and constructing the second time sequence function for the second propagation parameter based on the first infected person number, the first suspected infected person number, the second propagation parameter, the second sample total number and the influence weight.
Optionally, the constructing the first timing function for the first propagation parameter based on the second infected person number, the second suspected infected person number, the first propagation parameter, the first sample total number, and the influence weight includes:
querying the first propagation parameter to obtain a first infected person number, a first suspected infected person number, a first contact person number, a first death person number, a first cure person number, a first infection rate, a first cure rate and a first death rate;
calculating a first product of the conversion rate of the first suspected infected person number converted into the first infected person number, the first influence weight and the first infected person number, calculating a product of the conversion rate of the second suspected infected person number converted into the first infected person number, the fourth influence weight and the second infected person number, calculating a sum of the product and the first product, calculating a ratio of the first suspected infected person number to the first sample total number, and multiplying the ratio and the sum to obtain a time sequence function of the first suspected infected person number;
calculating a product of the ratio and a third coefficient, and subtracting the product from the time series function of the first suspected infected person number to obtain the time series function of the first contact person number, wherein the third coefficient is the product of the first contact person number and the first infection rate;
calculating a second product of the first infected person number and the first cure rate, and subtracting the second product from the third coefficient to obtain a time sequence function of the first infected person number;
calculating the product of the first infected population and the first mortality, and subtracting the product from the second product to obtain a time series function of the first cured population;
(ii) taking said product as a time series function of said first death population;
and taking the time sequence function of the first suspected infected person number, the time sequence function of the first contact person number, the time sequence function of the first infected person number, the time sequence function of the first cured person number and the time sequence function of the first death person number as the first time sequence function.
Optionally, the constructing the second time series function for the second propagation parameter based on the first infected person number, the first suspected infected person number, the second propagation parameter, the second sample total number, and the influence weight includes:
querying the second propagation parameter to obtain a second infected person number, a second suspected infected person number, a second contact person number, a second death person number, a second cure person number, a second infection rate, a second cure rate and a second death rate;
calculating a third product of the conversion rate of the number of the first suspected infected persons converted into the second infected persons, the second influence weight and the first infected persons, calculating a product of the conversion rate of the number of the second suspected infected persons converted into the second infected persons, the third influence weight and the second infected persons, calculating a sum of the product and the third product, calculating a ratio of the number of the second suspected infected persons to the total number of the second samples, and multiplying the ratio and the sum to obtain a time sequence function of the number of the second suspected infected persons;
calculating a product of the ratio and a fourth coefficient, and subtracting the product from the time sequence function of the second suspected infected person number to obtain the time sequence function of the second contact person number, wherein the fourth coefficient is the product of the second contact person number and the second infection rate;
calculating a fourth product of the second infected person number and the second cure rate, and subtracting the fourth product from the fourth coefficient to obtain a time sequence function of the second infected person number;
calculating the product of the second infected population and the second mortality, and subtracting the fourth product from the product to obtain a time series function of the second cure population;
(ii) taking said product as a time-series function of said second mortality population;
and taking the time-series function of the second suspected number of infected persons, the time-series function of the second number of persons who contact, the time-series function of the second number of infected persons, the time-series function of the second number of cured persons and the time-series function of the second number of dead persons as the second time-series function.
Optionally, after the reading of the plurality of electronic medical record data in the target area, the extracting of the electronic medical record data in the plurality of electronic medical record data whose address tag is in the target locking area as a plurality of first sample data, and the taking of the plurality of electronic medical record data remaining after the extracting as a plurality of second sample data, the method further includes:
numbering the plurality of first sample data respectively, adding a first test sample label to first sample data with the number consistent with a target number, adding a first training sample label to other first sample data, wherein the value of the target number is an arbitrary value, and the other first sample data are first sample data except the first sample data indicated by the target number;
numbering the plurality of second sample data respectively, adding a second test sample label to second sample data with the number consistent with the target number, and adding a second training sample label to other second sample data, wherein the other second sample data are second sample data except the second sample data indicated by the target number;
and using the first sample data added with the first test sample label and the second sample data added with the second test sample label as the test data.
Optionally, before extracting test data from the plurality of electronic medical record data and inputting the test data to the propagation prediction simulation model for epidemic propagation prediction to obtain an epidemic propagation prediction result of the target region, the method further includes:
based on maximum likelihood estimation and an optimal optimization method, setting each parameter in all unknown parameters as a model vector;
extracting first sample data and second sample data with sample labels as sample training data from the plurality of first sample data and the plurality of second sample data as training data, inputting the training data into the propagation prediction simulation model, and determining the value range of parameters in the propagation prediction simulation model;
and determining the value of the model vector according to an artificial intelligence algorithm Markov chain Monte Carlo, finishing model training and obtaining the propagation prediction simulation model.
According to a second aspect of the present application, there is provided an artificial intelligence based epidemic situation prediction apparatus, comprising:
the reading module is used for reading a plurality of electronic medical record data of a target area, extracting the electronic medical record data with an address label in the target locking area from the plurality of electronic medical record data as a plurality of first sample data, and taking the plurality of electronic medical record data left after extraction as a plurality of second sample data;
a first determination module for determining a blocking coefficient, based on which an influence weight between the plurality of first sample data and the plurality of second sample data is calculated;
the calculation module is used for respectively carrying out time sequence data processing on the plurality of first sample data and the plurality of second sample data according to the influence weights to respectively obtain a first time sequence function and a second time sequence function, and carrying out linear combination on the first time sequence function and the second time sequence function to obtain a propagation prediction simulation model;
and the test module is used for extracting test data from the plurality of first sample data and the plurality of second sample data, inputting the test data into the propagation prediction simulation model for predicting the propagation of the epidemic situation, and obtaining the prediction result of the propagation of the epidemic situation in the target area.
Optionally, the first determining module is configured to determine a baseline coefficient, a fatigue coefficient, a deviation coefficient, and a closing rate coefficient in a blocking state, and use the baseline coefficient, the fatigue coefficient, the deviation coefficient, and the closing rate coefficient as the blocking coefficient; querying a first influence coefficient generated by the plurality of first sample data, a second influence coefficient generated by the plurality of first sample data on the plurality of second sample data, a third influence coefficient generated by the plurality of second sample data, and a fourth influence coefficient generated by the plurality of second sample data on the plurality of first sample data; taking a natural constant as an exponential base, taking the product of the fatigue coefficient and a first coefficient as an exponent, and performing exponential operation to obtain an exponential value, wherein the first coefficient is the difference value between a time parameter and the deviation coefficient; calculating a product of the exponent value and a second coefficient, and taking a sum of the product and the second coefficient as a spline coefficient, wherein the second coefficient is a difference value between a first preset value and the baseline coefficient; calculating a product of the first coefficient and the closed-loop rate coefficient, calculating a sum of the product and the spline coefficient, and calculating a hyperbolic tangent value of the sum; calculating a product of the hyperbolic tangent value, the second coefficient and the influence function, and calculating a difference value between the product and a second preset value to obtain an influence parameter; inputting the first influence coefficient, the second influence coefficient, the third influence coefficient and the fourth influence coefficient into the influence function respectively to obtain first influence weights generated by the plurality of first sample data, second influence weights generated by the plurality of first sample data on the plurality of second sample data, third influence weights generated by the plurality of second sample data and fourth influence weights generated by the plurality of second sample data on the plurality of first sample data; taking the first, second, third and fourth impact weights as the impact weights between the plurality of first sample data and the plurality of second sample data.
Optionally, the calculation module is configured to query the plurality of first sample data and the plurality of second sample data to obtain a first propagation parameter, a first total number of samples in the target blocking area, and a second propagation parameter and a second total number of samples outside the target blocking area; extracting a second infected person number and a second suspected infected person number from the second propagation parameter, and constructing the first time sequence function for the first propagation parameter based on the second infected person number, the second suspected infected person number, the first propagation parameter, the first sample total number and the influence weight; extracting a first infected person number and a first suspected infected person number from the first propagation parameter, and constructing the second time sequence function for the second propagation parameter based on the first infected person number, the first suspected infected person number, the second propagation parameter, the second sample total number and the influence weight.
Optionally, the calculating module is configured to query the first propagation parameter to obtain a first infected person number, a first suspected infected person number, a first contact person number, a first death person number, a first cure person number, a first infection rate, a first cure rate, and a first death rate; calculating a first product of the conversion rate of the first suspected infected person number converted into the first infected person number, the first influence weight and the first infected person number, calculating a product of the conversion rate of the second suspected infected person number converted into the first infected person number, the fourth influence weight and the second infected person number, calculating a sum of the product and the first product, calculating a ratio of the first suspected infected person number to the first sample total number, and multiplying the ratio and the sum to obtain a time sequence function of the first suspected infected person number; calculating a product of the ratio and a third coefficient, and subtracting the product from the time series function of the first suspected infected person number to obtain the time series function of the first contact person number, wherein the third coefficient is the product of the first contact person number and the first infection rate; calculating a second product of the first infected person number and the first cure rate, and subtracting the second product from the third coefficient to obtain a time sequence function of the first infected person number; calculating the product of the first infected population and the first mortality, and subtracting the product from the second product to obtain a time series function of the first cured population; (ii) taking said product as a time series function of said first death population; and taking the time sequence function of the first suspected infected person number, the time sequence function of the first contact person number, the time sequence function of the first infected person number, the time sequence function of the first cured person number and the time sequence function of the first death person number as the first time sequence function.
Optionally, the calculating module is configured to query the second propagation parameter to obtain a second infected person number, a second suspected infected person number, a second contact person number, a second death person number, a second cure person number, a second infection rate, a second cure rate, and a second death rate; calculating a third product of the conversion rate of the number of the first suspected infected persons converted into the second infected persons, the second influence weight and the first infected persons, calculating a product of the conversion rate of the number of the second suspected infected persons converted into the second infected persons, the third influence weight and the second infected persons, calculating a sum of the product and the third product, calculating a ratio of the number of the second suspected infected persons to the total number of the second samples, and multiplying the ratio and the sum to obtain a time sequence function of the number of the second suspected infected persons; calculating a product of the ratio and a fourth coefficient, and subtracting the product from the time sequence function of the second suspected infected person number to obtain the time sequence function of the second contact person number, wherein the fourth coefficient is the product of the second contact person number and the second infection rate; calculating a fourth product of the second infected person number and the second cure rate, and subtracting the fourth product from the fourth coefficient to obtain a time sequence function of the second infected person number; calculating the product of the second infected population and the second mortality, and subtracting the fourth product from the product to obtain a time series function of the second cure population; (ii) taking said product as a time-series function of said second mortality population; and taking the time-series function of the second suspected number of infected persons, the time-series function of the second number of persons who contact, the time-series function of the second number of infected persons, the time-series function of the second number of cured persons and the time-series function of the second number of dead persons as the second time-series function.
Optionally, the apparatus further comprises:
the marking module is used for numbering the plurality of first sample data respectively, adding a first test sample label to the first sample data with the same number as a target number, and adding a first training sample label to other first sample data, wherein the value of the target number is an arbitrary value, and the other first sample data are first sample data except the first sample data indicated by the target number;
the marking module is further configured to number the plurality of second sample data, add a second test sample label to second sample data with a number consistent with the target number, and add a second training sample label to other second sample data, where the other second sample data is second sample data, except for the second sample data indicated by the target number, in the plurality of second sample data;
an extracting module, configured to use the first sample data added with the first test sample label and the second sample data added with the second test sample label as the test data.
Optionally, the apparatus further comprises:
the setting module is used for setting each parameter in all unknown parameters as a model vector based on maximum likelihood estimation and an optimal optimization method;
the training module is used for extracting first sample data and second sample data with sample labels as sample training data from the plurality of first sample data and the plurality of second sample data as training data, inputting the training data into the propagation prediction simulation model, and determining the value range of parameters in the propagation prediction simulation model;
and the second determination module is used for determining the value of the model vector according to an artificial intelligence algorithm Markov chain Monte Carlo, completing model training and obtaining the propagation prediction simulation model.
According to a third aspect of the present application, there is provided a computer device comprising a memory storing a computer program and a processor implementing the steps of the method of any of the first aspects when the computer program is executed.
According to a fourth aspect of the present application, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of any of the first aspects described above.
According to the technical scheme, the method, the device, the equipment and the storage medium for predicting the epidemic situation based on the artificial intelligence are characterized in that a plurality of electronic medical record data of a target area are read in an electronic medical record database, the electronic medical record data with an address label as a target locking area are extracted from the electronic medical record data to serve as a plurality of first sample data, and a plurality of electronic medical record data left after extraction serve as a plurality of second sample data. Then, a blocking coefficient is determined, and based on the blocking coefficient, influence weights between the plurality of first sample data and the plurality of second sample data are calculated. And then, respectively carrying out time sequence data processing on the plurality of first sample data and the plurality of second sample data according to the influence weight to respectively obtain a first time sequence function and a second time sequence function, and then carrying out linear combination on the first time sequence function and the second time sequence function to obtain a propagation prediction simulation model. And finally, extracting test data from the plurality of electronic medical record data, inputting the test data into the propagation prediction simulation model for predicting epidemic propagation, and obtaining an epidemic propagation prediction result of the target area. Influence weight between two groups of sample data inside and outside a target blocking area is calculated by setting blocking parameters, and then a propagation prediction simulation model is constructed, so that a subsequent system only needs to update the blocking parameters of the area to be predicted, and the epidemic situation propagation condition of the current area to be predicted is input into the propagation prediction simulation model, and then the propagation prediction result of the current area can be obtained, so that the propagation prediction simulation model can be suitable for epidemic propagation scenes under different prevention and control states, and the accuracy and the authenticity of the prediction result are improved.
The foregoing description is only an overview of the technical solutions of the present application, and the present application can be implemented according to the content of the description in order to make the technical means of the present application more clearly understood, and the following detailed description of the present application is given in order to make the above and other objects, features, and advantages of the present application more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 is a schematic flow chart of an artificial intelligence-based epidemic situation prediction method provided by an embodiment of the present application;
FIG. 2 is a schematic flow chart of an artificial intelligence-based epidemic situation prediction method provided by an embodiment of the present application;
fig. 3A is a schematic structural diagram illustrating an artificial intelligence-based epidemic situation prediction apparatus according to an embodiment of the present application;
fig. 3B is a schematic structural diagram illustrating an artificial intelligence-based epidemic situation prediction apparatus according to an embodiment of the present application;
fig. 3C is a schematic structural diagram illustrating an artificial intelligence-based epidemic situation prediction apparatus according to an embodiment of the present application;
fig. 4 shows a schematic device structure diagram of a computer apparatus according to an embodiment of the present application.
Detailed Description
Exemplary embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The embodiment of the application provides an epidemic situation prediction method based on artificial intelligence, and as shown in fig. 1, the method comprises the following steps:
101. reading a plurality of electronic medical record data of the target area, extracting the electronic medical record data with the address label in the target locking area from the plurality of electronic medical record data as a plurality of first sample data, and taking the plurality of electronic medical record data left after extraction as a plurality of second sample data.
102. Determining a blocking coefficient, and calculating influence weights between the plurality of first sample data and the plurality of second sample data based on the blocking coefficient.
103. And respectively carrying out time sequence data processing on the plurality of first sample data and the plurality of second sample data according to the influence weight to respectively obtain a first time sequence function and a second time sequence function, and carrying out linear combination on the first time sequence function and the second time sequence function to obtain a propagation prediction simulation model.
104. And extracting test data from the plurality of electronic medical record data, and inputting the test data into the propagation prediction simulation model to perform epidemic propagation prediction to obtain an epidemic propagation prediction result of the target area.
The method provided by the embodiment of the application comprises the steps of firstly reading a plurality of electronic medical record data in a target area from an electronic medical record database, extracting the electronic medical record data with an address label in the target locking area as a plurality of first sample data, and taking the plurality of electronic medical record data left after extraction as a plurality of second sample data. Then, a blocking coefficient is determined, and based on the blocking coefficient, influence weights between the plurality of first sample data and the plurality of second sample data are calculated. And then, respectively carrying out time sequence data processing on the plurality of first sample data and the plurality of second sample data according to the influence weight to respectively obtain a first time sequence function and a second time sequence function, and then carrying out linear combination on the first time sequence function and the second time sequence function to obtain a propagation prediction simulation model. And finally, extracting test data from the plurality of electronic medical record data, inputting the test data into the propagation prediction simulation model for predicting epidemic propagation, and obtaining an epidemic propagation prediction result of the target area. Influence weight between two groups of sample data inside and outside a target blocking area is calculated by setting blocking parameters, and then a propagation prediction simulation model is constructed, so that a subsequent system only needs to update the blocking parameters of the area to be predicted, and the epidemic situation propagation condition of the current area to be predicted is input into the propagation prediction simulation model, and then the propagation prediction result of the current area can be obtained, so that the propagation prediction simulation model can be suitable for epidemic propagation scenes under different prevention and control states, and the accuracy and the authenticity of the prediction result are improved.
The embodiment of the application provides an epidemic situation prediction method based on artificial intelligence, and as shown in fig. 2, the method comprises the following steps:
201. reading a plurality of electronic medical record data of the target area, extracting the electronic medical record data with the address label in the target locking area from the plurality of electronic medical record data as a plurality of first sample data, and taking the plurality of electronic medical record data left after extraction as a plurality of second sample data.
Since the outbreak of new coronavirus all over the world, the global epidemic situation has not been improved despite various degrees of epidemic prevention measures in various countries. The novel coronavirus has the characteristics of long incubation period, hidden asymptomatic infection and the like, so that great difficulty is brought to epidemic prevention and control. Therefore, a reasonable mathematical model is established, and the method has important practical significance for scientifically analyzing the virus propagation characteristics and predicting the epidemic situation inflection point and the completion date. At present, a worker who executes propagation prediction collects epidemic situation propagation data of a sample area as input data of a traditional SEIRD model, trains the model to obtain a final epidemic situation propagation prediction model, inputs the existing data of infected people and cured people waiting for prediction in the area to be predicted into the epidemic situation propagation prediction model, and takes the result output by the model as an epidemic situation prediction result. However, the applicant recognizes that, in fact, in the prior art, a unified standard is used for predicting the epidemic situation, so that the predicted epidemic situation data is single, but in some environments, a special epidemic situation exists, so that the predicted data is inaccurate in result and poor in authenticity, and is difficult to be truly utilized in the subsequent epidemic prevention process.
Therefore, according to the artificial intelligence based epidemic situation prediction method, the artificial intelligence based epidemic situation prediction device, the artificial intelligence based epidemic situation prediction equipment and the artificial intelligence based epidemic situation prediction storage medium, a plurality of electronic medical record data in a target area are read in an electronic medical record database, the electronic medical record data with the address label in the target locking area are extracted from the electronic medical record database and serve as a plurality of first sample data, and the plurality of electronic medical record data left after extraction serve as a plurality of second sample data. Then, a blocking coefficient is determined, and based on the blocking coefficient, influence weights between the plurality of first sample data and the plurality of second sample data are calculated. And then, respectively carrying out time sequence data processing on the plurality of first sample data and the plurality of second sample data according to the influence weight to respectively obtain a first time sequence function and a second time sequence function, and then carrying out linear combination on the first time sequence function and the second time sequence function to obtain a propagation prediction simulation model. And finally, extracting test data from the plurality of electronic medical record data, inputting the test data into the propagation prediction simulation model for predicting epidemic propagation, and obtaining an epidemic propagation prediction result of the target area. Influence weight between two groups of sample data inside and outside a target blocking area is calculated by setting blocking parameters, and then a propagation prediction simulation model is constructed, so that a subsequent system only needs to update the blocking parameters of the area to be predicted, and the epidemic situation propagation condition of the current area to be predicted is input into the propagation prediction simulation model, and then the propagation prediction result of the current area can be obtained, so that the propagation prediction simulation model can be suitable for epidemic propagation scenes under different prevention and control states, and the accuracy and the authenticity of the prediction result are improved.
In the embodiment of the application, the system needs to select the target area with the area for executing the blocking prevention and control action in advance, namely the target area is divided into an internal blocking area for executing the blocking action and other areas except the blocking area. And then, extracting a plurality of electronic medical record data in the target area for splitting to obtain a plurality of first sample data and a plurality of second sample data. It should be noted that the electronic medical record data includes basic data such as contact time, infection time, whether to cure or die, and epidemic situation spreading data of the internal and external two parts of areas, such as mortality, infection rate, and cure rate, can be obtained by performing data statistics on a plurality of electronic medical record data. The type of epidemic propagation data is not specifically limited in the present application.
Specifically, the system is connected with the electronic medical record system through an external system connection interface, accesses the electronic medical record database, identifies the address tags of the electronic medical records stored in the electronic medical record database, and extracts a plurality of electronic medical record data with the address tags being the same as the target area. Next, a plurality of electronic medical record data with the same address label as the target blocked area address are extracted from the plurality of electronic medical record data as a plurality of first sample data, and a plurality of electronic medical record data remaining after extraction are used as a plurality of second sample data. For example, the selected target area is an area A, the area A contains an area A nursing home for performing blocking prevention control, the system inquires an address tag corresponding to electronic medical record data in an electronic medical record database, extracts a plurality of electronic medical record data of which the address tag indicates the area A, then extracts a plurality of electronic medical record data of which the address tag indicates the area A as a plurality of first sample data, and uses a plurality of electronic medical record data left after extraction as a plurality of second sample data.
In the practical application process, the system splits the acquired multiple first sample data and multiple second sample data to obtain training data and test data. Training data is used for training the propagation prediction simulation model to obtain an optimal parameter value, testing data is used for testing the propagation prediction simulation model after training is completed to obtain a prediction result and obtain the accuracy rate of the prediction result, and the specific process of sample data splitting is as follows:
firstly, a plurality of first sample data are respectively numbered, first test sample labels are added to the first sample data with the number consistent with a target number, first training sample labels are added to the first sample data except the first sample data indicated by the target number in the plurality of first sample data, wherein the value of the target number is an arbitrary value, and the value mode of the target number is not specifically limited in the application. And then numbering the plurality of second sample data respectively, adding a second test sample label to the second sample data with the number consistent with the target number, and adding a second training sample label to the second sample data except the second sample data indicated by the target number in the plurality of second sample data. Finally, the first sample data to which the first test pattern label is added and the second sample data to which the second test pattern label is added are taken as test data.
The method comprises the steps of obtaining and calling an electronic medical record database, obtaining a plurality of electronic medical record data in a target area, splitting the plurality of electronic medical record data to obtain electronic medical record data of two groups of people inside and outside the target blocking area, enabling a subsequent system to calculate influences caused by personnel flow in a blocking prevention and control state based on the electronic medical record data, namely influence weights, and establishing a propagation prediction simulation model for first sample data and second sample data according to the influence weights, so that the propagation prediction simulation model can be suitable for application scenes of different blocking parameters, and accuracy and authenticity of prediction results are improved.
202. Determining a blocking coefficient, and calculating influence weights between the plurality of first sample data and the plurality of second sample data based on the blocking coefficient.
In the embodiment of the application, the system acquires the blocking coefficient under the current application scene to construct a spline function, and then calculates the influence weight between the plurality of first sample data and the plurality of second sample data by inputting different influence coefficients indicated in the plurality of first sample data and the plurality of second sample data into the spline function.
The specific process of calculating the influence weight is as follows:
firstly, determining a baseline coefficient, a fatigue coefficient, a deviation coefficient and a city sealing rate coefficient in a sealing state, and taking the baseline coefficient, the fatigue coefficient, the deviation coefficient and the city sealing rate coefficient as sealing coefficients. And querying a first influence coefficient generated by a plurality of first sample data, a second influence coefficient generated by the plurality of first sample data on the plurality of second sample data, a third influence coefficient generated by the plurality of second sample data and a fourth influence coefficient generated by the plurality of second sample data on the plurality of first sample data in the propagation data.
And then, taking a natural constant as an exponential base, taking the product of the fatigue coefficient and a first coefficient as an exponent, and performing exponential operation to obtain an exponential value, wherein the first coefficient is the difference value between the time parameter and the deviation coefficient. And calculating the product of the exponent value and the second coefficient, and taking the sum of the product and the second coefficient as a spline coefficient, wherein the second coefficient is the difference value of the first preset value and the baseline coefficient. The process of calculating spline coefficients can be implemented based on the following formula 1:
equation 1:
Figure BDA0003465030790000141
wherein, alphabIs the baseline coefficient; alphafThe fatigue coefficient is; alphaoIs a coefficient of deviation; t-alphaoIs a first coefficient; 1-alphabIs the second coefficient.
And finally, calculating the product of the first coefficient and the closed-loop coefficient, calculating the sum of the product and the spline coefficient, and calculating the hyperbolic tangent value of the sum. And calculating the product of the hyperbolic tangent value, the second coefficient and the influence function, and calculating the difference value of the product and a second preset value to obtain the influence parameter. And respectively inputting the first influence coefficient, the second influence coefficient, the third influence coefficient and the fourth influence coefficient into the influence function to obtain first influence weights generated by the plurality of first sample data, second influence weights generated by the plurality of first sample data on the plurality of second sample data, third influence weights generated by the plurality of second sample data and fourth influence weights generated by the plurality of second sample data on the plurality of first sample data. The first influence weight, the second influence weight, the third influence weight, and the fourth influence weight are taken as influence weights between the plurality of first sample data and the plurality of second sample data. The specific process of calculating the impact weight can be implemented based on the following formula 2:
equation 2: g (p (t), t) ═ 1-p (t) (0.5 × (1- α b) × (α r (t- α o)) +1+ f (t)))
Wherein f (t) is a spline coefficient; p (t) is an influence function that adjusts between 0 and 1 between two populations; alpha is alpharIs the closing rate coefficient. And substituting a first influence coefficient pc generated by the plurality of first sample data, a second influence coefficient pcn generated by the plurality of first sample data on the plurality of second sample data, a third influence coefficient pnn generated by the plurality of second sample data and a fourth influence coefficient pnc generated by the plurality of second sample data on the plurality of first sample data into an influence function p (t) to obtain an influence weight. For example, a first influence parameter pc of the inside of the nursing home with a value of 0, a second influence parameter pnc of the inside of the nursing home with a value of 0.5, a third influence parameter pnn of the outside of the nursing home with a value of 1, and an outside of the nursing home with a value of 0.5The fourth influence parameter pcn is substituted into the influence function p (t) to obtain a first influence weight G (p) generated by a plurality of first sample dataccT), second influence weight G (p) of the plurality of first sample data on the plurality of second sample datancT), third influence weight G (p) generated by a plurality of second sample datannT) and fourth influence weights G (p) of the plurality of first sample data generated by the plurality of second sample datacn,t)。
Influence weights between the plurality of first sample data and the plurality of second sample data are calculated by setting the blocking parameters, and then the prediction result of epidemic situation propagation generated under the blocking prevention and control scene is realized. Furthermore, different blocking prevention and control scenes correspond to different blocking parameters, and a propagation prediction result under the current scene can be obtained by using the blocking parameters corresponding to the current blocking scene and actual propagation data under the current scene, so that the generated simulation prediction model has the capability of adapting to different blocking prevention and control scenes.
203. And respectively carrying out time sequence data processing on the plurality of first sample data and the plurality of second sample data according to the influence weight to respectively obtain a first time sequence function and a second time sequence function, and carrying out linear combination on the first time sequence function and the second time sequence function to obtain a propagation prediction simulation model.
In the embodiment of the application, a plurality of first sample data and a plurality of second sample data are utilized to express a first contact rate, a second contact rate, a first infection rate, a second infection rate, a first suspected infection rate, a second suspected infection rate, a first death rate, a second death rate, a first cure rate and a second cure rate as time sequence functions of time t, and the plurality of time sequence functions are linearly combined to obtain a propagation prediction simulation model. The specific process of generating the propagation prediction simulation model is as follows:
firstly, a plurality of first sample data and a plurality of second sample data are inquired to obtain a first propagation parameter, a first sample total number, a second propagation parameter and a second sample total number outside a target blocking area in a target blocking area. And extracting a second infected person number and a second suspected infected person number from the second propagation parameters, and constructing a first time sequence function for the first propagation parameters based on the second infected person number, the second suspected infected person number, the first propagation parameters, the first sample total number and the influence weight. Specifically, a first transmission parameter is inquired to obtain a first infected person number, a first suspected infected person number, a first contact person number, a first death person number, a first cure person number, a first infection rate, a first cure rate and a first death rate, a first product of the conversion rate of the first suspected infected person number converted into the first infected person number, the first influence weight and the first infected person number is calculated, a product of the conversion rate of the second suspected infected person number converted into the first infected person number, the fourth influence weight and the second infected person number is calculated, the sum of the product and the first product is calculated, the ratio of the first suspected infected person number to the first sample number is calculated, and the ratio and the sum are multiplied to obtain a time sequence function of the first suspected infected person number. And calculating the product of the ratio and a third coefficient, and subtracting the product from the time sequence function of the first suspected infected person number to obtain the time sequence function of the first contact person number, wherein the third coefficient is the product of the first contact person number and the first infection rate. And calculating a second product of the first infected person number and the first cure rate, and subtracting the second product from the third coefficient to obtain a time sequence function of the first infected person number. And calculating the product of the first infected person number and the first death rate, subtracting the product from the second product to obtain a time sequence function of the first cured person number, and taking the product as the time sequence function of the first death person number. Further, a time series function of the first suspected number of infected persons, a time series function of the first number of persons who contacted, a time series function of the first number of infected persons, a time series function of the first number of cured persons, and a time series function of the first number of dead persons are taken as the first time series function. The process of specifically calculating the first timing function may be implemented based on the following formula 3:
equation 3:
Figure BDA0003465030790000161
where Nc is the first total number of samples; in is the second infected; ic is the number of first infected persons; ec is the first number of contacts; rc is the number of first cures; sc is the number of first suspected infected persons; dc is the first death; β nc is the second sensation(ii) conversion of the number of infections Sn to the number of first infected persons Ic; β cc is the conversion of the first suspected infected person's Sc to the first infected person's Ic; α c first infection rate; yc first rate of recovery; ec(t)*αcIs the third coefficient. The initial values of the parameters can adopt default values of the system, and can also be set by workers based on the actual infectious disease transmission condition, and the setting mode and the value of the initial parameters are not specifically limited.
And extracting the first infected person number and the first suspected infected person number from the first transmission parameters, and constructing a second time sequence function for the second transmission parameters based on the first infected person number, the first suspected infected person number, the second transmission parameters, the total number of the second samples and the influence weight. Specifically, the second propagation parameter is queried to obtain a second infected person number, a second suspected infected person number, a second contact person number, a second death person number, a second cure person number, a second infection rate, a second cure rate and a second death rate. Calculating a third product of the conversion rate of the first suspected infected person number converted into the second infected person number, the second influence weight and the first infected person number, calculating a product of the conversion rate of the second suspected infected person number converted into the second infected person number, the third influence weight and the second infected person number, calculating a sum of the product and the third product, calculating a ratio of the second suspected infected person number to the total number of the second sample, and multiplying the ratio and the sum to obtain a time sequence function of the second suspected infected person number. And calculating the product of the ratio and a fourth coefficient, and subtracting the product from the time sequence function of the second suspected infected person number to obtain the time sequence function of the second contact person number, wherein the fourth coefficient is the product of the second contact person number and the second infection rate. And calculating a fourth product of the second infected person number and the second cure rate, and subtracting the fourth product from the fourth coefficient to obtain a time sequence function of the second infected person number. Calculating the product of the second infected person number and the second death rate, subtracting the fourth product from the product to obtain a time sequence function of the second cured person number, and taking the product as the time sequence function of the second death person number. Further, a time series function of the second suspected number of infected persons, a time series function of the second number of persons who contacted, a time series function of the second number of infected persons, a time series function of the second number of cured persons, and a time series function of the second number of dead persons are set as the second time series function. The process of specifically calculating the second timing function can be implemented based on the following formula 4:
equation 4:
Figure BDA0003465030790000171
wherein Nn is the total number of the second samples; in is the second infected; ic is the number of first infected persons; en is the second exposure; rn is the second cure population; sn is the second suspected infected person; dn is the second number of deaths; β cn is the conversion of Sc from a first suspected infected person to In from a second infected person; β nn is the conversion rate of Sn In the second suspected infected person to In the second infected person; a second infection rate; a second recovery rate of γ n; ent*αnIs the fourth coefficient.
The first time sequence function and the second time sequence function constructed through the steps consider the influence weight generated by the mutual flow of the two groups of people inside and outside the lockout area. In fact, due to different blocking parameters corresponding to different degrees of prevention and control, the calculated influence weights are different, that is, if blocking is strict, the flow of people is small, and the influence weights are also small. Therefore, the propagation prediction simulation model obtained by linearly combining the first time sequence function and the second time sequence function can change the influence weight of people inside and outside the target area by modifying the blocking parameter of the target area, so that the propagation prediction simulation model can be suitable for various prevention and control scenes, and the prediction accuracy and the authenticity of the propagation prediction simulation model are improved.
204. And extracting test data from the plurality of electronic medical record data, and inputting the test data into the propagation prediction simulation model to perform epidemic propagation prediction to obtain an epidemic propagation prediction result of the target area.
In the embodiment of the application, the system identifies the sample labels of a plurality of electronic medical record data, inputs the first sample data added with the first test sample label and the second sample data added with the second test sample label as test data into the propagation prediction simulation model for model test, and finally outputs a prediction result. In the practical application process, the propagation simulation prediction model can be trained by using the test data continuously, and when the prediction result of the propagation prediction simulation model on the test data reaches the accuracy threshold, the training is completed to obtain the propagation prediction simulation model. Further, the propagation data and the blocking parameters of the region to be predicted are input into the propagation prediction simulation model, prediction results such as the inflection point, the ending time and the like of the epidemic situation of the target region are obtained, and epidemic prevention measures are regulated and controlled based on the prediction results.
In practice, before test data is input into the propagation prediction simulation model to obtain a prediction result, the system performs model training on the propagation prediction simulation model by using training data to determine values of unknown parameters in the model, and specifically, each parameter in all unknown parameters is set as a model vector based on maximum likelihood estimation and an optimal optimization method. And extracting first sample data and second sample data with sample labels as sample training data from the plurality of first sample data and the plurality of second sample data as training data, inputting the training data into the propagation prediction simulation model, and determining the value range of parameters in the propagation prediction simulation model. And finally, determining the value of the model vector according to the Markov chain Monte Carlo of the artificial intelligence algorithm, completing model training and obtaining a propagation prediction simulation model.
The method provided by the embodiment of the application comprises the steps of firstly reading a plurality of electronic medical record data in a target area from an electronic medical record database, extracting the electronic medical record data with an address label in the target locking area as a plurality of first sample data, and taking the plurality of electronic medical record data left after extraction as a plurality of second sample data. Then, a blocking coefficient is determined, and based on the blocking coefficient, influence weights between the plurality of first sample data and the plurality of second sample data are calculated. And then, respectively carrying out time sequence data processing on the plurality of first sample data and the plurality of second sample data according to the influence weight to respectively obtain a first time sequence function and a second time sequence function, and then carrying out linear combination on the first time sequence function and the second time sequence function to obtain a propagation prediction simulation model. And finally, extracting test data from the plurality of electronic medical record data, inputting the test data into the propagation prediction simulation model for predicting epidemic propagation, and obtaining an epidemic propagation prediction result of the target area. Influence weight between two groups of sample data inside and outside a target blocking area is calculated by setting blocking parameters, and then a propagation prediction simulation model is constructed, so that a subsequent system only needs to update the blocking parameters of the area to be predicted, and the epidemic situation propagation condition of the current area to be predicted is input into the propagation prediction simulation model, and then the propagation prediction result of the current area can be obtained, so that the propagation prediction simulation model can be suitable for epidemic propagation scenes under different prevention and control states, and the accuracy and the authenticity of the prediction result are improved.
Further, as a specific implementation of the method shown in fig. 1, an embodiment of the present application provides an artificial intelligence-based epidemic situation prediction apparatus, as shown in fig. 3A, the apparatus includes: a reading module 301, a first determining module 302, a calculating module 303, and a testing module 304.
The reading module 301 is configured to read a plurality of electronic medical record data of a target area, extract the electronic medical record data with an address tag in the plurality of electronic medical record data as a plurality of first sample data, and use a plurality of electronic medical record data remaining after extraction as a plurality of second sample data;
the first determining module 302 is configured to determine a blocking coefficient, and based on the blocking coefficient, calculate an influence weight between the plurality of first sample data and the plurality of second sample data;
the calculating module 303 is configured to perform time sequence data processing on the plurality of first sample data and the plurality of second sample data respectively according to the influence weights, to obtain a first time sequence function and a second time sequence function respectively, and perform linear combination on the first time sequence function and the second time sequence function to obtain a propagation prediction simulation model;
the test module 304 is configured to extract test data from the plurality of first sample data and the plurality of second sample data, and input the test data to the propagation prediction simulation model to perform epidemic propagation prediction, so as to obtain an epidemic propagation prediction result of the target area.
In a specific application scenario, the first determining module is configured to determine a baseline coefficient, a fatigue coefficient, a deviation coefficient, and a closing rate coefficient in a blocking state, and use the baseline coefficient, the fatigue coefficient, the deviation coefficient, and the closing rate coefficient as blocking coefficients; querying a first influence coefficient generated by the plurality of first sample data, a second influence coefficient generated by the plurality of first sample data on the plurality of second sample data, a third influence coefficient generated by the plurality of second sample data, and a fourth influence coefficient generated by the plurality of second sample data on the plurality of first sample data; taking a natural constant as an exponential base, taking the product of the fatigue coefficient and a first coefficient as an exponent, and performing exponential operation to obtain an exponential value, wherein the first coefficient is the difference value between a time parameter and the deviation coefficient; calculating a product of the exponent value and a second coefficient, and taking a sum of the product and the second coefficient as a spline coefficient, wherein the second coefficient is a difference value between a first preset value and the baseline coefficient; calculating a product of the first coefficient and the closed-loop rate coefficient, calculating a sum of the product and the spline coefficient, and calculating a hyperbolic tangent value of the sum; calculating a product of the hyperbolic tangent value, the second coefficient and the influence function, and calculating a difference value between the product and a second preset value to obtain an influence parameter; inputting the first influence coefficient, the second influence coefficient, the third influence coefficient and the fourth influence coefficient into the influence function respectively to obtain first influence weights generated by the plurality of first sample data, second influence weights generated by the plurality of first sample data on the plurality of second sample data, third influence weights generated by the plurality of second sample data and fourth influence weights generated by the plurality of second sample data on the plurality of first sample data; taking the first, second, third and fourth impact weights as the impact weights between the plurality of first sample data and the plurality of second sample data.
In a specific application scenario, the calculating module 303 is configured to query the multiple first sample data and the multiple second sample data to obtain a first propagation parameter, a first total number of samples in the target blocking area, and a second propagation parameter and a second total number of samples outside the target blocking area; extracting a second infected person number and a second suspected infected person number from the second propagation parameter, and constructing the first time sequence function for the first propagation parameter based on the second infected person number, the second suspected infected person number, the first propagation parameter, the first sample total number and the influence weight; extracting a first infected person number and a first suspected infected person number from the first propagation parameter, and constructing the second time sequence function for the second propagation parameter based on the first infected person number, the first suspected infected person number, the second propagation parameter, the second sample total number and the influence weight.
In a specific application scenario, the calculating module 303 is configured to query the first propagation parameter to obtain a first infected person number, a first suspected infected person number, a first contact person number, a first death person number, a first cure person number, a first infection rate, a first cure rate, and a first death rate; calculating a first product of the conversion rate of the first suspected infected person number converted into the first infected person number, the first influence weight and the first infected person number, calculating a product of the conversion rate of the second suspected infected person number converted into the first infected person number, the fourth influence weight and the second infected person number, calculating a sum of the product and the first product, calculating a ratio of the first suspected infected person number to the first sample total number, and multiplying the ratio and the sum to obtain a time sequence function of the first suspected infected person number; calculating a product of the ratio and a third coefficient, and subtracting the product from the time series function of the first suspected infected person number to obtain the time series function of the first contact person number, wherein the third coefficient is the product of the first contact person number and the first infection rate; calculating a second product of the first infected person number and the first cure rate, and subtracting the second product from the third coefficient to obtain a time sequence function of the first infected person number; calculating the product of the first infected population and the first mortality, and subtracting the product from the second product to obtain a time series function of the first cured population; (ii) taking said product as a time series function of said first death population; and taking the time sequence function of the first suspected infected person number, the time sequence function of the first contact person number, the time sequence function of the first infected person number, the time sequence function of the first cured person number and the time sequence function of the first death person number as the first time sequence function.
In a specific application scenario, the calculating module 303 is configured to query the second propagation parameter to obtain a second infected person number, a second suspected infected person number, a second contact person number, a second death person number, a second cure person number, a second infection rate, a second cure rate, and a second death rate; calculating a third product of the conversion rate of the number of the first suspected infected persons converted into the second infected persons, the second influence weight and the first infected persons, calculating a product of the conversion rate of the number of the second suspected infected persons converted into the second infected persons, the third influence weight and the second infected persons, calculating a sum of the product and the third product, calculating a ratio of the number of the second suspected infected persons to the total number of the second samples, and multiplying the ratio and the sum to obtain a time sequence function of the number of the second suspected infected persons; calculating a product of the ratio and a fourth coefficient, and subtracting the product from the time sequence function of the second suspected infected person number to obtain the time sequence function of the second contact person number, wherein the fourth coefficient is the product of the second contact person number and the second infection rate; calculating a fourth product of the second infected person number and the second cure rate, and subtracting the fourth product from the fourth coefficient to obtain a time sequence function of the second infected person number; calculating the product of the second infected population and the second mortality, and subtracting the fourth product from the product to obtain a time series function of the second cure population; (ii) taking said product as a time-series function of said second mortality population; and taking the time-series function of the second suspected number of infected persons, the time-series function of the second number of persons who contact, the time-series function of the second number of infected persons, the time-series function of the second number of cured persons and the time-series function of the second number of dead persons as the second time-series function.
In a specific application scenario, as shown in fig. 3B, the apparatus further includes: a marking module 305 and an extraction module 306.
The marking module 305 is configured to number the plurality of first sample data, add a first test sample label to the first sample data with the same number as a target number, add a first training sample label to other first sample data, where a value of the target number is an arbitrary value, and the other first sample data is first sample data of the plurality of first sample data except the first sample data indicated by the target number;
the marking module 305 is further configured to number the plurality of second sample data, add a second test sample tag to second sample data with a number consistent with the target number, and add a second training sample tag to other second sample data, where the other second sample data is second sample data in the plurality of second sample data except for the second sample data indicated by the target number;
the extracting module 306 is configured to use the first sample data added with the first test sample label and the second sample data added with the second test sample label as the test data.
In a specific application scenario, as shown in fig. 3C, the apparatus further includes: a setting module 307, a training module 308, a second determining module 309.
The setting module 307 is configured to set each of the unknown parameters as a model vector based on maximum likelihood estimation and an optimal optimization method;
the training module 308 is configured to extract first sample data and second sample data, of which sample labels are sample training data, from the plurality of first sample data and the plurality of second sample data as training data, input the training data into the propagation prediction simulation model, and determine a value range of a parameter in the propagation prediction simulation model;
the second determining module 309 is configured to determine a value of the model vector according to an artificial intelligence algorithm markov chain monte carlo, complete model training, and obtain the propagation prediction simulation model.
According to the device provided by the embodiment of the application, firstly, a plurality of electronic medical record data in a target area are read from an electronic medical record database, the electronic medical record data with the address label in the target locking area are extracted as a plurality of first sample data, and the plurality of electronic medical record data left after extraction are used as a plurality of second sample data. Then, a blocking coefficient is determined, and based on the blocking coefficient, influence weights between the plurality of first sample data and the plurality of second sample data are calculated. And then, respectively carrying out time sequence data processing on the plurality of first sample data and the plurality of second sample data according to the influence weight to respectively obtain a first time sequence function and a second time sequence function, and then carrying out linear combination on the first time sequence function and the second time sequence function to obtain a propagation prediction simulation model. And finally, extracting test data from the plurality of electronic medical record data, inputting the test data into the propagation prediction simulation model for predicting epidemic propagation, and obtaining an epidemic propagation prediction result of the target area. Influence weight between two groups of sample data inside and outside a target blocking area is calculated by setting blocking parameters, and then a propagation prediction simulation model is constructed, so that a subsequent system only needs to update the blocking parameters of the area to be predicted, and the epidemic situation propagation condition of the current area to be predicted is input into the propagation prediction simulation model, and then the propagation prediction result of the current area can be obtained, so that the propagation prediction simulation model can be suitable for epidemic propagation scenes under different prevention and control states, and the accuracy and the authenticity of the prediction result are improved.
It should be noted that other corresponding descriptions of the functional units related to the artificial intelligence based epidemic situation prediction apparatus provided in the embodiment of the present application may refer to the corresponding descriptions in fig. 1 and fig. 2, and are not described herein again.
In an exemplary embodiment, referring to fig. 4, there is further provided a device, which includes a bus, a processor, a memory, and a communication interface, and may further include an input/output interface and a display device, wherein the functional units may communicate with each other through the bus. The memory stores computer programs, and the processor is used for executing the programs stored in the memory and executing the artificial intelligence based epidemic situation prediction method in the embodiment.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the artificial intelligence based epidemic prediction method.
Through the above description of the embodiments, those skilled in the art will clearly understand that the present application can be implemented by hardware, and also by software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.), and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the implementation scenarios of the present application.
Those skilled in the art will appreciate that the figures are merely schematic representations of one preferred implementation scenario and that the blocks or flow diagrams in the figures are not necessarily required to practice the present application.
Those skilled in the art will appreciate that the modules in the devices in the implementation scenario may be distributed in the devices in the implementation scenario according to the description of the implementation scenario, or may be located in one or more devices different from the present implementation scenario with corresponding changes. The modules of the implementation scenario may be combined into one module, or may be further split into a plurality of sub-modules.
The above application serial numbers are for description purposes only and do not represent the superiority or inferiority of the implementation scenarios.
The above disclosure is only a few specific implementation scenarios of the present application, but the present application is not limited thereto, and any variations that can be made by those skilled in the art are intended to fall within the scope of the present application.

Claims (10)

1. An epidemic situation prediction method based on artificial intelligence is characterized by comprising the following steps:
reading a plurality of electronic medical record data of a target area, extracting the electronic medical record data with an address label in the target locking area from the plurality of electronic medical record data as a plurality of first sample data, and taking the plurality of electronic medical record data left after extraction as a plurality of second sample data;
determining a blocking coefficient, based on which an influence weight between the plurality of first sample data and the plurality of second sample data is calculated;
respectively performing time sequence data processing on the plurality of first sample data and the plurality of second sample data according to the influence weights to respectively obtain a first time sequence function and a second time sequence function, and linearly combining the first time sequence function and the second time sequence function to obtain a propagation prediction simulation model;
and extracting test data from the plurality of electronic medical record data, inputting the test data into the propagation prediction simulation model for epidemic propagation prediction, and obtaining an epidemic propagation prediction result of the target area.
2. The method of claim 1, wherein said determining a blocking coefficient, based on which impact weights between said plurality of first sample data and said plurality of second sample data are calculated, comprises:
determining a baseline coefficient, a fatigue coefficient, a deviation coefficient and a city sealing rate coefficient in a locking state, and taking the baseline coefficient, the fatigue coefficient, the deviation coefficient and the city sealing rate coefficient as locking coefficients;
querying a first influence coefficient generated by the plurality of first sample data, a second influence coefficient generated by the plurality of first sample data on the plurality of second sample data, a third influence coefficient generated by the plurality of second sample data, and a fourth influence coefficient generated by the plurality of second sample data on the plurality of first sample data;
taking a natural constant as an exponential base, taking the product of the fatigue coefficient and a first coefficient as an exponent, and performing exponential operation to obtain an exponential value, wherein the first coefficient is the difference value between a time parameter and the deviation coefficient;
calculating a product of the exponent value and a second coefficient, and taking a sum of the product and the second coefficient as a spline coefficient, wherein the second coefficient is a difference value between a first preset value and the baseline coefficient;
calculating a product of the first coefficient and the closed-loop rate coefficient, calculating a sum of the product and the spline coefficient, and calculating a hyperbolic tangent value of the sum;
calculating a product of the hyperbolic tangent value, the second coefficient and the influence function, and calculating a difference value between the product and a second preset value to obtain an influence parameter;
inputting the first influence coefficient, the second influence coefficient, the third influence coefficient and the fourth influence coefficient into the influence function respectively to obtain first influence weights generated by the plurality of first sample data, second influence weights generated by the plurality of first sample data on the plurality of second sample data, third influence weights generated by the plurality of second sample data and fourth influence weights generated by the plurality of second sample data on the plurality of first sample data;
taking the first, second, third and fourth impact weights as the impact weights between the plurality of first sample data and the plurality of second sample data.
3. The method according to claim 1, wherein the performing time series data processing on the plurality of first sample data and the plurality of second sample data respectively according to the influence weights to obtain a first time series function and a second time series function respectively comprises:
inquiring the plurality of first sample data and the plurality of second sample data to obtain a first propagation parameter, a first sample total number, a second propagation parameter and a second sample total number outside the target blocking area in the target blocking area;
extracting a second infected person number and a second suspected infected person number from the second propagation parameter, and constructing the first time sequence function for the first propagation parameter based on the second infected person number, the second suspected infected person number, the first propagation parameter, the first sample total number and the influence weight;
extracting a first infected person number and a first suspected infected person number from the first propagation parameter, and constructing the second time sequence function for the second propagation parameter based on the first infected person number, the first suspected infected person number, the second propagation parameter, the second sample total number and the influence weight.
4. The method of claim 3, wherein constructing the first timing function for the first propagation parameter based on the second infected person number, the second suspected infected person number, the first propagation parameter, the first sample total number, and the impact weight comprises:
querying the first propagation parameter to obtain a first infected person number, a first suspected infected person number, a first contact person number, a first death person number, a first cure person number, a first infection rate, a first cure rate and a first death rate;
calculating a first product of the conversion rate of the first suspected infected person number converted into the first infected person number, the first influence weight and the first infected person number, calculating a product of the conversion rate of the second suspected infected person number converted into the first infected person number, the fourth influence weight and the second infected person number, calculating a sum of the product and the first product, calculating a ratio of the first suspected infected person number to the first sample total number, and multiplying the ratio and the sum to obtain a time sequence function of the first suspected infected person number;
calculating a product of the ratio and a third coefficient, and subtracting the product from the time series function of the first suspected infected person number to obtain the time series function of the first contact person number, wherein the third coefficient is the product of the first contact person number and the first infection rate;
calculating a second product of the first infected person number and the first cure rate, and subtracting the second product from the third coefficient to obtain a time sequence function of the first infected person number;
calculating the product of the first infected population and the first mortality, and subtracting the product from the second product to obtain a time series function of the first cured population;
(ii) taking said product as a time series function of said first death population;
and taking the time sequence function of the first suspected infected person number, the time sequence function of the first contact person number, the time sequence function of the first infected person number, the time sequence function of the first cured person number and the time sequence function of the first death person number as the first time sequence function.
5. The method of claim 3, wherein constructing the second timing function for the second propagation parameter based on the first infected person number, the first suspected infected person number, the second propagation parameter, the second sample total number, and the impact weight comprises:
querying the second propagation parameter to obtain a second infected person number, a second suspected infected person number, a second contact person number, a second death person number, a second cure person number, a second infection rate, a second cure rate and a second death rate;
calculating a third product of the conversion rate of the number of the first suspected infected persons converted into the second infected persons, the second influence weight and the first infected persons, calculating a product of the conversion rate of the number of the second suspected infected persons converted into the second infected persons, the third influence weight and the second infected persons, calculating a sum of the product and the third product, calculating a ratio of the number of the second suspected infected persons to the total number of the second samples, and multiplying the ratio and the sum to obtain a time sequence function of the number of the second suspected infected persons;
calculating a product of the ratio and a fourth coefficient, and subtracting the product from the time sequence function of the second suspected infected person number to obtain the time sequence function of the second contact person number, wherein the fourth coefficient is the product of the second contact person number and the second infection rate;
calculating a fourth product of the second infected person number and the second cure rate, and subtracting the fourth product from the fourth coefficient to obtain a time sequence function of the second infected person number;
calculating the product of the second infected population and the second mortality, and subtracting the fourth product from the product to obtain a time series function of the second cure population;
(ii) taking said product as a time-series function of said second mortality population;
and taking the time-series function of the second suspected number of infected persons, the time-series function of the second number of persons who contact, the time-series function of the second number of infected persons, the time-series function of the second number of cured persons and the time-series function of the second number of dead persons as the second time-series function.
6. The method according to claim 1, wherein after reading the plurality of electronic medical record data in the target area, extracting the electronic medical record data with an address tag in the target locking area from the plurality of electronic medical record data as a plurality of first sample data, and taking the plurality of electronic medical record data remaining after the extraction as a plurality of second sample data, the method further comprises:
numbering the plurality of first sample data respectively, adding a first test sample label to first sample data with the number consistent with a target number, adding a first training sample label to other first sample data, wherein the value of the target number is an arbitrary value, and the other first sample data are first sample data except the first sample data indicated by the target number;
numbering the plurality of second sample data respectively, adding a second test sample label to second sample data with the number consistent with the target number, and adding a second training sample label to other second sample data, wherein the other second sample data are second sample data except the second sample data indicated by the target number;
and using the first sample data added with the first test sample label and the second sample data added with the second test sample label as the test data.
7. The method according to claim 1, wherein before extracting test data from the plurality of electronic medical record data and inputting the test data to the propagation prediction simulation model for epidemic propagation prediction to obtain the epidemic propagation prediction result of the target region, the method further comprises:
based on maximum likelihood estimation and an optimal optimization method, setting each parameter in all unknown parameters as a model vector;
extracting first sample data and second sample data with sample labels as sample training data from the plurality of first sample data and the plurality of second sample data as training data, inputting the training data into the propagation prediction simulation model, and determining the value range of parameters in the propagation prediction simulation model;
and determining the value of the model vector according to an artificial intelligence algorithm Markov chain Monte Carlo, finishing model training and obtaining the propagation prediction simulation model.
8. An epidemic situation prediction device based on artificial intelligence is characterized by comprising:
the reading module is used for reading a plurality of electronic medical record data of a target area, extracting the electronic medical record data with an address label in the target locking area from the plurality of electronic medical record data as a plurality of first sample data, and taking the plurality of electronic medical record data left after extraction as a plurality of second sample data;
a first determination module for determining a blocking coefficient, based on which an influence weight between the plurality of first sample data and the plurality of second sample data is calculated;
the calculation module is used for respectively carrying out time sequence data processing on the plurality of first sample data and the plurality of second sample data according to the influence weights to respectively obtain a first time sequence function and a second time sequence function, and carrying out linear combination on the first time sequence function and the second time sequence function to obtain a propagation prediction simulation model;
and the test module is used for extracting test data from the plurality of first sample data and the plurality of second sample data, inputting the test data into the propagation prediction simulation model for predicting the propagation of the epidemic situation, and obtaining the prediction result of the propagation of the epidemic situation in the target area.
9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
CN202210026852.2A 2022-01-11 2022-01-11 Artificial intelligence based epidemic situation prediction method, device, equipment and storage medium Pending CN114388138A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210026852.2A CN114388138A (en) 2022-01-11 2022-01-11 Artificial intelligence based epidemic situation prediction method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210026852.2A CN114388138A (en) 2022-01-11 2022-01-11 Artificial intelligence based epidemic situation prediction method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114388138A true CN114388138A (en) 2022-04-22

Family

ID=81201851

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210026852.2A Pending CN114388138A (en) 2022-01-11 2022-01-11 Artificial intelligence based epidemic situation prediction method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114388138A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116168847A (en) * 2023-04-26 2023-05-26 南京邮电大学 Infectious disease prediction method based on optimized next generation reserve pool calculation

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110093249A1 (en) * 2009-10-19 2011-04-21 Theranos, Inc. Integrated health data capture and analysis system
CN111242395A (en) * 2020-04-26 2020-06-05 北京全路通信信号研究设计院集团有限公司 Method and device for constructing prediction model for OD (origin-destination) data
CN112435759A (en) * 2020-11-24 2021-03-02 医渡云(北京)技术有限公司 Epidemic situation data prediction method and device, electronic equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110093249A1 (en) * 2009-10-19 2011-04-21 Theranos, Inc. Integrated health data capture and analysis system
CN111242395A (en) * 2020-04-26 2020-06-05 北京全路通信信号研究设计院集团有限公司 Method and device for constructing prediction model for OD (origin-destination) data
CN112435759A (en) * 2020-11-24 2021-03-02 医渡云(北京)技术有限公司 Epidemic situation data prediction method and device, electronic equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘向阳;宋玉蓉;孟繁荣;: "微博网络事件谣言信息扩散准确预测仿真", 计算机仿真, no. 08, 15 August 2018 (2018-08-15), pages 458 - 462 *
李兴兵;: "网络信息传播中群体心理演化建模与仿真", 网络安全技术与应用, no. 03, 15 March 2020 (2020-03-15), pages 43 - 46 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116168847A (en) * 2023-04-26 2023-05-26 南京邮电大学 Infectious disease prediction method based on optimized next generation reserve pool calculation
CN116168847B (en) * 2023-04-26 2023-08-11 南京邮电大学 Infectious disease prediction method based on optimized next generation reserve pool calculation

Similar Documents

Publication Publication Date Title
CN110211690A (en) Disease risks prediction technique, device, computer equipment and computer storage medium
JP6815708B2 (en) Influenza prediction model generation method, equipment and computer readable storage medium
Saulnier et al. Inferring epidemiological parameters from phylogenies using regression-ABC: A comparative study
CN108491714A (en) The man-machine recognition methods of identifying code
CN108052979A (en) The method, apparatus and equipment merged to model predication value
Craiu et al. Inference based on the EM algorithm for the competing risks model with masked causes of failure
CN109784015A (en) A kind of authentication identifying method and device
CN109981749A (en) A kind of cloud workflow task running time prediction method promoted based on limit gradient
CN114388138A (en) Artificial intelligence based epidemic situation prediction method, device, equipment and storage medium
CN111178537A (en) Feature extraction model training method and device
CN108446841A (en) A kind of systems approach determining accident factor hierarchical structure using grey correlation
Wu et al. SQEIR: An epidemic virus spread analysis and prediction model
CN110610140A (en) Training method, device and equipment of face recognition model and readable storage medium
CN110210522A (en) The training method and device of picture quality Fraction Model
Gasparrini et al. Package ‘mvmeta’
Liu et al. Maximum likelihood abundance estimation from capture-recapture data when covariates are missing at random
CN114881124A (en) Method and device for constructing cause-and-effect relationship diagram, electronic equipment and medium
CN111523685B (en) Method for reducing performance modeling overhead based on active learning
CN110110280B (en) Curve integral calculation method, device and equipment for coordinates and storage medium
OSullivan et al. Canonical correlation analysis for detecting changes in network structure
Maddumage et al. R programming for Social Network Analysis-A Review
Huang Probabilistic model checking of disease spread and prevention
Vorgul et al. Pseudo Random Value Generation in STM32 Cube
CN113658713B (en) Infection tendency prediction method, device, equipment and storage medium
CN111522644B (en) Method for predicting running time of parallel program based on historical running data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination