CN113192640A - New crown risk stage assessment method and system based on transfer learning - Google Patents

New crown risk stage assessment method and system based on transfer learning Download PDF

Info

Publication number
CN113192640A
CN113192640A CN202110492146.2A CN202110492146A CN113192640A CN 113192640 A CN113192640 A CN 113192640A CN 202110492146 A CN202110492146 A CN 202110492146A CN 113192640 A CN113192640 A CN 113192640A
Authority
CN
China
Prior art keywords
data
country
decoder
countries
stage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110492146.2A
Other languages
Chinese (zh)
Inventor
沈国江
李宁
郦鹏飞
孔祥杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN202110492146.2A priority Critical patent/CN113192640A/en
Publication of CN113192640A publication Critical patent/CN113192640A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Public Health (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Medical Informatics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The new crown risk stage assessment method based on transfer learning comprises the following steps: 1) the new crown hazard phase proposes: and defining the dangerous stage of the epidemic situation from the perspective of the epidemic situation period. 2) Pre-training a decoder to obtain a standard feature space mapping method; 3) classifying the new crown data of the country according to the similarity degree through a decoder, and 4) quantitatively analyzing the characteristics of each category of data; 5) matching the corresponding country types of the countries to be evaluated according to the data characteristics; 6) evaluation of the risk stage of the new crown based on transfer learning: 7) and (3) carrying out new crown risk stage evaluation respectively from two transfer learning methods based on examples and examples after standardization, wherein the former method is used for fixing data volume, and the latter method is used for self-fixing data volume. The invention also comprises a system for implementing the new crown risk phase assessment method based on the transfer learning. The evaluation experiment of the new coronary risk stage by taking 9 epidemic situation countries as examples shows that the invention has excellent performance for treating the problem.

Description

New crown risk stage assessment method and system based on transfer learning
Technical Field
The invention relates to the field of infectious diseases such as new crowns, in particular to a new crown risk stage evaluation method and system based on transfer learning.
Background
The evaluation of the current new crown phase also has the following disadvantages: firstly, no good definition of evaluation indexes of new crown risk stages exists; secondly, in the initial stage of new crown outbreak, due to the lack of the data of the epidemic situation of the country, the data of other countries are needed to help the determination of the dangerous stage of the new crown, while the difference of the epidemic situation data of different countries is huge, and a transfer learning method needs to be proposed. Therefore, the evaluation of the dangerous stage of the new coronary epidemic situation has huge challenges.
Disclosure of Invention
The invention provides a new crown risk stage evaluation method and system based on transfer learning, aiming at overcoming the defects in the prior art.
Aiming at the situation, the invention firstly defines the danger stage of the new crown and provides an evaluation method for the danger stage of the new crown. And a complete data migration evaluation flow is further provided, the risk stage of the new crown is evaluated, and an effective basis is provided for government decision.
The invention achieves the aim through the following technical scheme: the new crown risk stage evaluation method based on the transfer learning comprises the following steps:
(1) proposing a new crown danger phase;
(2) designing a decoder;
(3) pre-training a decoder to obtain a standard feature space mapping method;
(4) classifying the new crown data of the country according to the similarity degree through a decoder;
(5) quantitatively analyzing the characteristics of each category of data;
(6) matching the corresponding country types of the countries to be evaluated according to the data characteristics;
(7) an example-based transfer learning assessment;
(8) example-based migration learning assessment after normalization;
wherein, the step (1) specifically comprises the following steps:
at present, each country has no unified standard for evaluating the risk stage of the new coronary pneumonia. The existing evaluation standard is generally based on the number of confirmed cases. However, the quantity-based risk stage assessment method has many problems, one is that the future development trend of the epidemic situation is not clear, and the risk stage of the epidemic situation cannot be assessed from a complete diagnosis period. Our evaluation criteria are based on the complete diagnosis period and combined with the future epidemic development trend.
The specific evaluation criteria are defined as follows
Defining: a new crown hazard phase. We need to introduce a standard-the current _ to _ max, to describe the infection status in a country.
Figure BDA0003052827510000021
rdc represents the average of the infection volume on the last three days, too short days being too sensitive to daily fluctuations, and too long days leading to a low variance in the early high rates of countries. The three-day-long-life tea is reasonable and can catch up with the latest growth trend. mdc represents the maximum daily confirmation throughout the infection cycle, and to eliminate errors as much as possible, we are also the average of the three-day maximum confirmed diagnoses over the entire COVID-19 infection cycle. casenIndicating the number of visits for a certain day. The specific COVID-19 stage is determined as shown in Table 1. This classification criterion is the most satisfactory one for our present study, after several attempts. The tags are labeled in the whole COVID-19 period of all countries as shown in FIG. 2, and on the graph, the distribution of the tags can be clearly seen to be quite consistent with the development of the COVID-19 period.
Figure BDA0003052827510000022
Figure BDA0003052827510000031
TABLE 1
Wherein, the step (2) specifically comprises the following steps:
LSTM may capture quantitative characteristics and trends of the input. As shown in fig. 1: the encoder consists of LSTM plus one layer of fully-connected layers after passing through the argmax function. Input device
Figure BDA0003052827510000036
The data for the diagnosis of COVID-19 infection for the historical 4 days and COVID-19 for the future day is output as the stage in which the current COVID-19 is located. The specific formula is as follows:
Figure BDA0003052827510000032
take the last one passing through the LSTM
Figure BDA0003052827510000033
And finally the output is obtained.
Figure BDA0003052827510000034
YlableIs the dangerous phase assessment result we need.
The loss function is shown in the following equation. The first half minimizes the error between the true and predicted values. The rear half LregThe regularization term is used to avoid overfitting of the function for L2, where λ is a hyper-parameter.
Figure BDA0003052827510000035
Wherein, the step (3) specifically comprises the following steps:
31. preprocessing data;
countries with a receiver _ to _ max less than or equal to 0.1 and a total number of diagnoses greater than 3000 were selected from 187 countries of the global COVID-19 diagnostic dataset, which were considered to have been at the end of or past a complete cycle. And eliminating the countries with obvious data errors in advance, and marking the data of the selected countries with corresponding labels according to the previous rules.
32. Formal pre-training;
we train using new coronary confirmed data of one country in turn, and we generally choose the trained decoder to be able to decode 100% of the trained country as the stop sign.
Wherein, the step (4) specifically comprises the following steps:
the specific classification process is as follows: the decoder is used for decoding all countries to obtain the similarity degree of the countries, and data with the similarity degree larger than a certain standard are selected as a class. Looping is continued until all data is classified.
Wherein, the step (5) specifically comprises the following steps:
the difference between different categories of data is mainly reflected in the number of cases diagnosed. Countries of the same category have similar COVID-19 periodicity variations and maximum distributions. The average number of confirmed cases in the same country is about the same. Therefore, we should find a characteristic trend in each class of data, which in turn can help us classify unknown classes of data. We obtained these features by statistical analysis of the data.
Wherein, the step (6) specifically comprises the following steps:
it is determined which category the country to be evaluated is in based on the characteristics.
Wherein, the step (7) specifically comprises the following steps:
for the same type of data, the data can be directly migrated, and the problem of data shortage is solved. And directly migrating the unified category data to train a decoder, and then using the decoder to decode the current new crown risk stage of the country to be evaluated.
Wherein the step (8) specifically comprises the following steps:
the evaluation process based on the example after standardization is shown in fig. 7, the data of all countries are standardized according to the maximum value of the own country data, and the values [0,1] are standardized, namely mapped to the same distribution space. This portion of data is then used as source data to train the decoder. And the country to be evaluated is not known with the maximum value of new coronary diagnosis of the country, so the country is standardized according to the standardization rule of the country characteristic matching to the country category. The normalized data input decoder gets the new coronary risk stage of the country.
The system for implementing the new crown risk stage assessment method based on the transfer learning comprises a new crown risk stage evaluation standard module, a decoder pre-training module, a national new crown epidemic situation data classification module, a data analysis module, a national category matching module, a transfer learning assessment module and a standardized post-transfer learning assessment module which are sequentially connected.
The invention has the advantages that: 1. and dividing the danger level of the new crown according to the period of the new crown, and concretizing the abstract danger of the new crown. 2. By analyzing the characteristics of the confirmed diagnosis data of different epidemic situations, the data of other countries are fully utilized to
Drawings
FIG. 1 is an overall flow diagram of the method of the present invention.
Fig. 2(a) -2 (p) are views showing the dangerous phases of the new crown in the defined example of the present invention, wherein fig. 2(a) is country a; FIG. 2(B) is country B; FIG. 2(C) is State C; FIG. 2(D) is country D; FIG. 2(E) is nation E; FIG. 2(F) is country F; FIG. 2(G) is State G; FIG. 2(H) is State H; FIG. 2 (I) is country I; FIG. 2(J) is country J; FIG. 2(K) is the K country; FIG. 2(M) is M country; FIG. 2(N) is the N country; FIG. 2(O) is State O; FIG. 2(P) shows country P.
Fig. 3 is a diagram showing similarity of data of new crown countries in the example of the present invention.
FIG. 4 is a diagram of sorting criteria selection in an example of the invention.
FIG. 5 is a display diagram showing confirmed diagnosis of epidemic situation after classification of new crown nations data in the example of the present invention.
Fig. 6 is a periodic mean number of confirmed diagnoses after classification of new coronary country data in an example of the invention.
FIG. 7 is an illustration of the standard example-based transfer learning method of the present invention.
Fig. 8(a) -8 (i) are actual evaluation effect displays in 9 countries in the example of the present invention, wherein fig. 8(a) is country a; FIG. 8(B) shows country B; FIG. 8(c) is country E; FIG. 8(d) is country H; FIG. 8(e) is country J; FIG. 8(f) is the K country; FIG. 8(g) is state Q; FIG. 8(h) is the M country; fig. 8(i) is country O.
Detailed description of the preferred embodiments
The invention is further described below with reference to examples of 9 national new crown risk phase assessments.
The overall method of the new crown risk stage assessment method in this example is shown in fig. 1, and specifically includes the following steps:
(1) and (4) proposing a new crown danger stage:
at present, each country has no unified standard for evaluating the risk stage of the new coronary pneumonia. The existing evaluation standard is generally based on the number of confirmed cases. However, the quantity-based risk stage assessment method has many problems, one is that the future development trend of the epidemic situation is not clear, and the risk stage of the epidemic situation cannot be assessed from a complete diagnosis period. Our evaluation criteria are based on the complete diagnosis period and combined with the future epidemic development trend.
The specific evaluation criteria are defined as follows
Defining: a new crown hazard phase. We need to introduce a standard-the current _ to _ max, to describe the infection status in a country.
Figure BDA0003052827510000061
rdc represents the average of the infection volume on the last three days, too short days being too sensitive to daily fluctuations, and too long days leading to a low variance in the early high rates of countries. The three-day-long-life tea is reasonable and can catch up with the latest growth trend. mdc represents the maximum daily confirmation throughout the infection cycle, and to eliminate errors as much as possible, we are also the average of the three-day maximum confirmed diagnoses over the entire COVID-19 infection cycle. casenIndicating the number of visits for a certain day. The specific COVID-19 stage is determined as shown in Table 1. After a number of attempts, thisThe classification criteria are the most consistent with our current study. The tags are labeled in the whole COVID-19 period of all countries as shown in FIG. 2, and on the graph, the distribution of the tags can be clearly seen to be quite consistent with the development of the COVID-19 period.
(2) Designing a decoder:
LSTM may capture quantitative characteristics and trends of the input. As shown in fig. 1: the encoder consists of LSTM plus one layer of fully-connected layers after passing through the argmax function. Input device
Figure BDA0003052827510000062
The data for the diagnosis of COVID-19 infection for the historical 4 days and COVID-19 for the future day is output as the stage in which the current COVID-19 is located. The specific formula is as follows:
Figure BDA0003052827510000063
take the last one passing through the LSTM
Figure BDA0003052827510000064
And finally the output is obtained.
Figure BDA0003052827510000065
YlableIs the dangerous phase assessment result we need.
The loss function is shown in the following equation. The first half minimizes the error between the true and predicted values. The rear half LregThe regularization term is used to avoid overfitting of the function for L2, where λ is a hyper-parameter.
Figure BDA0003052827510000071
(3) Pre-training the decoder to obtain a standard feature space mapping method:
31. preprocessing data;
countries with a receiver _ to _ max less than or equal to 0.1 and a total number of diagnoses greater than 3000 were selected from 187 countries of the global COVID-19 diagnostic dataset, which were considered to have been at the end of or past a complete cycle. And eliminating the countries with obvious data errors in advance, and marking the data of the selected countries with corresponding labels according to the previous rules.
32. Formal pre-training;
we train using new coronary confirmed data of one country in turn, and we generally choose the trained decoder to be able to decode 100% of the trained country as the stop sign.
(4) Classifying, by a decoder, the national new crown data according to the degree of similarity:
fig. 3 is the country similarity we derive from austria as the pre-trained country, and fig. 4 shows the accuracy of the evaluation of the new crown risk phase after example-based migratory learning when we select different similarity criteria, so we select 80% of this criteria. The specific classification process is as follows: we train the decoder as the benchmark with a country epidemic situation data at a time, use the decoder to decode all countries, get its similarity degree with every country, we choose the data with similarity degree greater than 80% as a class. Looping is continued until all data is classified. In the examples our classification results are shown in Table 2
Figure BDA0003052827510000072
TABLE 2
(5) Quantitative analysis of the characteristics of each category of data:
as can be seen from fig. 5 and 6, the difference between the different categories of data is mainly reflected in the number of confirmed cases. In fig. 5, countries of the same category have similar COVID-19 periodicity variations and maximum distributions. In fig. 6, the average number of confirmed cases in the same country is approximately the same. Therefore, we should find a characteristic trend in each class of data, which in turn can help us classify unknown classes of data. Through statistical analysis of the data, we obtained these features, as shown in table 2.
(6) And matching the corresponding country types of the countries to be evaluated according to the data characteristics:
it is determined which category the country to be evaluated is in based on the characteristics.
(7) Example-based migration learning assessment:
for the same type of data, the data can be directly migrated, and the problem of data shortage is solved. And directly migrating the unified category data to train a decoder, and then using the decoder to decode the current new crown risk stage of the country to be evaluated.
(8) Example-based migration learning assessment after normalization:
the evaluation process based on the example after standardization is shown in fig. 7, the data of all countries are standardized according to the maximum value of the own country data, and the values [0,1] are standardized, namely mapped to the same distribution space. This portion of data is then used as source data to train the decoder. And the country to be evaluated is not known with the maximum value of new coronary diagnosis of the country, so the country is standardized according to the standardization rule of the country characteristic matching to the country category. The normalized data input decoder gets the new coronary risk stage of the country.
The evaluation results of the 9 countries after (6), (7) and (8) are shown in FIG. 8. True values represent true new crown risk phases, i. represents case-based migratory learning estimates, n.i. represents case-based migratory learning estimates after normalization, 31.25% inside brackets represents utilization of source data, and so on. From the results, we can see that the evaluation mode is very suitable for the development of new crown epidemic and can be successfully implemented in a country without new crown epidemic, which has important reference value for the arrangement of government work. Has certain guiding significance in the face of unknown epidemic situation.
The system for implementing the new crown risk stage evaluation method based on the transfer learning comprises a new crown risk stage evaluation standard module, a decoder pre-training module, a national new crown epidemic situation data classification module, a data analysis module, a national category matching module, a transfer learning evaluation module and a standardized post-transfer learning evaluation module which are sequentially connected;
wherein, the new crown danger stage evaluation standard module specifically comprises:
specific evaluation criteria were defined as follows:
defining: a new crown hazard phase; we need to introduce a standard-the current _ to _ max, to describe the infection status of a country;
Figure BDA0003052827510000091
rdc represents the average of the infection volume of the last three days, too short days being too sensitive to daily fluctuations, too long days leading to a low variance in the early high rate countries; the three-day-long-life growth promotion agent is reasonable and can catch up with the latest growth trend; mdc represents the maximum daily diagnosis amount in the whole infection period, and in order to eliminate errors as much as possible, the average value of the maximum three-day diagnosis amount in the whole COVID-19 infection period is taken; casenIndicating the number of confirmed diagnoses on a certain day; the specific COVID-19 stage is determined as shown in Table 1;
range of rtm COVID-19 dangerous stage
[0,0.2) Low risk of separation
[0.2,0.5) Middle risk
[0.5,0.8) High risk
[0.8,+∞) Severe severity of disease
TABLE 1
Wherein, the decoder specifically includes:
the LSTM can capture quantitative characteristics and trends of the input; the encoder is composed of an LSTM plus a full-link layer which passes through an argmax function; input device
Figure BDA0003052827510000092
The diagnosis data of the COVID-19 infection amount of the historical 4 days and the COVID-19 diagnosis data of the future one day are output as the stage of the current COVID-19; the specific formula is as follows:
Figure BDA0003052827510000093
take the last one passing through the LSTM
Figure BDA0003052827510000094
Outputting the most;
Figure BDA0003052827510000101
Ylableis the required risk stage assessment result;
the loss function is shown by the following formula; the first half minimizes the error between the true and predicted values; the rear half LregRegularization terms are used to avoid overfitting of the function for L2, λ is a hyper-parameter;
Figure BDA0003052827510000102
the decoder pre-training module specifically comprises: a data preprocessing submodule and a formal pre-training submodule;
a data preprocessing submodule: selecting countries having a receiver _ to _ max of 0.1 or less and a total number of diagnoses of more than 3000 from 187 countries of the global COVID-19 diagnostic dataset, which are considered to have been at the end of a complete cycle or have exceeded the cycle; removing countries with obvious data errors in advance, and marking the selected country data with corresponding labels according to the previous rules;
formal pre-training submodule: training by sequentially using new coronary diagnosis data of a country, and selecting a trained decoder capable of decoding 100% of the trained country as a stop sign;
the specific classification process of the national new crown epidemic situation data classification module is as follows: one country epidemic situation data of each random bar is used as a benchmark training decoder, all countries are decoded by the decoder to obtain the similarity degree of the country with each country, and the data with the similarity degree larger than a certain standard is selected as a class; circulating until all data are classified;
wherein, the data analysis module specifically includes: the difference of different types of data is mainly reflected in the number of confirmed cases; countries of the same category have similar COVID-19 periodicity variation and maximum distribution; the average number of confirmed cases in the same country is about the same; therefore, a characteristic trend should be found in each class of data, which in turn can help us classify unknown classes of data; by statistical analysis of the data, we obtained these features;
the country type matching module determines which type the country to be evaluated is in according to the characteristics;
wherein, the transfer learning evaluation module comprises: for the same type of data, the data is directly migrated, and the problem of data shortage is solved; directly transferring unified category data to train a decoder, and then decoding a current new crown danger stage of a country to be evaluated by using the decoder;
the standardized migration learning evaluation module specifically comprises: after standardization, based on evaluation of an example, the data of all countries are subjected to standardization processing according to the maximum value of the data of the countries, and the data are normalized to a value [0,1], namely, the data are mapped to the same distribution space; then the part of data is used as source data to train a decoder; the country to be evaluated is not known with the maximum value of new coronary diagnosis of the country, so the country is standardized according to the standardization rule of the country feature matching to the country category; the normalized data input decoder gets the new crown hazard phase of the country.

Claims (2)

1. The new crown risk stage assessment method based on the transfer learning comprises the following steps:
(1) the method provides a new crown danger stage, which specifically comprises the following steps:
at present, each country has no unified standard for evaluating the risk stage of the new coronary pneumonia; the existing evaluation standard is generally based on the number of confirmed cases; however, the quantity-based risk stage assessment method has a plurality of problems, namely that the future development trend of the epidemic situation is not clear, and the risk stage of the epidemic situation cannot be assessed from a complete diagnosis period; the evaluation standard is based on a complete diagnosis period and is combined with the development trend of future epidemic situations;
specific evaluation criteria were defined as follows:
defining: a new crown hazard phase; a standard-the current _ to _ max is required to be introduced to describe the infection state of a country;
Figure FDA0003052827500000011
rdc represents the average of the infection volume for the last three days; mdc represents the maximum daily diagnosis amount in the whole infection period, and in order to eliminate errors as much as possible, the average value of the maximum three-day diagnosis amount in the whole COVID-19 infection period is taken; casenIndicating the number of confirmed diagnoses on a certain day; the specific COVID-19 stage is determined as shown in Table 1;
range of rtm COVID-19 dangerous stage [0,0.2) Low risk of separation [0.2,0.5) Middle risk [0.5,0.8) High risk [0.8,+∞) Severe severity of disease
TABLE 1
(2) Designing a decoder, specifically comprising:
the LSTM can capture quantitative characteristics and trends of the input; the encoder is composed of an LSTM plus a full-link layer through an argmax function; input device
Figure FDA0003052827500000012
The diagnosis data of the COVID-19 infection amount of the historical 4 days and the COVID-19 diagnosis data of the future one day are output as the stage of the current COVID-19; the specific formula is as follows:
Figure FDA0003052827500000021
take the last one passing through the LSTM
Figure FDA0003052827500000022
Outputting the most;
Figure FDA0003052827500000023
Ylableis the required risk stage assessment result;
the loss function is shown by the following formula; the first half minimizes the error between the true and predicted values; the latter half LregRegularization terms are used to avoid overfitting of the function for L2, λ is a hyper-parameter;
Figure FDA0003052827500000024
(3) the method for pre-training the decoder to obtain the standard feature space mapping specifically comprises the following steps:
31. preprocessing data;
selecting countries having a receiver _ to _ max of 0.1 or less and a total number of diagnoses of more than 3000 from 187 countries of the global COVID-19 diagnostic dataset, which are considered to have been at the end of a complete cycle or have exceeded the cycle; removing countries with obvious data errors in advance, and marking corresponding labels on the selected country data according to the previous rules;
32. formal pre-training;
training is carried out by sequentially using new coronary diagnosis data of a country, and the trained decoder can decode the trained country as a stop sign by 100 percent generally;
(4) classifying the national new crown epidemic situation data according to the similarity degree through a decoder;
the specific classification process is as follows: randomly taking the new crown epidemic situation data of one country as a benchmark training decoder each time, decoding all countries by using the decoder to obtain the similarity of the new crown situation data of one country and each country, and selecting the data with the similarity larger than a certain standard as a class; circulating until all data are classified;
(5) quantitatively analyzing the characteristics of each category of data;
the difference of different types of data is mainly reflected in the number of confirmed cases; countries of the same category have similar COVID-19 periodicity variation and maximum distribution; the average number of confirmed cases in the same country is about the same; therefore, a characteristic trend should be found in each category of data to classify the unknown category of data; obtaining the characteristics through statistical analysis of data;
(6) matching the corresponding country types of the countries to be evaluated according to the data characteristics;
(7) an example-based transfer learning assessment;
for the same type of data, the data can be directly migrated, and the problem of data shortage is solved; directly transferring unified category data to train a decoder, and then decoding a current new crown danger stage of a country to be evaluated by using the decoder;
(8) example-based migration learning assessment after normalization;
the data of all countries are standardized according to the maximum value of the data of the countries, and the data are standardized to a value [0,1], namely are mapped to the same distribution space; then the part of data is used as source data to train a decoder; the country to be evaluated is not known with the maximum value of new coronary diagnosis of the country, so the country is standardized according to the standardization rule of the country feature matching to the country category; the normalized data input decoder gets the new coronary risk stage of the country.
2. The system for implementing the new crown risk phase assessment method based on transfer learning of claim 1 is characterized in that: the system comprises a new crown risk stage evaluation standard module, a decoder pre-training module, a national new crown epidemic situation data classification module, a data analysis module, a national category matching module, a transfer learning evaluation module and a standardized post-transfer learning evaluation module which are connected in sequence;
wherein, the new crown danger stage evaluation standard module specifically comprises:
specific evaluation criteria were defined as follows:
defining: a new crown hazard phase; we need to introduce a standard-the current _ to _ max, to describe the infection status of a country;
Figure FDA0003052827500000031
rdc represents the average of the infection volume for the last three days, too short a day being too sensitive to daily fluctuations, too long a day leading to a small difference in the early rates of countries; the three-day-long-life growth promotion agent is reasonable and can catch up with the latest growth trend; mdc represents the maximum daily diagnosis amount in the whole infection period, and in order to eliminate errors as much as possible, the average value of the maximum three-day diagnosis amount in the whole COVID-19 infection period is taken; casenIndicating the number of confirmed diagnoses on a certain day; the specific COVID-19 stage is determined as shown in Table 1;
range of rtm COVID-19 dangerous stage [0,0.2) Low risk of separation [0.2,0.5) Middle risk [0.5,0.8) High risk [0.8,+∞) Severe severity of disease
TABLE 1
Wherein, the decoder specifically includes:
the LSTM can capture quantitative characteristics and trends of the input; the encoder is composed of an LSTM plus a full-link layer through an argmax function; input device
Figure FDA0003052827500000041
The diagnosis data of the COVID-19 infection amount of the historical 4 days and the COVID-19 diagnosis data of the future one day are output as the stage of the current COVID-19; the specific formula is as follows:
Figure FDA0003052827500000042
take the last one passing through the LSTM
Figure FDA0003052827500000043
Outputting the most;
Figure FDA0003052827500000044
Ylableis the required risk stage assessment result;
the loss function is shown by the following formula; the first half minimizes the error between the true and predicted values; the latter half LregRegularization terms are used to avoid overfitting of the function for L2, λ is a hyper-parameter;
Figure FDA0003052827500000045
the decoder pre-training module specifically comprises: a data preprocessing submodule and a formal pre-training submodule;
a data preprocessing submodule: selecting countries having a receiver _ to _ max of 0.1 or less and a total number of diagnoses of more than 3000 from 187 countries of the global COVID-19 diagnostic dataset, which are considered to have been at the end of a complete cycle or have exceeded the cycle; removing countries with obvious data errors in advance, and marking corresponding labels on the selected country data according to the previous rules;
formal pre-training submodule: training by sequentially using new coronary diagnosis data of a country, and selecting a trained decoder capable of decoding 100% of the trained country as a stop sign;
the specific classification process of the national new crown epidemic situation data classification module is as follows: one country epidemic situation data of each random bar is used as a benchmark training decoder, all countries are decoded by the decoder to obtain the similarity degree of the country with each country, and the data with the similarity degree larger than a certain standard is selected as a class; circulating until all data are classified;
wherein, the data analysis module specifically includes: the difference of different types of data is mainly reflected in the number of confirmed cases; countries of the same category have similar COVID-19 periodicity variation and maximum distribution; the average number of confirmed cases in the same country is about the same; therefore, a characteristic trend should be found in each class of data, which in turn can help us classify unknown classes of data; by statistical analysis of the data, we obtained these features;
the country type matching module determines which type the country to be evaluated is in according to the characteristics;
wherein, the transfer learning evaluation module comprises: for the same type of data, the data is directly migrated, and the problem of data shortage is solved; directly transferring unified category data to train a decoder, and then decoding a current new crown danger stage of a country to be evaluated by using the decoder;
the standardized migration learning evaluation module specifically comprises: after standardization, based on evaluation of an example, the data of all countries are subjected to standardization processing according to the maximum value of the data of the countries, and the data are normalized to a value [0,1], namely, the data are mapped to the same distribution space; then the part of data is used as source data to train a decoder; the country to be evaluated is not known with the maximum value of new coronary diagnosis of the country, so the country is standardized according to the standardization rule of the country feature matching to the country category; the normalized data input decoder gets the new coronary risk stage of the country.
CN202110492146.2A 2021-05-06 2021-05-06 New crown risk stage assessment method and system based on transfer learning Pending CN113192640A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110492146.2A CN113192640A (en) 2021-05-06 2021-05-06 New crown risk stage assessment method and system based on transfer learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110492146.2A CN113192640A (en) 2021-05-06 2021-05-06 New crown risk stage assessment method and system based on transfer learning

Publications (1)

Publication Number Publication Date
CN113192640A true CN113192640A (en) 2021-07-30

Family

ID=76983964

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110492146.2A Pending CN113192640A (en) 2021-05-06 2021-05-06 New crown risk stage assessment method and system based on transfer learning

Country Status (1)

Country Link
CN (1) CN113192640A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111768873A (en) * 2020-06-03 2020-10-13 中国地质大学(武汉) COVID-19 real-time risk prediction method
CN111798991A (en) * 2020-07-09 2020-10-20 重庆邮电大学 LSTM-based method for predicting population situation of new coronary pneumonia epidemic situation
CN112164471A (en) * 2020-09-17 2021-01-01 吉林大学 New crown epidemic situation comprehensive evaluation method based on classification regression model
CN112542250A (en) * 2020-11-04 2021-03-23 温州大学 Global new coronavirus transmission prediction method based on optimized SEIRD model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111768873A (en) * 2020-06-03 2020-10-13 中国地质大学(武汉) COVID-19 real-time risk prediction method
CN111798991A (en) * 2020-07-09 2020-10-20 重庆邮电大学 LSTM-based method for predicting population situation of new coronary pneumonia epidemic situation
CN112164471A (en) * 2020-09-17 2021-01-01 吉林大学 New crown epidemic situation comprehensive evaluation method based on classification regression model
CN112542250A (en) * 2020-11-04 2021-03-23 温州大学 Global new coronavirus transmission prediction method based on optimized SEIRD model

Similar Documents

Publication Publication Date Title
CN110491465B (en) Disease classification coding method, system, device and medium based on deep learning
CN107292330B (en) Iterative label noise identification algorithm based on double information of supervised learning and semi-supervised learning
US11580459B2 (en) Systems and methods for extracting specific data from documents using machine learning
CN111161814A (en) DRGs automatic grouping method based on convolutional neural network
Jung et al. Deep recurrent model for individualized prediction of Alzheimer’s disease progression
CN110532398B (en) Automatic family map construction method based on multi-task joint neural network model
CN112015863A (en) Multi-feature fusion Chinese text classification method based on graph neural network
US11886820B2 (en) System and method for machine-learning based extraction of information from documents
US20210357680A1 (en) Machine learning classification system
CN113526282B (en) Method, device, medium and equipment for diagnosing medium and long-term aging faults of elevator
US11763945B2 (en) System and method for labeling medical data to generate labeled training data
CN111160959A (en) User click conversion estimation method and device
CN115858785A (en) Sensitive data identification method and system based on big data
CN115295104A (en) Similarity evaluation method and system for patient identity information matching
CN113487223B (en) Risk assessment method and system based on information fusion
CN114428860A (en) Pre-hospital emergency case text recognition method and device, terminal and storage medium
CN114139624A (en) Method for mining time series data similarity information based on integrated model
CN115290326A (en) Rolling bearing fault intelligent diagnosis method
CN113555110A (en) Method and equipment for training multi-disease referral model
CN113192640A (en) New crown risk stage assessment method and system based on transfer learning
CN116910526A (en) Model training method, device, communication equipment and readable storage medium
CN112598082B (en) Method and system for predicting generalized error of image identification model based on non-check set
CN115329872A (en) Sensitive attribute identification method and device based on comparison learning
US20210241147A1 (en) Method and device for predicting pair of similar questions and electronic equipment
Mozharovskyi Anomaly detection using data depth: multivariate case

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210730