CN113362959A - Sudden respiratory infectious disease risk prediction model for regional epidemic prevention and control - Google Patents
Sudden respiratory infectious disease risk prediction model for regional epidemic prevention and control Download PDFInfo
- Publication number
- CN113362959A CN113362959A CN202110618120.8A CN202110618120A CN113362959A CN 113362959 A CN113362959 A CN 113362959A CN 202110618120 A CN202110618120 A CN 202110618120A CN 113362959 A CN113362959 A CN 113362959A
- Authority
- CN
- China
- Prior art keywords
- data
- risk
- fusion
- establishing
- disease
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 208000015181 infectious disease Diseases 0.000 title claims abstract description 46
- 208000035473 Communicable disease Diseases 0.000 title claims abstract description 27
- 238000013058 risk prediction model Methods 0.000 title claims abstract description 22
- 230000000241 respiratory effect Effects 0.000 title claims abstract description 16
- 230000002265 prevention Effects 0.000 title claims abstract description 15
- 230000004927 fusion Effects 0.000 claims abstract description 64
- 238000000034 method Methods 0.000 claims abstract description 29
- 238000012545 processing Methods 0.000 claims abstract description 25
- 238000013523 data management Methods 0.000 claims abstract description 5
- 201000010099 disease Diseases 0.000 claims description 30
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 30
- 238000013528 artificial neural network Methods 0.000 claims description 12
- 238000004458 analytical method Methods 0.000 claims description 10
- 238000004140 cleaning Methods 0.000 claims description 9
- 238000007726 management method Methods 0.000 claims description 9
- 230000010354 integration Effects 0.000 claims description 7
- 230000007246 mechanism Effects 0.000 claims description 6
- 230000005180 public health Effects 0.000 claims description 6
- 230000002776 aggregation Effects 0.000 claims description 5
- 238000004220 aggregation Methods 0.000 claims description 5
- 230000036541 health Effects 0.000 claims description 5
- 238000005065 mining Methods 0.000 claims description 5
- 238000003860 storage Methods 0.000 claims description 5
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 238000007689 inspection Methods 0.000 claims description 4
- 239000000463 material Substances 0.000 claims description 4
- 230000006399 behavior Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 230000008520 organization Effects 0.000 claims description 3
- 230000001737 promoting effect Effects 0.000 claims description 3
- 230000035945 sensitivity Effects 0.000 claims description 3
- 238000012163 sequencing technique Methods 0.000 claims description 3
- 238000013526 transfer learning Methods 0.000 claims description 3
- 230000002458 infectious effect Effects 0.000 claims 3
- 208000023504 respiratory system disease Diseases 0.000 claims 3
- 206010002091 Anaesthesia Diseases 0.000 claims 1
- 230000037005 anaesthesia Effects 0.000 claims 1
- 230000007170 pathology Effects 0.000 claims 1
- 230000008901 benefit Effects 0.000 abstract description 7
- 238000013507 mapping Methods 0.000 abstract description 5
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 238000004422 calculation algorithm Methods 0.000 description 8
- 230000008569 process Effects 0.000 description 7
- 230000002860 competitive effect Effects 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 5
- 230000000295 complement effect Effects 0.000 description 5
- 238000010276 construction Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 238000010801 machine learning Methods 0.000 description 5
- 238000012544 monitoring process Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 4
- 238000007405 data analysis Methods 0.000 description 4
- 238000003745 diagnosis Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 230000007547 defect Effects 0.000 description 3
- 230000000670 limiting effect Effects 0.000 description 3
- 238000011282 treatment Methods 0.000 description 3
- 230000004913 activation Effects 0.000 description 2
- 238000013480 data collection Methods 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 239000003814 drug Substances 0.000 description 2
- 238000007499 fusion processing Methods 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012502 risk assessment Methods 0.000 description 2
- 206010067484 Adverse reaction Diseases 0.000 description 1
- 208000025721 COVID-19 Diseases 0.000 description 1
- 241000711573 Coronaviridae Species 0.000 description 1
- 208000025370 Middle East respiratory syndrome Diseases 0.000 description 1
- 206010035664 Pneumonia Diseases 0.000 description 1
- 230000006838 adverse reaction Effects 0.000 description 1
- 238000012098 association analyses Methods 0.000 description 1
- 230000001364 causal effect Effects 0.000 description 1
- 230000019771 cognition Effects 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000006806 disease prevention Effects 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000013210 evaluation model Methods 0.000 description 1
- 208000037797 influenza A Diseases 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000005541 medical transmission Effects 0.000 description 1
- 150000007523 nucleic acids Chemical class 0.000 description 1
- 102000039446 nucleic acids Human genes 0.000 description 1
- 108020004707 nucleic acids Proteins 0.000 description 1
- 230000000474 nursing effect Effects 0.000 description 1
- 238000004393 prognosis Methods 0.000 description 1
- 230000001681 protective effect Effects 0.000 description 1
- 238000003908 quality control method Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000002945 steepest descent method Methods 0.000 description 1
- 238000012384 transportation and delivery Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/80—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H40/00—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
- G16H40/20—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the management or administration of healthcare resources or facilities, e.g. managing hospital staff or surgery rooms
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Public Health (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Biomedical Technology (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Pathology (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Business, Economics & Management (AREA)
- General Business, Economics & Management (AREA)
- Animal Behavior & Ethology (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The invention discloses a sudden respiratory infectious disease risk prediction model for regional epidemic prevention and control, which comprises the following steps: s1: original data; s2: data management; s3: establishing a knowledge base; s4: fusing decision layers; s5: and (5) establishing a risk prediction model. The invention has the following beneficial effects: the system has advantages in processing multidimensional input data and output data; the problem of any complex nonlinearity can be solved, so the nonlinear mapping capability is high; information or data can be processed in parallel, and the processing capacity and performance of the data are improved; fourthly, the adaptability is stronger; fifthly, the method is better in data fusion.
Description
Technical Field
The invention belongs to the technical field of data analysis, and particularly relates to a sudden respiratory infectious disease risk prediction model for regional epidemic situation prevention and control.
Background
The new (sudden) serious infectious diseases (such as SARS in 2003, influenza A/H1N1 in 2009 and MERS in 2012) in the new century continuously emerge, and the new coronavirus pneumonia (COVID-19) which is epidemic causes a great influence on the normal life of the society and the life health of people. The early and rapid prediction of epidemic situation development trend is carried out based on multi-source heterogeneous data, and especially the prediction of infectious disease transmission risk, infection rate, disease death rate, peak (inflection) arrival time, final infection scale and the like has important practical significance. The algorithm model and data analysis play an important role in the prediction, early warning and risk analysis of the power-assisted epidemic situation. However, in the regional epidemic prevention and control, data from a single source or a single system cannot acquire comprehensive knowledge of the epidemic situation, and multi-source information can be converted into valuable explanations for the environment through fusion processing of different measurement characteristics. If the data layer is subjected to overall transmission and centralized processing, the defects of large communication load, high calculation cost, long processing time and poor interference resistance are caused.
Disclosure of Invention
The invention aims to provide a sudden respiratory infectious disease risk prediction model for regional epidemic situation prevention and control, which is used for solving the problems that information source data forms are different in the processes of multi-channel monitoring early warning, combination of an infectious disease monitoring system and other department monitoring systems and sudden respiratory infectious disease risk prediction in a region. Based on the fusion of decision layer data, a respiratory infectious disease risk prediction model is provided, the nonlinear problem of multi-dimensional multi-source heterogeneous data fusion in the regional infectious disease prevention and control is supported, the data processing capacity and performance are improved, and the self-adaption is strong.
In order to achieve the purpose, the invention provides the following technical scheme: the utility model provides a regional epidemic situation prevention and control is with proruption respiratory infectious disease risk prediction model which the structural feature lies in: the method comprises the following steps:
s1: the method comprises the steps that raw data are extracted, integrated and cleaned through a large data center established in a functional unit, and multi-source heterogeneous data of hospital data, disease control center data, inspection institution data, basic community population data and emergency material data are dispersed;
s2: data management, namely establishing a comprehensive, dynamic and configurable data access mechanism to meet the requirements of data acquisition, data aggregation, task configuration, task scheduling, data encryption and breakpoint retransmission; establishing a standardized data processing flow, forming a data content-oriented conversion processing specification mode of data specification, cleaning, association, comparison and identification, and providing support for an organized data fusion database; comprehensively establishing a data organization mode of multi-element integration and fusion database establishment, and promoting cloud database establishment and storage management according to key element classification, wherein the key element classification comprises service types, sensitivity degrees and privacy contents, and the correlation and fusion of data resources from different sources are realized by adopting feature tags and normalization integration;
s3: establishing a knowledge base, establishing knowledge map classification, establishing a multi-channel and multi-dimensional data service mode, providing basic data service for a user, wherein the basic data service comprises query retrieval, comparison sequencing and intelligent data service for professionals, and mining analysis and expert modeling of the intelligent data service;
s4: the method comprises the steps of decision layer fusion, wherein rules and knowledge are extracted based on a large data center knowledge base of each unit in a region, multi-source heterogeneous data semantic fusion is realized on a knowledge layer, and information fusion of a semantic layer can be divided into data fusion based on multiple views, data fusion based on similarity, data fusion based on probability dependence and data fusion based on transfer learning;
s5: the risk prediction model is established, the total infection of the sudden respiratory infectious disease can be divided into 7 types of infection risk scenes, including families, communities, residential buildings, workplaces, public places, transportation, medical and health institutions and social welfare institutions, and the infection risk scenes are subdivided into 53 types of infection risk areas and 255 infection risk point sites, and based on key factors in the risk scenes: risk sites, risk groups and risk behaviors, establishing a risk prediction model by using the data fused by the decision layer, and selecting the optimal parameters of the risk prediction model through a deep neural network.
Preferably, the functional units in S1 include hospitals and disease control centers.
Preferably, the disease control center data in S1 includes an infectious disease and emergency public health event network notification system.
Preferably, the disease control center data in S1 includes an infectious disease and emergency public health event network notification system.
Compared with the prior art, the invention has the following beneficial effects:
the system has advantages in processing multidimensional input data and output data;
the problem of any complex nonlinearity can be solved, so the nonlinear mapping capability is high;
information or data can be processed in parallel, and the processing capacity and performance of the data are improved;
fourthly, the adaptability is stronger;
fifthly, the method is better in data fusion.
Drawings
FIG. 1 is a schematic structural diagram of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings of the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
In the description of the present invention, it is to be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", "clockwise", "counterclockwise", and the like, indicate orientations and positional relationships based on those shown in the drawings, and are used only for convenience of description and simplicity of description, and do not indicate or imply that the equipment or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be considered as limiting the present invention.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.
In the present invention, unless otherwise expressly stated or limited, the terms "mounted," "connected," "secured," and the like are to be construed broadly and can, for example, be fixedly connected, detachably connected, or integrally formed; can be mechanically or electrically connected; either directly or indirectly through intervening media, either internally or in any other relationship. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
In the present invention, unless otherwise expressly stated or limited, "above" or "below" a first feature means that the first and second features are in direct contact, or that the first and second features are not in direct contact but are in contact with each other via another feature therebetween. Also, the first feature being "on," "above" and "over" the second feature includes the first feature being directly on and obliquely above the second feature, or merely indicating that the first feature is at a higher level than the second feature. A first feature being "under," "below," and "beneath" a second feature includes the first feature being directly under and obliquely below the second feature, or simply meaning that the first feature is at a lesser elevation than the second feature.
In the description of the present invention, it should be noted that the terms "upper", "lower", "left", "right", "inner", "outer", and the like indicate orientations or positional relationships based on those shown in the drawings, and are only for convenience of description and simplification of description, but do not indicate or imply that the referred device or element must have a specific orientation, be constructed in a specific orientation, and be operated, and thus, should not be construed as limiting the present invention.
Referring to fig. 1, the present invention provides a technical solution, which is a sudden respiratory infectious disease risk prediction model for regional epidemic situation prevention and control, including the following steps:
s1: the method comprises the steps that raw data are extracted, integrated and cleaned through a large data center established in a functional unit, and multi-source heterogeneous data of hospital data, disease control center data, inspection institution data, basic community population data and emergency material data are dispersed;
s2: data management, namely establishing a comprehensive, dynamic and configurable data access mechanism to meet the requirements of data acquisition, data aggregation, task configuration, task scheduling, data encryption and breakpoint retransmission; establishing a standardized data processing flow, forming a data content-oriented conversion processing specification mode of data specification, cleaning, association, comparison and identification, and providing support for an organized data fusion database; comprehensively establishing a data organization mode of multi-element integration and fusion database establishment, and promoting cloud database establishment and storage management according to key element classification, wherein the key element classification comprises service types, sensitivity degrees and privacy contents, and the correlation and fusion of data resources from different sources are realized by adopting feature tags and normalization integration;
s3: establishing a knowledge base, establishing knowledge map classification, establishing a multi-channel and multi-dimensional data service mode, providing basic data service for a user, wherein the basic data service comprises query retrieval, comparison sequencing and intelligent data service for professionals, and mining analysis and expert modeling of the intelligent data service;
s4: the method comprises the steps of decision layer fusion, wherein rules and knowledge are extracted based on a large data center knowledge base of each unit in a region, multi-source heterogeneous data semantic fusion is realized on a knowledge layer, and information fusion of a semantic layer can be divided into data fusion based on multiple views, data fusion based on similarity, data fusion based on probability dependence and data fusion based on transfer learning;
s5: the risk prediction model is established, the total infection of the sudden respiratory infectious disease can be divided into 7 types of infection risk scenes, including families, communities, residential buildings, workplaces, public places, transportation, medical and health institutions and social welfare institutions, and the infection risk scenes are subdivided into 53 types of infection risk areas and 255 infection risk point sites, and based on key factors in the risk scenes: risk sites, risk groups and risk behaviors, establishing a risk prediction model by using the data fused by the decision layer, and selecting the optimal parameters of the risk prediction model through a deep neural network.
In this embodiment, the functional units in S1 include hospitals and disease control centers.
In this embodiment, the disease control center data in S1 includes the infectious disease and public health incident emergency network direct reporting system.
In this embodiment, the disease control center data in S1 includes the infectious disease and public health incident emergency network direct reporting system.
From the relationship between the sources, the types of information fusion can be divided into complementary type, competitive type and cooperative type. All the information sources in the complementary type are independent of each other, all the information sources sense different aspects of the target/scene, and the global information of the target is obtained through information source fusion; all information sources in the competitive type describe the same aspect of the same target/scene, and multi-source information fusion is used for redundancy calibration and trust enhancement; in the cooperative mode, all information sources are mutually dependent, targets are perceived from different angles, and multi-source information fusion is used for obtaining brand-new information.
From the abstract level of information fusion, people often divide fusion into data layer fusion, decision layer fusion and feature layer fusion.
Since the data layer fuses data generally oriented to equivalent information sources, a common fusion mechanism is competitive. The data layer fusion has the advantage of small information loss because field data are kept as much as possible, but has the defects of large communication load, high calculation cost, long processing time and poor anti-interference capability because the field data are subjected to integral transmission and centralized processing.
Decision-level fusion, also referred to as semantic-level fusion, operates on rules or knowledge. Decision-layer fusion is a high-level and more human-cognition-compliant fusion mode, which depends on human understanding of data feature significance and relation. The fusion mechanism of the decision layer is more flexible because the fusion of the decision layer is not limited by the difference of the data forms of the source, and the decision layer can be oriented to the fusion requirements of competitive type, cooperative type and complementary type. Because the decision layer is fused with knowledge with smaller scale in transmission and processing, the method has the advantages of small communication load, strong anti-interference capability and low calculation cost of a fusion center, but certain calculation cost is still required and certain information loss is generated in the knowledge acquisition stage of each information source.
The operation object of feature layer fusion is the feature attribute extracted from the data, and the common fusion machine is provided with a competitive type, a complementary type and a cooperative type, and the advantages and disadvantages of the common fusion machine are between the data layer fusion and the decision layer fusion.
The disease risk prediction model is a mathematical formula used to estimate the probability that a particular individual is currently suffering from a disease or will develop a result in the future. Establishing a disease risk prediction model is a complex system engineering, and relates to many links of researching problems, data sets, variables, models and result reports.
The modeling process comprises real world data acquisition and aggregation, big data management, disease risk model construction and model utilization, and specifically comprises the following steps:
data collection and aggregation are based on platform data integration, a server is used as a basic hardware platform, a cluster technology, a distributed storage technology, a distributed computing technology and an ETL (Extract-Transform-Load) technology are adopted, data collection standards and Processing flows are formulated, structured data are extracted and warehoused, unstructured data are structurally transformed by Natural Language Processing (NLP), and the structured data comprise not only basic information, medical record information, disease course information, medical advice information, inspection information, image information, nursing information and other contents of a patient, but also personal electronic disease files, information of each main infection risk scene monitoring system, infectious disease epidemic situations, emergency public health event network direct reporting system information, nucleic acid detection information, population information system information, emergency material information and other contents. The data storage and sharing are realized, and more refined and precise support is provided for different requirements.
And establishing an infectious disease big data center in each related unit, cleaning and processing the collected and gathered data, and performing standardized arrangement. The method mainly comprises the steps of data making, cleaning flow control, cleaning quality control, cleaning process management and the like. Through the standard process and the rule base, the unified and configurable data conversion, cleaning, comparison, association, fusion and other processing processes are constructed based on the process engine, heterogeneous mass discrete data resources are processed and produced, and sharable data which are easy to analyze and utilize are generated. By deploying a big data computing framework and based on various algorithm libraries, the functions of big data storage access, distributed computing task scheduling, deep search of multi-dimensional index data, full-text retrieval and the like are realized. The distributed parallel computing architecture is established, the server cluster is deployed, the transverse expansion capability is achieved, computing resources and storage resources can be dynamically increased or decreased, and PB magnitude offline computing and online computing are supported. The method comprises the steps of deploying a non-relational database HBase, a data warehouse Hive, a data processing tool Sqoop, a machine learning algorithm library Mahout, consistency service software ZooKeeper, a management tool Am-barri and the like, or deploying other big data computing frameworks such as Map Reduce, Spark, Tez and the like, and deploying a search engine Elasticissearch for full-text retrieval, structured retrieval and analysis.
The knowledge graph is a structured semantic knowledge base and is used for describing concepts and mutual relations in the physical world in a symbolic form, and the entities are mutually connected through the relations to form a reticular knowledge structure, so that the knowledge base is endowed with a quick retrieval entrance and basic reasoning ability. The knowledge graph is based on relevant dimensions, entities are extracted from knowledge base data, and the text understanding capability of the machine is improved by using the techniques of the knowledge graph, word cloud analysis, text reasoning and the like, so that knowledge visualization and graph construction are realized. The method provides multi-channel and multi-dimensional data services for different users based on the knowledge graph, provides data services such as model management, intelligent discovery, model exploration, data exploration and data subscription for the users, and provides intelligent data services such as mining analysis and expert modeling for professionals. The model management mainly comprises editing and processing entities and relations, intelligent discovery comprises reversely deducing physical model relations from a data source configured to a system according to meta-information such as logs and the like, automatically generating a business view of a semantic layer after a plurality of heterogeneous physical models are normalized to the same entity, model exploration supports keyword searching of the entities, the relations and the like, dragging search results to canvas for searching the check relations between the entities and the relations, a user can know the physical model corresponding to the back of the business model and the production relationship of the physical table while knowing the business model, data exploration comprises searching the business model view in a knowledge and answer mode, setting the conditions of labels on any node of a path, setting answers corresponding to the labels on other nodes to enable the user to fully know the business relation of the data, and data subscription meets the requirements of other external platforms on various types of data of the platform, through different authorities transferred to different users and the open data content of the data resource directory service, a data subscription/unsubscription process is provided for external users, and final data delivery is completed through the resource bus service.
The method comprises the steps of fusion of decision layers based on a structured semantic knowledge base, construction of a sudden respiratory infectious disease risk model, construction of deep hierarchical features through a deep learning technology, automatic learning of data representation, effective capture of data dependency relationship, mining of information in data through a machine learning algorithm based on a deep neural network to conduct clinical endpoint prediction, and prediction of clinical endpoint events including disease diagnosis, mortality, length of stay in hospital, unplanned readmission and the like. The method comprises the following steps: risk prediction of diseases, risk prediction crowd construction: aiming at the selected risk outcome of a certain disease, integrating and cleaning related data resources and constructing a queue for data analysis; risk factor identification: automatically constructing candidate characteristics and screening potential risk factors through data analysis; constructing a prediction model: establishing a prediction model by using a machine learning method; the cross validation evaluation model improves the model prediction performance; risk factor assessment: and evaluating the contribution and importance of each risk/protection factor to the target outcome. Analysis of disease risk/protective factors, model feature importance ranking: using an attribution algorithm to obtain a Shapley value (Shapley value) of a more objective and accurate risk factor; factor attribute identification: calculating the value of each factor OR (odds ratio), and determining whether the factor is a risk factor or a protection factor; factor and outcome relationship quantification: quantifying the degree of influence of the high risk factors on the outcome by using the nomogram; thirdly, disease association analysis is carried out to find and discover other diseases and factors highly related to the diseases; searching and exploring related diseases accompanied with relationships and related diseases related to causal relationships.
Various applications are developed through the disease risk prediction model, diagnosis and treatment levels of medical staff are improved, hospital management staff are assisted in making decisions, the falling of scientific research achievements is accelerated, and accurate medical services are provided for patients. The method comprises clinical assistant decision-making, single-disease large-case statistical analysis, treatment method and curative effect comparison, accurate diagnosis and personalized treatment, adverse reaction and error analysis reminding, health prediction and early warning, refined management decision support, scientific research result verification, assistant medication analysis, medicine research and development and the like, and meanwhile, the method is helpful for prediction, early warning and risk analysis of epidemic situations.
In the regional epidemic situation prevention and control, comprehensive cognition on the whole epidemic situation is difficult to obtain only by a data source of a certain unit, and multi-source information can be converted into analysis and explanation of value provided for respiratory infectious disease risks by carrying out fusion processing on data sources with different measurement characteristics and different dimensions. The regional epidemic situation prevention and control needs to increase multi-channel monitoring data sources, but because the data forms of all systems are not uniform, if the data layers are fused, field data needs to be integrally transmitted and processed in a centralized mode, and the defects of large communication load, high calculation cost, long processing time and poor anti-interference capability are caused. For the risk prediction of the sudden respiratory infectious diseases in the whole area, decision-level fusion is adopted, so that the risk prediction is not limited by the difference of source data forms, the fusion mechanism is more flexible, and the risk prediction method can also face the fusion requirements of competitive type, cooperative type and complementary type.
An Artificial Neural Network (ANN) is a special Machine Learning (ML) algorithm, and is formed by a large number of nodes connected to one another. Each node represents a particular output function, called the activation function. Every 2 connections between nodes represent a weighted value, called weight, for the signal passing through the connection, which is equivalent to the memory of the artificial neural network. The output of the network is different according to the different connection modes, weight values and activation functions of the network. The deep neural network algorithm is a multi-layer feedforward network trained according to an error back propagation algorithm and is one of the most widely applied artificial neural network models. Deep neural networks can be used to learn and store a large number of mappings for input-output models without the need to reveal mathematical equations describing these mappings in advance. The learning rule is that the steepest descent method is adopted, and the weight and the aperture value of the network are adjusted by utilizing back propagation so as to achieve the minimum sum of squares of errors. The deep neural network has the capabilities of parallel processing mode, self-organization, self-learning capability, associative memory, fault tolerance and the like, and particularly can play a role of an expert system in the aspects of early prevention, diagnosis, prognosis evaluation and the like of diseases.
The deep neural network is more suitable for multi-source heterogeneous data sets and is characterized in that: the system has advantages in processing multidimensional input data and output data; the problem of any complex nonlinearity can be solved, so the nonlinear mapping capability is high; information or data can be processed in parallel, and the processing capacity and performance of the data are improved; fourthly, the adaptability is stronger; fifthly, the method is better in data fusion.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein, and any reference signs in the claims are not intended to be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.
Claims (4)
1. A sudden respiratory infectious disease risk prediction model for regional epidemic prevention and control is characterized in that: the method comprises the following steps:
s1: the method comprises the steps that raw data are extracted, integrated and cleaned through a large data center established in a functional unit, and multi-source heterogeneous data of hospital data, disease control center data, inspection institution data, basic community population data and emergency material data are dispersed;
s2: data management, namely establishing a comprehensive, dynamic and configurable data access mechanism to meet the requirements of data acquisition, data aggregation, task configuration, task scheduling, data encryption and breakpoint retransmission; establishing a standardized data processing flow, forming a data content-oriented conversion processing specification mode of data specification, cleaning, association, comparison and identification, and providing support for an organized data fusion database; comprehensively establishing a data organization mode of multi-element integration and fusion database establishment, and promoting cloud database establishment and storage management according to key element classification, wherein the key element classification comprises service types, sensitivity degrees and privacy contents, and the correlation and fusion of data resources from different sources are realized by adopting feature tags and normalization integration;
s3: establishing a knowledge base, establishing knowledge map classification, establishing a multi-channel and multi-dimensional data service mode, providing basic data service for a user, wherein the basic data service comprises query retrieval, comparison sequencing and intelligent data service for professionals, and mining analysis and expert modeling of the intelligent data service;
s4: the method comprises the steps of decision layer fusion, wherein rules and knowledge are extracted based on a large data center knowledge base of each unit in a region, multi-source heterogeneous data semantic fusion is realized on a knowledge layer, and information fusion of a semantic layer can be divided into data fusion based on multiple views, data fusion based on similarity, data fusion based on probability dependence and data fusion based on transfer learning;
s5: the risk prediction model is established, the total infection of the sudden respiratory infectious disease can be divided into 7 types of infection risk scenes, including families, communities, residential buildings, workplaces, public places, transportation, medical and health institutions and social welfare institutions, and the infection risk scenes are subdivided into 53 types of infection risk areas and 255 infection risk point sites, and based on key factors in the risk scenes: risk sites, risk groups and risk behaviors, establishing a risk prediction model by using the data fused by the decision layer, and selecting the optimal parameters of the risk prediction model through a deep neural network.
2. The model for predicting risk of infectious respiratory disease for regional epidemic prevention and control according to claim 1, wherein the model comprises: the functional units in S1 include hospitals and disease control centers.
3. The model for predicting risk of infectious respiratory disease for regional epidemic prevention and control according to claim 1, wherein the model comprises: the hospital data in S1 includes hospital information system, electronic medical record, examination, pathology, and hand anesthesia.
4. The model for predicting risk of infectious respiratory disease for regional epidemic prevention and control according to claim 1, wherein the model comprises: the disease control center data in S1 includes the infectious disease and public health incident emergency network direct reporting system.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110618120.8A CN113362959A (en) | 2021-06-03 | 2021-06-03 | Sudden respiratory infectious disease risk prediction model for regional epidemic prevention and control |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110618120.8A CN113362959A (en) | 2021-06-03 | 2021-06-03 | Sudden respiratory infectious disease risk prediction model for regional epidemic prevention and control |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113362959A true CN113362959A (en) | 2021-09-07 |
Family
ID=77531561
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110618120.8A Pending CN113362959A (en) | 2021-06-03 | 2021-06-03 | Sudden respiratory infectious disease risk prediction model for regional epidemic prevention and control |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113362959A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114067935A (en) * | 2021-11-03 | 2022-02-18 | 广西壮族自治区通信产业服务有限公司技术服务分公司 | Epidemic disease investigation method, system, electronic equipment and storage medium |
CN114203295A (en) * | 2021-11-23 | 2022-03-18 | 国家康复辅具研究中心 | Cerebral apoplexy risk prediction intervention method and system |
CN115798734A (en) * | 2023-01-09 | 2023-03-14 | 杭州杏林信息科技有限公司 | New emergent infectious disease prevention and control method and device based on big data and storage medium |
CN117238452A (en) * | 2023-10-08 | 2023-12-15 | 中世康恺科技有限公司 | Regional medical image cloud and inspection result mutual recognition sharing platform |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110277167A (en) * | 2019-05-31 | 2019-09-24 | 南京邮电大学 | The Chronic Non-Communicable Diseases Risk Forecast System of knowledge based map |
CN112735598A (en) * | 2021-01-21 | 2021-04-30 | 山东健康医疗大数据有限公司 | Method for analyzing and early warning new coronary epidemic and respiratory tract syndrome |
CN112786210A (en) * | 2021-01-15 | 2021-05-11 | 华南师范大学 | Epidemic propagation tracking method and system |
-
2021
- 2021-06-03 CN CN202110618120.8A patent/CN113362959A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110277167A (en) * | 2019-05-31 | 2019-09-24 | 南京邮电大学 | The Chronic Non-Communicable Diseases Risk Forecast System of knowledge based map |
CN112786210A (en) * | 2021-01-15 | 2021-05-11 | 华南师范大学 | Epidemic propagation tracking method and system |
CN112735598A (en) * | 2021-01-21 | 2021-04-30 | 山东健康医疗大数据有限公司 | Method for analyzing and early warning new coronary epidemic and respiratory tract syndrome |
Non-Patent Citations (1)
Title |
---|
操玉杰 等: "大数据环境下面向决策全流程的应急信息融合研究", 《图书情报知识》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114067935A (en) * | 2021-11-03 | 2022-02-18 | 广西壮族自治区通信产业服务有限公司技术服务分公司 | Epidemic disease investigation method, system, electronic equipment and storage medium |
CN114067935B (en) * | 2021-11-03 | 2022-05-20 | 广西壮族自治区通信产业服务有限公司技术服务分公司 | Epidemic disease investigation method, system, electronic equipment and storage medium |
CN114203295A (en) * | 2021-11-23 | 2022-03-18 | 国家康复辅具研究中心 | Cerebral apoplexy risk prediction intervention method and system |
CN114203295B (en) * | 2021-11-23 | 2022-05-20 | 国家康复辅具研究中心 | Cerebral apoplexy risk prediction intervention method and system |
CN115798734A (en) * | 2023-01-09 | 2023-03-14 | 杭州杏林信息科技有限公司 | New emergent infectious disease prevention and control method and device based on big data and storage medium |
CN117238452A (en) * | 2023-10-08 | 2023-12-15 | 中世康恺科技有限公司 | Regional medical image cloud and inspection result mutual recognition sharing platform |
CN117238452B (en) * | 2023-10-08 | 2024-05-17 | 中世康恺科技有限公司 | Regional medical image cloud and inspection result mutual recognition sharing platform |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Mardani et al. | A novel extended approach under hesitant fuzzy sets to design a framework for assessing the key challenges of digital health interventions adoption during the COVID-19 outbreak | |
CN113362959A (en) | Sudden respiratory infectious disease risk prediction model for regional epidemic prevention and control | |
Fu et al. | Data-driven multiple criteria decision making for diagnosis of thyroid cancer | |
Ju et al. | Emergency alternative evaluation and selection based on ANP, DEMATEL, and TL-TOPSIS | |
US20090299766A1 (en) | System and method for optimizing medical treatment planning and support in difficult situations subject to multiple constraints and uncertainties | |
Qahtan et al. | Review of healthcare industry 4.0 application-based blockchain in terms of security and privacy development attributes: Comprehensive taxonomy, open issues and challenges and recommended solution | |
CN112151130B (en) | Decision support system based on literature retrieval and construction method | |
Dooshima et al. | A predictive model for the risk of mental illness in Nigeria using data mining | |
Zeng et al. | Infectious disease informatics and biosurveillance | |
Abdullayeva | Internet of Things‐based healthcare system on patient demographic data in Health 4.0 | |
Liu et al. | Analysis and research on intelligent manufacturing medical product design and intelligent hospital system dynamics based on machine learning under big data | |
Abbas et al. | Fused weighted federated deep extreme machine learning based on intelligent lung cancer disease prediction model for healthcare 5.0 | |
Yuan | GIS research to address tensions in geography | |
Chen et al. | Factors affecting the use of blockchain technology in humanitarian supply chain: a novel fuzzy large-scale group-DEMATEL | |
Yothapakdee et al. | Improving the efficiency of machine learning models for predicting blood glucose levels and diabetes risk | |
CN113628744A (en) | Quantitative evaluation system and method for body health | |
Comito et al. | Exploiting social media to enhance clinical decision support | |
Zhang et al. | Three‐Way Group Decisions with Incomplete Spherical Fuzzy Information for Treating Parkinson’s Disease Using IoMT Devices | |
Ahmad et al. | Implementation of fusion and filtering techniques in IoT data processing: a case study of smart healthcare | |
Li et al. | Research on public health crisis early warning system based on context awareness | |
Zamora et al. | Characterizing chronic disease and polymedication prescription patterns from electronic health records | |
Thilagavathi et al. | Analysis of Artificial Intelligence in Medical Sectors | |
Zhao et al. | Logistic regression analysis of targeted poverty alleviation with big data in mobile network | |
Pati et al. | IFCnCov: An IoT‐based smart diagnostic architecture for COVID‐19 | |
Dutta et al. | Artificial Intelligence and Big Data Analytics in Healthcare for Predictive Analysis and Future Prospects |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210907 |
|
RJ01 | Rejection of invention patent application after publication |