CN116504418B - Medical epidemic situation prevention and control history data based collection method - Google Patents
Medical epidemic situation prevention and control history data based collection method Download PDFInfo
- Publication number
- CN116504418B CN116504418B CN202310786205.6A CN202310786205A CN116504418B CN 116504418 B CN116504418 B CN 116504418B CN 202310786205 A CN202310786205 A CN 202310786205A CN 116504418 B CN116504418 B CN 116504418B
- Authority
- CN
- China
- Prior art keywords
- information matrix
- data
- information
- matrix set
- matrixes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000002265 prevention Effects 0.000 title claims abstract description 36
- 238000000034 method Methods 0.000 title claims abstract description 24
- 239000011159 matrix material Substances 0.000 claims abstract description 339
- 238000012216 screening Methods 0.000 claims abstract description 42
- 238000013500 data storage Methods 0.000 claims abstract description 34
- 239000000463 material Substances 0.000 claims abstract description 8
- 201000010099 disease Diseases 0.000 claims description 63
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims description 63
- 238000011156 evaluation Methods 0.000 claims description 21
- 239000000284 extract Substances 0.000 claims description 12
- 230000010354 integration Effects 0.000 claims description 8
- 230000003993 interaction Effects 0.000 claims description 6
- 238000001914 filtration Methods 0.000 claims description 5
- 230000000295 complement effect Effects 0.000 claims description 3
- 238000013480 data collection Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 208000035473 Communicable disease Diseases 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 208000015181 infectious disease Diseases 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 239000012925 reference material Substances 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000005541 medical transmission Effects 0.000 description 1
- 230000005180 public health Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/80—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Public Health (AREA)
- Medical Informatics (AREA)
- Data Mining & Analysis (AREA)
- Biomedical Technology (AREA)
- Databases & Information Systems (AREA)
- Pathology (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to the field of medical epidemic prevention and control, in particular to a method for collecting history data based on medical epidemic prevention and control, which comprises the following steps: step S1, searching the first data storage module and the second data storage module according to keywords respectively to obtain a first associated data set searched based on the first data storage module and a second associated data set searched based on the second data storage module; s2, selecting a reference data set, respectively integrating the first association data set and the second association data set to obtain information matrixes, and respectively screening the information matrixes to obtain a first reference information matrix set and a second reference information matrix set; step S3, integrating the second reference information matrix set with each group of information matrix in the first reference information matrix set to obtain a standard reference information matrix set. The invention can collect and integrate the historical material data of epidemic situation and evaluate the historical material data with reference.
Description
Technical Field
The invention relates to the field of medical epidemic prevention and control, in particular to a method for collecting history data based on medical epidemic prevention and control.
Background
Along with the frequent occurrence of global public health events, medical epidemic prevention and control has become a major challenge facing various countries, and scientists need to know the rule of disease transmission from historical epidemic data in order to improve the effect of epidemic prevention and control. However, the existing method for collecting, sorting and storing the medical epidemic prevention and control history data has limitations such as untimely data updating, lack of integrity and accuracy and the like, which severely restricts the development of epidemic prevention and control work. Therefore, a method for quickly and efficiently collecting medical epidemic situation prevention and control history data is urgently needed to improve the scientificity and accuracy of epidemic situation prevention and control work.
The invention discloses a construction method of an infectious disease prediction model, which is based on a search engine big data platform to collect user groups on the Internet for searching relevant keywords of epidemic situation, meanwhile, collects multiple types of climate variable data from a meteorological platform for infectious disease prediction, adopts a time sequence cross validation algorithm (TSCV) to perform model validation, outputs new added diagnosis number and other variables in future time period, and realizes accurate prediction of epidemic situation of infectious disease, but has the problem that historical material data of epidemic situation can not be used as reference data for effectively analyzing epidemic situation to be happened or happened.
Disclosure of Invention
Therefore, the invention provides a collection method based on medical epidemic prevention and control historical material data, which can solve the problem that the epidemic historical material data cannot be effectively collected and integrated and evaluated in a referential manner.
In order to achieve the above purpose, the invention provides a method for collecting history data based on medical epidemic prevention and control, comprising the following steps:
step S1, searching a first data storage module for storing historic epidemic situation related history data and a second data storage module for storing near modern epidemic situation prevention and control data according to keywords to obtain a first associated data set retrieved based on the first data storage module and a second associated data set retrieved based on the second data storage module;
s2, selecting a reference data set according to the ratio of the data quantity in the second associated data set to the data quantity in the first associated data set, when the first associated data set and the second associated data set are both the reference data sets, respectively integrating each group of data of the first associated data set and the second associated data set according to disease types to respectively acquire a first information matrix set and a second information matrix set, acquiring the first reference information matrix set according to the matching result of the first information matrix set and the second information matrix set, judging whether to screen each information matrix in the second information matrix set according to the quantity of the information matrices in the first reference information matrix set, judging whether to reserve each information matrix reserved in the second information matrix set as a second reference information matrix set according to the data quantity of a certain information matrix in the second information matrix set, the latest update time of a certain group of data of the information matrix and the average integrity of each group of data;
and S3, integrating the second reference information matrix set and each group of information matrix in the first reference information matrix set according to the disease type to obtain a standard reference information matrix set, evaluating the referential of each information matrix in the standard reference information matrix set according to the historical time of occurrence of each type of disease, setting a feedback sequence of each information matrix in the standard reference information matrix set according to the referential evaluation result of each information matrix in the standard reference information matrix set, and marking the referential evaluation result of each information matrix in the standard reference information matrix set for feedback to an external instruction source.
Further, in the step S2,
if the ratio of the data volume in the second associated data set to the data volume in the first associated data set is equal to zero, the screening unit sets the first associated data set as the reference data set;
if the ratio of the data volume in the second associated data set to the data volume in the first associated data set is greater than or equal to a preset standard duty ratio, the screening unit sets the second associated data set as the reference data set;
the integration unit integrates all groups of data in the reference data set according to the disease type to obtain a plurality of information matrixes, and each information matrix is set as the standard reference information matrix set.
Further, in the step S2, if the ratio of the data amount in the second association data set to the data amount in the first association data set is smaller than a preset standard duty ratio and is not equal to zero, the filtering unit sets the first association data set and the second association data set as the reference data set.
Further, the integrating unit integrates each group of data of the first association data set according to the disease type to obtain a plurality of information matrixes and marks the information matrixes as the first information matrix set, integrates each group of data of the second association data set to obtain a plurality of information matrixes and marks the information matrixes as the second information matrix set, the matching unit matches each information matrix in the first information matrix set with each information matrix in the second information matrix set according to the disease type, marks the information matrixes capable of achieving matching as matchable information matrixes, marks the information matrixes incapable of matching as unmatcheable information matrixes, and the screening unit extracts the matchable information matrixes in the first information matrix set as the first reference information matrix set.
Further, the screening unit determines that,
and if the number of the information matrixes in the first reference information matrix set is smaller than the preset reference matrix number, the screening unit reserves all the information matrixes in the second information matrix set as the second reference information matrix set.
Further, if the number of information matrices in the first reference information matrix set is greater than the number of preset reference matrices, the filtering unit determines that,
and if the data volume of a certain information matrix A in the second information matrix set is larger than or equal to the preset reference data volume, or the time length of the latest updating time interval retrieval day of a certain group of data of the information matrix A is smaller than a preset time length threshold value, the screening unit reserves the information matrix A.
Further, if the data size of the information matrix a is smaller than the preset reference data size and the time length of the latest update time interval retrieval day of any group of data of the information matrix a is greater than or equal to the preset time length threshold, the filtering unit determines whether to retain the information matrix a according to the average integrity of each group of data in the information matrix a, wherein,
if the average integrity of each group of data in the information matrix A is greater than or equal to a preset integrity minimum threshold value, the screening unit judges that the information matrix A is reserved;
and if the average integrity of each group of data in the information matrix A is smaller than the preset integrity minimum threshold, the screening unit judges that the information matrix A is screened out.
Further, the average integrity of each group of data in the information matrix A is equal to the ratio of the sum of the integrity of each group of data in the information matrix A to the data volume in the information matrix A;
the integrity of a certain group of data a in the information matrix A is equal to the ratio of the number of data types contained in the data a to the total number of preset data types.
Further, in the step S1, if the first association data set and the second association data set are both empty sets, the retrieving unit extracts all the data of the first data storage module and the second data storage module, and the integrating unit integrates each group of data of the first data storage module according to the disease type to obtain a plurality of information matrixes and record the information matrixes as the first information matrix set, and integrates each group of data of the second data storage module according to the disease type to obtain a plurality of information matrixes and record the information matrixes as the second information matrix set;
the matching unit matches each information matrix in the first information matrix set with each information matrix in the second information matrix set according to the disease type, marks the information matrix which can be matched as a matchable information matrix, marks the information matrix which cannot be matched as a non-matchable information matrix, and the screening unit extracts the matchable information matrix in the first information matrix set as a third information matrix set;
the interaction unit sets each information matrix in the third information matrix set as a complementary reference information matrix set of each information matrix corresponding to each information matrix in the second information matrix set, and sets the second information matrix set as the standard reference information matrix set.
Further, in the step S3,
if the time length of the latest occurrence time interval retrieval day of a certain disease is less than or equal to the first preset time length, the evaluation unit evaluates that the information matrix in the standard reference information matrix set corresponding to the disease is of high referential;
if the time length of the latest occurrence time interval retrieval day of a certain disease is longer than the first preset time length and shorter than the second preset time length, the evaluation unit evaluates that the information matrix in the standard reference information matrix set corresponding to the disease is middle referential;
and if the time length of the latest occurrence time interval retrieval day of a certain disease is more than or equal to the second preset time length, the evaluation unit evaluates that the information matrix in the standard reference information matrix set corresponding to the disease is of low referential.
Compared with the prior art, the invention has the beneficial effects that the near-modern history materials are used as the priority collection data, and because the near-modern and ancient objective environments are greatly different, if the data of the same disease type are recorded in the near-modern and ancient, the near-modern data is preferentially selected as the collection data, so that the data can have higher reference value, and each group of data of the first association data set and the second association data set is respectively integrated according to the disease type, so that the collected and acquired data can be subjected to the classification according to the disease type, and the strip physicochemical of the data collection is realized; the invention matches the first information matrix set and the second information matrix set, can screen out epidemic situations which occur in ancient and recent modern times, reduces the interference of invalid information, can extract data with higher referential property in recent modern epidemic situation prevention and control data, reduces the screening time of effective information, and improves the data collection efficiency.
In particular, the invention considers that certain epidemic situations occur in ancient times but not in recent times, so that the second association data set is an empty set, and when the search result exists, the first association data set is directly used as a reference data set; when the data volume in the second association data set is larger than the data volume in the first association data set, it can be determined that the data volume in the second association data set can be used as the reference material alone and the data volume is sufficient without assisting in referencing the data in the first association data set.
In particular, when the number of information matrices in the first reference information matrix set is small, the invention reserves all the information matrices in the second information matrix set as the second reference information matrix set; when the number of the information matrixes in the first reference information matrix set is large and the time length of the latest update time interval retrieval of each group of data of each information matrix in the second information matrix set is larger than or equal to a preset time length threshold, the method can judge that the number of interference items in the second information matrix set is too large, and can exclude the interference items by excluding the information matrixes with smaller data quantity and lower data integrity of each group so as to simplify the search result.
In particular, when the number of data types of a certain group of data is small, the completeness of the group of data can be judged to be low, necessary characteristic data is lacked, the referential is low, and when the average completeness of each group of data of a certain information matrix is low, the time for manually screening effective information can be reduced by screening the information matrix, so that the efficiency of data collection is improved.
In particular, the invention evaluates the referential property of each information matrix according to the time of the disease occurrence distance from the searching day, when the time of the latest occurrence time interval of a certain disease is shorter than the time of the searching day, the invention can judge that the data is newer, has strong traceability, and the newer data has higher authenticity and higher referential property due to the development of statistical technology; conversely, when the time period of the latest occurrence time interval retrieval of a certain disease is extremely long, the data can be referenced less, and the occurrence probability of the epidemic situation is also lower.
Drawings
FIG. 1 is a diagram of a collection system architecture based on medical epidemic prevention and control history data in an embodiment of the invention;
FIG. 2 is a flow chart of a method for collecting historic data based on medical epidemic prevention and control in an embodiment of the invention;
FIG. 3 is a schematic diagram of a collection system information base based on medical epidemic prevention and control history data according to an embodiment of the invention;
fig. 4 is a schematic diagram of an information matrix set according to an embodiment of the invention.
Detailed Description
In order that the objects and advantages of the invention will become more apparent, the invention will be further described with reference to the following examples; it should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Preferred embodiments of the present invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are merely for explaining the technical principles of the present invention, and are not intended to limit the scope of the present invention.
It should be noted that, in the description of the present invention, terms such as "upper," "lower," "left," "right," "inner," "outer," and the like indicate directions or positional relationships based on the directions or positional relationships shown in the drawings, which are merely for convenience of description, but do not indicate or imply that the apparatus or element must have a specific orientation, be constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.
Furthermore, it should be noted that, in the description of the present invention, unless explicitly specified and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be either fixedly connected, detachably connected, or integrally connected, for example; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the above terms in the present invention can be understood by those skilled in the art according to the specific circumstances.
Referring to fig. 1, a structure diagram of a collection system based on medical epidemic prevention and control history data according to an embodiment of the present invention includes:
an information base comprising a first data storage module for storing historic epidemic related history data and a second data storage module for storing near modern epidemic prevention and control data;
the retrieval unit is connected with the information base and is used for respectively retrieving the first data storage module and the second data storage module according to the keywords so as to obtain a first associated data set retrieved based on the first data storage module and a second associated data set retrieved based on the second data storage module;
the integration unit is connected with the retrieval unit and is used for integrating all groups of data of the first association data set according to the disease type to obtain a plurality of information matrixes and marking the information matrixes as a first information matrix set, and integrating all groups of data of the second association data set to obtain a plurality of information matrixes and marking the information matrixes as a second information matrix set; the second reference information matrix set is integrated with each group of information matrix in the first reference information matrix set according to the disease type to obtain a standard reference information matrix set;
the matching unit is connected with the integration unit and is used for matching each information matrix in the first information matrix set with each information matrix in the second information matrix set according to the disease type, marking the information matrix capable of realizing matching as a matchable information matrix and marking the information matrix incapable of matching as a non-matchable information matrix;
the screening unit is respectively connected with the integrating unit and the retrieving unit and is used for selecting a reference data set according to the ratio of the data quantity in the second associated data set to the data quantity of the first associated data set, when the first associated data set and the second associated data set are both used as the reference data set, the screening unit extracts a matable information matrix in the first information matrix set as the first reference information matrix set, judges whether to screen each information matrix in the second associated data set according to the quantity of the information matrix in the first reference information matrix set, and judges whether to keep the information matrix according to the data quantity of a certain information matrix in the second associated data set;
the evaluation unit is respectively connected with the integration unit and the screening unit and is used for evaluating the referential property of each information matrix according to the historical time of occurrence of each type of disease;
the interaction unit is connected with the evaluation unit and is used for acquiring a supplementary reference information matrix set corresponding to each information matrix in the second information matrix set, respectively linking the supplementary reference information matrix set with each information matrix in the second information matrix set, setting a feedback sequence of each information matrix according to the evaluation result of the referential property of each information matrix when the external instruction source is fed back, and marking the referential evaluation result of each information matrix to feed back to the external instruction source.
Referring to fig. 2, a flowchart of a method for collecting history data based on medical epidemic prevention and control according to an embodiment of the present invention includes:
step S1, searching a first data storage module for storing historic epidemic situation related history data and a second data storage module for storing near modern epidemic situation prevention and control data according to keywords to obtain a first associated data set retrieved based on the first data storage module and a second associated data set retrieved based on the second data storage module;
s2, selecting a reference data set according to the ratio of the data quantity in the second associated data set to the data quantity of the first associated data set, when the first associated data set and the second associated data set are both the reference data sets, respectively integrating each group of data of the first associated data set and the second associated data set according to disease types to respectively acquire a first information matrix set and a second information matrix set, acquiring the first reference information matrix set according to the matching result of the first information matrix set and the second information matrix set, judging whether to screen each information matrix in the second information matrix set according to the quantity of the information matrices in the first reference information matrix set, judging whether to reserve each information matrix reserved in the second information matrix set according to the data quantity of a certain information matrix in the second information matrix set, the latest updating time of a certain group of data of the information matrix and the average integrity of each group of data, and taking each information matrix reserved in the second information matrix set as a second reference information matrix set;
and S3, integrating the second reference information matrix set and each group of information matrix in the first reference information matrix set according to the disease type to obtain a standard reference information matrix set, evaluating the referential of each information matrix in the standard reference information matrix set according to the historical time of occurrence of each type of disease, setting a feedback sequence of each information matrix in the standard reference information matrix set according to the referential evaluation result of each information matrix in the standard reference information matrix set, and marking the referential evaluation result of each information matrix in the standard reference information matrix set for feedback to an external instruction source.
Specifically, in this embodiment, the male member 1840 is ancient, and the male member 1840 are modern.
Specifically, the invention takes the near-modern historical materials as the preferential collection data, and because the near-modern and ancient objective environments are greatly different, if the data of the same disease type are recorded in the near-modern and ancient, the near-modern data is preferentially selected as the collection data, so that the data can have higher reference value, and each group of data of the first association data set and the second association data set are respectively integrated according to the disease type, so that the collected data can be classified according to the disease type, and the strip physicochemical of the data collection is realized; the invention matches the first information matrix set and the second information matrix set, can screen out epidemic situations which occur in ancient and recent modern times, reduces the interference of invalid information, can extract data with higher referential property in recent modern epidemic situation prevention and control data, reduces the screening time of effective information, and improves the data collection efficiency.
In the step S2 of the above-mentioned process,
if the ratio of the data volume in the second association data set to the data volume of the first association data set is equal to zero, the screening unit sets the first association data set as a reference data set;
if the ratio of the data volume in the second association data set to the data volume in the first association data set is greater than or equal to the preset standard duty ratio, the screening unit sets the second association data set as a reference data set;
the integration unit integrates all groups of data in the reference data set according to the disease type to obtain a plurality of information matrixes, and each information matrix is set as a standard reference information matrix set.
Specifically, in this embodiment, the data amount represents the number of data sets, and the data in the process from occurrence to completion of a certain epidemic situation is a set of data; the present embodiment preferably presets a standard duty cycle equal to 1.
Specifically, the invention considers that certain epidemic situations occur in ancient times but not in recent times, so that the second association data set is an empty set, and when the search result exists, the first association data set is directly used as a reference data set; when the data volume in the second association data set is larger than the data volume in the first association data set, it can be determined that the data volume in the second association data set can be used as the reference material alone and the data volume is sufficient without assisting in referencing the data in the first association data set.
If the ratio of the data volume in the second association data set to the data volume in the first association data set is smaller than the preset standard duty ratio and is not equal to zero, the screening unit sets the first association data set and the second association data set as the reference data set.
When the first association data set and the second association data set are both used as reference data sets, the integration unit integrates all groups of data of the first association data set according to disease types to obtain a plurality of information matrixes and marks the information matrixes as a first information matrix set, the matching unit integrates all groups of data of the second association data set to obtain a plurality of information matrixes and marks the information matrixes in the first information matrix set and all information matrixes in the second information matrix set as a second information matrix set, the matching unit matches all information matrixes in the first information matrix set with all information matrixes in the second information matrix set according to disease types, marks the information matrixes capable of achieving matching as matchable information matrixes, marks the information matrixes incapable of matching as unmatcheable information matrixes, and the screening unit extracts the matchable information matrixes in the first information matrix set as a first reference information matrix set.
When the first reference information matrix set is acquired, the screening unit judges whether to screen each information matrix in the second information matrix set according to the number of the information matrices in the first reference information matrix set, wherein,
if the number of the information matrixes in the first reference information matrix set is smaller than the preset reference matrix number, the screening unit reserves all the information matrixes in the second information matrix set as the second reference information matrix set.
Specifically, the present embodiment preferably presets the number of reference matrices to be 10.
If the number of information matrixes in the first reference information matrix set is greater than the preset reference matrix number, the screening unit determines that,
if the data volume of a certain information matrix A in the second associated data set is larger than or equal to the preset reference data volume, or the time length of the latest updating time interval retrieval of a certain group of data of the information matrix A is smaller than the preset time length threshold value, the screening unit reserves the information matrix A.
Specifically, the preset time period threshold is preferably 250 years in this embodiment, and the preset reference data amount is preferably 2 groups.
If the data volume of a certain information matrix A in the second information matrix set is smaller than the preset reference data volume and the time length of the latest updating time interval retrieval day of any group of data of the information matrix A is larger than or equal to a preset time length threshold value, the screening unit judges whether to keep the information matrix A according to the average integrity of all groups of data in the information matrix A, wherein,
if the average integrity of each group of data in the information matrix A is greater than or equal to a preset integrity minimum threshold value, the screening unit judges that the information matrix A is reserved;
if the average integrity of each group of data in the information matrix A is smaller than a preset integrity minimum threshold value, the screening unit judges that the information matrix A is screened out;
the screening unit takes each information matrix reserved in the second association data set as a second reference information matrix set.
Specifically, the preset integrity minimum threshold value is preferably 0.6 in this embodiment.
Specifically, when the number of information matrixes in the first reference information matrix set is smaller, the invention reserves all the information matrixes in the second information matrix set as the second reference information matrix set; when the number of the information matrixes in the first reference information matrix set is large and the time length of the latest update time interval retrieval of each group of data of each information matrix in the second information matrix set is larger than or equal to a preset time length threshold, the method can judge that the number of interference items in the second information matrix set is too large, and can exclude the interference items by excluding the information matrixes with smaller data quantity and lower data integrity of each group so as to simplify the search result.
The average integrity of each group of data in the information matrix A is equal to the ratio of the sum of the integrity of each group of data in the information matrix A to the data quantity in the information matrix A;
the integrity of a certain group of data a in the information matrix A is equal to the ratio of the number of data types contained in the data a to the total number of preset data types.
Specifically, a set of data includes a plurality of data, the data types include disease types, transmission paths, influence areas, prevention measures, and the like, and a set of data may include all preset data types or may lack part of data types.
Specifically, when the number of data types of a certain group of data is small, the completeness of the group of data can be judged to be low, necessary characteristic data is lacked, the referential is low, and when the average completeness of each group of data of a certain information matrix is low, the time for manually screening effective information can be reduced by screening the information matrix, so that the efficiency of data collection is improved.
In the step S1, if the first association data set and the second association data set are empty sets, the retrieval unit extracts all the data of the first data storage module and the second data storage module respectively, the integration unit integrates each group of data of the first data storage module according to the disease type to obtain a plurality of information matrixes and marks the information matrixes as the first information matrix set, and integrates each group of data of the second data storage module according to the disease type to obtain a plurality of information matrixes and marks the information matrixes as the second information matrix set;
the matching unit matches each information matrix in the first information matrix set with each information matrix in the second information matrix set according to the disease type, marks the information matrix which can be matched as a matchable information matrix, marks the information matrix which cannot be matched as a non-matchable information matrix, and the screening unit extracts the matchable information matrix in the first information matrix set as a third information matrix set;
the interaction unit sets each information matrix in the third information matrix set as a complementary reference information matrix set of each information matrix corresponding to each information matrix in the second information matrix set, and sets the second information matrix set as a standard reference information matrix set.
Specifically, the interaction unit takes each information matrix of the supplementary reference information matrix set as an additional link of each corresponding information matrix in the standard reference information matrix set, and when the external instruction source receives feedback of the interaction unit, each information matrix in the standard reference information matrix set can be directly obtained, and each information matrix of the supplementary reference information matrix set is obtained by clicking each additional link.
In the step S3, if the time length of the latest occurrence time interval retrieval day of a certain disease is less than or equal to the first preset time length, the evaluation unit evaluates that the information matrix in the standard reference information matrix set corresponding to the disease is of high referential;
if the time length of the latest occurrence time interval retrieval day of a certain disease is longer than the first preset time length and shorter than the second preset time length, the evaluation unit evaluates that the information matrix in the standard reference information matrix set corresponding to the disease is middle referential;
and if the time length of the latest occurrence time interval retrieval day of a certain disease is more than or equal to the second preset time length, the evaluation unit evaluates that the information matrix in the standard reference information matrix set corresponding to the disease is of low referential.
Specifically, the feedback sequence of the external instruction source is set according to the reference evaluation result of the information matrix in the standard reference information matrix set, namely, the sequence of the information matrix with high reference is front, the sequence of the information matrix with medium reference is next, the sequence of the information matrix with low reference is last, when the reference evaluation results of a plurality of information matrices are the same, the time length of the latest occurrence time interval retrieval day of the diseases corresponding to the information matrices is smaller, and the feedback sequence of the information matrix is front.
Specifically, the specific values of the first preset duration and the second preset duration are not limited in this embodiment, and the first preset duration is preferably 30 years, and the second preset duration is preferably 80 years in this embodiment.
Specifically, the invention evaluates the referential property of each information matrix according to the time of the disease occurrence distance from the searching time to the searching time, when the time of the latest occurrence time interval of a certain disease is shorter than the searching time, the invention can judge that the data is newer, the traceability is strong, and the newer the data is higher in authenticity and the referential property is higher due to the development of the statistical technology; conversely, when the time period of the latest occurrence time interval retrieval of a certain disease is extremely long, the data can be referenced less, and the occurrence probability of the epidemic situation is also lower.
Thus far, the technical solution of the present invention has been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of protection of the present invention is not limited to these specific embodiments. Equivalent modifications and substitutions for related technical features may be made by those skilled in the art without departing from the principles of the present invention, and such modifications and substitutions will be within the scope of the present invention.
The foregoing is merely a preferred embodiment of the present invention and is not intended to limit the present invention; various modifications and variations of the present invention will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (10)
1. The method for collecting the history data based on medical epidemic prevention and control is characterized by comprising the following steps:
step S1, searching a first data storage module for storing historic epidemic situation related history data and a second data storage module for storing near modern epidemic situation prevention and control data according to keywords to obtain a first associated data set retrieved based on the first data storage module and a second associated data set retrieved based on the second data storage module;
s2, selecting a reference data set according to the ratio of the data quantity in the second associated data set to the data quantity in the first associated data set, when the first associated data set and the second associated data set are both the reference data sets, respectively integrating each group of data of the first associated data set and the second associated data set according to disease types to respectively acquire a first information matrix set and a second information matrix set, acquiring the first reference information matrix set according to the matching result of the first information matrix set and the second information matrix set, judging whether to screen each information matrix in the second information matrix set according to the quantity of the information matrices in the first reference information matrix set, judging whether to reserve each information matrix reserved in the second information matrix set as a second reference information matrix set according to the data quantity of a certain information matrix in the second information matrix set, the latest update time of a certain group of data of the information matrix and the average integrity of each group of data;
and S3, integrating the second reference information matrix set and each group of information matrix in the first reference information matrix set according to the disease type to obtain a standard reference information matrix set, evaluating the referential of each information matrix in the standard reference information matrix set according to the historical time of occurrence of each type of disease, setting a feedback sequence of each information matrix in the standard reference information matrix set according to the referential evaluation result of each information matrix in the standard reference information matrix set, and marking the referential evaluation result of each information matrix in the standard reference information matrix set for feedback to an external instruction source.
2. The method for collecting history data based on medical epidemic prevention and control according to claim 1, wherein in the step S2,
if the ratio of the data volume in the second associated data set to the data volume in the first associated data set is equal to zero, the screening unit sets the first associated data set as the reference data set;
if the ratio of the data volume in the second associated data set to the data volume in the first associated data set is greater than or equal to a preset standard duty ratio, the screening unit sets the second associated data set as the reference data set;
the integration unit integrates all groups of data in the reference data set according to the disease type to obtain a plurality of information matrixes, and each information matrix is set as the standard reference information matrix set.
3. The method for collecting history data based on medical epidemic prevention and control according to claim 1, wherein in the step S2, if the ratio of the data amount in the second association data set to the data amount in the first association data set is smaller than a preset standard duty ratio and is not equal to zero, the filtering unit sets the first association data set and the second association data set as the reference data set.
4. The collecting method based on medical epidemic prevention and control historic material data according to claim 3, characterized in that the integrating unit integrates each group of data of the first association data set according to disease types to obtain a plurality of information matrixes and record the information matrixes as the first information matrix set, integrates each group of data of the second association data set to obtain a plurality of information matrixes and record the information matrixes as the second information matrix set, the matching unit matches each information matrix in the first information matrix set with each information matrix in the second information matrix set according to disease types, marks the information matrixes capable of achieving matching as matchable information matrixes, marks the information matrixes incapable of matching as unmatcheable information matrixes, and the screening unit extracts the matchable information matrixes in the first information matrix set as the first reference information matrix set.
5. The method for collecting history data based on medical epidemic prevention and control according to claim 4, wherein the screening unit determines that,
and if the number of the information matrixes in the first reference information matrix set is smaller than the preset reference matrix number, the screening unit reserves all the information matrixes in the second information matrix set as the second reference information matrix set.
6. The method for collecting history data based on medical epidemic prevention and control according to claim 4, wherein if the number of information matrices in the first reference information matrix set is greater than a preset reference matrix number, the filtering unit determines that,
and if the data volume of a certain information matrix A in the second information matrix set is larger than or equal to the preset reference data volume, or the time length of the latest updating time interval retrieval day of a certain group of data of the information matrix A is smaller than a preset time length threshold value, the screening unit reserves the information matrix A.
7. The method for collecting history data based on medical epidemic prevention and control according to claim 6, wherein if the data size of the information matrix a is smaller than the preset reference data size and the time length of the day of the latest update time interval retrieval of any group of data of the information matrix a is greater than or equal to the preset time length threshold, the screening unit determines whether to retain the information matrix a according to the average integrity of the groups of data in the information matrix a, wherein,
if the average integrity of each group of data in the information matrix A is greater than or equal to a preset integrity minimum threshold value, the screening unit judges that the information matrix A is reserved;
and if the average integrity of each group of data in the information matrix A is smaller than the preset integrity minimum threshold, the screening unit judges that the information matrix A is screened out.
8. The method for collecting history data based on medical epidemic prevention and control according to claim 7, wherein the average integrity of each group of data in the information matrix a is equal to the ratio of the sum of the integrity of each group of data in the information matrix a to the data amount in the information matrix a;
the integrity of a certain group of data a in the information matrix A is equal to the ratio of the number of data types contained in the data a to the total number of preset data types.
9. The method for collecting history data based on medical epidemic situation prevention and control according to claim 1, wherein in the step S1, if the first association data set and the second association data set are empty sets, the searching unit extracts all data of the first data storage module and the second data storage module respectively, the integrating unit integrates each group of data of the first data storage module according to disease type to obtain a plurality of information matrixes and record the information matrixes as the first information matrix set, and integrates each group of data of the second data storage module according to disease type to obtain a plurality of information matrixes and record the information matrixes as the second information matrix set;
the matching unit matches each information matrix in the first information matrix set with each information matrix in the second information matrix set according to the disease type, marks the information matrix which can be matched as a matchable information matrix, marks the information matrix which cannot be matched as a non-matchable information matrix, and the screening unit extracts the matchable information matrix in the first information matrix set as a third information matrix set;
the interaction unit sets each information matrix in the third information matrix set as a complementary reference information matrix set of each information matrix corresponding to each information matrix in the second information matrix set, and sets the second information matrix set as the standard reference information matrix set.
10. The method for collecting history data based on medical epidemic prevention and control according to any one of claims 1 to 9, wherein in step S3,
if the time length of the latest occurrence time interval retrieval day of a certain disease is less than or equal to the first preset time length, the evaluation unit evaluates that the information matrix in the standard reference information matrix set corresponding to the disease is of high referential;
if the time length of the latest occurrence time interval retrieval day of a certain disease is longer than the first preset time length and shorter than the second preset time length, the evaluation unit evaluates that the information matrix in the standard reference information matrix set corresponding to the disease is middle referential;
and if the time length of the latest occurrence time interval retrieval day of a certain disease is more than or equal to the second preset time length, the evaluation unit evaluates that the information matrix in the standard reference information matrix set corresponding to the disease is of low referential.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310786205.6A CN116504418B (en) | 2023-06-30 | 2023-06-30 | Medical epidemic situation prevention and control history data based collection method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310786205.6A CN116504418B (en) | 2023-06-30 | 2023-06-30 | Medical epidemic situation prevention and control history data based collection method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116504418A CN116504418A (en) | 2023-07-28 |
CN116504418B true CN116504418B (en) | 2023-09-08 |
Family
ID=87323532
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310786205.6A Active CN116504418B (en) | 2023-06-30 | 2023-06-30 | Medical epidemic situation prevention and control history data based collection method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116504418B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10073890B1 (en) * | 2015-08-03 | 2018-09-11 | Marca Research & Development International, Llc | Systems and methods for patent reference comparison in a combined semantical-probabilistic algorithm |
CN114420252A (en) * | 2021-12-10 | 2022-04-29 | 重庆大学附属肿瘤医院 | Method, device and medium for determining intensity modulated radiotherapy plan evaluation parameter matrix |
CN115048571A (en) * | 2022-04-27 | 2022-09-13 | 赵涛 | Online education recommendation management system based on cloud platform |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102412380B1 (en) * | 2021-07-01 | 2022-06-23 | (주)뤼이드 | Method for, device for, and system for evaluating a learning ability of an user based on search information of the user |
-
2023
- 2023-06-30 CN CN202310786205.6A patent/CN116504418B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10073890B1 (en) * | 2015-08-03 | 2018-09-11 | Marca Research & Development International, Llc | Systems and methods for patent reference comparison in a combined semantical-probabilistic algorithm |
CN114420252A (en) * | 2021-12-10 | 2022-04-29 | 重庆大学附属肿瘤医院 | Method, device and medium for determining intensity modulated radiotherapy plan evaluation parameter matrix |
CN115048571A (en) * | 2022-04-27 | 2022-09-13 | 赵涛 | Online education recommendation management system based on cloud platform |
Non-Patent Citations (1)
Title |
---|
Formative Evaluation and Complex Health Improvement Initiatives: A Learning System to Improve Theory, Implementation, Support, and Evaluation;V. C. Scott等;《American Journal of Evaluation》;第89-106页 * |
Also Published As
Publication number | Publication date |
---|---|
CN116504418A (en) | 2023-07-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7310652B1 (en) | Method and apparatus for managing hierarchical collections of data | |
CN105260598A (en) | Oral diagnosis and treatment decision support system and decision method | |
CN107077413A (en) | The test frame of data-driven | |
CN105373894A (en) | Inspection data-based power marketing service diagnosis model establishing method and system | |
CN111028954A (en) | Infectious disease early warning analysis method and system based on Chinese semantic technology | |
Jaczynski | A framework for the management of past experiences with time-extended situations | |
CN109324960A (en) | Automatic test approach and terminal device based on big data analysis | |
CN102801548B (en) | A kind of method of intelligent early-warning, device and information system | |
CN116504418B (en) | Medical epidemic situation prevention and control history data based collection method | |
CN115757363A (en) | Multi-level management method and system for three-dimensional cadastral database | |
CN117828448A (en) | Internal partial discharge temperature anomaly identification system for primary and secondary fusion ring main unit | |
CN110990384B (en) | Big data platform BI analysis method | |
CN111628888B (en) | Fault diagnosis method, device, equipment and computer storage medium | |
CN108521346A (en) | Method for positioning abnormal nodes of telecommunication bearer network based on terminal data | |
CN1866821A (en) | Network monitoring data compression storing and combination detecting method based on similar data set | |
CN113470776B (en) | Genetic diagnosis system integrating data acquisition, analysis and report generation | |
CN109446489A (en) | Legal information repetitive rate detection system and detection method | |
CN113449326A (en) | Industrial big data analysis system based on multi-source heterogeneous data processing | |
Schmidt et al. | Abstractions of data and time for multiparametric time course prognoses | |
CN112737799A (en) | Data processing method, device and storage medium | |
CN117311576B (en) | CAD operation behavior prediction method and system | |
CN117880338B (en) | Data acquisition system based on internet of things equipment | |
Schmidt et al. | Prognoses for Multiparametric Time Courses | |
CN116841904A (en) | Health diagnosis method, device and equipment for data product and storage medium | |
CN118708444A (en) | Self-recommendation operation server operation and maintenance management method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |