CN116504418B - Medical epidemic situation prevention and control history data based collection method - Google Patents

Medical epidemic situation prevention and control history data based collection method Download PDF

Info

Publication number
CN116504418B
CN116504418B CN202310786205.6A CN202310786205A CN116504418B CN 116504418 B CN116504418 B CN 116504418B CN 202310786205 A CN202310786205 A CN 202310786205A CN 116504418 B CN116504418 B CN 116504418B
Authority
CN
China
Prior art keywords
information matrix
data
information
matrix set
matrixes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310786205.6A
Other languages
Chinese (zh)
Other versions
CN116504418A (en
Inventor
丁侃
肖永芝
张丽君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
INSTITUTE OF CHINESE MEDICAL HISTORY AND LITERATURE CHINA ACADEMY OF CHINESE MEDICAL SCIENCES
Original Assignee
INSTITUTE OF CHINESE MEDICAL HISTORY AND LITERATURE CHINA ACADEMY OF CHINESE MEDICAL SCIENCES
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by INSTITUTE OF CHINESE MEDICAL HISTORY AND LITERATURE CHINA ACADEMY OF CHINESE MEDICAL SCIENCES filed Critical INSTITUTE OF CHINESE MEDICAL HISTORY AND LITERATURE CHINA ACADEMY OF CHINESE MEDICAL SCIENCES
Priority to CN202310786205.6A priority Critical patent/CN116504418B/en
Publication of CN116504418A publication Critical patent/CN116504418A/en
Application granted granted Critical
Publication of CN116504418B publication Critical patent/CN116504418B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the field of medical epidemic prevention and control, in particular to a method for collecting history data based on medical epidemic prevention and control, which comprises the following steps: step S1, searching the first data storage module and the second data storage module according to keywords respectively to obtain a first associated data set searched based on the first data storage module and a second associated data set searched based on the second data storage module; s2, selecting a reference data set, respectively integrating the first association data set and the second association data set to obtain information matrixes, and respectively screening the information matrixes to obtain a first reference information matrix set and a second reference information matrix set; step S3, integrating the second reference information matrix set with each group of information matrix in the first reference information matrix set to obtain a standard reference information matrix set. The invention can collect and integrate the historical material data of epidemic situation and evaluate the historical material data with reference.

Description

Medical epidemic situation prevention and control history data based collection method
Technical Field
The invention relates to the field of medical epidemic prevention and control, in particular to a method for collecting history data based on medical epidemic prevention and control.
Background
Along with the frequent occurrence of global public health events, medical epidemic prevention and control has become a major challenge facing various countries, and scientists need to know the rule of disease transmission from historical epidemic data in order to improve the effect of epidemic prevention and control. However, the existing method for collecting, sorting and storing the medical epidemic prevention and control history data has limitations such as untimely data updating, lack of integrity and accuracy and the like, which severely restricts the development of epidemic prevention and control work. Therefore, a method for quickly and efficiently collecting medical epidemic situation prevention and control history data is urgently needed to improve the scientificity and accuracy of epidemic situation prevention and control work.
The invention discloses a construction method of an infectious disease prediction model, which is based on a search engine big data platform to collect user groups on the Internet for searching relevant keywords of epidemic situation, meanwhile, collects multiple types of climate variable data from a meteorological platform for infectious disease prediction, adopts a time sequence cross validation algorithm (TSCV) to perform model validation, outputs new added diagnosis number and other variables in future time period, and realizes accurate prediction of epidemic situation of infectious disease, but has the problem that historical material data of epidemic situation can not be used as reference data for effectively analyzing epidemic situation to be happened or happened.
Disclosure of Invention
Therefore, the invention provides a collection method based on medical epidemic prevention and control historical material data, which can solve the problem that the epidemic historical material data cannot be effectively collected and integrated and evaluated in a referential manner.
In order to achieve the above purpose, the invention provides a method for collecting history data based on medical epidemic prevention and control, comprising the following steps:
step S1, searching a first data storage module for storing historic epidemic situation related history data and a second data storage module for storing near modern epidemic situation prevention and control data according to keywords to obtain a first associated data set retrieved based on the first data storage module and a second associated data set retrieved based on the second data storage module;
s2, selecting a reference data set according to the ratio of the data quantity in the second associated data set to the data quantity in the first associated data set, when the first associated data set and the second associated data set are both the reference data sets, respectively integrating each group of data of the first associated data set and the second associated data set according to disease types to respectively acquire a first information matrix set and a second information matrix set, acquiring the first reference information matrix set according to the matching result of the first information matrix set and the second information matrix set, judging whether to screen each information matrix in the second information matrix set according to the quantity of the information matrices in the first reference information matrix set, judging whether to reserve each information matrix reserved in the second information matrix set as a second reference information matrix set according to the data quantity of a certain information matrix in the second information matrix set, the latest update time of a certain group of data of the information matrix and the average integrity of each group of data;
and S3, integrating the second reference information matrix set and each group of information matrix in the first reference information matrix set according to the disease type to obtain a standard reference information matrix set, evaluating the referential of each information matrix in the standard reference information matrix set according to the historical time of occurrence of each type of disease, setting a feedback sequence of each information matrix in the standard reference information matrix set according to the referential evaluation result of each information matrix in the standard reference information matrix set, and marking the referential evaluation result of each information matrix in the standard reference information matrix set for feedback to an external instruction source.
Further, in the step S2,
if the ratio of the data volume in the second associated data set to the data volume in the first associated data set is equal to zero, the screening unit sets the first associated data set as the reference data set;
if the ratio of the data volume in the second associated data set to the data volume in the first associated data set is greater than or equal to a preset standard duty ratio, the screening unit sets the second associated data set as the reference data set;
the integration unit integrates all groups of data in the reference data set according to the disease type to obtain a plurality of information matrixes, and each information matrix is set as the standard reference information matrix set.
Further, in the step S2, if the ratio of the data amount in the second association data set to the data amount in the first association data set is smaller than a preset standard duty ratio and is not equal to zero, the filtering unit sets the first association data set and the second association data set as the reference data set.
Further, the integrating unit integrates each group of data of the first association data set according to the disease type to obtain a plurality of information matrixes and marks the information matrixes as the first information matrix set, integrates each group of data of the second association data set to obtain a plurality of information matrixes and marks the information matrixes as the second information matrix set, the matching unit matches each information matrix in the first information matrix set with each information matrix in the second information matrix set according to the disease type, marks the information matrixes capable of achieving matching as matchable information matrixes, marks the information matrixes incapable of matching as unmatcheable information matrixes, and the screening unit extracts the matchable information matrixes in the first information matrix set as the first reference information matrix set.
Further, the screening unit determines that,
and if the number of the information matrixes in the first reference information matrix set is smaller than the preset reference matrix number, the screening unit reserves all the information matrixes in the second information matrix set as the second reference information matrix set.
Further, if the number of information matrices in the first reference information matrix set is greater than the number of preset reference matrices, the filtering unit determines that,
and if the data volume of a certain information matrix A in the second information matrix set is larger than or equal to the preset reference data volume, or the time length of the latest updating time interval retrieval day of a certain group of data of the information matrix A is smaller than a preset time length threshold value, the screening unit reserves the information matrix A.
Further, if the data size of the information matrix a is smaller than the preset reference data size and the time length of the latest update time interval retrieval day of any group of data of the information matrix a is greater than or equal to the preset time length threshold, the filtering unit determines whether to retain the information matrix a according to the average integrity of each group of data in the information matrix a, wherein,
if the average integrity of each group of data in the information matrix A is greater than or equal to a preset integrity minimum threshold value, the screening unit judges that the information matrix A is reserved;
and if the average integrity of each group of data in the information matrix A is smaller than the preset integrity minimum threshold, the screening unit judges that the information matrix A is screened out.
Further, the average integrity of each group of data in the information matrix A is equal to the ratio of the sum of the integrity of each group of data in the information matrix A to the data volume in the information matrix A;
the integrity of a certain group of data a in the information matrix A is equal to the ratio of the number of data types contained in the data a to the total number of preset data types.
Further, in the step S1, if the first association data set and the second association data set are both empty sets, the retrieving unit extracts all the data of the first data storage module and the second data storage module, and the integrating unit integrates each group of data of the first data storage module according to the disease type to obtain a plurality of information matrixes and record the information matrixes as the first information matrix set, and integrates each group of data of the second data storage module according to the disease type to obtain a plurality of information matrixes and record the information matrixes as the second information matrix set;
the matching unit matches each information matrix in the first information matrix set with each information matrix in the second information matrix set according to the disease type, marks the information matrix which can be matched as a matchable information matrix, marks the information matrix which cannot be matched as a non-matchable information matrix, and the screening unit extracts the matchable information matrix in the first information matrix set as a third information matrix set;
the interaction unit sets each information matrix in the third information matrix set as a complementary reference information matrix set of each information matrix corresponding to each information matrix in the second information matrix set, and sets the second information matrix set as the standard reference information matrix set.
Further, in the step S3,
if the time length of the latest occurrence time interval retrieval day of a certain disease is less than or equal to the first preset time length, the evaluation unit evaluates that the information matrix in the standard reference information matrix set corresponding to the disease is of high referential;
if the time length of the latest occurrence time interval retrieval day of a certain disease is longer than the first preset time length and shorter than the second preset time length, the evaluation unit evaluates that the information matrix in the standard reference information matrix set corresponding to the disease is middle referential;
and if the time length of the latest occurrence time interval retrieval day of a certain disease is more than or equal to the second preset time length, the evaluation unit evaluates that the information matrix in the standard reference information matrix set corresponding to the disease is of low referential.
Compared with the prior art, the invention has the beneficial effects that the near-modern history materials are used as the priority collection data, and because the near-modern and ancient objective environments are greatly different, if the data of the same disease type are recorded in the near-modern and ancient, the near-modern data is preferentially selected as the collection data, so that the data can have higher reference value, and each group of data of the first association data set and the second association data set is respectively integrated according to the disease type, so that the collected and acquired data can be subjected to the classification according to the disease type, and the strip physicochemical of the data collection is realized; the invention matches the first information matrix set and the second information matrix set, can screen out epidemic situations which occur in ancient and recent modern times, reduces the interference of invalid information, can extract data with higher referential property in recent modern epidemic situation prevention and control data, reduces the screening time of effective information, and improves the data collection efficiency.
In particular, the invention considers that certain epidemic situations occur in ancient times but not in recent times, so that the second association data set is an empty set, and when the search result exists, the first association data set is directly used as a reference data set; when the data volume in the second association data set is larger than the data volume in the first association data set, it can be determined that the data volume in the second association data set can be used as the reference material alone and the data volume is sufficient without assisting in referencing the data in the first association data set.
In particular, when the number of information matrices in the first reference information matrix set is small, the invention reserves all the information matrices in the second information matrix set as the second reference information matrix set; when the number of the information matrixes in the first reference information matrix set is large and the time length of the latest update time interval retrieval of each group of data of each information matrix in the second information matrix set is larger than or equal to a preset time length threshold, the method can judge that the number of interference items in the second information matrix set is too large, and can exclude the interference items by excluding the information matrixes with smaller data quantity and lower data integrity of each group so as to simplify the search result.
In particular, when the number of data types of a certain group of data is small, the completeness of the group of data can be judged to be low, necessary characteristic data is lacked, the referential is low, and when the average completeness of each group of data of a certain information matrix is low, the time for manually screening effective information can be reduced by screening the information matrix, so that the efficiency of data collection is improved.
In particular, the invention evaluates the referential property of each information matrix according to the time of the disease occurrence distance from the searching day, when the time of the latest occurrence time interval of a certain disease is shorter than the time of the searching day, the invention can judge that the data is newer, has strong traceability, and the newer data has higher authenticity and higher referential property due to the development of statistical technology; conversely, when the time period of the latest occurrence time interval retrieval of a certain disease is extremely long, the data can be referenced less, and the occurrence probability of the epidemic situation is also lower.
Drawings
FIG. 1 is a diagram of a collection system architecture based on medical epidemic prevention and control history data in an embodiment of the invention;
FIG. 2 is a flow chart of a method for collecting historic data based on medical epidemic prevention and control in an embodiment of the invention;
FIG. 3 is a schematic diagram of a collection system information base based on medical epidemic prevention and control history data according to an embodiment of the invention;
fig. 4 is a schematic diagram of an information matrix set according to an embodiment of the invention.
Detailed Description
In order that the objects and advantages of the invention will become more apparent, the invention will be further described with reference to the following examples; it should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Preferred embodiments of the present invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are merely for explaining the technical principles of the present invention, and are not intended to limit the scope of the present invention.
It should be noted that, in the description of the present invention, terms such as "upper," "lower," "left," "right," "inner," "outer," and the like indicate directions or positional relationships based on the directions or positional relationships shown in the drawings, which are merely for convenience of description, but do not indicate or imply that the apparatus or element must have a specific orientation, be constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.
Furthermore, it should be noted that, in the description of the present invention, unless explicitly specified and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be either fixedly connected, detachably connected, or integrally connected, for example; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the above terms in the present invention can be understood by those skilled in the art according to the specific circumstances.
Referring to fig. 1, a structure diagram of a collection system based on medical epidemic prevention and control history data according to an embodiment of the present invention includes:
an information base comprising a first data storage module for storing historic epidemic related history data and a second data storage module for storing near modern epidemic prevention and control data;
the retrieval unit is connected with the information base and is used for respectively retrieving the first data storage module and the second data storage module according to the keywords so as to obtain a first associated data set retrieved based on the first data storage module and a second associated data set retrieved based on the second data storage module;
the integration unit is connected with the retrieval unit and is used for integrating all groups of data of the first association data set according to the disease type to obtain a plurality of information matrixes and marking the information matrixes as a first information matrix set, and integrating all groups of data of the second association data set to obtain a plurality of information matrixes and marking the information matrixes as a second information matrix set; the second reference information matrix set is integrated with each group of information matrix in the first reference information matrix set according to the disease type to obtain a standard reference information matrix set;
the matching unit is connected with the integration unit and is used for matching each information matrix in the first information matrix set with each information matrix in the second information matrix set according to the disease type, marking the information matrix capable of realizing matching as a matchable information matrix and marking the information matrix incapable of matching as a non-matchable information matrix;
the screening unit is respectively connected with the integrating unit and the retrieving unit and is used for selecting a reference data set according to the ratio of the data quantity in the second associated data set to the data quantity of the first associated data set, when the first associated data set and the second associated data set are both used as the reference data set, the screening unit extracts a matable information matrix in the first information matrix set as the first reference information matrix set, judges whether to screen each information matrix in the second associated data set according to the quantity of the information matrix in the first reference information matrix set, and judges whether to keep the information matrix according to the data quantity of a certain information matrix in the second associated data set;
the evaluation unit is respectively connected with the integration unit and the screening unit and is used for evaluating the referential property of each information matrix according to the historical time of occurrence of each type of disease;
the interaction unit is connected with the evaluation unit and is used for acquiring a supplementary reference information matrix set corresponding to each information matrix in the second information matrix set, respectively linking the supplementary reference information matrix set with each information matrix in the second information matrix set, setting a feedback sequence of each information matrix according to the evaluation result of the referential property of each information matrix when the external instruction source is fed back, and marking the referential evaluation result of each information matrix to feed back to the external instruction source.
Referring to fig. 2, a flowchart of a method for collecting history data based on medical epidemic prevention and control according to an embodiment of the present invention includes:
step S1, searching a first data storage module for storing historic epidemic situation related history data and a second data storage module for storing near modern epidemic situation prevention and control data according to keywords to obtain a first associated data set retrieved based on the first data storage module and a second associated data set retrieved based on the second data storage module;
s2, selecting a reference data set according to the ratio of the data quantity in the second associated data set to the data quantity of the first associated data set, when the first associated data set and the second associated data set are both the reference data sets, respectively integrating each group of data of the first associated data set and the second associated data set according to disease types to respectively acquire a first information matrix set and a second information matrix set, acquiring the first reference information matrix set according to the matching result of the first information matrix set and the second information matrix set, judging whether to screen each information matrix in the second information matrix set according to the quantity of the information matrices in the first reference information matrix set, judging whether to reserve each information matrix reserved in the second information matrix set according to the data quantity of a certain information matrix in the second information matrix set, the latest updating time of a certain group of data of the information matrix and the average integrity of each group of data, and taking each information matrix reserved in the second information matrix set as a second reference information matrix set;
and S3, integrating the second reference information matrix set and each group of information matrix in the first reference information matrix set according to the disease type to obtain a standard reference information matrix set, evaluating the referential of each information matrix in the standard reference information matrix set according to the historical time of occurrence of each type of disease, setting a feedback sequence of each information matrix in the standard reference information matrix set according to the referential evaluation result of each information matrix in the standard reference information matrix set, and marking the referential evaluation result of each information matrix in the standard reference information matrix set for feedback to an external instruction source.
Specifically, in this embodiment, the male member 1840 is ancient, and the male member 1840 are modern.
Specifically, the invention takes the near-modern historical materials as the preferential collection data, and because the near-modern and ancient objective environments are greatly different, if the data of the same disease type are recorded in the near-modern and ancient, the near-modern data is preferentially selected as the collection data, so that the data can have higher reference value, and each group of data of the first association data set and the second association data set are respectively integrated according to the disease type, so that the collected data can be classified according to the disease type, and the strip physicochemical of the data collection is realized; the invention matches the first information matrix set and the second information matrix set, can screen out epidemic situations which occur in ancient and recent modern times, reduces the interference of invalid information, can extract data with higher referential property in recent modern epidemic situation prevention and control data, reduces the screening time of effective information, and improves the data collection efficiency.
In the step S2 of the above-mentioned process,
if the ratio of the data volume in the second association data set to the data volume of the first association data set is equal to zero, the screening unit sets the first association data set as a reference data set;
if the ratio of the data volume in the second association data set to the data volume in the first association data set is greater than or equal to the preset standard duty ratio, the screening unit sets the second association data set as a reference data set;
the integration unit integrates all groups of data in the reference data set according to the disease type to obtain a plurality of information matrixes, and each information matrix is set as a standard reference information matrix set.
Specifically, in this embodiment, the data amount represents the number of data sets, and the data in the process from occurrence to completion of a certain epidemic situation is a set of data; the present embodiment preferably presets a standard duty cycle equal to 1.
Specifically, the invention considers that certain epidemic situations occur in ancient times but not in recent times, so that the second association data set is an empty set, and when the search result exists, the first association data set is directly used as a reference data set; when the data volume in the second association data set is larger than the data volume in the first association data set, it can be determined that the data volume in the second association data set can be used as the reference material alone and the data volume is sufficient without assisting in referencing the data in the first association data set.
If the ratio of the data volume in the second association data set to the data volume in the first association data set is smaller than the preset standard duty ratio and is not equal to zero, the screening unit sets the first association data set and the second association data set as the reference data set.
When the first association data set and the second association data set are both used as reference data sets, the integration unit integrates all groups of data of the first association data set according to disease types to obtain a plurality of information matrixes and marks the information matrixes as a first information matrix set, the matching unit integrates all groups of data of the second association data set to obtain a plurality of information matrixes and marks the information matrixes in the first information matrix set and all information matrixes in the second information matrix set as a second information matrix set, the matching unit matches all information matrixes in the first information matrix set with all information matrixes in the second information matrix set according to disease types, marks the information matrixes capable of achieving matching as matchable information matrixes, marks the information matrixes incapable of matching as unmatcheable information matrixes, and the screening unit extracts the matchable information matrixes in the first information matrix set as a first reference information matrix set.
When the first reference information matrix set is acquired, the screening unit judges whether to screen each information matrix in the second information matrix set according to the number of the information matrices in the first reference information matrix set, wherein,
if the number of the information matrixes in the first reference information matrix set is smaller than the preset reference matrix number, the screening unit reserves all the information matrixes in the second information matrix set as the second reference information matrix set.
Specifically, the present embodiment preferably presets the number of reference matrices to be 10.
If the number of information matrixes in the first reference information matrix set is greater than the preset reference matrix number, the screening unit determines that,
if the data volume of a certain information matrix A in the second associated data set is larger than or equal to the preset reference data volume, or the time length of the latest updating time interval retrieval of a certain group of data of the information matrix A is smaller than the preset time length threshold value, the screening unit reserves the information matrix A.
Specifically, the preset time period threshold is preferably 250 years in this embodiment, and the preset reference data amount is preferably 2 groups.
If the data volume of a certain information matrix A in the second information matrix set is smaller than the preset reference data volume and the time length of the latest updating time interval retrieval day of any group of data of the information matrix A is larger than or equal to a preset time length threshold value, the screening unit judges whether to keep the information matrix A according to the average integrity of all groups of data in the information matrix A, wherein,
if the average integrity of each group of data in the information matrix A is greater than or equal to a preset integrity minimum threshold value, the screening unit judges that the information matrix A is reserved;
if the average integrity of each group of data in the information matrix A is smaller than a preset integrity minimum threshold value, the screening unit judges that the information matrix A is screened out;
the screening unit takes each information matrix reserved in the second association data set as a second reference information matrix set.
Specifically, the preset integrity minimum threshold value is preferably 0.6 in this embodiment.
Specifically, when the number of information matrixes in the first reference information matrix set is smaller, the invention reserves all the information matrixes in the second information matrix set as the second reference information matrix set; when the number of the information matrixes in the first reference information matrix set is large and the time length of the latest update time interval retrieval of each group of data of each information matrix in the second information matrix set is larger than or equal to a preset time length threshold, the method can judge that the number of interference items in the second information matrix set is too large, and can exclude the interference items by excluding the information matrixes with smaller data quantity and lower data integrity of each group so as to simplify the search result.
The average integrity of each group of data in the information matrix A is equal to the ratio of the sum of the integrity of each group of data in the information matrix A to the data quantity in the information matrix A;
the integrity of a certain group of data a in the information matrix A is equal to the ratio of the number of data types contained in the data a to the total number of preset data types.
Specifically, a set of data includes a plurality of data, the data types include disease types, transmission paths, influence areas, prevention measures, and the like, and a set of data may include all preset data types or may lack part of data types.
Specifically, when the number of data types of a certain group of data is small, the completeness of the group of data can be judged to be low, necessary characteristic data is lacked, the referential is low, and when the average completeness of each group of data of a certain information matrix is low, the time for manually screening effective information can be reduced by screening the information matrix, so that the efficiency of data collection is improved.
In the step S1, if the first association data set and the second association data set are empty sets, the retrieval unit extracts all the data of the first data storage module and the second data storage module respectively, the integration unit integrates each group of data of the first data storage module according to the disease type to obtain a plurality of information matrixes and marks the information matrixes as the first information matrix set, and integrates each group of data of the second data storage module according to the disease type to obtain a plurality of information matrixes and marks the information matrixes as the second information matrix set;
the matching unit matches each information matrix in the first information matrix set with each information matrix in the second information matrix set according to the disease type, marks the information matrix which can be matched as a matchable information matrix, marks the information matrix which cannot be matched as a non-matchable information matrix, and the screening unit extracts the matchable information matrix in the first information matrix set as a third information matrix set;
the interaction unit sets each information matrix in the third information matrix set as a complementary reference information matrix set of each information matrix corresponding to each information matrix in the second information matrix set, and sets the second information matrix set as a standard reference information matrix set.
Specifically, the interaction unit takes each information matrix of the supplementary reference information matrix set as an additional link of each corresponding information matrix in the standard reference information matrix set, and when the external instruction source receives feedback of the interaction unit, each information matrix in the standard reference information matrix set can be directly obtained, and each information matrix of the supplementary reference information matrix set is obtained by clicking each additional link.
In the step S3, if the time length of the latest occurrence time interval retrieval day of a certain disease is less than or equal to the first preset time length, the evaluation unit evaluates that the information matrix in the standard reference information matrix set corresponding to the disease is of high referential;
if the time length of the latest occurrence time interval retrieval day of a certain disease is longer than the first preset time length and shorter than the second preset time length, the evaluation unit evaluates that the information matrix in the standard reference information matrix set corresponding to the disease is middle referential;
and if the time length of the latest occurrence time interval retrieval day of a certain disease is more than or equal to the second preset time length, the evaluation unit evaluates that the information matrix in the standard reference information matrix set corresponding to the disease is of low referential.
Specifically, the feedback sequence of the external instruction source is set according to the reference evaluation result of the information matrix in the standard reference information matrix set, namely, the sequence of the information matrix with high reference is front, the sequence of the information matrix with medium reference is next, the sequence of the information matrix with low reference is last, when the reference evaluation results of a plurality of information matrices are the same, the time length of the latest occurrence time interval retrieval day of the diseases corresponding to the information matrices is smaller, and the feedback sequence of the information matrix is front.
Specifically, the specific values of the first preset duration and the second preset duration are not limited in this embodiment, and the first preset duration is preferably 30 years, and the second preset duration is preferably 80 years in this embodiment.
Specifically, the invention evaluates the referential property of each information matrix according to the time of the disease occurrence distance from the searching time to the searching time, when the time of the latest occurrence time interval of a certain disease is shorter than the searching time, the invention can judge that the data is newer, the traceability is strong, and the newer the data is higher in authenticity and the referential property is higher due to the development of the statistical technology; conversely, when the time period of the latest occurrence time interval retrieval of a certain disease is extremely long, the data can be referenced less, and the occurrence probability of the epidemic situation is also lower.
Thus far, the technical solution of the present invention has been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of protection of the present invention is not limited to these specific embodiments. Equivalent modifications and substitutions for related technical features may be made by those skilled in the art without departing from the principles of the present invention, and such modifications and substitutions will be within the scope of the present invention.
The foregoing is merely a preferred embodiment of the present invention and is not intended to limit the present invention; various modifications and variations of the present invention will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. The method for collecting the history data based on medical epidemic prevention and control is characterized by comprising the following steps:
step S1, searching a first data storage module for storing historic epidemic situation related history data and a second data storage module for storing near modern epidemic situation prevention and control data according to keywords to obtain a first associated data set retrieved based on the first data storage module and a second associated data set retrieved based on the second data storage module;
s2, selecting a reference data set according to the ratio of the data quantity in the second associated data set to the data quantity in the first associated data set, when the first associated data set and the second associated data set are both the reference data sets, respectively integrating each group of data of the first associated data set and the second associated data set according to disease types to respectively acquire a first information matrix set and a second information matrix set, acquiring the first reference information matrix set according to the matching result of the first information matrix set and the second information matrix set, judging whether to screen each information matrix in the second information matrix set according to the quantity of the information matrices in the first reference information matrix set, judging whether to reserve each information matrix reserved in the second information matrix set as a second reference information matrix set according to the data quantity of a certain information matrix in the second information matrix set, the latest update time of a certain group of data of the information matrix and the average integrity of each group of data;
and S3, integrating the second reference information matrix set and each group of information matrix in the first reference information matrix set according to the disease type to obtain a standard reference information matrix set, evaluating the referential of each information matrix in the standard reference information matrix set according to the historical time of occurrence of each type of disease, setting a feedback sequence of each information matrix in the standard reference information matrix set according to the referential evaluation result of each information matrix in the standard reference information matrix set, and marking the referential evaluation result of each information matrix in the standard reference information matrix set for feedback to an external instruction source.
2. The method for collecting history data based on medical epidemic prevention and control according to claim 1, wherein in the step S2,
if the ratio of the data volume in the second associated data set to the data volume in the first associated data set is equal to zero, the screening unit sets the first associated data set as the reference data set;
if the ratio of the data volume in the second associated data set to the data volume in the first associated data set is greater than or equal to a preset standard duty ratio, the screening unit sets the second associated data set as the reference data set;
the integration unit integrates all groups of data in the reference data set according to the disease type to obtain a plurality of information matrixes, and each information matrix is set as the standard reference information matrix set.
3. The method for collecting history data based on medical epidemic prevention and control according to claim 1, wherein in the step S2, if the ratio of the data amount in the second association data set to the data amount in the first association data set is smaller than a preset standard duty ratio and is not equal to zero, the filtering unit sets the first association data set and the second association data set as the reference data set.
4. The collecting method based on medical epidemic prevention and control historic material data according to claim 3, characterized in that the integrating unit integrates each group of data of the first association data set according to disease types to obtain a plurality of information matrixes and record the information matrixes as the first information matrix set, integrates each group of data of the second association data set to obtain a plurality of information matrixes and record the information matrixes as the second information matrix set, the matching unit matches each information matrix in the first information matrix set with each information matrix in the second information matrix set according to disease types, marks the information matrixes capable of achieving matching as matchable information matrixes, marks the information matrixes incapable of matching as unmatcheable information matrixes, and the screening unit extracts the matchable information matrixes in the first information matrix set as the first reference information matrix set.
5. The method for collecting history data based on medical epidemic prevention and control according to claim 4, wherein the screening unit determines that,
and if the number of the information matrixes in the first reference information matrix set is smaller than the preset reference matrix number, the screening unit reserves all the information matrixes in the second information matrix set as the second reference information matrix set.
6. The method for collecting history data based on medical epidemic prevention and control according to claim 4, wherein if the number of information matrices in the first reference information matrix set is greater than a preset reference matrix number, the filtering unit determines that,
and if the data volume of a certain information matrix A in the second information matrix set is larger than or equal to the preset reference data volume, or the time length of the latest updating time interval retrieval day of a certain group of data of the information matrix A is smaller than a preset time length threshold value, the screening unit reserves the information matrix A.
7. The method for collecting history data based on medical epidemic prevention and control according to claim 6, wherein if the data size of the information matrix a is smaller than the preset reference data size and the time length of the day of the latest update time interval retrieval of any group of data of the information matrix a is greater than or equal to the preset time length threshold, the screening unit determines whether to retain the information matrix a according to the average integrity of the groups of data in the information matrix a, wherein,
if the average integrity of each group of data in the information matrix A is greater than or equal to a preset integrity minimum threshold value, the screening unit judges that the information matrix A is reserved;
and if the average integrity of each group of data in the information matrix A is smaller than the preset integrity minimum threshold, the screening unit judges that the information matrix A is screened out.
8. The method for collecting history data based on medical epidemic prevention and control according to claim 7, wherein the average integrity of each group of data in the information matrix a is equal to the ratio of the sum of the integrity of each group of data in the information matrix a to the data amount in the information matrix a;
the integrity of a certain group of data a in the information matrix A is equal to the ratio of the number of data types contained in the data a to the total number of preset data types.
9. The method for collecting history data based on medical epidemic situation prevention and control according to claim 1, wherein in the step S1, if the first association data set and the second association data set are empty sets, the searching unit extracts all data of the first data storage module and the second data storage module respectively, the integrating unit integrates each group of data of the first data storage module according to disease type to obtain a plurality of information matrixes and record the information matrixes as the first information matrix set, and integrates each group of data of the second data storage module according to disease type to obtain a plurality of information matrixes and record the information matrixes as the second information matrix set;
the matching unit matches each information matrix in the first information matrix set with each information matrix in the second information matrix set according to the disease type, marks the information matrix which can be matched as a matchable information matrix, marks the information matrix which cannot be matched as a non-matchable information matrix, and the screening unit extracts the matchable information matrix in the first information matrix set as a third information matrix set;
the interaction unit sets each information matrix in the third information matrix set as a complementary reference information matrix set of each information matrix corresponding to each information matrix in the second information matrix set, and sets the second information matrix set as the standard reference information matrix set.
10. The method for collecting history data based on medical epidemic prevention and control according to any one of claims 1 to 9, wherein in step S3,
if the time length of the latest occurrence time interval retrieval day of a certain disease is less than or equal to the first preset time length, the evaluation unit evaluates that the information matrix in the standard reference information matrix set corresponding to the disease is of high referential;
if the time length of the latest occurrence time interval retrieval day of a certain disease is longer than the first preset time length and shorter than the second preset time length, the evaluation unit evaluates that the information matrix in the standard reference information matrix set corresponding to the disease is middle referential;
and if the time length of the latest occurrence time interval retrieval day of a certain disease is more than or equal to the second preset time length, the evaluation unit evaluates that the information matrix in the standard reference information matrix set corresponding to the disease is of low referential.
CN202310786205.6A 2023-06-30 2023-06-30 Medical epidemic situation prevention and control history data based collection method Active CN116504418B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310786205.6A CN116504418B (en) 2023-06-30 2023-06-30 Medical epidemic situation prevention and control history data based collection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310786205.6A CN116504418B (en) 2023-06-30 2023-06-30 Medical epidemic situation prevention and control history data based collection method

Publications (2)

Publication Number Publication Date
CN116504418A CN116504418A (en) 2023-07-28
CN116504418B true CN116504418B (en) 2023-09-08

Family

ID=87323532

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310786205.6A Active CN116504418B (en) 2023-06-30 2023-06-30 Medical epidemic situation prevention and control history data based collection method

Country Status (1)

Country Link
CN (1) CN116504418B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10073890B1 (en) * 2015-08-03 2018-09-11 Marca Research & Development International, Llc Systems and methods for patent reference comparison in a combined semantical-probabilistic algorithm
CN114420252A (en) * 2021-12-10 2022-04-29 重庆大学附属肿瘤医院 Method, device and medium for determining intensity modulated radiotherapy plan evaluation parameter matrix
CN115048571A (en) * 2022-04-27 2022-09-13 赵涛 Online education recommendation management system based on cloud platform

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102412380B1 (en) * 2021-07-01 2022-06-23 (주)뤼이드 Method for, device for, and system for evaluating a learning ability of an user based on search information of the user

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10073890B1 (en) * 2015-08-03 2018-09-11 Marca Research & Development International, Llc Systems and methods for patent reference comparison in a combined semantical-probabilistic algorithm
CN114420252A (en) * 2021-12-10 2022-04-29 重庆大学附属肿瘤医院 Method, device and medium for determining intensity modulated radiotherapy plan evaluation parameter matrix
CN115048571A (en) * 2022-04-27 2022-09-13 赵涛 Online education recommendation management system based on cloud platform

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Formative Evaluation and Complex Health Improvement Initiatives: A Learning System to Improve Theory, Implementation, Support, and Evaluation;V. C. Scott等;《American Journal of Evaluation》;第89-106页 *

Also Published As

Publication number Publication date
CN116504418A (en) 2023-07-28

Similar Documents

Publication Publication Date Title
US7310652B1 (en) Method and apparatus for managing hierarchical collections of data
CN105260598A (en) Oral diagnosis and treatment decision support system and decision method
CN107077413A (en) The test frame of data-driven
CN105373894A (en) Inspection data-based power marketing service diagnosis model establishing method and system
CN111028954A (en) Infectious disease early warning analysis method and system based on Chinese semantic technology
Jaczynski A framework for the management of past experiences with time-extended situations
CN109324960A (en) Automatic test approach and terminal device based on big data analysis
CN102801548B (en) A kind of method of intelligent early-warning, device and information system
CN116504418B (en) Medical epidemic situation prevention and control history data based collection method
CN115757363A (en) Multi-level management method and system for three-dimensional cadastral database
CN117828448A (en) Internal partial discharge temperature anomaly identification system for primary and secondary fusion ring main unit
CN110990384B (en) Big data platform BI analysis method
CN111628888B (en) Fault diagnosis method, device, equipment and computer storage medium
CN108521346A (en) Method for positioning abnormal nodes of telecommunication bearer network based on terminal data
CN1866821A (en) Network monitoring data compression storing and combination detecting method based on similar data set
CN113470776B (en) Genetic diagnosis system integrating data acquisition, analysis and report generation
CN109446489A (en) Legal information repetitive rate detection system and detection method
CN113449326A (en) Industrial big data analysis system based on multi-source heterogeneous data processing
Schmidt et al. Abstractions of data and time for multiparametric time course prognoses
CN112737799A (en) Data processing method, device and storage medium
CN117311576B (en) CAD operation behavior prediction method and system
CN117880338B (en) Data acquisition system based on internet of things equipment
Schmidt et al. Prognoses for Multiparametric Time Courses
CN116841904A (en) Health diagnosis method, device and equipment for data product and storage medium
CN118708444A (en) Self-recommendation operation server operation and maintenance management method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant