CN111063444A - Excel VBA-based death cause monitoring data analysis method - Google Patents

Excel VBA-based death cause monitoring data analysis method Download PDF

Info

Publication number
CN111063444A
CN111063444A CN201911231607.XA CN201911231607A CN111063444A CN 111063444 A CN111063444 A CN 111063444A CN 201911231607 A CN201911231607 A CN 201911231607A CN 111063444 A CN111063444 A CN 111063444A
Authority
CN
China
Prior art keywords
data
death
analysis
excel
cause
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911231607.XA
Other languages
Chinese (zh)
Inventor
陈海
董昀球
钱云
王璐
郭亮亮
刘佳
杨志杰
申倩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Center for Disease Control and Prevention
Original Assignee
Wuxi Center for Disease Control and Prevention
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Center for Disease Control and Prevention filed Critical Wuxi Center for Disease Control and Prevention
Priority to CN201911231607.XA priority Critical patent/CN111063444A/en
Publication of CN111063444A publication Critical patent/CN111063444A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases

Abstract

The invention discloses a death cause monitoring data analysis method based on Excel VBA, which comprises the following steps: constructing a data quality evaluation model; constructing a data analysis execution engine, converting all variables by adopting a uniform coding rule and a uniform format based on an Excel VBA, uniformly sorting and importing all data into the Excel, and embedding ICD codes of different disease types in a background; taking the data imported into Excel as an execution object, and carrying out data analysis and sorting in a transverse analysis or longitudinal analysis mode, wherein the transverse analysis reflects the death condition and key index condition of the crowd in a specific year, and the longitudinal analysis reflects the change trend of the death level or key index of the crowd in different year spans. The death cause monitoring data analysis method based on the Excel VBA manages and controls the death cause monitoring data, and realizes analysis of all people, all processes and all-in-one death cause data and calculation of key indexes.

Description

Excel VBA-based death cause monitoring data analysis method
Technical Field
The invention belongs to the field of big data processing, and particularly relates to a method for analyzing death cause monitoring data.
Background
The death cause monitoring is an important public health monitoring activity, is beneficial to knowing the death level and the death cause sequence of residents, mastering the health influence factors of the residents, and has great significance in providing reliable basis for government decision making and the like. As a technical work, the links involved in the cause of death monitoring are more, the data volume per year is huge, and from the perspective of monitoring data management, the links involved in data acquisition, reporting, sorting, cleaning, analysis, reporting and the like are multiple. With the application of the network direct reporting system of the cause of death registration, the data acquisition reporting quality is greatly improved, and how to perform subsequent data management, analysis and utilization becomes an urgent problem to be solved.
The management, analysis and utilization of the death cause monitoring data cannot be limited to the national and provincial level, and each basic level monitoring unit is also included; the method is not limited to simple statistical reports, and also comprises deep analysis of data. Due to the fact that data management, analysis and utilization levels of technicians of basic monitoring units are different, utilization of monitored data is limited. In addition, due to the fact that the dead cause monitoring data are updated in real time, and key information items are more and data volume is large, processing and analyzing workload of data is large, time consumption is long, mistakes and omissions are easy to occur, and working requirements cannot be met. Although many statistical software exist at present, such as SAS, R language, SPSS, Stata, etc., for the staff of the vast base monitoring unit, the high price and professional use method thereof are the biggest obstacles to applying the statistical software, and part of the software is inconvenient to use, needs to be programmed, and is not very visual and interactive.
Disclosure of Invention
The purpose of the invention is as follows: the invention aims to provide a death cause monitoring data analysis method based on excelVBA (ExcelVBA) aiming at the defects of the prior art, so that the death cause monitoring data is managed and quality controlled, and the analysis of the death cause data of all people, all processes and all-in-one and the calculation of key indexes are realized.
The technical scheme is as follows: the invention discloses a death cause monitoring data analysis method based on excelVBA, which comprises the following steps:
(1) constructing a data quality assessment model
Formulating a data quality definition rule, and screening and filtering all collected data according to the data source and the formulated data quality definition rule so as to construct a standardized and systematized data quality evaluation model;
(2) building data analytics execution engine
Converting all variables by adopting a uniform coding rule and a uniform format based on the Excel VBA, uniformly sorting and importing all data into the Excel, and embedding ICD codes of different disease types in the background; taking the data imported into Excel as an execution object, and carrying out data analysis and sorting in a transverse analysis or longitudinal analysis mode, wherein the transverse analysis reflects the death condition and key index condition of the crowd in a specific year, and the longitudinal analysis reflects the change trend of the death level or key index of the crowd in different year spans.
According to a further preferable technical scheme, the data quality evaluation model is constructed in the step (1), and the method specifically comprises the following steps:
a. and (3) accessing death cause monitoring data: collecting data by using an electronic information system, and collecting the death cause monitoring data of all regions, all people groups and all life cycles; embedding a missed report investigation module, comprehensively mastering the heading of special death cases, and simultaneously realizing a multi-department combined data sharing mechanism to ensure the comprehensiveness and reliability of the death cause data and accurately accessing the death cause data to an evaluation model;
b. population monitoring data access: through a population registration system, accurate access evaluation models of population data of different household registration types in all regions and all coverage are realized;
c. reporting quality control indexes: reflecting data integrity by the produced infant mortality index, reflecting data reliability by the diagnosis basis composition and diagnosis unit composition index, and formulating a data quality definition rule by report rate, timeliness rate, integrity rate, identity card number filling integrity rate, radical cause inference accuracy rate and cause chain filling accuracy rate;
d. cleaning abnormal data: respectively cleaning and sorting the database from the variable and the record level, and checking the variable, wherein the variable comprises a value, a deletion and a logic error; the record level check includes duplicate, incomplete, and out-of-condition records.
Preferably, when the data imported into Excel is analyzed transversely or longitudinally in step (2), the death level analysis comprises summarizing and analyzing the death number, the gross death rate and the standardized death rate, further counting the cis-position condition and the composition ratio condition of different disease causes, and comprehensively reflecting the death level.
Preferably, when the data imported into Excel is analyzed transversely or longitudinally in the step (2), the calculation of the key indexes comprises the steps of compiling a Jiang simplified life table by using death data and population data of the age groups, and calculating the expected life of all people; calculating the death cause eliminating expected life based on the age-expectancy algorithm by using the death numbers of different disease in different age groups; the death number of the 30-69-year-old population caused by 4 major chronic diseases, namely cardiovascular and cerebrovascular diseases, cancers, chronic respiratory diseases and diabetes, and corresponding population data are used for calculating the premature death rate of the major chronic diseases.
Preferably, after the data analysis is completed, the method further comprises a step of visually displaying the analysis result:
based on a graphic system carried by Excel software, visual, scene and real-time interactive processing are carried out on a data structure and a data analysis result by utilizing VBA coding, and one or more of a multi-parameter selectable columnar diagram, a bar diagram, a broken line diagram, an area diagram, a scatter diagram or a pie diagram is manufactured to visually display the analysis result.
Has the advantages that: (1) the Excel VBA-based method for analyzing the death cause monitoring data solves the problems of management and quality control, accurate analysis and calculation of the death cause monitoring data, and realizes analysis of the death cause data and calculation of key indexes of all people, all processes and all integration; the invention realizes the comprehensive and complete cause of death monitoring big data management and analysis system construction; maintaining, sorting and accurately accessing a population database; maintenance, arrangement and accurate access of the big data of the cause of death monitoring; the quality control of data is realized, and the data is recorded, fed back and revised in time; data management and analysis of different age groups, different sexes, different dates, different areas and different cause of death categories are realized; realizing the calculation of important indexes such as the life expectancy of all people and the premature death rate of major chronic diseases; realizing the cause of death related disease burden analysis; interactive visual display of the death cause monitoring data is realized;
visual Basic for Applications (VBA) is a macro language of Visual Basic, is a programming language developed by Microsoft to execute general automation (OLE) tasks in a desktop application program, can be mainly used for expanding the application program function of Windows, particularly Excel, Excel VBA does not depend on other programming environments, can be directly called through Excel, is good in compatibility with Excel, and is high in shareability and portability; the Excel VBA background programming is adopted, the operation and the presentation are realized through a simple interface, the use by non-professional statistical personnel is facilitated, and different analysis requirements of different users are fully considered; integrating data from different sources and different quality control and analysis methods, and constructing an integrated and full-flow data analysis execution engine to realize efficient analysis and evaluation of the death cause monitoring data; the data analysis relates to more comprehensive contents and more dimensions, and covers the calculation of key indexes and the analysis of disease burden of healthy China; the invention integrates the data quality control checking and cleaning functions into a module of an analysis system, and mainly processes repeated data and data with logic errors; in addition, the Excel VBA background programming is adopted, the full-flow interfacing data visualization function is realized, one-key drawing is supported, and the method is convenient for non-professional drawing personnel to use.
Drawings
FIG. 1 is a flow chart of the data quality assessment model construction of the present invention;
FIG. 2 is a flow chart of data analysis according to the present invention.
Detailed Description
The technical solution of the present invention is described in detail below with reference to the accompanying drawings, but the scope of the present invention is not limited to the embodiments.
Example (b): a death cause monitoring data analysis method based on Excel VBA comprises the following steps:
(1) constructing a data quality assessment model
Formulating a data quality definition rule, and screening and filtering all collected data according to the data source and the formulated data quality definition rule so as to construct a standardized and systematized data quality evaluation model; the scope of data management and evaluation mainly includes:
a. and (3) accessing death cause monitoring data: collecting data by using an electronic information system, collecting death cause monitoring data of all regions, all people groups and all life cycles, and tracing the death population information data in the whole process; embedding a missed report investigation module, comprehensively mastering the heading of special death cases, and simultaneously realizing a multi-department combined data sharing mechanism to ensure the comprehensiveness and reliability of the death cause data and accurately accessing the death cause data to an evaluation model;
b. population monitoring data access: through a population registration system, accurate access evaluation models of population data of different household registration types in all regions and all coverage are realized;
c. reporting quality control indexes: reflecting data integrity by the produced infant mortality index, reflecting data reliability by the diagnosis basis composition and diagnosis unit composition index, and formulating a data quality definition rule by report rate, timeliness rate, integrity rate, identity card number filling integrity rate, radical cause inference accuracy rate and cause chain filling accuracy rate;
d. cleaning abnormal data: respectively cleaning and sorting the database from the variable and the record level, and checking the variable, wherein the variable comprises a value, a deletion and a logic error; the record level check includes duplicate, incomplete, and out-of-condition records.
(2) A data analysis execution engine is constructed to realize efficient analysis of the death cause monitoring data
a. Encoding and format conversion of acquired data
All variables are converted by adopting a uniform coding rule and format, particularly ICD (interface control document) fundamental cause of death coding, and all data are uniformly sorted and imported into Excel; background embedding ICD codes of different disease types.
b. Determining transverse or longitudinal analysis
The transverse analysis mainly reflects the death condition and key index condition of the crowd in a specific year; longitudinal analysis mainly reflects the change trend of the death level or key indexes of the population among different year spans; the death level analysis comprises the steps of summarizing and analyzing the death number, the gross death rate and the standardized death rate, further counting the cis-position conditions and the composition ratio conditions of different disease causes, and comprehensively reflecting the death level. The calculation of key indexes comprises using death data of different age groups and population data to compile a Jiang simplified life table and calculating the expected life of all people; calculating the death cause eliminating expected life based on the age-expectancy algorithm by using the death numbers of different disease in different age groups; the death number of the 30-69-year-old population caused by 4 major chronic diseases, namely cardiovascular and cerebrovascular diseases, cancers, chronic respiratory diseases and diabetes, and corresponding population data are used for calculating the premature death rate of the major chronic diseases.
c. Flexible determination of analysis objects
The focus is on different crowds, such as the whole crowd, the male crowd and the female crowd;
the points of interest are at different ages, such as:
i, all age groups;
II, age 0, 1-4, 5-9, 10-14, … … 85 and over age;
III group of infants aged 0, children aged 1-14, young aged 15-44, middle aged 45-64, and elderly over 65;
IV, self-defining a specific age group;
the focus is in different areas, such as reporting units and areas, household areas, residential areas, cities and rural areas;
the method mainly comprises the following steps of (1) selecting a total cause of death analysis or a cause of death analysis of different diseases from the following points:
the disease types of I, II, 17 and III are the major diseases (83 types) as shown in Table 1.
Figure BDA0002303706300000061
Table 1(3) visual presentation of data results:
based on a graphic system carried by Excel software, visualization, scene and real-time interaction of a data structure and a data analysis result are realized by using VBA codes, one-key production of a multi-parameter selectable column diagram, a bar diagram, a broken line diagram, an area diagram, a scatter diagram, a pie diagram and the like is realized, and a user unfamiliar with Excel drawing can conveniently perform personalized management and use of data.
As noted above, while the present invention has been shown and described with reference to certain preferred embodiments, it is not to be construed as limited thereto. Various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (5)

1. A death cause monitoring data analysis method based on Excel VBA is characterized by comprising the following steps:
(1) constructing a data quality assessment model
Formulating a data quality definition rule, and screening and filtering all collected data according to the data source and the formulated data quality definition rule so as to construct a standardized and systematized data quality evaluation model;
(2) building data analytics execution engine
Converting all variables by adopting a uniform coding rule and a uniform format based on the Excel VBA, uniformly sorting and importing all data into the Excel, and embedding ICD codes of different disease types in the background; taking the data imported into Excel as an execution object, and carrying out data analysis and sorting in a transverse analysis or longitudinal analysis mode, wherein the transverse analysis reflects the death condition and key index condition of the crowd in a specific year, and the longitudinal analysis reflects the change trend of the death level or key index of the crowd in different year spans.
2. The Excel VBA based cause of death monitoring data analysis method of claim 1, wherein:
the method comprises the following steps of (1) constructing a data quality evaluation model, and specifically comprises the following steps:
a. and (3) accessing death cause monitoring data: collecting data by using an electronic information system, and collecting the death cause monitoring data of all regions, all people groups and all life cycles; embedding a missed report investigation module, comprehensively mastering the heading of special death cases, and simultaneously realizing a multi-department combined data sharing mechanism to ensure the comprehensiveness and reliability of the death cause data and accurately accessing the death cause data to an evaluation model;
b. population monitoring data access: through a population registration system, accurate access evaluation models of population data of different household registration types in all regions and all coverage are realized;
c. reporting quality control indexes: reflecting data integrity by the produced infant mortality index, reflecting data reliability by the diagnosis basis composition and diagnosis unit composition index, and formulating a data quality definition rule by report rate, timeliness rate, integrity rate, identity card number filling integrity rate, radical cause inference accuracy rate and cause chain filling accuracy rate;
d. cleaning abnormal data: respectively cleaning and sorting the database from the variable and the record level, and checking the variable, wherein the variable comprises a value, a deletion and a logic error; the record level check includes duplicate, incomplete, and out-of-condition records.
3. The Excel VBA based cause of death monitoring data analysis method of claim 1, wherein: and (3) when the data imported into the Excel is subjected to transverse analysis or longitudinal analysis in the step (2), analyzing the death level, including summarizing and analyzing the death number, the gross death rate and the normalized death rate, further counting the cis-position conditions and the composition ratio conditions of the death causes of different diseases, and comprehensively reflecting the death level.
4. The Excel VBA based cause of death monitoring data analysis method of claim 1, wherein: when the data imported into Excel is subjected to transverse analysis or longitudinal analysis in the step (2), calculating key indexes comprises the steps of compiling a Jiang's simplified life sheet by using death data of different age groups and population data, and calculating the expected life of all people; calculating the death cause eliminating expected life based on the age-expectancy algorithm by using the death numbers of different disease in different age groups; the death number of the 30-69-year-old population caused by 4 major chronic diseases, namely cardiovascular and cerebrovascular diseases, cancers, chronic respiratory diseases and diabetes, and corresponding population data are used for calculating the premature death rate of the major chronic diseases.
5. The Excel VBA-based death cause monitoring data analysis method according to claim 1, further comprising the step of visually displaying the analysis result after the data analysis is completed:
based on a graphic system carried by Excel software, visual, scene and real-time interactive processing are carried out on a data structure and a data analysis result by utilizing VBA coding, and one or more of a multi-parameter selectable columnar diagram, a bar diagram, a broken line diagram, an area diagram, a scatter diagram or a pie diagram is manufactured to visually display the analysis result.
CN201911231607.XA 2019-12-05 2019-12-05 Excel VBA-based death cause monitoring data analysis method Pending CN111063444A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911231607.XA CN111063444A (en) 2019-12-05 2019-12-05 Excel VBA-based death cause monitoring data analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911231607.XA CN111063444A (en) 2019-12-05 2019-12-05 Excel VBA-based death cause monitoring data analysis method

Publications (1)

Publication Number Publication Date
CN111063444A true CN111063444A (en) 2020-04-24

Family

ID=70299820

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911231607.XA Pending CN111063444A (en) 2019-12-05 2019-12-05 Excel VBA-based death cause monitoring data analysis method

Country Status (1)

Country Link
CN (1) CN111063444A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112509693A (en) * 2020-12-11 2021-03-16 北京目人生殖医学科技有限公司 Method, system, equipment and storage medium for statistical analysis of clinical data
CN112686982A (en) * 2020-12-31 2021-04-20 山东师范大学 VBA-based multi-dimensional data scatter diagram generation method and system
CN112765256A (en) * 2020-12-16 2021-05-07 山东师范大学 Parallel coordinate axis data visualization method and system

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109346173A (en) * 2018-10-23 2019-02-15 上海市疾病预防控制中心 The system and method for realization health life expectancy in life expectancy operational analysis function based on hygiene medical treatment big data

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109346173A (en) * 2018-10-23 2019-02-15 上海市疾病预防控制中心 The system and method for realization health life expectancy in life expectancy operational analysis function based on hygiene medical treatment big data

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
刘悦等: "空气污染人群健康影响监测信息系统数据质量控制的设计与实现" *
王骏等: "基于用户层级的死因监测数据分析解决方案" *
韦廑: "2017年贺州市居民死亡原因分析" *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112509693A (en) * 2020-12-11 2021-03-16 北京目人生殖医学科技有限公司 Method, system, equipment and storage medium for statistical analysis of clinical data
CN112765256A (en) * 2020-12-16 2021-05-07 山东师范大学 Parallel coordinate axis data visualization method and system
CN112686982A (en) * 2020-12-31 2021-04-20 山东师范大学 VBA-based multi-dimensional data scatter diagram generation method and system

Similar Documents

Publication Publication Date Title
CN107609835B (en) Power grid manpower configuration application system and method
CN111063444A (en) Excel VBA-based death cause monitoring data analysis method
Wynn et al. ProcessProfiler3D: A visualisation framework for log-based process performance comparison
CN111400366B (en) Interactive outpatient quantity prediction visual analysis method and system based on Catboost model
CN111324602A (en) Method for realizing financial big data oriented analysis visualization
CN109887611A (en) A kind of medical management system and mobile terminal
CN107679634A (en) A kind of method that power supply trouble based on data visualization reports analysis and prediction for repairment
Fair Generalized record linkage system–Statistics Canada’s record linkage software
CN110728422A (en) Building information model, method, device and settlement system for construction project
CN110379472A (en) A kind of clinical research project management system
CN112100200A (en) Method for automatically generating SQL (structured query language) statements based on dimension model
CN202711253U (en) Detecting and early-stage pre-warning system for infectious disease aggregation
CN112328803A (en) Construction method of company knowledge graph based on industrial chain data
CN112735571A (en) Medical health data uploading management platform
CN113871025A (en) Dermatological clinical special disease database construction method and system
CN112256681A (en) Air traffic control digital index application system and method
Battisto et al. A Standardized Case Study Framework and Methodology to Identify" Best Practices”
CN112259202A (en) Rehabilitation diagnosis and treatment system
CN111401776A (en) Public health service quality control method and system
Aziz et al. Using quantitative approaches to enhance construction performance through data captured from mobile devices
Fischer et al. Approaches and tools for user-driven provenance and data quality information in spatial data infrastructures
CN117370448B (en) Brand digital asset insight analysis method
Mansmann Extending the OLAP technology to handle non-conventional and complex data
Nicholls et al. TraitLab: A MatLab package for fitting and simulating binary tree-like data
Obeysekare et al. Developing a lean data management system for an emerging social enterprise

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200424