CN117271492A - Data management method, device and equipment - Google Patents

Data management method, device and equipment Download PDF

Info

Publication number
CN117271492A
CN117271492A CN202311256617.5A CN202311256617A CN117271492A CN 117271492 A CN117271492 A CN 117271492A CN 202311256617 A CN202311256617 A CN 202311256617A CN 117271492 A CN117271492 A CN 117271492A
Authority
CN
China
Prior art keywords
data
diagnosis
treatment
target
historical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311256617.5A
Other languages
Chinese (zh)
Inventor
柏志文
胡磊
邵名柱
窦浩哲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lianren Healthcare Big Data Technology Co Ltd
Original Assignee
Lianren Healthcare Big Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lianren Healthcare Big Data Technology Co Ltd filed Critical Lianren Healthcare Big Data Technology Co Ltd
Priority to CN202311256617.5A priority Critical patent/CN117271492A/en
Publication of CN117271492A publication Critical patent/CN117271492A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Quality & Reliability (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses a data management method, a device and equipment, wherein the method comprises the following steps: acquiring historical diagnosis and treatment data generated by a plurality of third-party platforms; copying the historical diagnosis and treatment data meeting the preset conditions into a preset copying library, and performing data restoration processing on the historical diagnosis and treatment data in the preset copying library to obtain historical restoration diagnosis and treatment data; dividing the historical repair diagnosis and treatment data into a plurality of training samples based on the patient identification information and a preset diagnosis and treatment sequence to obtain a target training sample set; training a preset data treatment model based on a target training sample set to obtain a target data treatment model; and carrying out data restoration and data table association processing on the diagnosis and treatment data to be processed based on the target data management model to obtain target high-quality diagnosis and treatment data corresponding to the diagnosis and treatment data to be processed. According to the diagnosis and treatment data processing method and device, high-quality diagnosis and treatment data can be obtained, and data treatment efficiency is improved.

Description

Data management method, device and equipment
Technical Field
The embodiment of the invention relates to the technical field of data processing, in particular to a data management method, a device and equipment.
Background
With the increasing emphasis of the status of data elements in national development and security strategy, data circulation is a 'hub' from data element generation to data value release, the urgent demands of data circulation and the characteristics of medical policies and medical industries provide a broad stage for data circulation related technologies. Limited by the profession and serious nature of the medical industry, and in order to achieve the aim of data energization, higher requirements are put on diagnosis and treatment data standards and service data quality.
At present, because the diagnosis and treatment data management systems adopted by the third party platforms are different, the diagnosis and treatment data generated by the different third party platforms have large difference and different data execution standards, and specific data management models are required to be designed aiming at the different diagnosis and treatment data management systems for completing data management tasks, so that high-quality diagnosis and treatment data are obtained.
However, in the data management mode, the acquired medical treatment data are isolated from each other, a special data management model is required to be designed aiming at different diagnosis and treatment data management systems, and the technical problem of low data management efficiency exists.
Disclosure of Invention
The invention provides a data management method, a device and equipment, which can obtain high-quality diagnosis and treatment data and improve the data management efficiency.
According to a first aspect of the present invention there is provided a data governance method comprising:
acquiring historical diagnosis and treatment data generated by a plurality of third-party platforms; the historical diagnosis and treatment data are provided by a plurality of diagnosis and treatment data management systems, each diagnosis and treatment data management system provides a plurality of diagnosis and treatment process data tables, and the historical diagnosis and treatment data comprise patient identification information;
copying the historical diagnosis and treatment data meeting the preset conditions into a preset copying library, and performing data restoration processing on the historical diagnosis and treatment data in the preset copying library to obtain historical restoration diagnosis and treatment data;
dividing the history repairing diagnosis and treatment data into a plurality of training samples based on the patient identification information and a preset diagnosis and treatment sequence to obtain a target training sample set; wherein, a plurality of diagnosis and treatment process data tables in each training sample have association relations corresponding to the preset diagnosis and treatment sequence;
training a preset data treatment model based on a target training sample set to obtain a target data treatment model;
performing data restoration and data table association processing on the diagnosis and treatment data to be processed based on the target data management model to obtain target high-quality diagnosis and treatment data corresponding to the diagnosis and treatment data to be processed; the diagnosis and treatment data to be processed are current diagnosis and treatment data generated by a target third-party platform.
According to a second aspect of the present invention, there is provided a data governance device comprising:
the historical data acquisition module is used for acquiring historical diagnosis and treatment data generated by a plurality of third-party platforms; the historical diagnosis and treatment data are provided by a plurality of diagnosis and treatment data management systems, each diagnosis and treatment data management system provides a plurality of diagnosis and treatment process data tables, and the historical diagnosis and treatment data comprise patient identification information;
the historical data copying module is used for copying the historical diagnosis and treatment data meeting the preset conditions into a preset copying library, and carrying out data restoration processing on the historical diagnosis and treatment data in the preset copying library to obtain historical restoration diagnosis and treatment data;
the training sample set determining module is used for dividing the historical repair diagnosis and treatment data into a plurality of training samples based on the patient identification information and a preset diagnosis and treatment sequence to obtain a target training sample set; wherein, a plurality of diagnosis and treatment process data tables in each training sample have association relations corresponding to the preset diagnosis and treatment sequence;
the control model training module is used for training the preset data control model based on the target training sample set to obtain a target data control model;
The diagnosis and treatment data treatment module is used for carrying out data restoration and data table association treatment on diagnosis and treatment data to be treated based on the target data treatment model to obtain target high-quality diagnosis and treatment data corresponding to the diagnosis and treatment data to be treated; the diagnosis and treatment data to be processed are current diagnosis and treatment data generated by a target third-party platform.
According to a third aspect of the present invention, there is provided an electronic device comprising:
at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the data governance method of any of the embodiments of the present invention.
According to a fourth aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to execute a data governance method according to any of the embodiments of the present invention.
According to the technical scheme, historical diagnosis and treatment data generated by a plurality of third-party platforms are obtained, wherein the historical diagnosis and treatment data are provided by a plurality of diagnosis and treatment data management systems, each diagnosis and treatment data management system provides a plurality of diagnosis and treatment process data tables, and the historical diagnosis and treatment data comprise patient identification information. And further, dividing the historical diagnosis and treatment data into a plurality of training samples based on the patient identification information and the preset diagnosis and treatment sequence to obtain a target training sample set, wherein a plurality of diagnosis and treatment process data tables in each training sample have an association relation corresponding to the preset diagnosis and treatment sequence. Further, training the preset data treatment model based on the target training sample set to obtain a target data treatment model, so that data restoration and data table association processing are carried out on the diagnosis and treatment data to be processed based on the target data treatment model to obtain target high-quality diagnosis and treatment data corresponding to the diagnosis and treatment data to be processed, wherein the diagnosis and treatment data to be processed is current diagnosis and treatment data generated by a target third party platform. According to the diagnosis and treatment data processing method and device, high-quality diagnosis and treatment data can be obtained, and data treatment efficiency is improved.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a data governance method according to a first embodiment of the present invention;
FIG. 2 is a flow chart of a data governance method according to a second embodiment of the present invention;
FIG. 3 is a schematic diagram of a data management device according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device implementing a data management method according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
Fig. 1 is a flowchart of a data management method according to a first embodiment of the present invention, where the embodiment is applicable to a case of performing data management on diagnosis and treatment data provided by any diagnosis and treatment data management system to obtain high quality diagnosis and treatment data, the method may be performed by a data management device, the data management device may be implemented in a form of hardware and/or software, and the data management device may be configured in a terminal and/or a server. As shown in fig. 1, the method includes:
S110, historical diagnosis and treatment data generated by a plurality of third party platforms are obtained.
Wherein, the third party platform is a medical institution such as a hospital, a clinic, a sanitarian and the like. The historical diagnosis and treatment data are all data contents generated in the past diagnosis and treatment process of each third party platform, and the historical diagnosis and treatment data comprise, but are not limited to, diagnosis and treatment process data tables, image pictures, data logs and the like. The historical diagnosis and treatment data are provided by a plurality of diagnosis and treatment data management systems, each diagnosis and treatment data management system provides a plurality of diagnosis and treatment process data tables, and the historical diagnosis and treatment data comprise patient identification information.
In this embodiment, the diagnosis and treatment data management system is a server that records and manages diagnosis and treatment data of a patient. For a third party platform, there are situations where different departments employ different clinical data management systems, and different clinical data management systems are developed by different manufacturers. For example, hospitals include outpatient clinics, examination departments, imaging departments, pathology departments, physical sign departments, hospitalization departments, nursing departments, rehabilitation departments, and the like. The clinic adopts an A-type diagnosis and treatment data management system delivered by an A manufacturer, and the inspection department adopts a B-type diagnosis and treatment data management system delivered by a B manufacturer, so that the acquired historical diagnosis and treatment data are provided by a plurality of diagnosis and treatment data management systems.
In this embodiment, in the process of medical treatment of a patient, the medical treatment process of a department needs to go through multiple flow steps, and different flow steps correspond to different diagnosis and treatment process data tables, so each diagnosis and treatment data management system provides multiple diagnosis and treatment process data tables. The patient identification information is unique identification information corresponding to each patient, such as a visit sequence number, registration code, identity unique identification, etc.
Specifically, in order to obtain a data management model with high data management performance, model training is required to be performed on a preset data management model based on a large amount of historical diagnosis and treatment data, so that the data management model recognizes the data flow relationship between diagnosis and treatment process data tables provided by each diagnosis and treatment data management system and the data characteristics of each diagnosis and treatment process data table. Thus, historical clinical data generated by a plurality of third party platforms is acquired.
Particularly, after the historical diagnosis and treatment data generated by a plurality of third-party platforms are obtained, the historical diagnosis and treatment data are subjected to data preprocessing. The specific pretreatment content comprises at least one of the following steps: classifying and grouping the historical diagnosis and treatment data based on the difference of each diagnosis and treatment data management system; performing intelligent limited marking and business marking preprocessing on the historical diagnosis and treatment data; and performing intelligent duplicate removal and normalization processing on the value fields, the coded equivalent sets and the term attribute features of the crossing diagnosis and treatment data management system and the crossing version.
S120, copying the historical diagnosis and treatment data meeting the preset conditions into a preset copying library, and performing data restoration processing on the historical diagnosis and treatment data in the preset copying library to obtain historical restoration diagnosis and treatment data.
The preset condition is a preset precondition. The preset replication library is a pre-configured data storage structure, and the historical diagnosis and treatment data is diagnosis and treatment data obtained after the historical diagnosis and treatment data is subjected to data restoration.
Specifically, the preset conditions are: and performing data quality evaluation on the historical diagnosis and treatment data from a plurality of evaluation dimensions to obtain data quality scores corresponding to the historical diagnosis and treatment data, wherein the data quality scores are greater than a preset threshold value.
Wherein the plurality of evaluation dimensions includes an integrity evaluation dimension, a consistency evaluation dimension, a timeliness evaluation dimension, a validity evaluation dimension, a uniqueness evaluation dimension, and an accuracy evaluation dimension. The integrity evaluation dimension is used for evaluating whether the historical diagnosis and treatment data has blank fields in a data table with fields. The consistency evaluation dimension is used for evaluating whether the historical diagnosis and treatment data has the situation that the data table content does not correspond to the data table title. The timeliness evaluation dimension is used for evaluating whether the historical diagnosis and treatment data should generate data content at preset time or not, but does not generate corresponding data content. The validity evaluation dimension is used for evaluating whether the historical diagnosis and treatment data has useless data content or not. The uniqueness evaluation dimension is used for evaluating whether the same diagnosis and treatment data of one patient are repeated in the historical diagnosis and treatment data. The accuracy evaluation dimension is used for evaluating whether the historical diagnosis and treatment data has error data.
The data quality score is a numerical representation for evaluating the quality of the historical diagnosis and treatment data, for example, the data quality score is a percentage, the higher the score is, the higher the quality of the historical diagnosis and treatment data is, and the lower the score is, the lower the quality of the historical diagnosis and treatment data is. The preset threshold is a preset data quality scoring threshold.
Optionally, performing data quality evaluation on the historical diagnosis and treatment data from multiple evaluation dimensions to obtain a data quality score corresponding to the historical diagnosis and treatment data, which specifically includes: determining the quality scores of the data to be applied corresponding to the historical diagnosis and treatment data under each evaluation dimension; and determining the data quality scores corresponding to the historical diagnosis and treatment data based on the weighted sum of the data quality scores to be applied.
In this embodiment, the quality scores of the data to be applied corresponding to each diagnosis and treatment process data table in the integrity evaluation dimension, the consistency evaluation dimension, the timeliness evaluation dimension, the validity evaluation dimension, the uniqueness evaluation dimension and the accuracy evaluation dimension can be determined respectively. Based on the above, for each diagnosis and treatment process data table, 6 data quality scores to be applied are obtained, and further, according to the weighted sum of the 6 data quality scores to be applied, the data quality score corresponding to each diagnosis and treatment process data table is determined.
Further, on the basis of obtaining the data quality scores corresponding to the diagnosis and treatment process data table, the diagnosis and treatment process data table with the data quality scores larger than the preset threshold value is reserved for subsequent processing. For example, if the preset threshold is 60 minutes, the diagnosis and treatment process data table with the data quality score being greater than 60 minutes is reserved for subsequent processing. The purpose of this is to: because the historical diagnosis and treatment data with poor data quality is high in data restoration difficulty and can reduce the performance of the data treatment model, the historical diagnosis and treatment data with poor data quality can be screened out.
Specifically, on the basis of obtaining the historical diagnosis and treatment data meeting the preset conditions, the historical diagnosis and treatment data meeting the preset conditions is copied to a preset copying library, so that the data restoration processing of the historical diagnosis and treatment data is more convenient. The data repair process specifically includes at least one of the following steps: the method comprises the steps of intelligently filling missing data, detecting and correcting error data, and normalizing nonstandard data content. And carrying out the repairing treatment of the historical diagnosis and treatment data to obtain the historical repairing diagnosis and treatment data.
S130, dividing the historical repair diagnosis and treatment data into a plurality of training samples based on the patient identification information and a preset diagnosis and treatment sequence to obtain a target training sample set.
The patient identification information is the existing data content in each diagnosis and treatment process data table in the historical diagnosis and treatment data, and the patient identification information corresponding to each patient can be directly obtained. The training samples are orderly historical diagnosis and treatment sample groups obtained after the disordered historical repair and treatment data are subjected to induction and treatment. The target training sample set is a set of a plurality of training samples.
Wherein, the preset diagnosis and treatment sequence is the preset different department sequence arrangement. For example, for a complete diagnosis and treatment activity, the appointment registration, the diagnosis, the examination, the diagnosis, the treatment and the rehabilitation of the diagnosis stage are required to be performed, and different diagnosis and treatment departments corresponding to different diagnosis and treatment stages of the complete diagnosis and treatment activity can be sequentially arranged. Exemplary, the preset diagnosis and treatment sequence is as follows: outpatient service, examination department, imaging department, pathology department, physical sign department, hospitalization department, nursing department, rehabilitation department.
Specifically, the determining the target training sample set specifically includes the following steps:
s1301, dividing the historical repair diagnosis and treatment data in a preset copy library into a plurality of historical diagnosis and treatment sample groups based on patient identification information.
In this embodiment, for each diagnosis and treatment process data table in the history repairing diagnosis and treatment data in the preset replication library, according to patient identification information in the diagnosis and treatment process data table, data related to the patient identification information in the diagnosis and treatment process data table is summarized, so that the history repairing diagnosis and treatment data is divided into each history diagnosis and treatment sample group corresponding to each patient identification information, and a plurality of history diagnosis and treatment sample groups are obtained.
S1301, sequentially associating treatment is carried out on each diagnosis and treatment process data table in each history diagnosis and treatment sample group based on a preset diagnosis and treatment sequence, so that a target history diagnosis and treatment sample group corresponding to each history diagnosis and treatment sample group is obtained.
In this embodiment, each historical diagnosis and treatment sample group includes diagnosis and treatment data corresponding to a plurality of diagnosis and treatment departments, so that the diagnosis and treatment data in each historical diagnosis and treatment sample group is subjected to sequential association processing according to a preset diagnosis and treatment sequential preset sequence, so that the diagnosis and treatment data of each historical diagnosis and treatment sample group is a sequential data stream representing an actual diagnosis and treatment process, and a target historical diagnosis and treatment sample group corresponding to each historical diagnosis and treatment sample group is obtained.
S1301, forming a target training sample set based on each target history diagnosis and treatment sample set.
In this embodiment, on the basis of obtaining a plurality of target history diagnosis and treatment sample groups, the whole of each target history diagnosis and treatment sample group is set as a target training sample set.
And S140, training the preset data treatment model based on the target training sample set to obtain a target data treatment model.
The preset data management model is a predetermined generated artificial intelligent medical vertical model, and model parameters in the preset data management model are initial parameters.
In this embodiment, the target training sample set is input into the preset data governance model, so that the target training sample performs model training on the preset data governance model. Therefore, the trained target data management model can recognize the data flow relation among the diagnosis and treatment process data tables provided by each diagnosis and treatment data management system and the data characteristics of each diagnosis and treatment process data table in the whole diagnosis and treatment process.
And S150, performing data restoration and data table association processing on the diagnosis and treatment data to be processed based on the target data management model to obtain target high-quality diagnosis and treatment data corresponding to the diagnosis and treatment data to be processed.
The diagnosis and treatment data to be processed are current diagnosis and treatment data generated by the target third-party platform. The current diagnosis and treatment data can be understood as diagnosis and treatment data which need to be subjected to data management.
In this embodiment, after the target data management model is obtained, the target data management model may perform data restoration and data table association processing on the diagnosis and treatment data to be processed provided by any third party platform. After the diagnosis and treatment data to be processed are processed by the target data treatment model, the target data treatment model outputs target high-quality diagnosis and treatment data corresponding to the diagnosis and treatment data to be processed.
Specifically, determining target high-quality diagnosis and treatment data corresponding to diagnosis and treatment data to be processed comprises the following contents: and carrying out error format correction and repair treatment, missing data supplement and repair treatment, value range standardization and repair treatment and data table association treatment on the diagnosis and treatment data to be treated based on the target data treatment model to obtain target high-quality diagnosis and treatment data corresponding to the diagnosis and treatment data to be treated.
In this embodiment, since the target data management model has already learned the characteristics of the diagnosis and treatment data provided by various mainstream diagnosis and treatment data management systems and the data table association relationship between different mainstream diagnosis and treatment data management systems through a large amount of historical diagnosis and treatment data, in particular, the mainstream diagnosis and treatment data management system is a diagnosis and treatment data management system adopted by each third party platform at present. After the diagnosis and treatment data to be processed are input into the target data treatment model, the target data treatment model can identify which diagnosis and treatment data to be processed have format errors, which diagnosis and treatment data to be processed have missing data and which values of the diagnosis and treatment data to be processed are not standard, and based on the diagnosis and treatment data to be processed, the target data treatment model carries out data repair processing on at least one aspect of error format correction repair processing, missing data supplement repair processing and value domain standardization repair processing on the diagnosis and treatment data to be processed with problems. Further, carrying out association processing on the diagnosis and treatment process data table to be processed on the repaired diagnosis and treatment process data tables to be processed. For example, the target data management model performs table series connection on each diagnosis and treatment process data table to form a closed-loop data stream capable of representing the complete diagnosis and treatment process of the patient according to the learned data stream relation. Thereby, the target high-quality diagnosis and treatment data corresponding to the diagnosis and treatment data to be processed is obtained.
According to the technical scheme, historical diagnosis and treatment data generated by a plurality of third-party platforms are obtained, wherein the historical diagnosis and treatment data are provided by a plurality of diagnosis and treatment data management systems, each diagnosis and treatment data management system provides a plurality of diagnosis and treatment process data tables, and the historical diagnosis and treatment data comprise patient identification information. And further, dividing the historical diagnosis and treatment data into a plurality of training samples based on the patient identification information and the preset diagnosis and treatment sequence to obtain a target training sample set, wherein a plurality of diagnosis and treatment process data tables in each training sample have an association relation corresponding to the preset diagnosis and treatment sequence. Still further, training the preset data treatment model based on the target training sample set to obtain a target data treatment model, so that data restoration and data table association processing are carried out on the diagnosis and treatment data to be processed based on the target data treatment model to obtain target high-quality diagnosis and treatment data corresponding to the diagnosis and treatment data to be processed, wherein the diagnosis and treatment data to be processed is current diagnosis and treatment data generated by a target third party platform. According to the diagnosis and treatment data processing method and device, high-quality diagnosis and treatment data can be obtained, and data treatment efficiency is improved.
Example two
Fig. 2 is a flowchart of a data governance method according to a second embodiment of the present invention, and details of how a trained target data governance model is applied to a target third party platform are described based on the foregoing embodiments, where technical terms that are the same as or corresponding to the foregoing embodiments are not repeated herein.
As shown in fig. 2, the method includes:
s210, historical diagnosis and treatment data generated by a plurality of third party platforms are obtained.
S220, copying the historical diagnosis and treatment data meeting the preset conditions into a preset copying library, and performing data restoration processing on the historical diagnosis and treatment data in the preset copying library to obtain historical restoration diagnosis and treatment data.
S230, dividing the historical repair diagnosis and treatment data into a plurality of training samples based on the patient identification information and the preset diagnosis and treatment sequence, and obtaining a target training sample set.
S240, training the preset data treatment model based on the target training sample set to obtain a target data treatment model.
S250, applying the target data governance model to a target third party platform, and extracting diagnosis and treatment data to be processed from all diagnosis and treatment data of the target third party platform based on a preset data extraction rule.
Wherein, the target third party platform is any medical institution such as hospital, clinic, sanitarian and the like. All of the targeting third party platform diagnostic data is provided by a variety of targeting diagnostic data management systems. The diagnosis and treatment data to be processed are the diagnosis and treatment data to be processed based on the target data treatment model.
The preset data extraction rule is a specific measure adopted when the to-be-processed diagnosis and treatment data are extracted from all the diagnosis and treatment data in a preset mode.
Optionally, the data extraction rule includes at least one of: configuring the name of a diagnosis and treatment process data table to be extracted, which corresponds to a target diagnosis and treatment data management system; extracting a time interval of data to be processed from a target third party platform; and when the extraction of the data to be processed from the target third-party platform fails, the adopted data extraction retry strategy.
Specifically, the trained target data management model can be applied to any target third party platform, and diagnosis and treatment data generated by any target third party platform are subjected to data management. Because some useless data exists in all diagnosis and treatment data generated by the target third party platform, in the actual application process, the to-be-processed diagnosis and treatment data which needs to be subjected to data treatment needs to be extracted from all diagnosis and treatment data. When extracting the diagnosis and treatment data to be processed, carrying out data extraction based on a preset data extraction rule, wherein the first aspect of the data extraction rule is to configure the name of the diagnosis and treatment process data table to be extracted corresponding to the target diagnosis and treatment data management system, namely determining the name information of the diagnosis and treatment data table to be extracted from the target diagnosis and treatment data management system, and determining which diagnosis and treatment process data table to be extracted specifically according to the name information of the data table; the second aspect of the data extraction rule is the time interval of extracting the data to be processed from the target third party platform, namely how often the data is extracted from the target third party platform; the third aspect of the data extraction rule is that when the extraction of the data to be processed from the target third party platform fails, the adopted data extraction retry strategy can be understood as how long to re-extract the data from the target third party platform if the extraction of the diagnosis and treatment data from the target third party platform fails, and the retry task is ended and the error prompt is reported under the condition that the retry is several times.
S260, determining a synchronous replication scheme corresponding to the diagnosis and treatment data to be processed based on a target diagnosis and treatment data management system corresponding to the diagnosis and treatment data to be processed.
The synchronous replication scheme is a specific measure adopted in the process of replicating the diagnosis and treatment data to be processed to a preset treatment replication library.
In this embodiment, a synchronous replication scheme corresponding to the target diagnosis and treatment data management system may be configured in advance in the target data management model. Among them, synchronous replication schemes include, but are not limited to, scripts giving the need for replication, data source configuration of replication libraries, data storage schemes, replication tasks, etc. On the basis of determining the target diagnosis and treatment data management system, a synchronous replication scheme of the to-be-processed diagnosis and treatment data corresponding to the target diagnosis and treatment data management system can be determined.
S270, based on the synchronous replication scheme, the diagnosis and treatment data to be processed are replicated to a preset treatment replication library, and the data replication progress is displayed in real time.
The preset governance copy library is a pre-configured data storage unit. The data replication progress is used for representing the completion degree of data replication, for example, the data replication progress is the ratio of the amount of diagnosis and treatment data to be processed of the replication task to the amount of all diagnosis and treatment data to be replicated, and can be represented by a percentage.
In this embodiment, the diagnosis and treat data to be processed is copied to a preset treatment copy library according to a synchronous copy scheme, and in the process of synchronously copying the diagnosis and treat data, a data processing log is recorded and the data copy progress is synchronously displayed on a target display device.
S280, performing data restoration and data table association processing on the diagnosis and treatment data to be processed based on the target data management model to obtain target high-quality diagnosis and treatment data corresponding to the diagnosis and treatment data to be processed.
And S290, performing data quality evaluation on the target high-quality diagnosis and treatment data from a plurality of evaluation dimensions to obtain a target data quality score corresponding to the target high-quality diagnosis and treatment data.
The quality scores of the target data are used for representing the quality degree of the target high-quality diagnosis and treatment data.
Wherein the plurality of evaluation dimensions includes an integrity evaluation dimension, a consistency evaluation dimension, a timeliness evaluation dimension, a validity evaluation dimension, a uniqueness evaluation dimension, and an accuracy evaluation dimension.
In this embodiment, after the target high-quality diagnosis and treatment data is obtained, data quality evaluation is performed on the target high-quality diagnosis and treatment data from an integrity evaluation dimension, a consistency evaluation dimension, a timeliness evaluation dimension, a validity evaluation dimension, a uniqueness evaluation dimension and an accuracy evaluation dimension, so as to obtain a target data quality score corresponding to the target high-quality diagnosis and treatment data in each dimension.
According to the technical scheme, after the target data management model is obtained, the target data management model is applied to the target third party platform, the diagnosis and treatment data to be processed are extracted from all diagnosis and treatment data of the target third party platform based on the preset data extraction rule, and further, the synchronous replication scheme corresponding to the diagnosis and treatment data to be processed is determined based on the target diagnosis and treatment data management system corresponding to the diagnosis and treatment data to be processed, so that the diagnosis and treatment data to be processed are replicated to the preset management replication library based on the synchronous replication scheme, the data replication progress is displayed in real time, only diagnosis and treatment data required by data management are extracted through the data extraction rule, and data management processing is not performed on all data of the target third party platform, so that the data management efficiency is improved. In this embodiment, after obtaining the target high-quality diagnosis and treat data, data quality evaluation is performed on the target high-quality diagnosis and treat data from multiple evaluation dimensions, so as to obtain a target data quality score corresponding to the target high-quality diagnosis and treat data, thereby realizing quantitative evaluation on the target high-quality diagnosis and treat data.
Example III
Fig. 3 is a schematic structural diagram of a data management device according to a third embodiment of the present invention. As shown in fig. 3, the apparatus includes: a historical data acquisition module 310, a historical data replication module 320, a training sample set determination module 330, a abatement model training module 340, and a abatement data abatement module 350.
The historical data obtaining module 310 is configured to obtain historical diagnosis and treatment data generated by a plurality of third party platforms; the historical diagnosis and treatment data are provided by a plurality of diagnosis and treatment data management systems, each diagnosis and treatment data management system provides a plurality of diagnosis and treatment process data tables, and the historical diagnosis and treatment data comprise patient identification information;
the historical data copying module 320 is configured to copy the historical diagnosis and treatment data meeting the preset conditions into a preset copying library, and perform data restoration processing on the historical diagnosis and treatment data in the preset copying library to obtain historical restoration diagnosis and treatment data;
the training sample set determining module 330 is configured to divide the historical repair diagnosis and treatment data into a plurality of training samples based on the patient identification information and a preset diagnosis and treatment sequence, so as to obtain a target training sample set; wherein, a plurality of diagnosis and treatment process data tables in each training sample have association relations corresponding to the preset diagnosis and treatment sequence;
The treatment model training module 340 is configured to train the preset data treatment model based on the target training sample set to obtain a target data treatment model;
the diagnosis and treatment data treatment module 350 is configured to perform data restoration and data table association processing on diagnosis and treatment data to be processed based on the target data treatment model, so as to obtain target high-quality diagnosis and treatment data corresponding to the diagnosis and treatment data to be processed; the diagnosis and treatment data to be processed are current diagnosis and treatment data generated by a target third-party platform.
According to the technical scheme, historical diagnosis and treatment data generated by a plurality of third-party platforms are obtained, wherein the historical diagnosis and treatment data are provided by a plurality of diagnosis and treatment data management systems, each diagnosis and treatment data management system provides a plurality of diagnosis and treatment process data tables, and the historical diagnosis and treatment data comprise patient identification information. And further, dividing the historical diagnosis and treatment data into a plurality of training samples based on the patient identification information and the preset diagnosis and treatment sequence to obtain a target training sample set, wherein a plurality of diagnosis and treatment process data tables in each training sample have an association relation corresponding to the preset diagnosis and treatment sequence. Still further, training the preset data treatment model based on the target training sample set to obtain a target data treatment model, so that data restoration and data table association processing are carried out on the diagnosis and treatment data to be processed based on the target data treatment model to obtain target high-quality diagnosis and treatment data corresponding to the diagnosis and treatment data to be processed, wherein the diagnosis and treatment data to be processed is current diagnosis and treatment data generated by a target third party platform. According to the diagnosis and treatment data processing method and device, high-quality diagnosis and treatment data can be obtained, and data treatment efficiency is improved.
Optionally, the historical data replication module 320 includes: the data determining unit is used for evaluating the data quality of the historical diagnosis and treatment data from a plurality of evaluation dimensions to obtain a data quality score corresponding to the historical diagnosis and treatment data, and the historical diagnosis and treatment data with the data quality score being greater than a preset threshold value is the historical diagnosis and treatment data meeting the preset condition.
Optionally, the plurality of evaluation dimensions includes an integrity evaluation dimension, a consistency evaluation dimension, a timeliness evaluation dimension, a validity evaluation dimension, a uniqueness evaluation dimension, and an accuracy evaluation dimension; the data determining unit is used for determining the quality scores of the data to be applied, which correspond to the historical diagnosis and treatment data under each evaluation dimension; and determining the data quality score corresponding to the historical diagnosis and treatment data based on the weighted sum of the data quality scores to be applied.
Optionally, the training sample set determining module 330 includes:
the sample group dividing unit is used for dividing the historical repair diagnosis and treatment data in the preset replication library into a plurality of historical diagnosis and treatment sample groups based on the patient identification information;
the data table association unit is used for sequentially associating the diagnosis and treatment process data tables in the history diagnosis and treatment sample groups based on a preset diagnosis and treatment sequence to obtain target history diagnosis and treatment sample groups corresponding to the history diagnosis and treatment sample groups;
The sample set determining unit is used for forming a target training sample set based on each target historical diagnosis and treatment sample set.
Optionally, the data processing device further includes a data processing module to be processed, where the data processing module to be processed includes:
the to-be-processed data extraction unit is used for applying the target data treatment model to a target third party platform and extracting to-be-processed diagnosis and treatment data from all diagnosis and treatment data of the target third party platform based on a preset data extraction rule; wherein the overall diagnosis and treatment data is provided by a plurality of target diagnosis and treatment data management systems;
the data replication scheme determining unit is used for determining a synchronous replication scheme corresponding to the diagnosis and treatment data to be processed based on a target diagnosis and treatment data management system corresponding to the diagnosis and treatment data to be processed;
the data synchronous copying unit is used for copying the diagnosis and treatment data to be processed to a preset treatment copying library based on a synchronous copying scheme and displaying the data copying progress in real time.
Optionally, the diagnosis and treatment data management module 350 is specifically configured to: and performing error format correction and repair processing, missing data supplement and repair processing, value range standardization and repair processing and data table association processing on the diagnosis and treatment data to be processed based on the target data management model to obtain target high-quality diagnosis and treatment data corresponding to the diagnosis and treatment data to be processed.
Optionally, the data management device further includes a target data quality evaluation module, specifically configured to perform data quality evaluation on the target high-quality diagnosis and treatment data from multiple evaluation dimensions, so as to obtain a target data quality score corresponding to the target high-quality diagnosis and treatment data.
The data management device provided by the embodiment of the invention can execute the data management method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Example IV
Fig. 4 shows a schematic diagram of the structure of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 4, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the various methods and processes described above, such as data governance methods.
In some embodiments, the data governance method may be implemented as a computer program tangibly embodied on a computer readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more of the steps of the data governance method described above may be carried out. Alternatively, in other embodiments, processor 11 may be configured to perform the data governance method in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method of data management comprising:
acquiring historical diagnosis and treatment data generated by a plurality of third-party platforms; the historical diagnosis and treatment data are provided by a plurality of diagnosis and treatment data management systems, each diagnosis and treatment data management system provides a plurality of diagnosis and treatment process data tables, and the historical diagnosis and treatment data comprise patient identification information;
copying the historical diagnosis and treatment data meeting the preset conditions into a preset copying library, and performing data restoration processing on the historical diagnosis and treatment data in the preset copying library to obtain historical restoration diagnosis and treatment data;
Dividing the history repairing diagnosis and treatment data into a plurality of training samples based on the patient identification information and a preset diagnosis and treatment sequence to obtain a target training sample set; wherein, a plurality of diagnosis and treatment process data tables in each training sample have association relations corresponding to the preset diagnosis and treatment sequence;
training a preset data treatment model based on a target training sample set to obtain a target data treatment model;
performing data restoration and data table association processing on the diagnosis and treatment data to be processed based on the target data management model to obtain target high-quality diagnosis and treatment data corresponding to the diagnosis and treatment data to be processed; the diagnosis and treatment data to be processed are current diagnosis and treatment data generated by a target third-party platform.
2. The method according to claim 1, wherein the preset conditions are:
and carrying out data quality evaluation on the historical diagnosis and treatment data from a plurality of evaluation dimensions to obtain data quality scores corresponding to the historical diagnosis and treatment data, wherein the data quality scores are greater than historical diagnosis and treatment data with a preset threshold value.
3. The method of claim 2, wherein the plurality of evaluation dimensions includes an integrity evaluation dimension, a consistency evaluation dimension, a timeliness evaluation dimension, a validity evaluation dimension, a uniqueness evaluation dimension, and an accuracy evaluation dimension; performing data quality evaluation on the historical diagnosis and treatment data from a plurality of evaluation dimensions to obtain a data quality score corresponding to the historical diagnosis and treatment data, including:
Determining the quality scores of the data to be applied corresponding to the historical diagnosis and treatment data under each evaluation dimension;
and determining the data quality score corresponding to the historical diagnosis and treatment data based on the weighted sum of the data quality scores to be applied.
4. The method of claim 1, wherein dividing the historic repair diagnosis and treatment data into a plurality of training samples based on the patient identification information and a preset diagnosis and treatment sequence to obtain a target training sample set comprises:
dividing the historical repair diagnosis and treatment data in the preset copy library into a plurality of historical diagnosis and treatment sample groups based on the patient identification information;
sequentially associating treatment is carried out on each diagnosis and treatment process data table in each history diagnosis and treatment sample group based on a preset diagnosis and treatment sequence, so that a target history diagnosis and treatment sample group corresponding to each history diagnosis and treatment sample group is obtained;
and forming a target training sample set based on each target historical diagnosis and treatment sample set.
5. The method according to claim 1, further comprising, before performing data restoration and data table association processing on the diagnosis and treat data to be treated based on the target data management model, obtaining target high-quality diagnosis and treat data corresponding to the diagnosis and treat data to be treated:
Applying the target data management model to a target third party platform, and extracting diagnosis and treatment data to be processed from all diagnosis and treatment data of the target third party platform based on a preset data extraction rule; wherein the overall diagnosis and treatment data is provided by a plurality of target diagnosis and treatment data management systems;
determining a synchronous replication scheme corresponding to the diagnosis and treatment data to be processed based on a target diagnosis and treatment data management system corresponding to the diagnosis and treatment data to be processed;
based on a synchronous replication scheme, the diagnosis and treatment data to be processed are replicated to a preset treatment replication library, and the data replication progress is displayed in real time.
6. The method of claim 5, wherein the data extraction rules comprise at least one of:
configuring the name of a diagnosis and treatment process data table to be extracted corresponding to the target diagnosis and treatment data management system;
extracting a time interval of data to be processed from the target third party platform;
and when the extraction of the data to be processed from the target third-party platform fails, adopting a data extraction retry strategy.
7. The method according to claim 1, wherein the performing data restoration and data table association processing on the diagnosis and treatment data to be processed based on the target data management model to obtain target high-quality diagnosis and treatment data corresponding to the diagnosis and treatment data includes:
And performing error format correction and repair processing, missing data supplement and repair processing, value range standardization and repair processing and data table association processing on the diagnosis and treatment data to be processed based on the target data management model to obtain target high-quality diagnosis and treatment data corresponding to the diagnosis and treatment data to be processed.
8. The method according to claim 7, wherein after performing data restoration and data table association processing on the diagnosis and treat data to be treated based on the target data management model, obtaining target high-quality diagnosis and treat data corresponding to the diagnosis and treat data, further comprising:
and carrying out data quality evaluation on the target high-quality diagnosis and treatment data from a plurality of evaluation dimensions to obtain a target data quality score corresponding to the target high-quality diagnosis and treatment data.
9. A data governance device, comprising:
the historical data acquisition module is used for acquiring historical diagnosis and treatment data generated by a plurality of third-party platforms; the historical diagnosis and treatment data are provided by a plurality of diagnosis and treatment data management systems, each diagnosis and treatment data management system provides a plurality of diagnosis and treatment process data tables, and the historical diagnosis and treatment data comprise patient identification information;
the historical data copying module is used for copying the historical diagnosis and treatment data meeting the preset conditions into a preset copying library, and carrying out data restoration processing on the historical diagnosis and treatment data in the preset copying library to obtain historical restoration diagnosis and treatment data;
The training sample set determining module is used for dividing the historical repair diagnosis and treatment data into a plurality of training samples based on the patient identification information and a preset diagnosis and treatment sequence to obtain a target training sample set; wherein, a plurality of diagnosis and treatment process data tables in each training sample have association relations corresponding to the preset diagnosis and treatment sequence;
the control model training module is used for training the preset data control model based on the target training sample set to obtain a target data control model;
the diagnosis and treatment data treatment module is used for carrying out data restoration and data table association treatment on diagnosis and treatment data to be treated based on the target data treatment model to obtain target high-quality diagnosis and treatment data corresponding to the diagnosis and treatment data to be treated; the diagnosis and treatment data to be processed are current diagnosis and treatment data generated by a target third-party platform.
10. An electronic device, characterized in that the electronic device comprises:
one or more processors;
storage means for storing one or more programs,
when executed by one or more processors, causes the one or more processors to implement the data governance method of any of claims 1 to 8.
CN202311256617.5A 2023-09-26 2023-09-26 Data management method, device and equipment Pending CN117271492A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311256617.5A CN117271492A (en) 2023-09-26 2023-09-26 Data management method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311256617.5A CN117271492A (en) 2023-09-26 2023-09-26 Data management method, device and equipment

Publications (1)

Publication Number Publication Date
CN117271492A true CN117271492A (en) 2023-12-22

Family

ID=89217378

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311256617.5A Pending CN117271492A (en) 2023-09-26 2023-09-26 Data management method, device and equipment

Country Status (1)

Country Link
CN (1) CN117271492A (en)

Similar Documents

Publication Publication Date Title
CN112579621A (en) Data display method and device, electronic equipment and computer storage medium
JP6419667B2 (en) Test DB data generation method and apparatus
CN110706121B (en) Method and device for determining medical insurance fraud result, electronic equipment and storage medium
CN117372424B (en) Defect detection method, device, equipment and storage medium
CN111460293B (en) Information pushing method and device and computer readable storage medium
CN117271492A (en) Data management method, device and equipment
CN111859985B (en) AI customer service model test method and device, electronic equipment and storage medium
CN115186738A (en) Model training method, device and storage medium
CN114443493A (en) Test case generation method and device, electronic equipment and storage medium
CN116089459B (en) Data retrieval method, device, electronic equipment and storage medium
CN113760777B (en) Application program pressure test method, device, equipment and storage medium
CN116070601B (en) Data splicing method and device, electronic equipment and storage medium
CN117312288A (en) Data quality inspection method and device, electronic equipment and storage medium
CN117609054A (en) Automatic test method, device, equipment and storage medium
CN117495579A (en) Medical insurance data quality control method and device based on double quality control differences
CN116304796A (en) Data classification method, device, equipment and medium
CN116992284A (en) Medical data labeling method and device, electronic equipment and storage medium
CN117035989A (en) Asset risk exposure classification identification method, device and equipment
CN115658510A (en) Test data generation method and device, electronic equipment and storage medium
CN117829660A (en) Quality management method and device for clothing data, electronic equipment and storage medium
CN115408400A (en) Business data batching method and device, electronic equipment and storage medium
CN117743396A (en) Data quality detection method, device, equipment and storage medium
CN114998037A (en) Data processing method and device, electronic equipment and storage medium
CN115600819A (en) Risk assessment method and device, electronic equipment and storage medium
CN116646038A (en) Method, apparatus, electronic device and storage medium for determining medical data packet

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination