CN108198595A - A kind of multi-source heterogeneous unstructured medical record data fusion method - Google Patents
A kind of multi-source heterogeneous unstructured medical record data fusion method Download PDFInfo
- Publication number
- CN108198595A CN108198595A CN201810047069.8A CN201810047069A CN108198595A CN 108198595 A CN108198595 A CN 108198595A CN 201810047069 A CN201810047069 A CN 201810047069A CN 108198595 A CN108198595 A CN 108198595A
- Authority
- CN
- China
- Prior art keywords
- data
- class
- medical record
- tables
- record data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/316—Indexing structures
- G06F16/328—Management therefor
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- General Business, Economics & Management (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of multi-source heterogeneous unstructured medical record data fusion methods.The method is based on following design, and the tables of data based on medical record data platform establishes exterior deficiency and intends class, and the exterior deficiency intends class and includes entity generation SQL statement algorithm;Class foundation and the one-to-one table class of attribute of the tables of data in medical record data platform are intended by exterior deficiency;It establishes data and controls virtual class, the example of the correspondence table class of tables of data in case data platform is included in a manner of attribute, the data control virtual class to include virtual transfer algorithm, and the virtual transfer algorithm converts the data into the object for table class;The entity generation SQL statement algorithm is called, the object of the table class is traversed by reflection technology, the attribute of data is changed into SQL statement, stores the tables of data into medical record data platform.The present invention realizes medical record data fusion, improves processing safety and fusion efficiencies, and effectively reduce error rate under the premise of original system Structure and stability is not influenced.
Description
Technical field
The present invention relates to electronic health record technical field more particularly to a kind of multi-source heterogeneous unstructured medical record data fusion sides
Method.
Background technology
With the development of Chinese medical cause, although medical information degree improves, the HIS of each hospital
(Hospital Information System) system storage organization is not consistent.In order to carry out scientific research, in face of from difference
No any structure text data are exported in the HIS databases of hospital, how to be realized for unstructured in multiple data sources
Data integrated, remove privacy, it is an important content in Chinese electronic health record research work to form structural data.
To at present, China there is not yet the data of such Medical data sharing, to multiple data sources, different structure, it is non-
The integration of the data of structuring, also in budding state.
Such as shown in Fig. 1, medical record data comes from the case history text data of more Different hospitals, multi-source heterogeneous to study
Non-structured medical record data merges, and following characteristics are contained in text medical record data:
(1) form of case history is text, but the form disunity of text, there is word forms, txt forms, all
It is non-structured data.
(2) data volume of text is big, nearly 100MB.
(3) content of the medical record data file of each hospital, structure are between each other without similar.Since medical record data source comes from
Each different place, and the HIS systems that each civilian hospital uses are inconsistent, along with manually being exported from HIS systems
The each difference of operation.As a result the form for leading to medical record data is inconsistent.
(4) data structure of different regions hospital is substantially inconsistent, but the form basic one of the file in same area
It causes.
Realize above-mentioned unstructured case data fusion, existing influences original system stability, complicated for operation, inefficiency,
Error rate is high, the problems such as needing second-order correction.
Invention content
To solve the above problems, the present invention provides a kind of multi-source heterogeneous unstructured medical record data fusion method, it is intended to number
On the basis of the original form of platform, structuring operation is carried out to new data, is formed with being carried out after the matched structure of data platform
Addition, does not influence former data platform or system operation and stability, and avoid or reduce further machine or artificial correction.
A kind of multi-source heterogeneous unstructured medical record data fusion method, includes the following steps:
(a) tables of data based on medical record data platform establishes exterior deficiency and intends class, and the exterior deficiency intends class and includes entity generation SQL languages
Sentence algorithm;
(b) class foundation and the one-to-one table class of attribute of the tables of data in medical record data platform are intended by exterior deficiency;
(c) data are established and control virtual class, the correspondence table class of tables of data in case data platform is included in a manner of attribute
Example, the data control virtual class to include virtual transfer algorithm, and the virtual transfer algorithm is converted the data into as table class
Object;
(d) the entity generation SQL statement algorithm is called, the object of the table class is traversed by reflection technology, by data
Attribute change into SQL statement, store the tables of data into medical record data platform.
Preferably, the tables of data of the medical record data platform include patient's personal information, the history information of patient, case
Essential information and progress note information.
Preferably, the attribute of the data is word or text.
A kind of multi-source heterogeneous unstructured medical record data fusion method, including with lower module:
(a) exterior deficiency intends generic module, and SQL statement algorithm is generated including entity;
(b) table generic module is intended generic module by exterior deficiency and is obtained and one a pair of the attribute of the tables of data in medical record data platform
The table class answered;
(c) data control virtual generic module, including the example of the correspondence table class of tables of data in case data platform and virtually
Transfer algorithm, the virtual transfer algorithm convert the data into the object for table class;
The table generic module calls the entity generation SQL statement algorithm, and pair of the table class is traversed by reflection technology
As the attribute of data is changed into SQL statement, stores the tables of data into medical record data platform.
Preferably, the tables of data of the medical record data platform include patient's personal information, the history information of patient, case
Essential information and progress note information.
Preferably, the attribute of the data is word or text.
The multi-source heterogeneous unstructured medical record data fusion method of one kind provided by the present invention is by changing to input data
Carry out structuring, obtain with input data platform after the data of data platform attributes match, do not influencing original system structure and steady
Under the premise of qualitatively, data fusion is realized, improves processing safety and fusion efficiencies, and effectively reduce error rate.
Description of the drawings
Fig. 1 is multi-source heterogeneous case history text data process flow
Fig. 2 is the flow chart of multi-source heterogeneous unstructured medical record data fusion method provided by the present invention.
Specific embodiment
For those skilled in the art is made to more fully understand technical scheme of the present invention, the present invention is carried below in conjunction with the accompanying drawings
The multi-source heterogeneous unstructured medical record data fusion method of one kind of confession is described in detail.
Embodiment one
The form of all case text data files is converted;The tray of txt is dumped to from forms such as word
In formula.
For the content of case data, an electronic medical records data platform for including all case data contents is designed.Electricity
The system structure of sub- case data platform, it is contemplated that the Rational structure of case data rather than rely on certain a kind of case file
Data content, and the dependence between the case relation table of data platform also should be simple and clear.Designed case load
The substance that should be included according to platform:The personal information of patient, the history information of patient, the essential information of case, course of disease note
Record information etc..
Realize that a case data platform exterior deficiency intends class A, the case data platform exterior deficiency in present case case fusion case
Intend generating the method B of SQL statement comprising an entity in class A, realize the SQL statement that the attribute of oneself is changed into data platform.
The reflection technology of CSharp is used in present case, so as to be directly generated by way of the attribute of traverse object.
Each table in corresponding case data platform, realizes each table class C based on virtual class A;Attribute in table class C with
The attribute of table corresponds in case data platform.It it is achieved thereby that can for each entity object in case data platform
Automatically generate the SQL statement being input in case data platform.
One data of design control virtual class D, and the correspondence of table in each case data platform is included in a manner of attribute
The example of table class C controls a virtual conversion method E defined in virtual class D in data, and this method is to be converted into text data
Object implementatio8 for table class C;
The control module of system realizes that the reading to file operates;Virtual class D is controlled to correspond to according to parameter call data real
The conversion method E of body generates each object of table class C, and the method B of SQL statement is generated by the entity of call list class C, just real
Show and each data have been deposited into data platform.
So far, basic function is all realized substantially, also, content of no longer modifying.
When the non-structured case data in a certain area need to integrate, according to the characteristics of data source, with case data
The form of class F realizes virtual class D, i.e., the entity conversion method E for generating each object is realized, calling system fortune
Row can realize that current non-structured case data are dissolved into the table in case data platform.
System control process is essentially:The entity conversion method E of case data class F is called, is entered data into method, it is empty
Method E can generate the object of the table class C of the table in each corresponding case data platform;According to the case load in case data platform
According to the generation SQL method B of the dependence of the major key of table, successively call list class C, SQL statement, execution are returned.Although control
Defined in molding block is that data control virtual class D, but according to the difference of the case data source of input and the parameter of input
Difference, that actually perform is the last entity conversion method E of case data class F newly added in.
As shown in Figure 2:
Wherein table 1, table 2 etc. belong to the various tables of case data Platform Designing;
Table class 1, table class 2 etc. be with the corresponding table class C of case data platform table, they are from the virtual class A of a base
Come over;
Object 1, object 2 etc. are the objects of the generations such as table class 1, table class 2, by the way that the entity of itself is called to generate SQL statement
Method B, the sentence for the attribute data of itself being deposited into, table being corresponded in case data platform can be generated;
Data source 1, data source 2 etc. are the non-structured medical record datas for having different structure from different areas;
It is to define empty conversion method E that data, which control virtual class, and this method is that the data in data source are generated data pair
The process of elephant, what is called in control module is the virtual class;
Data class 1, data class 2 etc. are according to data source 1, the case history feature of data source 2, and are carried out increased based on data
Control the subclass of virtual class D.Object and the method that how to generate these objects which includes each table class.
Control module is the basic process of system, and the mode that data control the empty method E of virtual class is called (actually to call
Conversion method E for case data class F), text medical record data is input to control module, the opposite data with data source of generation
To processes such as each object 1, objects 2;And it is deposited into data platform, forms the data of structuring.
According to the method so designed, after new text medical record data source n is obtained, work is for according to text
The content of the data of medical record data source n completes a new entity class n for realizing data and controlling the empty method E of virtual class D,
Data source file and novel entities class n are input to for parameter in control module, just realized to new medical record data source
The integration of the data of n, and there is no any influence to the data source of structured completion before, to system before
Normal operation does not also influence.The requirements such as efficient, safety are reached.
For being independent from each other between the data of different regions, for the difference of the new different data sources of new content
The non-structured data of structure are supplemented in a manner of adding rather than original system are modified, and are influenced previously
Content.And then under the premise of original system Structure and stability is not influenced, data fusion is realized, improve safe operation
Property and fusion efficiencies, and effectively reduce error rate.
It is understood that the principle that embodiment of above is intended to be merely illustrative of the present and the exemplary implementation that uses
Mode, however the present invention is not limited thereto.For those skilled in the art, in the essence for not departing from the present invention
In the case of refreshing and essence, various changes and modifications can be made therein, these variations and modifications are also considered as protection scope of the present invention.
Claims (6)
1. a kind of multi-source heterogeneous unstructured medical record data fusion method, which is characterized in that include the following steps:
(a) tables of data based on medical record data platform establishes exterior deficiency and intends class, and the exterior deficiency intends class and includes entity generation SQL statement calculation
Method;
(b) class foundation and the one-to-one table class of attribute of the tables of data in medical record data platform are intended by exterior deficiency;
(c) data are established and control virtual class, the reality of the correspondence table class of tables of data in case data platform is included in a manner of attribute
Example, the data control virtual class to include virtual transfer algorithm, and the virtual transfer algorithm converts the data into pair for table class
As;
(d) the entity generation SQL statement algorithm is called, the object of the table class is traversed by reflection technology, by the category of data
Property changes into the SQL statement of data platform, stores the tables of data into medical record data platform.
2. multi-source heterogeneous unstructured medical record data fusion method according to claim 1, which is characterized in that the case history
The tables of data of data platform includes patient's personal information, the history information of patient, the essential information of case and progress note information.
3. multi-source heterogeneous unstructured medical record data fusion method according to claim 1 or 2, which is characterized in that described
The attribute of data is word or text.
4. a kind of multi-source heterogeneous unstructured medical record data fusion method, which is characterized in that including with lower module:
(a) exterior deficiency intends generic module, and SQL statement algorithm is generated including entity;
(b) table generic module is obtained one-to-one with the attribute of the tables of data in medical record data platform by exterior deficiency plan generic module
Table class;
(c) data control virtual generic module, including the example of the correspondence table class of tables of data in case data platform and virtual conversion
Algorithm, the virtual transfer algorithm convert the data into the object for table class;
The table generic module calls the entity generation SQL statement algorithm, and the object of the table class is traversed by reflection technology, will
The attribute of data changes into SQL statement, stores the tables of data into medical record data platform.
5. multi-source heterogeneous unstructured medical record data fusion method according to claim 4, which is characterized in that the case history
The tables of data of data platform includes patient's personal information, the history information of patient, the essential information of case and progress note information.
6. multi-source heterogeneous unstructured medical record data fusion method according to claim 4 or 5, which is characterized in that described
The attribute of data is word or text.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810047069.8A CN108198595B (en) | 2018-01-18 | 2018-01-18 | Multi-source heterogeneous unstructured medical record data fusion method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810047069.8A CN108198595B (en) | 2018-01-18 | 2018-01-18 | Multi-source heterogeneous unstructured medical record data fusion method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108198595A true CN108198595A (en) | 2018-06-22 |
CN108198595B CN108198595B (en) | 2022-05-03 |
Family
ID=62590153
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810047069.8A Active CN108198595B (en) | 2018-01-18 | 2018-01-18 | Multi-source heterogeneous unstructured medical record data fusion method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108198595B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109448846A (en) * | 2018-09-07 | 2019-03-08 | 北京大学 | A kind of analysis method for calculating rare sick disease incidence based on medical insurance big data |
CN111177156A (en) * | 2019-12-31 | 2020-05-19 | 广东科学技术职业学院 | Big data storage method and system |
CN111177506A (en) * | 2019-12-31 | 2020-05-19 | 广东科学技术职业学院 | Classification storage method and system based on big data |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103226478A (en) * | 2013-05-22 | 2013-07-31 | 北京金和软件股份有限公司 | Method for automatically generating and using code |
US20160021181A1 (en) * | 2013-07-23 | 2016-01-21 | George Ianakiev | Data fusion and exchange hub - architecture, system and method |
CN107066499A (en) * | 2016-12-30 | 2017-08-18 | 江苏瑞中数据股份有限公司 | The data query method of multi-source data management and visualization system is stored towards isomery |
CN107193858A (en) * | 2017-03-28 | 2017-09-22 | 福州金瑞迪软件技术有限公司 | Towards the intelligent Service application platform and method of multi-source heterogeneous data fusion |
CN107402976A (en) * | 2017-07-03 | 2017-11-28 | 国网山东省电力公司经济技术研究院 | Power grid multi-source data fusion method and system based on multi-element heterogeneous model |
-
2018
- 2018-01-18 CN CN201810047069.8A patent/CN108198595B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103226478A (en) * | 2013-05-22 | 2013-07-31 | 北京金和软件股份有限公司 | Method for automatically generating and using code |
US20160021181A1 (en) * | 2013-07-23 | 2016-01-21 | George Ianakiev | Data fusion and exchange hub - architecture, system and method |
CN107066499A (en) * | 2016-12-30 | 2017-08-18 | 江苏瑞中数据股份有限公司 | The data query method of multi-source data management and visualization system is stored towards isomery |
CN107193858A (en) * | 2017-03-28 | 2017-09-22 | 福州金瑞迪软件技术有限公司 | Towards the intelligent Service application platform and method of multi-source heterogeneous data fusion |
CN107402976A (en) * | 2017-07-03 | 2017-11-28 | 国网山东省电力公司经济技术研究院 | Power grid multi-source data fusion method and system based on multi-element heterogeneous model |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109448846A (en) * | 2018-09-07 | 2019-03-08 | 北京大学 | A kind of analysis method for calculating rare sick disease incidence based on medical insurance big data |
CN111177156A (en) * | 2019-12-31 | 2020-05-19 | 广东科学技术职业学院 | Big data storage method and system |
CN111177506A (en) * | 2019-12-31 | 2020-05-19 | 广东科学技术职业学院 | Classification storage method and system based on big data |
CN111177156B (en) * | 2019-12-31 | 2023-10-03 | 广东科学技术职业学院 | Big data storage method and system |
Also Published As
Publication number | Publication date |
---|---|
CN108198595B (en) | 2022-05-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104123374B (en) | The method and device of aggregate query in distributed data base | |
CN108198595A (en) | A kind of multi-source heterogeneous unstructured medical record data fusion method | |
US20040122661A1 (en) | Method, system, and computer program product for storing, managing and using knowledge expressible as, and organized in accordance with, a natural language | |
EP1222569A1 (en) | Method and systems for making olap hierarchies summarisable | |
CN103049251B (en) | A kind of data base persistence layer device and database operation method | |
Slepicka et al. | KR2RML: An Alternative Interpretation of R2RML for Heterogenous Sources. | |
CN111061739B (en) | Method and device for warehousing massive medical data, electronic equipment and storage medium | |
WO2008016822A2 (en) | Primenet data management system | |
JP2010160591A (en) | Device, method and program for managing spatial data | |
CN112115276B (en) | Intelligent customer service method, device, equipment and storage medium based on knowledge graph | |
US20230050290A1 (en) | Horizontally-scalable data de-identification | |
Theodorakis et al. | Context in information bases | |
CN106021344A (en) | A multi-adaptive CIME power grid model sharing method | |
JP2017511928A (en) | Data processing method and system for establishing input recommendations | |
van den Hamer et al. | A data flow based architecture for CAD frameworks | |
Chen et al. | Constructing and maintaining scientific database views in the framework of the object-protocol model | |
CN110032574B (en) | SQL statement processing method and device | |
EP2590089B1 (en) | Rule type columns in database | |
CN114817512A (en) | Question-answer reasoning method and device | |
CN114637752A (en) | Connection query statement processing method, device, equipment and storage medium | |
Sukarsa et al. | Modification of ISONER Framework as Enterprise Service Bus to Build Consultation Robot Using External Engine | |
Bechhofer et al. | Delivering terminological services | |
Sachdeva et al. | AQBE–QBE style queries for archetyped data | |
JP2785138B2 (en) | Genetic methods in large-scale knowledge database systems | |
CN113672639B (en) | Multi-type database table structure comparison method, system, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |