CN109785927A - Clinical document structuring processing method based on internet integration medical platform - Google Patents
Clinical document structuring processing method based on internet integration medical platform Download PDFInfo
- Publication number
- CN109785927A CN109785927A CN201910101984.5A CN201910101984A CN109785927A CN 109785927 A CN109785927 A CN 109785927A CN 201910101984 A CN201910101984 A CN 201910101984A CN 109785927 A CN109785927 A CN 109785927A
- Authority
- CN
- China
- Prior art keywords
- clinical
- clinical document
- sample
- data
- structuring processing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000010354 integration Effects 0.000 title claims abstract description 26
- 238000003672 processing method Methods 0.000 title claims abstract description 16
- 238000012545 processing Methods 0.000 claims abstract description 40
- 239000003814 drug Substances 0.000 claims abstract description 32
- 238000003860 storage Methods 0.000 claims abstract description 26
- 238000004458 analytical method Methods 0.000 claims abstract description 19
- 238000010801 machine learning Methods 0.000 claims abstract description 6
- 238000000605 extraction Methods 0.000 claims description 16
- 230000036541 health Effects 0.000 claims description 12
- 238000003058 natural language processing Methods 0.000 claims description 12
- 230000006870 function Effects 0.000 claims description 11
- 238000000034 method Methods 0.000 claims description 10
- 238000005520 cutting process Methods 0.000 claims description 9
- 230000011218 segmentation Effects 0.000 claims description 7
- 238000012549 training Methods 0.000 claims description 7
- 238000007619 statistical method Methods 0.000 claims description 4
- 206010028916 Neologism Diseases 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims description 3
- 239000000284 extract Substances 0.000 claims description 3
- 230000006698 induction Effects 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 239000000463 material Substances 0.000 claims description 3
- 230000001575 pathological effect Effects 0.000 claims description 3
- 238000002604 ultrasonography Methods 0.000 claims description 3
- 229940079593 drug Drugs 0.000 claims description 2
- 238000003306 harvesting Methods 0.000 claims 1
- 238000004422 calculation algorithm Methods 0.000 abstract description 7
- 238000003745 diagnosis Methods 0.000 description 17
- 238000011156 evaluation Methods 0.000 description 16
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 14
- 201000010099 disease Diseases 0.000 description 12
- 238000007689 inspection Methods 0.000 description 10
- 238000012360 testing method Methods 0.000 description 9
- 238000012550 audit Methods 0.000 description 8
- 238000007726 management method Methods 0.000 description 8
- 239000000203 mixture Substances 0.000 description 7
- 238000011160 research Methods 0.000 description 7
- 241001269238 Data Species 0.000 description 6
- 206010020850 Hyperthyroidism Diseases 0.000 description 6
- 208000024770 Thyroid neoplasm Diseases 0.000 description 6
- 238000004891 communication Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 208000026310 Breast neoplasm Diseases 0.000 description 4
- 208000009453 Thyroid Nodule Diseases 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 4
- 238000007405 data analysis Methods 0.000 description 4
- 238000013500 data storage Methods 0.000 description 4
- 238000011161 development Methods 0.000 description 4
- 230000018109 developmental process Effects 0.000 description 4
- 206010012601 diabetes mellitus Diseases 0.000 description 4
- 238000009826 distribution Methods 0.000 description 4
- 230000007170 pathology Effects 0.000 description 4
- 206010028980 Neoplasm Diseases 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 3
- 208000029078 coronary artery disease Diseases 0.000 description 3
- 238000013075 data extraction Methods 0.000 description 3
- 230000008676 import Effects 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 238000012552 review Methods 0.000 description 3
- 210000001685 thyroid gland Anatomy 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000007418 data mining Methods 0.000 description 2
- 238000000586 desensitisation Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000013508 migration Methods 0.000 description 2
- 230000005012 migration Effects 0.000 description 2
- 230000006855 networking Effects 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 208000013076 thyroid tumor Diseases 0.000 description 2
- 241000894006 Bacteria Species 0.000 description 1
- 206010006187 Breast cancer Diseases 0.000 description 1
- 206010018473 Glycosuria Diseases 0.000 description 1
- 208000024799 Thyroid disease Diseases 0.000 description 1
- 238000013475 authorization Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000036772 blood pressure Effects 0.000 description 1
- 230000036760 body temperature Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 239000000470 constituent Substances 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000002526 effect on cardiovascular system Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 230000002996 emotional effect Effects 0.000 description 1
- 230000002124 endocrine Effects 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 210000005075 mammary gland Anatomy 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 230000000474 nursing effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000000505 pernicious effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 230000029058 respiratory gaseous exchange Effects 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 238000001356 surgical procedure Methods 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 208000021510 thyroid gland disease Diseases 0.000 description 1
Landscapes
- Medical Treatment And Welfare Office Work (AREA)
Abstract
Present invention discloses a kind of clinical document structuring processing methods based on internet integration medical platform, it is related to internet medical platform technical field, unstructured clinical document is input to clinical document structuring processing engine, it is handled by means such as clinical medicine corpus, rule, full-text search and machine learning, it obtains structural data and is output to distributed storage engine, it is handled by intelligent algorithm, for Platform Analysis, is shown.The present invention is based on the clinical document structuring processing methods of internet integration medical platform, text data non-structured in clinical data is subjected to structuring processing, it stores in distributed Hadoop cluster, it realizes Distributed Storage mode and distributed computing processing, and the programming in software application is realized and is transformed and is adapted to for distributed nature.
Description
Technical field
The present invention relates to internet medical platform technical fields, more specifically refer to a kind of flat based on internet integration medical treatment
The clinical document structuring processing method of platform.
Background technique
Big data penetrates into each industry and department, depth is answered as a kind of important resource to some extent
With the business activities for not only facilitating constituent parts, it is also beneficial to push the development of national economy." internet+" is industry and information
Change the achievement and mark of depth integration, and further promotes the important handgrip of information consumption.So-called " internet+" is exactly " mutually
Networking+each traditional industries ", but this is not both simple addition, but utilize Information and Communication Technology and internet flat
Platform allows internet and traditional industries to carry out depth integration, creates new developing ecology.Future Internet can also be made as electricity
For a kind of productivity tool, being substantially improved for efficiency is brought to each industry.Push mobile Internet, cloud computing, big data, object
Networking etc. promotes e-commerce, industry internet and the development of internet financial health, guides internet in conjunction with modern manufacturing industry
Enterprise's Opening International Market." there is numerous electric business in traditional fairground+internet, such as traditional general merchandise is also produced accordingly to be sold
Field+internet, traditional bank+internet, conventional traffic+internet." internet+" is just in overall application to the tertiary industry, shape
At the new industry situation such as internet medical treatment, internet finance, Internet traffic, Internet education.
Medical industry is the important component of national economy and social development, and under the new situation, medical information is built
Fast development have benefited from the application of the IT emerging technology such as big data, cloud computing and Internet of Things, caused the big of medical data
Explosion, promotes the formation of medical big data.But there is a large amount of patient to check that survey report is (super in the clinical data of hospital
Sound, X-ray, CT etc.), the non-structured text data such as pathological replacement, be unfavorable for internet unified platform and be analyzed and processed.
Summary of the invention
(1) the technical issues of solving
The invention aims to realize the structuring processing to non-structured notebook data in survey report is checked, provide
A kind of clinical document structuring processing method based on internet integration medical platform.
(2) technical solution
Clinical document structuring processing method based on internet integration medical platform, includes the following steps,
S1, clinical document structuring processing engine receive the input of unstructured clinical document, pass through clinical medicine corpus
The means such as library, rule, full-text search and machine learning convert non-structured text data in the sample and index of structuring
Data;
S2, clinical document after structuring processing engine processing, obtained structural data i.e. sample and index
Key-value pair is stored into distributed storage engine, analysis, displaying for platform.
An embodiment according to the present invention, the clinical document structuring processing engine include Chinese natural language processing mould
Block, clinical medicine building of corpus module, sample index's extraction module, Chinese natural language processing module is respectively at clinical medicine
Building of corpus module, sample index's extraction module are connected.
An embodiment according to the present invention, the Chinese natural language processing module handle skill using Chinese natural language
Art is handled the clinical document of input from word, sentence, paragraph level, and processing step is as follows:
(1) cutting short sentence: according to the text narration feature of clinical document, using the display rule of sentence, by clinical document
Content of text cutting be the short sentence for describing sample one by one;
(2) Chinese word segmentation: utilizing Chinese word segmentation tool, is based on general medicine dictionary and clinical medicine dictionary, short to sample
Sentence is segmented, and significant word or phrase is obtained;
(3) part of speech is analyzed: analyzing the part of speech of each word;
(4) syntactic analysis: for specific sample short sentence, it is carried out with the short sentence for describing same sample in clinical document
Compare, summary and induction goes out the short sentence syntax of every kind of pattern representation.
An embodiment according to the present invention, the clinical medicine building of corpus module are obtained by clinical document learning training
The clinical medicine special term material library arrived, the construction step of the module are as follows:
(1) new word discovery: utilizing word frequency statistics, clustering method, be combined to participle, finds neologisms;
(2) synonym is found: for synonym present in clinical document, the content of text based on clinical document is utilized
Fuzzy matching, statistical analysis means, obtain synonym, establish synonym table;
(3) sample extraction: in the short sentence of cutting, according to rule, sample name is extracted;
(4) template extraction: it is directed to specific sample, extracts the description template of the sample.
An embodiment according to the present invention, sample index's extraction module needle for each sample, by full-text search and
The mode of fuzzy matching determines that index name is corresponding from clinical document in conjunction with clinical medicine corpus and pattern representation template
Index value;The sample and index form key-value pair, the processing result output as structuring processing engine.
An embodiment according to the present invention, the clinical document include electronic health record, pathological replacement, check survey report.
An embodiment according to the present invention, the inspection survey report include ultrasound, X-ray, CT report.
An embodiment according to the present invention, the distributed storage engine is using the composition distribution of multiple standard server nodes
Formula storage cluster, each Hadoop cluster include a host node and multiple slave nodes;Host node run NameNode and
JobTracker function, and be responsible for coordinating slave node to ensure the completing to be supplied to cluster of the task;Slave node operation
TaskTracker and HDFS for storing data has the function of to execute the mapping and abbreviation that data calculate.
(3) beneficial effect
Using technical solution of the present invention, the clinical document structuring processing side based on internet integration medical platform
Method, unstructured clinical document are input to clinical document structuring processing engine, pass through clinical medicine corpus, rule, full text
The means processing such as retrieval and machine learning, obtains structural data and is output to distributed storage engine, pass through intelligent algorithm
It is handled, for Platform Analysis, is shown;The present invention carries out text data non-structured in clinical data at structuring
Reason is stored into distributed Hadoop cluster, realizes Distributed Storage mode and distributed computing processing, and will be in software
Programming in, which is realized, to be transformed and is adapted to for distributed nature.
Detailed description of the invention
In the present invention, identical appended drawing reference always shows identical feature, in which:
Fig. 1 is integrated medical platform general frame figure Internet-based.
Fig. 2 is flow chart of the present invention.
Fig. 3 is the integrated stand composition of clinical document structuring processing engine.
Fig. 4 is clinical document structuring processing engine internal structure chart.
Fig. 5 is Chinese natural language processing module structure chart.
Fig. 6 is clinical medicine building of corpus function structure chart.
Fig. 7 is Hadoop cluster network topological structure figure.
The architecture diagram of Fig. 8 distributed storage engine.
Specific embodiment
Technical solution of the present invention is further illustrated with reference to the accompanying drawings and examples.
Integration medical platform Internet-based combines medical big data and artificial intelligence technology, realizes based on " interconnection
The integrated big data medical services platform of net+medical treatment ", for all participation health cares, movable personal and mechanism provides data
The medical services of the online health care new model such as shared, business operation and cooperation with service, optimization information communication, advantageously promote
Doctors and patients' information mutual communication facilitates service and management that hospital improves itself.Platform general frame is as shown in Figure 1, in platform under
It is supreme to be respectively as follows: platform data basal layer, data analysis layer, medical information resource layer, data depth application layer and client layer etc.
Five levels.Integration medical services platform Internet-based, including back-stage management end, doctor terminal and the big portion of patient end three
Point.
Integrated medical services backstage management of platform end:
Back-stage management provides hospital HIS, the data exchanges such as PACS, LIS, RIS integration, medical information system medical data
The functions such as backup.Mainly by data pick-up integration, medical data backup storage, special population database and anonymous public medical record number
The composition such as inquiry according to library.
(1) data pick-up is integrated: completing the mistake of extraction, conversion and the load of the system datas such as HIS, RIS, LIS, PACS
The clinical data that different clinic information systems generate is carried out unified integration and summarized, realized and suffer from different clinical information by journey
The unification of person's mark and the unification of patient clinical information, make clinical data can unify storage.
A) HIS data extraction module, which is realized, registers, goes to a doctor, examines from HIS Emergency call and HIS system increment extraction of being hospitalized
Break, doctor's advice, be admitted to hospital, the clinical datas such as expense;
B) RIS data extraction module realizes from RIS system increment synchronization audit report, position detail etc. and checks data;
C) LIS data extraction module is realized from LIS system increment synchronization survey report, test rating, bacterium and susceptibility
Etc. inspection datas;
D) PACS module realizes the access from image documentation equipment such as DR, CT etc. the data for following DICOM3.0 consensus standard.
E) ETL subsystem is completed to operate desensitization, cleaning and conversion of clinical data etc..
Data desensitization: desensitizing for patient individual's sensitive data, and patient identity card number, medical card number, patient are personal
Name etc. carries out specially treated, removes sensitive composition.
Data cleansing: incomplete data are abandoned;The data wrong for format, such as date of birth, pass through
Other related datas are repaired, and can not repair, data are marked;
Data conversion: to the enumerated value for using numerical value or character to save in the system of source, the text of corresponding meaning is converted to.
(2) medical data backup storage: medical data backup center is the basis of clinical big data storage, for clinical big number
Initial data source is provided according to processing, analysis.System is using distributed Hadoop cloud storage architecture, and for different medical, mechanism is provided
The distributed storage ability of linear expansion, realize data storage filing, management and shared and all types medical institutions it
Between information intercommunication, shared, achieve the purpose that the diversification storage and access of cloud computing platform.Medical data backup center is by curing
The modules compositions such as treatment data bulk migration, medical data increment import, medical data is checked.
A) medical data bulk migration: use hadoop distributed structure/architecture, realize medical information system medical data by
The monolithic backup that time carries out.
B) medical data increment imports: in the incremental mode of time series, the medical treatment imported in medical information system increases
Measure data.
C) medical data is checked: being realized to kinds of Diseases, Gender, age bracket, department, audit report type and inspection
Time etc. imports medical data and is inquired.
(3) it special population database: according to the patient clinical data of medical information system, establishes towards hyperthyroidism, glycosuria
The special population database of the diseases such as disease, thyroid nodule, tumor of breast and thyroid tumors, can be to kinds of Diseases, patient
Gender, age, inspection doctor, Index for examination and review time etc. inquire.
(4) state of an illness case and the doctor of the patients such as diabetes, thyroid disease anonymous public clinical record data base: can be checked
Diagnosis and treatment suggestion, see a doctor to the patient of the similar state of an illness and reference be provided.In view of privacy, number is established using anonymous form for patient
According to library.Kinds of Diseases, illness description content, doctor can be suggested in detail, check doctor, enquirement and time for replying etc. to look into
It askes.
(5) model library: in order to which the model constructed using intelligent algorithm carries out classification forecast analysis, mould to medical diagnosis on disease
The management of artificial intelligence model is mainly realized in type library, including importing, model training and model such as check at the functions.
(6) system administration: unified platform is mainly directed towards information centre, medical institutions administrative staff, doctor and patient etc. no
With role, need scientifically to manage these users, lead to user management and role rights management, to it is various operation with
Data access authority carries out stringent authorization and control.
Integrated medical services platform doctor terminal:
Integrated medical services platform doctor terminal is mainly that the medical personnel of medical institutions and researcher provide medicine
Research and medical diagnosis aid decision provide platform, establish doctors and patients' channel of communication, check the medical advice of patient and for diagnosis
Evaluation.Mainly by special population analysis, aided remote decision, patient advisory checks, evaluation of patient is checked etc. forms.
(1) special population is analyzed: to the clinical data for suffering from the special populations such as hyperthyroidism and diabetes in hospital information system
Analysis mining is carried out disease research is provided and is provided and is for clinician and scientific research personnel to obtain occurrence regularity and inherent mechanism
System is supported.
A) hyperthyroidism clinical data analysis excavates: hyperthyroidism clinical data includes the medical note of the Basic Information Table of patient, patient
The clinical datas tables such as table, the medicining condition table of patient, the index test table of patient and the diagnosis situation table of patient are recorded, number is recorded
Total amount about 2,000,000.Realize and data mining analysis carried out to the clinical data of hyperthyroidism disease, mainly from the essential information of patient,
The themes such as test rating data information, doctor's advice medicining condition, complication situation, recurrence carry out.
B) diabetes clinical data analysis excavates: Basic Information Table of the diabetes clinical data comprising patient, patient are just
The clinical datas tables such as record sheet, the medicining condition table of patient, the index test table of patient and the diagnosis situation table of patient are examined, are remembered
Record number total amount about 1,000,000.It realizes and data mining analysis is carried out to the clinical data of hyperthyroidism disease, mainly from the basic letter of patient
The themes such as breath, test rating data information, doctor's advice medicining condition, diagnosis situation carry out.
(2) aided remote decision: selection endocrine subject, the thyroid gland of cardiovascular subject and tumour subject, coronary heart disease and
Research object of several diseases such as tumour as data collection and analysis relies on unified platform acquisition to integrate clinical treatment number
According to realizing the medical diagnosis aid decision-making system towards thyroid nodule, coronary heart disease and tumor of breast etc., face for clinician
Bed diagnosis and scientific research personnel's disease research provide system and support.Mainly by based on index parameter prediction module, based on check report
Accuse four parts such as prediction module, model training module and the structurized module of text composition.
A) based on the prediction module of index parameter: according to the information such as patient's outpatient service serial number or medical insurance card number, Ke Yicha
Ask the test rating and audit report text of the related disease of the patient.Structuring achievement data can be directly inputted;It is right
In non-structured audit report, structuring is carried out using structuring submodule and obtains the data format that model can identify.
B the prediction module) based on audit report text:, can according to the information such as patient's outpatient service serial number or medical insurance card number
To inquire the audit report text of the patient.For non-structured audit report text, deep learning algorithm is directly utilized
It is predicted.
C) model training module: belonging to the basic module of system, invisible to user.By the thyroid gland knot of multiple databases
The data such as the relevant clinical audit reports of the medical information systems such as section, coronary heart disease, mammary gland, test rating merge processing, collect
At into unified tables of data, model training is carried out.
D) structurized module: realize that the structuring to ultrasonic report text data is handled, the ultrasound for extracting various samples is special
Sign includes the index value of Tumor size, boundary, echo distribution, echo intensity etc. and each index, and forms retouching by each sample
State template.Based on the template, the processing of the structuring to ultrasonic content of text is realized.
(3) patient advisory checks: it realizes doctor and conditions of patients diagnosis consulting content is checked, it can be according to disease kind
Class, illness description content, review time etc. screening are checked.
(4) evaluation of patient is checked: it realizes doctor and evaluation of patient content is checked, it can be according to physician names, patient
Name, evaluation content, evaluation time etc. screening are checked.
Integrated medical services platform patient end:
Integrated medical services platform patient end is the interface that patient logs in platform, and predominantly patient provides remotely cures on line
Service is treated, evaluates service, the inquiry of Patients ' Electronic health account etc. after being mainly included in line consulting interrogation, medical treatment.Patient can lead to
Online interrogation is crossed, the state of an illness tentative diagnosis result that artificial intelligence technology provides is obtained;, clothes horizontal by on-line evaluation doctor medical skill
Attitude of being engaged in etc.;Diagnosis, inspection, inspection and image, doctor's advice, medical history, pathology and expense etc. are checked by Patients ' Electronic health account
Data.It mainly include three modules: evaluation service, Patients ' Electronic health account after patient advisory's service, medical treatment.
(1) patient advisory services: realizing and provides online interrogation service for patient.Patient provides original state of an illness symptom and retouches
It states, data, the system such as image check text report, test rating value obtain model energy using OCR identification facility, structured techniques
The data format of identification examines unknown sample using the model that intelligent algorithm constructs by test rating signature analysis
It is disconnected to carry out classification prediction, the state of an illness result for predicting the patient is finally showed into patient, including thyroid nodule type, thyroid gland
Good pernicious, Breast Tumors of type of surgery, thyroid tumors etc. achieve the purpose that instruct patient's medical treatment and health care.System
Be integrated with including convolutional neural networks (CNN), Recognition with Recurrent Neural Network (RNN), shot and long term memory unit recurrent neural network mould
The intelligent algorithms such as type (LSTM), random forest, support vector machines, neural network, decision tree and K-means, construct
Thyroid nodule and Breast Tumor Patients disease auxiliary diagnosis prediction model.
(2) service is evaluated after medical treatment: realizing rear evaluation of the patient to doctor's diagnosis and treatment process.Patient on the line of doctor to commenting
Valence is a kind of effective doctor patient communication channel, is improved service quality for medical institutions and doctor, and gradually alleviating conflict between doctors and patients is
There is great help.Doctor can be according to the evaluation and demand of patient come improvement, and medical institutions can be according to patient to doctor
Overall evaluation situation give rewards and punishments appropriate.But the review number of single doctor may just have hundreds and thousands of in practice,
Doctor's quantity of one medical institutions has several hundred or even thousands of, it will generates the evaluation of patient text information of magnanimity, manually
Method needs to expend a large amount of energy to handle and analyze these information.System realizes the medical care evaluation body based on artificial intelligence
System carries out emotional semantic analysis to evaluation of patient by machine, identifies front and unfavorable ratings automatically, count proportion.
Doctor can quickly filter out unfavorable ratings, make improvement according to content;Medical institutions can be by department, doctor etc. to magnanimity
Overall evaluation situation statistical analysis is carried out in evaluation information.
(3) Patients ' Electronic health account:
The clinical data for relying on Data Integration module to generate the different clinic information system such as HIS, PACS, LIS, RIS into
Row and summarizes at unified integration, establish include patient essential information, diagnosis, inspection, inspection and image, doctor's advice, medical history, pathology
With the personal electric health account unified view view of the data such as expense, it can be convenient patient and have access at any time, be diagnosis and treatment and scientific research
Application is provided using clinical big data to support.
A) patient basis's dimension: name, gender, date of birth, passport NO., the contact method of main display patient
Etc. essential informations;
B it) diagnoses dimension: showing all previous diagnosis records of patient etc.;
C it) examines dimension: showing all previous inspection record of patient in a tabular form;
D ultrasonic examination record and image of patient etc.) inspection and image dimension: are shown;
E) doctor's advice dimension: all kinds of doctor's advices of the record doctor to patient;
F) medical history dimension: the electronic health record record of patient;
G) pathology dimension: the pathology of patient is recorded;
H it) nurses dimension: showing the nursing record of patient, such as pulse, body temperature, blood pressure, breathing in graphical form;
I) physical examination dimension: display patient's physical examination record;
J) expense dimension: display statistics all kinds of expense details of patient.
With the increasingly increase of hospital data amount, hospital also enters big data era, sufficiently excavates the number that hospital generates
According to can undoubtedly bring many valuable information.Hospital information initial stage, each clinic information system data separate local storage,
Information Security is low, and data reliability cannot get effective guarantee.To hospital medical information system based on Hadoop framework
Data are backed up, and are further sufficiently excavated the knowledge of medicinal data information behind, are provided support for Hospital Decision making, are improved
The working efficiency of hospital establishes basis.
In conjunction with flow chart 2, the clinical document structuring processing method based on internet integration medical platform, including it is following
Step:
S1, clinical document structuring processing engine receive the input of unstructured clinical document, pass through clinical medicine corpus
The means such as library, rule, full-text search and machine learning convert non-structured text data in the sample and index of structuring
Data;
S2, clinical document after structuring processing engine processing, obtained structural data i.e. sample and index
Key-value pair is stored into distributed Hadoop cluster, analysis, displaying for platform.
Clinical document structuring handles the overall architecture of engine as shown in figure 3, clinical document structuring processing engine receives
After the input of clinical document body of text content, two tasks are executed respectively: first, generating clinical medicine corpus includes clinic
Medicine dictionary, synonym table and description template of clinical document etc.;Second, extracting sample index's key-value pair from clinical document
As output.
As shown in figure 4, clinical document structuring handles engine mainly by Chinese natural language processing module, clinical medicine language
Expect the big module composition of library building module, sample index's extraction module etc. three.The function of each module is as follows:
Chinese natural language processing module
Clinical document structuring handles the relevant technologies that engine utilizes Chinese natural language processing, from layers such as word, sentence, paragraphs
The secondary clinical document to input is handled.Its function structure chart is as shown in Figure 5.It can be seen that the processing step of the module
It is as follows:
(1) cutting short sentence: according to the text narration feature of clinical document, using the display rule of sentence, by clinical document
Content of text cutting be the short sentence for describing sample one by one.Obtained each short sentence is substantially corresponding with certain sample.
In other words, each short sentence substantially can be considered the description information of certain sample.
(2) Chinese word segmentation: utilizing Chinese word segmentation tool, is based on general medicine dictionary and clinical medicine dictionary, short to sample
Sentence is segmented, and significant word or phrase is obtained.
(3) part of speech is analyzed: the part of speech of each word is analyzed, to help subsequent syntactic analysis and semantic understanding.
(4) syntactic analysis: for specific sample short sentence, it is carried out with the short sentence for describing same sample in clinical document
Compare, summary and induction goes out the short sentence syntax of every kind of pattern representation.
Clinical medicine building of corpus module
Clinical medicine corpus is the clinical medicine special term material library obtained by clinical document learning training.This corpus
More professional, accuracy is higher on Chinese word segmentation, is conducive to the Chinese natural language processing of clinical document.Clinical medicine corpus
It is as shown in Figure 6 to construct modular structure.It can be seen that the construction step of the module is as follows:
(1) new word discovery: using the methods of word frequency statistics, cluster, participle is combined, finds neologisms.
(2) synonym is found: for synonym present in clinical document, when such as describing all diameters of lump, some texts
Using " Zhou Jing ", some is using " diameter ".Content of text based on clinical document, using means such as fuzzy matching, statistical analysis,
Synonym is obtained, synonym table is established.
(3) sample extraction: in the short sentence of cutting, according to rule, sample name is extracted.
(4) template extraction: it is directed to specific sample, extracts the description template of the sample.
Sample index's extraction module
Needle is retouched for each sample, by way of full-text search and fuzzy matching in conjunction with clinical medicine corpus and sample
Template is stated, the corresponding index value of index name is determined from clinical document.Sample and index form key-value pair, handle as structuring
The processing result of engine exports.
Family can be used without understanding the details of the distributed bottom layer in backup center based on Hadoop framework, utilizes
The data that cluster carries out high speed are imported, are stored, and carry out data query.Text document is stored in collection in the form of multiple wave files
In multiple nodes of group, the reliability of data storage will be increased.Hadoop distributed field system is used for medicinal data
The bottom storage facility of system HDFS.The HDFS bottom storage that the present invention is implemented uses the master/slave framework of HDFS2.0.
Hadoop system operates on Linux cluster.In the cluster, a computer manages other computers as master node, other
Computer is responsible for data storage as slave node.The Hadoop of this system uses complete distributed mode, Hadoop cluster net
Network topological structure is as shown in Figure 7.
Clinical document is after structuring processing engine processing, the key assignments of obtained structural data i.e. sample and index
It is right, it stores into distributed Hadoop cluster, for the analysis of platform, shows.The storage engines are directed to the distribution of big data
Characteristic realizes Distributed Storage mode and distributed computing processing using HDFS and MapReduce as core respectively, and will be
Programming in software application, which is realized, to be transformed and is adapted to for distributed nature, really to play the excellent of distributed computing architecture
Gesture.
Hadoop frame is " computing resource is moved to range data closer proximity ".The position Fig. 8 show distribution and deposits
Store up the architecture diagram of engine.
Distributed storage engine based on Hadoop frame forms distributed storage collection using multiple standard server nodes
Group.Each Hadoop cluster includes a host node and multiple slave nodes.Host node runs NameNode and JobTracker
Function, and be responsible for coordinating slave node to ensure the completing to be supplied to cluster of the task.Slave node runs TaskTracker and use
In the HDFS of storing data, have the function of to execute the mapping and abbreviation that data calculate.Actual data are stored in each back end
On, it then calculates and occurs on the node that data are resident, Hadoop can be helped to provide than the higher property of storing data on network
Energy.The high-performance that the combination of standard server platform and Hadoop infrastructure can provide economical and efficient for Data parallel application is flat
Platform.
In conclusion using technical solution of the present invention, the clinical document knot based on internet integration medical platform
Structure processing method, unstructured clinical document be input to clinical document structuring processing engine, by clinical medicine corpus,
The processing of the means such as rule, full-text search and machine learning, obtains structural data and is output to distributed storage engine, by artificial
Intelligent algorithm is handled, and for Platform Analysis, is shown;The present invention ties text data non-structured in clinical data
Structureization processing, stores into distributed Hadoop cluster, realizes Distributed Storage mode and distributed computing processing, and will
Programming in software application, which is realized, to be transformed and is adapted to for distributed nature.
Claims (8)
1. the clinical document structuring processing method based on internet integration medical platform, it is characterised in that: including following step
Suddenly,
S1, clinical document structuring processing engine receive the input of unstructured clinical document, pass through clinical medicine corpus, rule
Then, the means such as full-text search and machine learning convert non-structured text data in the sample and achievement data of structuring;
S2, clinical document are after structuring processing engine processing, the key assignments of obtained structural data i.e. sample and index
It is right, it stores into distributed storage engine, analysis, displaying for platform.
2. special as described in claim 1 based on the clinical document structuring processing method of internet integration medical platform
Sign is that the clinical document structuring processing engine includes Chinese natural language processing module, clinical medicine building of corpus
Module, sample index's extraction module, Chinese natural language processing module refer to respectively at clinical medicine building of corpus module, sample
Extraction module is marked to be connected.
3. special as claimed in claim 2 based on the clinical document structuring processing method of internet integration medical platform
Sign is that the Chinese natural language processing module utilizes Chinese natural language processing technique, from word, sentence, paragraph level to defeated
The clinical document entered is handled, and processing step is as follows:
(1) cutting short sentence: according to the text narration feature of clinical document, using the display rule of sentence, by the text of clinical document
The cutting of this content is the short sentence for describing sample one by one;
(2) Chinese word segmentation: utilize Chinese word segmentation tool, be based on general medicine dictionary and clinical medicine dictionary, to sample short sentence into
Row participle, obtains significant word or phrase;
(3) part of speech is analyzed: analyzing the part of speech of each word;
(4) syntactic analysis: for specific sample short sentence, it is compared with the short sentence for describing same sample in clinical document
Compared with summary and induction goes out the short sentence syntax of every kind of pattern representation.
4. special as claimed in claim 2 based on the clinical document structuring processing method of internet integration medical platform
Sign is that the clinical medicine building of corpus module is the clinical medicine special term material obtained by clinical document learning training
The construction step in library, the module is as follows:
(1) new word discovery: utilizing word frequency statistics, clustering method, be combined to participle, finds neologisms;
(2) synonym is found: for synonym present in clinical document, the content of text based on clinical document, using fuzzy
Matching, statistical analysis means, obtain synonym, establish synonym table;
(3) sample extraction: in the short sentence of cutting, according to rule, sample name is extracted;
(4) template extraction: it is directed to specific sample, extracts the description template of the sample.
5. special as claimed in claim 2 based on the clinical document structuring processing method of internet integration medical platform
Sign is, sample index's extraction module needle is for each sample, by way of full-text search and fuzzy matching, in conjunction with clinic
Medicine corpus and pattern representation template determine the corresponding index value of index name from clinical document;The sample and index shape
Processing result output at key-value pair, as structuring processing engine.
6. special as described in claim 1 based on the clinical document structuring processing method of internet integration medical platform
Sign is that the clinical document includes electronic health record, pathological replacement, checks survey report.
7. integrated stowage is transported in the grain harvesting based on image recognition as claimed in claim 6, which is characterized in that institute
It states and checks that survey report includes ultrasound, X-ray, CT report.
8. special as described in claim 1 based on the clinical document structuring processing method of internet integration medical platform
Sign is that the distributed storage engine forms distributed storage cluster, each Hadoop using multiple standard server nodes
Cluster includes a host node and multiple slave nodes;Host node runs NameNode and JobTracker function, and is responsible for association
Slave node is adjusted to ensure the completing to be supplied to cluster of the task;Slave node runs TaskTracker and for storing data
HDFS has the function of to execute the mapping and abbreviation that data calculate.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910101984.5A CN109785927A (en) | 2019-02-01 | 2019-02-01 | Clinical document structuring processing method based on internet integration medical platform |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910101984.5A CN109785927A (en) | 2019-02-01 | 2019-02-01 | Clinical document structuring processing method based on internet integration medical platform |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109785927A true CN109785927A (en) | 2019-05-21 |
Family
ID=66504122
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910101984.5A Pending CN109785927A (en) | 2019-02-01 | 2019-02-01 | Clinical document structuring processing method based on internet integration medical platform |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109785927A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110399450A (en) * | 2019-06-20 | 2019-11-01 | 东华大学 | A kind of Thyroid ultrasound report structure scan method based on semantic tree |
CN110413963A (en) * | 2019-07-03 | 2019-11-05 | 东华大学 | Breast ultrasonography report structure method based on domain body |
CN110853745A (en) * | 2019-09-23 | 2020-02-28 | 陈翔 | Skin disease patient standardization system |
CN111476030A (en) * | 2020-05-08 | 2020-07-31 | 中国科学院计算机网络信息中心 | Prospective factor screening method based on deep learning |
CN112687364A (en) * | 2020-12-24 | 2021-04-20 | 宁波金唐软件有限公司 | Hbase-based medical data management method and system |
CN112948471A (en) * | 2019-11-26 | 2021-06-11 | 广州知汇云科技有限公司 | Clinical medical text post-structured processing platform and method |
CN113380414A (en) * | 2021-05-20 | 2021-09-10 | 心医国际数字医疗系统(大连)有限公司 | Data acquisition method and system based on big data |
CN113380380A (en) * | 2021-06-23 | 2021-09-10 | 上海电子信息职业技术学院 | Intelligent reading device for medical reports |
CN113539414A (en) * | 2021-07-30 | 2021-10-22 | 中电药明数据科技(成都)有限公司 | Method and system for predicting rationality of antibiotic medication |
CN114678132A (en) * | 2022-02-22 | 2022-06-28 | 北京颐圣智能科技有限公司 | Self-learning medical wind control system and method based on clinical behavior feedback |
CN115617840A (en) * | 2022-12-19 | 2023-01-17 | 江西曼荼罗软件有限公司 | Medical data retrieval platform construction method, system, computer and storage medium |
CN115757430A (en) * | 2022-12-01 | 2023-03-07 | 武汉博科国泰信息技术有限公司 | Data structured processing method and system for medical data |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150066537A1 (en) * | 2013-09-05 | 2015-03-05 | A-Life Medical, LLC. | Automated clinical indicator recognition with natural language processing |
CN108538395A (en) * | 2018-04-02 | 2018-09-14 | 上海市儿童医院 | A kind of construction method of general medical disease that calls for specialized treatment data system |
CN109243616A (en) * | 2018-06-29 | 2019-01-18 | 东华大学 | Breast electronic medical record combined relation extraction and structuring system based on deep learning |
-
2019
- 2019-02-01 CN CN201910101984.5A patent/CN109785927A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150066537A1 (en) * | 2013-09-05 | 2015-03-05 | A-Life Medical, LLC. | Automated clinical indicator recognition with natural language processing |
CN108538395A (en) * | 2018-04-02 | 2018-09-14 | 上海市儿童医院 | A kind of construction method of general medical disease that calls for specialized treatment data system |
CN109243616A (en) * | 2018-06-29 | 2019-01-18 | 东华大学 | Breast electronic medical record combined relation extraction and structuring system based on deep learning |
Non-Patent Citations (2)
Title |
---|
施志威: "基于云服务的临床文档结构化系统设计与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
阮彤等: "基于电子病历的临床医疗大数据挖掘流程与方法", 《大数据》 * |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110399450A (en) * | 2019-06-20 | 2019-11-01 | 东华大学 | A kind of Thyroid ultrasound report structure scan method based on semantic tree |
CN110399450B (en) * | 2019-06-20 | 2023-06-23 | 东华大学 | Thyroid ultrasound report structured scanning method based on semantic tree |
CN110413963A (en) * | 2019-07-03 | 2019-11-05 | 东华大学 | Breast ultrasonography report structure method based on domain body |
CN110413963B (en) * | 2019-07-03 | 2022-11-25 | 东华大学 | Breast ultrasonic examination report structuring method based on domain ontology |
CN110853745A (en) * | 2019-09-23 | 2020-02-28 | 陈翔 | Skin disease patient standardization system |
CN112948471A (en) * | 2019-11-26 | 2021-06-11 | 广州知汇云科技有限公司 | Clinical medical text post-structured processing platform and method |
CN111476030B (en) * | 2020-05-08 | 2022-03-15 | 中国科学院计算机网络信息中心 | Prospective factor screening method based on deep learning |
CN111476030A (en) * | 2020-05-08 | 2020-07-31 | 中国科学院计算机网络信息中心 | Prospective factor screening method based on deep learning |
CN112687364A (en) * | 2020-12-24 | 2021-04-20 | 宁波金唐软件有限公司 | Hbase-based medical data management method and system |
CN112687364B (en) * | 2020-12-24 | 2023-08-01 | 宁波金唐软件有限公司 | Medical data management method and system based on Hbase |
CN113380414A (en) * | 2021-05-20 | 2021-09-10 | 心医国际数字医疗系统(大连)有限公司 | Data acquisition method and system based on big data |
CN113380414B (en) * | 2021-05-20 | 2023-11-10 | 心医国际数字医疗系统(大连)有限公司 | Data acquisition method and system based on big data |
CN113380380A (en) * | 2021-06-23 | 2021-09-10 | 上海电子信息职业技术学院 | Intelligent reading device for medical reports |
CN113539414A (en) * | 2021-07-30 | 2021-10-22 | 中电药明数据科技(成都)有限公司 | Method and system for predicting rationality of antibiotic medication |
CN114678132A (en) * | 2022-02-22 | 2022-06-28 | 北京颐圣智能科技有限公司 | Self-learning medical wind control system and method based on clinical behavior feedback |
CN115757430A (en) * | 2022-12-01 | 2023-03-07 | 武汉博科国泰信息技术有限公司 | Data structured processing method and system for medical data |
CN115617840B (en) * | 2022-12-19 | 2023-03-10 | 江西曼荼罗软件有限公司 | Medical data retrieval platform construction method, system, computer and storage medium |
CN115617840A (en) * | 2022-12-19 | 2023-01-17 | 江西曼荼罗软件有限公司 | Medical data retrieval platform construction method, system, computer and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109785927A (en) | Clinical document structuring processing method based on internet integration medical platform | |
CN110415831B (en) | Medical big data cloud service analysis platform | |
CN109830303A (en) | Clinical data mining analysis and aid decision-making method based on internet integration medical platform | |
US11232365B2 (en) | Digital assistant platform | |
Chen et al. | A parallel patient treatment time prediction algorithm and its applications in hospital queuing-recommendation in a big data environment | |
Puppala et al. | METEOR: an enterprise health informatics environment to support evidence-based medicine | |
CN109841282A (en) | A kind of Chinese medicine health control cloud system and its building method based on cloud computing | |
CN112349369A (en) | Medical image big data intelligent analysis method, system and storage medium | |
Li et al. | Automatic approach for constructing a knowledge graph of knee osteoarthritis in Chinese | |
CN115497631A (en) | Clinical scientific research big data analysis system | |
Liu et al. | Requirements engineering for health data analytics: Challenges and possible directions | |
Qu | A review on the application of knowledge graph technology in the medical field | |
CN111460173A (en) | Method for constructing disease ontology model of thyroid cancer | |
D'Auria et al. | Improving graph embeddings via entity linking: a case study on Italian clinical notes | |
Sakib et al. | A novel approach on machine learning based data warehousing for intelligent healthcare services | |
Chluski et al. | The application of big data in the management of healthcare organizations. A review of selected practical solutions | |
Saranya et al. | Intelligent medical data storage system using machine learning approach | |
Kumar Attar et al. | The emergence of Natural Language Processing (NLP) techniques in healthcare AI | |
Jin et al. | Research on the construction and application of breast cancer-specific database system based on full data lifecycle | |
Sathish Kumar et al. | Information extraction and prediction using partial keyword combination and blends measure | |
Chondrogiannis et al. | A novel approach for clinical data harmonization | |
Ambhaikar | A survey on health care and expert system | |
Yanling et al. | Research on entity recognition and knowledge graph construction based on TCM medical records | |
Alani et al. | Big data analytics for healthcare organizations a case study of the Iraqi healthcare sector | |
Samra et al. | Design of a clinical database to support research purposes: Challenges and solutions |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190521 |