CN109637605A - Electronic health record structural method and computer readable storage medium - Google Patents

Electronic health record structural method and computer readable storage medium Download PDF

Info

Publication number
CN109637605A
CN109637605A CN201811513668.0A CN201811513668A CN109637605A CN 109637605 A CN109637605 A CN 109637605A CN 201811513668 A CN201811513668 A CN 201811513668A CN 109637605 A CN109637605 A CN 109637605A
Authority
CN
China
Prior art keywords
attribute
knowledge base
score
health record
medical knowledge
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811513668.0A
Other languages
Chinese (zh)
Other versions
CN109637605B (en
Inventor
文再文
陈青筱
谢屿
张嘉琦
刘普凡
刘德斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Peking University School of Stomatology
Original Assignee
Peking University
Peking University School of Stomatology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University, Peking University School of Stomatology filed Critical Peking University
Priority to CN201811513668.0A priority Critical patent/CN109637605B/en
Publication of CN109637605A publication Critical patent/CN109637605A/en
Application granted granted Critical
Publication of CN109637605B publication Critical patent/CN109637605B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Primary Health Care (AREA)
  • Medical Informatics (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Epidemiology (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The present invention provides a kind of electronic health record structural method and computer readable storage mediums.Wherein, this method comprises: being loaded into the first medical knowledge base;Subordinate sentence is carried out according to additional character to the first electronic health record, obtains multiple text sentences;Using matching marking algorithm, the attribute in the first medical knowledge base is matched to text sentence each in multiple text sentences;Save matching result.Through the invention, it solves the problems, such as that electronic health record is unable to complete lattice in the related technology, realizes the complete lattice of electronic health record.

Description

Electronic health record structural method and computer readable storage medium
Technical field
The present invention relates to medical field, in particular to a kind of electronic health record structural method and computer-readable deposit Storage media.
Background technique
With the electronization of medical system, networking and intelligence, the medical data of patient is saved in electronic health record, Include the comprehensive information such as main suit, medical history, inspection, diagnosis, treatment plan, disposition.Under the background of big data, these are original Data provide medical Diagnostic Decision Making it is new a possibility that so that people consider from these medical record datas mined information, extract Rule designs intelligence system, further increases medical level and quality of medical care.
But electronic health record database often save be doctor's typing urtext, despite according to some specified What template was write, still suffer from freedom and the flexibility of some natural language expressings.Therefore, it such data and non-fully ties Structure, and only partly-structured data, it is not particularly suited for deeper scientific research mission and intelligent medical project.This is We have proposed the requirements of structuring urtext data.
Due to professional, the structuring of electronic health record text of the diversity and medical terminology of natural language expressing mode There are certain difficulty for method, and the country is still insufficient to the work development of correlative study at present.For electronic health record structuring Method, the result of studies in China work be presently mainly be based on electronic health record using it is semantic positive oppose disease information make certainly or The judgement of negative, this mode are able to solve the disease information demarcated with two-valued function, but for types such as numerical value, disease degrees Information cannot then extract;In addition, the result of study current for the happening part of patient's related disease information does not propose pair yet The solution answered.The imperfection of this information extraction is for work such as medical research, the exploitations for diagnosing intelligent decision-making system Form certain limitation.
Object of the present invention is to completely be believed for different types of disease information, medical treatment information electronic health record Breath extracts, and realizes the complete lattice to electronic health record text.
Summary of the invention
The present invention provides a kind of electronic health record structural method and computer readable storage mediums, at least to solve correlation Electronic health record is unable to the problem of complete lattice in technology.
In a first aspect, the embodiment of the invention provides a kind of electronic health record structural methods, comprising: be loaded into the first medicine and know Know library;Subordinate sentence is carried out according to additional character to the first electronic health record, obtains multiple text sentences;It is right using matching marking algorithm Each text sentence matches the attribute in first medical knowledge base in the multiple text sentence;Save matching result.
Second aspect, the embodiment of the invention provides a kind of computer readable storage mediums, are stored thereon with computer journey Sequence instruction, method described in first aspect is realized when the computer program instructions are executed by processor.
The electronic health record structural method and computer readable storage medium provided through the embodiment of the present invention, using loading First medical knowledge base;Subordinate sentence is carried out according to additional character to the first electronic health record, obtains multiple text sentences;It is beaten using matching Divide algorithm, the attribute in the first medical knowledge base is matched to text sentence each in multiple text sentences;Save matching result, solution It has determined the problem of electronic health record is unable to complete lattice in the related technology, has realized the complete lattice of electronic health record.
Detailed description of the invention
The drawings described herein are used to provide a further understanding of the present invention, constitutes part of this application, this hair Bright illustrative embodiments and their description are used to explain the present invention, and are not constituted improper limitations of the present invention.In the accompanying drawings:
Fig. 1 is the flow chart of electronic health record structural method according to an embodiment of the present invention;
Fig. 2 is the hardware structural diagram of electronic health record structural devices according to an embodiment of the present invention;
Fig. 3 is the flow chart of electronic health record structural method according to the preferred embodiment of the invention;
Fig. 4 is the schematic diagram of the first medical knowledge base topology example of oral cavity restoration field according to the preferred embodiment of the invention;
Fig. 5 is the exemplary schematic diagram of electronic health record according to the preferred embodiment of the invention;
Fig. 6 is the schematic diagram of electronic health record structure match result according to the preferred embodiment of the invention;
Fig. 7 is the matching frequency statistical chart of attribute in electronic health record structure match result according to the preferred embodiment of the invention Table.
Specific embodiment
The feature and exemplary embodiment of various aspects of the invention is described more fully below, in order to make mesh of the invention , technical solution and advantage be more clearly understood, with reference to the accompanying drawings and embodiments, the present invention is further retouched in detail It states.It should be understood that described herein, the specific embodiments are only for explaining the present invention, is not intended to limit the present invention.For ability For field technique personnel, the present invention can be implemented in the case where not needing some details in these details.It is right below The description of embodiment is used for the purpose of better understanding the present invention to provide by showing example of the invention.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment Intrinsic element.In the absence of more restrictions, the element limited by sentence " including ... ", it is not excluded that including There is also other identical elements in the process, method, article or equipment of the element.
A kind of electronic health record structural method is provided in the present embodiment, and Fig. 1 is electronics according to an embodiment of the present invention The flow chart of case history structural method, as shown in Figure 1, the process includes the following steps:
Step S101 is loaded into the first medical knowledge base;
Step S102 carries out subordinate sentence according to additional character to the first electronic health record, obtains multiple text sentences;
Step S103 matches the first medical knowledge base to text sentence each in multiple text sentences using matching marking algorithm In attribute;
Step S104 saves matching result.
It through the above steps, can well will be in text sentence and the first medical knowledge base using matching marking algorithm Attribute matched, matched keyword can not only include with two-valued function demarcate disease information, additionally it is possible to coupling number The information of the types such as value, disease degree is realized to solve the problems, such as that electronic health record is unable to complete lattice in the related technology The complete lattice of electronic health record.
Optionally, the first medical knowledge base includes multiple portions, and each part includes one or more attribute and attribute One or more corresponding keyword, each attribute include at least: Property Name, attribute value and position, each keyword is also Score including the keyword.For example, basic unit is an attribute in the first medical knowledge base, by Property Name, belong to Property value and position three parts composition, Property Name can be certain disease symptom, physical trait or treatment means etc.;It is corresponding Attribute value can be the presence or absence of symptom and light and heavy degree, the specific manifestation of physical trait or specific method for the treatment of means etc.;Position It can be the physical feeling with corresponding attribute.A collection of attribute belongs to some part (section) (such as inspection, treatment plan jointly Deng), various pieces constitute entire knowledge base.
Due to the complexity of medical diagnosis and remedy measures itself, in order to which at large medical knowledge is described And retain the information of original case history as much as possible in structurizing process, it can be into the first medical knowledge base in the present embodiment The improvement of row the following aspects: attribute value value type a) is expanded;B) " position " is increased to each attribute to describe corresponding belong to The physical feeling of property;C) increase the description to time serial message;D) medical knowledge is based on to attribute to classify, formed to doctor The stratification expression gained knowledge.
It is described as follows:
A) the first medical knowledge base attribute Value Types have type real, Boolean type, discrete classification type etc., and in attribute value It include the various combinations of judgement, single choice, number, multiselect and these types of mode in value mode.This diversified expression shape Formula can be realized the value expression of each attribute occurred in medicine.
B) since most of attribute in the first medical knowledge base is directed to a certain specific physical feeling, such as disease information Happening part, the implementation position of medical measure etc., corresponding physical feeling is increased to attribute in the present embodiment and is described.And it is same When, increase " position " description and need to increase the extraction to " position " information in structural method again later, this is in the present embodiment It can be further illustrated.
C) since medical act itself is a procedural behavior, rather than the simple static group of various medical measures It closes, in particular for treatment plan and Disposal Measures that patient condition specifies, has precedence relationship between different medical measure.In order to Retain the successive dependence between different medical measure, the description of time serial message is increased to the first medical knowledge base. For example, can be by needing the attribute of expression time sequence to increase by two members of step and substep for describing the attribute The order occurred over the course for the treatment of realizes that the serializing to attribute is expressed.
D) based on the considerations of medically, the first medical knowledge base involved in the present embodiment is divided into main suit, further consultation, existing disease Eight history, past medical history, inspection, diagnosis, treatment plan, disposition parts, each part is directed to the medical domain for specifically needing to describe Carry out the design and classification of attribute.Such as in oral cavity restoration field, check that part includes to the two-part inspection of tooth and oral cavity As a result, the inspection result of buccal portion is divided into two subdivisions according to whether related to tooth position, above-mentioned each part includes several Attribute at large describes the disease information occurred in various inspections.
The first above-mentioned medical knowledge base, which can be realized desirably, expresses original case history text structureization.
Optionally, additional character includes at least one of: Chinese and English comma, fullstop, newline, tab.
Optionally, before being loaded into the first medical knowledge base, method further include: be loaded into the second medical knowledge base;According to Two medical knowledge bases and the second electronic health record extract keyword and its score;According to the second medical knowledge base and the key extracted Word and its score construct the first medical knowledge base.In each example, the structure of the first medical knowledge base needs accordingly Specification, provides the second medical knowledge base, this second medical knowledge base is equivalent to the first medical knowledge base in the present embodiment Specification template;Similar with the first medical knowledge base, the second medical knowledge base also includes multiple portions, and each part includes one Or multiple attributes;Each attribute includes at least: Property Name, attribute value and position.Unlike the first medical knowledge base, The score information of one or more keyword not corresponding with attribute and keyword in second medical knowledge base.These Keyword and its score information are extracted from the second electronic health record.First medical knowledge base is in the second medical knowledge Increase in library for each attribute built-up after one or more keyword and its score.
Optionally, keyword title and keyword score packet are extracted according to the second medical knowledge base and the second electronic health record It includes: the text sentence in the second electronic health record is segmented according to Property Name and attribute value, obtain multiple keywords, and will Near synonym, the synonym of the keyword are also used as keyword together;According to the importance (whether being everyday words) of keyword, negative Property (whether being negative word) and logical relation (with or it is non-) weight, give its different score.
Optionally, using matching marking algorithm, the first medical knowledge is matched to text sentence each in multiple text sentences Attribute in library includes: to match keyword and its score of each text sentence to all properties, obtains each text Sentence corresponds to the gross score of all properties;The gross score of attribute is higher than the text sentence of preset threshold to the pass in the attribute Keyword and its score are matched, obtain attribute value in this text sentence, position correspond to the attribute attribute value score and Position score;By attribute value score and the highest attribute value of position score, position and corresponding attribute, as this text sentence Matching result.By above-mentioned matching marking algorithm, the matching of text sentence and attribute is realized.
Optionally, matching result includes: text sentence and the corresponding attribute of text sentence, attribute value, position, institute Belong to part, position of the text sentence in the first electronic health record.It, can be by of each text sentence when saving matching result Data line is saved as with result, and according to time series and the affiliated part of text sentence, by the matching of all text sentences As a result it is arranged successively, saves as .csv format, so as to the inquiry and processing of follow-up data.
Optionally, method further include: be extracted and preserved be not matched to correctly by any attribute text sentence (including matching Attribute has been arrived, but has been not matched to the text sentence of attribute value).By the above-mentioned means, the matching of text sentence can be grasped Degree.Wherein, the text sentence for not being matched to attribute for each part can save are as follows: text sentence, text initial position, End of text position, patient file folder number, medical record number;The text sentence for not being matched to attribute value for each part can be with It saves are as follows: text sentence, the attribute being matched to, text initial position, end of text position, patient file folder number, case history are compiled Number.The format of preservation is preferably .xls format.
After extracting the text sentence not being matched to correctly by any attribute, these text sentences can also be carried out The processing such as participle, sequence, artificial screening, to find the deficiency of keyword or attributive classification in the second medical knowledge base, and leads to It crosses and is added to keyword/deletion/and adjust the operation such as score, realize the iteration optimization to the second medical knowledge base, thus into One step improves the second medical knowledge base to the matching rate and accuracy rate of the text sentence of electronic health record.
Through the above description of the embodiments, those skilled in the art can be understood that according to above-mentioned implementation The method of example can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but it is very much In the case of the former be more preferably embodiment.Based on this understanding, technical solution of the present invention is substantially in other words to existing The part that technology contributes can be embodied in the form of software products, which is stored in a storage In medium (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that a terminal device (can be mobile phone, calculate Machine, server or network equipment etc.) execute method described in each embodiment of the present invention.
It can be by electronic health record structural devices in conjunction with the electronic health record structural method of Fig. 1 embodiment of the present invention described To realize.Fig. 2 shows the hardware structural diagrams of electronic health record structural devices provided in an embodiment of the present invention.
Electronic health record structural devices may include processor 21 and the memory 22 for being stored with computer program instructions.
Specifically, above-mentioned processor 21 may include central processing unit (CPU) or specific integrated circuit (Application Specific Integrated Circuit, ASIC), or may be configured to implement implementation of the present invention One or more integrated circuits of example.
Memory 22 may include the mass storage for data or instruction.For example it rather than limits, memory 22 may include hard disk drive (Hard Disk Drive, HDD), floppy disk drive, flash memory, CD, magneto-optic disk, tape or logical With the combination of universal serial bus (Universal Serial Bus, USB) driver or two or more the above.It is closing In the case where suitable, memory 22 may include the medium of removable or non-removable (or fixed).In a suitable case, memory 22 can be inside or outside data processing equipment.In a particular embodiment, memory 22 is non-volatile solid state memory.? In specific embodiment, memory 22 includes read-only memory (ROM).In a suitable case, which can be masked edit program ROM, programming ROM (PROM), erasable PROM(EPROM), electric erasable PROM(EEPROM), electrically-alterable ROM (EAROM) Or the combination of flash memory or two or more the above.
Processor 21 is by reading and executing the computer program instructions stored in memory 22, to realize above-described embodiment In any one electronic health record structural method.
In one example, electronic health record structural devices may also include communication interface 23 and bus 20.Wherein, such as Fig. 2 Shown, processor 21, memory 22, communication interface 23 connect by bus 20 and complete mutual communication.
Communication interface 23 is mainly used for realizing in the embodiment of the present invention between each module, device, unit and/or equipment Communication.
Bus 20 includes hardware, software or both, and the component of electronic health record structural devices is coupled to each other together.It lifts It for example rather than limits, bus may include accelerated graphics port (AGP) or other graphics bus, enhancing Industry Standard Architecture (EISA) bus, front side bus (FSB), super transmission (HT) interconnection, Industry Standard Architecture (ISA) bus, infinite bandwidth interconnect, are low Number of pins (LPC) bus, memory bus, micro- channel architecture (MCA) bus, periphery component interconnection (PCI) bus, PCI- Express(PCI-X) bus, Serial Advanced Technology Attachment (SATA) bus, Video Electronics Standards Association part (VLB) bus or The combination of other suitable buses or two or more the above.In a suitable case, bus 20 may include one Or multiple buses.Although specific bus has been described and illustrated in the embodiment of the present invention, the present invention considers any suitable bus Or interconnection.
The electronic health record structural devices can execute the electronic health record in the embodiment of the present invention based on the data got Structural method, to realize the electronic health record structural method described in conjunction with Fig. 1.
In addition, in conjunction with the electronic health record structural method in above-described embodiment, the embodiment of the present invention can provide a kind of calculating Machine readable storage medium storing program for executing is realized.Computer program instructions are stored on the computer readable storage medium;The computer program Any one electronic health record structural method in above-described embodiment is realized in instruction when being executed by processor.
In order to keep the description of the embodiment of the present invention clearer, it is described and illustrates below with reference to preferred embodiment.
This preferred embodiment provides a kind of electronic health record structural method, and Fig. 3 is according to the preferred embodiment of the invention The flow chart of electronic health record structural method, as shown in figure 3, the flow chart includes the following steps:
Step 1: the first medical knowledge base of building.
In the preferred embodiment, the first medical knowledge base is constructed based on the second medical knowledge base, included the following steps:
1, define the second medical knowledge library format in table 1, table 2 give in table 1 " it is required that " detailed description, Fig. 4 provides The schematic diagram of first medical knowledge base topology example of oral cavity restoration field.
A kind of second medical knowledge library format of table 1
In 2 table 1 of table " it is required that " detailed description
It is required that Explanation
Single choice Default " unknown ", attribute-name score >=1 when, select the option of highest scoring
Single choice * Select the option of highest scoring
Multiselect Select all scores >=1 option
Judgement Default "None", attribute-name score >=1 when, and when there is not negative word, select "Yes"
Number The option (unit) for selecting highest scoring, finds the number in sentence before unit
Time The option (unit) for selecting highest scoring, finds the time word in sentence before unit
Single choice/number Attribute-name score >=1 when, select the option of highest scoring, or find out number
2, the genitive phrase in Fig. 5 is manually segmented, and by sampling screening case history, the keyword often occurred (packet Containing synonym, near synonym, write a Chinese character in simplified form abbreviation, wrong word etc.);
3, keyword is added to behind each matching object of the second medical knowledge base, constitutes the first medical knowledge base.And according to The corresponding different importance of keyword and part of speech assign different scores, and (such as technical term is positive point, and negative word is negative point, often Word is 0 point).AND-OR INVERTER relationship is also realized by score simultaneously: for example, since regulation score is more than or equal to 1 for matching Success, so if require two keywords while occurring, the score that two words can be set is respectively 0.5.As shown in table 3.
A kind of first medical knowledge library format of table 3
Step 2: subordinate sentence is carried out to the first electronic health record.
In most cases, a short sentence (being divided with comma) corresponds to one group " attribute-attribute value " in electronic health record.Cause This, divides electronic health record according to punctuation mark.
1, entire case history text is subjected to subordinate sentence according to Chinese and English comma, fullstop, newline, tab.
2, the special circumstances that processing divides (such as decimal point, serial number are numbered)
Step 3: definition structure format.
1, the target basic format of structuring are as follows: text sentence, attribute, attribute value, position, affiliated part, text are in electricity Corresponding position in sub- case history.For needing to increase the attribute of time series, object format are as follows: text sentence, attribute, attribute Value, position, step, substep, affiliated part, text corresponding position in electronic health record.
2, in this, as a line content, entire patient file is arranged by sentence, is saved into .csv format.
Step 4: text sentence is matched with the first medical knowledge base.
1, to each text sentence, all properties are traversed.To each attribute, the matching score and attribute of the title that sets a property The matching score initial value for being worth each option is 0.
2, the attribute-name of attribute, attribute value, position are matched.Specific matching process is as described below:
A) attribute-name matches
It is matched according to the corresponding crucial phrase of attribute-name with text sentence, (positivity is crucial for running summary of the points scored if successful match Word adds, and negativity keyword subtracts), obtain the total score of all Keywords matchings of attribute-name.If score is more than certain threshold value, then it is assumed that The Property Name successful match of text sentence and this attribute, and carry out attribute value matching.
B) option type attribute value matches
To each option of attribute value, corresponding crucial phrase is matched with text sentence, the running summary of the points scored if success obtains To the total score of all Keywords matchings of the option.For single choice type attribute, take the highest option of running summary of the points scored as the attribute Attribute value;For multiselect type attribute, taking running summary of the points scored is more than attribute value of the total Options of certain threshold value as the attribute;It is right In judgement type attribute, the attribute value successful match is thought if option running summary of the points scored is more than certain threshold value.
C) Numeric Attributes value matches
Each character in text sentence is looped to determine, the continuation character string of expression numerical value therein is found out, and is converted Attribute value for value type as the attribute.
D) location matches
If attribute be it is relevant to tooth position, using the tooth position in regular expression matching text sentence, (three continuous '/' makees It is characterized) value as the property location.If position there are multiple options, option type attribute value is taken to match same method pair Each option of position is matched, and different according to position value require the selection option that wherein accumulation score is met the requirements to make For position value.
3, the difference according to shown in table 3 requires corresponding score matching criteria, determines and is somebody's turn to do whether " attribute-attribute value " meets It is required that.If satisfied, then this text sentence is saved with corresponding " attribute-attribute value " to according to the format in step 3;If discontented Foot is then matched into next attribute.
4, for being likely to occur multiple attributes equal successful match the case where, the result of each successful match is all saved.
5, the time serial message in text information is extracted.
Since the different operation in treatment plan part is divided into sequence, need to embody in structured result.It is right Each text sentence in treatment plan part, finding text sentence beginning indicates the serial number of step as the corresponding category of the sentence The operation order of property.
It is also the same to need the body in structured result due to there is also the optional situation of multiple schemes among each step It is existing.To such each text sentence, judges whether there is the word for indicating "or" relationship in text sentence, is separated if having, Attributes match is carried out respectively.
6, the matching based on matching marking algorithm can carry out the information in text sentence more adequately to extract.? In most cases, the corresponding attribute of a text sentence;The case where multiple attributes are corresponded to for a text sentence, according to These attributes of algorithm logic can also match.Since medical knowledge base of the present invention is contained to Boolean type, real number A plurality of types of values such as type, classification type describe and increase semantic positive and negative word in crucial phrase, thus should Matching algorithm can not only correctly identify that the semanteme of disease information is positive and negative, at the same can also specific value information to disease information into Row extracts (illustrating the severity of disease, measured value etc.), this is that current other structures method cannot achieve.
Text sentence is matched using crucial phrase, can recognize that a plurality of types of information in text sentence, is wrapped Semantic positive and negative, attribute value different options, numerical value etc. are included, this is greatly enlarged the applicability of the method.
Step 5: the text sentence not exactly matched in matching process is saved.
1, file format is not exactly matched are as follows: text sentence, the attribute * being matched to, text initial position, end of text position It sets, patient file folder number, patient file number.
2, in this, as a line content, the sentence of successful match non-in all patient files is arranged by sentence, is saved as .xls format.
3, to each text sentence in each patient file, its match condition is checked.If text sentence does not meet With successful condition, then it is saved into the .xls table of corresponding part.
Structured result analysis
It is a set of for case history text knot that the present embodiment utilizes python language development to go out the above electronic health record structural method The tool of structure, and structured work has been carried out to more than 3,000 part electronic health record texts.It will be given below the displaying to this result It is counted with analysis.
Related case history of the case history text of the present embodiment processing from Prosthodontics defect of dentition, medical knowledge used Library is based on oral cavity restoration field relevant knowledge and arranges to obtain, and partial knowledge library is as shown in Figure 4.Case history text examples as shown in figure 5, Structured result is as shown in Figure 6.
From the point of view of structured result, the method realize it is following the utility model has the advantages that
1, can be recognized accurately the location information occurred in case history text, including with " upper jaw ", " lower jaw " is this kind of is gone out with text Existing position and tooth position information.
2, the attribute and corresponding attribute value in case history text can be effectively marked out, wherein to different types of attribute value It can realize effective identification.
3, the sequencing between remedy measures different in text can be extracted effectively.
It is compared with existing several case history structural methods, the method provided in an embodiment of the present invention being related to constructs more complete First medical knowledge base in face can more be bonded case history text, while also can more fully extract in case history text Information.And existing method often can only be according to text to describing in knowledge base such as based on semantic positive and negative structural method Medical speciality word provide certainly negative judgement, and cannot assign the more comprehensive information of the attribute (such as disease locus, Degree etc.).
The first medical knowledge base that the present embodiment uses includes 12 parts, amounts to 389 attributes.Attribute value has more Types, the property location values such as choosing, single choice judgement, numerical value have single choice, take the types such as tooth position.Fig. 7 illustrates this exemplary construction As a result the frequency statistics of adhering to separately property of middle part, the statistics about attribute difference value difference are not reflected in data.It can from Fig. 7 To see, in this more than 3,000 part case history, the frequency that different attribute occurs has very big gap, this reflection is in addition to one in case history A little common diseases also recognize illness for us and provide a kind of statistical method.
By randomly selecting a certain number of case histories, the first medical knowledge base of control is manually marked, is weighed in this, as standard The effect for the structured result that amount the method provides, shows that electronic health record structural method provided in an embodiment of the present invention can be complete At structure tasks required in the first medical knowledge base.
The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, made any to repair Change, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.

Claims (10)

1. a kind of electronic health record structural method characterized by comprising
It is loaded into the first medical knowledge base;
Subordinate sentence is carried out according to additional character to the first electronic health record, obtains multiple text sentences;
Using matching marking algorithm, text sentence each in the multiple text sentence is matched in first medical knowledge base Attribute;
Save matching result.
2. the method according to claim 1, wherein first medical knowledge base includes multiple portions, each Part includes one or more attribute, one or more keyword corresponding with attribute;Each attribute includes at least: attribute Title, attribute value and position, each keyword further include: the score of keyword.
3. according to the method described in claim 2, it is characterized in that, the type of the attribute value includes at least one of: real Several classes of types, Boolean type, discrete classification type;The value mode of the attribute value includes at least one of: judgement, single choice, Number, multiselect.
4. the method according to claim 1, wherein the additional character includes at least one of: Chinese and English Comma, fullstop, newline, tab.
5. the method according to claim 1, wherein before being loaded into first medical knowledge base, the side Method further include:
Be loaded into the second medical knowledge base, wherein second medical knowledge base includes multiple portions, each part include one or The multiple attributes of person;Each attribute includes at least: Property Name, attribute value and position;
Keyword and its score are extracted according to second medical knowledge base and the second electronic health record;
According to second medical knowledge base and the keyword and its score that extract, first medical knowledge base is constructed.
6. according to the method described in claim 5, it is characterized in that, according to second medical knowledge base and the second electronic health record It extracts keyword title and keyword score includes:
Second electronic health record is segmented according to Property Name and attribute value, obtains multiple keywords, and by the key Near synonym, the synonym of word are also used as keyword together;
According to the weight of the importance of keyword, negativity and logical relation, its different score is given.
7. the method according to claim 1, wherein using matching marking algorithm, to the multiple text sentence In the attribute that matches in first medical knowledge base of each text sentence include:
Keyword and its score of each text sentence to all properties are matched, it is corresponding to obtain each text sentence In the gross score of all properties;
By the gross score of attribute be higher than preset threshold text sentence in the attribute keyword and its score match, obtain The attribute value score and position score of attribute value, position corresponding to the attribute into this text sentence;
By attribute value score and the highest attribute value of position score, position and corresponding attribute, as this text sentence With result.
8. the method according to the description of claim 7 is characterized in that the matching result includes: text sentence and the text The position of the corresponding attribute of sentence, attribute value, position, affiliated part, text sentence in first electronic health record.
9. method according to any one of claim 1 to 8, which is characterized in that the method also includes:
The text sentence not being matched to correctly by any attribute is extracted and preserved.
10. a kind of computer readable storage medium, is stored thereon with computer program instructions, which is characterized in that when the calculating Machine program instruction realizes method as claimed in any one of claims 1-9 wherein when being executed by processor.
CN201811513668.0A 2018-12-11 2018-12-11 Electronic medical record structuring method and computer-readable storage medium Active CN109637605B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811513668.0A CN109637605B (en) 2018-12-11 2018-12-11 Electronic medical record structuring method and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811513668.0A CN109637605B (en) 2018-12-11 2018-12-11 Electronic medical record structuring method and computer-readable storage medium

Publications (2)

Publication Number Publication Date
CN109637605A true CN109637605A (en) 2019-04-16
CN109637605B CN109637605B (en) 2022-05-10

Family

ID=66072953

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811513668.0A Active CN109637605B (en) 2018-12-11 2018-12-11 Electronic medical record structuring method and computer-readable storage medium

Country Status (1)

Country Link
CN (1) CN109637605B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110277149A (en) * 2019-06-28 2019-09-24 北京百度网讯科技有限公司 Processing method, device and the equipment of electronic health record
CN110704632A (en) * 2019-08-26 2020-01-17 南京医渡云医学技术有限公司 Method and device for processing clinical data, readable medium and electronic equipment
CN111192646A (en) * 2019-12-30 2020-05-22 北京爱医生智慧医疗科技有限公司 Method and device for extracting physical sign information in electronic medical record
CN112101034A (en) * 2020-09-09 2020-12-18 沈阳东软智能医疗科技研究院有限公司 Method and device for distinguishing attribute of medical entity and related product
CN112883712A (en) * 2021-02-05 2021-06-01 中国人民解放军南部战区总医院 Intelligent input method and device for electronic medical record
TWI750513B (en) * 2019-10-05 2021-12-21 業務人資訊有限公司 Insurance claim and underwriting assistance system and implementation method thereof
CN113988082A (en) * 2021-10-28 2022-01-28 泰康保险集团股份有限公司 Text processing method and device, electronic equipment and storage medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001101184A (en) * 1999-10-01 2001-04-13 Nippon Telegr & Teleph Corp <Ntt> Method and device for generating structurized document and storage medium with structurized document generation program stored therein
CN1614587A (en) * 2003-11-07 2005-05-11 杨立伟 Method for digesting Chinese document automatically
CN102298588A (en) * 2010-06-25 2011-12-28 株式会社理光 Method and device for extracting object from non-structured document
CN103020453A (en) * 2012-12-15 2013-04-03 中国科学院深圳先进技术研究院 Generation method of structured electronic medical record based on ontology technology
CN106095913A (en) * 2016-06-08 2016-11-09 广州同构医疗科技有限公司 A kind of electronic health record text structure method
CN106897568A (en) * 2017-02-28 2017-06-27 北京大数医达科技有限公司 The treating method and apparatus of case history structuring
CN107085655A (en) * 2017-04-07 2017-08-22 江西中医药大学 The traditional Chinese medical science data processing method and system of constrained concept lattice based on attribute
CN107578798A (en) * 2017-10-26 2018-01-12 北京康夫子科技有限公司 The processing method and system of electronic health record
CN107908768A (en) * 2017-09-30 2018-04-13 北京颐圣智能科技有限公司 Method, apparatus, computer equipment and the storage medium of electronic health record processing
CN108009157A (en) * 2017-12-27 2018-05-08 北京嘉和美康信息技术有限公司 A kind of sentence classifying method and device
CN108182972A (en) * 2017-12-15 2018-06-19 上海长江科技发展有限公司 The intelligent coding method and system of Chinese medical diagnosis on disease based on participle network
CN108711443A (en) * 2018-05-07 2018-10-26 成都智信电子技术有限公司 The text data analysis method and device of electronic health record

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001101184A (en) * 1999-10-01 2001-04-13 Nippon Telegr & Teleph Corp <Ntt> Method and device for generating structurized document and storage medium with structurized document generation program stored therein
CN1614587A (en) * 2003-11-07 2005-05-11 杨立伟 Method for digesting Chinese document automatically
CN102298588A (en) * 2010-06-25 2011-12-28 株式会社理光 Method and device for extracting object from non-structured document
CN103020453A (en) * 2012-12-15 2013-04-03 中国科学院深圳先进技术研究院 Generation method of structured electronic medical record based on ontology technology
CN106095913A (en) * 2016-06-08 2016-11-09 广州同构医疗科技有限公司 A kind of electronic health record text structure method
CN106897568A (en) * 2017-02-28 2017-06-27 北京大数医达科技有限公司 The treating method and apparatus of case history structuring
CN107085655A (en) * 2017-04-07 2017-08-22 江西中医药大学 The traditional Chinese medical science data processing method and system of constrained concept lattice based on attribute
CN107908768A (en) * 2017-09-30 2018-04-13 北京颐圣智能科技有限公司 Method, apparatus, computer equipment and the storage medium of electronic health record processing
CN107578798A (en) * 2017-10-26 2018-01-12 北京康夫子科技有限公司 The processing method and system of electronic health record
CN108182972A (en) * 2017-12-15 2018-06-19 上海长江科技发展有限公司 The intelligent coding method and system of Chinese medical diagnosis on disease based on participle network
CN108009157A (en) * 2017-12-27 2018-05-08 北京嘉和美康信息技术有限公司 A kind of sentence classifying method and device
CN108711443A (en) * 2018-05-07 2018-10-26 成都智信电子技术有限公司 The text data analysis method and device of electronic health record

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
J. NARAYANAN 等: "Structured clinical documentation in the electronic medical record to improve quality and to support practice-based research in epilepsy", 《EPILEPSIA》 *
周钧: "基于本体的临床医学案例知识库研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
钟文明 等: "基于临床路径的半结构化电子病历建设研究", 《中国病案》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110277149A (en) * 2019-06-28 2019-09-24 北京百度网讯科技有限公司 Processing method, device and the equipment of electronic health record
CN110704632A (en) * 2019-08-26 2020-01-17 南京医渡云医学技术有限公司 Method and device for processing clinical data, readable medium and electronic equipment
TWI750513B (en) * 2019-10-05 2021-12-21 業務人資訊有限公司 Insurance claim and underwriting assistance system and implementation method thereof
CN111192646A (en) * 2019-12-30 2020-05-22 北京爱医生智慧医疗科技有限公司 Method and device for extracting physical sign information in electronic medical record
CN112101034A (en) * 2020-09-09 2020-12-18 沈阳东软智能医疗科技研究院有限公司 Method and device for distinguishing attribute of medical entity and related product
CN112101034B (en) * 2020-09-09 2024-02-27 沈阳东软智能医疗科技研究院有限公司 Method and device for judging attribute of medical entity and related product
CN112883712A (en) * 2021-02-05 2021-06-01 中国人民解放军南部战区总医院 Intelligent input method and device for electronic medical record
CN112883712B (en) * 2021-02-05 2023-05-02 中国人民解放军南部战区总医院 Intelligent input method and device for electronic medical record
CN113988082A (en) * 2021-10-28 2022-01-28 泰康保险集团股份有限公司 Text processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN109637605B (en) 2022-05-10

Similar Documents

Publication Publication Date Title
CN109637605A (en) Electronic health record structural method and computer readable storage medium
CN107705839B (en) Disease automatic coding method and system
CN107731269B (en) Disease coding method and system based on original diagnosis data and medical record file data
CN109460473B (en) Electronic medical record multi-label classification method based on symptom extraction and feature representation
CN110032648B (en) Medical record structured analysis method based on medical field entity
CN110993081B (en) Doctor online recommendation method and system
CN108628824A (en) A kind of entity recognition method based on Chinese electronic health record
CN109192255B (en) Medical record structuring method
CN110277165A (en) Aided diagnosis method, device, equipment and storage medium based on figure neural network
CN109791569A (en) Causality identification device and computer program for it
CN108629046A (en) A kind of fields match method and terminal device
CN105138829B (en) A kind of natural language processing method and system of Chinese medical information
CN110335653A (en) Non-standard case history analytic method based on openEHR case history format
CN112241457A (en) Event detection method for event of affair knowledge graph fused with extension features
CN109003648A (en) Outpatient Service Stomatology speech electronic case history generation method and computer readable storage medium
Bentz Adaptive languages: An information-theoretic account of linguistic diversity
JP2020035036A (en) Test plan formulation support device, and test plan formulation support method and program
CN111581969B (en) Medical term vector representation method, device, storage medium and electronic equipment
CN109830272A (en) Data normalization method, apparatus, computer equipment and storage medium
Moen et al. On evaluation of automatically generated clinical discharge summaries
CN109036506A (en) Monitoring and managing method, electronic device and the readable storage medium storing program for executing of internet medical treatment interrogation
CN109509517A (en) A kind of medical test Index for examination modified method automatically
CN109378082A (en) Monitoring and managing method, electronic device and the readable storage medium storing program for executing of internet medical treatment interrogation
CN110060749B (en) Intelligent electronic medical record diagnosis method based on SEV-SDG-CNN
CN108319580A (en) Diagnose word normalizing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant