CN111209742A - Method and device for determining diagnosis basis data, readable medium and electronic equipment - Google Patents

Method and device for determining diagnosis basis data, readable medium and electronic equipment Download PDF

Info

Publication number
CN111209742A
CN111209742A CN201911360739.2A CN201911360739A CN111209742A CN 111209742 A CN111209742 A CN 111209742A CN 201911360739 A CN201911360739 A CN 201911360739A CN 111209742 A CN111209742 A CN 111209742A
Authority
CN
China
Prior art keywords
diagnosis
data
diagnostic
determining
words
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911360739.2A
Other languages
Chinese (zh)
Inventor
赖昆
邢俊珠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yidu Cloud Beijing Technology Co Ltd
Original Assignee
Nanjing Yiyi Yunda Data Technology Co Ltd
Nanjing Yirui Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Yiyi Yunda Data Technology Co Ltd, Nanjing Yirui Technology Co Ltd filed Critical Nanjing Yiyi Yunda Data Technology Co Ltd
Priority to CN201911360739.2A priority Critical patent/CN111209742A/en
Publication of CN111209742A publication Critical patent/CN111209742A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses a method and a device for determining diagnosis basis data, a computer readable storage medium and electronic equipment, wherein the method comprises the following steps: determining diagnostic data in the medical record; determining a diagnosis element word set according to the diagnosis data; and determining first diagnosis basis data from the diagnosis data according to the diagnosis element word set. By the technical scheme of the invention, when the medical record is subjected to quality control according to the determined diagnosis basis data, the quality control efficiency and quality can be improved.

Description

Method and device for determining diagnosis basis data, readable medium and electronic equipment
Technical Field
The invention relates to the field of medical technology and artificial intelligence, in particular to a method and a device for determining diagnosis basis data, a readable medium and electronic equipment.
Background
The medical record first page usually records the diagnosis result of the patient, the diagnosis result indicates the disease state and/or treatment mode of the patient, the medical record is an important data source for judging whether the diagnosis result of the patient is correct or not, and the correctness of the diagnosis result plays a key role in the data quality of the medical record. If the wrong diagnosis result is directly stored without correction, the patient can obtain wrong result when inquiring the medical record in the future, and the formulation of the treatment scheme is influenced. Meanwhile, after wrong diagnosis results are uploaded to national medical regulatory agencies, the accuracy of regional medical statistical data can be affected, and the medical policy is not easy to make. Wrong diagnosis results deposited in the hospital can also influence the large data system construction of the hospital, and have long-term negative effects. Therefore, quality control inspection is carried out on the medical records, the accuracy of the diagnosis result is ensured from the source, and the utilization value of the diagnosis result is improved.
In order to improve the quality control efficiency and quality of medical records, it is usually necessary to determine the diagnosis basis data in the medical records, for example, if a certain operation is performed on a patient recorded in the first page of a medical record, the evidence that the operation is performed by a doctor appearing in the medical records is the diagnosis basis data of the operation; if the first page of the medical record records that the patient has a certain disease, the record which appears in the medical record and supports the disease is the diagnosis basis data of the pathology. However, there is currently no relevant technical means to determine the diagnostic basis data in a medical record.
Disclosure of Invention
The invention provides a method and a device for determining diagnosis basis data, a computer readable storage medium and electronic equipment, which can improve the quality control efficiency when the quality control is performed on medical records according to the determined diagnosis basis data.
In a first aspect, the present invention provides a method of determining diagnostic dependency data, comprising:
determining diagnostic data in the medical record;
determining a diagnosis element word set according to the diagnosis data;
and determining first diagnosis basis data from the diagnosis data according to the diagnosis element word set.
Preferably, the determining diagnostic data in the medical record comprises:
determining a diagnosis text in the medical record according to a preset diagnosis data position configuration file;
and segmenting the diagnostic text, and determining the segmented diagnostic text as diagnostic data.
Preferably, the method further comprises the following steps:
sorting the first diagnosis basis data according to the priority information in the preset diagnosis data position configuration file;
and determining the sorted first diagnosis basis data as second diagnosis basis data.
Preferably, the determining a set of diagnostic factor words from the diagnostic data comprises:
determining a diagnostic name in the diagnostic data;
performing word segmentation on the diagnosis name to determine a diagnosis element word set.
Preferably, the determining a diagnosis name in the diagnosis data comprises:
when a diagnosis code exists in the diagnosis data, determining the name of the diagnosis code corresponding to the international disease classification table as a diagnosis name;
determining a diagnostic name in the diagnostic data when no diagnostic code is present in the diagnostic data.
Preferably, the segmenting the diagnosis name to determine a diagnosis element word set includes:
segmenting words of the diagnosis name according to a preset word segmentation device to determine a first word set;
filtering stop words in the first set of words to determine a set of diagnostic factor words.
Preferably, the segmenting the diagnosis name to determine a diagnosis element word set includes:
acquiring a diagnosis element configuration table corresponding to the international disease classification table;
adding the diagnosis element configuration table into a preset first user-defined word bank to determine a second user-defined word bank;
segmenting words of the diagnosis names according to word segmenters corresponding to the second custom word bank to determine a second word set;
filtering words in the second set of words that are not in the diagnostic element configuration table to determine a set of diagnostic element words.
Preferably, the determining first diagnosis basis data from the diagnosis data according to the diagnosis element word set includes:
judging whether the diagnostic data and/or the synonym diagnostic data corresponding to the diagnostic data and the diagnostic element word set meet preset conditions or not;
and when the diagnosis data and/or the synonym diagnosis data corresponding to the diagnosis data and the diagnosis element word set meet a preset condition, determining the diagnosis data as first diagnosis basis data.
Preferably, the first diagnostic criterion data includes position information that a diagnostic element word in the set of diagnostic element words is located in the first diagnostic criterion data.
In a second aspect, the present invention provides a diagnostic dependency data determination apparatus comprising:
a first data determination module for determining diagnostic data in a medical record;
the set determining module is used for determining a diagnosis element word set according to the diagnosis data;
and the second data determination module is used for determining first diagnosis basis data from the diagnosis data according to the diagnosis element word set.
In a third aspect, the invention provides a computer-readable storage medium comprising executable instructions which, when executed by a processor of an electronic device, cause the processor to perform the method according to any one of the first aspect.
In a fourth aspect, the present invention provides an electronic device, comprising a processor and a memory storing execution instructions, wherein when the processor executes the execution instructions stored in the memory, the processor performs the method according to any one of the first aspect.
The invention provides a method, a device, a computer readable storage medium and electronic equipment for determining diagnosis basis data, wherein the method determines the diagnosis data in a medical record without considering the data except the diagnosis data in the medical record, thereby reducing the influence of the data except the diagnosis data in the medical record, ensuring the accuracy of the diagnosis data in the medical record, because the physiological and pathological complexity of a patient causes a large amount of data without medical value or with small medical value in the diagnosis data in the medical record, the diagnosis basis term set with relatively high medical value is determined through the diagnosis data, then the first diagnosis basis data with relatively high medical value can be determined from the diagnosis data according to the diagnosis basis term set, the first diagnosis basis data is the basis for judging the disease and/or treatment mode of the patient, and the first diagnosis basis data is the data with relatively high medical value screened from the medical record, therefore, when the medical record is subjected to quality control according to the first diagnosis data, the quality and the efficiency of quality control can be improved.
Further effects of the above-mentioned unconventional preferred modes will be described below in conjunction with specific embodiments.
Drawings
In order to more clearly illustrate the embodiments or the prior art solutions of the present invention, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments described in the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive labor.
FIG. 1 is a flow chart illustrating a method for determining diagnostic criteria according to one embodiment of the present invention;
FIG. 2 is a flow chart illustrating another method for determining diagnostic dependency data according to an embodiment of the present invention;
FIG. 3 is a flow chart illustrating a method for determining diagnostic dependency data according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a device for determining diagnostic dependency data according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be described in detail and completely with reference to the following embodiments and accompanying drawings. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Based on the foregoing, there is no method for determining the diagnosis basis data in the medical record in the prior art for the diagnosis basis data in the medical record. In the invention, the computer technology is combined, and more accurate diagnosis basis data is obtained based on the diagnosis data in the medical record.
As shown in fig. 1, an embodiment of the present invention provides a method for determining diagnosis basis data, including the following steps:
step 101, determining diagnostic data in a medical record.
The diagnostic data is typically from a typical medical record, which is a file in the patient's medical record, which is typically medical data that has been stored in a hospital big data system or a national medical regulatory agency. The medical record is usually a carrier for recording the basic condition and the physiological and pathological conditions of a patient, and the diagnosis data specifically refers to data capable of reflecting the physiological and pathological conditions of the patient, and the data sources required for judging the symptoms and the treatment means of the patient are usually considered in all aspects and directions in consideration of the complexity of the physiological and pathological conditions of the patient, so that the diagnosis data is usually comprehensive and complete, and the accuracy and the limitation of the diagnosis data are ensured.
It should be noted that the medical record is the text record of the medical staff to the disease process and treatment condition of the patient, which is the basis for the doctor to diagnose and treat the disease, and is a valuable data for the medical science research. The medical records usually include basic patient information, admission records, operation records, examination records, discharge records, medical orders and other data reflecting basic patient conditions and physiological and pathological conditions.
And 102, determining a diagnosis element word set according to the diagnosis data.
The physiological and pathological conditions of patients at different periods are different, so that the symptoms and/or treatment methods of the patients at different periods are different, and the patients can suffer from various diseases, which all cause the difference and diversity of diagnosis data in medical records of the patients, and further cause the diagnosis element word set in the diagnosis data in each medical record to be different. Obviously, each medical record needs to determine its corresponding set of diagnostic element words.
Step 103, determining first diagnosis basis data from the diagnosis data according to the diagnosis element word set.
The diagnosis data comprehensively considers the physiological and pathological conditions of the patient, considers the complexity of the physiological and pathological conditions of the patient and the comprehensive and complete diagnosis data, and for a medical record, some diagnosis data have no medical value or have small medical value for judging the disease and/or treatment mode of the patient, and meanwhile, the data without the medical value or with the small medical value can reduce the diagnosis efficiency or increase the misdiagnosis probability, so that the efficiency of quality control of the medical record is reduced, and the difficulty of quality control is increased. Through the diagnosis element word set, data with low or no medical value in the diagnosis data are filtered, first diagnosis basis data with high medical value are determined, and when quality control is performed on medical records according to the diagnosis basis data, quality control efficiency can be improved, and quality control difficulty can be reduced. The first diagnostic basis data specifically refers to data on which a doctor relies to determine a patient's disease and treatment modality.
It should be noted that the medical records are filed in the medical record room after being sorted by the medical record manager, the medical records will be converted into medical records, the disease and/or treatment mode of the patient will be recorded in the first page of the medical records, and the medical records in the medical records are the source of the disease and/or treatment mode of the patient. The accuracy of the disease and/or treatment mode of the patient recorded on the first page of the medical record can be judged by controlling the quality of the first diagnosis basis data in the medical record, wherein the specific control of the medical record refers to the fact that whether the patient suffers from a certain disease and/or the treatment mode of the disease of the patient is accurate or not by comprehensively checking and analyzing the diagnosis basis data.
According to the technical scheme, the method has the beneficial effects that: the method has the advantages that the influence of data except the diagnosis data in the medical record is reduced by comprehensively considering the diagnosis data in the medical record, the accuracy and the integrity of the diagnosis data in the medical record are ensured, a large amount of data with no medical value or less medical value often exist in the diagnosis data in the medical record due to the integrity of the diagnosis data and the complexity of the physiological and pathological conditions of a patient, and if the diagnosis data are directly used as diagnosis basis data, the accuracy of the diagnosis basis data can be reduced, so that the efficiency and the quality of the quality control of the medical record are reduced. The diagnosis element word set with relatively high medical value is determined through the diagnosis element data, the diagnosis element words in the diagnosis element word set are indispensable data in the diagnosis data, then, first diagnosis basis data with relatively high medical value can be determined from the diagnosis data according to the diagnosis element word set, the first diagnosis basis data are the basis for judging the disease and/or treatment mode of a patient, and the first diagnosis basis data are data with relatively high medical value screened from medical records, so that when the medical records are subjected to quality control through the first diagnosis basis data, the quality and the efficiency of the medical record quality control can be improved.
Fig. 1 shows only a basic embodiment of the method of the present invention, and based on this, certain optimization and expansion can be performed, and other preferred embodiments of the method can also be obtained.
FIG. 2 illustrates another embodiment of the present invention for determining the diagnostic dependency data. This embodiment will be disclosed and expanded in more detail on the basis of the embodiment shown in fig. 1. For ease of explanation and illustration, the present embodiments will be described in conjunction with the following detailed scenarios. It should be understood that the method described in this embodiment is also applicable in other relevant scenarios.
The specific scenario combined in this embodiment is as follows: the first page of the medical record records that the patient has performed a hepatectomy operation at a certain day in a certain month in a certain year, the medical record comprises a plurality of medical records, the medical records record the physiological and pathological conditions of the patient, the medical records comprise basic information, admission records, operation records, first postoperative course records, invasive diagnosis and treatment operation records (various diagnoses and therapeutic operations recorded in the clinical diagnosis and treatment activity process of the invasive diagnosis and treatment operation records specifically including the records of interventional therapy, clinical common diagnosis and treatment technology and the like), inspection data, medical advice, discharge records and the like, and a diagnosis data position configuration file, an international disease classification table and a synonym library are preset as examples.
In the actual method, the first page of the medical record usually records several symptoms and/or several treatment modes of the patient, and for the convenience of description, only liver resection is taken as an example. Considering that the diagnostic data location profiles of each medical record are similar, it is obvious that the contents of the diagnostic data location profiles can be modified according to actual scenarios, and the contents of the diagnostic data location profiles are shown in table 1:
Figure BDA0002337096850000071
Figure BDA0002337096850000081
TABLE 1
In table 1, 1 indicates that the priority of the surgical record is first, and the meanings of the other data are the same, which is not described herein again.
The method of the embodiment comprises the following steps:
step 201, according to a preset diagnosis data position configuration file, determining a diagnosis text in the medical record, performing clause segmentation on the diagnosis text, and determining the diagnosis text after the clause segmentation as diagnosis data.
The medical record comprises a plurality of fields and records, and the data position can be determined by the positions of the records and the positions of the fields, wherein the diagnostic data position configuration file indicates the positions of the fields of the diagnostic data and the positions of the records in the medical record, namely the row and the column of the diagnostic data, the positions of the fields specifically refer to the positions of the field names, the positions of the records specifically refer to the positions of the field contents corresponding to the field names, specifically, the diagnostic data position configuration file comprises the fields, for example, the fields can be the columns of the category names in the table 1 and the columns of the priorities, each field comprises a plurality of data items, for example, the data items can be the surgical records, 1 and the like in the table 1, the difference of the categories of the data in each medical record is small, so that the diagnostic data position configuration file is applicable to all medical records, each medical record may include data corresponding to one or more categories, where the categories include, but are not limited to, the operation record, the post-operation first disease course record, the invasive medical procedure record, the examination data, the medical order and the discharge record in table 1, and obviously, the categories may also be the basic information and the admission record in the medical record in a specific scenario. Of course, the diagnostic data location profile may also be adaptively modified according to actual scene requirements.
According to the diagnostic data position configuration file, a plurality of categories in the medical record can be determined, and the complete records corresponding to the categories are determined as the diagnostic data, that is, the position of the field and the position of the record of the diagnostic data in the medical record are determined, for example, the position of the field is the position of the data item where the surgical record is located, the position of the record is the row data corresponding to the surgical record, and the diagnostic data is the complete record corresponding to the surgical record in the medical record. It should be noted that, the plurality of data items corresponding to the category names in the diagnostic data location configuration file can comprehensively determine the diagnostic data in the medical record, so the diagnostic data usually includes data that comprehensively reflects the physiology and pathology of the patient in the medical record, and therefore the diagnostic data is usually comprehensive, complete and accurate, and has a higher medical value.
It should be noted that the diagnostic data is usually a compound sentence with complete meaning composed of several sentences, and in the natural language processing process, the compound sentence is usually required to be divided into sentences to efficiently and accurately check the data, where a sentence refers to a part that is divided from the compound sentence and is equivalent to a single sentence, and there are usually separators between a sentence and a sentence, and the separators include, but are not limited to, commas or semicolons, and can be used to mark the beginning or end of a sentence. The diagnostic data is divided into sentences to reduce the correlation between the sentences, but the semantics between the sentences are not changed to ensure the reality and accuracy of the diagnostic data.
For example, the diagnosis data includes complete row data corresponding to an operation record, a first postoperative course record, an invasive diagnosis and treatment operation record, examination data, a medical order, a discharge record, and the like, and the specific content of the diagnosis data is not described herein in consideration of a large data volume in a medical record.
Step 202, when a diagnosis code exists in the diagnosis data, determining the name of the diagnosis code corresponding to the international disease classification table as a diagnosis name; determining a diagnostic name in the diagnostic data when no diagnostic code is present in the diagnostic data.
Specifically, the international disease classification is a system for classifying diseases according to certain characteristics of the diseases according to rules and expressing the diseases in a coding mode, namely, expression class standards of operation names and disease names. The International Classification of Diseases (ICD) table mainly includes codes and operation names or disease names corresponding to the codes, the operation names are determined according to the file revised at the 9 th time of the International disease Classification, usually the file in ICD-9-CM-3, and the disease names are determined according to the file revised at the 10 th time of the International disease Classification. In the scenario provided by the embodiment of the present invention, the liver resection is a generic term for surgery to remove a certain amount of liver parenchyma for therapeutic purposes, including liver lobectomy and liver segmental resection or irregular liver resection with a small resection range, and the liver resection is encoded in ICD-9-CM-3 to include 50.2, 50.22, 50.3 and 50.4, wherein 50.2 represents partial resection or destruction of liver tissue or liver damage, 50.22 represents partial hepatectomy, liver wedge resection, 50.3 represents liver lobectomy, 50.4 represents total hepatectomy, and the specific contents of 50.2, 50.22, 50.3 or 50.4 in the ICD-9-CM-3 code are as shown in table 2:
numbering Name (R)
50.22 003 Liver II section resection
50.22 004 Liver III section resection
…… ……
50.22 009 Liver VIII section resection
50.22 011 Partial resection of liver
50.22 013 Wedge resection of the liver
50.3 001 Liver lobe resection
50.3 002 Right hemihepatectomy
50.3 003 Left hemihepatectomy
50.3 004 Total hepatectomy
50.4 001 Total hepatectomy
TABLE 2
When a patient is ill, a doctor needs to determine the disease and/or treatment mode of the patient according to the diagnosis data of the patient, i.e. determine the diagnosis name of the disease and/or the diagnosis name of the operation, the diagnosis name indicates the diagnosis result of the patient, and the diagnosis name is a summary of the disease and/or treatment mode of the patient, for example, hepatectomy is the diagnosis name, and hepatectomy indicates that the patient has undergone the operation of hepatectomy for some reason, including but not limited to malignant tumor of liver, benign tumor of liver, hepatobiliary calculus in liver, liver trauma, liver abscess, echinococcosis, etc. When the diagnosis code exists in the diagnosis data and the diagnosis name does not exist, the diagnosis code is usually written by a doctor according to the code rule in the international disease classification, so that the name of the diagnosis code corresponding to the international disease classification table is directly determined as the diagnosis name, and the accuracy of the diagnosis name is ensured. When there are diagnosis names and diagnosis codes in the diagnosis data, in order to secure the accuracy of the diagnosis names in consideration of the diversity of the diagnosis names written by doctors, the names corresponding to the diagnosis codes in the international disease classification table are determined as the diagnosis names without selecting the diagnosis names in the diagnosis data, for example, when the diagnosis codes in the diagnosis data are 50.3003 or ICD-9-CM-3-50.3003, the diagnosis names are hepatectomy, and left hemihepatectomy is used as the diagnosis name according to Table 2. When the diagnosis name exists in the diagnosis data and the diagnosis code does not exist, the diagnosis name is directly extracted to keep the truth and validity of the data. When the diagnosis name and the diagnosis code do not exist in the diagnosis data, the medical value of the diagnosis data is smaller or no, and the process of determining the first diagnosis basis data can be finished. The diagnosis code and the diagnosis name can be determined only by natural language identification of the diagnosis data, and the diagnosis code and the diagnosis name can be realized by using an identification technology in the prior art.
Step 203, performing word segmentation on the diagnosis name according to a preset word segmentation device to determine a first word set, and filtering stop words in the first word set to determine a diagnosis element word set.
Considering that the doctor's writing method of the diagnosis name of the disease and/or the diagnosis name of the operation is not the name in the international disease classification table in most cases, but adds or deletes some unnecessary words on the basis of keeping the necessary words in the name in the international disease classification table, thereby leading to the diversity of the diagnosis name of the same disease, the description is given by taking hepatectomy as an example, and hepatectomy can be written as liver resection, hepatectomy, liver lobectomy, liver segmental resection, partial hepatectomy and other diagnosis names, and the necessary words of these diagnosis names are liver and resection, obviously, the semantics and necessary words of these diagnosis names are the same, but the writing method is different. The coupling between words in the diagnosis names can be cut off by segmenting the diagnosis names, and the influence of word sequences and the correlation between the words are reduced, so that the influence of the diversity of the diagnosis names is reduced. It should be noted that the word segmentation device does not change the word composition of the diagnosis name for segmenting the diagnosis name, the word segmentation device in the prior art can be used for segmenting the diagnosis name, and a word bank is arranged in the prior word segmentation device.
Considering that the data depended on by different diseases and treatment modes have larger difference, namely the difference between the diagnosis data of different medical records is larger, meanwhile, the difference between the data for judging the symptoms and the treatment modes of the patients in the diagnosis data is larger, but doctors can record the diagnosed symptoms and the treatment modes of the patients in medical records, and the diagnostic designation indicates the disease and/or treatment modality of the patient, where the diagnostic designation is generally determined from analysis of diagnostic data, the diagnosis data of different medical records comprise diagnosis names, obviously, the diagnosis names are indispensable data with high medical value in the diagnosis data, and compared with other diagnosis data, the diagnosis names do not need to consider the complex physiological and pathological conditions of the patient, and the disease and/or treatment mode of the patient can be summarized more accurately and simply.
Considering that the words in the diagnosis names are highly correlated, and data matching is not utilized, it is generally required to determine a diagnosis element word set of the diagnosis names to improve matching between the diagnosis names and the data, the diagnosis element word set has a high medical value, specifically, the diagnosis element word set specifically refers to a minimum word group word necessary in the diagnosis names of the diseases and/or the diagnosis names of the operations, for example, the diagnosis element word set of hepatectomy includes two diagnosis element words of liver and resection.
The diagnosis name may include unnecessary stop words which do not affect the semantic meaning of the diagnosis name, and the stop words specifically refer to generalized words such as "of", "sick", and the like. Considering that the participle of the diagnosis name cannot change the word composition of the diagnosis name by the participle device, unnecessary stop words also exist in the word set, the stop words can increase the limit of the words in the first word set, and simultaneously considering that the writing habits of each doctor are different, some doctor habits and stop words are added, some doctors cannot add the stop words, the stop words can increase the matching difficulty between the words and the data in the first word set, and the matching accuracy can be reduced.
For example, the stop words for hepatectomy include "what" is used "," what "is done", "dirty", "lobe", "segment", etc., and the first word set is obtained by segmenting hepatectomy, wherein "what" is used "is the stop word, and then" what "is done" in "resection" is deleted, and thus a set of diagnostic element words is obtained, and the set of diagnostic element words includes "what is done" is "what is done" to get the set of diagnostic element words.
Step 204, judging whether the diagnosis data and/or the synonym diagnosis data corresponding to the diagnosis data and the diagnosis element word set meet preset conditions.
Considering the difference in writing habits of doctors, there are usually several synonyms in the diagnostic data, i.e. one word uses other words having similar or identical meaning to the word, for example, the synonyms for resection in hepatectomy include total resection, eradication and radical surgery. Here, the synonym diagnostic data specifically refers to the diagnostic data and a plurality of synonym groups in the diagnostic data, the synonym group specifically refers to a plurality of words with similar or identical meanings, here, the plurality of synonym groups in the diagnostic data are determined from a preset synonym library, and most of synonym groups commonly used for operation names and disease names are stored in the synonym library.
Here, the preset condition includes a sentence in the diagnosis data and/or a sentence in the synonym diagnosis data corresponding to the diagnosis data, and the diagnosis element words in the diagnosis element word set are completely matched or partially matched, the precision of the complete matching is relatively high, the recall rate is relatively low, and the precision of the partial matching is relatively low, and the recall rate is relatively high. The matching strategy can be flexibly changed according to different application scenes.
For example, the content of a sentence in the diagnostic data "the patient has a benign tumor of the liver and thus needs to undergo radical hepatectomy", where radical surgery is synonymous, and correspondingly, the synonym diagnostic data of the sentence includes "the patient has a benign tumor of the liver and thus needs to undergo radical hepatectomy" and "resection, total resection and eradication", where radical surgery, resection, total resection and eradication are synonymous.
Step 205, when the diagnosis data and/or the synonym diagnosis data corresponding to the diagnosis data and the diagnosis element word set satisfy a preset condition, determining the diagnosis data as first diagnosis basis data, where the first diagnosis basis data includes position information of diagnosis element words in the diagnosis element word set in the first diagnosis basis data.
When the sentences in the diagnosis data or the sentences in the synonym diagnosis data corresponding to the diagnosis data can be completely or partially matched with the diagnosis element words in the diagnosis element word set, the diagnosis data is determined as first diagnosis basis data, for example, the content of the medical advice in the diagnosis data is that the liver part of the patient is injured due to traffic accident, and therefore liver resection operation needs to be performed, the content of the medical advice comprises the liver and resection in the diagnosis element records, the content of the medical advice can be determined as the first diagnosis basis data, the synonym diagnosis data in the sentence of 'hepatic radical operation needs to be performed comprises the fact that the patient has benign tumor of the liver, and therefore hepatic radical operation needs to be performed' and 'resection, full resection and eradication' as examples, the synonym diagnosis data in the sentence comprises the liver and resection in the diagnosis element records, the sentence can be determined as the first diagnostic basis data. Or, when the sentence in the diagnostic data and the diagnostic element word in the set of diagnostic element words cannot be completely matched or partially matched, and the synonym phrase corresponding to the sentence in the diagnostic data and the diagnostic element word in the set of diagnostic element words can be completely matched or partially matched, determining the sentence in the diagnostic data as the first diagnostic basis data. It should be noted that the first diagnosis basis data includes a plurality of sentences, each sentence includes all or part of the diagnosis element words, and also includes position information of the diagnosis element words in the sentence, the position information indicates positions of the diagnosis element words in the sentence, for example, the diagnosis element words are located between the ith character string and the nth character string in a left-to-right order, so that the first diagnosis basis data can be conveniently checked and analyzed, obviously, positions of the sentences including the diagnosis element words can also be highlighted, for example, the diagnosis element words in the sentences are displayed in different colors, and the first diagnosis basis data can be more conveniently checked and analyzed.
It should be noted that, for the sentences in the diagnosis data, by analyzing the information in the sentences, the illness state and/or treatment method of the patient can be determined, so as to determine the diagnosis name in the sentences, for example, "the liver part of the patient is injured due to car accident, and therefore liver resection operation is required", obviously, the information in the sentences includes partial words in the diagnosis name, therefore, the sentences in which the words of the diagnosis elements appear are the first diagnosis basis data, and the accuracy of the first diagnosis basis data is high.
In this embodiment, the preset diagnostic data position file is used to comprehensively and accurately determine the diagnostic data from the medical record, determine the diagnostic name in the diagnostic data, then determine the diagnostic element word set corresponding to the diagnostic name, and match the diagnostic data and/or the synonym diagnostic data corresponding to the diagnostic data with the diagnostic element word set, thereby more accurately determining the first diagnostic basis data from the diagnostic data.
In addition, preferably, the method in this embodiment may further include:
step 206, according to the priority information in the preset diagnosis data position configuration file, sorting the first diagnosis basis data, and determining the sorted first diagnosis basis data as second diagnosis basis data.
Taking table 1 as an example for illustration, the preset diagnosis data location configuration file includes data items in the category name field and data items in the priority field, obviously, the priority information specifically refers to the importance degree of the data items in the category name field, the first diagnosis basis data corresponds to the data items in the category name field, for example, the data corresponding to the surgical record in the first diagnosis basis data carries a label of the surgical record, based on which, the sentences in the first diagnosis basis data are sorted according to the corresponding relationship between the first diagnosis basis data and the data items in the category name field to determine the importance degree of the first diagnosis basis data, here, the first diagnosis basis data can be sorted in the order of priority from high to low, that is, the important diagnosis basis data are arranged in front, and the sorted first diagnosis basis data are determined as the second diagnosis basis data, when the medical records need to be controlled, the data in the second diagnosis basis data which is ranked in the front can be checked in a key mode, so that the accuracy and the efficiency of controlling the medical records are improved, and the labor and the time cost are saved.
According to the technical scheme, on the basis of the embodiment shown in fig. 1, the method further has the following beneficial effects: and when the medical record is subjected to quality control according to the second diagnosis basis data, the data with higher priority in the second diagnosis basis data can be mainly checked, so that the efficiency and the quality of quality control are further improved.
FIG. 3 shows another embodiment of the method for determining diagnostic dependency data according to the present invention. The present embodiment utilizes another method to determine the set of diagnostic factor words. For convenience of explanation and explanation, the present embodiment will be described with reference to the above specific scenarios, and the present embodiment adds a first customized thesaurus and a diagnosis element configuration table corresponding to an international disease classification table in the above specific scenarios.
In this embodiment, the method includes the steps of:
step 301, according to a preset diagnostic data position configuration file, determining a diagnostic text in the medical record, performing clause segmentation on the diagnostic text, and determining the diagnostic text after the clause segmentation as diagnostic data.
Step 302, when a diagnosis code exists in the diagnosis data, determining the name of the diagnosis code corresponding to the international disease classification table as a diagnosis name; determining a diagnostic name in the diagnostic data when no diagnostic code is present in the diagnostic data.
Step 303, obtaining a diagnosis element configuration table corresponding to the international disease classification table, and adding the diagnosis element configuration table into a preset first user-defined word bank to determine a second user-defined word bank.
As is known from the foregoing, the operation names and disease names in the international disease classification table are standardized, and doctors will add and delete some unnecessary words in the operation names and disease names in the international disease classification table according to professional habits, and determine the diagnosis element configuration table of the international disease classification table in order to reduce the influence of the unnecessary words, specifically, the diagnosis element configuration table specifically refers to the words essential in the operation names and disease names in the international disease classification table.
Considering that the word stock is visual and the final word segmentation result is convenient to adjust by adding or deleting the word stock, the word segmentation method of the word stock is adopted. The preset first self-defined word bank comprises a plurality of words capable of forming diagnosis names in the diagnosis data, obviously, the diagnosis names in the diagnosis data are not necessarily operation names or disease names in international disease classification, the participles in the first self-defined word bank usually contain unnecessary words, the quality of the participles can be reduced, a second self-defined word bank is formed by adding the diagnosis element configuration table into the first self-defined word bank, the second self-defined word bank is universal, namely, the second self-defined word bank is used for participling a plurality of medical records, meanwhile, the unnecessary words in the second self-defined word bank are relatively few, and therefore the quality of the participles is improved.
Step 304, performing word segmentation on the diagnosis name according to a word segmentation device corresponding to the second custom thesaurus to determine a second word set, and filtering words in the second word set which are not in the diagnosis element configuration table to determine a diagnosis element word set.
The diagnosis name does not have natural segmentation words or punctuation marks, and a space or other boundary marks are usually automatically added in the diagnosis name to realize automatic word segmentation. Considering that unnecessary words are usually added or deleted on the basis of operation names or disease names in the international disease classification table in the diagnosis names, the word order of the words may be changed, and the word segmentation is performed on the diagnosis names through the word segmentation device corresponding to the second self-defined word stock so as to reduce the correlation among all the constituent words in the diagnosis names and reduce the influence of the word order, thereby forming a second word set.
Considering the diversity of the diagnosis names, the matching relationship between the diagnosis element configuration table and the diagnosis names is difficult to determine, the diagnosis element word set of the diagnosis names cannot be determined directly according to the diagnosis element configuration table, meanwhile, the function of the word segmentation device is only word segmentation, the word combination in the diagnosis names is not changed, the obtained second word set usually contains words which are not in the diagnosis element configuration table, the words are unnecessary, the matching difficulty with data is easy to increase, and the recall rate is reduced. By filtering the words in the second word set which do not belong to the diagnosis element configuration table, the diagnosis element word set is more accurately determined, the matching difficulty with data is reduced, and the recall rate is increased.
For example, a liver resection may be segmented to obtain a second word set, wherein the second word set comprises three words of "liver", "resection" and "surgery", and the "surgery" is not in the diagnosis element configuration table, and the "surgery" is deleted to obtain a diagnosis element word set, wherein the diagnosis element word set comprises two words of "liver" and "resection".
Step 305, judging whether the diagnosis data and/or the synonym diagnosis data corresponding to the diagnosis data and the diagnosis element word set meet preset conditions.
Step 306, when the diagnosis data and/or the synonym diagnosis data corresponding to the diagnosis data and the diagnosis element word set satisfy a preset condition, determining the diagnosis data as first diagnosis basis data, where the first diagnosis basis data includes position information of diagnosis element words in the diagnosis element word set in the first diagnosis basis data.
Step 307, sorting the first diagnosis basis data according to the priority information in the preset diagnosis data position configuration file, and determining the sorted first diagnosis basis data as second diagnosis basis data.
According to the technical scheme, the diagnosis element word set with high accuracy is determined through the diagnosis element configuration table corresponding to the first user-defined word bank and the international disease classification table, and the accuracy and the medical value of diagnosis basis data can be further improved.
Referring to fig. 4, based on the same concept as the method embodiment of the present invention, an embodiment of the present invention further provides a device for determining diagnosis dependency data, including:
a first data determination module 401 for determining diagnostic data in a medical record;
a set determining module 402, configured to determine a diagnosis element word set according to the diagnosis data;
a second data determining module 403, configured to determine first diagnosis criterion data from the diagnosis data according to the diagnosis element word set.
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention. On the hardware level, the electronic device includes a processor 501 and a memory 502 storing execution instructions, and optionally further includes an internal bus 503 and a network interface 504. The memory 502 may include a memory 5021, such as a Random-access memory (RAM), and may further include a non-volatile memory 5022(non-volatile memory), such as at least 1 disk memory; the processor 501, the network interface 504, and the memory 502 may be connected to each other by an internal bus 503, and the internal bus 503 may be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect) bus, an EISA (extended Industry Standard Architecture) bus, or the like; the internal bus 503 may be divided into an address bus, a data bus, a control bus, etc., and is indicated by only one double-headed arrow in fig. 5 for convenience of illustration, but does not indicate only one bus or one type of bus. Of course, the electronic device may also include hardware required for other services. When the processor 501 executes execution instructions stored by the memory 502, the processor 501 performs a method in any of the embodiments of the present invention and at least is used to perform the method as shown in fig. 1, fig. 2, or fig. 3.
In a possible implementation manner, the processor reads the corresponding execution instruction from the nonvolatile memory to the memory and then runs the execution instruction, and the corresponding execution instruction can also be obtained from other equipment, so as to form a diagnosis determination device according to data on a logic level. The processor executes the execution instructions stored in the memory to implement a method for determining diagnostic dependency data provided in any embodiment of the invention by executing the execution instructions.
The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by instructions in the form of hardware integrated logic circuits or software in a processor. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Embodiments of the present invention further provide a computer-readable storage medium, which includes an execution instruction, and when a processor of an electronic device executes the execution instruction, the processor executes a method provided in any one of the embodiments of the present invention. The electronic device may specifically be the electronic device shown in fig. 5; the execution instruction is a computer program corresponding to the determination device of the diagnosis basis data.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects.
The embodiments of the present invention are described in a progressive manner, and the same and similar parts among the embodiments can be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or boiler that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or boiler. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or boiler that comprises the element.
The above description is only an example of the present invention, and is not intended to limit the present invention. Various modifications and alterations to this invention will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.

Claims (12)

1. A method for determining diagnostic dependency data, comprising:
determining diagnostic data in the medical record;
determining a diagnosis element word set according to the diagnosis data;
and determining first diagnosis basis data from the diagnosis data according to the diagnosis element word set.
2. The method of claim 1, wherein determining diagnostic data in the medical record comprises:
determining a diagnosis text in the medical record according to a preset diagnosis data position configuration file;
and segmenting the diagnostic text, and determining the segmented diagnostic text as diagnostic data.
3. The method of claim 2, further comprising:
sorting the first diagnosis basis data according to the priority information in the preset diagnosis data position configuration file;
and determining the sorted first diagnosis basis data as second diagnosis basis data.
4. The method of claim 1, wherein determining a set of diagnostic factor words from the diagnostic data comprises:
determining a diagnostic name in the diagnostic data;
performing word segmentation on the diagnosis name to determine a diagnosis element word set.
5. The method of claim 4, wherein said determining a diagnostic name in said diagnostic data comprises:
when a diagnosis code exists in the diagnosis data, determining the name of the diagnosis code corresponding to the international disease classification table as a diagnosis name;
determining a diagnostic name in the diagnostic data when no diagnostic code is present in the diagnostic data.
6. The method of claim 4, wherein the tokenizing the diagnostic name to determine a set of diagnostic factor words comprises:
segmenting words of the diagnosis name according to a preset word segmentation device to determine a first word set;
filtering stop words in the first set of words to determine a set of diagnostic factor words.
7. The method of claim 4, wherein the tokenizing the diagnostic name to determine a set of diagnostic factor words comprises:
acquiring a diagnosis element configuration table corresponding to the international disease classification table;
adding the diagnosis element configuration table into a preset first user-defined word bank to determine a second user-defined word bank;
segmenting words of the diagnosis names according to word segmenters corresponding to the second custom word bank to determine a second word set;
filtering words in the second set of words that are not in the diagnostic element configuration table to determine a set of diagnostic element words.
8. The method of claim 1, wherein determining first diagnostic criteria data from the diagnostic data based on the set of diagnostic factor words comprises:
judging whether the diagnostic data and/or the synonym diagnostic data corresponding to the diagnostic data and the diagnostic element word set meet preset conditions or not;
and when the diagnosis data and/or the synonym diagnosis data corresponding to the diagnosis data and the diagnosis element word set meet a preset condition, determining the diagnosis data as first diagnosis basis data.
9. The method of claim 1, wherein the first diagnostic basis data includes location information that a diagnostic factor word of the set of diagnostic factor words is located in the first diagnostic basis data.
10. A diagnostic dependency data determination apparatus, comprising:
a first data determination module for determining diagnostic data in a medical record;
the set determining module is used for determining a diagnosis element word set according to the diagnosis data;
and the second data determination module is used for determining first diagnosis basis data from the diagnosis data according to the diagnosis element word set.
11. A computer-readable storage medium comprising executable instructions that, when executed by a processor of an electronic device, cause the processor to perform the method of any of claims 1-9.
12. An electronic device comprising a processor and a memory storing execution instructions, the processor performing the method of any of claims 1-9 when the processor executes the execution instructions stored by the memory.
CN201911360739.2A 2019-12-25 2019-12-25 Method and device for determining diagnosis basis data, readable medium and electronic equipment Pending CN111209742A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911360739.2A CN111209742A (en) 2019-12-25 2019-12-25 Method and device for determining diagnosis basis data, readable medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911360739.2A CN111209742A (en) 2019-12-25 2019-12-25 Method and device for determining diagnosis basis data, readable medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN111209742A true CN111209742A (en) 2020-05-29

Family

ID=70784249

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911360739.2A Pending CN111209742A (en) 2019-12-25 2019-12-25 Method and device for determining diagnosis basis data, readable medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN111209742A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112184084A (en) * 2020-11-05 2021-01-05 北京嘉和海森健康科技有限公司 Medical record learning quality assessment method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108182972A (en) * 2017-12-15 2018-06-19 上海长江科技发展有限公司 The intelligent coding method and system of Chinese medical diagnosis on disease based on participle network
CN109524072A (en) * 2018-05-28 2019-03-26 平安医疗健康管理股份有限公司 Electronic health record generation method, device, computer equipment and storage medium
CN110471941A (en) * 2019-08-12 2019-11-19 贵州医渡云技术有限公司 It is automatically positioned the method, apparatus and electronic equipment of judgment basis

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108182972A (en) * 2017-12-15 2018-06-19 上海长江科技发展有限公司 The intelligent coding method and system of Chinese medical diagnosis on disease based on participle network
CN109524072A (en) * 2018-05-28 2019-03-26 平安医疗健康管理股份有限公司 Electronic health record generation method, device, computer equipment and storage medium
CN110471941A (en) * 2019-08-12 2019-11-19 贵州医渡云技术有限公司 It is automatically positioned the method, apparatus and electronic equipment of judgment basis

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112184084A (en) * 2020-11-05 2021-01-05 北京嘉和海森健康科技有限公司 Medical record learning quality assessment method and device
CN112184084B (en) * 2020-11-05 2023-08-08 北京嘉和海森健康科技有限公司 Medical record learning quality assessment method and device

Similar Documents

Publication Publication Date Title
Szucs et al. Sample size evolution in neuroimaging research: An evaluation of highly-cited studies (1990–2012) and of latest practices (2017–2018) in high-impact journals
CN111681728B (en) Content quality control method and device for electronic medical records
US7610192B1 (en) Process and system for high precision coding of free text documents against a standard lexicon
CN110069779B (en) Symptom entity identification method of medical text and related device
Schneider et al. Normative data for 8 neuropsychological tests in older blacks and whites from the atherosclerosis risk in communities (ARIC) study
US10176892B2 (en) Method and system for presenting summarized information of medical reports
CN109637605B (en) Electronic medical record structuring method and computer-readable storage medium
WO2018169795A1 (en) Interoperable record matching process
CN111785383B (en) Data processing method and related equipment
CN111292814A (en) Medical data standardization method and device
CN110674244B (en) Structured processing method and device for medical text
WO2020048952A1 (en) Method of classifying medical records
CN114912887A (en) Clinical data entry method and device based on electronic medical record
CN111209742A (en) Method and device for determining diagnosis basis data, readable medium and electronic equipment
CN112699669B (en) Natural language processing method, device and storage medium for epidemiological survey report
CN112329461A (en) Similar medical record determination method, computer equipment and computer storage medium
CN112154512B (en) Systems and methods for prioritization and presentation of heterogeneous medical data
CN113052410B (en) Quality control method and device for electronic medical record data
CN111243692A (en) Automatic coding method and system for medical record
CN115631823A (en) Similar case recommendation method and system
CN114520035A (en) Volunteer screening method and device, electronic equipment and storage medium
CN111710431B (en) Method, device, equipment and storage medium for identifying synonymous diagnosis names
Santos et al. Influence of autopsy reports on trauma registry accuracy
CN113486644A (en) Method, system, terminal and storage medium for quickly generating medical document
CN111667922A (en) Clinical diagnosis and treatment data entry system and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20230328

Address after: 100089 801, 8th floor, building 9, No.35 Huayuan North Road, Haidian District, Beijing

Applicant after: YIDU CLOUD Ltd.

Address before: Room 1502, 15th floor, No.211, pubin Road, Jiangbei new district, Nanjing, Jiangsu 210000

Applicant before: Nanjing Yirui Technology Co.,Ltd.

Applicant before: Nanjing Yiyi Yunda Data Technology Co.,Ltd.

TA01 Transfer of patent application right