WO2021114620A1

WO2021114620A1 - Medical-record quality control method, apparatus, computer device, and storage medium

Info

Publication number: WO2021114620A1
Application number: PCT/CN2020/099180
Authority: WO
Inventors: 朱昭苇; 孙行智; 胡岗
Original assignee: 平安科技（深圳）有限公司
Priority date: 2020-06-16
Filing date: 2020-06-30
Publication date: 2021-06-17
Also published as: CN111710383A

Abstract

Provided are a medical-record quality control method, apparatus, computer device, and storage medium, relating to artificial intelligence. The method comprises: extracting chief complaint information and a corresponding symptom relationship attribute pair in a medical record to be examined (S202); inputting the chief complaint information and symptom relationship attribute pair to a trained first natural language processing model to obtain a set of diseases matching the chief complaint information (S204); matching the set of diseases with diagnostic information in the medical record to be examined, and according to the result of matching, determining whether the diagnosis information of the medical record to be examined is a misdiagnosis (S206). In addition, the invention also relates to blockchain technology, and the medical record to be examined can be stored on the blockchain. Using this method, it can be determined whether the chief complaint information and the diagnostic information are consistent, thus achieving diagnosis quality control.

Description

Medical record quality control method, device, computer equipment and storage medium

Cross-references to related applications

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on June 16, 2020. The application number is 2020105485409, and the application title is "Medical Record Quality Control Methods, Devices, Computer Equipment and Storage Media". The entire content of the Chinese patent application is by reference. Incorporated in this application.

Technical field

This application relates to the field of artificial intelligence, and in particular to a method, device, computer equipment and storage medium for quality control of medical records based on natural language processing.

Background technique

Medical records are used to record patient visits and are the basic data source for follow-up medical research. In order to strengthen the quality management of hospital medical records, improve the internal quality management system of the hospital, and follow-up test the doctor's professional level to improve the doctor's ability, the quality control of medical records is one of the important concerns in the quality control system.

However, the inventor realizes that the current quality control of medical records is mostly focused on basic aspects such as medical record writing, such as whether the medical record is written correctly, whether the case entries are consistent before and after, etc., and there is a lack of judgment on whether the main complaint and the diagnosis are consistent.

Summary of the invention

According to various embodiments disclosed in the present application, a medical record quality control method, device, computer equipment, and storage medium are provided.

A method for quality control of medical records, the method comprising:

Extract the main complaint information and the corresponding symptom relationship attribute pair in the medical record to be examined;

Input the trained first natural language processing model to the main complaint information and the symptom relationship attribute pair to obtain a set of diseases matching the main complaint information; and

The disease set is matched with the diagnosis information in the medical record to be examined, and it is determined whether the diagnosis information of the medical record to be examined is misdiagnosed according to the matching result.

A medical record quality control device, the device comprising:

The extraction module is used to extract the main complaint information and the corresponding symptom relationship attribute pair in the medical record to be examined;

A processing module, configured to input the main complaint information and the symptom relationship attribute pair into the trained first natural language processing model to obtain a set of diseases matching the main complaint information; and

The determining module is configured to match the disease set with the diagnostic information in the medical record to be checked, and determine whether the diagnostic information in the medical record to be checked is misdiagnosed according to the matching result.

A computer device includes a memory and one or more processors, the memory stores computer readable instructions, and when the computer readable instructions are executed by the processor, the one or more processors execute the following step:

One or more computer-readable storage media storing computer-readable instructions. When the computer-readable instructions are executed by one or more processors, the one or more processors perform the following steps:

The above medical record quality control method, device, computer equipment and storage medium use the trained natural language processing model to perform natural language processing on the main complaint information extracted from the medical record to be examined and the corresponding symptom relationship attribute pair to obtain a match with the main complaint information Disease collection. Then, the disease set matched with the main complaint information is matched with the diagnosis information in the medical record to be checked to determine whether the diagnosis information in the medical record to be checked is misdiagnosed. This method uses the extracted chief complaint information and symptom relationship attributes to determine the disease set corresponding to the chief complaint information, and then matches the diseases in the disease set with the diagnosis information, thereby realizing the judgment of whether the chief complaint information and the diagnosis information are consistent.

The details of one or more embodiments of the present application are set forth in the following drawings and description. Other features and advantages of this application will become apparent from the description, drawings and claims.

Description of the drawings

In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings needed in the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. A person of ordinary skill in the art can obtain other drawings based on these drawings without creative work.

Fig. 1 is an application scenario diagram of a medical record quality control method according to one or more embodiments;

2 is a schematic flowchart of a method for quality control of medical records according to one or more embodiments;

FIG. 3 is a schematic diagram of a process of extracting the main complaint information and the corresponding symptom relationship attribute pair steps in the medical record to be examined according to one or more embodiments;

4 is a flow diagram of the steps of inputting the main complaint information and the symptom relationship attribute pair into the trained first natural language processing model to obtain the disease set matching the main complaint information according to one or more embodiments;

Fig. 5 is a schematic diagram of a work flow of a medical record quality control method according to one or more embodiments;

Fig. 6 is a structural block diagram of a medical record quality control device according to one or more embodiments;

Figure 7 is a block diagram of a computer device according to one or more embodiments.

Detailed ways

In order to make the technical solutions and advantages of the present application clearer, the following further describes the present application in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the present application, and are not used to limit the present application.

The medical record quality control method provided in this application can be applied to the application environment as shown in FIG. 1. Wherein, the terminal 102 communicates with the server 104 through the network through the network. After the terminal 102 sends the medical record to be checked to the server 104, the server 104 extracts the main complaint information and the corresponding symptom relationship attribute pair in the medical record to be checked; the server 104 inputs the main complaint information and the symptom relationship attribute pair into the trained first natural language processing The model obtains the disease set matching the main complaint information; the server 104 matches the disease set with the diagnosis information in the medical record to be checked, and determines whether the diagnosis information of the medical record to be checked is misdiagnosed according to the matching result. The terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server 104 may be implemented by an independent server or a server cluster composed of multiple servers.

In one of the embodiments, as shown in FIG. 2, a method for quality control of medical records is provided. Taking the method applied to the server in FIG. 1 as an example for description, the method includes the following steps:

Step S202: Extract the main complaint information and the corresponding symptom relationship attribute pair in the medical record to be examined.

Among them, the medical record to be checked is an electronic medical record that needs to be quality controlled and has been entered into the terminal. The main complaint information is the description of the patient's own symptoms recorded in the medical record. The symptom relationship attribute pair refers to the attribute pair including the relationship between the symptom entity and the symptom location, symptom duration, etc., including {symptom entity: symptom location}{symptom entity: symptom duration}. For example, suppose the symptom entity is coughing and convulsions. The symptom relationship attribute pair can be {convulsion: right lower limb}{cough: two days} etc.

Specifically, the server obtains the medical record to be checked, which may be obtained by the user entering the main complaint information and diagnosis information in real time through the terminal, or may be pre-configured and stored in the server. After the server obtains the medical record to be examined, the natural language processing model and regular expression are used to extract the symptom relationship attribute pair from the main complaint information of the medical record to be examined. It should be emphasized that, in order to further ensure the privacy and security of the medical record information to be examined, the medical record to be examined may also be stored in a node of a blockchain.

Step S204: Input the main complaint information and the symptom relationship attribute pair into the trained first natural language processing model to obtain a disease set matching the main complaint information.

Among them, natural language processing is an important direction in the field of computer science and artificial intelligence. It studies various theories and methods that can realize effective communication between humans and computers in natural language. Natural language processing is a science that integrates linguistics, computer science, and mathematics. Therefore, research in this field will involve natural language, that is, the language people use daily, so it is closely related to linguistic research, but there are important differences. The natural language processing model is a neural network model used for natural language processing. The disease collection refers to a collection that includes multiple diseases.

Specifically, after the server extracts the main complaint information and symptom relationship attribute pairs, the extracted main complaint information and symptom relationship attribute pairs are input into the pre-trained first natural language processing model. The first natural language processing model is used to perform natural language processing on the main complaint information and the symptom relationship attribute pair, and the main complaint information is matched with the matching diseases to obtain the disease set.

Step S206: Match the disease set with the diagnostic information in the medical record to be checked, and determine whether the diagnostic information in the medical record to be checked is misdiagnosed according to the matching result.

Among them, the diagnosis information is the information entered into the medical record to be examined after the medical staff diagnoses the patient.

In one of the embodiments, step S206 includes: when the diagnosis information does not match the diseases in the disease set, determining that the diagnosis information of the medical record to be checked is misdiagnosed; when the diagnosis information matches any disease in the disease set, determining that the diagnosis information matches any disease in the disease set. The diagnostic information in the medical record was not misdiagnosed.

Specifically, the server obtains the diagnosis information from the medical record to be examined, and matches the diagnosis information with each disease in the disease set one by one. When the diagnosis information matches any one of the diseases in the disease set, it means that the diagnosis of the medical staff matches the main complaint information, and it is determined that there is no misdiagnosis. When the diagnosis information does not match all the diseases in the disease set, it means that the diagnosis of the medical staff does not match the main complaint information, and the misdiagnosis is determined.

The above medical record quality control method uses a trained natural language processing model to perform natural language processing on the main complaint information extracted from the medical record to be examined and the corresponding symptom relationship attribute pair to obtain a set of diseases matching the main complaint information. Then, the disease set matched with the main complaint information is matched with the diagnosis information in the medical record to be checked to determine whether the diagnosis information in the medical record to be checked is misdiagnosed. This method uses the extracted chief complaint information and symptom relationship attributes to determine the disease set corresponding to the chief complaint information, and then matches the diseases in the disease set with the diagnosis information, thereby realizing the judgment of whether the chief complaint information and the diagnosis information are consistent.

In one of the embodiments, as shown in FIG. 3, step S202 includes:

In step S302, the main complaint information of the medical record to be examined is extracted.

Specifically, after the server obtains the medical record to be examined, it first extracts the main complaint information from the medical record to be examined. Since the content of the medical record generally has a fixed format, the server can directly extract the main complaint information from the medical record according to the format of the medical record.

Step S304: Input the main complaint information into the trained second natural language processing model, and use the second natural language processing model to extract symptom entities from the main complaint information.

Among them, the second natural language processing model is a natural language processing model for extracting symptom entities from the main complaint information, and the second natural language processing model in this embodiment is preferably a named entity recognition model NER. The named entity recognition model is a model used for information extraction, which aims to locate and classify named entities in the text into predefined categories.

Specifically, after the server extracts the main complaint information, it inputs the main complaint information into the named entity recognition model NER. The named entity recognition model NER is used to locate and classify the main complaint information to obtain the symptom entities in the main complaint information.

Step S306: Query the symptom duration and symptom location of the symptom entity from the main complaint information to obtain a symptom relationship attribute pair.

Specifically, after the symptom entity is extracted from the main complaint information, the regular expression is used to query the symptom duration and symptom location corresponding to the symptom entity from the main complaint information. Combine the obtained symptom entity with symptom duration and symptom location to obtain a symptom relationship attribute pair.

In one of the embodiments, step S306 includes: matching the nearest punctuation marks on the left and right sides of the symptom entity in the main complaint information to determine the sentence segment where the symptom entity is located; The symptom part characters and symptom time characters in the dictionary are matched; when there are characters that successfully match the symptoms part characters and symptom time characters in the preset dictionary, the successfully matched characters are extracted from the sentence segment; the symptom entities are combined with the extracted characters Character, get symptom relationship attribute pair.

Wherein, the regular expression in this embodiment includes a regular expression punctuation symbol template and a regular expression part and time template. The regular expression punctuation template is a logic program that matches punctuation, and the regular expression location and time template is a logic program for detecting the symptom location and the duration of the symptom.

Specifically, when the server queries the symptom entity's symptom duration and symptom location from the main complaint information, it first calls the regular expression punctuation template. The logic program recorded by the regular expression punctuation template matches the punctuation marks closest to the left and right sides of the symptom entity to determine the sentence segment where the symptom entity is located. For example, the source string is "Patient complained of twitching sensation in the right lower extremity and started coughing 2 days ago". When the named entity recognition model NER detects the symptom entity "twitch", the punctuation marks closest to the left and right sides of the "twitch" are queried through the regular expression punctuation template. Here, the punctuation mark on the right side of the symptom entity "twitch" is ",", and no punctuation marks are detected on the left side, so it is considered that the beginning of the left side is the beginning of the sentence segment where the symptom entity "twitch" is located, and the punctuation mark "," is At the end of the sentence segment where the symptom entity "twitches" is located. Therefore, the sentence segment where the symptom entity "twitches" is located is "the patient complains of twitching in the right lower extremity".

After the server determines the sentence segment where the symptom entity is located, it then calls the regular expression location and time template, and determines the symptom duration or symptom location corresponding to the symptom entity through the regular expression location and time template. That is, a dictionary constructed offline in advance is obtained, and a preset dictionary is obtained. For example, the form of the preset dictionary can be {upper right limb, lower right limb, /d day, /d month}, where d represents any number. Then, the server matches the characters in the preset dictionary representing the symptom location and symptom duration with the characters in the sentence segment one by one, and judges whether the characters in the dictionary are located in the sentence segment. If so, take the matched characters from the sentence segment as the attributes of the symptom entity, and establish a symptom relationship attribute pair with the symptom entity. For example, when the right lower limb of the symptom part is detected in the main complaint information of the medical record to be examined by matching with a preset dictionary, it will be used as the attribute of the symptom entity "twitch" and combined to form a symptom relationship attribute pair {twitch: right lower limb}.

Step S308: Perform text conversion on the symptom relationship attribute pair to obtain a symptom relationship attribute pair in text form.

Among them, the text form does not include any structured form. For example, the CCP extracts two symptom relationship attribute pairs in the main complaint above: {cough: 2 days} and {convulsions: right lower limb}. The converted text format is cough two Tian, right lower limb twitching. In addition, Arabic numerals need to be converted into Chinese character descriptions during this text conversion process.

Specifically, after the server extracts the symptom relationship attribute pair, in order to facilitate subsequent processing, the symptom relationship attribute pair that originally has a structure is converted into a symptom relationship attribute pair in text form.

In this embodiment, the natural language processing model and regular expression technology are used to extract all symptoms and related attributes from the main complaint information in the medical record to be examined. Compared with the extraction using the natural language processing model alone, the accuracy is higher and can ensure The most comprehensive symptom information is extracted from the main complaint information to improve the accuracy of extraction.

In one of the embodiments, the first natural language processing model includes a first natural language text classification model and a second natural language text classification model. As shown in Fig. 4, step S204 includes:

Step S402: Input the main complaint information into the embedding layer of the first natural language text classification model to perform vector conversion to obtain the word vector of the main complaint information.

Specifically, the server inputs the main complaint information into the embedding layer (embedding) of the first natural language text classification model, through the embedding layer, first performs vector conversion on the main complaint information, and the embedding layer outputs the word vector of the main complaint information. The first natural language text classification model in this embodiment is preferably the TextCNN model. The TextCNN model is a model that applies a convolutional neural network CNN to text classification. It extracts key information in sentences by using multiple convolution kernels with different scales. The TextCNN model includes embedding layer (embedding), convolution layer (Convolution), pooling layer (MaxPolling) and fully connected layer (FullConnection and Softmax). The server first inputs the main complaint information into the embedding layer (embedding) of the TextCNN model to obtain the word vector of the main complaint information.

In step S404, word vector conversion is performed on the symptom relationship attribute to the embedding layer of the input second natural language text classification model to obtain the word vector of the symptom relationship attribute pair.

Specifically, the server inputs the symptom relationship attribute pair in the text form into the embedding layer of the second natural language text classification model. The embedding layer of the second natural language text classification model is used to perform word vector conversion on the symptom relation attribute pair to obtain the word vector of the symptom relation attribute pair. In this embodiment, the second natural language text classification model is preferably the FastText model. The Fasttext model is an engineering model based on the word2vec theoretical framework, which can quickly complete text word vector conversion and incorporate text n-gram information at the same time.

It should be understood that since only the word vector conversion is needed to obtain the corresponding word vector, the main complaint information and the symptom relationship attribute pair in the text form are respectively input to the textCNN model and the fasttext model, instead of obtaining the final output of the textCNN model and the fasttext model. Instead, get the output of the embedding layer in the textCNN model and the fasttext model. That is, the output of the embedding layer of the textCNN model is obtained, and the word vector of the main complaint information is obtained. Obtain the output of the embedding layer of the fasttext model, and obtain the word vector of the symptom relationship attribute pair.

In step S406, the word vector of the main complaint information and the word vector of the symptom relationship attribute pair are spliced according to the vertical axis direction to obtain a spliced vector.

Specifically, the word vector of the main complaint information and the word vector of the symptom relationship attribute pair are spliced in the direction of the vertical axis to obtain the spliced vector. If there are multiple pairs of symptom relationship attributes at the same time. Firstly, the multiple word vectors of the same symptom relationship attribute pair are spliced on the vertical axis to obtain the spliced word vector of the symptom relationship attribute pair. Then, the word vector corresponding to the main complaint information and the spliced word vector of the symptom relationship attribute pair are spliced on the vertical axis, and the size of the spliced vector finally obtained is 1*N. For example, one main complaint information is extracted from two symptom relationship attribute pairs. The splicing vector is: the word vector of the main complaint information-the word vector of the symptom relationship attribute pair-the word vector of the symptom relationship attribute pair. Among them, the order of the word vectors of the symptom relation attribute pair is determined by the order of the model output. Due to the mini-batch method adopted for model training, the batches obtained are randomly selected, so the order of word vectors is random.

Step S408, input the splicing vector to the network layer after the embedding layer of the first natural language text classification model, and output the disease set matching the main complaint information.

Specifically, after the server obtains the splicing vector, the splicing vector is input to the network layer after the embedding layer of the first natural language text classification model. Taking TextCNN model including embedding layer (embedding), convolution layer (Convolution), pooling layer (MaxPolling) and fully connected layer (FullConnection and Softmax) as an example, the stitching vector is directly input to the convolution layer (Convolution) of the TextCNN model. ). Then, obtain the disease set output by the Full Connection and Softmax layer of the TextCNN model. The number of diseases in the disease set can be configured according to the actual situation, for example, 20 diseases are required for the configuration of the disease set. Then the Full Connection layer (Full Connection and Softmax) outputs the top 20 diseases according to the probability, and obtains a disease set including 20 diseases.

In this embodiment, the first natural language text classification model and the second natural language text classification model are based on the MIMIC data set and are trained using a supervision method based on an end-to-end mechanism. In this embodiment, a data-driven model is used to perform medical record diagnosis quality control, which can cover more disease types and improve the wide availability of medical record quality control.

In one of the embodiments, step S402 includes: each convolution kernel in the embedding layer of the first natural language text classification model convolves the main complaint information to obtain the convolution vector of each convolution kernel; Weighted average processing to obtain the word vector of the main complaint information.

Specifically, each convolution kernel in the embedding layer of the TextCNN model performs a weighted average on the vector obtained by convolving the main complaint information, thereby obtaining the word vector of the main complaint information. Among them, the weight coefficient has been fixed when training the TextCNN model.

In this embodiment, compared with the traditional method of directly taking the mean value of the vector, the weight of the vector convolved by different convolution kernels in different embedding layers is fully considered, and the accuracy is improved.

In one of the embodiments, as shown in FIG. 5, a working flow chart of medical record quality control is provided, and the medical record quality control method is explained with reference to FIG. 5.

Specifically, first obtain the medical record to be examined including the main complaint information and the diagnosis information. The server inputs the main complaint information into the embedding layer of the TextCNN model to obtain the word vector of the main complaint information. At the same time, the server extracts the symptom relationship attribute pair from the main complaint information, inputs the symptom relationship attribute pair into the embedding layer of the FastText model, and obtains the word vector of the symptom relationship attribute pair. Then, the word vector of the main complaint information and the word vector of the symptom relationship attribute pair are spliced on the vertical axis to obtain the splicing vector. Finally, the splicing vector is input to the network layer after the embedding layer of the TextCNN model for processing, and the disease set including TOP20 diseases is obtained. Match the disease with the diagnosis information to determine whether it is misdiagnosed.

It should be understood that although the various steps in the flowcharts of FIGS. 2-4 are displayed in sequence as indicated by the arrows, these steps are not necessarily performed in sequence in the order indicated by the arrows. Unless specifically stated in this article, the execution of these steps is not strictly limited in order, and these steps can be executed in other orders. Moreover, at least part of the steps in Figures 2-4 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but can be executed at different times. These sub-steps or stages The execution order of is not necessarily performed sequentially, but may be performed alternately or alternately with at least a part of other steps or sub-steps or stages of other steps.

In one of the embodiments, as shown in FIG. 6, a medical record quality control device is provided, which includes: an extraction module 602, a processing module 604, and a determination module 606, wherein:

The extraction module 602 is used to extract the main complaint information and the corresponding symptom relationship attribute pair in the medical record to be examined.

The processing module 604 is used to input the main complaint information and the symptom relationship attribute pair into the trained first natural language processing model to obtain a disease set that matches the main complaint information.

The determining module 606 is configured to match the disease set with the diagnosis information in the medical record to be checked, and determine whether the diagnosis information of the medical record to be checked is misdiagnosed according to the matching result.

In one of the embodiments, the extraction module 602 is also used to extract the main complaint information of the medical record to be examined; input the main complaint information into the trained second natural language processing model, and use the second natural language processing model to extract symptom entities from the main complaint information; From the main complaint information, query the symptom entity's symptom duration and symptom location to obtain the symptom relationship attribute pair; perform text conversion of the symptom relationship attribute pair to obtain the symptom relationship attribute pair in text form.

In one of the embodiments, the extraction module 602 is also used to match the nearest punctuation marks on the left and right sides of the symptom entity in the main complaint information to determine the sentence segment where the symptom entity is located; The symptom part character and the symptom time character are matched; when there is a character that successfully matches the symptom part character and the symptom time character in the preset dictionary, the successfully matched character is extracted from the sentence segment; the symptom entity and the extracted character are combined to obtain Symptom relationship attribute pair.

In one of the embodiments, the processing module 604 is further configured to input the main complaint information into the embedding layer of the first natural language text classification model for vector conversion to obtain the word vector of the main complaint information; and classify the symptom relationship attribute to the input second natural language text The embedding layer of the model performs word vector conversion to obtain the word vector of the symptom relation attribute pair; splicing the word vector of the main complaint information and the word vector of the symptom relation attribute pair according to the vertical axis direction to obtain the splicing vector; input the splicing vector into the first natural language The network layer after the embedding layer of the text classification model outputs a set of diseases matching the main complaint information.

In one of the embodiments, the processing module 604 is also used to convolve the main complaint information with each convolution kernel in the embedding layer of the first natural language text classification model to obtain the convolution vector of each convolution kernel; Perform weighted average processing to obtain the word vector of the main complaint information.

In one of the embodiments, the determining module 606 is further configured to determine that the diagnostic information of the medical record to be checked is misdiagnosed when the diagnostic information does not match the diseases in the disease set; when the diagnostic information matches any disease in the disease set, determine The diagnosis information of the medical record to be examined is not misdiagnosed.

For the specific definition of the medical record quality control device, please refer to the above definition of the medical record quality control method, which will not be repeated here. Each module in the above medical record quality control device can be implemented in whole or in part by software, hardware and a combination thereof. The above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.

In one of the embodiments, a computer device is provided. The computer device may be a server, and its internal structure diagram may be as shown in FIG. 7. The computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus. Among them, the processor of the computer device is used to provide calculation and control capabilities. The memory of the computer device includes a non-volatile or volatile storage medium and internal memory. The non-volatile or volatile storage medium stores an operating system, computer readable instructions, and a database. The internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium. The database of the computer equipment is used to store data such as medical records and models to be examined. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer-readable instructions are executed by the processor to realize a medical record quality control method.

Those skilled in the art can understand that the structure shown in FIG. 7 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied. The specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.

A computer device, including a memory and one or more processors. The memory stores computer-readable instructions. When the computer-readable instructions are executed by the processor, the one or more processors perform the following steps: extracting medical records to be examined The main complaint information and the corresponding symptom relationship attribute pair;

Used to input the main complaint information and symptom relationship attributes into the trained first natural language processing model to obtain a set of diseases matching the main complaint information; and

The disease set is matched with the diagnosis information in the medical record to be checked, and the diagnosis information in the medical record to be checked is determined according to the matching result whether it is misdiagnosed.

In one of the embodiments, the processor further implements the following steps when executing the computer-readable instructions:

Extract the main complaint information of the medical records to be examined;

Input the main complaint information into the trained second natural language processing model, and use the second natural language processing model to extract symptom entities from the main complaint information;

Query the symptom entity's symptom duration and symptom location from the main complaint information to obtain symptom relationship attribute pairs; and

The symptom relationship attribute pair is converted into text to obtain the symptom relationship attribute pair in text form.

Match the nearest punctuation marks on the left and right sides of the symptom entity in the main complaint information to determine the sentence segment where the symptom entity is located;

Match each character in the sentence segment with the symptom part character and symptom time character in the preset dictionary one by one;

When there are characters that successfully match the symptom part characters and symptom time characters in the preset dictionary, extract the successfully matched characters from the sentence segment; and

Combine the symptom entity and the extracted characters to obtain the symptom relationship attribute pair.

Input the main complaint information into the embedding layer of the first natural language text classification model for vector conversion to obtain the word vector of the main complaint information;

Perform word vector conversion on the symptom relationship attribute to the embedding layer of the input second natural language text classification model to obtain the word vector of the symptom relationship attribute pair;

Splicing the word vector of the main complaint information and the word vector of the symptom relation attribute pair according to the vertical axis direction to obtain the splicing vector; and

The splicing vector is input to the network layer after the embedding layer of the first natural language text classification model, and the disease set matching the main complaint information is output.

Each convolution kernel in the embedding layer of the first natural language text classification model convolves the main complaint information to obtain the convolution vector of each convolution kernel; and

Perform weighted average processing on each convolution vector to obtain the word vector of the main complaint information.

When the diagnosis information does not match the diseases in the disease set, it is determined that the diagnosis information in the medical record to be examined is misdiagnosed; and

When the diagnosis information matches any disease in the disease set, it is determined that the diagnosis information of the medical record to be examined is not misdiagnosed.

Wherein, the computer-readable storage medium may be non-volatile or volatile.

In one of the embodiments, when the computer-readable instructions are executed by the processor, the following steps are further implemented:

Extract the main complaint information of the medical records to be examined;

In one of the embodiments, when the computer-readable instructions are executed by the processor, the following steps are further implemented: input the main complaint information into the embedding layer of the first natural language text classification model to perform vector conversion to obtain the word vector of the main complaint information;

A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through computer-readable instructions. The computer-readable instructions can be stored in a computer-readable storage. In the medium, when the computer-readable instructions are executed, they may include the processes of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database, or other media used in the embodiments provided in this application may include non-volatile and/or volatile memory. Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. As an illustration and not a limitation, RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

The technical features of the above embodiments can be combined arbitrarily. In order to make the description concise, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction in the combination of these technical features, they should be It is considered as the range described in this specification.

The above-mentioned embodiments only express several implementation manners of the present application, and the description is relatively specific and detailed, but it should not be understood as a limitation on the scope of the invention patent. It should be pointed out that for those of ordinary skill in the art, without departing from the concept of this application, several modifications and improvements can be made, and these all fall within the protection scope of this application. Therefore, the scope of protection of the patent of this application shall be subject to the appended claims.

Claims

A method for quality control of medical records, the method comprising:

Extract the main complaint information and the corresponding symptom relationship attribute pair in the medical record to be examined;

Input the trained first natural language processing model to the main complaint information and the symptom relationship attribute pair to obtain a set of diseases matching the main complaint information; and

The disease set is matched with the diagnosis information in the medical record to be examined, and it is determined whether the diagnosis information of the medical record to be examined is misdiagnosed according to the matching result.
The method according to claim 1, wherein the extracting the main complaint information and the corresponding symptom relationship attribute pair in the medical record to be examined comprises:

Extract the main complaint information of the medical record to be examined;

Input the main complaint information into a trained second natural language processing model, and use the second natural language processing model to extract symptom entities from the main complaint information;

Query the symptom duration and symptom location of the symptom entity from the main complaint information to obtain symptom relationship attribute pairs; and

The symptom relationship attribute pair is text-converted to obtain the symptom relationship attribute pair in text form.
The method according to claim 2, wherein the querying the symptom duration and symptom location of the symptom entity from the main complaint information to obtain a symptom relationship attribute pair comprises:

Matching the nearest punctuation marks on the left and right sides of the symptom entity in the main complaint information to determine the sentence segment where the symptom entity is located;

Match each character in the sentence segment with the symptom part character and symptom time character in the preset dictionary one by one;

When there is a character that successfully matches the symptom part character and the symptom time character in the preset dictionary, extract the successfully matched character from the sentence segment; and

The symptom entity and the extracted characters are combined to obtain a symptom relationship attribute pair.
The method according to claim 1, wherein the first natural language processing model comprises a first natural language text classification model and a second natural language text classification model;

The main complaint information and the symptom relationship attribute pair are input into the trained first natural language processing model to obtain a set of diseases matching the main complaint information, including:

Input the main complaint information into the embedding layer of the first natural language text classification model to perform vector conversion to obtain the word vector of the main complaint information;

Performing word vector conversion on the symptom relationship attribute to the embedding layer input to the second natural language text classification model to obtain the word vector of the symptom relationship attribute pair;

Splicing the word vector of the main complaint information and the word vector of the symptom relationship attribute pair according to the vertical axis direction to obtain a splicing vector; and

The splicing vector is input to a network layer after the embedding layer of the first natural language text classification model, and a disease set matching the main complaint information is output.
The method according to claim 4, wherein the inputting the main complaint information into the embedding layer of the trained first natural language text classification model for vector conversion to obtain the word vector of the main complaint information comprises:

Each convolution kernel in the embedding layer of the first natural language text classification model convolves the main complaint information to obtain the convolution vector of each convolution kernel; and

Perform weighted average processing on each of the convolution vectors to obtain the word vector of the main complaint information.
The method according to claim 1, wherein the matching the disease set with the diagnosis information in the medical record to be examined, and determining whether the diagnosis information of the medical record to be examined is misdiagnosed according to the matching result, comprises:

When the diagnosis information does not match the diseases in the disease set, it is determined that the diagnosis information of the medical record to be checked is a misdiagnosis; and

When the diagnosis information matches any disease in the disease set, it is determined that the diagnosis information of the medical record to be examined is not misdiagnosed.
The method according to any one of claims 4 or 5, wherein the first natural language text classification model includes a TextCNN model; and the second natural language text classification model includes a FastText model.
A medical record quality control device, the device comprising:

The extraction module is used to extract the main complaint information and the corresponding symptom relationship attribute pair in the medical record to be examined;

A processing module, configured to input the main complaint information and the symptom relationship attribute pair into the trained first natural language processing model to obtain a set of diseases matching the main complaint information; and

The determining module is configured to match the disease set with the diagnostic information in the medical record to be checked, and determine whether the diagnostic information in the medical record to be checked is misdiagnosed according to the matching result.
A computer device includes a memory and one or more processors. The memory stores computer readable instructions. When the computer readable instructions are executed by the one or more processors, the one or more The processor performs the following steps:

Extract the main complaint information and the corresponding symptom relationship attribute pair in the medical record to be examined;

Input the trained first natural language processing model to the main complaint information and the symptom relationship attribute pair to obtain a set of diseases matching the main complaint information; and

The disease set is matched with the diagnosis information in the medical record to be examined, and it is determined whether the diagnosis information of the medical record to be examined is misdiagnosed according to the matching result.
The computer device according to claim 9, wherein the processor further executes the following steps when executing the computer readable instruction:

Extract the main complaint information of the medical record to be examined;

Input the main complaint information into a trained second natural language processing model, and use the second natural language processing model to extract symptom entities from the main complaint information;

Query the symptom duration and symptom location of the symptom entity from the main complaint information to obtain symptom relationship attribute pairs; and

The symptom relationship attribute pair is text-converted to obtain the symptom relationship attribute pair in text form.
The computer device according to claim 10, wherein the processor further executes the following steps when executing the computer-readable instructions:

Matching the nearest punctuation marks on the left and right sides of the symptom entity in the main complaint information to determine the sentence segment where the symptom entity is located;

Match each character in the sentence segment with the symptom part character and symptom time character in the preset dictionary one by one;

When there is a character that successfully matches the symptom part character and the symptom time character in the preset dictionary, extract the successfully matched character from the sentence segment; and

The symptom entity and the extracted characters are combined to obtain a symptom relationship attribute pair.
The computer device according to claim 9, wherein the processor further executes the following steps when executing the computer readable instruction:

The main complaint information and the symptom relationship attribute pair are input into the trained first natural language processing model to obtain a set of diseases matching the main complaint information, including:

Input the main complaint information into the embedding layer of the first natural language text classification model to perform vector conversion to obtain the word vector of the main complaint information;

Performing word vector conversion on the symptom relationship attribute to the embedding layer input to the second natural language text classification model to obtain the word vector of the symptom relationship attribute pair;

Splicing the word vector of the main complaint information and the word vector of the symptom relationship attribute pair according to the vertical axis direction to obtain a splicing vector; and

The splicing vector is input to a network layer after the embedding layer of the first natural language text classification model, and a disease set matching the main complaint information is output.
The computer device according to claim 12, wherein the processor further executes the following steps when executing the computer readable instruction:

Each convolution kernel in the embedding layer of the first natural language text classification model convolves the main complaint information to obtain the convolution vector of each convolution kernel; and

Perform weighted average processing on each of the convolution vectors to obtain the word vector of the main complaint information.
The computer device according to claim 9, wherein the processor further executes the following steps when executing the computer readable instruction:

When the diagnosis information does not match the diseases in the disease set, it is determined that the diagnosis information of the medical record to be checked is a misdiagnosis; and

When the diagnosis information matches any disease in the disease set, it is determined that the diagnosis information of the medical record to be examined is not misdiagnosed.
One or more computer-readable storage media storing computer-readable instructions, which when executed by one or more processors, cause the one or more processors to perform the following steps:

Extract the main complaint information and the corresponding symptom relationship attribute pair in the medical record to be examined;

Input the trained first natural language processing model to the main complaint information and the symptom relationship attribute pair to obtain a set of diseases matching the main complaint information; and

The disease set is matched with the diagnosis information in the medical record to be examined, and it is determined whether the diagnosis information of the medical record to be examined is misdiagnosed according to the matching result.
The storage medium according to claim 15, wherein the following steps are further performed when the computer-readable instructions are executed by the processor:

Extract the main complaint information of the medical record to be examined;

Input the main complaint information into a trained second natural language processing model, and use the second natural language processing model to extract symptom entities from the main complaint information;

Query the symptom duration and symptom location of the symptom entity from the main complaint information to obtain symptom relationship attribute pairs; and

The symptom relationship attribute pair is text-converted to obtain the symptom relationship attribute pair in text form.
The storage medium according to claim 16, wherein the following steps are further performed when the computer-readable instructions are executed by the processor:

Matching the nearest punctuation marks on the left and right sides of the symptom entity in the main complaint information to determine the sentence segment where the symptom entity is located;

Match each character in the sentence segment with the symptom part character and symptom time character in the preset dictionary one by one;

When there is a character that successfully matches the symptom part character and the symptom time character in the preset dictionary, extract the successfully matched character from the sentence segment; and

The symptom entity and the extracted characters are combined to obtain a symptom relationship attribute pair.
The storage medium according to claim 15, wherein the following steps are further performed when the computer-readable instructions are executed by the processor:

The main complaint information and the symptom relationship attribute pair are input into the trained first natural language processing model to obtain a set of diseases matching the main complaint information, including:

Input the main complaint information into the embedding layer of the first natural language text classification model to perform vector conversion to obtain the word vector of the main complaint information;

Performing word vector conversion on the symptom relationship attribute to the embedding layer input to the second natural language text classification model to obtain the word vector of the symptom relationship attribute pair;

Splicing the word vector of the main complaint information and the word vector of the symptom relationship attribute pair according to the vertical axis direction to obtain a splicing vector; and

The splicing vector is input to a network layer after the embedding layer of the first natural language text classification model, and a disease set matching the main complaint information is output.
The storage medium according to claim 18, wherein the following steps are further performed when the computer-readable instructions are executed by the processor:

Each convolution kernel in the embedding layer of the first natural language text classification model convolves the main complaint information to obtain the convolution vector of each convolution kernel; and

Perform weighted average processing on each of the convolution vectors to obtain the word vector of the main complaint information.
The storage medium according to claim 16, wherein the following steps are further performed when the computer-readable instructions are executed by the processor:

When the diagnosis information does not match the diseases in the disease set, it is determined that the diagnosis information of the medical record to be checked is a misdiagnosis; and

When the diagnosis information matches any disease in the disease set, it is determined that the diagnosis information of the medical record to be examined is not misdiagnosed.