WO2021180242A1 - Method and apparatus for detecting anomaly in diagnostic data, and computer device and storage medium - Google Patents

Method and apparatus for detecting anomaly in diagnostic data, and computer device and storage medium Download PDF

Info

Publication number
WO2021180242A1
WO2021180242A1 PCT/CN2021/083622 CN2021083622W WO2021180242A1 WO 2021180242 A1 WO2021180242 A1 WO 2021180242A1 CN 2021083622 W CN2021083622 W CN 2021083622W WO 2021180242 A1 WO2021180242 A1 WO 2021180242A1
Authority
WO
WIPO (PCT)
Prior art keywords
medical
diagnostic data
candidate information
disease
data
Prior art date
Application number
PCT/CN2021/083622
Other languages
French (fr)
Chinese (zh)
Inventor
唐蕊
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021180242A1 publication Critical patent/WO2021180242A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/60ICT specially adapted for the handling or processing of medical references relating to pathologies
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Abstract

A method and apparatus for detecting an anomaly in diagnostic data, and a computer device and a storage medium, which belong to the field of intelligent medical treatment. The method comprises: respectively matching diagnostic data of a target patient with two rules, i.e. a preset medical treatment rule and a medical treatment mining rule, so as to obtain two types of candidate information, and by means of fusing the two types of candidate information, obtaining multi-dimensional third candidate information in which the medical treatment rule and the medical treatment mining rule are combined; using a disease recognition model to recognize the diagnostic data, so as to acquire fourth candidate information, such that the flexibility of recognition of the diagnostic data is improved and the recognition speed is fast; and determining suspected disease information of the target patient by means of combining the fourth candidate information and the third candidate information, so as to determine, according to the suspected disease information, whether the diagnostic data is anomalous, and thereby achieving the aim of quickly and effectively confirming a misdiagnosis.

Description

诊断数据异常检测方法、装置、计算机设备及存储介质Diagnostic data abnormal detection method, device, computer equipment and storage medium
本申请要求于2020年10月27日提交中国专利局、申请号为202011161090.4、发明名称为“诊断数据异常检测方法、装置、计算机设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on October 27, 2020, the application number is 202011161090.4, and the invention title is "diagnostic data anomaly detection method, device, computer equipment and storage medium". The entire content of the application is approved The reference is incorporated in the application.
技术领域Technical field
本申请涉及数字医疗领域,尤其涉及一种诊断数据异常检测方法、装置、计算机设备及存储介质。This application relates to the field of digital medicine, and in particular to a method, device, computer equipment, and storage medium for detecting abnormal diagnostic data.
背景技术Background technique
误诊是指医生由于各种原因,对病人给出错误的诊断。误诊的发生非常普遍,据调查数据显示,疾病的误诊率通常在30%左右。误诊会导致严重的后果,例如错误的治疗方案、病人治疗的延迟。所以,对误诊进行及时的检测是非常必要的。Misdiagnosis means that the doctor gives the wrong diagnosis to the patient due to various reasons. Misdiagnosis is very common. According to survey data, the misdiagnosis rate of diseases is usually around 30%. Misdiagnosis can lead to serious consequences, such as wrong treatment plans and delays in patient treatment. Therefore, timely detection of misdiagnosis is very necessary.
发明人意识到,现有的误诊检测方法是基于医生根据医学知识针对每种疾病编写医学规则对患者的诊断进行误诊检测,通过判断该次诊断的疾病是否满足该疾病对应的医学规则,如果不满足规则,则说明本次诊断存在误诊。然而,现有的误诊检测方法存在:医学规则是由医生根据医学知识整理得到的,耗费精力大、时间成本高、灵活性低、误诊检测的精度低等问题。The inventor realizes that the existing misdiagnosis detection method is based on the doctors compiling medical rules for each disease according to medical knowledge to perform misdiagnosis detection on the patient’s diagnosis, by judging whether the diagnosed disease meets the medical rules corresponding to the disease, if not Satisfying the rules means that this diagnosis is misdiagnosed. However, the existing misdiagnosis detection methods have problems: medical rules are compiled by doctors based on medical knowledge, which consumes a lot of energy, high time cost, low flexibility, and low accuracy of misdiagnosis detection.
发明内容Summary of the invention
针对现有的误诊检测方法灵活性差、检测精度低的问题,现提供一种旨在可提高误诊检测的灵活性以及检测精度的诊断数据异常检测方法、装置、计算机设备及存储介质。Aiming at the problems of poor flexibility and low detection accuracy of existing misdiagnosis detection methods, a diagnostic data abnormality detection method, device, computer equipment and storage medium aiming to improve the flexibility and detection accuracy of misdiagnosis detection are now provided.
为实现上述目的,本申请第一方面提供了一种诊断数据异常检测方法,包括:To achieve the foregoing objective, the first aspect of the present application provides a method for detecting anomalies in diagnostic data, including:
获取目标患者的诊断数据;Obtain the diagnostic data of the target patient;
将所述诊断数据与预设医疗规则进行匹配,获取第一候选信息;Matching the diagnostic data with preset medical rules to obtain first candidate information;
将所述诊断数据与医疗挖掘规则进行匹配,获取第二候选信息;Matching the diagnostic data with medical mining rules to obtain second candidate information;
将所述第一候选信息和所述第二候选信息进行融合生成第三候选信息;Fusing the first candidate information and the second candidate information to generate third candidate information;
采用疾病识别模型对所述诊断数据进行识别获取第四候选信息;Recognizing the diagnosis data using a disease recognition model to obtain fourth candidate information;
对所述第三候选信息和所述第四候选信息进行融合,以获取疑似疾病信息,根据所述疑似疾病信息判断所述诊断数据是否异常。The third candidate information and the fourth candidate information are fused to obtain suspected disease information, and whether the diagnosis data is abnormal is determined according to the suspected disease information.
为实现上述目的,本申请第二方面提供了一种诊断数据异常检测装置,包括:To achieve the foregoing objective, a second aspect of the present application provides a diagnostic data abnormality detection device, including:
获取单元,用于获取目标患者的诊断数据;The acquiring unit is used to acquire the diagnostic data of the target patient;
第一匹配单元,用于将所述诊断数据与预设医疗规则进行匹配,获取第一候选信息;The first matching unit is configured to match the diagnostic data with preset medical rules to obtain first candidate information;
第二匹配单元,用于将所述诊断数据与医疗挖掘规则进行匹配,获取第二候选信息;The second matching unit is used to match the diagnostic data with medical mining rules to obtain second candidate information;
融合单元,用于将所述第一候选信息和所述第二候选信息进行融合生成第三候选信息;A fusion unit for fusing the first candidate information and the second candidate information to generate third candidate information;
识别单元,用于采用疾病识别模型对所述诊断数据进行识别获取第四候选信息;An identification unit, configured to use a disease identification model to identify the diagnostic data to obtain fourth candidate information;
处理单元,用于对所述第三候选信息和所述第四候选信息进行融合,以获取疑似疾病信息,根据所述疑似疾病信息判断所述诊断数据是否异常。The processing unit is configured to fuse the third candidate information and the fourth candidate information to obtain suspected disease information, and determine whether the diagnosis data is abnormal according to the suspected disease information.
为实现上述目的,本申请第三方面提供了一种诊断数据异常检测设备,包括:存储器和至少一个处理器,所述存储器中存储有指令;所述至少一个处理器调用所述存储器中的所述指令,以使得所述诊断数据异常检测设备执行如下所示的诊断数据异常检测方法的步骤:In order to achieve the above objective, the third aspect of the present application provides a diagnostic data abnormality detection device, including: a memory and at least one processor, the memory stores instructions; the at least one processor calls all the devices in the memory The instructions, so that the diagnostic data abnormality detection device executes the steps of the diagnostic data abnormality detection method as shown below:
获取目标患者的诊断数据;Obtain the diagnostic data of the target patient;
将所述诊断数据与预设医疗规则进行匹配,获取第一候选信息;Matching the diagnostic data with preset medical rules to obtain first candidate information;
将所述诊断数据与医疗挖掘规则进行匹配,获取第二候选信息;Matching the diagnostic data with medical mining rules to obtain second candidate information;
将所述第一候选信息和所述第二候选信息进行融合生成第三候选信息;Fusing the first candidate information and the second candidate information to generate third candidate information;
采用疾病识别模型对所述诊断数据进行识别获取第四候选信息;Recognizing the diagnosis data using a disease recognition model to obtain fourth candidate information;
对所述第三候选信息和所述第四候选信息进行融合,以获取疑似疾病信息,根据所述疑似疾病信息判断所述诊断数据是否异常。The third candidate information and the fourth candidate information are fused to obtain suspected disease information, and whether the diagnosis data is abnormal is determined according to the suspected disease information.
为实现上述目的,本申请第四方面提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行如下所示的诊断数据异常检测方法的步骤:In order to achieve the above objectives, the fourth aspect of the present application provides a computer-readable storage medium, which stores instructions in the computer-readable storage medium, which when run on a computer, causes the computer to execute the following diagnosis data abnormality Steps of the detection method:
获取目标患者的诊断数据;Obtain the diagnostic data of the target patient;
将所述诊断数据与预设医疗规则进行匹配,获取第一候选信息;Matching the diagnostic data with preset medical rules to obtain first candidate information;
将所述诊断数据与医疗挖掘规则进行匹配,获取第二候选信息;Matching the diagnostic data with medical mining rules to obtain second candidate information;
将所述第一候选信息和所述第二候选信息进行融合生成第三候选信息;Fusing the first candidate information and the second candidate information to generate third candidate information;
采用疾病识别模型对所述诊断数据进行识别获取第四候选信息;Recognizing the diagnosis data using a disease recognition model to obtain fourth candidate information;
对所述第三候选信息和所述第四候选信息进行融合,以获取疑似疾病信息,根据所述疑似疾病信息判断所述诊断数据是否异常。The third candidate information and the fourth candidate information are fused to obtain suspected disease information, and whether the diagnosis data is abnormal is determined according to the suspected disease information.
本申请提供的诊断数据异常检测方法、装置、计算机设备及存储介质,可将目标患者的诊断数据分别与预设医疗规则和医疗挖掘规则两种规则进行匹配,以得到两种候选信息,通过将两种候选信息进行融合以得到结合了医疗规则以及医疗挖掘规则的多维度的第三候选信息;采用疾病识别模型对诊断数据进行识别获取第四候选信息,提升了识别诊断数据的灵活性,且识别的速度快;通过结合第四候选信息和第三候选信息确定目标患者的疑似疾病信息,以便于根据疑似疾病信息判断诊断数据是否异常,从而达到快速有效确认误诊的目的。The diagnostic data abnormality detection method, device, computer equipment and storage medium provided in this application can match the target patient’s diagnostic data with preset medical rules and medical mining rules to obtain two kinds of candidate information. The two kinds of candidate information are fused to obtain the multi-dimensional third candidate information that combines medical rules and medical mining rules; the disease recognition model is used to identify the diagnostic data to obtain the fourth candidate information, which improves the flexibility of identifying diagnostic data, and The recognition speed is fast; the suspected disease information of the target patient is determined by combining the fourth candidate information and the third candidate information, so as to determine whether the diagnosis data is abnormal according to the suspected disease information, so as to achieve the purpose of quickly and effectively confirming the misdiagnosis.
附图说明Description of the drawings
图1为本申请所述的诊断数据异常检测方法的一种实施例的流程图;FIG. 1 is a flowchart of an embodiment of the method for detecting abnormality of diagnostic data according to this application;
图2为本申请获取目标患者诊断数据的一种实施例的流程图;FIG. 2 is a flowchart of an embodiment of obtaining diagnostic data of a target patient in this application;
图3为本申请获取第一候选信息的一种实施例的流程图;FIG. 3 is a flowchart of an embodiment of obtaining first candidate information in this application;
图4为本申请根据历史样本数据生成所述医疗挖掘规则的一种实施例的流程图;FIG. 4 is a flowchart of an embodiment of generating the medical mining rule according to historical sample data in this application;
图5为本申请获取第二候选信息的一种实施例的流程图;FIG. 5 is a flowchart of an embodiment of obtaining second candidate information in this application;
图6为对第三候选信息和第四候选信息进行融合以获取疑似疾病信息的一种实施例的流程图;FIG. 6 is a flowchart of an embodiment of fusing the third candidate information and the fourth candidate information to obtain suspected disease information;
图7为本申请所述的诊断数据异常检测装置的一种实施例的模块图;FIG. 7 is a block diagram of an embodiment of the diagnostic data abnormality detection device described in this application;
图8为本申请计算机设备的一个实施例的硬件架构图。FIG. 8 is a hardware architecture diagram of an embodiment of the computer device of this application.
具体实施方式Detailed ways
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purpose, technical solutions, and advantages of this application clearer and clearer, the following further describes the application in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the application, and are not used to limit the application. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组 合。It should be noted that the embodiments in the application and the features in the embodiments can be combined with each other if there is no conflict.
本申请提供的诊断数据异常检测方法、装置、计算机设备及存储介质,适用于智能医疗业务领域。本申请可将目标患者的诊断数据分别与预设医疗规则和医疗挖掘规则两种规则进行匹配,以得到两种候选信息,通过将两种候选信息进行融合以得到结合了医疗规则以及医疗挖掘规则的多维度的第三候选信息;采用疾病识别模型对诊断数据进行识别获取第四候选信息,提升了识别诊断数据的灵活性,且识别的速度快;通过结合第四候选信息和第三候选信息确定目标患者的疑似疾病信息,以便于判断疑似疾病信息中是否包含诊断数据中的疾病类型,即鉴别诊断数据是否异常,若是,则表示诊断数据正常;若否,则表示诊断数据异常存在误诊,从而达到快速有效确认误诊的目的。The diagnostic data abnormality detection method, device, computer equipment and storage medium provided in this application are suitable for the field of intelligent medical business. In this application, the diagnostic data of the target patient can be matched with preset medical rules and medical mining rules to obtain two kinds of candidate information. By fusing the two kinds of candidate information to obtain a combination of medical rules and medical mining rules The multi-dimensional third candidate information; the disease recognition model is used to identify the diagnostic data to obtain the fourth candidate information, which improves the flexibility of identifying the diagnostic data, and the recognition speed is fast; by combining the fourth candidate information and the third candidate information Determine the suspected disease information of the target patient in order to determine whether the suspected disease information contains the disease type in the diagnostic data, that is, whether the diagnostic data is abnormal, if it is, it means that the diagnostic data is normal; if not, it means that the diagnostic data is abnormal and misdiagnosed. So as to achieve the purpose of quickly and effectively confirming the misdiagnosis.
实施例一Example one
请参阅图1,本实施例的一种诊断数据异常检测方法包括以下步骤:Referring to FIG. 1, a method for detecting anomalies in diagnostic data in this embodiment includes the following steps:
S1.获取目标患者的诊断数据。S1. Obtain the diagnostic data of the target patient.
进一步地,参阅图2所示步骤S1可包括以下步骤:Further, referring to step S1 shown in FIG. 2 may include the following steps:
S11.接收用户终端发送的目标患者的医疗数据,所述医疗数据包括:目标患者的基本信息、目标疾病类型和多个医学实体。S11. Receive medical data of the target patient sent by the user terminal, where the medical data includes: basic information of the target patient, the target disease type, and multiple medical entities.
其中,目标患者的基本信息可包括:标识目标患者身份的编号(如:身份证、医保卡编号等)、年龄、性别、主诉、现病史、家族史等信息;目标疾病类型可以是疾病分类编号;医学实体可以是检测项目的编号,例如:血液的检查项目(如:血压、血红蛋白、血小板等指标),尿液的检查项目(如:蛋白、酮体、葡萄糖)等。Among them, the basic information of the target patient may include: identification number of the target patient (such as ID card, medical insurance card number, etc.), age, gender, chief complaint, current medical history, family history and other information; the target disease type can be a disease classification number ; The medical entity can be the number of the test item, for example: blood test items (such as blood pressure, hemoglobin, platelets, etc.), urine test items (such as protein, ketone body, glucose), etc.
S12.提取所述医疗数据中的所述医学实体,生成所述诊断数据。S12. Extract the medical entity in the medical data, and generate the diagnostic data.
本实施例中,诊断数据由医学实体的编号(ID)组成,如:[医学实体X1,医学实体X2,医学实体X3,……],医学实体中的X表示医学实体的ID。In this embodiment, the diagnostic data is composed of the ID of the medical entity, such as: [medical entity X1, medical entity X2, medical entity X3, ...], X in the medical entity represents the ID of the medical entity.
S2.将所述诊断数据与预设医疗规则进行匹配,获取第一候选信息。S2. Match the diagnostic data with preset medical rules to obtain first candidate information.
其中,所述预设医疗规则为根据医学知识(医生从医学知识中整理得到的)预先设定的规则,包括多条医疗规则,每一条所述医疗规则包括至少一个所述医学实体,每一条所述医疗规则对应一种疾病类型。Wherein, the preset medical rules are rules set in advance according to medical knowledge (doctors sorted out from medical knowledge), including multiple medical rules, each medical rule includes at least one medical entity, and each medical rule includes at least one medical entity. The medical rule corresponds to a type of disease.
作为举例而非限定,医疗规则的呈现方式一般为,例如,疾病1→(医学实体1,医学实体3):表示诊断数据中同时出现医学实体1和医学实体3即可将疾病1添加到目标患者的第一候选信息中;疾病2→(医学实体1,医学实体5,医学实体10):表示诊断数据中同时出现医学实体1、医学实体5和医学实体10即可将疾病2添加到目标患者的第一候选信息中。As an example and not a limitation, the presentation of medical rules is generally, for example, disease 1 → (medical entity 1, medical entity 3): it means that both medical entity 1 and medical entity 3 appear in the diagnostic data, and disease 1 can be added to the target. In the patient’s first candidate information; disease 2 → (medical entity 1, medical entity 5, medical entity 10): indicates that medical entity 1, medical entity 5, and medical entity 10 appear in the diagnostic data at the same time to add disease 2 to the target The patient’s first candidate information.
进一步地,参阅图3所示步骤S2可包括以下步骤:Further, referring to step S2 shown in FIG. 3, it may include the following steps:
S21.将所述诊断数据中的多个所述医学实体分别与所述预设医疗规则中的每一条所述医疗规则进行匹配,以获取与所述诊断数据匹配的疾病类型的匹配度。S21. The multiple medical entities in the diagnostic data are matched with each of the medical rules in the preset medical rules to obtain the matching degree of the disease type matched with the diagnostic data.
本实施例中,在件医疗规则匹配时,将诊断数据中的所有医学实体分别与每一条医疗规则中的所有实体进行匹配,若医疗规则中的所有实体与诊断数据中的全部或部分实体匹配,则可确认该诊断数据与该医疗规则对应的疾病类型匹配,若医疗规则中的部分实体与诊断数据中的全部或部分实体匹配,则表示该诊断数据与该医疗规则对应的疾病类型不匹配。In this embodiment, when the medical rules are matched, all medical entities in the diagnosis data are matched with all the entities in each medical rule. If all the entities in the medical rules match all or part of the entities in the diagnosis data , It can be confirmed that the diagnostic data matches the disease type corresponding to the medical rule. If some entities in the medical rule match all or part of the entities in the diagnostic data, it means that the diagnostic data does not match the disease type corresponding to the medical rule .
例如:将目标患者的诊断数据(包含多个医学实体),匹配到预设医疗规则中,得到如下结果。假设一共有5条规则,对应3种疾病类型参阅表1:For example: matching the diagnostic data (including multiple medical entities) of the target patient to the preset medical rules, and the following results are obtained. Suppose there are 5 rules in total, corresponding to 3 types of diseases, see Table 1:
表1Table 1
医疗规则Medical rules 疾病类型Type of disease 规则匹配结果Rule matching result 匹配度suitability
规则1 Rule 1 疾病1 Disease 1 匹配 match 11
规则2 Rule 2 疾病1 Disease 1 不匹配Mismatch 00
规则3Rule 3 疾病2 Disease 2 不匹配Mismatch 00
规则4Rule 4 疾病2 Disease 2 不匹配Mismatch 00
规则5 Rule 5 疾病3Disease 3 匹配 match 11
在预设医疗规则中,将匹配到的医疗规则对应的疾病类型的匹配度置为1。如果一种疾病类型有多条医疗规则,只要诊断数据匹配到任意一条,诊断数据对该疾病的匹配度就置为1,例如上表中规则1和规则2都对应疾病1,虽然诊断数据只匹配到了规则1,但也将该诊断数据对疾病1的匹配度置为1。In the preset medical rule, the matching degree of the disease type corresponding to the matched medical rule is set to 1. If there are multiple medical rules for a disease type, as long as the diagnosis data matches any one, the matching degree of the diagnosis data for the disease is set to 1. For example, both rule 1 and rule 2 in the above table correspond to disease 1, although the diagnosis data only Rule 1 is matched, but the matching degree of the diagnosis data to disease 1 is also set to 1.
S22.提取与所述诊断数据匹配的所有疾病类型的匹配度,生成所述第一候选信息。S22. Extract the matching degrees of all disease types that match the diagnosis data, and generate the first candidate information.
结合表1所示,提取匹配度为1的疾病类型,第一候选信息为:{疾病1:1,疾病3:1}。Combining with Table 1, extracting the disease type with a matching degree of 1, the first candidate information is: {disease 1:1, disease 3:1}.
在一实施例中,执行步骤S3之前还可包括:根据历史样本数据生成所述医疗挖掘规则。In an embodiment, before step S3 is executed, the method may further include: generating the medical mining rule according to historical sample data.
需要说明的是:所述历史样本数据包括多条历史医疗数据,每一条所述历史医疗数据包括历史患者的疾病类型和多个所述医学实体。所述医疗挖掘规则包括多条挖掘规则,每一条所述挖掘规则包括至少一个所述医学实体,每一条所述挖掘规则对应一种疾病类型。It should be noted that the historical sample data includes multiple pieces of historical medical data, and each piece of historical medical data includes the disease type of the historical patient and multiple medical entities. The medical mining rule includes a plurality of mining rules, each of the mining rules includes at least one of the medical entities, and each of the mining rules corresponds to a type of disease.
进一步地,参阅图4所示根据历史样本数据生成所述医疗挖掘规则包括以下步骤:Further, referring to FIG. 4, generating the medical mining rule based on historical sample data includes the following steps:
A1.根据疾病类型对历史样本数据中的所述历史医疗数据进行分类,生成疾病类型集合。A1. Classify the historical medical data in the historical sample data according to the disease type to generate a disease type set.
具体地,计算每一个所述医学实体与相应的所述疾病类型的权重值;逐条提取历史样本数据中权重值大于或等于权重阈值的医学实体。Specifically, the weight value of each medical entity and the corresponding disease type is calculated; medical entities whose weight value is greater than or equal to the weight threshold in the historical sample data are extracted one by one.
将历史医疗数据根据疾病类型进行分类,对每种疾病类型对应的历史医疗数据中的每个医学实体计算权重值。过滤掉权重值小于权重阈值(权重阈值在0到1之间,[0,1])的医学实体。对每种疾病类型的医学实体进行过滤的目的是为了降低干扰,移除频繁出现但是区分度较低以及重要性较低的医学实体的干扰,提升数据的质量。The historical medical data is classified according to the disease type, and the weight value is calculated for each medical entity in the historical medical data corresponding to each disease type. Filter out the medical entities whose weight value is less than the weight threshold (the weight threshold is between 0 and 1, [0,1]). The purpose of filtering medical entities of each disease type is to reduce interference, remove the interference of frequently occurring medical entities with low discrimination and low importance, and improve the quality of data.
其中,权重值的计算公式:weight[医学实体i,疾病j]=(医学实体i在疾病j中出现的次数)/(医学实体i出现在多少种疾病)。i表示医学实体的ID;j表示医学实体的ID。如果weight[医学实体i,疾病j]<threshold,移除该医学实体。Among them, the calculation formula of the weight value: weight[medical entity i, disease j]=(the number of times the medical entity i appears in the disease j)/(how many diseases the medical entity i appears in). i represents the ID of the medical entity; j represents the ID of the medical entity. If weight[medical entity i, disease j]<threshold, remove the medical entity.
A2.采用频繁集挖掘算法对所述疾病类型集合中的所述历史医疗数据进行筛选,生成与疾病类型对应的挖掘规则。A2. Use a frequent set mining algorithm to filter the historical medical data in the disease type set, and generate mining rules corresponding to the disease type.
采用频繁集挖掘算法基于预设的支持度(min_support)和置信度(min_confidence)两个阈值,对过滤后的每种疾病类型对应的多条历史样本数据进行筛选,得到每种疾病类型对应的挖掘规则,例如:对于疾病1,在满足两个阈值后得到3条基于数据的规则,即:{[医学实体1,医学实体3,医学实体9],[医学实体1,医学实体3,医学实体5,医学实体7],[医学实体1,医学实体5,医学实体7,医学实体9,医学实体10]}。对每种疾病类型对应的挖掘规则,按照支持度从大到小排列。The frequent set mining algorithm is used to filter multiple historical sample data corresponding to each disease type after filtering based on the preset support (min_support) and confidence (min_confidence) thresholds to obtain the mining corresponding to each disease type Rules, such as: for disease 1, three data-based rules are obtained after meeting two thresholds, namely: {[medical entity 1, medical entity 3, medical entity 9], [medical entity 1, medical entity 3, medical entity 5. Medical entity 7], [medical entity 1, medical entity 5, medical entity 7, medical entity 9, medical entity 10]}. The mining rules corresponding to each disease type are arranged in descending order of support.
频繁集挖掘算法(Frequent-Pattern Growth,FP-growth)是一种称作逐层搜索的迭代方法,例如:采用k-项集用于探索(k+1)-项集。首先,找出频繁1-项集的集合,将该集合记作L 1,采用集合L 1用于找频繁2-项集的集合L 2,再采用集合L 2用于找L 3,以此类推,直到不能找到频繁k-项集,其中找每个L k需要进行一次数据库扫描。S3.将所述诊断数据与医疗挖掘规则进行匹配,获取第二候选信息。 Frequent-Pattern Growth (FP-growth) is an iterative method called layer-by-layer search. For example, k-itemsets are used to explore (k+1)-itemsets. First, find the set of frequent 1-itemsets, denote the set as L 1 , use set L 1 to find the set L 2 of frequent 2-items sets, and then use the set L 2 to find L 3 , so By analogy, until no frequent k-itemsets can be found, a database scan is required to find each L k. S3. Match the diagnostic data with medical mining rules to obtain second candidate information.
进一步地,参阅图5所示步骤S3可包括以下步骤:Further, referring to step S3 shown in FIG. 5, it may include the following steps:
S31.将所述诊断数据中的多个所述医学实体分别与所述医疗挖掘规则中的每一条所述挖掘规则进行匹配,以获取与所述诊断数据匹配的疾病类型的匹配度。S31. Match a plurality of the medical entities in the diagnostic data with each of the mining rules in the medical mining rules to obtain the matching degree of the disease type matching the diagnostic data.
将诊断数据分别和所有挖掘规则进行匹配,对于一个疾病类型,若匹配到该疾病类型 的某条规则,则该条规则对应的支持度作为该条数据对这个疾病的匹配度;若没有匹配到该疾病类型的任何规则,则将历史医疗数据对这个疾病类型的匹配度置为0。以此类推,得到诊断数据对所有疾病类型的可能性列表,将列表中的疾病类型根据匹配度由大到小进行排序。注意,如果一条数据匹配到一个疾病的多条规则,取匹配到的匹配度最大的规则。The diagnostic data is matched with all mining rules. For a disease type, if a certain rule of the disease type is matched, the support degree corresponding to the rule is used as the matching degree of the data to the disease; if there is no match For any rule of this disease type, the matching degree of historical medical data to this disease type is set to 0. By analogy, the possibility list of all disease types of the diagnosis data is obtained, and the disease types in the list are sorted according to the degree of matching from large to small. Note that if a piece of data matches multiple rules for a disease, the rule with the largest matching degree is selected.
例如:对于疾病1,将支持度阈值为0.7,对应表2中的3条挖掘规则,每条挖掘规则对应的支持度如下:For example: for disease 1, the support threshold is set to 0.7, corresponding to the 3 mining rules in Table 2, and the support corresponding to each mining rule is as follows:
表2Table 2
编号serial number 挖掘规则 Mining rules 支持度Support
11 医学实体1,医学实体3,医学实体9 Medical entity 1, medical entity 3, medical entity 9 0.800.80
22 医学实体1,医学实体3,医学实体5,医学实体7 Medical entity 1, medical entity 3, medical entity 5, medical entity 7 0.750.75
33 医学实体1,医学实体5,医学实体7,医学实体9,医学实体10 Medical entity 1, medical entity 5, medical entity 7, medical entity 9, medical entity 10 0.700.70
当诊断数据包含4个医学实体:[医学实体1,医学实体2,医学实体3,医学实体9]时,对于疾病1而言该诊断数据对应表2中的规则1(该条数据包含规则1中的全部三个医学实体),即该诊断数据对应到疾病1的可能性为0.80。When the diagnostic data contains 4 medical entities: [medical entity 1, medical entity 2, medical entity 3, medical entity 9], for disease 1, the diagnostic data corresponds to rule 1 in Table 2 (this piece of data contains rule 1 All three medical entities in ), that is, the probability that the diagnostic data corresponds to disease 1 is 0.80.
S32.提取与所述诊断数据匹配的所有疾病类型的匹配度,生成所述第二候选信息。S32. Extract the matching degrees of all disease types that match the diagnosis data, and generate the second candidate information.
以10种疾病类型为例:将诊断数据与所有疾病类型对应的挖掘规则进行匹配,如表3所示:Take 10 disease types as an example: match the diagnosis data with the mining rules corresponding to all disease types, as shown in Table 3:
表3table 3
编号serial number 规则匹配结果Rule matching result 支持度Support
疾病1Disease 1 匹配到某条规则Match a rule 0.800.80
疾病2 Disease 2 没有匹配到任何规则No rules were matched 00
疾病3Disease 3 没有匹配到任何规则No rules were matched 00
疾病4Disease 4 匹配到某条规则Match a rule 0.850.85
疾病5 Disease 5 没有匹配到任何规则No rules were matched 00
疾病6 Disease 6 没有匹配到任何规则No rules were matched 00
疾病7Disease 7 匹配到某条规则Match a rule 0.700.70
疾病8Disease 8 没有匹配到任何规则No rules were matched 00
疾病9Disease 9 没有匹配到任何规则No rules were matched 00
疾病10Disease 10 没有匹配到任何规则No rules were matched 00
得到3种疾病类型,提取匹配的每种疾病类型的支持度(匹配度),第二候选信息为:{疾病1:0.82,疾病4:0.85,疾病7:0.70}。Three disease types are obtained, and the support (matching degree) of each disease type that matches is extracted. The second candidate information is: {disease 1:0.82, disease 4:0.85, disease 7:0.70}.
需要强调的是,为进一步保证上述预设医疗规则和医疗挖掘规则的私密和安全性,上述预设医疗规则和医疗挖掘规则还可以存储于一区块链的节点中。本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。It should be emphasized that, in order to further ensure the privacy and security of the aforementioned preset medical rules and medical mining rules, the aforementioned preset medical rules and medical mining rules can also be stored in a node of a blockchain. The blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
S4.将所述第一候选信息和所述第二候选信息进行融合生成第三候选信息。S4. Fusion of the first candidate information and the second candidate information to generate third candidate information.
进一步地,步骤S4可包括:根据所述第一候选信息中的疾病类型和所述第二候选信息中的疾病类型,对同一疾病类型在所述第一候选信息中对应的匹配度和在所述第二候选信息中对应的匹配度计算匹配平均值,生成包括疾病类型匹配平均值的所述第三候选信息。Further, step S4 may include: according to the disease type in the first candidate information and the disease type in the second candidate information, the matching degree and the corresponding degree of matching of the same disease type in the first candidate information The corresponding matching degree in the second candidate information calculates a matching average value, and generates the third candidate information including the disease type matching average value.
在实施例中,将第一候选信息和第二候选信息中的匹配度进行权重融合,得到融合后的第三候选信息,例如诊断数据对应的第三候选信息L rule={疾病4:0.95,疾病2:0.90,疾病 1:0.5,……}。 In the embodiment, the matching degree in the first candidate information and the second candidate information is weighted and fused to obtain the fused third candidate information, for example, the third candidate information corresponding to the diagnostic data L rule = {disease 4:0.95, Disease 2: 0.90, Disease 1: 0.5, ...}.
S5.采用疾病识别模型对所述诊断数据进行识别获取第四候选信息。S5. Use a disease recognition model to recognize the diagnostic data to obtain fourth candidate information.
其中,所述第四候选信息包括疾病类型的匹配值。Wherein, the fourth candidate information includes the matching value of the disease type.
本实施例中的,疾病识别模型采用BERT(Bidirectional Encoder Representations from Transformers)模型,BERT的输入为诊断数据,输出为第四候选信息。In this embodiment, the disease recognition model adopts the BERT (Bidirectional Encoder Representations from Transformers) model, and the input of the BERT is diagnostic data, and the output is the fourth candidate information.
BERT模型是一个自然语言处理领域里的一个语言模型,能够对自然语言不需要转换直接进行处理。然而,病人的就诊数据不仅包含了非结构化数据即自由文本数据,还包含了大量结构化数据,而BERT模型只能对自由文本数据进行处理。为了使得BERT模型能够同时对结构化和非结构化数据进行处理,本实施例对BERT模型进行了改进。将非结构化数据和结构化数据进行拼接,然后输入到BERT模型中,其中,非结构化数据的每个词(word,w)对应输入到一个token中,结构化数据的每个编码(code,c)对应输入到一个token中。对BERT模型进行改进,移除了BERT模型中原有的segementembedding层,并对原有的positionembedding层进行改进。非结构化的数据即文本是有顺序的,所以非结构化数据对应的positionembedding的token有位置的嵌入式表达。但是,结构化的数据是没有顺序的,所有对所有结构化的数据的positionembedding的token置为相同的嵌入式表达。The BERT model is a language model in the field of natural language processing, which can directly process natural language without conversion. However, the patient's consultation data not only contains unstructured data, that is, free text data, but also contains a large amount of structured data, and the BERT model can only process free text data. In order to enable the BERT model to process structured and unstructured data at the same time, this embodiment improves the BERT model. The unstructured data and structured data are spliced, and then input into the BERT model, where each word (word, w) of the unstructured data is input into a token, and each code of the structured data (code , C) Correspondingly input into a token. The BERT model is improved, the original segmentembedding layer in the BERT model is removed, and the original positionembedding layer is improved. Unstructured data, that is, text, is sequential, so the positionembedding token corresponding to unstructured data has positional embedded expression. However, structured data has no order, and all positionembedding tokens for all structured data are set to the same embedded expression.
基于中文预训练的BERT模型,将结构化的数据加入到字典中进行扩展。在这个中文预训练模型的基础上,再进行预训练(pre-training)更新结构化数据的嵌入式表达(embeddingrepresentation)以及模型的参数。在预训练模型的基础上,进行疑似疾病判断的下游任务对模型进行微调(fine-tuning),,其中FC为全连接层(fullyconnetedlayer),output输出为疑似疾病列表,例如:第四候选信息L deep={疾病2:0.98,疾病4:0.80,疾病1:0.2,……}。 Based on the Chinese pre-trained BERT model, structured data is added to the dictionary for expansion. On the basis of this Chinese pre-training model, pre-training is performed to update the embedding representation of structured data and the parameters of the model. On the basis of the pre-trained model, the downstream task of performing suspected disease judgment fine-tuning the model, where FC is a fully connected layer (fullyconneted layer), and the output is a list of suspected diseases, for example: fourth candidate information L deep = {disease 2:0.98, disease 4:0.80, disease 1:0.2,...}.
S6.对所述第三候选信息和所述第四候选信息进行融合,以获取疑似疾病信息,根据所述疑似疾病信息判断所述诊断数据是否异常。S6. Fusion of the third candidate information and the fourth candidate information to obtain suspected disease information, and determine whether the diagnostic data is abnormal according to the suspected disease information.
进一步地,参阅图6所示步骤S6可包括以下步骤:Further, referring to step S6 shown in FIG. 6 may include the following steps:
S61.提取所述疑似值符合预设条件的疾病类型,生成所述疑似疾病信息。S61. Extract the type of disease whose suspected value meets the preset condition, and generate the suspected disease information.
S62.将所述诊断数据中的所述目标疾病类型与所述疑似疾病信息中所述疾病类型进行匹配,若匹配,则表示所述诊断数据正常,若不匹配,则表示所述诊断数据异常。S62. Match the target disease type in the diagnostic data with the disease type in the suspected disease information. If it matches, it means that the diagnostic data is normal, and if it does not match, it means that the diagnostic data is abnormal. .
将第三候选信息L rule和第四候选信息L deep进行权重融合,公式为疑似疾病信息L=w rule×L rule+w deep×L deep,其中,w rule和w deep为预设系数,w rule+w deep=1。例如:L rule的值为0.8,L deep的值为0.5,那么在L中的这个疾病的值为w rule×0.8+w deep×0.5。取前K(正整数)个可能性值最高的疾病类型构成目标患者的疑似疾病信息。如果目标患者的目标疾病类型(实际疾病类型)不在疑似疾病信息中,那么说明目标患者的诊断数据出现误诊。 The third candidate information L rule and the fourth candidate information L deep are weighted and fused, and the formula is suspected disease information L = w rule × L rule + w deep × L deep , where w rule and w deep are preset coefficients, w rule +w deep =1. For example: the value of L rule is 0.8 and the value of L deep is 0.5, then the value of this disease in L is w rule ×0.8+w deep ×0.5. Take the first K (positive integer) disease types with the highest probability value to constitute the suspected disease information of the target patient. If the target disease type (the actual disease type) of the target patient is not in the suspected disease information, it means that the diagnostic data of the target patient is misdiagnosed.
在本实施例中,诊断数据异常检测方法可将目标患者的诊断数据分别与预设医疗规则和医疗挖掘规则两种规则进行匹配,以得到两种候选信息,通过将两种候选信息进行融合以得到结合了医疗规则以及医疗挖掘规则的多维度的第三候选信息;采用疾病识别模型对诊断数据进行识别获取第四候选信息,提升了识别诊断数据的灵活性,且识别的速度快;通过结合第四候选信息和第三候选信息确定目标患者的疑似疾病信息,以便于根据疑似疾病信息判断诊断数据是否异常,从而达到快速有效确认误诊的目的。In this embodiment, the diagnostic data abnormality detection method can match the target patient's diagnostic data with preset medical rules and medical mining rules to obtain two kinds of candidate information. The two kinds of candidate information are merged to Obtain multi-dimensional third candidate information that combines medical rules and medical mining rules; use disease recognition models to identify diagnostic data to obtain fourth candidate information, which improves the flexibility of identifying diagnostic data, and the recognition speed is fast; The fourth candidate information and the third candidate information determine the suspected disease information of the target patient, so as to determine whether the diagnosis data is abnormal according to the suspected disease information, so as to achieve the purpose of quickly and effectively confirming the misdiagnosis.
实施例二Example two
请参阅图7,本实施例的一种诊断数据异常检测装置1,包括:获取单元11、第一匹配单元12、第二匹配单元13、融合单元14、识别单元15和处理单元16。Referring to FIG. 7, a diagnostic data abnormality detection device 1 of this embodiment includes: an acquisition unit 11, a first matching unit 12, a second matching unit 13, a fusion unit 14, an identification unit 15 and a processing unit 16.
获取单元11,用于获取目标患者的诊断数据。The acquiring unit 11 is used to acquire diagnostic data of the target patient.
进一步地,获取单元11用于接收用户终端发送的目标患者的医疗数据,所述医疗数据 包括:目标患者的基本信息、目标疾病类型和多个医学实体。Further, the acquiring unit 11 is configured to receive medical data of the target patient sent by the user terminal, the medical data including: basic information of the target patient, the target disease type, and multiple medical entities.
其中,目标患者的基本信息可包括:标识目标患者身份的编号(如:身份证、医保卡编号等)、年龄、性别、主诉、现病史、家族史等信息;目标疾病类型可以是疾病分类编号;医学实体可以是检测项目的编号,例如:血液的检查项目(如:血压、血红蛋白、血小板等指标),尿液的检查项目(如:蛋白、酮体、葡萄糖)等。Among them, the basic information of the target patient may include: identification number of the target patient (such as ID card, medical insurance card number, etc.), age, gender, chief complaint, current medical history, family history and other information; the target disease type can be a disease classification number ; The medical entity can be the number of the test item, for example: blood test items (such as blood pressure, hemoglobin, platelets, etc.), urine test items (such as protein, ketone body, glucose), etc.
获取单元11还用于提取所述医疗数据中的所述医学实体,生成所述诊断数据。The acquiring unit 11 is also used to extract the medical entity in the medical data to generate the diagnostic data.
本实施例中,诊断数据由医学实体的编号(ID)组成,如:[医学实体X1,医学实体X2,医学实体X3,……],医学实体中的X表示医学实体的ID。In this embodiment, the diagnostic data is composed of the ID of the medical entity, such as: [medical entity X1, medical entity X2, medical entity X3, ...], X in the medical entity represents the ID of the medical entity.
第一匹配单元12,用于将所述诊断数据与预设医疗规则进行匹配,获取第一候选信息。The first matching unit 12 is configured to match the diagnostic data with preset medical rules to obtain first candidate information.
其中,所述预设医疗规则为根据医学知识(医生从医学知识中整理得到的)预先设定的规则,包括多条医疗规则,每一条所述医疗规则包括至少一个所述医学实体,每一条所述医疗规则对应一种疾病类型。Wherein, the preset medical rules are rules set in advance according to medical knowledge (doctors sorted out from medical knowledge), including multiple medical rules, each medical rule includes at least one medical entity, and each medical rule includes at least one medical entity. The medical rule corresponds to a type of disease.
进一步地,第一匹配单元12用于将所述诊断数据中的多个所述医学实体分别与所述预设医疗规则中的每一条所述医疗规则进行匹配,以获取与所述诊断数据匹配的疾病类型的匹配度。第一匹配单元12还用于提取与所述诊断数据匹配的所有疾病类型的匹配度,生成所述第一候选信息。Further, the first matching unit 12 is configured to match a plurality of the medical entities in the diagnostic data with each of the medical rules in the preset medical rules to obtain a match with the diagnostic data The degree of matching of the type of disease. The first matching unit 12 is also used to extract the matching degrees of all disease types that match the diagnosis data, and generate the first candidate information.
第二匹配单元13,用于将所述诊断数据与医疗挖掘规则进行匹配,获取第二候选信息。The second matching unit 13 is configured to match the diagnostic data with medical mining rules to obtain second candidate information.
进一步地,第二匹配单元13用于将所述诊断数据中的多个所述医学实体分别与所述医疗挖掘规则中的每一条所述挖掘规则进行匹配,以获取与所述诊断数据匹配的疾病类型的匹配度;第二匹配单元13还用于提取与所述诊断数据匹配的所有疾病类型的匹配度,生成所述第二候选信息。Further, the second matching unit 13 is configured to match a plurality of the medical entities in the diagnostic data with each of the mining rules in the medical mining rules, to obtain the matching with the diagnostic data The matching degree of disease types; the second matching unit 13 is also used to extract the matching degrees of all disease types matching the diagnosis data, and generate the second candidate information.
本实施例中,医疗挖掘规则是根据历史样本数据获取的具体的获取过程为(参阅图4):In this embodiment, the medical mining rules are acquired based on historical sample data. The specific acquisition process is (see Figure 4):
A1.根据疾病类型对历史样本数据中的所述历史医疗数据进行分类,生成疾病类型集合。A1. Classify the historical medical data in the historical sample data according to the disease type to generate a disease type set.
A2.采用频繁集挖掘算法对所述疾病类型集合中的所述历史医疗数据进行筛选,生成与疾病类型对应的挖掘规则。A2. Use a frequent set mining algorithm to filter the historical medical data in the disease type set, and generate mining rules corresponding to the disease type.
融合单元14,用于将所述第一候选信息和所述第二候选信息进行融合生成第三候选信息。The fusion unit 14 is configured to fuse the first candidate information and the second candidate information to generate third candidate information.
进一步地,融合单元14可根据所述第一候选信息中的疾病类型和所述第二候选信息中的疾病类型,对同一疾病类型在所述第一候选信息中对应的匹配度和在所述第二候选信息中对应的匹配度计算匹配平均值,生成包括疾病类型匹配平均值的所述第三候选信息。Further, the fusion unit 14 may, according to the disease type in the first candidate information and the disease type in the second candidate information, determine the matching degree of the same disease type in the first candidate information and the matching degree in the first candidate information. The corresponding matching degree in the second candidate information calculates a matching average value, and generates the third candidate information including the disease type matching average value.
识别单元15,用于采用疾病识别模型对所述诊断数据进行识别获取第四候选信息。The identification unit 15 is configured to identify the diagnosis data using a disease identification model to obtain fourth candidate information.
其中,所述第四候选信息包括疾病类型的匹配值。Wherein, the fourth candidate information includes the matching value of the disease type.
本实施例中的,疾病识别模型采用BERT(Bidirectional Encoder Representations from Transformers)模型,BERT的输入为诊断数据,输出为第四候选信息。In this embodiment, the disease recognition model adopts the BERT (Bidirectional Encoder Representations from Transformers) model, and the input of the BERT is diagnostic data, and the output is the fourth candidate information.
处理单元16,用于对所述第三候选信息和所述第四候选信息进行融合,以获取疑似疾病信息,根据所述疑似疾病信息判断所述诊断数据是否异常。The processing unit 16 is configured to fuse the third candidate information and the fourth candidate information to obtain suspected disease information, and determine whether the diagnosis data is abnormal according to the suspected disease information.
进一步地,通过处理单元16提取所述疑似值符合预设条件的疾病类型,生成所述疑似疾病信息;将所述诊断数据中的所述目标疾病类型与所述疑似疾病信息中所述疾病类型进行匹配,若匹配,则表示所述诊断数据正常,若不匹配,则表示所述诊断数据异常。Further, the processing unit 16 extracts the disease type whose suspect value meets the preset condition to generate the suspected disease information; compares the target disease type in the diagnostic data with the disease type in the suspected disease information Matching is performed. If it matches, it means that the diagnostic data is normal, and if it does not match, it means that the diagnostic data is abnormal.
在本实施例中,诊断数据异常检测装置1通过第一匹配单元12将目标患者的诊断数据与预设医疗规则进行匹配,通过和第二匹配单元13将目标患者的诊断数据与医疗挖掘规则两种规则进行匹配,以得到两种候选信息;采用融合单元14将两种候选信息进行融合以得到结合了医疗规则以及医疗挖掘规则的多维度的第三候选信息;采用识别单元15中的疾病 识别模型对诊断数据进行识别获取第四候选信息,提升了识别诊断数据的灵活性,且识别的速度快;通过处理单元16结合第四候选信息和第三候选信息确定目标患者的疑似疾病信息,以便于根据疑似疾病信息判断诊断数据是否异常,从而达到快速有效确认误诊的目的。In this embodiment, the diagnostic data abnormality detection device 1 matches the target patient's diagnostic data with preset medical rules through the first matching unit 12, and uses the second matching unit 13 to match the target patient's diagnostic data with medical mining rules. Two kinds of rules are matched to obtain two kinds of candidate information; the fusion unit 14 is used to fuse the two kinds of candidate information to obtain a multi-dimensional third candidate information that combines medical rules and medical mining rules; and the disease recognition in the recognition unit 15 The model recognizes the diagnostic data to obtain the fourth candidate information, which improves the flexibility of identifying the diagnostic data, and the recognition speed is fast; the processing unit 16 combines the fourth candidate information and the third candidate information to determine the suspected disease information of the target patient, so that To determine whether the diagnosis data is abnormal based on the suspected disease information, so as to achieve the purpose of quickly and effectively confirming the misdiagnosis.
实施例三Example three
为实现上述目的,本申请还提供一种计算机设备2,该计算机设备2包括多个计算机设备2,实施例二的诊断数据异常检测装置1的组成部分可分散于不同的计算机设备2中,计算机设备2可以是执行程序的智能手机、平板电脑、笔记本电脑、台式计算机、机架式服务器、刀片式服务器、塔式服务器或机柜式服务器(包括独立的服务器,或者多个服务器所组成的服务器集群)等。本实施例的计算机设备2至少包括但不限于:可通过系统总线相互通信连接的存储器21、处理器23、网络接口22以及诊断数据异常检测装置1(参考图8)。需要指出的是,图8仅示出了具有组件-的计算机设备2,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。In order to achieve the above objective, the present application also provides a computer device 2 which includes a plurality of computer devices 2. The components of the diagnostic data abnormality detection device 1 of the second embodiment can be dispersed in different computer devices 2. Device 2 can be a smart phone, a tablet, a laptop, a desktop computer, a rack server, a blade server, a tower server, or a rack server (including an independent server, or a server cluster composed of multiple servers) that executes the program )Wait. The computer device 2 of this embodiment at least includes but is not limited to: a memory 21, a processor 23, a network interface 22, and a diagnostic data abnormality detection device 1 (refer to FIG. 8) that can be communicably connected to each other through a system bus. It should be pointed out that FIG. 8 only shows the computer device 2 with components, but it should be understood that it is not required to implement all the components shown, and more or fewer components may be implemented instead.
本实施例中,所述存储器21至少包括一种类型的计算机可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。在一些实施例中,存储器21可以是计算机设备2的内部存储单元,例如该计算机设备2的硬盘或内存。在另一些实施例中,存储器21也可以是计算机设备2的外部存储设备,例如该计算机设备2上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。当然,所述存储器21还可以既包括计算机设备2的内部存储单元也包括其外部存储设备。本实施例中,存储器21通常用于存储安装于计算机设备2的操作系统和各类应用软件,例如实施例一的诊断数据异常检测方法的程序代码等。此外,存储器21还可以用于暂时地存储已经输出或者将要输出的各类数据。In this embodiment, the memory 21 includes at least one type of computer-readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access Memory (RAM), Static Random Access Memory (SRAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Programmable Read Only Memory (PROM), Magnetic Memory, Magnetic Disk, Optical Disk, etc. In some embodiments, the memory 21 may be an internal storage unit of the computer device 2, for example, a hard disk or a memory of the computer device 2. In other embodiments, the memory 21 may also be an external storage device of the computer device 2, such as a plug-in hard disk, a smart media card (SMC), and a secure digital (Secure Digital, SMC) equipped on the computer device 2. SD) card, flash card (Flash Card), etc. Of course, the memory 21 may also include both the internal storage unit of the computer device 2 and its external storage device. In this embodiment, the memory 21 is generally used to store the operating system and various application software installed in the computer device 2, such as the program code of the diagnostic data abnormality detection method in the first embodiment. In addition, the memory 21 can also be used to temporarily store various types of data that have been output or will be output.
所述处理器23在一些实施例中可以是中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器23通常用于控制计算机设备2的总体操作例如执行与所述计算机设备2进行数据交互或者通信相关的控制和处理等。本实施例中,所述处理器23用于运行所述存储器21中存储的程序代码或者处理数据,例如运行所述的诊断数据异常检测装置1等。In some embodiments, the processor 23 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips. The processor 23 is generally used to control the overall operation of the computer device 2, for example, to perform data interaction or communication-related control and processing with the computer device 2. In this embodiment, the processor 23 is used to run the program code or processing data stored in the memory 21, for example, to run the diagnostic data abnormality detection device 1 and the like.
所述网络接口22可包括无线网络接口或有线网络接口,该网络接口22通常用于在所述计算机设备2与其他计算机设备2之间建立通信连接。例如,所述网络接口22用于通过网络将所述计算机设备2与外部终端相连,在所述计算机设备2与外部终端之间的建立数据传输通道和通信连接等。所述网络可以是企业内部网(Intranet)、互联网(Internet)、全球移动通讯系统(Global System of Mobile communication,GSM)、宽带码分多址(Wideband Code Division Multiple Access,WCDMA)、4G网络、5G网络、蓝牙(Bluetooth)、Wi-Fi等无线或有线网络。The network interface 22 may include a wireless network interface or a wired network interface, and the network interface 22 is generally used to establish a communication connection between the computer device 2 and other computer devices 2. For example, the network interface 22 is used to connect the computer device 2 with an external terminal through a network, and establish a data transmission channel and a communication connection between the computer device 2 and the external terminal. The network may be Intranet, Internet, Global System of Mobile Communication (GSM), Wideband Code Division Multiple Access (WCDMA), 4G network, 5G Network, Bluetooth (Bluetooth), Wi-Fi and other wireless or wired networks.
需要指出的是,图8仅示出了具有部件21-23的计算机设备2,但是应理解的是,并不要求实施所有示出的部件,可以替代的实施更多或者更少的部件。It should be pointed out that FIG. 8 only shows the computer device 2 with components 21-23, but it should be understood that it is not required to implement all the components shown, and more or fewer components may be implemented instead.
在本实施例中,存储于存储器21中的所述诊断数据异常检测装置1还可以被分割为一个或者多个程序模块,所述一个或者多个程序模块被存储于存储器21中,并由一个或多个处理器(本实施例为处理器23)所执行,以完成本申请。In this embodiment, the diagnostic data abnormality detection device 1 stored in the memory 21 may also be divided into one or more program modules, and the one or more program modules are stored in the memory 21 and consist of one Or executed by multiple processors (in this embodiment, the processor 23) to complete the application.
实施例四Example four
为实现上述目的,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质可以是非易失性,也可以是易失性,其包括多个存储介质,如闪存、硬盘、多媒体卡、卡 型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘、服务器、App应用商城等等,其上存储有计算机程序,程序被处理器23执行时实现相应功能。本实施例的计算机可读存储介质用于存储诊断数据异常检测装置1,被处理器23执行时实现实施例一的诊断数据异常检测方法。In order to achieve the above objective, the present application also provides a computer-readable storage medium. The computer-readable storage medium may be non-volatile or volatile, and includes multiple storage media, such as flash memory, hard disk, and multimedia. Card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), Programmable read-only memory (PROM), magnetic memory, magnetic disks, optical disks, servers, App application malls, etc., have computer programs stored thereon, and corresponding functions are realized when the programs are executed by the processor 23. The computer-readable storage medium of this embodiment is used to store the diagnostic data abnormality detection device 1, and when executed by the processor 23, the diagnostic data abnormality detection method of the first embodiment is implemented.
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。The serial numbers of the foregoing embodiments of the present application are for description only, and do not represent the superiority or inferiority of the embodiments.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。Through the description of the above implementation manners, those skilled in the art can clearly understand that the above-mentioned embodiment method can be implemented by means of software plus the necessary general hardware platform, of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。The above are only the preferred embodiments of the application, and do not limit the scope of the patent for this application. Any equivalent structure or equivalent process transformation made using the content of the description and drawings of the application, or directly or indirectly applied to other related technical fields , The same reason is included in the scope of patent protection of this application.

Claims (20)

  1. 一种诊断数据异常检测方法,其中,包括:A method for detecting abnormal diagnostic data, which includes:
    获取目标患者的诊断数据;Obtain the diagnostic data of the target patient;
    将所述诊断数据与预设医疗规则进行匹配,获取第一候选信息;Matching the diagnostic data with preset medical rules to obtain first candidate information;
    将所述诊断数据与医疗挖掘规则进行匹配,获取第二候选信息;Matching the diagnostic data with medical mining rules to obtain second candidate information;
    将所述第一候选信息和所述第二候选信息进行融合生成第三候选信息;Fusing the first candidate information and the second candidate information to generate third candidate information;
    采用疾病识别模型对所述诊断数据进行识别获取第四候选信息;Recognizing the diagnosis data using a disease recognition model to obtain fourth candidate information;
    对所述第三候选信息和所述第四候选信息进行融合,以获取疑似疾病信息,根据所述疑似疾病信息判断所述诊断数据是否异常。The third candidate information and the fourth candidate information are fused to obtain suspected disease information, and whether the diagnosis data is abnormal is determined according to the suspected disease information.
  2. 根据权利要求1所述的诊断数据异常检测方法,其中,获取目标患者的诊断数据,包括:The diagnostic data abnormality detection method according to claim 1, wherein acquiring the diagnostic data of the target patient comprises:
    接收用户终端发送的目标患者的医疗数据,所述医疗数据包括:目标患者的基本信息、目标疾病类型和多个医学实体;Receiving medical data of the target patient sent by the user terminal, the medical data including: basic information of the target patient, the target disease type, and multiple medical entities;
    提取所述医疗数据中的所述医学实体,生成所述诊断数据。The medical entity in the medical data is extracted to generate the diagnostic data.
  3. 根据权利要求2所述的诊断数据异常检测方法,其中,所述预设医疗规则为根据医学知识预先设定的规则,包括多条医疗规则,每一条所述医疗规则包括至少一个所述医学实体,每一条所述医疗规则对应一种疾病类型;The diagnostic data abnormality detection method according to claim 2, wherein the preset medical rule is a rule set in advance according to medical knowledge, including a plurality of medical rules, and each medical rule includes at least one of the medical entities , Each of the medical rules corresponds to a type of disease;
    将所述诊断数据与预设医疗规则进行匹配,获取第一候选信息,包括:Matching the diagnostic data with preset medical rules to obtain first candidate information includes:
    将所述诊断数据中的多个所述医学实体分别与所述预设医疗规则中的每一条所述医疗规则进行匹配,以获取与所述诊断数据匹配的疾病类型的匹配度;Matching the multiple medical entities in the diagnostic data with each of the medical rules in the preset medical rules to obtain the matching degree of the disease type matching the diagnostic data;
    提取与所述诊断数据匹配的所有疾病类型的匹配度,生成所述第一候选信息。The matching degrees of all disease types matching the diagnosis data are extracted, and the first candidate information is generated.
  4. 根据权利要求3所述的诊断数据异常检测方法,其中,将所述诊断数据与医疗挖掘规则进行匹配,获取第二候选信息,之前还包括:The diagnostic data abnormality detection method according to claim 3, wherein matching the diagnostic data with medical mining rules to obtain the second candidate information further comprises:
    根据历史样本数据生成所述医疗挖掘规则:Generate the medical mining rule based on historical sample data:
    所述历史样本数据包括多条历史医疗数据,每一条所述历史医疗数据包括历史患者的疾病类型和多个所述医学实体;The historical sample data includes multiple pieces of historical medical data, and each piece of historical medical data includes the disease type of the historical patient and multiple medical entities;
    所述医疗挖掘规则包括多条挖掘规则,每一条所述挖掘规则包括至少一个所述医学实体,每一条所述挖掘规则对应一种疾病类型;The medical mining rule includes a plurality of mining rules, each of the mining rules includes at least one of the medical entities, and each of the mining rules corresponds to a type of disease;
    根据疾病类型对历史样本数据中的所述历史医疗数据进行分类,生成疾病类型集合;Classify the historical medical data in the historical sample data according to the disease type to generate a disease type set;
    采用频繁集挖掘算法对所述疾病类型集合中的所述历史医疗数据进行筛选,生成与疾病类型对应的挖掘规则。A frequent set mining algorithm is used to screen the historical medical data in the disease type set, and a mining rule corresponding to the disease type is generated.
  5. 根据权利要求4所述的诊断数据异常检测方法,其中,将所述诊断数据与医疗挖掘规则进行匹配,获取第二候选信息,包括:The diagnostic data abnormality detection method according to claim 4, wherein matching the diagnostic data with medical mining rules to obtain the second candidate information comprises:
    将所述诊断数据中的多个所述医学实体分别与所述医疗挖掘规则中的每一条所述挖掘规则进行匹配,以获取与所述诊断数据匹配的疾病类型的匹配度;Matching a plurality of the medical entities in the diagnostic data with each of the mining rules in the medical mining rules to obtain the matching degree of the disease type matching the diagnostic data;
    提取与所述诊断数据匹配的所有疾病类型的匹配度,生成所述第二候选信息。Extract the matching degrees of all disease types that match the diagnostic data, and generate the second candidate information.
  6. 根据权利要求5所述的诊断数据异常检测方法,其中,将所述第一候选信息和所述第二候选信息进行融合生成第三候选信息,包括:The diagnostic data abnormality detection method according to claim 5, wherein the fusion of the first candidate information and the second candidate information to generate the third candidate information comprises:
    根据所述第一候选信息中的疾病类型和所述第二候选信息中的疾病类型,对同一疾病类型在所述第一候选信息中对应的匹配度和在所述第二候选信息中对应的匹配度计算匹配平均值,生成包括疾病类型匹配平均值的所述第三候选信息。According to the disease type in the first candidate information and the disease type in the second candidate information, the matching degree for the same disease type in the first candidate information and the corresponding degree in the second candidate information The matching degree calculates the matching average value, and generates the third candidate information including the disease type matching average value.
  7. 根据权利要求5所述的诊断数据异常检测方法,其中,所述第四候选信息包括疾病类型的匹配值;The diagnostic data abnormality detection method according to claim 5, wherein the fourth candidate information includes a matching value of a disease type;
    对所述第三候选信息和所述第四候选信息进行融合,以获取疑似疾病信息,根据所述 疑似疾病信息判断所述诊断数据是否异常,包括:The fusion of the third candidate information and the fourth candidate information to obtain suspected disease information, and judging whether the diagnosis data is abnormal according to the suspected disease information includes:
    对同一疾病类型在所述第三候选信息中对应的匹配平均值和在所述第四候选信息中对应的匹配值计算疑似值;Calculating a suspicion value for the matching average value corresponding to the same disease type in the third candidate information and the corresponding matching value in the fourth candidate information;
    提取所述疑似值符合预设条件的疾病类型,生成所述疑似疾病信息;Extracting the type of disease whose suspected value meets the preset condition, and generating the suspected disease information;
    将所述诊断数据中的所述目标疾病类型与所述疑似疾病信息中所述疾病类型进行匹配,若匹配,则表示所述诊断数据正常,若不匹配,则表示所述诊断数据异常。The target disease type in the diagnostic data is matched with the disease type in the suspected disease information. If it matches, it means that the diagnostic data is normal, and if it does not match, it means that the diagnostic data is abnormal.
  8. 一种诊断数据异常检测装置,其中,包括:A diagnostic data abnormality detection device, which includes:
    获取单元,用于获取目标患者的诊断数据;The acquiring unit is used to acquire the diagnostic data of the target patient;
    第一匹配单元,用于将所述诊断数据与预设医疗规则进行匹配,获取第一候选信息;The first matching unit is configured to match the diagnostic data with preset medical rules to obtain first candidate information;
    第二匹配单元,用于将所述诊断数据与医疗挖掘规则进行匹配,获取第二候选信息;The second matching unit is used to match the diagnostic data with medical mining rules to obtain second candidate information;
    融合单元,用于将所述第一候选信息和所述第二候选信息进行融合生成第三候选信息;A fusion unit for fusing the first candidate information and the second candidate information to generate third candidate information;
    识别单元,用于采用疾病识别模型对所述诊断数据进行识别获取第四候选信息;An identification unit, configured to use a disease identification model to identify the diagnostic data to obtain fourth candidate information;
    处理单元,用于对所述第三候选信息和所述第四候选信息进行融合,以获取疑似疾病信息,根据所述疑似疾病信息判断所述诊断数据是否异常。The processing unit is configured to fuse the third candidate information and the fourth candidate information to obtain suspected disease information, and determine whether the diagnosis data is abnormal according to the suspected disease information.
  9. 一种诊断数据异常检测设备,其中,所述诊断数据异常检测设备包括存储器和至少一个处理器;A diagnostic data abnormality detection device, wherein the diagnostic data abnormality detection device includes a memory and at least one processor;
    所述至少一个处理器调用所述存储器中的所述指令,以使得所述诊断数据异常检测设备执行如下所述的诊断数据异常检测方法的步骤:The at least one processor invokes the instructions in the memory, so that the diagnostic data abnormality detection device executes the steps of the diagnostic data abnormality detection method as described below:
    获取目标患者的诊断数据;Obtain the diagnostic data of the target patient;
    将所述诊断数据与预设医疗规则进行匹配,获取第一候选信息;Matching the diagnostic data with preset medical rules to obtain first candidate information;
    将所述诊断数据与医疗挖掘规则进行匹配,获取第二候选信息;Matching the diagnostic data with medical mining rules to obtain second candidate information;
    将所述第一候选信息和所述第二候选信息进行融合生成第三候选信息;Fusing the first candidate information and the second candidate information to generate third candidate information;
    采用疾病识别模型对所述诊断数据进行识别获取第四候选信息;Recognizing the diagnosis data using a disease recognition model to obtain fourth candidate information;
    对所述第三候选信息和所述第四候选信息进行融合,以获取疑似疾病信息,根据所述疑似疾病信息判断所述诊断数据是否异常。The third candidate information and the fourth candidate information are fused to obtain suspected disease information, and whether the diagnosis data is abnormal is determined according to the suspected disease information.
  10. 根据权利要求9所述的诊断数据异常检测设备,其中,所述诊断数据异常检测设备被所述处理器执行所述获取目标患者的诊断数据的步骤时,包括:The diagnostic data abnormality detection device according to claim 9, wherein, when the diagnostic data abnormality detection device is executed by the processor, the step of acquiring the diagnostic data of the target patient comprises:
    接收用户终端发送的目标患者的医疗数据,所述医疗数据包括:目标患者的基本信息、目标疾病类型和多个医学实体;Receiving medical data of the target patient sent by the user terminal, the medical data including: basic information of the target patient, the target disease type, and multiple medical entities;
    提取所述医疗数据中的所述医学实体,生成所述诊断数据。The medical entity in the medical data is extracted to generate the diagnostic data.
  11. 根据权利要求10所述的诊断数据异常检测设备,其中,所述预设医疗规则为根据医学知识预先设定的规则,包括多条医疗规则,每一条所述医疗规则包括至少一个所述医学实体,每一条所述医疗规则对应一种疾病类型;The diagnostic data abnormality detection device according to claim 10, wherein the preset medical rule is a rule set in advance based on medical knowledge, including a plurality of medical rules, and each of the medical rules includes at least one of the medical entities , Each of the medical rules corresponds to a type of disease;
    所述诊断数据异常检测设备被所述处理器执行所述将所述诊断数据与预设医疗规则进行匹配,获取第一候选信息的步骤时,包括:When the diagnostic data abnormality detection device is executed by the processor, the step of matching the diagnostic data with preset medical rules to obtain first candidate information includes:
    将所述诊断数据中的多个所述医学实体分别与所述预设医疗规则中的每一条所述医疗规则进行匹配,以获取与所述诊断数据匹配的疾病类型的匹配度;Matching the multiple medical entities in the diagnostic data with each of the medical rules in the preset medical rules to obtain the matching degree of the disease type matching the diagnostic data;
    提取与所述诊断数据匹配的所有疾病类型的匹配度,生成所述第一候选信息。The matching degrees of all disease types matching the diagnosis data are extracted, and the first candidate information is generated.
  12. 根据权利要求11所述的诊断数据异常检测设备,其中,所述诊断数据异常检测设备被所述处理器执行所述将所述诊断数据与医疗挖掘规则进行匹配,获取第二候选信息的步骤之前,还包括:The diagnostic data anomaly detection device according to claim 11, wherein the diagnostic data anomaly detection device is executed by the processor before the step of matching the diagnostic data with medical mining rules to obtain second candidate information ,Also includes:
    根据历史样本数据生成所述医疗挖掘规则:Generate the medical mining rule based on historical sample data:
    所述历史样本数据包括多条历史医疗数据,每一条所述历史医疗数据包括历史患者的疾病类型和多个所述医学实体;The historical sample data includes multiple pieces of historical medical data, and each piece of historical medical data includes the disease type of the historical patient and multiple medical entities;
    所述医疗挖掘规则包括多条挖掘规则,每一条所述挖掘规则包括至少一个所述医学实体,每一条所述挖掘规则对应一种疾病类型;The medical mining rule includes a plurality of mining rules, each of the mining rules includes at least one of the medical entities, and each of the mining rules corresponds to a type of disease;
    根据疾病类型对历史样本数据中的所述历史医疗数据进行分类,生成疾病类型集合;Classify the historical medical data in the historical sample data according to the disease type to generate a disease type set;
    采用频繁集挖掘算法对所述疾病类型集合中的所述历史医疗数据进行筛选,生成与疾病类型对应的挖掘规则。A frequent set mining algorithm is used to screen the historical medical data in the disease type set, and a mining rule corresponding to the disease type is generated.
  13. 根据权利要求12所述的诊断数据异常检测设备,其中,所述诊断数据异常检测设备被所述处理器执行所述将所述诊断数据与医疗挖掘规则进行匹配,获取第二候选信息的步骤时,包括:The diagnostic data anomaly detection device according to claim 12, wherein the diagnostic data anomaly detection device is executed by the processor when the step of matching the diagnostic data with medical mining rules to obtain second candidate information ,include:
    将所述诊断数据中的多个所述医学实体分别与所述医疗挖掘规则中的每一条所述挖掘规则进行匹配,以获取与所述诊断数据匹配的疾病类型的匹配度;Matching a plurality of the medical entities in the diagnostic data with each of the mining rules in the medical mining rules to obtain the matching degree of the disease type matching the diagnostic data;
    提取与所述诊断数据匹配的所有疾病类型的匹配度,生成所述第二候选信息。Extract the matching degrees of all disease types that match the diagnostic data, and generate the second candidate information.
  14. 根据权利要求13所述的诊断数据异常检测设备,其中,所述诊断数据异常检测设备被所述处理器执行所述将所述第一候选信息和所述第二候选信息进行融合生成第三候选信息的步骤时,包括:The diagnostic data anomaly detection device according to claim 13, wherein the diagnostic data anomaly detection device is executed by the processor to fuse the first candidate information and the second candidate information to generate a third candidate The information steps include:
    根据所述第一候选信息中的疾病类型和所述第二候选信息中的疾病类型,对同一疾病类型在所述第一候选信息中对应的匹配度和在所述第二候选信息中对应的匹配度计算匹配平均值,生成包括疾病类型匹配平均值的所述第三候选信息。According to the disease type in the first candidate information and the disease type in the second candidate information, the matching degree for the same disease type in the first candidate information and the corresponding degree in the second candidate information The matching degree calculates the matching average value, and generates the third candidate information including the disease type matching average value.
  15. 根据权利要求13所述的诊断数据异常检测设备,其中,所述第四候选信息包括疾病类型的匹配值;The diagnostic data abnormality detection device according to claim 13, wherein the fourth candidate information includes a matching value of a disease type;
    所述诊断数据异常检测设备被所述处理器执行所述对所述第三候选信息和所述第四候选信息进行融合,以获取疑似疾病信息,根据所述疑似疾病信息判断所述诊断数据是否异常的步骤时,包括:The diagnostic data abnormality detection device is executed by the processor to fuse the third candidate information and the fourth candidate information to obtain suspected disease information, and determine whether the diagnostic data is based on the suspected disease information The abnormal steps include:
    对同一疾病类型在所述第三候选信息中对应的匹配平均值和在所述第四候选信息中对应的匹配值计算疑似值;Calculating a suspicion value for the matching average value corresponding to the same disease type in the third candidate information and the corresponding matching value in the fourth candidate information;
    提取所述疑似值符合预设条件的疾病类型,生成所述疑似疾病信息;Extracting the type of disease whose suspected value meets the preset condition, and generating the suspected disease information;
    将所述诊断数据中的所述目标疾病类型与所述疑似疾病信息中所述疾病类型进行匹配,若匹配,则表示所述诊断数据正常,若不匹配,则表示所述诊断数据异常。The target disease type in the diagnostic data is matched with the disease type in the suspected disease information. If it matches, it means that the diagnostic data is normal, and if it does not match, it means that the diagnostic data is abnormal.
  16. 一种计算机可读存储介质,所述计算机可读存储介质上存储有指令,其中,所述指令被处理器执行时实现如下所述的诊断数据异常检测方法的步骤:A computer-readable storage medium having instructions stored on the computer-readable storage medium, wherein, when the instructions are executed by a processor, the steps of the diagnostic data abnormality detection method described below are implemented:
    获取目标患者的诊断数据;Obtain the diagnostic data of the target patient;
    将所述诊断数据与预设医疗规则进行匹配,获取第一候选信息;Matching the diagnostic data with preset medical rules to obtain first candidate information;
    将所述诊断数据与医疗挖掘规则进行匹配,获取第二候选信息;Matching the diagnostic data with medical mining rules to obtain second candidate information;
    将所述第一候选信息和所述第二候选信息进行融合生成第三候选信息;Fusing the first candidate information and the second candidate information to generate third candidate information;
    采用疾病识别模型对所述诊断数据进行识别获取第四候选信息;Recognizing the diagnosis data using a disease recognition model to obtain fourth candidate information;
    对所述第三候选信息和所述第四候选信息进行融合,以获取疑似疾病信息,根据所述疑似疾病信息判断所述诊断数据是否异常。The third candidate information and the fourth candidate information are fused to obtain suspected disease information, and whether the diagnosis data is abnormal is determined according to the suspected disease information.
  17. 根据权利要求16所述的计算机可读存储介质,其中,所述诊断数据异常检测的指令被所述处理器执行所述获取目标患者的诊断数据的步骤时,包括:The computer-readable storage medium according to claim 16, wherein, when the instruction for detecting abnormality of the diagnostic data is executed by the processor, the step of acquiring the diagnostic data of the target patient comprises:
    接收用户终端发送的目标患者的医疗数据,所述医疗数据包括:目标患者的基本信息、目标疾病类型和多个医学实体;Receiving medical data of the target patient sent by the user terminal, the medical data including: basic information of the target patient, the target disease type, and multiple medical entities;
    提取所述医疗数据中的所述医学实体,生成所述诊断数据。The medical entity in the medical data is extracted to generate the diagnostic data.
  18. 根据权利要求17所述的计算机可读存储介质,其中,所述预设医疗规则为根据医学知识预先设定的规则,包括多条医疗规则,每一条所述医疗规则包括至少一个所述医学实体,每一条所述医疗规则对应一种疾病类型;The computer-readable storage medium according to claim 17, wherein the preset medical rule is a rule set in advance based on medical knowledge, including a plurality of medical rules, and each medical rule includes at least one of the medical entities , Each of the medical rules corresponds to a type of disease;
    所述诊断数据异常检测的指令被所述处理器执行所述将所述诊断数据与预设医疗规则进行匹配,获取第一候选信息的步骤时,包括:When the instruction for detecting abnormality of the diagnostic data is executed by the processor, the step of matching the diagnostic data with preset medical rules and obtaining first candidate information includes:
    将所述诊断数据中的多个所述医学实体分别与所述预设医疗规则中的每一条所述医疗规则进行匹配,以获取与所述诊断数据匹配的疾病类型的匹配度;Matching the multiple medical entities in the diagnostic data with each of the medical rules in the preset medical rules to obtain the matching degree of the disease type matching the diagnostic data;
    提取与所述诊断数据匹配的所有疾病类型的匹配度,生成所述第一候选信息。The matching degrees of all disease types matching the diagnosis data are extracted, and the first candidate information is generated.
  19. 根据权利要求18所述的计算机可读存储介质,其中,所述诊断数据异常检测的指令被所述处理器执行所述将所述诊断数据与医疗挖掘规则进行匹配,获取第二候选信息的步骤之前,还包括:18. The computer-readable storage medium according to claim 18, wherein the instructions for detecting abnormality of the diagnostic data are executed by the processor, and the step of matching the diagnostic data with medical mining rules to obtain second candidate information Before, it also included:
    根据历史样本数据生成所述医疗挖掘规则:Generate the medical mining rule based on historical sample data:
    所述历史样本数据包括多条历史医疗数据,每一条所述历史医疗数据包括历史患者的疾病类型和多个所述医学实体;The historical sample data includes multiple pieces of historical medical data, and each piece of historical medical data includes the disease type of the historical patient and multiple medical entities;
    所述医疗挖掘规则包括多条挖掘规则,每一条所述挖掘规则包括至少一个所述医学实体,每一条所述挖掘规则对应一种疾病类型;The medical mining rule includes a plurality of mining rules, each of the mining rules includes at least one of the medical entities, and each of the mining rules corresponds to a type of disease;
    根据疾病类型对历史样本数据中的所述历史医疗数据进行分类,生成疾病类型集合;Classify the historical medical data in the historical sample data according to the disease type to generate a disease type set;
    采用频繁集挖掘算法对所述疾病类型集合中的所述历史医疗数据进行筛选,生成与疾病类型对应的挖掘规则。A frequent set mining algorithm is used to screen the historical medical data in the disease type set, and a mining rule corresponding to the disease type is generated.
  20. 根据权利要求19所述的计算机可读存储介质,其中,所述诊断数据异常检测的指令被所述处理器执行所述将所述诊断数据与医疗挖掘规则进行匹配,获取第二候选信息的步骤时,包括:The computer-readable storage medium according to claim 19, wherein the instructions for detecting abnormality of the diagnostic data are executed by the processor, and the step of matching the diagnostic data with medical mining rules to obtain second candidate information When, including:
    将所述诊断数据中的多个所述医学实体分别与所述医疗挖掘规则中的每一条所述挖掘规则进行匹配,以获取与所述诊断数据匹配的疾病类型的匹配度;Matching a plurality of the medical entities in the diagnostic data with each of the mining rules in the medical mining rules to obtain the matching degree of the disease type matching the diagnostic data;
    提取与所述诊断数据匹配的所有疾病类型的匹配度,生成所述第二候选信息。Extract the matching degrees of all disease types that match the diagnostic data, and generate the second candidate information.
PCT/CN2021/083622 2020-10-27 2021-03-29 Method and apparatus for detecting anomaly in diagnostic data, and computer device and storage medium WO2021180242A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011161090.4 2020-10-27
CN202011161090.4A CN112365987B (en) 2020-10-27 2020-10-27 Diagnostic data abnormality detection method, diagnostic data abnormality detection device, computer device, and storage medium

Publications (1)

Publication Number Publication Date
WO2021180242A1 true WO2021180242A1 (en) 2021-09-16

Family

ID=74510908

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/083622 WO2021180242A1 (en) 2020-10-27 2021-03-29 Method and apparatus for detecting anomaly in diagnostic data, and computer device and storage medium

Country Status (2)

Country Link
CN (1) CN112365987B (en)
WO (1) WO2021180242A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113850499A (en) * 2021-09-23 2021-12-28 平安银行股份有限公司 Data processing method and device, electronic equipment and storage medium
CN114334167A (en) * 2021-12-31 2022-04-12 医渡云(北京)技术有限公司 Medical data mining method and device, storage medium and electronic equipment
CN114783581A (en) * 2022-06-22 2022-07-22 北京惠每云科技有限公司 Reporting method and reporting device for single disease type data
CN116798636A (en) * 2022-03-14 2023-09-22 数坤(北京)网络科技股份有限公司 Medical diagnostic method and related apparatus

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112365987B (en) * 2020-10-27 2023-06-06 平安科技(深圳)有限公司 Diagnostic data abnormality detection method, diagnostic data abnormality detection device, computer device, and storage medium
CN112885479B (en) * 2021-02-23 2023-05-16 武汉大学 Method and device for realizing data item comparison verification in medical data
CN113096752B (en) * 2021-03-01 2023-09-29 北京联袂义齿技术有限公司 Stomatology data arrangement analysis system
CN113051373B (en) * 2021-04-19 2024-02-13 讯飞医疗科技股份有限公司 Text analysis method, text analysis device, electronic equipment and storage medium
CN113096799B (en) * 2021-04-25 2024-04-02 北京百度网讯科技有限公司 Quality control method and device
CN113241135B (en) * 2021-04-30 2023-05-05 山东大学 Disease risk prediction method and system based on multi-modal fusion
CN113823414B (en) * 2021-08-23 2024-04-05 杭州火树科技有限公司 Main diagnosis and main operation matching detection method, device, computing equipment and storage medium
CN114400091B (en) * 2022-01-22 2022-11-08 深圳市携康网络科技有限公司 Medical prevention fusion system based on informatization
CN114496131B (en) * 2022-01-22 2022-10-04 深圳市携康网络科技有限公司 Family doctor informatization system
CN114822865B (en) * 2022-06-27 2022-11-11 天津幸福生命科技有限公司 Diagnostic data identification method and device, electronic equipment and storage medium
CN116682551B (en) * 2023-07-27 2023-12-22 腾讯科技(深圳)有限公司 Disease prediction method, disease prediction model training method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090259494A1 (en) * 2005-05-20 2009-10-15 Carlos Feder Computer-implemented medical analytics method and system employing a modified mini-max procedure
CN109636623A (en) * 2018-10-19 2019-04-16 平安医疗健康管理股份有限公司 Medical data method for detecting abnormality, device, equipment and storage medium
CN109659035A (en) * 2018-12-13 2019-04-19 平安医疗健康管理股份有限公司 Medical data exception recognition methods, equipment and storage medium based on machine learning
CN112365987A (en) * 2020-10-27 2021-02-12 平安科技(深圳)有限公司 Diagnostic data anomaly detection method and device, computer equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8412541B2 (en) * 2003-08-14 2013-04-02 Edda Technology, Inc. Method and system for intelligent qualitative and quantitative analysis for medical diagnosis
CN110379520A (en) * 2019-06-18 2019-10-25 北京百度网讯科技有限公司 The method for digging and device of medical knowledge map, computer equipment and readable medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090259494A1 (en) * 2005-05-20 2009-10-15 Carlos Feder Computer-implemented medical analytics method and system employing a modified mini-max procedure
CN109636623A (en) * 2018-10-19 2019-04-16 平安医疗健康管理股份有限公司 Medical data method for detecting abnormality, device, equipment and storage medium
CN109659035A (en) * 2018-12-13 2019-04-19 平安医疗健康管理股份有限公司 Medical data exception recognition methods, equipment and storage medium based on machine learning
CN112365987A (en) * 2020-10-27 2021-02-12 平安科技(深圳)有限公司 Diagnostic data anomaly detection method and device, computer equipment and storage medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113850499A (en) * 2021-09-23 2021-12-28 平安银行股份有限公司 Data processing method and device, electronic equipment and storage medium
CN113850499B (en) * 2021-09-23 2024-04-09 平安银行股份有限公司 Data processing method and device, electronic equipment and storage medium
CN114334167A (en) * 2021-12-31 2022-04-12 医渡云(北京)技术有限公司 Medical data mining method and device, storage medium and electronic equipment
CN116798636A (en) * 2022-03-14 2023-09-22 数坤(北京)网络科技股份有限公司 Medical diagnostic method and related apparatus
CN116798636B (en) * 2022-03-14 2024-03-26 数坤(北京)网络科技股份有限公司 Medical diagnostic method and related apparatus
CN114783581A (en) * 2022-06-22 2022-07-22 北京惠每云科技有限公司 Reporting method and reporting device for single disease type data

Also Published As

Publication number Publication date
CN112365987A (en) 2021-02-12
CN112365987B (en) 2023-06-06

Similar Documents

Publication Publication Date Title
WO2021180242A1 (en) Method and apparatus for detecting anomaly in diagnostic data, and computer device and storage medium
US20220308942A1 (en) Systems and methods for censoring text inline
EP3602384B1 (en) Layered masking of content
WO2019019630A1 (en) Anti-fraud identification method, storage medium, server carrying ping an brain and device
WO2021068547A1 (en) Log schema extraction method and apparatus
WO2016205286A1 (en) Automatic entity resolution with rules detection and generation system
CN111612041A (en) Abnormal user identification method and device, storage medium and electronic equipment
US20210334455A1 (en) Utility-preserving text de-identification with privacy guarantees
US11366843B2 (en) Data classification
CN110991530A (en) Missing data processing method and device, electronic equipment and storage medium
CN116194922A (en) Protecting sensitive data in a document
US20210397891A1 (en) Anomaly analysis using a blockchain, and applications thereof
US11741379B2 (en) Automated resolution of over and under-specification in a knowledge graph
US11500876B2 (en) Method for duplicate determination in a graph
CN110808095B (en) Diagnostic result recognition method, model training method, computer equipment and storage medium
EP3901791A1 (en) Systems and method for evaluating identity disclosure risks in synthetic personal data
Ashoori et al. Using clustering methods for identifying blood donors behavior
US11227062B2 (en) Data management method, apparatus and system for machine learning system
Chen et al. Propensity score-integrated approach to survival analysis: leveraging external evidence in single-arm studies
Vincent et al. Template matching for benchmarking hospital performance in the veterans affairs healthcare system
CN112991079B (en) Multi-card co-occurrence medical treatment fraud detection method, system, cloud end and medium
CN114724693A (en) Method and device for detecting abnormal diagnosis and treatment behaviors, electronic equipment and storage medium
de Oliveira Lopes et al. Statistical characteristics of the weighted inter‐rater reliability index for clinically validating nursing diagnoses
CN112711579A (en) Medical data quality detection method and device, storage medium and electronic equipment
KR101868744B1 (en) Method for providing clinical practice guideline and computer readable record-medium on which program for executing method therefor

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21767403

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21767403

Country of ref document: EP

Kind code of ref document: A1