CN113792115B - Entity correlation determination method, device, electronic equipment and storage medium - Google Patents

Entity correlation determination method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113792115B
CN113792115B CN202110944911.XA CN202110944911A CN113792115B CN 113792115 B CN113792115 B CN 113792115B CN 202110944911 A CN202110944911 A CN 202110944911A CN 113792115 B CN113792115 B CN 113792115B
Authority
CN
China
Prior art keywords
entity
matched
determining
correlation
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110944911.XA
Other languages
Chinese (zh)
Other versions
CN113792115A (en
Inventor
张正东
陈俊
代小亚
王磊
黄海峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202110944911.XA priority Critical patent/CN113792115B/en
Publication of CN113792115A publication Critical patent/CN113792115A/en
Application granted granted Critical
Publication of CN113792115B publication Critical patent/CN113792115B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance

Abstract

The disclosure provides a method, a device, electronic equipment and a storage medium for determining entity correlation, relates to the technical field of computers, and particularly relates to the technical field of artificial intelligence such as natural language processing and knowledge graph. The specific implementation scheme is as follows: acquiring an entity pair, wherein the entity pair comprises: the method comprises the steps of determining a reference type corresponding to a reference entity according to the priority order of a plurality of candidate types, wherein the reference type is one of the candidate types, and determining the correlation between the entity to be matched and the reference entity by adopting a correlation matching method corresponding to the reference type, so that the correlation between the entity to be matched and the reference entity can be effectively determined, the influence of human subjective factors on the accuracy of the correlation determination is reduced, the efficiency of the correlation determination between the entities is effectively improved, and the effect of the correlation determination between the entities is improved.

Description

Entity correlation determination method, device, electronic equipment and storage medium
Technical Field
The disclosure relates to the technical field of computers, in particular to the technical field of artificial intelligence such as natural language processing and knowledge graph, and specifically relates to a method and a device for determining entity correlation, electronic equipment and a storage medium.
Background
Artificial intelligence is the discipline of studying the process of making a computer mimic certain mental processes and intelligent behaviors (e.g., learning, reasoning, thinking, planning, etc.) of a person, both hardware-level and software-level techniques. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like; the artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural language processing technology, a machine learning technology, a deep learning technology, a big data processing technology, a knowledge graph technology and the like.
In the medical service scene, the supervision of the medical insurance is an important content of medical insurance management, and has very important significance for promoting the reasonable utilization of the medical insurance fund, improving the use efficiency of the medical insurance fund, saving unnecessary expenses for medical insurance departments and providing better medical service for paramedics. While there are various entities in the healthcare scenario, such as diagnostic items, medical consumption items, or operational items.
Disclosure of Invention
Provided are an entity relevance determining method, an entity relevance determining device, an electronic device, a storage medium and a computer program product.
According to a first aspect, there is provided a method of determining entity relevance, comprising: acquiring an entity pair, wherein the entity pair comprises: an entity to be matched and a reference entity; determining a reference type corresponding to the reference entity according to a plurality of pieces of description information respectively corresponding to a plurality of candidate types, wherein the reference type is one of the plurality of candidate types; and determining the correlation between the entity to be matched and the reference entity by adopting a correlation matching method corresponding to the reference type.
According to a second aspect, there is provided an entity-relevance determining apparatus, comprising: the first acquisition module is used for acquiring an entity pair, wherein the entity pair comprises: an entity to be matched and a reference entity; the determining module is used for determining a reference type corresponding to the reference entity according to a plurality of pieces of description information respectively corresponding to a plurality of candidate types, wherein the reference type is one of the plurality of candidate types; and the matching module is used for determining the correlation between the entity to be matched and the reference entity by adopting a correlation matching method corresponding to the reference type.
According to a third aspect, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the entity correlation determination method according to the embodiments of the present disclosure.
According to a fourth aspect, a non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the entity correlation determination method proposed by the embodiments of the present disclosure is presented.
According to a fifth aspect, a computer program product is presented, comprising a computer program, which, when being executed by a processor, implements the steps of the entity-relevance determining method presented by embodiments of the present disclosure.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a schematic diagram according to a first embodiment of the present disclosure;
FIG. 2 is a schematic diagram according to a second embodiment of the present disclosure; the method comprises the steps of carrying out a first treatment on the surface of the
FIG. 3 is a schematic diagram according to a third embodiment of the present disclosure;
FIG. 4 is a schematic diagram of a label mapping process provided in accordance with an embodiment of the present disclosure;
FIG. 5 is a schematic diagram of entity relationships provided in accordance with an embodiment of the present disclosure;
FIG. 6 is a schematic diagram of a network architecture of a tag identification model provided in accordance with an embodiment of the present disclosure;
FIG. 7 is a schematic diagram according to a fourth embodiment of the present disclosure;
FIG. 8 is a schematic diagram of the construction of a historical entity knowledge base provided in accordance with an embodiment of the present disclosure;
FIG. 9 is a schematic diagram of an entity correlation determination system provided in accordance with an implementation of the present disclosure;
FIG. 10 is a schematic diagram according to a fifth embodiment of the present disclosure;
FIG. 11 is a schematic diagram according to a sixth embodiment of the present disclosure;
fig. 12 is a block diagram of an electronic device for implementing the entity-relevance determination method of an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a schematic diagram according to a first embodiment of the present disclosure.
It should be noted that, the execution body of the entity correlation determination method in this embodiment is an entity correlation determination apparatus, and the apparatus may be implemented in a software and/or hardware manner, and the apparatus may be configured in an electronic device, where the electronic device may include, but is not limited to, a terminal, a server, and so on.
The disclosure relates to the technical field of computers, in particular to the technical field of artificial intelligence such as natural language processing and knowledge graph, and can effectively determine the relativity between entities, reduce the influence of artificial subjective factors on the relativity determination accuracy, effectively improve the relativity determination efficiency between the entities and improve the relativity determination effect between the entities.
Wherein, artificial intelligence (Artificial Intelligence), english is abbreviated AI. It is a new technical science for researching, developing theory, method, technology and application system for simulating, extending and expanding human intelligence.
Natural language processing (Natural Language Processing, NLP), which is an important direction in the fields of computer science and artificial intelligence, is used for researching various theories and methods capable of realizing effective communication between people and computers by natural language, and is mainly applied to the aspects of machine translation, public opinion monitoring, automatic abstract, text classification, voice recognition, text semantic comparison and the like.
The knowledge graph is a modern theory which combines the theory and method of subjects such as application mathematics, graphics, information visualization technology, information science and the like with the method of metering introduction analysis, co-occurrence analysis and the like, and utilizes the visualized graph to vividly display the core structure, development history, leading edge field and overall knowledge architecture of the subjects to achieve the aim of multi-subject fusion.
As shown in fig. 1, the entity correlation determination method includes:
s101: acquiring an entity pair, wherein the entity pair comprises: an entity to be matched and a reference entity.
In an embodiment of the present disclosure, one or more entity pairs are first acquired.
An entity refers to an object or thing that exists objectively and can be distinguished from each other, and can be a concrete concept or an abstract concept, and can be expressed by using entity words, which is not limited.
In a specific example, the entity correlation determination method may be applied to a scenario of medical insurance payment supervision, and the entity may be an entity related to the medical field, for example: diagnostic items, operational items, medical consumption items, and any other possible medical entity, are not limited in this regard.
While a combination of a plurality of entities (entity words) may be referred to as an entity pair, an entity pair of an embodiment of the present disclosure may include two entities, namely, an entity to be matched and a reference entity, an entity pair may be expressed in a form of < entity_1, entity_2>, entity_1 may represent the entity to be matched, and entity_2 may represent the reference entity.
Wherein, the entity playing a role in reference in the entity pair can be called a reference entity, and the entity performing correlation matching with the reference entity can be called an entity to be matched.
For example, in the above scenario of medical insurance payment supervision, the entity pair may be composed of a diagnostic item and a medical consumption item or an operation item and a medical consumption item, and may be represented as < diagnostic item, medical consumption item >, or < operation item, medical consumption item >, where the diagnostic item and the medical consumption item respectively correspond to an entity to be matched and a reference entity, or the operation item and the medical consumption item respectively correspond to the entity to be matched and the reference entity, that is, the embodiments of the present disclosure may determine a correlation (or association) between the diagnostic item or the operation item and the medical consumption item.
S102: and determining a reference type corresponding to the reference entity according to the plurality of descriptive information respectively corresponding to the plurality of candidate types, wherein the reference type is one of the plurality of candidate types.
The type to which the reference entity (e.g., medical consumption item) belongs may be referred to as a reference type, different reference entities may have different reference types, or the same reference entity may have multiple reference types, which is not limited.
In determining the reference type, the reference type corresponding to the reference entity may be determined according to a plurality of description information corresponding to a plurality of candidate types, respectively.
Wherein the plurality of candidate types includes, for example: general type, tag type, object type, and any other possible type, without limitation. And the reference type may be any one of a plurality of candidate types, for example: the reference type is a general type, or a tag type, or a target type.
The reference entity of the general type may be a general item in the medical field, which may be free of specific applicable diseases, symptoms or operations, a general item belonging to the hospitalization for admission, such as registration, blood routine test, urine routine test, and any other possible general item, or may be a general item of each medical department, without limitation.
The tag type is related to the semantic feature of the reference entity, that is, the corresponding tag can be determined according to the semantic of the reference entity, and the tag can be used as the reference type of the reference entity.
And reference entities of neither general nor tag type, the reference type of which may be referred to as a target type, which is related to, for example, historical medical record data, authoritative medical books, pharmacopoeias, medical records data, without limitation.
Among other things, information for describing the candidate type of feature may be referred to as description information.
In some embodiments, the description information may be a priority order corresponding to the corresponding candidate type, the priority orders corresponding to the different candidate types are different, and the priority order may be flexibly set according to an actual application scenario, which is not limited.
For example, in the case that the description information is a priority order corresponding to the corresponding candidate type, the plurality of candidate types may be ordered according to the priority from high to low, the priority order of the plurality of candidate types may be a general type, a tag type, and a target type, and in determining the reference type corresponding to the reference entity, whether the reference type is the general type, the tag type, and the target type may be sequentially determined.
That is, it is first determined whether the reference type belongs to the general type, and further determined whether the reference type belongs to the tag type in the case that the reference type does not belong to the general type, and the reference type of the reference entity is determined to be the target type in the case that the reference type does not belong to the tag type.
In the process of determining whether the reference type belongs to the universal type, the determination may be performed according to a pre-constructed universal knowledge base, and if the universal knowledge base exists in the reference entity, the determination that the reference type belongs to the universal type is described in the following embodiments. In the process of determining whether the reference type belongs to the tag type, the tag of the reference entity can be determined according to the semantic features of the reference entity, further, whether the tag of the reference entity exists is determined in a pre-constructed tag knowledge base, and if the corresponding tag exists, the reference type is determined to belong to the tag type.
In other embodiments, the description information may also be a classification type corresponding to the corresponding candidate type, classification categories corresponding to different candidate types are different, and the classification categories may be flexibly set according to an actual application scenario, which is not limited.
In practical application, a neural network classification model for identifying the category to which the candidate type belongs can be trained, the classification category corresponding to the reference entity is determined through the classification model, and then the reference type corresponding to the reference entity is determined according to the classification category.
For example, the classification model may be a classification model that classifies reference entities of a general type, a tag type, and reference entities of neither a general type nor a tag type, and may be determined to be of a target type.
It should be understood that the foregoing embodiments are merely illustrative of determining whether the reference type belongs to a general type or a tag type, and the reference type may be determined in any other possible manner in practical applications, which is not limited thereto.
S103: and determining the correlation between the entity to be matched and the reference entity by adopting a correlation matching method corresponding to the reference type.
After the reference type of the reference entity is determined, further, a correlation matching method corresponding to the reference type is adopted to determine the correlation between the entity to be matched and the reference entity.
That is, different reference types may correspond to different correlation matching methods, and in the operation of determining the correlation between the entity to be matched and the reference entity, the correlation matching method corresponding to the reference type may be employed to determine the correlation.
Wherein, the relevance may also be called relevance, and is used for describing the degree of relevance between the entity to be matched and the reference entity.
For example, an entity pair, e.g., < lumbar disc herniation, lumbar X-ray computer body layer (CT) flattening >, where lumbar disc herniation is a diagnostic term and may correspond to an entity to be matched, lumbar X-ray computer body layer (CT) flattening is a medical consumption term and corresponds to a reference entity. In the medical field, for diagnosis items of lumbar disc herniation, diagnosis can be generally performed by adopting lumbar X-ray Computer Tomography (CT) flat scan, so that there is a reasonable correlation between lumbar disc herniation and lumbar X-ray Computer Tomography (CT) flat scan, and then in this example, it can be determined that the entity to be matched and the reference entity have correlation.
For another example, in the entity pair < knee arthropathy, craniocerebral magnetic resonance imaging >, knee arthropathy is a diagnostic term and may correspond to the entity to be matched, craniocerebral magnetic resonance imaging is a medical consumption term and corresponds to the reference entity, while there is no reasonable correlation between knee arthropathy and craniocerebral magnetic resonance imaging, so it may be determined that the entity to be matched and the reference entity do not have correlation.
Further, subsequent processing operations may be performed according to the correlation between the entity to be matched and the reference entity in the entity pair, for example: judging whether unreasonable medical behaviors exist in the medical insurance payment supervision process, and further realizing effective accounting and supervision of medical insurance payment.
In the embodiment of the disclosure, by acquiring the entity pair, the entity pair includes: the method comprises the steps of determining a reference type corresponding to a reference entity according to the priority order of a plurality of candidate types, wherein the reference type is one of the candidate types, and determining the correlation between the entity to be matched and the reference entity by adopting a correlation matching method corresponding to the reference type, so that the correlation between the entity to be matched and the reference entity can be effectively determined, the influence of human subjective factors on the accuracy of the correlation determination is reduced, the efficiency of the correlation determination between the entities is effectively improved, and the effect of the correlation determination between the entities is improved.
Note that, in this embodiment, the medical insurance payment supervision is not performed for a specific user, and cannot reflect personal information of a specific user.
In this embodiment, the related medical data information may be obtained in various public and legal manners, for example, may be obtained from a public data set, or may be obtained from a user through authorization of the user, and the processing procedure accords with related laws and regulations.
Fig. 2 is a schematic diagram according to a second embodiment of the present disclosure.
As shown in fig. 2, the entity correlation determination method includes:
s201: and acquiring a plurality of candidate entities, wherein the plurality of candidate entities respectively have corresponding use proportion values of the plurality of entities.
Wherein an entity under any scenario may be referred to as a candidate entity, for example: in the above scenario of medical insurance payment supervision, any item in the medical process (including any diagnostic item, operational item, and medical consumption item) may be referred to as a candidate entity.
The candidate entity may have a corresponding entity usage proportion value, where the entity usage proportion value is used to describe a usage coverage of the candidate entity in the corresponding scenario, or is used to describe a frequency of occurrence of the candidate entity in the corresponding scenario, or may also describe any other possible usage proportion, which is not limited.
In practical applications, the patient data may be mined, and the ratio of the coverage of the patient by the candidate entity may be calculated as the usage proportion value of the entity, or the usage proportion value of the entity may be calculated in any other possible manner, which is not limited.
S202: and acquiring candidate entities to which the entity usage proportion value greater than or equal to the proportion threshold belongs, and constructing a universal knowledge base according to the belonging candidate entities.
After the plurality of candidate entities are obtained, further, the candidate entity to which the entity usage proportion value greater than or equal to the proportion threshold belongs is obtained, and a universal knowledge base is constructed according to the belonging candidate entity.
That is, an entity whose use scale value is greater than a scale threshold value is selected from a plurality of candidate entities, and a knowledge base, which may be referred to as a general-purpose knowledge base, is constructed.
The proportion threshold may be determined according to an actual application scenario, for example, the proportion threshold in the embodiment of the disclosure may be 90%, that is, the candidate entity with the entity usage proportion greater than or equal to 90% is constructed to the universal knowledge base. Thus, by creating a generic knowledge base, subsequent determination of the reference type of the reference entity from the generic knowledge base may be facilitated.
S203: acquiring an entity pair, wherein the entity pair comprises: an entity to be matched and a reference entity.
S204: and determining a reference type corresponding to the reference entity according to the plurality of descriptive information respectively corresponding to the plurality of candidate types, wherein the reference type is one of the plurality of candidate types.
The descriptions of S203 to S204 may be specifically referred to the above embodiments, and are not repeated herein.
S205: if the candidate entity corresponding to the reference entity can be retrieved from the universal knowledge base, the reference type is determined to be a universal type.
Further, candidate entities corresponding to the reference entity may be retrieved from the generic knowledge base, and if there is a corresponding candidate entity in the generic knowledge base, the reference type is determined to be a generic type.
For example, a reference entity such as a lumbar X-ray Computed Tomography (CT) swipe, if present, is retrieved from a general knowledge base as to whether the lumbar X-ray Computed Tomography (CT) swipe is present, and if present, indicates that the lumbar X-ray Computed Tomography (CT) swipe is of a general type. Therefore, by searching in the universal knowledge base, whether the corresponding candidate entity exists or not can be quickly determined, and therefore the speed of correlation matching can be improved.
S206: if the reference type is a generic type, it is determined that the entity to be matched and the reference entity have relevance.
Further, in the embodiment of the disclosure, in the operation of determining the correlation between the entity to be matched and the reference entity by adopting the correlation matching method corresponding to the reference type, whether the reference type is a general type may be determined, and if the reference type is a general type, it is determined that the entity to be matched and the reference entity have correlation. The entity to be matched and the reference entity can be directly determined to have correlation through the universal type, so that the calculation complexity and occupied calculation resources can be reduced, the correlation matching efficiency is improved, and the processing speed of medical supervision can be further improved.
In the embodiment of the disclosure, by acquiring the entity pair, the entity pair includes: the method comprises the steps of determining a reference type corresponding to a reference entity according to the priority order of a plurality of candidate types, wherein the reference type is one of the candidate types, and determining the correlation between the entity to be matched and the reference entity by adopting a correlation matching method corresponding to the reference type, so that the correlation between the entity to be matched and the reference entity can be effectively determined, the influence of human subjective factors on the accuracy of the correlation determination is reduced, the efficiency of the correlation determination between the entities is effectively improved, and the effect of the correlation determination between the entities is improved. In addition, by searching in the universal knowledge base, whether the corresponding candidate entity exists can be quickly determined, so that the speed of correlation matching can be improved. In addition, the entity to be matched and the reference entity can be directly determined to have correlation through the universal type, so that the computational complexity and occupied computational resources can be reduced, the correlation matching efficiency is improved, and the processing speed of medical supervision can be further improved.
Fig. 3 is a schematic diagram according to a third embodiment of the present disclosure.
As shown in fig. 3, the entity correlation determination method includes:
s301: acquiring an entity pair, wherein the entity pair comprises: an entity to be matched and a reference entity.
S302: and determining a reference type corresponding to the reference entity according to the plurality of descriptive information respectively corresponding to the plurality of candidate types, wherein the reference type is one of the plurality of candidate types.
The descriptions of S301 to S302 may be specifically referred to the above embodiments, and are not repeated herein.
S303: if the reference type is the tag type, determining tag information to be matched of the entity to be matched, and determining the reference tag information of the reference entity.
In the embodiment of the disclosure, in an operation of determining a correlation between an entity to be matched and a reference entity by adopting a correlation matching method corresponding to a reference type, if the reference type is a tag type, determining tag information to be matched of the entity to be matched, and determining reference tag information of the reference entity.
That is, the entity of the embodiments of the present disclosure may have corresponding tag information, including, for example, a body part, a body system, a disease type, and the like, without limitation.
The tag information corresponding to the entity to be matched may be referred to as tag information to be matched, and the tag information corresponding to the reference entity may be referred to as reference tag information.
In some embodiments, the entity to be matched may be subjected to label mapping processing to obtain label information to be matched.
The method may, for example, use a machine learning-based tag mapping algorithm to perform tag mapping processing on the entity to be matched to obtain the tag information to be matched, or use any other possible method to perform tag mapping processing on the entity to be matched, which is not limited.
Similarly, in the embodiment of the present disclosure, tag mapping processing may be performed on a reference entity to obtain reference tag information, where a processing manner is similar to the tag mapping processing manner of the entity to be matched, and is not described herein again.
For example, fig. 4 is a schematic diagram of a label mapping process provided according to an embodiment of the present disclosure, as shown in fig. 4, the medical entity word: the left knee joint injury can be a reference entity or an entity to be matched, and the medical entity word is calculated through a label mapping algorithm to obtain corresponding label information, for example, a body part: knee, bone, knee joint, body system: locomotor system, disease type: injury disease. Therefore, the label information to be matched of the entity to be matched and the reference label information of the reference entity can be obtained, and the correlation between the entity to be matched and the reference entity can be determined according to the label information.
In some embodiments, in the process of determining the tag information to be matched or the reference tag information, semantic analysis may also be performed on the entity to be determined, so as to obtain semantic features of the entity. The entity to be determined may be an entity to be matched, or may also be a reference entity, that is, entity semantic features of the entity to be matched and/or the reference entity may be determined.
Any possible manner or algorithm may be used to perform semantic parsing on the entity to be matched or the reference entity, for example: and identifying entity semantic features from the entity to be determined by adopting a deep learning network model, and not limiting the entity semantic features.
Further, according to the entity semantic features, the target tag information is determined in combination with the setting mode, and the target tag information is the tag information to be matched or the reference tag information, that is, the setting mode and the entity semantic features can be adopted to determine the tag information to be matched or the reference tag information. Therefore, semantic information of the entity can be combined in the process of determining the entity correlation, and the robustness and the mobility of the medical insurance system in actual supervision are improved.
In some embodiments, candidate semantic features matched with the entity semantic features can be determined from a tag knowledge base, and candidate tag information to which the candidate semantic features belong is used as target tag information.
The tag knowledge base may be pre-constructed, for example, by manually labeling, to establish a correspondence between entity words or entity semantic features and tags, or may also establish a mapping relationship between keywords of entity semantic features and tags, for example: and establishing a mapping relation between the keywords of the upper respiratory tract and the respiratory system and the corresponding labels.
The candidate semantic features are features constructed in the tag knowledge base, and have corresponding candidate tag information, that is, the corresponding relationship between the candidate semantic features and the candidate tag information, or the mapping relationship is recorded in the tag knowledge base.
In the operation of determining the target tag information, whether corresponding candidate semantic features exist or not can be queried in a tag knowledge base according to entity semantic features, and when the candidate semantic features exist, the candidate tag information to which the candidate semantic features belong is used as the target tag information.
In other embodiments, the associated entity related to the entity to be determined may also be determined according to the semantic features of the entity.
For example, in the above scenario of medical insurance payment supervision, an entity having the same department relationship, the same condition relationship, and any other possibly same or similar relationship with the entity to be determined may be referred to as an association entity, and the association entity may have corresponding association tag information.
In the operation of determining the target tag information, the associated entity related to the entity to be determined can be determined according to the semantic features of the entity, and the associated tag information of the associated entity is used as the target tag information.
In practical applications, a tag propagation model may be employed to determine the associated entity. The core idea of tag propagation is that similar data have the same tag, and the method mainly comprises two steps of building a similarity matrix and tag propagation. Fig. 5 is a schematic diagram of entity relationships provided according to an embodiment of the present disclosure, as shown in fig. 5, according to a known association entity (including tag information and unlabeled information), entity words are represented by nodes based on a context relationship, the same department relationship, the same condition relationship, or the like existing between entity words, and a connection line between the nodes represents a relationship between two entities, which are then associated in a form of a graph structure. Further, the label is propagated through the edges between the nodes, the larger the weight of the edges is, the more similar the two nodes are, and then the label is easy to pass through. For example: for nodes i and j, the edge weights between the two are:
Wherein, ||x i -x j || 2 Representing Euclidean distance between nodes, alpha is a super parameter, and then the probability transition matrix P of the label is:
wherein P is ij Representing the probability that the label will transition from node i to node j. For the data with the tag, the tag is a priori knowledge determined in advance and is not affected by the propagation of the tag, so that the initial tag needs to be reset after each propagation is completed, and the tag of the entity word with wider coverage range can be obtained through the cyclic iteration of the operation. Thus, the target tag information can be determined by the tag propagation algorithm.
In other embodiments, the semantic features of the entity may be input into the tag recognition model to obtain the target tag information output by the tag recognition model.
Wherein the tag identification model may be a deep network based model, such as: based on the deep network model of the gating cycle unit (Gated Recurrent Unit, GRU), GRU is a variant of Long Short-Term Memory (LSTM) with good effect, and has simpler structure and good effect than LSTM. Compared with the traditional neural network framework, on one hand, the sequential dependency relationship among words in the entity words is considered, the basic assumption of natural language processing (expression of semantics is influenced by the word sequence) is more met, and on the other hand, the GRU-based method effectively solves the problems of gradient explosion (gradient explosion), gradient dispersion (gradient vanishing) and the like caused by long dependence of the traditional cyclic neural network (Recurrent Neural Network, RNN) so that model training is more stable. After the labels of the entity words are obtained, whether the labels of the two entity words overlap or not can be judged, and whether the labels of the two entity words have correlation on semantic features or not can be judged.
Fig. 6 is a schematic diagram of a network structure of a tag identification model provided according to an embodiment of the present disclosure, as shown in fig. 6, including: input layer (input), embedded layer (embedding), full connection layer (fully connected layers, FC), wherein still including discarding layer (dropout) between the embedding and the FC, FC and the embedding carry out GRU and propagate, then connect max pooling layer (max pool), max pool through full connection layer FC, discarding layer dropout at connection FC layer or activation function (sigmoid), then connect prediction layer (predictons).
The dropout layer has the function of randomly discarding the characteristic node values with a certain ratio (namely, the values are set to 0) in the current training batch, so that the interaction among the characteristic detectors (hidden layer nodes) can be reduced, the detector interaction means that certain detectors can only act by depending on other detectors, and the final goal is to reduce the overfitting in the model training process, reduce the dependence on certain local characteristics and enhance the generalization performance of the model.
Therefore, the embodiment can determine the target tag information in various modes, ensure the accuracy of the target tag information, improve the fault tolerance, and further be beneficial to judging the correlation between the entity to be matched and the reference entity.
S304: and determining the correlation between the entity to be matched and the reference entity according to the information of the label to be matched and the information of the reference label.
After the tag information to be matched and the reference tag information are determined, further, the correlation between the entity to be matched and the reference entity is determined according to the tag information to be matched and the reference tag information, that is, the correlation between the entities is determined according to the correlation of the tag information. Thus, the correlation between the entities can be determined according to the correlation between the tags, and the correlation between the entities can be quickly and accurately determined in this embodiment because the tags are specific.
In some embodiments, a tag overlap value between tag information to be matched and reference tag information may be determined first.
The tag overlap value may represent, for example, a similarity, a degree of coincidence, and any other possible correlation information between the tag information to be matched and the reference tag information, and the number of identical characters in the tag information to be matched and the reference tag information may be calculated as the tag overlap value, or the tag overlap value may also be calculated in any other possible manner, which is not limited.
Further, comparing the label overlapping value with an overlapping threshold, wherein the overlapping threshold can be set according to an actual application scene, if the label overlapping value is larger than or equal to the overlapping threshold, determining that the entity to be matched has correlation with the reference entity, and if the label overlapping value is smaller than the overlapping threshold, determining that the entity to be matched does not have correlation with the reference entity. Therefore, the label overlapping value and the overlapping threshold value are compared, the calculation process can be simplified, and the overlapping threshold value can be flexibly set, so that the requirements in different scenes can be met.
In the embodiment of the disclosure, by acquiring the entity pair, the entity pair includes: the method comprises the steps of determining a reference type corresponding to a reference entity according to the priority order of a plurality of candidate types, wherein the reference type is one of the candidate types, and determining the correlation between the entity to be matched and the reference entity by adopting a correlation matching method corresponding to the reference type, so that the correlation between the entity to be matched and the reference entity can be effectively determined, the influence of human subjective factors on the accuracy of the correlation determination is reduced, the efficiency of the correlation determination between the entities is effectively improved, and the effect of the correlation determination between the entities is improved. In addition, by searching in the universal knowledge base, whether the corresponding candidate entity exists can be quickly determined, so that the speed of correlation matching can be improved. And the target label information can be determined in various modes, so that the accuracy of the target label information is ensured, the fault tolerance is improved, and the correlation between the entity to be matched and the reference entity is further facilitated to be judged. In addition, the label overlapping value and the overlapping threshold value are compared, the calculation process can be simplified, and the overlapping threshold value can be flexibly set, so that the requirements in different scenes can be met.
Fig. 7 is a schematic diagram according to a fourth embodiment of the present disclosure.
As shown in fig. 7, the entity correlation determination method includes:
s701: acquiring an entity pair, wherein the entity pair comprises: an entity to be matched and a reference entity.
S702: and determining a reference type corresponding to the reference entity according to the plurality of descriptive information respectively corresponding to the plurality of candidate types, wherein the reference type is one of the plurality of candidate types.
The descriptions of S701-S702 may be specifically referred to the above embodiments, and are not repeated herein.
S703: if the reference type is a target type, a plurality of historical entities associated with the reference entity are obtained.
In the embodiment of the disclosure, in the case that the reference type is the target type, a plurality of history entities related to the reference entity may also be acquired.
For example, a plurality of entities related to a reference entity may be mined in historical medical record data, authoritative medical books, pharmacopoeias, and medical records data as historical entities. And, as shown in fig. 8, fig. 8 is a schematic diagram of construction of a knowledge base of historical entities provided according to an embodiment of the present disclosure, where the knowledge base may be constructed by historical entities in historical medical record data, authoritative medical books, pharmacopoeias, and medical records data.
S704: and determining the correlation between the entity to be matched and the reference entity according to the plurality of historical entities.
Further, a correlation between the entity to be matched and the reference entity is determined according to historical entities in the plurality of knowledge bases. Therefore, the correlation between the entity to be matched and the reference entity can be determined by combining the historical data, so that the correlation matching result is more authoritative.
In some embodiments, a plurality of semantic similarities between the entity to be matched and a plurality of historical entities, respectively, may be calculated.
Semantic similarity, which is used to describe similarity of semantic feature levels between the entity to be matched and the historical entity, for example: semantic similarity between the entity to be matched and each historical entity can be calculated through a semantic recognition algorithm, so that a plurality of semantic similarities can be obtained.
Further, the plurality of semantic similarities are respectively compared with a similarity threshold, if any semantic similarity among the plurality of semantic similarities is larger than or equal to the similarity threshold, the entity to be matched and the reference entity are determined to have correlation, and if the plurality of semantic similarities are smaller than the similarity threshold, the entity to be matched and the reference entity are determined to have no correlation. That is, if at least one historical entity greater than or equal to the similarity threshold exists in the plurality of semantic similarities, it is indicated that there is a correlation between the entity to be matched and the reference entity. Therefore, the method and the device can quickly determine the correlation between the entity to be matched and the reference entity through the semantic similarity and the similarity threshold, and can be applied to different application scenes by flexibly adjusting the similarity threshold.
In the embodiment of the disclosure, by acquiring the entity pair, the entity pair includes: the method comprises the steps of determining a reference type corresponding to a reference entity according to the priority order of a plurality of candidate types, wherein the reference type is one of the candidate types, and determining the correlation between the entity to be matched and the reference entity by adopting a correlation matching method corresponding to the reference type, so that the correlation between the entity to be matched and the reference entity can be effectively determined, the influence of human subjective factors on the accuracy of the correlation determination is reduced, the efficiency of the correlation determination between the entities is effectively improved, and the effect of the correlation determination between the entities is improved. And moreover, the correlation between the entity to be matched and the reference entity can be determined by combining the historical data, so that the correlation matching result is more authoritative. In addition, the correlation between the entity to be matched and the reference entity can be rapidly determined through the semantic similarity and the similarity threshold, and the similarity threshold is flexibly adjusted, so that the scheme can be applied to different application scenes.
Fig. 9 is a schematic structural diagram of an entity relevance determining system according to an embodiment of the present disclosure, and as shown in fig. 9, mainly includes: the medical entity relationship prediction module, the medical consumption item rationality detection module and the medical insurance fund accounting and supervision module.
The medical entity relation prediction module is mainly used for realizing the embodiment, and can determine the relevance between medical entities (reference entities and entities to be matched) through general item (general knowledge base) retrieval, or can determine the relevance between the medical entities through a label prediction model, or can determine the relevance between the medical entities through a knowledge base (a plurality of historical entities).
The medical consumption item rationality detection module is established on the basis of the medical entity relation prediction module and is used for extracting main diagnosis, other diagnosis, operation or other operation and contents in anesthesia mode fields from medical records first page data to be sequentially used as the entity_1, extracting charging consumption item names to be used as the entity_2, constructing a relation pair of the < entity_1 and the entity_2> to judge the relation, and for a patient, if the charging consumption item has correlation with any contents in the main diagnosis, other diagnosis, operation and other operation and anesthesia mode fields, the charging of the patient can be judged to be reasonable charging, otherwise, the patient is unreasonable charging.
And the medical insurance fund accounting and supervision module is used for judging reasonable and unreasonable projects in medical insurance charge, and then according to the recorded consumption quantity, unit price and times of the projects, the total amount and unreasonable total amount of the patient which can be reasonably reimbursed in the medical treatment service can be calculated, so that the supervision of medical insurance payment is more reasonably and efficiently realized.
Fig. 10 is a schematic diagram according to a fifth embodiment of the present disclosure.
As shown in fig. 10, the entity-relevance determining device 100 includes:
a first obtaining module 101, configured to obtain an entity pair, where the entity pair includes: an entity to be matched and a reference entity;
a determining module 102, configured to determine a reference type corresponding to the reference entity according to a plurality of description information corresponding to a plurality of candidate types, where the reference type is one of the plurality of candidate types; and
and the matching module 103 is used for determining the correlation between the entity to be matched and the reference entity by adopting a correlation matching method corresponding to the reference type.
Optionally, in some embodiments of the present disclosure, as shown in fig. 11, fig. 11 is a schematic diagram according to a sixth embodiment of the present disclosure, the entity correlation determining apparatus 110 includes: the first obtaining module 111, the determining module 112, and the matching module 113, where the matching module 113 includes:
the first matching submodule 1131 is configured to determine that the entity to be matched and the reference entity have correlation when the reference type is a general type.
Optionally, in some embodiments of the present disclosure, as shown in fig. 11, the matching module 113 further includes:
A first determining submodule 1132, configured to determine tag information to be matched of the entity to be matched when the reference type is a tag type, and determine reference tag information of the reference entity;
the second matching sub-module 1133 is configured to determine a correlation between the entity to be matched and the reference entity according to the tag information to be matched and the reference tag information.
Optionally, in some embodiments of the present disclosure, the second matching submodule 1133 is specifically configured to: determining a label overlap value between label information to be matched and reference label information; when the label overlapping value is larger than or equal to the overlapping threshold value, determining that the entity to be matched has correlation with the reference entity; and when the label overlapping value is smaller than the overlapping threshold value, determining that the entity to be matched and the reference entity have no correlation.
Optionally, in some embodiments of the present disclosure, as shown in fig. 11, the matching module 113 further includes:
a second determining submodule 1134, configured to obtain a plurality of historical entities related to the reference entity when the reference type is the target type;
a third matching sub-module 1135, configured to determine, according to the plurality of historical entities, a correlation between the entity to be matched and the reference entity.
Optionally, in some embodiments of the present disclosure, the third matching submodule 1135 is specifically configured to: determining a plurality of semantic similarities between the entity to be matched and a plurality of historical entities respectively; when any semantic similarity among the plurality of semantic similarities is greater than or equal to a similarity threshold, determining that the entity to be matched has correlation with the reference entity; and when the semantic similarities are smaller than the similarity threshold value, determining that the entity to be matched and the reference entity have no correlation.
Optionally, in some embodiments of the present disclosure, the first matching submodule 1131 is specifically configured to: and when the candidate entity corresponding to the reference entity can be retrieved from the universal knowledge base, determining that the reference type is the universal type.
Optionally, in some embodiments of the present disclosure, as shown in fig. 11, the apparatus 110 further includes: a second obtaining module 114, configured to obtain a plurality of candidate entities, where the plurality of candidate entities respectively have a plurality of corresponding entity usage scale values; the third obtaining module 115 is configured to obtain candidate entities to which the entity usage scale value greater than or equal to the scale threshold belongs, and construct a universal knowledge base according to the candidate entities to which the entity usage scale value belongs.
Optionally, in some embodiments of the present disclosure, the first determining submodule 1132 is specifically configured to: performing label mapping processing on the entity to be matched to obtain label information to be matched; and performing label mapping processing on the reference entity to obtain reference label information.
Optionally, in some embodiments of the present disclosure, the first determining submodule 1132 is specifically configured to: carrying out semantic analysis on the entity to be judged to obtain entity semantic features, wherein the entity to be judged is the entity to be matched or the reference entity; and determining target label information, which is label information to be matched or reference label information, according to the entity semantic features in combination with a setting mode.
Optionally, in some embodiments of the present disclosure, the first determining submodule 1132 is specifically configured to: determining candidate semantic features matched with entity semantic features from a tag knowledge base, and taking candidate tag information of the candidate semantic features as target tag information; or determining an associated entity related to the entity to be judged according to the entity semantic features, and taking the associated tag information of the associated entity as target tag information; or inputting the entity semantic features into the tag recognition model to obtain target tag information output by the tag recognition model.
Optionally, in some embodiments of the disclosure, the description information is a priority order corresponding to the respective candidate types, and the priority orders corresponding to the different candidate types are different.
It can be understood that, the entity correlation determining apparatus 110 in fig. 11 of the present embodiment and the entity correlation determining apparatus 100 in the foregoing embodiment, the first obtaining module 111 and the first obtaining module 101 in the foregoing embodiment, the determining module 112 and the determining module 102 in the foregoing embodiment, and the matching module 113 and the matching module 103 in the foregoing embodiment may have the same functions and structures.
It should be noted that the foregoing explanation of the entity correlation determination method is also applicable to the entity correlation determination apparatus of the present embodiment, and will not be repeated here.
In the embodiment of the disclosure, by acquiring the entity pair, the entity pair includes: the method comprises the steps of determining a reference type corresponding to a reference entity according to the priority order of a plurality of candidate types, wherein the reference type is one of the candidate types, and determining the correlation between the entity to be matched and the reference entity by adopting a correlation matching method corresponding to the reference type, so that the correlation between the entity to be matched and the reference entity can be effectively determined, the influence of human subjective factors on the accuracy of the correlation determination is reduced, the efficiency of the correlation determination between the entities is effectively improved, and the effect of the correlation determination between the entities is improved.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
Fig. 12 is a block diagram of an electronic device for implementing the entity-relevance determination method of an embodiment of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 12, the apparatus 1200 includes a computing unit 1201, which may perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 1202 or a computer program loaded from a storage unit 1208 into a Random Access Memory (RAM) 1203. In the RAM1203, various programs and data required for the operation of the device 1200 may also be stored. The computing unit 1201, the ROM 1202, and the RAM1203 are connected to each other via a bus 1204. An input/output (I/O) interface 1205 is also connected to the bus 1204.
Various components in device 1200 are connected to I/O interface 1205, including: an input unit 1206 such as a keyboard, mouse, etc.; an output unit 1207 such as various types of displays, speakers, and the like; a storage unit 1208 such as a magnetic disk, an optical disk, or the like; and a communication unit 1209, such as a network card, modem, wireless communication transceiver, etc. The communication unit 1209 allows the device 1200 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks.
The computing unit 1201 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 1201 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The computing unit 1201 performs the various methods and processes described above, for example, the entity-relevance determination method.
For example, in some embodiments, the entity correlation determination method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 1208. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 1200 via ROM 1202 and/or communication unit 1209. When the computer program is loaded into the RAM 1203 and executed by the computing unit 1201, one or more steps of the above-described entity correlation determination method may be performed. Alternatively, in other embodiments, the computing unit 1201 may be configured to perform the entity-relevance determination method in any other suitable way (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out the entity correlation determination methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable entity correlation determination device, such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), the internet, and blockchain networks.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service ("Virtual Private Server" or simply "VPS") are overcome. The server may also be a server of a distributed system or a server that incorporates a blockchain.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the disclosed aspects are achieved, and are not limited herein.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (20)

1. An entity relevance determination method, comprising:
obtaining an entity pair, wherein the entity pair comprises: an entity to be matched and a reference entity;
determining a reference type corresponding to the reference entity according to a plurality of pieces of description information respectively corresponding to a plurality of candidate types, wherein the reference type is one of the plurality of candidate types; and
adopting a correlation matching method corresponding to the reference type to determine the correlation between the entity to be matched and the reference entity;
The candidate types include: and a universal type, wherein the determining the correlation between the entity to be matched and the reference entity by adopting a correlation matching method corresponding to the reference type comprises the following steps:
if the reference type is the universal type, determining that the entity to be matched and the reference entity have correlation;
the candidate types include: the tag type, wherein the determining the correlation between the entity to be matched and the reference entity by adopting a correlation matching method corresponding to the reference type includes:
if the reference type is the tag type, determining tag information to be matched of the entity to be matched, and determining reference tag information of the reference entity;
determining the correlation between the entity to be matched and the reference entity according to the label information to be matched and the reference label information;
the candidate types include: the target type, wherein the determining the correlation between the entity to be matched and the reference entity by adopting a correlation matching method corresponding to the reference type includes:
if the reference type is the target type, acquiring a plurality of historical entities related to the reference entity;
And determining the correlation between the entity to be matched and the reference entity according to the historical entities.
2. The method of claim 1, wherein the determining the correlation between the entity to be matched and the reference entity according to the tag information to be matched and the reference tag information comprises:
determining a label overlap value between the label information to be matched and the reference label information;
if the label overlapping value is larger than or equal to an overlapping threshold value, determining that the entity to be matched and the reference entity have correlation;
and if the label overlapping value is smaller than the overlapping threshold value, determining that the entity to be matched and the reference entity have no correlation.
3. The method of claim 1, wherein the determining a correlation between the entity to be matched and the reference entity from the plurality of historical entities comprises:
determining a plurality of semantic similarities between the entity to be matched and the plurality of historical entities respectively;
if any semantic similarity among the plurality of semantic similarities is greater than or equal to a similarity threshold, determining that there is a correlation between the entity to be matched and the reference entity;
And if the semantic similarities are smaller than the similarity threshold, determining that the entity to be matched and the reference entity have no correlation.
4. The method of claim 1, the method further comprising:
if the candidate entity corresponding to the reference entity can be retrieved from the universal knowledge base, determining that the reference type is the universal type.
5. The method of claim 4, further comprising, prior to the acquiring the entity pair:
acquiring a plurality of candidate entities, wherein the plurality of candidate entities respectively have corresponding use proportion values of the plurality of entities;
and acquiring candidate entities to which the entity usage proportion value is greater than or equal to a proportion threshold, and constructing the universal knowledge base according to the candidate entities to which the entity usage proportion value is belonged.
6. The method of claim 1, wherein the determining the to-be-matched tag information of the to-be-matched entity and determining the reference tag information of the reference entity comprises:
performing label mapping processing on the entity to be matched to obtain the label information to be matched;
and performing the label mapping processing on the reference entity to obtain the reference label information.
7. The method of claim 6, further comprising:
carrying out semantic analysis on an entity to be judged to obtain entity semantic features, wherein the entity to be judged is the entity to be matched or the reference entity;
and determining target label information according to the entity semantic features and combining a setting mode, wherein the target label information is the label information to be matched or the reference label information.
8. The method of claim 7, wherein the determining target tag information according to the entity semantic features in combination with a setting manner comprises:
determining candidate semantic features matched with the entity semantic features from a tag knowledge base, and taking candidate tag information to which the candidate semantic features belong as the target tag information; or alternatively
Determining an associated entity related to the entity to be judged according to the entity semantic features, and taking associated tag information of the associated entity as the target tag information; or alternatively
Inputting the entity semantic features into a tag recognition model to obtain the target tag information output by the tag recognition model.
9. The method of claim 1, wherein the description information is a priority order corresponding to the respective candidate types, and the priority orders corresponding to the different candidate types are different.
10. An entity correlation determination apparatus, comprising:
the first acquisition module is used for acquiring an entity pair, wherein the entity pair comprises: an entity to be matched and a reference entity;
the determining module is used for determining a reference type corresponding to the reference entity according to a plurality of pieces of description information respectively corresponding to a plurality of candidate types, wherein the reference type is one of the candidate types; and
the matching module is used for determining the correlation between the entity to be matched and the reference entity by adopting a correlation matching method corresponding to the reference type;
the candidate types include: the universal type, wherein the matching module comprises:
the first matching sub-module is used for determining that the entity to be matched and the reference entity have correlation when the reference type is the universal type;
the candidate types include: the tag type, wherein, the matching module further includes:
the first determining submodule is used for determining label information to be matched of the entity to be matched when the reference type is the label type and determining reference label information of the reference entity;
the second matching sub-module is used for determining the correlation between the entity to be matched and the reference entity according to the label information to be matched and the reference label information;
The candidate types include: the target type, wherein, the matching module further includes:
a second determining sub-module, configured to obtain a plurality of historical entities related to the reference entity when the reference type is the target type;
and the third matching sub-module is used for determining the correlation between the entity to be matched and the reference entity according to the plurality of historical entities.
11. The apparatus of claim 10, wherein the second matching sub-module is specifically configured to:
determining a label overlap value between the label information to be matched and the reference label information;
when the label overlapping value is larger than or equal to an overlapping threshold value, determining that the entity to be matched and the reference entity have correlation;
and when the label overlapping value is smaller than the overlapping threshold value, determining that no correlation exists between the entity to be matched and the reference entity.
12. The apparatus of claim 10, wherein the third matching sub-module is specifically configured to:
determining a plurality of semantic similarities between the entity to be matched and the plurality of historical entities respectively;
determining that there is a correlation between the entity to be matched and the reference entity when any one of the semantic similarities is greater than or equal to a similarity threshold;
And when the semantic similarities are smaller than the similarity threshold, determining that the entity to be matched and the reference entity have no correlation.
13. The apparatus of claim 10, wherein the first matching sub-module is specifically configured to:
and when the candidate entity corresponding to the reference entity can be retrieved from the universal knowledge base, determining that the reference type is the universal type.
14. The apparatus of claim 13, wherein the apparatus further comprises:
the second acquisition module is used for acquiring a plurality of candidate entities, wherein the plurality of candidate entities respectively have a plurality of corresponding entity use proportion values;
and the third acquisition module is used for acquiring candidate entities to which the entity usage proportion value greater than or equal to the proportion threshold belongs and constructing the universal knowledge base according to the belonging candidate entities.
15. The apparatus of claim 10, wherein the first determination submodule is configured to:
performing label mapping processing on the entity to be matched to obtain the label information to be matched;
and performing the label mapping processing on the reference entity to obtain the reference label information.
16. The apparatus of claim 15, wherein the first determination submodule is configured to:
carrying out semantic analysis on an entity to be judged to obtain entity semantic features, wherein the entity to be judged is the entity to be matched or the reference entity;
and determining target label information according to the entity semantic features and combining a setting mode, wherein the target label information is the label information to be matched or the reference label information.
17. The apparatus of claim 16, wherein the first determination submodule is configured to:
determining candidate semantic features matched with the entity semantic features from a tag knowledge base, and taking candidate tag information to which the candidate semantic features belong as the target tag information; or alternatively
Determining an associated entity related to the entity to be judged according to the entity semantic features, and taking associated tag information of the associated entity as the target tag information; or alternatively
Inputting the entity semantic features into a tag recognition model to obtain the target tag information output by the tag recognition model.
18. The apparatus of claim 10, wherein the descriptive information is a priority order corresponding to the respective candidate types, the priority orders corresponding to different candidate types being different.
19. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-9.
20. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-9.
CN202110944911.XA 2021-08-17 2021-08-17 Entity correlation determination method, device, electronic equipment and storage medium Active CN113792115B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110944911.XA CN113792115B (en) 2021-08-17 2021-08-17 Entity correlation determination method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110944911.XA CN113792115B (en) 2021-08-17 2021-08-17 Entity correlation determination method, device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113792115A CN113792115A (en) 2021-12-14
CN113792115B true CN113792115B (en) 2024-03-22

Family

ID=78876033

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110944911.XA Active CN113792115B (en) 2021-08-17 2021-08-17 Entity correlation determination method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113792115B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114579626B (en) * 2022-03-09 2023-08-11 北京百度网讯科技有限公司 Data processing method, data processing device, electronic equipment and medium
CN116303392B (en) * 2023-03-02 2023-09-01 重庆市规划和自然资源信息中心 Multi-source data table management method for real estate registration data
CN117421416B (en) * 2023-12-19 2024-03-26 数据空间研究院 Interactive search method and device and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111241282A (en) * 2020-01-14 2020-06-05 北京百度网讯科技有限公司 Text theme generation method and device and electronic equipment
CN111967262A (en) * 2020-06-30 2020-11-20 北京百度网讯科技有限公司 Method and device for determining entity tag
CN112115697A (en) * 2020-09-25 2020-12-22 北京百度网讯科技有限公司 Method, device, server and storage medium for determining target text
CN112699667A (en) * 2020-12-29 2021-04-23 京东数字科技控股股份有限公司 Entity similarity determination method, device, equipment and storage medium
CN112860866A (en) * 2021-02-09 2021-05-28 北京百度网讯科技有限公司 Semantic retrieval method, device, equipment and storage medium
WO2021151353A1 (en) * 2020-10-20 2021-08-05 平安科技(深圳)有限公司 Medical entity relationship extraction method and apparatus, and computer device and readable storage medium
CN113257383A (en) * 2021-06-16 2021-08-13 腾讯科技(深圳)有限公司 Matching information determination method, display method, device, equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10019516B2 (en) * 2014-04-04 2018-07-10 University Of Southern California System and method for fuzzy ontology matching and search across ontologies
CN111831854A (en) * 2020-06-03 2020-10-27 北京百度网讯科技有限公司 Video tag generation method and device, electronic equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111241282A (en) * 2020-01-14 2020-06-05 北京百度网讯科技有限公司 Text theme generation method and device and electronic equipment
CN111967262A (en) * 2020-06-30 2020-11-20 北京百度网讯科技有限公司 Method and device for determining entity tag
CN112115697A (en) * 2020-09-25 2020-12-22 北京百度网讯科技有限公司 Method, device, server and storage medium for determining target text
WO2021151353A1 (en) * 2020-10-20 2021-08-05 平安科技(深圳)有限公司 Medical entity relationship extraction method and apparatus, and computer device and readable storage medium
CN112699667A (en) * 2020-12-29 2021-04-23 京东数字科技控股股份有限公司 Entity similarity determination method, device, equipment and storage medium
CN112860866A (en) * 2021-02-09 2021-05-28 北京百度网讯科技有限公司 Semantic retrieval method, device, equipment and storage medium
CN113257383A (en) * 2021-06-16 2021-08-13 腾讯科技(深圳)有限公司 Matching information determination method, display method, device, equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
一种改进的实体关系抽取算法――OptMultiR;延浩然;靳小龙;贾岩涛;程学旗;;中文信息学报(第09期);全文 *
基于图的中文集成实体链接算法;刘峤;钟云;李杨;刘瑶;秦志光;;计算机研究与发展(第02期);全文 *

Also Published As

Publication number Publication date
CN113792115A (en) 2021-12-14

Similar Documents

Publication Publication Date Title
CN113792115B (en) Entity correlation determination method, device, electronic equipment and storage medium
US10902588B2 (en) Anatomical segmentation identifying modes and viewpoints with deep learning across modalities
US11670420B2 (en) Drawing conclusions from free form texts with deep reinforcement learning
Fang et al. Feature Selection Method Based on Class Discriminative Degree for Intelligent Medical Diagnosis.
CN113241135A (en) Disease risk prediction method and system based on multi-mode fusion
CN113033622B (en) Training method, device, equipment and storage medium for cross-modal retrieval model
CN110162786B (en) Method and device for constructing configuration file and extracting structured information
US20220004706A1 (en) Medical data verification method and electronic device
CN112786144B (en) Knowledge graph method, doctor&#39;s advice quality control method, device, equipment and medium
CN114420309B (en) Method for establishing medicine synergistic effect prediction model, prediction method and corresponding device
CN111564223A (en) Infectious disease survival probability prediction method, and prediction model training method and device
Spanier et al. A new method for the automatic retrieval of medical cases based on the RadLex ontology
WO2022227171A1 (en) Method and apparatus for extracting key information, electronic device, and medium
Dubey et al. Enabling CT-Scans for covid detection using transfer learning-based neural networks
Banerjee et al. A scalable machine learning approach for inferring probabilistic US-LI-RADS categorization
CN113094476A (en) Risk early warning method, system, equipment and medium based on natural language processing
Gao et al. Accuracy analysis of triage recommendation based on CNN, RNN and RCNN models
Saad et al. Novel extreme regression-voting classifier to predict death risk in vaccinated people using VAERS data
Yaşar et al. A novel study to increase the classification parameters on automatic three-class COVID-19 classification from CT images, including cases from Turkey
CN114461085A (en) Medical input recommendation method, device, equipment and storage medium
CN113724017A (en) Pricing method and device based on neural network, electronic equipment and storage medium
CN111640517A (en) Medical record encoding method and device, storage medium and electronic equipment
CN111275558A (en) Method and device for determining insurance data
CN115033747B (en) Abnormal state searching method and device
CN114840560B (en) Unstructured data conversion and storage method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant