WO2022095892A1

WO2022095892A1 - Method and apparatus for generating push information

Info

Publication number: WO2022095892A1
Application number: PCT/CN2021/128398
Authority: WO
Inventors: 黄亮; 李鑫; 郭旭炀; 康西龙
Original assignee: 北京京东拓先科技有限公司
Priority date: 2020-11-09
Filing date: 2021-11-03
Publication date: 2022-05-12
Also published as: CN112287121A; KR20230092002A

Abstract

The present application relates to the field of artificial intelligence, the technical field of natural language processing, the technical field of knowledge graph, and the technical field of big data. Disclosed are a method and apparatus for generating push information, an electronic device, and a computer readable storage medium. The specific implementation solution is as follows: acquiring standard representation information corresponding to representation information in input information of a user; determining, on the basis of a pre-constructed medical knowledge graph, at least one medical status entity hit by the standard representation information, the medical knowledge graph recording the correspondence between the representation information and the medical status entity, and the correspondence being obtained by extraction from abstract information of the medical literature; and generating a push information set on the basis of the medical status entity, sending the push information set to the user, and determining a push message pushed to the user by using a knowledge graph constructed on the basis of the abstract information of the medical literature. The cost of determining a push message is reduced while improving the quality of the push message.

Description

Method and device for generating push information

This patent application claims the priority of the Chinese patent application with the application number 202011241131.0 and the invention title "Method and Device for Pushing Information" filed on November 9, 2020, the full text of which is incorporated into this application by reference middle.

technical field

The present application relates to the field of artificial intelligence, in particular to the technical field of natural language processing, the technical field of knowledge graphs, and the technical field of big data, and in particular, to a method, apparatus, electronic device, and computer-readable storage medium for generating push information.

Background technique

With the development of society, in order to better meet the retrieval needs of users, more and more methods based on Internet big data and knowledge graphs are used to match the query information input by users and generate corresponding push messages. Provide retrieval services.

When determining a knowledge graph in the prior art, it is usually constructed based on domain expert knowledge.

SUMMARY OF THE INVENTION

The present application provides a method, apparatus, electronic device, and storage medium for generating push information.

In a first aspect, embodiments of the present application provide a method for generating push information, including: acquiring standard representation information corresponding to representation information in user input information; and determining the standard representation based on a pre-constructed medical knowledge graph At least one medical state entity hit by the information; wherein, the medical knowledge graph records the corresponding relationship between the representation information and the medical state entity, and the corresponding relationship is extracted from the abstract information of the medical literature; based on the medical state entity, the push information is generated collection, and send the push information collection to the user.

In some embodiments, the medical knowledge graph is determined based on the following steps: acquiring abstract text information of a plurality of medical documents to obtain a set of abstract text information; using an entity recognition neural network to determine the set of entities hit in the set of abstract text information; wherein the The entity set includes the following information in the abstract text information set: information related to the representation information and medical state entities; perform medical language normalization matching on the entity set to obtain a canonical entity set; classify the canonical entities in the canonical entity set labeling to obtain a representation information set and a medical state entity set; and based on the co-occurrence relationship between the representation information in the representation information set and the medical state entity in the medical state entity set, the medical knowledge graph is obtained.

In some embodiments, the entity recognition neural network includes a bidirectional short-term memory network and a conditional random field.

In some embodiments, generating a set of push information based on the medical state entity, and sending the set of push information to the user includes: sorting the medical state entities using a probabilistic graph model, and selecting a preset number of the medical state entities according to the sorting result Generate a push information set; send the push set to the user.

In some embodiments, the step of generating standard representation information includes: acquiring user input information; identifying representation information contained in the input information to obtain a recognition result; and determining the standard representation information based on the normalized semantics of the recognition result.

In some embodiments, determining the standard representation information based on the normalized semantics of the recognition result includes: expanding based on the normalized semantics of the recognition result to generate an extended representation information set; Extended characterization information as standard characterization information.

In some embodiments, determining the at least one medical state entity hit by the standard representation information based on the pre-constructed medical knowledge graph includes: in response to determining that selection information for the standard representation information is received, using the pre-constructed medical knowledge graph, At least one medical state entity that the criterion characterization information hits is determined.

In a second aspect, embodiments of the present application provide an apparatus for generating push information, including: a standard representation information acquiring unit configured to acquire standard representation information corresponding to representation information in user input information; a medical status entity The determining unit is configured to, based on a pre-constructed medical knowledge graph, determine at least one medical state entity hit by the standard representation information; wherein, the medical knowledge graph records a correspondence between the representation information and the medical state entity, and the corresponding relationship It is extracted from abstract information of medical documents; the push information sending unit is configured to generate a push information set based on the medical state entity, and send the push information set to the user.

In some embodiments, a medical knowledge graph determination unit is further included, which specifically includes: an initial information acquisition subunit, configured to acquire abstract text information of a plurality of medical documents, to obtain a collection of abstract text information; an entity identification subunit, configured The entity recognition neural network is used to determine the hit entity set in the abstract text information set; wherein, the entity set includes the following information in the abstract text information set: information related to the representation information and the medical state entity; normative matching subunit, is configured to perform medical language normalization matching on the entity set to obtain a normalized entity set; a classification and labeling subunit is configured to classify and label the normalized entities in the normalized entity set to obtain a representation information set and a medical state entity set; medical The knowledge graph generation subunit is configured to obtain the medical knowledge graph based on the co-occurrence relationship between the representation information in the representation information set and the medical state entities in the medical state entity set.

In some embodiments, the entity recognition neural network in the entity recognition subunit includes: a bidirectional short-term memory network and a conditional random field.

In some embodiments, the push information sending unit is further configured to: use a probabilistic graph model to sort the medical state entities, select a preset number of the medical state entities according to the sorting result to generate a push information set; send the push set to the user.

In some embodiments, a standard information generation unit is further included, including: an initial information acquisition subunit, configured to acquire user input information; an information identification subunit, configured to recognize the representation information contained in the input information, and obtain the identification Result: the standard representation information determination subunit is configured to determine the standard representation information based on the normalized semantics of the recognition result.

In some embodiments, the standard characterization information determining subunit is further configured to: expand based on the normalized semantics of the recognition result to generate an extended characterization information set; and use the extended characterization information in the extended characterization information set as a standard characterization information.

In some embodiments, the medical state entity determination unit is further configured to: in response to determining that selection information for the standard characterization information is received, using a pre-configured medical knowledge graph, determine at least one medical state entity hit by the standard characterization information .

In a third aspect, an embodiment of the present application provides an electronic device, comprising: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores a memory that can be executed by the at least one processor The instruction is executed by the at least one processor, so that the at least one processor can execute the method for generating push information described in any implementation manner.

In a fourth aspect, embodiments of the present application provide a non-transitory computer-readable storage medium storing computer instructions, including: the computer instructions are used to cause the computer to execute the method for generating push information described in any implementation manner.

After obtaining the standard representation information corresponding to the representation information in the input information of the user, the present application determines at least one medical state entity hit by the standard representation information based on a pre-constructed medical knowledge graph, wherein the medical knowledge graph records The corresponding relationship between the representation information and the medical state entity, the corresponding relationship is extracted from the abstract information of the medical literature, the push information set is generated based on the medical state entity, the push information set is sent to the user, and the abstract based on the medical literature is used. The knowledge graph constructed by the information determines the push messages pushed to users, which reduces the cost of determining the push messages and improves the quality of the push messages.

It should be understood that the content described in this section is not intended to identify key or critical features of the embodiments of the application, nor is it intended to limit the scope of the application. Other features of the present application will become readily understood from the following description.

Description of drawings

The accompanying drawings are used for better understanding of the present solution, and do not constitute a limitation to the present application. in:

FIG. 1 is an exemplary system architecture to which embodiments of the present application may be applied;

2 is a flowchart of an embodiment of a method for generating push information according to the present application;

3 is a flowchart of an implementation of determining a medical knowledge graph in the method for generating push information according to the present application;

4 is a flowchart of another embodiment of a method for generating push information according to the present application;

5 is a schematic structural diagram of an embodiment of an apparatus for generating push information according to the present application;

FIG. 6 is a block diagram of an electronic device suitable for implementing the method for generating push information according to the embodiment of the present application.

Detailed ways

Exemplary embodiments of the present application are described below with reference to the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be considered as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted from the following description for clarity and conciseness.

It should be noted that the embodiments in the present application and the features of the embodiments may be combined with each other in the case of no conflict. The present application will be described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.

FIG. 1 shows an exemplary system architecture 100 to which embodiments of the method, apparatus, electronic device, and computer-readable storage medium for generating push information of the present application may be applied.

As shown in FIG. 1 , the system architecture 100 may include

terminal devices

101 , 102 , and 103 , a network 104 and a server 105 . The network 104 is a medium used to provide a communication link between the

terminal devices

101 , 102 , 103 and the server 105 . The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

The user can use the

terminal devices

101, 102, 103 to interact with the server 105 through the network 104, so as to achieve the purpose of sending the user's input information and the like. Retrieval applications, such as navigation applications, encyclopedia query applications, and online consultation applications, may be installed on the

terminal devices

101 , 102 , and 103 .

The

terminal devices

101, 102, and 103 may be hardware or software. In the case of hardware, it can be various electronic devices with display screens, including but not limited to smart phones, tablet computers, laptop computers, desktop computers, and the like. When the

terminal devices

101, 102, and 103 are software, they can be installed in the electronic devices listed above. It can be implemented as a plurality of software or software modules (such as sending user input information, etc.), or can be implemented as a single software or software module. There is no specific limitation here.

The server 105 may be a server that provides various services, for example, a server that provides retrieval services and generates push information for the

terminal devices

101 , 102 , and 103 . For example, obtain standard representation information corresponding to the representation information in the input information of the user; determine at least one medical state entity hit by the standard representation information based on a pre-constructed medical knowledge graph; wherein, the medical knowledge graph is based on the medical knowledge graph in the medical literature. The relationship between the representation information and the medical state entity is determined; a push information set is generated based on the medical state entity, and the push information set is sent to the user.

It should be noted that the method for generating push information provided by the embodiments of the present application is generally performed by the server 105 , and accordingly, the device for generating push information is generally set in the server 105 .

It should be noted that the server may be hardware or software. When the server is hardware, it can be implemented as a distributed server cluster composed of multiple servers, or as a single server. When the server is software, it may be implemented as multiple software or software modules for providing distributed services, or may be implemented as a single software or software module. There is no specific limitation here.

In addition, the method for generating push information may also be executed by the

terminal devices

101 , 102 and 103 , and correspondingly, the apparatus for generating push information may also be set in the

terminal devices

101 , 102 and 103 . At this point, the example system architecture 100 may also not include the server 105 and the network 104 .

It should be understood that the numbers of terminal devices, networks and servers in FIG. 1 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.

Continue to refer to FIG. 2 , which shows a flow 200 of an embodiment of the method for generating push information according to the present application. The method for generating the push information includes the following steps:

Step 201: Acquire standard representation information corresponding to the representation information in the user's input information.

In this embodiment, the execution body of the method for generating push information (for example, the server 105 shown in FIG. 1 ) may be obtained from a local or non-local human-computer interaction device (for example, the

terminal devices

101 , 102 , and 103 shown in FIG. 1 ) The user's input information and the standard representation information corresponding to the representation information in the user's input information are not limited in this application.

Among them, representational information is the way that information is presented in thinking systems such as the mind, computer system, etc., and the way of recording or expressing information, a formal system that can clearly express certain entities or certain types of information and explain how the system functions. certain rules for its functions. Therefore, we can understand that representational information refers to a symbol or signal that can refer to something, that is, when a thing is absent, it represents the relevant information of the thing, and an entity generally refers to a text that has a specific meaning or refers to Strong entities usually include names of people, places, organizations, dates and times, proper nouns, etc. Therefore, the concept of entity can be very broad, as long as it is a special text fragment required by business, it can be called an entity.

It should be understood that the input information of the user is usually the input information obtained according to the user's own cognitive level and cultural differences, which contains one or more representation information representing the real meaning of the user. These information are converted into standard representation information that can be identified and understood by the above-mentioned executive body. When pre-training the above-mentioned executive body, the training model usually used is the standard expression form provided by authoritative officials in the corresponding field, so the above standard The representation information is a standard form of expression provided by authoritative officials in various fields. For example, in the medical field, the user's input information is stomach pain, and the content is not a standard expression in the medical field, then the corresponding conversion is carried out into abdominal pain. , stomach pain, stomach colic, stomach pain and other standard expressions in the medical field to obtain standard representation information.

Similarly, the execution body of the method for generating push information can process the input information from the user locally after obtaining the input information to obtain the standard representation information corresponding to the representation information in the input information, or can directly obtain other non-local terminals. Corresponding standard characterization information obtained after processing in the device based on the characterization information in the user's input information.

Step 202, based on a pre-constructed medical knowledge map, determine at least one medical state entity hit by the standard representation information; wherein, the medical knowledge map records the correspondence between the representation information and the medical state entity, and the correspondence is from the medical literature. Extracted from the abstract information.

In this embodiment, the standard representation information obtained in the above step 201 is matched according to a pre-constructed medical knowledge graph that records the correspondence between the representation information and the medical state entities, and one or more hits of the standard representation information are determined. a medical state entity.

Among them, the corresponding relationship between the representation information recorded in the medical knowledge graph and the medical state entities is extracted based on the abstract information in multiple medical documents. Taking the abstract document of a medical document as an example, if the If there is the first representation information and the first medical state entity, it is considered that there is a corresponding relationship between the first representation information and the first medical state entity, based on the multiple needle information and the medical state entity existing in the abstract information of multiple medical documents The correspondence between them is obtained as a medical knowledge graph.

It should be understood that, in the abstract information of a medical document, there may be multiple representation information and medical state entities at the same time, and there is a corresponding representation information corresponding to multiple medical state entities, and there are multiple correspondences and/or multiple entities. When each piece of representation information corresponds to a medical state entity with multiple correspondences, these correspondences are also recorded in the process of generating the medical knowledge graph.

Step 203: Generate a push information set based on the hit medical state entity, and send the push information set to the user.

In this embodiment, after step 203 above, one or more medical state entities hit by the standard representation information can be obtained according to the pre-constructed medical knowledge graph. The screening rule selects the medical state entities that meet the requirements from the multiple hit medical state entities, sorts the multiple hit medical state entities, and generates a push information set according to the obtained one or more medical state entities, that is, the The push information set contains one or more medical status entities, and then the push information set is sent to the user who entered the information in step 201 to determine the final push content pushed to the user, so that the user can obtain the information based on the input information. Generated push information.

In the method for generating push information provided by the embodiment of the present application, after obtaining standard representation information corresponding to the representation information in the input information of the user, at least one medical state hit by the standard representation information is determined based on a pre-constructed medical knowledge graph entity, wherein the medical knowledge graph records the correspondence between the representation information and the medical state entity, and the correspondence is extracted from the abstract information of the medical literature, generates a push information set based on the medical state entity, and sends the push information set For the user, the knowledge graph constructed based on the abstract information of the medical literature is used to determine the push message to be pushed to the user, thereby reducing the cost of determining the push message and improving the quality of the push message.

In some optional implementations of this embodiment, referring to FIG. 3 , a process 300 of determining steps of a medical knowledge graph is shown, which specifically includes:

Step 301: Acquire abstract text information of a plurality of medical documents to obtain a set of abstract text information.

Specifically, a large number of medical documents can be obtained through medical document retrieval databases such as Pubmed and Chinese Biomedical Documents, and after obtaining the medical documents, the abstract text information in the medical documents is extracted to obtain a set of abstract text information.

Among them, the abstract text information can be in English or Chinese, and preferably English abstract information is used for extraction, because the use of English abstract information can avoid the problem of needing to segment the text content when using Chinese abstract text information, and further improve the abstract Generation efficiency of textual information collections.

It should be understood that, when extracting the abstract text information of the obtained medical documents, the title information of the medical documents may also be extracted for subsequent reference.

Step 302 , using an entity recognition neural network to determine a hit entity set in the abstract text information set; wherein the entity set includes the following information in the abstract text information set: information related to representation information and medical state entities.

Specifically, entity recognition neural network extracts entities from unstructured input text, and can identify more categories of entities according to business requirements, such as neural networks for product names, models, prices, etc., such as Deep Web, NER and other entity recognition neural network to perform entity recognition on the content in the abstract text information set to determine the hit entity in the abstract text information set, wherein the entity information includes information related to the representation information and the medical state entity, and finally according to the hit entity Get the entity collection.

It should be understood that an entity that satisfies the preset rule can be selected from the hit entities according to a preset rule, and then an entity set is obtained based on the entity that satisfies the preset rule.

Step 303 , perform medical language normalization matching on the entity set to obtain a normalized entity set.

Specifically, although the entity recognition neural network can identify medical entities, the recognized entity names are not necessarily standardized (for example: "diarrhea" is not standardized, "diarrhea" is standardized), and the category is not necessarily correct (for example: "amoxicillin" category identified as "drugs"). After the corpus is passed through the entity recognition neural network, the entity candidate set is obtained. In order to automatically normalize the entities, the entity names and their synonyms in the medical knowledge database such as the UMLS database are searched approximately, and the matching entity is a canonical entity. Then, the matching entity is added to the encoding of the corresponding entity. For example, the CUI encoding is used to uniquely identify the candidate entity of this specification, and the canonical entity set is obtained. In addition, existing methods such as key-value pair encoding can be used to achieve this purpose.

Step 304: Classify and label the normalized entities in the normalized entity set to obtain a representation information set and a medical state entity set.

Specifically, since there is no comprehensive definition for the types of entities (representation information, medical state entities) in the medical knowledge database, it is impossible to distinguish whether the obtained entities are representation information or medical state entities. Matching, classify and label the obtained normalized entity through the existing medical knowledge information, medical knowledge graph, etc., and judge that the entity is a representation information, a medical state entity, or neither.

Step 305: Obtain the medical knowledge graph based on the co-occurrence relationship between the representation information in the representation information set and the medical state entities in the medical state entity set.

Specifically, the number of occurrences of the medical state entity and the representation information in the abstract text information set can be determined according to the number of occurrences of the medical state entity and the number of occurrences of the medical state entity and the representation information in the same medical document abstract. Whether there is a correspondence between medical state entities, for example, determine the medical state according to whether the probability of medical state entity A appearing in the abstract text information set is similar to the co-occurrence probability of the medical state entity and representation information B in the abstract text information Whether there is a corresponding relationship between entity A and representation information B, or, for example, a threshold condition for the number of co-occurrences is pre-determined, when the co-occurrence number of entity C between the representation information D and the medical state entity satisfies the predetermined threshold condition, it is considered that the There is a corresponding relationship between the representation information D and the medical state entity C, and the medical knowledge graph is obtained according to the corresponding relationship between the collected representation information and the medical state entity.

In this implementation, the method of generating a medical knowledge graph based on the abstract information of medical documents can firstly achieve a wide range of representation information and entity information coverage through a large number of medical documents, avoiding the need to rely on expert knowledge to generate knowledge graphs in the prior art. The problem of narrow information coverage in the method, and the simplification of medical documents can be realized through abstract information, avoiding the technical problem of low identification efficiency of medical documents caused by the excessive amount of content information in medical documents, and improving the efficiency of generating knowledge graphs .

In some optional implementations of this embodiment, the entity recognition neural network includes: a bidirectional short-term memory network and a conditional random field.

Specifically, the bidirectional short-term memory network, namely Bi-LSTM (Bi-directional Long Short-Term Memory, referred to as Bi-LSTM) memory network, is the biggest difference from the traditional neural network in that the input of the hidden layer not only includes the output of the input layer, It also includes the output of the hidden layer at the previous moment, and its main feature is that it can store the information of the previous moment. Although RNN (Recurrent Neural Network, RNN for short) can theoretically retain all the above information, as the number of hidden layers increases, there is a phenomenon of gradient disappearance or gradient explosion. LSTM (Long Short-Term Memory, referred to as LSTM) can effectively solve the problem of long-term dependence, including forgetting gate, input gate and output gate. In order to make the information expressed by the network richer and the inference more accurate, the research adopts a bidirectional network structure, namely Bi-LSTM. Bi-LSTM is spliced from two LSTMs and contains a forward input sequence and a reverse input sequence, taking into account both past and future features.

The advantage of LSTM is that it can learn the dependencies between observation sequences (input words) through bidirectional settings. During the training process, LSTM can automatically extract the features of observation sequences according to the target (such as recognizing entities), but the disadvantage is that it cannot be learned. The relationship between state sequences (output annotations), you must know that in named entity recognition tasks, there is a certain relationship between annotations, such as E-type annotations (representing the beginning of an entity) will not be followed by an E. Class labeling, so when LSTM solves the sequence labeling task, although it can save a lot of complicated feature engineering, it also has the disadvantage of not being able to learn the labeling context.

When Bi-LSTM is used for named entity recognition, the output of Bi-LSTM is the score of the entity label, and the label corresponding to the highest score is selected. However, in some cases, Bi-LSTM cannot get the real correct entity label. At this time, it is necessary to add a conditional random field, that is, CRF (Conditional Random Field, CRF for short). CRF combines the maximum entropy model and the hidden Markov model. It can model the hidden state and learn the characteristics of the state sequence, but its disadvantage is that it needs to manually extract the sequence features.

Therefore, when the bidirectional short-term memory network and the conditional random field are used in combination, the above-mentioned shortcomings when used alone can be avoided, and the advantages of both can be obtained at the same time to achieve higher-quality entity recognition work.

In some optional implementations of this embodiment, generating a push information set based on the medical state entity, and sending the push information set to the user includes: sorting the medical state entities by using a probability graph model, and selecting a preset according to the sorting result A number of the medical state entities generate a push information set; send the push set to the user.

Specifically, after obtaining the category of the entity, the correlation between the entities can be measured by the number of co-occurrences between the representation information and the medical state entity, for example, using the formula:

Among them, P(Dis _i |Sym _j ) represents the correlation probability between the representation information and the medical state entity, P(Dis _i , Sym _j ) represents the probability based on the co-occurrence times between the representation information and the medical state entity, P( Sym _j ) represents the probability of the appearance of the representation information, more specifically, the above formula describes the contribution of the jth representation information (Sym) to the ith medical state entity (Dis), for example: in 10 million documents "cough"" appears 100 times, and the number of times that "pulmonary tuberculosis" and "cough" appear at the same time (#co_occurrence) in an abstract is 5 times, then the contribution of cough to tuberculosis is: P(pulmonary tuberculosis|cough)=0.05. In order to simplify the calculation, it is assumed that the symptoms and the symptoms are independent, so the probabilistic graphical model of the representation information (Naive Bayes) is obtained, and the posterior probability of the medical state entity is as follows:

P(Dis _i |Sym _j ,Sym _j+1 ,...)=P(Dis _i )·P(Dis _i |Sym _j )·P(Dis _i |Sym _j+1 )...

Among them, P(Dis) is the prior probability of the medical state entity. In order to avoid the result being too small due to continuous multiplication, the logarithm of both sides of the above formula can be taken, and the accumulation operation can be changed. Posterior probability of each medical state entity.

After obtaining the posterior probability of each medical state entity, each medical state entity can be sorted according to the size of the probability, and a preset number of medical state entities can be selected to generate a push information set, so that the user can obtain the medical state with a high hit probability. entity to further improve the quality of push information.

In some optional implementations of this embodiment, the step of generating the standard representation information includes:

The input information of the user is acquired; the representation information contained in the input information is identified to obtain a recognition result; the standard representation information is determined based on the normalized semantics of the recognition result.

Specifically, the normalized word may be used to represent some common expressions of similar descriptions, and the normalized semantics may be to replace the template word in the entry information with the normalized word to unify expressions with different semantics. Therefore, a standard representation information database can be constructed based on the existing information in advance, and after the recognition result of the user input information is obtained, the recognition result is normalized to speech to obtain the standard representation information, so as to prevent the user from being unable to use the more standardized information. , When the standard description language accurately expresses the ideas and needs, the above-mentioned executive body cannot understand the user's ideas and needs, so that the push information that meets the user's needs and needs can be smoothly generated according to the user's ideas and needs.

Continue to refer to FIG. 4, which shows a process 400 of another embodiment of a method for generating push information, which specifically includes the following steps:

Step 401: Obtain input information of a user.

Step 402: Identify the representation information contained in the input information to obtain the identification result.

Specifically, the entity recognition neural network in the implementation manner corresponding to FIG. 3 may be used to identify the user's input information, so as to determine the representation information existing therein.

Step 403: Expand based on the normalized semantics of the recognition result to generate an expanded representation information set.

Specifically, it is difficult to include standard representation information in the input information of the user. Therefore, based on the normalized semantic results obtained in some implementations of the embodiment shown in FIG. 2, similar extensions can be performed according to the semantic results, for example, If the normalized semantic result is "stomach cramps", it can be approximately extended to the same type of "stomach cramps" according to the content to obtain more reference information related to the user's input information, that is, extended representation information, which is convenient for Subsequent representation information sets obtained according to the representation information obtain more medical state entities, thereby improving the quality of the generated push information sets.

Wherein, when adopting the technical solution of determining the push set based on the probability graph model in some implementations of the embodiment shown in FIG. 2, the probability graph model may also be used:

P(Sym _i |Sym _j ,Sym _j+1 ,...)=P(Sym _i )·P(Sym _i |Sym _j )·P(Sym _i |Sym _j+1 )...

to determine the extended representation information, where P(Sym _j , Sym _j+1 . The specific principle is similar to the above-mentioned process of determining the corresponding relationship between the representation information and the medical state entity based on the co-occurrence relationship between the representation information and the medical state entity, and will not be repeated here.

Step 404, take the extended representation information in the extended representation information set as standard representation information.

Step 405 , in response to determining that selection information for the standard representation information is received, use a pre-constructed medical knowledge graph to determine at least one medical state entity hit by the standard representation information.

It should be understood that the number of standard representation information here may be one message or multiple pieces. When there are multiple pieces of standard representation information, these standard representation information can be presented to the user who inputs the information, and the user can obtain the information based on the information sent by the user. The selection information generated by the selected standard characterization information, determine the standard characterization information included in the selection information, that is, the standard characterization information selected by the user, obtain the standard characterization information expected by the user in the form of human-computer interaction, based on the user-selected standard characterization information It can better meet the needs of users, so as to improve the quality of the subsequently generated push information.

Step 406: Generate a push information set based on the medical state entity, and send the push information set to the user.

Specifically, if the standard representation information in the selection information sent by the user in step 405 hits multiple medical state entities, a push information set is generated based on the multiple medical state entities, and sent to the user, wherein, based on the multiple medical state entities When generating the push information set, the medical state entities may be sorted and screened according to predetermined rules, for example, in the implementation of the embodiment shown in FIG.

In this embodiment, some contents in

steps

404 and 405 are similar to steps 202-203 in the embodiment shown in FIG. 2 , and the repeated contents will not be repeated. The identification result of the included representation information is used to determine the extended representation information set, the extended representation set determines the standard representation information, and then the final selected standard representation information is determined based on the result of the human-computer interaction with the user, and the medical state entity is determined correspondingly, and the push notification is generated. The information collection is pushed to the user, so as to provide the user with the push information of higher quality and closer to the actual needs of the user according to the actual needs of the user.

In order to deepen understanding, this application also provides a specific implementation scheme in combination with a specific application scenario. In this specific application scenario, the user's input information is "continuous diarrhea from last night to this morning", and the predetermined number of medical state entities to be extracted is three.

After obtaining the information input by the user, the entity recognition neural network is used to identify the input information of the user, and the representative information "diarrhea" in it is determined, and then the semantic "diarrhea" is normalized based on the identification result, and the expansion is obtained. The extended representation information "abdominal pain", "bloating", "indigestion" and "stomach colic" is presented to the user.

In response to the user's selection of information for the standard characterization information of "diarrhea", which includes extended characterization information "abdominal pain" and "indigestion", then using a pre-constructed knowledge graph to determine "diarrhea", "abdominal pain" and "digestion". "bad" and hit medical status entities, the medical knowledge graph records the correspondence between representation information and medical status entities, which is extracted from the abstract information of medical literature.

The medical status entities that got hits were "Irritable Bowel Syndrome", "Lactose Intolerance", "Gastroparesis", "Celiac disease", "Gastritis" and "Peptic ulcer". After sorting the status entities, the sorting relationship is: "Gastritis", "Peptic ulcer", "Irritable bowel syndrome", "Lactose intolerance", "Celiac disease" and "Gastroparesis".

Therefore, extract three medical status entities, namely "gastritis", "peptic ulcer" and "lactose intolerance", generate a push information set, and push it to the user.

It can be seen from this application scenario that, in the method for generating push information provided by the embodiments of the present application, after obtaining the standard representation information corresponding to the representation information in the user's input information, the standard is determined based on a pre-constructed medical knowledge graph. At least one medical state entity hit by the representation information, wherein the medical knowledge graph records the correspondence between the representation information and the medical state entity, the correspondence is extracted from the abstract information of the medical literature, and the push is generated based on the medical state entity Information collection, send the push information collection to the user, and use the knowledge graph constructed based on the abstract information of medical literature to determine the push message to be pushed to the user, thereby reducing the cost of determining the push message and improving the quality of the push message.

As shown in FIG. 5 , the apparatus 500 for generating push information in this embodiment may include: a standard representation information acquiring unit 501 configured to acquire standard representation information corresponding to the representation information in the user's input information; a medical state entity determination Unit 502 is configured to determine at least one medical state entity hit by the standard representation information based on a pre-constructed medical knowledge map; wherein, the medical knowledge map records a correspondence between the representation information and the medical state entity, and the corresponding relationship It is extracted from the abstract information of medical documents; the push information sending unit 503 is configured to generate a push information set based on the medical state entity, and send the push information set to the user.

In some optional implementations of this embodiment, the apparatus for generating push information further includes: a medical knowledge graph determination unit, including: an initial information acquisition subunit, configured to acquire abstract text information of a plurality of medical documents, and obtain The abstract text information set; the entity identification subunit is configured to use the entity recognition neural network to determine the entity set hit in the abstract text information set; wherein, the entity set includes the following information in the abstract text information set: with representation information and Information related to medical state entities; a normative matching subunit, configured to perform medical language normalization matching on the entity set to obtain a normalized entity set; a classification and labeling subunit, configured to classify and label the normalized entities in the normalized entity set , obtain the representation information set and the medical state entity set; the medical knowledge graph generation subunit is configured to obtain the medical knowledge based on the co-occurrence relationship between the representation information in the representation information set and the medical state entity in the medical state entity set Atlas.

In some optional implementations of this embodiment, the entity recognition neural network in the entity recognition subunit includes: a bidirectional short-term memory network and a conditional random field.

In some optional implementations of this embodiment, the push information sending unit is further configured to: use a probability graph model to sort the medical state entities, and select a preset number of the medical state entities according to the sorting result to generate a push information set ; Send the push collection to the user.

In some optional implementation manners of this embodiment, the apparatus for generating push information further includes: a standard information generating unit, including: an initial information acquiring subunit, configured to acquire user input information; an information identifying subunit, which is configured to acquire user input information; is configured to identify the representation information contained in the input information to obtain a recognition result; the standard representation information determination subunit is configured to determine the standard representation information based on the normalized semantics of the recognition result.

In some optional implementation manners of this embodiment, the standard representation information determination subunit is further configured to: expand based on the normalized semantics of the recognition result to generate an extended representation information set; Extended characterization information as standard characterization information.

In some optional implementations of this embodiment, the medical state entity determination unit is further configured to: in response to determining that selection information for the standard representation information is received, use a pre-constructed medical knowledge graph to determine the standard representation information At least one medical state entity hit.

This embodiment exists as an apparatus embodiment corresponding to the foregoing method embodiment. For the same content, reference is made to the description of the foregoing method embodiment, which will not be repeated here. With the device for generating push information provided in the embodiment of the present application, the knowledge graph constructed based on the abstract information of medical documents is used to determine the push message to be pushed to the user, thereby reducing the cost of determining the push message and improving the quality of the push message.

As shown in FIG. 6 , it is a block diagram of an electronic device according to a method for generating push information according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are by way of example only, and are not intended to limit implementations of the application described and/or claimed herein.

As shown in FIG. 6, the electronic device includes: one or more processors 601, a memory 602, and interfaces for connecting various components, including a high-speed interface and a low-speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or otherwise as desired. The processor may process instructions executed within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used with multiple memories and multiple memories, if desired. Likewise, multiple electronic devices may be connected, each providing some of the necessary operations (eg, as a server array, a group of blade servers, or a multiprocessor system). A processor 601 is taken as an example in FIG. 6 .

The memory 602 is the non-transitory computer-readable storage medium provided by the present application. The memory stores instructions executable by at least one processor, so that the at least one processor executes the method for generating push information provided by the present application. The non-transitory computer-readable storage medium of the present application stores computer instructions, and the computer instructions are used to cause the computer to execute the method for generating push information provided by the present application.

As a non-transitory computer-readable storage medium, the memory 602 can be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as program instructions/modules (for example, program instructions/modules corresponding to the method for generating push information in the embodiments of the present application). , the standard representation information acquisition unit 501, the medical state entity determination unit 502 and the push information sending unit 503 shown in FIG. 5). The processor 601 executes various functional applications and data processing of the server by running the non-transitory software programs, instructions and modules stored in the memory 602, that is, implementing the method for generating push information in the above method embodiments.

The memory 602 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function; the storage data area may store data created by the use of the electronic device according to the generation of the push information, etc. . Additionally, memory 602 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 602 may optionally include memory located remotely from the processor 601, and these remote memories may push the information-generating electronic device through a network connection. Examples of such networks include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.

The electronic device for executing the method for generating push information may further include: an input device 603 and an output device 604 . The processor 601 , the memory 602 , the input device 603 and the output device 604 may be connected by a bus or in other ways, and the connection by a bus is taken as an example in FIG. 6 .

The input device 603 can receive input numerical or character information, and generate key signal input related to user settings and function control of the electronic device for generating push information, such as a touch screen, a keypad, a mouse, a trackpad, a touchpad, a pointing stick, One or more input devices such as mouse buttons, trackballs, joysticks, etc. Output devices 604 may include display devices, auxiliary lighting devices (eg, LEDs), haptic feedback devices (eg, vibration motors), and the like. The display device may include, but is not limited to, a liquid crystal display (LCD), a light emitting diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.

Various implementations of the systems and techniques described herein can be implemented in digital electronic circuitry, integrated circuit systems, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs executable and/or interpretable on a programmable system including at least one programmable processor that The processor, which can be a special purpose or general-purpose programmable processor, can receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device. an output device.

These computational programs (also referred to as programs, software, software applications, or codes) include machine instructions for programmable processors, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages calculation program. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or apparatus for providing machine instructions and/or data to a programmable processor ( For example, magnetic disks, optical disks, memories, programmable logic devices (PLDs), including machine-readable media that receive machine instructions as machine-readable signals. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide interaction with a user, the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user ); and a keyboard and pointing device (eg, a mouse or trackball) through which a user can provide input to the computer. Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (eg, visual feedback, auditory feedback, or tactile feedback); and can be in any form (including acoustic input, voice input, or tactile input) to receive input from the user.

The systems and techniques described herein may be implemented on a computing system that includes back-end components (eg, as a data server), or a computing system that includes middleware components (eg, an application server), or a computing system that includes front-end components (eg, a user's computer having a graphical user interface or web browser through which a user may interact with implementations of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system. The components of the system may be interconnected by any form or medium of digital data communication (eg, a communication network). Examples of communication networks include: Local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

A computer system can include clients and servers. Clients and servers are generally remote from each other and usually interact through a communication network. The relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other.

According to the technical solutions of the embodiments of the present application, after obtaining the standard representation information corresponding to the representation information in the input information of the user, based on the pre-constructed medical knowledge graph, at least one medical state entity hit by the standard representation information is determined, wherein , the medical knowledge graph records the corresponding relationship between the representation information and the medical state entity, the corresponding relationship is extracted from the abstract information of the medical literature, and the push information set is generated based on the medical state entity, and the push information set is sent to the user. , using the knowledge graph constructed based on the abstract information of medical literature to determine the push messages to the users, reducing the cost of determining the push messages and improving the quality of the push messages.

It should be understood that steps may be reordered, added or deleted using the various forms of flow shown above. For example, the steps described in the present application can be executed in parallel, sequentially or in different orders, as long as the desired results of the technical solutions disclosed in the present application can be achieved, no limitation is imposed herein.

The above-mentioned specific embodiments do not constitute a limitation on the protection scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may occur depending on design requirements and other factors. Any modifications, equivalent replacements and improvements made within the spirit and principles of this application shall be included within the protection scope of this application.

Claims

A method for generating push information, comprising:

Obtain standard representation information corresponding to the representation information in the user's input information;

Based on a pre-constructed medical knowledge graph, at least one medical state entity hit by the standard representation information is determined; wherein, the medical knowledge graph records the correspondence between the representation information and the medical state entity, and the correspondence is obtained from medical documents extracted from the summary information;

A push information set is generated based on the hit medical state entity, and the push information set is sent to the user.
The method of claim 1, wherein the medical knowledge graph is determined based on the steps of:

Obtain abstract text information of multiple medical documents, and obtain a collection of abstract text information;

An entity recognition neural network is used to determine the entity set hit in the abstract text information set; wherein, the entity set includes the following information in the abstract text information set: information related to representation information and medical status entities;

Performing medical language normalization matching on the entity set to obtain a normalized entity set;

Classifying and labeling the normalized entities in the normalized entity set to obtain a representation information set and a medical state entity set;

Based on the co-occurrence relationship between the representation information in the representation information set and the medical state entities in the medical state entity set, the medical knowledge graph is obtained.
The method of claim 2, wherein the entity recognition neural network comprises a bidirectional short-term memory network and a conditional random field.
The method according to claim 1, wherein the generating a set of push information based on the medical state entity, and sending the set of push information to the user comprises:

Sort the medical state entities by using a probability graph model, and select a preset number of the medical state entities according to the sorting result to generate a push information set;

Send the push collection to the user.
The method according to claim 1, wherein the generating step of the standard characterization information comprises:

Get the user's input information;

Identify the representation information contained in the input information to obtain the identification result;

The standard representation information is determined based on the normalized semantics of the recognition result.
The method according to claim 5, wherein the standard representation information is determined based on the normalized semantics of the recognition result, comprising:

Expand based on the normalized semantics of the recognition result to generate an expanded representation information set;

The extended representation information in the extended representation information set is used as standard representation information.
The method according to claim 6, wherein the determining, based on a pre-constructed medical knowledge graph, at least one medical state entity hit by the standard representation information comprises:

In response to determining that selection information for the standard characterization information is received, at least one medical state entity hit by the standard characterization information is determined using a pre-configured medical knowledge graph.
A device for generating push information, comprising:

a standard representation information acquisition unit, configured to acquire standard representation information corresponding to the representation information in the user's input information;

A medical state entity determination unit, configured to determine at least one medical state entity hit by the standard representation information based on a pre-constructed medical knowledge map; wherein the medical knowledge map records the correspondence between the representation information and the medical state entity relationship, the corresponding relationship is extracted from the abstract information of medical literature;

A push information sending unit, configured to generate a push information set based on the hit medical state entity, and send the push information set to the user.
The apparatus of claim 8, further comprising:

Medical knowledge graph determination unit, including:

The initial information acquisition subunit is configured to acquire abstract text information of a plurality of medical documents, and obtain a collection of abstract text information;

The entity identification subunit is configured to use an entity identification neural network to determine the entity set hit in the abstract text information set; wherein, the entity set includes the following information in the abstract text information set: related to representation information and medical status information about the entity;

A normative matching subunit, configured to perform medical language normalization matching on the entity set to obtain a normalized entity set;

A classification and labeling subunit, configured to classify and label the normalized entities in the normalized entity set to obtain a representation information set and a medical state entity set;

The medical knowledge graph generation subunit is configured to obtain the medical knowledge graph based on the co-occurrence relationship between the representation information in the representation information set and the medical state entities in the medical state entity set.
The apparatus according to claim 9, wherein the entity recognition neural network in the entity recognition subunit comprises: a bidirectional short-term memory network and a conditional random field.
The apparatus according to claim 8, wherein the push information sending unit is further configured to: use a probability graph model to sort the medical state entities, and select a preset number of the medical state entities according to the sorting result to generate a push collection of information;

Send the push collection to the user.
The apparatus of claim 8, further comprising:

Standard information generation unit, including:

an initial information acquisition subunit, configured to acquire the user's input information;

an information identification subunit, configured to identify the representation information contained in the input information to obtain an identification result;

The standard representation information determination subunit is configured to determine the standard representation information based on the normalized semantics of the recognition result.
The apparatus according to claim 12, wherein the standard characterization information determination subunit is further configured to:

Expand based on the normalized semantics of the recognition result to generate an expanded representation information set;

The extended representation information in the extended representation information set is used as standard representation information.
The apparatus of claim 13, wherein the medical state entity determination unit is further configured to:

In response to determining that selection information for the standard characterization information is received, at least one medical state entity hit by the standard characterization information is determined using a pre-configured medical knowledge graph.
An electronic device comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to cause the at least one process The device can execute the method for generating push information according to any one of claims 1-7.
A non-transitory computer-readable storage medium storing computer instructions, comprising: the computer instructions are used to cause the computer to execute the method for generating push information according to any one of claims 1-7.