WO2022095892A1 - Method and apparatus for generating push information - Google Patents

Method and apparatus for generating push information Download PDF

Info

Publication number
WO2022095892A1
WO2022095892A1 PCT/CN2021/128398 CN2021128398W WO2022095892A1 WO 2022095892 A1 WO2022095892 A1 WO 2022095892A1 CN 2021128398 W CN2021128398 W CN 2021128398W WO 2022095892 A1 WO2022095892 A1 WO 2022095892A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
medical
entity
push
representation
Prior art date
Application number
PCT/CN2021/128398
Other languages
French (fr)
Chinese (zh)
Inventor
黄亮
李鑫
郭旭炀
康西龙
Original Assignee
北京京东拓先科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京京东拓先科技有限公司 filed Critical 北京京东拓先科技有限公司
Priority to KR1020237017518A priority Critical patent/KR20230092002A/en
Publication of WO2022095892A1 publication Critical patent/WO2022095892A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Definitions

  • the present application relates to the field of artificial intelligence, in particular to the technical field of natural language processing, the technical field of knowledge graphs, and the technical field of big data, and in particular, to a method, apparatus, electronic device, and computer-readable storage medium for generating push information.
  • the present application provides a method, apparatus, electronic device, and storage medium for generating push information.
  • embodiments of the present application provide a method for generating push information, including: acquiring standard representation information corresponding to representation information in user input information; and determining the standard representation based on a pre-constructed medical knowledge graph At least one medical state entity hit by the information; wherein, the medical knowledge graph records the corresponding relationship between the representation information and the medical state entity, and the corresponding relationship is extracted from the abstract information of the medical literature; based on the medical state entity, the push information is generated collection, and send the push information collection to the user.
  • the medical knowledge graph is determined based on the following steps: acquiring abstract text information of a plurality of medical documents to obtain a set of abstract text information; using an entity recognition neural network to determine the set of entities hit in the set of abstract text information; wherein the The entity set includes the following information in the abstract text information set: information related to the representation information and medical state entities; perform medical language normalization matching on the entity set to obtain a canonical entity set; classify the canonical entities in the canonical entity set labeling to obtain a representation information set and a medical state entity set; and based on the co-occurrence relationship between the representation information in the representation information set and the medical state entity in the medical state entity set, the medical knowledge graph is obtained.
  • the entity recognition neural network includes a bidirectional short-term memory network and a conditional random field.
  • generating a set of push information based on the medical state entity, and sending the set of push information to the user includes: sorting the medical state entities using a probabilistic graph model, and selecting a preset number of the medical state entities according to the sorting result Generate a push information set; send the push set to the user.
  • the step of generating standard representation information includes: acquiring user input information; identifying representation information contained in the input information to obtain a recognition result; and determining the standard representation information based on the normalized semantics of the recognition result.
  • determining the standard representation information based on the normalized semantics of the recognition result includes: expanding based on the normalized semantics of the recognition result to generate an extended representation information set; Extended characterization information as standard characterization information.
  • determining the at least one medical state entity hit by the standard representation information based on the pre-constructed medical knowledge graph includes: in response to determining that selection information for the standard representation information is received, using the pre-constructed medical knowledge graph, At least one medical state entity that the criterion characterization information hits is determined.
  • embodiments of the present application provide an apparatus for generating push information, including: a standard representation information acquiring unit configured to acquire standard representation information corresponding to representation information in user input information; a medical status entity The determining unit is configured to, based on a pre-constructed medical knowledge graph, determine at least one medical state entity hit by the standard representation information; wherein, the medical knowledge graph records a correspondence between the representation information and the medical state entity, and the corresponding relationship It is extracted from abstract information of medical documents; the push information sending unit is configured to generate a push information set based on the medical state entity, and send the push information set to the user.
  • a medical knowledge graph determination unit is further included, which specifically includes: an initial information acquisition subunit, configured to acquire abstract text information of a plurality of medical documents, to obtain a collection of abstract text information; an entity identification subunit, configured The entity recognition neural network is used to determine the hit entity set in the abstract text information set; wherein, the entity set includes the following information in the abstract text information set: information related to the representation information and the medical state entity; normative matching subunit, is configured to perform medical language normalization matching on the entity set to obtain a normalized entity set; a classification and labeling subunit is configured to classify and label the normalized entities in the normalized entity set to obtain a representation information set and a medical state entity set; medical The knowledge graph generation subunit is configured to obtain the medical knowledge graph based on the co-occurrence relationship between the representation information in the representation information set and the medical state entities in the medical state entity set.
  • the entity recognition neural network in the entity recognition subunit includes: a bidirectional short-term memory network and a conditional random field.
  • the push information sending unit is further configured to: use a probabilistic graph model to sort the medical state entities, select a preset number of the medical state entities according to the sorting result to generate a push information set; send the push set to the user.
  • a standard information generation unit is further included, including: an initial information acquisition subunit, configured to acquire user input information; an information identification subunit, configured to recognize the representation information contained in the input information, and obtain the identification Result: the standard representation information determination subunit is configured to determine the standard representation information based on the normalized semantics of the recognition result.
  • the standard characterization information determining subunit is further configured to: expand based on the normalized semantics of the recognition result to generate an extended characterization information set; and use the extended characterization information in the extended characterization information set as a standard characterization information.
  • the medical state entity determination unit is further configured to: in response to determining that selection information for the standard characterization information is received, using a pre-configured medical knowledge graph, determine at least one medical state entity hit by the standard characterization information .
  • an embodiment of the present application provides an electronic device, comprising: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores a memory that can be executed by the at least one processor The instruction is executed by the at least one processor, so that the at least one processor can execute the method for generating push information described in any implementation manner.
  • embodiments of the present application provide a non-transitory computer-readable storage medium storing computer instructions, including: the computer instructions are used to cause the computer to execute the method for generating push information described in any implementation manner.
  • the present application determines at least one medical state entity hit by the standard representation information based on a pre-constructed medical knowledge graph, wherein the medical knowledge graph records The corresponding relationship between the representation information and the medical state entity, the corresponding relationship is extracted from the abstract information of the medical literature, the push information set is generated based on the medical state entity, the push information set is sent to the user, and the abstract based on the medical literature is used.
  • the knowledge graph constructed by the information determines the push messages pushed to users, which reduces the cost of determining the push messages and improves the quality of the push messages.
  • FIG. 1 is an exemplary system architecture to which embodiments of the present application may be applied;
  • FIG. 2 is a flowchart of an embodiment of a method for generating push information according to the present application
  • FIG. 3 is a flowchart of an implementation of determining a medical knowledge graph in the method for generating push information according to the present application
  • FIG. 4 is a flowchart of another embodiment of a method for generating push information according to the present application.
  • FIG. 5 is a schematic structural diagram of an embodiment of an apparatus for generating push information according to the present application.
  • FIG. 6 is a block diagram of an electronic device suitable for implementing the method for generating push information according to the embodiment of the present application.
  • FIG. 1 shows an exemplary system architecture 100 to which embodiments of the method, apparatus, electronic device, and computer-readable storage medium for generating push information of the present application may be applied.
  • the system architecture 100 may include terminal devices 101 , 102 , and 103 , a network 104 and a server 105 .
  • the network 104 is a medium used to provide a communication link between the terminal devices 101 , 102 , 103 and the server 105 .
  • the network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
  • the user can use the terminal devices 101, 102, 103 to interact with the server 105 through the network 104, so as to achieve the purpose of sending the user's input information and the like.
  • Retrieval applications such as navigation applications, encyclopedia query applications, and online consultation applications, may be installed on the terminal devices 101 , 102 , and 103 .
  • the terminal devices 101, 102, and 103 may be hardware or software. In the case of hardware, it can be various electronic devices with display screens, including but not limited to smart phones, tablet computers, laptop computers, desktop computers, and the like.
  • the terminal devices 101, 102, and 103 are software, they can be installed in the electronic devices listed above. It can be implemented as a plurality of software or software modules (such as sending user input information, etc.), or can be implemented as a single software or software module. There is no specific limitation here.
  • the server 105 may be a server that provides various services, for example, a server that provides retrieval services and generates push information for the terminal devices 101 , 102 , and 103 .
  • a server that provides retrieval services and generates push information for the terminal devices 101 , 102 , and 103 .
  • obtain standard representation information corresponding to the representation information in the input information of the user ; determine at least one medical state entity hit by the standard representation information based on a pre-constructed medical knowledge graph; wherein, the medical knowledge graph is based on the medical knowledge graph in the medical literature.
  • the relationship between the representation information and the medical state entity is determined; a push information set is generated based on the medical state entity, and the push information set is sent to the user.
  • the method for generating push information provided by the embodiments of the present application is generally performed by the server 105 , and accordingly, the device for generating push information is generally set in the server 105 .
  • the server may be hardware or software.
  • the server can be implemented as a distributed server cluster composed of multiple servers, or as a single server.
  • the server is software, it may be implemented as multiple software or software modules for providing distributed services, or may be implemented as a single software or software module. There is no specific limitation here.
  • the method for generating push information may also be executed by the terminal devices 101 , 102 and 103 , and correspondingly, the apparatus for generating push information may also be set in the terminal devices 101 , 102 and 103 .
  • the example system architecture 100 may also not include the server 105 and the network 104 .
  • terminal devices, networks and servers in FIG. 1 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.
  • FIG. 2 shows a flow 200 of an embodiment of the method for generating push information according to the present application.
  • the method for generating the push information includes the following steps:
  • Step 201 Acquire standard representation information corresponding to the representation information in the user's input information.
  • the execution body of the method for generating push information may be obtained from a local or non-local human-computer interaction device (for example, the terminal devices 101 , 102 , and 103 shown in FIG. 1 )
  • a local or non-local human-computer interaction device for example, the terminal devices 101 , 102 , and 103 shown in FIG. 1
  • the user's input information and the standard representation information corresponding to the representation information in the user's input information are not limited in this application.
  • representational information is the way that information is presented in thinking systems such as the mind, computer system, etc., and the way of recording or expressing information, a formal system that can clearly express certain entities or certain types of information and explain how the system functions. certain rules for its functions. Therefore, we can understand that representational information refers to a symbol or signal that can refer to something, that is, when a thing is absent, it represents the relevant information of the thing, and an entity generally refers to a text that has a specific meaning or refers to Strong entities usually include names of people, places, organizations, dates and times, proper nouns, etc. Therefore, the concept of entity can be very broad, as long as it is a special text fragment required by business, it can be called an entity.
  • the input information of the user is usually the input information obtained according to the user's own cognitive level and cultural differences, which contains one or more representation information representing the real meaning of the user. These information are converted into standard representation information that can be identified and understood by the above-mentioned executive body.
  • the training model usually used is the standard expression form provided by authoritative officials in the corresponding field, so the above standard
  • the representation information is a standard form of expression provided by authoritative officials in various fields.
  • the user's input information is stomach pain
  • the content is not a standard expression in the medical field
  • the corresponding conversion is carried out into abdominal pain.
  • stomach pain, stomach colic, stomach pain and other standard expressions in the medical field to obtain standard representation information.
  • the execution body of the method for generating push information can process the input information from the user locally after obtaining the input information to obtain the standard representation information corresponding to the representation information in the input information, or can directly obtain other non-local terminals.
  • Step 202 based on a pre-constructed medical knowledge map, determine at least one medical state entity hit by the standard representation information; wherein, the medical knowledge map records the correspondence between the representation information and the medical state entity, and the correspondence is from the medical literature. Extracted from the abstract information.
  • the standard representation information obtained in the above step 201 is matched according to a pre-constructed medical knowledge graph that records the correspondence between the representation information and the medical state entities, and one or more hits of the standard representation information are determined. a medical state entity.
  • the corresponding relationship between the representation information recorded in the medical knowledge graph and the medical state entities is extracted based on the abstract information in multiple medical documents.
  • the abstract document of a medical document as an example, if the If there is the first representation information and the first medical state entity, it is considered that there is a corresponding relationship between the first representation information and the first medical state entity, based on the multiple needle information and the medical state entity existing in the abstract information of multiple medical documents The correspondence between them is obtained as a medical knowledge graph.
  • Step 203 Generate a push information set based on the hit medical state entity, and send the push information set to the user.
  • one or more medical state entities hit by the standard representation information can be obtained according to the pre-constructed medical knowledge graph.
  • the screening rule selects the medical state entities that meet the requirements from the multiple hit medical state entities, sorts the multiple hit medical state entities, and generates a push information set according to the obtained one or more medical state entities, that is, the The push information set contains one or more medical status entities, and then the push information set is sent to the user who entered the information in step 201 to determine the final push content pushed to the user, so that the user can obtain the information based on the input information. Generated push information.
  • the method for generating push information after obtaining standard representation information corresponding to the representation information in the input information of the user, at least one medical state hit by the standard representation information is determined based on a pre-constructed medical knowledge graph entity, wherein the medical knowledge graph records the correspondence between the representation information and the medical state entity, and the correspondence is extracted from the abstract information of the medical literature, generates a push information set based on the medical state entity, and sends the push information set
  • the knowledge graph constructed based on the abstract information of the medical literature is used to determine the push message to be pushed to the user, thereby reducing the cost of determining the push message and improving the quality of the push message.
  • a process 300 of determining steps of a medical knowledge graph is shown, which specifically includes:
  • Step 301 Acquire abstract text information of a plurality of medical documents to obtain a set of abstract text information.
  • a large number of medical documents can be obtained through medical document retrieval databases such as Pubmed and Chinese Biomedical Documents, and after obtaining the medical documents, the abstract text information in the medical documents is extracted to obtain a set of abstract text information.
  • the abstract text information can be in English or Chinese, and preferably English abstract information is used for extraction, because the use of English abstract information can avoid the problem of needing to segment the text content when using Chinese abstract text information, and further improve the abstract Generation efficiency of textual information collections.
  • the title information of the medical documents may also be extracted for subsequent reference.
  • Step 302 using an entity recognition neural network to determine a hit entity set in the abstract text information set; wherein the entity set includes the following information in the abstract text information set: information related to representation information and medical state entities.
  • entity recognition neural network extracts entities from unstructured input text, and can identify more categories of entities according to business requirements, such as neural networks for product names, models, prices, etc., such as Deep Web, NER and other entity recognition neural network to perform entity recognition on the content in the abstract text information set to determine the hit entity in the abstract text information set, wherein the entity information includes information related to the representation information and the medical state entity, and finally according to the hit entity Get the entity collection.
  • business requirements such as neural networks for product names, models, prices, etc.
  • NER and other entity recognition neural network to perform entity recognition on the content in the abstract text information set to determine the hit entity in the abstract text information set, wherein the entity information includes information related to the representation information and the medical state entity, and finally according to the hit entity Get the entity collection.
  • an entity that satisfies the preset rule can be selected from the hit entities according to a preset rule, and then an entity set is obtained based on the entity that satisfies the preset rule.
  • Step 303 perform medical language normalization matching on the entity set to obtain a normalized entity set.
  • the entity recognition neural network can identify medical entities
  • the recognized entity names are not necessarily standardized (for example: “diarrhea” is not standardized, “diarrhea” is standardized), and the category is not necessarily correct (for example: "amoxicillin” category identified as “drugs”).
  • the corpus is passed through the entity recognition neural network
  • the entity candidate set is obtained.
  • the entity names and their synonyms in the medical knowledge database such as the UMLS database are searched approximately, and the matching entity is a canonical entity.
  • the matching entity is added to the encoding of the corresponding entity.
  • the CUI encoding is used to uniquely identify the candidate entity of this specification, and the canonical entity set is obtained.
  • existing methods such as key-value pair encoding can be used to achieve this purpose.
  • Step 304 Classify and label the normalized entities in the normalized entity set to obtain a representation information set and a medical state entity set.
  • Step 305 Obtain the medical knowledge graph based on the co-occurrence relationship between the representation information in the representation information set and the medical state entities in the medical state entity set.
  • the number of occurrences of the medical state entity and the representation information in the abstract text information set can be determined according to the number of occurrences of the medical state entity and the number of occurrences of the medical state entity and the representation information in the same medical document abstract.
  • Whether there is a correspondence between medical state entities for example, determine the medical state according to whether the probability of medical state entity A appearing in the abstract text information set is similar to the co-occurrence probability of the medical state entity and representation information B in the abstract text information Whether there is a corresponding relationship between entity A and representation information B, or, for example, a threshold condition for the number of co-occurrences is pre-determined, when the co-occurrence number of entity C between the representation information D and the medical state entity satisfies the predetermined threshold condition, it is considered that the There is a corresponding relationship between the representation information D and the medical state entity C, and the medical knowledge graph is obtained according to the corresponding relationship between the collected representation information and the medical state entity.
  • the method of generating a medical knowledge graph based on the abstract information of medical documents can firstly achieve a wide range of representation information and entity information coverage through a large number of medical documents, avoiding the need to rely on expert knowledge to generate knowledge graphs in the prior art.
  • the problem of narrow information coverage in the method, and the simplification of medical documents can be realized through abstract information, avoiding the technical problem of low identification efficiency of medical documents caused by the excessive amount of content information in medical documents, and improving the efficiency of generating knowledge graphs .
  • the entity recognition neural network includes: a bidirectional short-term memory network and a conditional random field.
  • the bidirectional short-term memory network namely Bi-LSTM (Bi-directional Long Short-Term Memory, referred to as Bi-LSTM) memory network
  • Bi-LSTM Bi-directional Long Short-Term Memory
  • RNN Recurrent Neural Network, RNN for short
  • LSTM Long Short-Term Memory
  • LSTM Long Short-Term Memory
  • Bi-LSTM is spliced from two LSTMs and contains a forward input sequence and a reverse input sequence, taking into account both past and future features.
  • LSTM can learn the dependencies between observation sequences (input words) through bidirectional settings.
  • LSTM can automatically extract the features of observation sequences according to the target (such as recognizing entities), but the disadvantage is that it cannot be learned.
  • the relationship between state sequences (output annotations) you must know that in named entity recognition tasks, there is a certain relationship between annotations, such as E-type annotations (representing the beginning of an entity) will not be followed by an E. Class labeling, so when LSTM solves the sequence labeling task, although it can save a lot of complicated feature engineering, it also has the disadvantage of not being able to learn the labeling context.
  • Bi-LSTM When Bi-LSTM is used for named entity recognition, the output of Bi-LSTM is the score of the entity label, and the label corresponding to the highest score is selected. However, in some cases, Bi-LSTM cannot get the real correct entity label. At this time, it is necessary to add a conditional random field, that is, CRF (Conditional Random Field, CRF for short).
  • CRF combines the maximum entropy model and the hidden Markov model. It can model the hidden state and learn the characteristics of the state sequence, but its disadvantage is that it needs to manually extract the sequence features.
  • generating a push information set based on the medical state entity, and sending the push information set to the user includes: sorting the medical state entities by using a probability graph model, and selecting a preset according to the sorting result A number of the medical state entities generate a push information set; send the push set to the user.
  • the correlation between the entities can be measured by the number of co-occurrences between the representation information and the medical state entity, for example, using the formula:
  • Sym j ) represents the correlation probability between the representation information and the medical state entity
  • P(Dis i , Sym j ) represents the probability based on the co-occurrence times between the representation information and the medical state entity
  • P( Sym j ) represents the probability of the appearance of the representation information
  • P(Dis) is the prior probability of the medical state entity.
  • the logarithm of both sides of the above formula can be taken, and the accumulation operation can be changed. Posterior probability of each medical state entity.
  • each medical state entity After obtaining the posterior probability of each medical state entity, each medical state entity can be sorted according to the size of the probability, and a preset number of medical state entities can be selected to generate a push information set, so that the user can obtain the medical state with a high hit probability. entity to further improve the quality of push information.
  • the step of generating the standard representation information includes:
  • the input information of the user is acquired; the representation information contained in the input information is identified to obtain a recognition result; the standard representation information is determined based on the normalized semantics of the recognition result.
  • the normalized word may be used to represent some common expressions of similar descriptions, and the normalized semantics may be to replace the template word in the entry information with the normalized word to unify expressions with different semantics. Therefore, a standard representation information database can be constructed based on the existing information in advance, and after the recognition result of the user input information is obtained, the recognition result is normalized to speech to obtain the standard representation information, so as to prevent the user from being unable to use the more standardized information.
  • the standard description language accurately expresses the ideas and needs, the above-mentioned executive body cannot understand the user's ideas and needs, so that the push information that meets the user's needs and needs can be smoothly generated according to the user's ideas and needs.
  • FIG. 4 shows a process 400 of another embodiment of a method for generating push information, which specifically includes the following steps:
  • Step 401 Obtain input information of a user.
  • Step 402 Identify the representation information contained in the input information to obtain the identification result.
  • the entity recognition neural network in the implementation manner corresponding to FIG. 3 may be used to identify the user's input information, so as to determine the representation information existing therein.
  • Step 403 Expand based on the normalized semantics of the recognition result to generate an expanded representation information set.
  • the probability graph model when adopting the technical solution of determining the push set based on the probability graph model in some implementations of the embodiment shown in FIG. 2, the probability graph model may also be used:
  • Step 404 take the extended representation information in the extended representation information set as standard representation information.
  • Step 405 in response to determining that selection information for the standard representation information is received, use a pre-constructed medical knowledge graph to determine at least one medical state entity hit by the standard representation information.
  • the number of standard representation information here may be one message or multiple pieces. When there are multiple pieces of standard representation information, these standard representation information can be presented to the user who inputs the information, and the user can obtain the information based on the information sent by the user.
  • the selection information generated by the selected standard characterization information determine the standard characterization information included in the selection information, that is, the standard characterization information selected by the user, obtain the standard characterization information expected by the user in the form of human-computer interaction, based on the user-selected standard characterization information It can better meet the needs of users, so as to improve the quality of the subsequently generated push information.
  • Step 406 Generate a push information set based on the medical state entity, and send the push information set to the user.
  • a push information set is generated based on the multiple medical state entities, and sent to the user, wherein, based on the multiple medical state entities
  • the medical state entities may be sorted and screened according to predetermined rules, for example, in the implementation of the embodiment shown in FIG.
  • steps 404 and 405 are similar to steps 202-203 in the embodiment shown in FIG. 2 , and the repeated contents will not be repeated.
  • the identification result of the included representation information is used to determine the extended representation information set, the extended representation set determines the standard representation information, and then the final selected standard representation information is determined based on the result of the human-computer interaction with the user, and the medical state entity is determined correspondingly, and the push notification is generated.
  • the information collection is pushed to the user, so as to provide the user with the push information of higher quality and closer to the actual needs of the user according to the actual needs of the user.
  • this application also provides a specific implementation scheme in combination with a specific application scenario.
  • the user's input information is "continuous diarrhea from last night to this morning", and the predetermined number of medical state entities to be extracted is three.
  • the entity recognition neural network After obtaining the information input by the user, the entity recognition neural network is used to identify the input information of the user, and the representative information “diarrhea” in it is determined, and then the semantic "diarrhea” is normalized based on the identification result, and the expansion is obtained.
  • the extended representation information “abdominal pain”, “bloating”, “indigestion” and “stomach colic” is presented to the user.
  • the medical knowledge graph In response to the user's selection of information for the standard characterization information of "diarrhea”, which includes extended characterization information “abdominal pain” and “indigestion”, then using a pre-constructed knowledge graph to determine “diarrhea", “abdominal pain” and “digestion”. "bad” and hit medical status entities, the medical knowledge graph records the correspondence between representation information and medical status entities, which is extracted from the abstract information of medical literature.
  • the medical status entities that got hits were "Irritable Bowel Syndrome”, “Lactose Intolerance”, “Gastroparesis”, “Celiac disease”, “Gastritis” and “Peptic ulcer”.
  • the sorting relationship is: “Gastritis”, “Peptic ulcer”, “Irritable bowel syndrome”, “Lactose intolerance”, “Celiac disease” and “Gastroparesis”.
  • extract three medical status entities namely "gastritis”, “peptic ulcer” and “lactose intolerance”
  • the standard is determined based on a pre-constructed medical knowledge graph.
  • At least one medical state entity hit by the representation information wherein the medical knowledge graph records the correspondence between the representation information and the medical state entity, the correspondence is extracted from the abstract information of the medical literature, and the push is generated based on the medical state entity Information collection, send the push information collection to the user, and use the knowledge graph constructed based on the abstract information of medical literature to determine the push message to be pushed to the user, thereby reducing the cost of determining the push message and improving the quality of the push message.
  • the apparatus 500 for generating push information in this embodiment may include: a standard representation information acquiring unit 501 configured to acquire standard representation information corresponding to the representation information in the user's input information; a medical state entity determination Unit 502 is configured to determine at least one medical state entity hit by the standard representation information based on a pre-constructed medical knowledge map; wherein, the medical knowledge map records a correspondence between the representation information and the medical state entity, and the corresponding relationship It is extracted from the abstract information of medical documents; the push information sending unit 503 is configured to generate a push information set based on the medical state entity, and send the push information set to the user.
  • the apparatus for generating push information further includes: a medical knowledge graph determination unit, including: an initial information acquisition subunit, configured to acquire abstract text information of a plurality of medical documents, and obtain The abstract text information set; the entity identification subunit is configured to use the entity recognition neural network to determine the entity set hit in the abstract text information set; wherein, the entity set includes the following information in the abstract text information set: with representation information and Information related to medical state entities; a normative matching subunit, configured to perform medical language normalization matching on the entity set to obtain a normalized entity set; a classification and labeling subunit, configured to classify and label the normalized entities in the normalized entity set , obtain the representation information set and the medical state entity set; the medical knowledge graph generation subunit is configured to obtain the medical knowledge based on the co-occurrence relationship between the representation information in the representation information set and the medical state entity in the medical state entity set Atlas.
  • a medical knowledge graph determination unit including: an initial information acquisition subunit, configured to acquire abstract text information of a plurality of medical documents, and obtain The abstract
  • the entity recognition neural network in the entity recognition subunit includes: a bidirectional short-term memory network and a conditional random field.
  • the push information sending unit is further configured to: use a probability graph model to sort the medical state entities, and select a preset number of the medical state entities according to the sorting result to generate a push information set ; Send the push collection to the user.
  • the apparatus for generating push information further includes: a standard information generating unit, including: an initial information acquiring subunit, configured to acquire user input information; an information identifying subunit, which is configured to acquire user input information; is configured to identify the representation information contained in the input information to obtain a recognition result; the standard representation information determination subunit is configured to determine the standard representation information based on the normalized semantics of the recognition result.
  • a standard information generating unit including: an initial information acquiring subunit, configured to acquire user input information; an information identifying subunit, which is configured to acquire user input information; is configured to identify the representation information contained in the input information to obtain a recognition result; the standard representation information determination subunit is configured to determine the standard representation information based on the normalized semantics of the recognition result.
  • the standard representation information determination subunit is further configured to: expand based on the normalized semantics of the recognition result to generate an extended representation information set; Extended characterization information as standard characterization information.
  • the medical state entity determination unit is further configured to: in response to determining that selection information for the standard representation information is received, use a pre-constructed medical knowledge graph to determine the standard representation information At least one medical state entity hit.
  • This embodiment exists as an apparatus embodiment corresponding to the foregoing method embodiment. For the same content, reference is made to the description of the foregoing method embodiment, which will not be repeated here.
  • the knowledge graph constructed based on the abstract information of medical documents is used to determine the push message to be pushed to the user, thereby reducing the cost of determining the push message and improving the quality of the push message.
  • FIG. 6 it is a block diagram of an electronic device according to a method for generating push information according to an embodiment of the present application.
  • Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers.
  • Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smart phones, wearable devices, and other similar computing devices.
  • the components shown herein, their connections and relationships, and their functions are by way of example only, and are not intended to limit implementations of the application described and/or claimed herein.
  • the electronic device includes: one or more processors 601, a memory 602, and interfaces for connecting various components, including a high-speed interface and a low-speed interface.
  • the various components are interconnected using different buses and may be mounted on a common motherboard or otherwise as desired.
  • the processor may process instructions executed within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface.
  • multiple processors and/or multiple buses may be used with multiple memories and multiple memories, if desired.
  • multiple electronic devices may be connected, each providing some of the necessary operations (eg, as a server array, a group of blade servers, or a multiprocessor system).
  • a processor 601 is taken as an example in FIG. 6 .
  • the memory 602 is the non-transitory computer-readable storage medium provided by the present application.
  • the memory stores instructions executable by at least one processor, so that the at least one processor executes the method for generating push information provided by the present application.
  • the non-transitory computer-readable storage medium of the present application stores computer instructions, and the computer instructions are used to cause the computer to execute the method for generating push information provided by the present application.
  • the memory 602 can be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as program instructions/modules (for example, program instructions/modules corresponding to the method for generating push information in the embodiments of the present application). , the standard representation information acquisition unit 501, the medical state entity determination unit 502 and the push information sending unit 503 shown in FIG. 5).
  • the processor 601 executes various functional applications and data processing of the server by running the non-transitory software programs, instructions and modules stored in the memory 602, that is, implementing the method for generating push information in the above method embodiments.
  • the memory 602 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function; the storage data area may store data created by the use of the electronic device according to the generation of the push information, etc. . Additionally, memory 602 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 602 may optionally include memory located remotely from the processor 601, and these remote memories may push the information-generating electronic device through a network connection. Examples of such networks include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.
  • the electronic device for executing the method for generating push information may further include: an input device 603 and an output device 604 .
  • the processor 601 , the memory 602 , the input device 603 and the output device 604 may be connected by a bus or in other ways, and the connection by a bus is taken as an example in FIG. 6 .
  • the input device 603 can receive input numerical or character information, and generate key signal input related to user settings and function control of the electronic device for generating push information, such as a touch screen, a keypad, a mouse, a trackpad, a touchpad, a pointing stick, One or more input devices such as mouse buttons, trackballs, joysticks, etc.
  • Output devices 604 may include display devices, auxiliary lighting devices (eg, LEDs), haptic feedback devices (eg, vibration motors), and the like.
  • the display device may include, but is not limited to, a liquid crystal display (LCD), a light emitting diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.
  • Various implementations of the systems and techniques described herein can be implemented in digital electronic circuitry, integrated circuit systems, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs executable and/or interpretable on a programmable system including at least one programmable processor that The processor, which can be a special purpose or general-purpose programmable processor, can receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device. an output device.
  • the processor which can be a special purpose or general-purpose programmable processor, can receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device. an output device.
  • machine-readable medium and “computer-readable medium” refer to any computer program product, apparatus, and/or apparatus for providing machine instructions and/or data to a programmable processor ( For example, magnetic disks, optical disks, memories, programmable logic devices (PLDs), including machine-readable media that receive machine instructions as machine-readable signals.
  • machine-readable signal refers to any signal used to provide machine instructions and/or data to a programmable processor.
  • the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user ); and a keyboard and pointing device (eg, a mouse or trackball) through which a user can provide input to the computer.
  • a display device eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor
  • a keyboard and pointing device eg, a mouse or trackball
  • Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (eg, visual feedback, auditory feedback, or tactile feedback); and can be in any form (including acoustic input, voice input, or tactile input) to receive input from the user.
  • the systems and techniques described herein may be implemented on a computing system that includes back-end components (eg, as a data server), or a computing system that includes middleware components (eg, an application server), or a computing system that includes front-end components (eg, a user's computer having a graphical user interface or web browser through which a user may interact with implementations of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system.
  • the components of the system may be interconnected by any form or medium of digital data communication (eg, a communication network). Examples of communication networks include: Local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
  • a computer system can include clients and servers.
  • Clients and servers are generally remote from each other and usually interact through a communication network.
  • the relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other.
  • the medical knowledge graph records the corresponding relationship between the representation information and the medical state entity, the corresponding relationship is extracted from the abstract information of the medical literature, and the push information set is generated based on the medical state entity, and the push information set is sent to the user.

Abstract

The present application relates to the field of artificial intelligence, the technical field of natural language processing, the technical field of knowledge graph, and the technical field of big data. Disclosed are a method and apparatus for generating push information, an electronic device, and a computer readable storage medium. The specific implementation solution is as follows: acquiring standard representation information corresponding to representation information in input information of a user; determining, on the basis of a pre-constructed medical knowledge graph, at least one medical status entity hit by the standard representation information, the medical knowledge graph recording the correspondence between the representation information and the medical status entity, and the correspondence being obtained by extraction from abstract information of the medical literature; and generating a push information set on the basis of the medical status entity, sending the push information set to the user, and determining a push message pushed to the user by using a knowledge graph constructed on the basis of the abstract information of the medical literature. The cost of determining a push message is reduced while improving the quality of the push message.

Description

推送信息的生成方法、装置Method and device for generating push information
本专利申请要求于2020年11月9日提交的、申请号为202011241131.0、发明名称为“推送信息的方法和装置”的中国专利申请的优先权,该申请的全文以引用的方式并入本申请中。This patent application claims the priority of the Chinese patent application with the application number 202011241131.0 and the invention title "Method and Device for Pushing Information" filed on November 9, 2020, the full text of which is incorporated into this application by reference middle.
技术领域technical field
本申请涉及人工智能领域,具体涉及自然语言处理技术领域、知识图谱技术领域和大数据技术领域,尤其涉及推送信息的生成方法、装置、电子设备及计算机可读存储介质。The present application relates to the field of artificial intelligence, in particular to the technical field of natural language processing, the technical field of knowledge graphs, and the technical field of big data, and in particular, to a method, apparatus, electronic device, and computer-readable storage medium for generating push information.
背景技术Background technique
随着社会的发展,为了更好的满足用户的检索需求,越来越多的使用基于互联网大数据、知识图谱实现对用户输入的查询信息进行匹配、生成对应的推送消息的方式,来为用户提供检索服务。With the development of society, in order to better meet the retrieval needs of users, more and more methods based on Internet big data and knowledge graphs are used to match the query information input by users and generate corresponding push messages. Provide retrieval services.
现有技术中在确定知识图谱时,通常基于领域专家知识进行构建。When determining a knowledge graph in the prior art, it is usually constructed based on domain expert knowledge.
发明内容SUMMARY OF THE INVENTION
本申请提供了一种推送信息的生成方法、装置、电子设备以及存储介质。The present application provides a method, apparatus, electronic device, and storage medium for generating push information.
第一方面,本申请的实施例提供了一种推送信息的生成方法,包括:获取与用户的输入信息中的表征信息相对应的标准表征信息;基于预先构造的医学知识图谱,确定该标准表征信息命中的至少一个医学状态实体;其中,该医学知识图谱记录有表征信息和医学状态实体之间的对应关系,该对应关系从医学文献的摘要信息中提取得到;基于该医学状态实体生成推送信息集合,发送该推送信息集合给该用户。In a first aspect, embodiments of the present application provide a method for generating push information, including: acquiring standard representation information corresponding to representation information in user input information; and determining the standard representation based on a pre-constructed medical knowledge graph At least one medical state entity hit by the information; wherein, the medical knowledge graph records the corresponding relationship between the representation information and the medical state entity, and the corresponding relationship is extracted from the abstract information of the medical literature; based on the medical state entity, the push information is generated collection, and send the push information collection to the user.
在一些实施例中,医学知识图谱基于以下步骤确定:获取多个医学文献的摘要文本信息,得到摘要文本信息集合;采用实体识别神经网络确定 该摘要文本信息集合中命中的实体集合;其中,该实体集合包括该摘要文本信息集合中的以下信息:与表征信息和医学状态实体相关的信息;对该实体集合进行医学语言规范化匹配,得到规范化实体集合;对该规范化实体集合中的规范化实体进行分类标注,得到表征信息集合和医学状态实体集合;基于该表征信息集合中的表征信息与该医学状态实体集合中的医学状态实体的共现关系,得到该医学知识图谱。In some embodiments, the medical knowledge graph is determined based on the following steps: acquiring abstract text information of a plurality of medical documents to obtain a set of abstract text information; using an entity recognition neural network to determine the set of entities hit in the set of abstract text information; wherein the The entity set includes the following information in the abstract text information set: information related to the representation information and medical state entities; perform medical language normalization matching on the entity set to obtain a canonical entity set; classify the canonical entities in the canonical entity set labeling to obtain a representation information set and a medical state entity set; and based on the co-occurrence relationship between the representation information in the representation information set and the medical state entity in the medical state entity set, the medical knowledge graph is obtained.
在一些实施例中,实体识别神经网络包括:双向短期记忆网络和条件随机场。In some embodiments, the entity recognition neural network includes a bidirectional short-term memory network and a conditional random field.
在一些实施例中,基于该医学状态实体生成推送信息集合,发送该推送信息集合给该用户包括:采用概率图模型对该医学状态实体进行排序,根据排序结果选取预设数量的该医学状态实体生成推送信息集合;发送该推送集合给该用户。In some embodiments, generating a set of push information based on the medical state entity, and sending the set of push information to the user includes: sorting the medical state entities using a probabilistic graph model, and selecting a preset number of the medical state entities according to the sorting result Generate a push information set; send the push set to the user.
在一些实施例中,标准表征信息的生成步骤包括:获取用户的输入信息;识别该输入信息中包含的表征信息,得到识别结果;基于该识别结果的归一化语义,确定该标准表征信息。In some embodiments, the step of generating standard representation information includes: acquiring user input information; identifying representation information contained in the input information to obtain a recognition result; and determining the standard representation information based on the normalized semantics of the recognition result.
在一些实施例中,基于该识别结果的归一化语义,确定该标准表征信息,包括:基于该识别结果的归一化语义进行扩展,生成扩展表征信息集合;将该扩展表征信息集合中的扩展表征信息作为标准表征信息。In some embodiments, determining the standard representation information based on the normalized semantics of the recognition result includes: expanding based on the normalized semantics of the recognition result to generate an extended representation information set; Extended characterization information as standard characterization information.
在一些实施例中,基于预先构造的医学知识图谱,确定该标准表征信息命中的至少一个医学状态实体包括:响应于确定接收到针对该标准表征信息的选择信息,采用预先构造的医学知识图谱,确定该标准表征信息命中的至少一个医学状态实体。In some embodiments, determining the at least one medical state entity hit by the standard representation information based on the pre-constructed medical knowledge graph includes: in response to determining that selection information for the standard representation information is received, using the pre-constructed medical knowledge graph, At least one medical state entity that the criterion characterization information hits is determined.
第二方面,本申请的实施例提供了一种推送信息的生成装置,包括:标准表征信息获取单元,被配置成获取与用户的输入信息中的表征信息相对应的标准表征信息;医学状态实体确定单元,被配置成基于预先构造的医学知识图谱,确定该标准表征信息命中的至少一个医学状态实体;其中,该医学知识图谱记录有表征信息和医学状态实体之间的对应关系,该对应关系从医学文献的摘要信息中提取得到;推送信息发送单元,被配置成基于该医学状态实体生成推送信息集合,发送该推送信息集合给该用户。In a second aspect, embodiments of the present application provide an apparatus for generating push information, including: a standard representation information acquiring unit configured to acquire standard representation information corresponding to representation information in user input information; a medical status entity The determining unit is configured to, based on a pre-constructed medical knowledge graph, determine at least one medical state entity hit by the standard representation information; wherein, the medical knowledge graph records a correspondence between the representation information and the medical state entity, and the corresponding relationship It is extracted from abstract information of medical documents; the push information sending unit is configured to generate a push information set based on the medical state entity, and send the push information set to the user.
在一些实施例中,还包括医学知识图谱确定单元,其中具体包括:初始信息获取子单元,被配置成获取多个医学文献的摘要文本信息,得到摘要文本信息集合;实体识别子单元,被配置成采用实体识别神经网络确定该摘要文本信息集合中命中的实体集合;其中,该实体集合包括该摘要文本信息集合中的以下信息:与表征信息和医学状态实体相关的信息;规范匹配子单元,被配置成对该实体集合进行医学语言规范化匹配,得到规范化实体集合;分类标注子单元,被配置成对该规范化实体集合中的规范化实体进行分类标注,得到表征信息集合和医学状态实体集合;医学知识图谱生成子单元,被配置成基于该表征信息集合中的表征信息与该医学状态实体集合中的医学状态实体的共现关系,得到该医学知识图谱。In some embodiments, a medical knowledge graph determination unit is further included, which specifically includes: an initial information acquisition subunit, configured to acquire abstract text information of a plurality of medical documents, to obtain a collection of abstract text information; an entity identification subunit, configured The entity recognition neural network is used to determine the hit entity set in the abstract text information set; wherein, the entity set includes the following information in the abstract text information set: information related to the representation information and the medical state entity; normative matching subunit, is configured to perform medical language normalization matching on the entity set to obtain a normalized entity set; a classification and labeling subunit is configured to classify and label the normalized entities in the normalized entity set to obtain a representation information set and a medical state entity set; medical The knowledge graph generation subunit is configured to obtain the medical knowledge graph based on the co-occurrence relationship between the representation information in the representation information set and the medical state entities in the medical state entity set.
在一些实施例中,实体识别子单元中该实体识别神经网络包括:双向短期记忆网络和条件随机场。In some embodiments, the entity recognition neural network in the entity recognition subunit includes: a bidirectional short-term memory network and a conditional random field.
在一些实施例中,推送信息发送单元进一步被配置成:采用概率图模型对该医学状态实体进行排序,根据排序结果选取预设数量的该医学状态实体生成推送信息集合;发送该推送集合给该用户。In some embodiments, the push information sending unit is further configured to: use a probabilistic graph model to sort the medical state entities, select a preset number of the medical state entities according to the sorting result to generate a push information set; send the push set to the user.
在一些实施例中,还包括标准信息生成单元,包括:初始信息获取子单元,被配置成获取用户的输入信息;信息识别子单元,被配置成识别该输入信息中包含的表征信息,得到识别结果;标准表征信息确定子单元,被配置成基于该识别结果的归一化语义,确定该标准表征信息。In some embodiments, a standard information generation unit is further included, including: an initial information acquisition subunit, configured to acquire user input information; an information identification subunit, configured to recognize the representation information contained in the input information, and obtain the identification Result: the standard representation information determination subunit is configured to determine the standard representation information based on the normalized semantics of the recognition result.
在一些实施例中,该标准表征信息确定子单元进一步被配置成:基于该识别结果的归一化语义进行扩展,生成扩展表征信息集合;将该扩展表征信息集合中的扩展表征信息作为标准表征信息。In some embodiments, the standard characterization information determining subunit is further configured to: expand based on the normalized semantics of the recognition result to generate an extended characterization information set; and use the extended characterization information in the extended characterization information set as a standard characterization information.
在一些实施例中,医学状态实体确定单元进一步被配置成:响应于确定接收到针对该标准表征信息的选择信息,采用预先构造的医学知识图谱,确定该标准表征信息命中的至少一个医学状态实体。In some embodiments, the medical state entity determination unit is further configured to: in response to determining that selection information for the standard characterization information is received, using a pre-configured medical knowledge graph, determine at least one medical state entity hit by the standard characterization information .
第三方面,本申请的实施例提供了一种电子设备,包括:至少一个处理器;以及与上述至少一个处理器通信连接的存储器;其中,该存储器存储有可被上述至少一个处理器执行的指令,该指令被上述至少一个处理器执行,以使上述至少一个处理器能够执行任一实现方式描述的推送信息的 生成方法。In a third aspect, an embodiment of the present application provides an electronic device, comprising: at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores a memory that can be executed by the at least one processor The instruction is executed by the at least one processor, so that the at least one processor can execute the method for generating push information described in any implementation manner.
第四方面,本申请的实施例提供了一种存储有计算机指令的非瞬时计算机可读存储介质,包括:该计算机指令用于使该计算机执行任一实现方式描述的推送信息的生成方法。In a fourth aspect, embodiments of the present application provide a non-transitory computer-readable storage medium storing computer instructions, including: the computer instructions are used to cause the computer to execute the method for generating push information described in any implementation manner.
本申请在获取与用户的输入信息中的表征信息相对应的标准表征信息后,基于预先构造的医学知识图谱,确定该标准表征信息命中的至少一个医学状态实体,其中,该医学知识图谱记录有表征信息和医学状态实体之间的对应关系,该对应关系从医学文献的摘要信息中提取得到,基于该医学状态实体生成推送信息集合,发送该推送信息集合给该用户,使用基于医学文献的摘要信息构建的知识图谱确定推送给用户的推送消息,降低推送消息确定成本的同时提升推送消息的质量。After obtaining the standard representation information corresponding to the representation information in the input information of the user, the present application determines at least one medical state entity hit by the standard representation information based on a pre-constructed medical knowledge graph, wherein the medical knowledge graph records The corresponding relationship between the representation information and the medical state entity, the corresponding relationship is extracted from the abstract information of the medical literature, the push information set is generated based on the medical state entity, the push information set is sent to the user, and the abstract based on the medical literature is used. The knowledge graph constructed by the information determines the push messages pushed to users, which reduces the cost of determining the push messages and improves the quality of the push messages.
应当理解,本部分所描述的内容并非旨在标识本申请的实施例的关键或重要特征,也不用于限制本申请的范围。本申请的其它特征将通过以下的说明书而变得容易理解。It should be understood that the content described in this section is not intended to identify key or critical features of the embodiments of the application, nor is it intended to limit the scope of the application. Other features of the present application will become readily understood from the following description.
附图说明Description of drawings
附图用于更好地理解本方案,不构成对本申请的限定。其中:The accompanying drawings are used for better understanding of the present solution, and do not constitute a limitation to the present application. in:
图1是本申请的实施例可以应用于其中的示例性系统架构;FIG. 1 is an exemplary system architecture to which embodiments of the present application may be applied;
图2是根据本申请的推送信息的生成方法的一个实施例的流程图;2 is a flowchart of an embodiment of a method for generating push information according to the present application;
图3是根据本申请的推送信息的生成方法中确定医学知识图谱的一个实现方式的流程图;3 is a flowchart of an implementation of determining a medical knowledge graph in the method for generating push information according to the present application;
图4是根据本申请的推送信息的生成方法的另一个实施例的流程图;4 is a flowchart of another embodiment of a method for generating push information according to the present application;
图5是根据本申请的推送信息的生成装置的一个实施例的结构示意图;5 is a schematic structural diagram of an embodiment of an apparatus for generating push information according to the present application;
图6是适于用来实现本申请实施例的推送信息的生成方法的电子设备的框图。FIG. 6 is a block diagram of an electronic device suitable for implementing the method for generating push information according to the embodiment of the present application.
具体实施方式Detailed ways
以下结合附图对本申请的示范性实施例做出说明,其中包括本申请实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和 修改,而不会背离本申请的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。Exemplary embodiments of the present application are described below with reference to the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be considered as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted from the following description for clarity and conciseness.
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。It should be noted that the embodiments in the present application and the features of the embodiments may be combined with each other in the case of no conflict. The present application will be described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
图1示出了可以应用本申请的推送信息的生成方法、装置、电子设备及计算机可读存储介质的实施例的示例性系统架构100。FIG. 1 shows an exemplary system architecture 100 to which embodiments of the method, apparatus, electronic device, and computer-readable storage medium for generating push information of the present application may be applied.
如图1所示,系统架构100可以包括终端设备101、102、103,网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 1 , the system architecture 100 may include terminal devices 101 , 102 , and 103 , a network 104 and a server 105 . The network 104 is a medium used to provide a communication link between the terminal devices 101 , 102 , 103 and the server 105 . The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
用户可以使用终端设备101、102、103通过网络104与服务器105交互,以实现发送用户的输入信息等目的。终端设备101、102、103上可以安装有检索类应用,例如导航类应用、百科查询类应用、在线咨询类应用等。The user can use the terminal devices 101, 102, 103 to interact with the server 105 through the network 104, so as to achieve the purpose of sending the user's input information and the like. Retrieval applications, such as navigation applications, encyclopedia query applications, and online consultation applications, may be installed on the terminal devices 101 , 102 , and 103 .
终端设备101、102、103可以是硬件,也可以是软件。硬件时,可以是具有显示屏的各种电子设备,包括但不限于智能手机、平板电脑、膝上型便携计算机和台式计算机等等。当终端设备101、102、103为软件时,可以安装在上述所列举的电子设备中。其可以实现成多个软件或软件模块(例如发送用户的输入信息等),也可以实现成单个软件或软件模块。在此不做具体限定。The terminal devices 101, 102, and 103 may be hardware or software. In the case of hardware, it can be various electronic devices with display screens, including but not limited to smart phones, tablet computers, laptop computers, desktop computers, and the like. When the terminal devices 101, 102, and 103 are software, they can be installed in the electronic devices listed above. It can be implemented as a plurality of software or software modules (such as sending user input information, etc.), or can be implemented as a single software or software module. There is no specific limitation here.
服务器105可以是提供各种服务的服务器,例如为终端设备101、102、103提供检索服务、生成推送信息的服务器。例如获取与用户的输入信息中的表征信息相对应的标准表征信息;基于预先构造的医学知识图谱,确定该标准表征信息命中的至少一个医学状态实体;其中,该医学知识图谱基于医学文献中的表征信息和医学状态实体之间的关系确定;基于该医学状态实体生成推送信息集合,发送该推送信息集合给该用户。The server 105 may be a server that provides various services, for example, a server that provides retrieval services and generates push information for the terminal devices 101 , 102 , and 103 . For example, obtain standard representation information corresponding to the representation information in the input information of the user; determine at least one medical state entity hit by the standard representation information based on a pre-constructed medical knowledge graph; wherein, the medical knowledge graph is based on the medical knowledge graph in the medical literature. The relationship between the representation information and the medical state entity is determined; a push information set is generated based on the medical state entity, and the push information set is sent to the user.
需要说明的是,本申请的实施例所提供的推送信息的生成方法一般由服务器105执行,相应地,推送信息的生成装置一般设置于服务器105中。It should be noted that the method for generating push information provided by the embodiments of the present application is generally performed by the server 105 , and accordingly, the device for generating push information is generally set in the server 105 .
需要说明的是,服务器可以是硬件,也可以是软件。当服务器为硬件时,可以实现成多个服务器组成的分布式服务器集群,也可以实现成单个 服务器。当服务器为软件时,可以实现成例如用来提供分布式服务的多个软件或软件模块,也可以实现成单个软件或软件模块。在此不做具体限定。It should be noted that the server may be hardware or software. When the server is hardware, it can be implemented as a distributed server cluster composed of multiple servers, or as a single server. When the server is software, it may be implemented as multiple software or software modules for providing distributed services, or may be implemented as a single software or software module. There is no specific limitation here.
此外,推送信息的生成方法也可以由终端设备101、102、103执行,相应地,推送信息的生成装置也可以设置于终端设备101、102、103中。此时,示例性系统架构100也可以不包括服务器105和网络104。In addition, the method for generating push information may also be executed by the terminal devices 101 , 102 and 103 , and correspondingly, the apparatus for generating push information may also be set in the terminal devices 101 , 102 and 103 . At this point, the example system architecture 100 may also not include the server 105 and the network 104 .
应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。It should be understood that the numbers of terminal devices, networks and servers in FIG. 1 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.
继续参考图2,其示出了根据本申请的推送信息的生成方法的一个实施例流程200。该推送信息的生成方法,包括以下步骤:Continue to refer to FIG. 2 , which shows a flow 200 of an embodiment of the method for generating push information according to the present application. The method for generating the push information includes the following steps:
步骤201,获取与用户的输入信息中的表征信息相对应的标准表征信息。Step 201: Acquire standard representation information corresponding to the representation information in the user's input information.
在本实施例中,推送信息的生成方法的执行主体(例如图1所示的服务器105)可以从本地或非本地人机交互设备(例如图1所示的终端设备101、102、103)获取用户的输入信息以及用户的输入信息中的表征信息对应的标准表征信息,本申请对此不做限定。In this embodiment, the execution body of the method for generating push information (for example, the server 105 shown in FIG. 1 ) may be obtained from a local or non-local human-computer interaction device (for example, the terminal devices 101 , 102 , and 103 shown in FIG. 1 ) The user's input information and the standard representation information corresponding to the representation information in the user's input information are not limited in this application.
其中,表征信息,是信息在例如头脑、计算机系统等思维系统中的呈现方式,是信息记载或表达的方式,能把某些实体或某类信息表达清楚的形式化系统以及说明该系统如何行使其职能的若干规则。因此,我们可以这样理解,表征信息是指可以指代某种东西的符号或信号,即某一事物缺席时,它代表该事物的相关信息,实体一般指的是文本中具有特定意义或者指代性强的实体,通常包括人名、地名、组织机构名、日期时间、专有名词等,因此实体这个概念可以很广,只要是业务需要的特殊文本片段都可以称为实体。Among them, representational information is the way that information is presented in thinking systems such as the mind, computer system, etc., and the way of recording or expressing information, a formal system that can clearly express certain entities or certain types of information and explain how the system functions. certain rules for its functions. Therefore, we can understand that representational information refers to a symbol or signal that can refer to something, that is, when a thing is absent, it represents the relevant information of the thing, and an entity generally refers to a text that has a specific meaning or refers to Strong entities usually include names of people, places, organizations, dates and times, proper nouns, etc. Therefore, the concept of entity can be very broad, as long as it is a special text fragment required by business, it can be called an entity.
应当理解的是,用户的输入信息通常为根据用户自身认知水平和文化差异得到的输入信息,其中包含有一个或者多个用户表示真实意思的表征信息,在获取到这些表征信息后需要将该这些信息转换为上述执行主体可以识别、理解的标准表征信息,因在对上述执行主体进行预先训练时,通常使用的训练模型为在对应领域由权威官方提供的、标准的表述形式,所以上述标准表征信息为针对各领域内由权威官方提供的、标准的表述形式,例如在医学领域中,用户的输入信息为肚子疼,该内容并非为医学领域内 的标准表述,则对应的进行转化为腹痛、胃痛、胃绞痛、胃阵痛等在医学领域中的标准表述形式,以得到标准表征信息。It should be understood that the input information of the user is usually the input information obtained according to the user's own cognitive level and cultural differences, which contains one or more representation information representing the real meaning of the user. These information are converted into standard representation information that can be identified and understood by the above-mentioned executive body. When pre-training the above-mentioned executive body, the training model usually used is the standard expression form provided by authoritative officials in the corresponding field, so the above standard The representation information is a standard form of expression provided by authoritative officials in various fields. For example, in the medical field, the user's input information is stomach pain, and the content is not a standard expression in the medical field, then the corresponding conversion is carried out into abdominal pain. , stomach pain, stomach colic, stomach pain and other standard expressions in the medical field to obtain standard representation information.
同样的,推送信息的生成方法的执行主体可以在获取到用户的输入信息后在本地进行处理,以得到该输入信息中的表征信息对应的标准表征信息,也可以直接获取由非本地的其他终端设备中基于用户的输入信息中的表征信息处理后得到的相对应的标准表征信息。Similarly, the execution body of the method for generating push information can process the input information from the user locally after obtaining the input information to obtain the standard representation information corresponding to the representation information in the input information, or can directly obtain other non-local terminals. Corresponding standard characterization information obtained after processing in the device based on the characterization information in the user's input information.
步骤202,基于预先构造的医学知识图谱,确定标准表征信息命中的至少一个医学状态实体;其中,该医学知识图谱记录有表征信息和医学状态实体之间的对应关系,该对应关系从医学文献的摘要信息中提取得到。 Step 202, based on a pre-constructed medical knowledge map, determine at least one medical state entity hit by the standard representation information; wherein, the medical knowledge map records the correspondence between the representation information and the medical state entity, and the correspondence is from the medical literature. Extracted from the abstract information.
在本实施例中,根据预先构造的记录有表征信息和医学状态实体之间的对应关系的医学知识图谱来对上述步骤201中得到的标准表征信息进行匹配,确定标准表征信息命中的一个或者多个医学状态实体。In this embodiment, the standard representation information obtained in the above step 201 is matched according to a pre-constructed medical knowledge graph that records the correspondence between the representation information and the medical state entities, and one or more hits of the standard representation information are determined. a medical state entity.
其中,该医学知识图谱中记录的表征信息和医学状态实体之间的对应关系,是基于多个医学文献中的摘要信息提取得到的,以一篇医学文献的摘要文献为例,若其中同时出现有第一表征信息和第一医学状态实体,则认为第一表征信息和第一医学状态实体之间存在有对应关系,基于多篇医学文献的摘要信息中存在的多个表针信息和医学状态实体之间的对应关系得到医学知识图谱。Among them, the corresponding relationship between the representation information recorded in the medical knowledge graph and the medical state entities is extracted based on the abstract information in multiple medical documents. Taking the abstract document of a medical document as an example, if the If there is the first representation information and the first medical state entity, it is considered that there is a corresponding relationship between the first representation information and the first medical state entity, based on the multiple needle information and the medical state entity existing in the abstract information of multiple medical documents The correspondence between them is obtained as a medical knowledge graph.
应当理解的是,在一篇医学文献的摘要信息中可能同时存在多个表征信息和医学状态实体,则对应的存在有一个表征信息对应多个医学状态实体存在有多个对应关系和/或多个表征信息对应一个医学状态实体存在有多个对应关系的情况,在生成医学知识图谱的过程中同样记录有这些对应关系。It should be understood that, in the abstract information of a medical document, there may be multiple representation information and medical state entities at the same time, and there is a corresponding representation information corresponding to multiple medical state entities, and there are multiple correspondences and/or multiple entities. When each piece of representation information corresponds to a medical state entity with multiple correspondences, these correspondences are also recorded in the process of generating the medical knowledge graph.
步骤203,基于命中的医学状态实体生成推送信息集合,发送该推送信息集合给该用户。Step 203: Generate a push information set based on the hit medical state entity, and send the push information set to the user.
在本实施例中,基于上述步骤203后,可以根据预先构造的医学知识图谱得到一个或者多个标准表征信息命中的医学状态实体,在存在有多个命中的医学状态实体时,可以根据预先确定的筛选规则从命中的多个医学状态实体中选择满足要求的医学状态实体,以及对命中的多个医学状态实体进行排序,并根据得到的一个或者多个医学状态实体生成推送信息集合, 即该推送信息集合中包含有一个或者多个医学状态实体,然后将该推送信息集合发送给步骤201中进行信息输入的用户,以确定最终的推送给用户的推送内容,使得该用户可以得到基于输入信息生成的推送信息。In this embodiment, after step 203 above, one or more medical state entities hit by the standard representation information can be obtained according to the pre-constructed medical knowledge graph. The screening rule selects the medical state entities that meet the requirements from the multiple hit medical state entities, sorts the multiple hit medical state entities, and generates a push information set according to the obtained one or more medical state entities, that is, the The push information set contains one or more medical status entities, and then the push information set is sent to the user who entered the information in step 201 to determine the final push content pushed to the user, so that the user can obtain the information based on the input information. Generated push information.
本申请实施例提供的推送信息的生成方法,在获取与用户的输入信息中的表征信息相对应的标准表征信息后,基于预先构造的医学知识图谱,确定该标准表征信息命中的至少一个医学状态实体,其中,该医学知识图谱记录有表征信息和医学状态实体之间的对应关系,该对应关系从医学文献的摘要信息中提取得到,基于该医学状态实体生成推送信息集合,发送该推送信息集合给该用户,使用基于医学文献的摘要信息构建的知识图谱确定推送给用户的推送消息,降低推送消息确定成本的同时提升推送消息的质量。In the method for generating push information provided by the embodiment of the present application, after obtaining standard representation information corresponding to the representation information in the input information of the user, at least one medical state hit by the standard representation information is determined based on a pre-constructed medical knowledge graph entity, wherein the medical knowledge graph records the correspondence between the representation information and the medical state entity, and the correspondence is extracted from the abstract information of the medical literature, generates a push information set based on the medical state entity, and sends the push information set For the user, the knowledge graph constructed based on the abstract information of the medical literature is used to determine the push message to be pushed to the user, thereby reducing the cost of determining the push message and improving the quality of the push message.
在本实施例的一些可选实现方式中,参考图3,其中示出了一种医学知识图谱的确定步骤的流程300,具体包括:In some optional implementations of this embodiment, referring to FIG. 3 , a process 300 of determining steps of a medical knowledge graph is shown, which specifically includes:
步骤301,获取多个医学文献的摘要文本信息,得到摘要文本信息集合。Step 301: Acquire abstract text information of a plurality of medical documents to obtain a set of abstract text information.
具体的,可以通过例如Pubmed、中国生物医学文献等医学文献检索库获取大量的医学文献,在获取医学文献后对其中的摘要文本信息进行提取,得到摘要文本信息集合。Specifically, a large number of medical documents can be obtained through medical document retrieval databases such as Pubmed and Chinese Biomedical Documents, and after obtaining the medical documents, the abstract text information in the medical documents is extracted to obtain a set of abstract text information.
其中,摘要文本信息可以为英文也可以为中文,优选地采用英文摘要信息进行提取,因使用英文摘要信息时可以避免使用中文摘要文本信息时需要对文本内容进行切词操作的问题,进一步提升摘要文本信息集合的生成效率。Among them, the abstract text information can be in English or Chinese, and preferably English abstract information is used for extraction, because the use of English abstract information can avoid the problem of needing to segment the text content when using Chinese abstract text information, and further improve the abstract Generation efficiency of textual information collections.
应当理解的是,在对获取到的医学文献的摘要文本信息进行提取时,还可以一并提取医学文献的标题信息,以便于后续作为参考。It should be understood that, when extracting the abstract text information of the obtained medical documents, the title information of the medical documents may also be extracted for subsequent reference.
步骤302,采用实体识别神经网络确定该摘要文本信息集合中命中的实体集合;其中,该实体集合包括该摘要文本信息集合中的以下信息:与表征信息和医学状态实体相关的信息。 Step 302 , using an entity recognition neural network to determine a hit entity set in the abstract text information set; wherein the entity set includes the following information in the abstract text information set: information related to representation information and medical state entities.
具体的,实体识别神经网络就是从非结构化的输入文本中抽取出实体,并且可以按照业务需求识别出更多类别的实体,比如产品名称、型号、价格等的神经网络,例如Deep Web、NER等实体识别神经网络,对摘要文 本信息集合中的内容进行实体识别,以确定摘要文本信息集合中命中的实体,其中该实体信息包括与表征信息和医学状态实体相关的信息,最后根据命中的实体得到实体集合。Specifically, entity recognition neural network extracts entities from unstructured input text, and can identify more categories of entities according to business requirements, such as neural networks for product names, models, prices, etc., such as Deep Web, NER and other entity recognition neural network to perform entity recognition on the content in the abstract text information set to determine the hit entity in the abstract text information set, wherein the entity information includes information related to the representation information and the medical state entity, and finally according to the hit entity Get the entity collection.
应当理解的是,可以根据预先设置的规则从命中的实体中选取满足预先设置的规则的实体,然后基于满足预先设置的规则的实体来得到实体集合。It should be understood that an entity that satisfies the preset rule can be selected from the hit entities according to a preset rule, and then an entity set is obtained based on the entity that satisfies the preset rule.
步骤303,对该实体集合进行医学语言规范化匹配,得到规范化实体集合。 Step 303 , perform medical language normalization matching on the entity set to obtain a normalized entity set.
具体的,虽然实体识别神经网络能够识别出医学实体,但是识别到的实体名称不一定规范(例如:“拉肚子”不规范,“腹泻”规范),类别不一定正确(例如:“阿莫西林”的类别识别成了“药物”)。语料经过实体识别神经网络之后,得到了实体候选集合,为了自动规范化实体,将实体名称在例如UMLS库等医学知识数据库中的实体名称及其同义词做近似搜索,匹配上的实体即是一个规范的实体,继而将匹配上的实体添加对应实体的编码,例如使用CUI编码唯一标识这个规范的候选实体,得到规范化实体集合,此外还可以使用键值对编码等现有的方式实现该目的。Specifically, although the entity recognition neural network can identify medical entities, the recognized entity names are not necessarily standardized (for example: "diarrhea" is not standardized, "diarrhea" is standardized), and the category is not necessarily correct (for example: "amoxicillin" category identified as "drugs"). After the corpus is passed through the entity recognition neural network, the entity candidate set is obtained. In order to automatically normalize the entities, the entity names and their synonyms in the medical knowledge database such as the UMLS database are searched approximately, and the matching entity is a canonical entity. Then, the matching entity is added to the encoding of the corresponding entity. For example, the CUI encoding is used to uniquely identify the candidate entity of this specification, and the canonical entity set is obtained. In addition, existing methods such as key-value pair encoding can be used to achieve this purpose.
步骤304,对该规范化实体集合中的规范化实体进行分类标注,得到表征信息集合和医学状态实体集合。Step 304: Classify and label the normalized entities in the normalized entity set to obtain a representation information set and a medical state entity set.
具体的,由于医学知识数据库中对于实体的类型(表征信息、医学状态实体)没有全面的定义,因此并不能区别得到的实体为表征信息还是医学状态实体,因此可以基于上述步骤中的编码信息进行匹配,通过现有的医学知识信息、医学知识图谱等对得到的规范化实体进行分类标注,判断该实体为表征信息、医学状态实体,或者两者都不是。Specifically, since there is no comprehensive definition for the types of entities (representation information, medical state entities) in the medical knowledge database, it is impossible to distinguish whether the obtained entities are representation information or medical state entities. Matching, classify and label the obtained normalized entity through the existing medical knowledge information, medical knowledge graph, etc., and judge that the entity is a representation information, a medical state entity, or neither.
步骤305,基于该表征信息集合中的表征信息与该医学状态实体集合中的医学状态实体的共现关系,得到该医学知识图谱。Step 305: Obtain the medical knowledge graph based on the co-occurrence relationship between the representation information in the representation information set and the medical state entities in the medical state entity set.
具体的,可以根据医学状态实体出现的次数以及医学状态实体与表征信息在同一医学文献摘要中出现的次数,来确定医学状态实体和表征信息在摘要文本信息集合中共同出现的次数确定表征信息与医学状态实体之间是否存在对应关系,例如根据医学状态实体A在摘要文本信息集合中出现的概率与该医学状态实体与表征信息B在摘要文本信息中的共现概率 是否相近来确定该医学状态实体A和表征信息B之间是否存在对应关系,或者例如预先确定一个共现次数阈值条件,在表征信息D和医学状态实体之间C的共现次数满足该预先确定的阈值条件时则认为该表征信息D和该医学状态实体C之间存在对应关系,根据收集到的表征信息和医学状态实体之间存在的对应关系得到该医学知识图谱。Specifically, the number of occurrences of the medical state entity and the representation information in the abstract text information set can be determined according to the number of occurrences of the medical state entity and the number of occurrences of the medical state entity and the representation information in the same medical document abstract. Whether there is a correspondence between medical state entities, for example, determine the medical state according to whether the probability of medical state entity A appearing in the abstract text information set is similar to the co-occurrence probability of the medical state entity and representation information B in the abstract text information Whether there is a corresponding relationship between entity A and representation information B, or, for example, a threshold condition for the number of co-occurrences is pre-determined, when the co-occurrence number of entity C between the representation information D and the medical state entity satisfies the predetermined threshold condition, it is considered that the There is a corresponding relationship between the representation information D and the medical state entity C, and the medical knowledge graph is obtained according to the corresponding relationship between the collected representation information and the medical state entity.
在本实现方式中,基于医学文献的摘要信息来生成医学知识图谱的方式,首先可以通过大量的医学文献实现广范围的表征信息和实体的信息覆盖,避免现有技术中依靠专家知识生成知识图谱方式中信息覆盖范围窄的问题,并且可以通过摘要信息实现对医学文献的简化,避免因医学文献中内容信息量过大导致的医学文献的识别效率较低的技术问题,提升生成知识图谱的效率。In this implementation, the method of generating a medical knowledge graph based on the abstract information of medical documents can firstly achieve a wide range of representation information and entity information coverage through a large number of medical documents, avoiding the need to rely on expert knowledge to generate knowledge graphs in the prior art. The problem of narrow information coverage in the method, and the simplification of medical documents can be realized through abstract information, avoiding the technical problem of low identification efficiency of medical documents caused by the excessive amount of content information in medical documents, and improving the efficiency of generating knowledge graphs .
在本实施例的一些可选实现方式中,实体识别神经网络包括:双向短期记忆网络和条件随机场。In some optional implementations of this embodiment, the entity recognition neural network includes: a bidirectional short-term memory network and a conditional random field.
具体的,双向短期记忆网络,即Bi-LSTM(Bi-directional Long Short-Term Memory,简称Bi-LSTM)记忆网络,与传统神经网络最大的不同在于隐藏层的输入不仅包含了输入层的输出,还包含了上一个时刻隐藏层的输出,其主要特点是可以存储之前时刻的信息。虽然RNN(Recurrent Neural Network,简称RNN)理论上可以保留上文的所有信息,但随着隐藏层层数的增加,存在着梯度消失或梯度爆炸的现象。LSTM(Long Short-Term Memory,简称LSTM)能有效解决长时依赖的问题,包括遗忘门、输入门和输出门。为了使网络表达的信息更丰富,推测更准确,研究采用了双向网络结构,即Bi-LSTM。Bi-LSTM由两个LSTM拼接而成包含一个正向输入序列和反向输入序列,同时考虑了过去的特征和未来的特征。Specifically, the bidirectional short-term memory network, namely Bi-LSTM (Bi-directional Long Short-Term Memory, referred to as Bi-LSTM) memory network, is the biggest difference from the traditional neural network in that the input of the hidden layer not only includes the output of the input layer, It also includes the output of the hidden layer at the previous moment, and its main feature is that it can store the information of the previous moment. Although RNN (Recurrent Neural Network, RNN for short) can theoretically retain all the above information, as the number of hidden layers increases, there is a phenomenon of gradient disappearance or gradient explosion. LSTM (Long Short-Term Memory, referred to as LSTM) can effectively solve the problem of long-term dependence, including forgetting gate, input gate and output gate. In order to make the information expressed by the network richer and the inference more accurate, the research adopts a bidirectional network structure, namely Bi-LSTM. Bi-LSTM is spliced from two LSTMs and contains a forward input sequence and a reverse input sequence, taking into account both past and future features.
LSTM的优点是能够通过双向的设置学习到观测序列(输入的字)之间的依赖,在训练过程中,LSTM能够根据目标(比如识别实体)自动提取观测序列的特征,但是缺点是无法学习到状态序列(输出的标注)之间的关系,要知道,在命名实体识别任务中,标注之间是有一定的关系的,比如E类标注(表示某实体的开头)后面不会再接一个E类标注,所以LSTM在解决序列标注任务时,虽然可以省去很繁杂的特征工程,但是也 存在无法学习到标注上下文的缺点。The advantage of LSTM is that it can learn the dependencies between observation sequences (input words) through bidirectional settings. During the training process, LSTM can automatically extract the features of observation sequences according to the target (such as recognizing entities), but the disadvantage is that it cannot be learned. The relationship between state sequences (output annotations), you must know that in named entity recognition tasks, there is a certain relationship between annotations, such as E-type annotations (representing the beginning of an entity) will not be followed by an E. Class labeling, so when LSTM solves the sequence labeling task, although it can save a lot of complicated feature engineering, it also has the disadvantage of not being able to learn the labeling context.
当用Bi-LSTM来做命名实体识别时,Bi-LSTM的输出为实体标签的分数,且选择最高分数对应的标签。然而某些时候,Bi-LSTM却不能得到真正正确的实体标签,这时候就需要加入条件随机场,即CRF(Conditional Random Field,简称CRF),CRF结合了最大熵模型和隐马尔科夫模型的特点,能对隐含状态建模,学习状态序列的特点,但它的缺点是需要手动提取序列特征。When Bi-LSTM is used for named entity recognition, the output of Bi-LSTM is the score of the entity label, and the label corresponding to the highest score is selected. However, in some cases, Bi-LSTM cannot get the real correct entity label. At this time, it is necessary to add a conditional random field, that is, CRF (Conditional Random Field, CRF for short). CRF combines the maximum entropy model and the hidden Markov model. It can model the hidden state and learn the characteristics of the state sequence, but its disadvantage is that it needs to manually extract the sequence features.
因此,联合使用双向短期记忆网络和条件随机场时,可以避免上述单独使用时存在的缺点,以同时获取两者的优点,实现质量较高的实体识别工作。Therefore, when the bidirectional short-term memory network and the conditional random field are used in combination, the above-mentioned shortcomings when used alone can be avoided, and the advantages of both can be obtained at the same time to achieve higher-quality entity recognition work.
在本实施例的一些可选实现方式中,基于该医学状态实体生成推送信息集合,发送该推送信息集合给该用户包括:采用概率图模型对该医学状态实体进行排序,根据排序结果选取预设数量的该医学状态实体生成推送信息集合;发送该推送集合给该用户。In some optional implementations of this embodiment, generating a push information set based on the medical state entity, and sending the push information set to the user includes: sorting the medical state entities by using a probability graph model, and selecting a preset according to the sorting result A number of the medical state entities generate a push information set; send the push set to the user.
具体的,得到实体的类别之后,可以通过表征信息与医学状态实体间的共现次数来衡量实体间的相关性,例如使用公式:Specifically, after obtaining the category of the entity, the correlation between the entities can be measured by the number of co-occurrences between the representation information and the medical state entity, for example, using the formula:
Figure PCTCN2021128398-appb-000001
Figure PCTCN2021128398-appb-000001
其中,P(Dis i|Sym j)表示表征信息与医学状态实体间的相关性概率,P(Dis i,Sym j)表示基于表征信息与医学状态实体间的共现次数得到的概率,P(Sym j)表示表征信息出现的概率,更具体的说上式描述了第j种表征信息(Sym)对第i种医学状态实体(Dis)的贡献度,例如:在1000万篇文献中“咳嗽”出现了100次,“肺结核”与“咳嗽”同时出现(#co_occurrence)在一篇摘要中的次数为5次,那么咳嗽对肺结核的贡献度为:P(肺结核|咳嗽)=0.05。为了简化计算,假设症状与症状间是独立的,于是得到表征信息概率图模型(朴素贝叶斯),医学状态实体的后验概率如下式: Among them, P(Dis i |Sym j ) represents the correlation probability between the representation information and the medical state entity, P(Dis i , Sym j ) represents the probability based on the co-occurrence times between the representation information and the medical state entity, P( Sym j ) represents the probability of the appearance of the representation information, more specifically, the above formula describes the contribution of the jth representation information (Sym) to the ith medical state entity (Dis), for example: in 10 million documents "cough"" appears 100 times, and the number of times that "pulmonary tuberculosis" and "cough" appear at the same time (#co_occurrence) in an abstract is 5 times, then the contribution of cough to tuberculosis is: P(pulmonary tuberculosis|cough)=0.05. In order to simplify the calculation, it is assumed that the symptoms and the symptoms are independent, so the probabilistic graphical model of the representation information (Naive Bayes) is obtained, and the posterior probability of the medical state entity is as follows:
P(Dis i|Sym j,Sym j+1,...)=Ρ(Dis i)·P(Dis i|Sym j)·P(Dis i|Sym j+1)... P(Dis i |Sym j ,Sym j+1 ,...)=P(Dis i )·P(Dis i |Sym j )·P(Dis i |Sym j+1 )...
其中P(Dis)是医学状态实体的先验概率,为了避免连乘导致结果过小,可以对上式两边取对数,改为累加操作,利用上式可以计算出已知若 干表征信息时,各个医学状态实体的后验概率。Among them, P(Dis) is the prior probability of the medical state entity. In order to avoid the result being too small due to continuous multiplication, the logarithm of both sides of the above formula can be taken, and the accumulation operation can be changed. Posterior probability of each medical state entity.
在得到各个医学状态实体的后验概率后,可以根据概率的大小对各个医学状态实体进行排序并选取预设数量的医学状态实体生成推送信息集合,以便于用户获取到命中概率较大的医学状态实体,进一步提升推送信息的质量。After obtaining the posterior probability of each medical state entity, each medical state entity can be sorted according to the size of the probability, and a preset number of medical state entities can be selected to generate a push information set, so that the user can obtain the medical state with a high hit probability. entity to further improve the quality of push information.
在本实施例的一些可选实现方式中,标准表征信息的生成步骤包括:In some optional implementations of this embodiment, the step of generating the standard representation information includes:
获取用户的输入信息;识别该输入信息中包含的表征信息,得到识别结果;基于该识别结果的归一化语义,确定标准表征信息。The input information of the user is acquired; the representation information contained in the input information is identified to obtain a recognition result; the standard representation information is determined based on the normalized semantics of the recognition result.
具体的,归一词可以是用来表示一些相似描述的共同表现形式,归一化语义可以是将条目信息中的模板词使用归一词进行代替,以将不同语义相同的表述进行形式统一。因此,可以预先基于现有的信息构建标准表征信息数据库,后续在获取到用户输入信息的识别结果后,将识别结果进行归一化语音,以得到标准表征信息,以防止在用户无法使用较为规范、标准的描述语言准确的表述想法和需求时,上述执行主体无法对用户的想法和需求进行理解,以便于针对用户的想法和需求顺利的生成满足其需求的推送信息。Specifically, the normalized word may be used to represent some common expressions of similar descriptions, and the normalized semantics may be to replace the template word in the entry information with the normalized word to unify expressions with different semantics. Therefore, a standard representation information database can be constructed based on the existing information in advance, and after the recognition result of the user input information is obtained, the recognition result is normalized to speech to obtain the standard representation information, so as to prevent the user from being unable to use the more standardized information. , When the standard description language accurately expresses the ideas and needs, the above-mentioned executive body cannot understand the user's ideas and needs, so that the push information that meets the user's needs and needs can be smoothly generated according to the user's ideas and needs.
继续参考图4,其中示出了一种推送信息的生成方法的另一个实施例的流程400,具体包括以下步骤:Continue to refer to FIG. 4, which shows a process 400 of another embodiment of a method for generating push information, which specifically includes the following steps:
步骤401,获取用户的输入信息。Step 401: Obtain input information of a user.
步骤402,识别该输入信息中包含的表征信息,得到识别结果。Step 402: Identify the representation information contained in the input information to obtain the identification result.
具体的,可以采用上述图3对应实现方式中的实体识别神经网络对用户的输入信息进行识别,以确定其中存在的表征信息。Specifically, the entity recognition neural network in the implementation manner corresponding to FIG. 3 may be used to identify the user's input information, so as to determine the representation information existing therein.
步骤403,基于该识别结果的归一化语义进行扩展,生成扩展表征信息集合。Step 403: Expand based on the normalized semantics of the recognition result to generate an expanded representation information set.
具体的,因用户的输入信息中很难包括标准表征信息,因此基于上述图2所示实施例的一些实现方式中得到的归一化语义结果后,可以根据该语义结果进行相似扩展,例如得到的归一化语义结果为“胃绞痛”,则可以根据该内容近似的扩展为同类型的“胃阵痛”,以获取更多与用户的输入信息相关的参考信息,即扩展表征信息,便于后续根据这些表征信息得到的表征信息集合得到更多的医学状态实体,提高生成的推送信息集合的质量。Specifically, it is difficult to include standard representation information in the input information of the user. Therefore, based on the normalized semantic results obtained in some implementations of the embodiment shown in FIG. 2, similar extensions can be performed according to the semantic results, for example, If the normalized semantic result is "stomach cramps", it can be approximately extended to the same type of "stomach cramps" according to the content to obtain more reference information related to the user's input information, that is, extended representation information, which is convenient for Subsequent representation information sets obtained according to the representation information obtain more medical state entities, thereby improving the quality of the generated push information sets.
其中,在采用图2所示所示实施例的一些实现方式中基于概率图模型确定推送集合的技术方案时,还可以基于概率图模型:Wherein, when adopting the technical solution of determining the push set based on the probability graph model in some implementations of the embodiment shown in FIG. 2, the probability graph model may also be used:
Figure PCTCN2021128398-appb-000002
Figure PCTCN2021128398-appb-000002
P(Sym i|Sym j,Sym j+1,...)=Ρ(Sym i)·P(Sym i|Sym j)·P(Sym i|Sym j+1)... P(Sym i |Sym j ,Sym j+1 ,...)=P(Sym i )·P(Sym i |Sym j )·P(Sym i |Sym j+1 )...
来确定扩展表征信息,其中,P(Sym j、Sym j+1……)表示不同的表征信息出现的概率,即相当于根据不同表征信息的共现情况来计算不同表征信息之间是否存在关联性,具体原理与上述基于表征信息和医学状态实体间之间的共现关系来确定表征信息和医学状态实体之间的对应关系的过程相似,对此不再赘述。 to determine the extended representation information, where P(Sym j , Sym j+1 . The specific principle is similar to the above-mentioned process of determining the corresponding relationship between the representation information and the medical state entity based on the co-occurrence relationship between the representation information and the medical state entity, and will not be repeated here.
步骤404,将该扩展表征信息集合中的扩展表征信息作为标准表征信息。 Step 404, take the extended representation information in the extended representation information set as standard representation information.
步骤405,响应于确定接收到针对该标准表征信息的选择信息,采用预先构造的医学知识图谱,确定该标准表征信息命中的至少一个医学状态实体。 Step 405 , in response to determining that selection information for the standard representation information is received, use a pre-constructed medical knowledge graph to determine at least one medical state entity hit by the standard representation information.
应当理解的是,这里的标准表征信息中的数量可能是一条消息也可能是多条,在存在多条标准表征信息时,可以将这些标准表征信息呈现给输入信息的用户,得到用户发出的基于选择的标准表征信息生成的选择信息,确定该选择信息中包括的标准表征信息,即用户选择的标准表征信息,以人机交互的方式获取用户期望的标准表征信息,基于用户选择的标准表征信息更能满足用户的需求,以提升后续生成的推送信息的质量。It should be understood that the number of standard representation information here may be one message or multiple pieces. When there are multiple pieces of standard representation information, these standard representation information can be presented to the user who inputs the information, and the user can obtain the information based on the information sent by the user. The selection information generated by the selected standard characterization information, determine the standard characterization information included in the selection information, that is, the standard characterization information selected by the user, obtain the standard characterization information expected by the user in the form of human-computer interaction, based on the user-selected standard characterization information It can better meet the needs of users, so as to improve the quality of the subsequently generated push information.
步骤406,基于该医学状态实体生成推送信息集合,发送该推送信息集合给该用户。Step 406: Generate a push information set based on the medical state entity, and send the push information set to the user.
具体的,若步骤405中用户发出的选择信息中的标准表征信息命中有多个医学状态实体,则基于多个医学状态实体生成推送信息集合,发送给用户,其中,在基于多个医学状态实体生成推送信息集合时,可以根据预先确定的规则对医学状态实体进行排序、筛选,例如图2所示实施例的实现方式中根据表征信息命中医学状态实体的概率进行排序等。Specifically, if the standard representation information in the selection information sent by the user in step 405 hits multiple medical state entities, a push information set is generated based on the multiple medical state entities, and sent to the user, wherein, based on the multiple medical state entities When generating the push information set, the medical state entities may be sorted and screened according to predetermined rules, for example, in the implementation of the embodiment shown in FIG.
在本实施例中,步骤404、405中部分内容与图2所示实施例中步骤 202-203相似,重复内容不再赘述,本实施例中在获取用户的输入信息后,根据该输入信息中包括的表征信息的识别结果来确定扩展表征信息集合,扩展表征集合确定标准表征信息,然后基于与用户进行人机交互的结果来确定最终选用的标准表征信息,并对应确定医学状态实体,生成推送信息集合推送给用户,以便于根据用户的实际需求为用户提供质量更高、更贴近用户实际需求的推送信息。In this embodiment, some contents in steps 404 and 405 are similar to steps 202-203 in the embodiment shown in FIG. 2 , and the repeated contents will not be repeated. The identification result of the included representation information is used to determine the extended representation information set, the extended representation set determines the standard representation information, and then the final selected standard representation information is determined based on the result of the human-computer interaction with the user, and the medical state entity is determined correspondingly, and the push notification is generated. The information collection is pushed to the user, so as to provide the user with the push information of higher quality and closer to the actual needs of the user according to the actual needs of the user.
为加深理解,本申请还结合一个具体应用场景,给出了一种具体的实现方案。在该具体应用场景下,用户的输入信息为“昨夜到今晨连续拉肚子”,预先确定的医学状态实体提取数量为三。In order to deepen understanding, this application also provides a specific implementation scheme in combination with a specific application scenario. In this specific application scenario, the user's input information is "continuous diarrhea from last night to this morning", and the predetermined number of medical state entities to be extracted is three.
在获取到该用户输入的信息后,采用实体识别神经网络对用户的输入信息进行识别,确定其中存在的表征信息“拉肚子”,然后基于该识别结果进行归一化语义“腹泻”,进行扩展得到扩展表征信息“腹痛”、“腹胀”、“消化不良”和“胃绞痛”,呈现给该用户。After obtaining the information input by the user, the entity recognition neural network is used to identify the input information of the user, and the representative information "diarrhea" in it is determined, and then the semantic "diarrhea" is normalized based on the identification result, and the expansion is obtained. The extended representation information "abdominal pain", "bloating", "indigestion" and "stomach colic" is presented to the user.
响应于用户针对“腹泻”这个标准表征信息的选择信息,其中包括有扩展表征信息“腹痛”和“消化不良”,然后采用基于预先构造的知识图谱,确定“腹泻”、“腹痛”和“消化不良”和命中的医学状态实体,医学知识图谱记录有表征信息和医学状态实体之间的对应关系,该对应关系从医学文献的摘要信息中提取得到。In response to the user's selection of information for the standard characterization information of "diarrhea", which includes extended characterization information "abdominal pain" and "indigestion", then using a pre-constructed knowledge graph to determine "diarrhea", "abdominal pain" and "digestion". "bad" and hit medical status entities, the medical knowledge graph records the correspondence between representation information and medical status entities, which is extracted from the abstract information of medical literature.
得到命中的医学状态实体有“肠易激综合征”、“乳糖不耐受”、“胃轻瘫”、“乳糜泻”、“胃炎”和“消化性溃疡”,采用概率图模型对这些医学状态实体进行排序后,得到排序关系为:“胃炎”、“消化性溃疡”、“肠易激综合征”、“乳糖不耐受”、“乳糜泻”和“胃轻瘫”。The medical status entities that got hits were "Irritable Bowel Syndrome", "Lactose Intolerance", "Gastroparesis", "Celiac disease", "Gastritis" and "Peptic ulcer". After sorting the status entities, the sorting relationship is: "Gastritis", "Peptic ulcer", "Irritable bowel syndrome", "Lactose intolerance", "Celiac disease" and "Gastroparesis".
因此,提取三个医学状态实体,即“胃炎”、“消化性溃疡”和“乳糖不耐受”,生成推送信息集合,并推送给该用户。Therefore, extract three medical status entities, namely "gastritis", "peptic ulcer" and "lactose intolerance", generate a push information set, and push it to the user.
通过本应用场景可以看出,本申请实施例提供的推送信息的生成方法,在获取与用户的输入信息中的表征信息相对应的标准表征信息后,基于预先构造的医学知识图谱,确定该标准表征信息命中的至少一个医学状态实体,其中,该医学知识图谱记录有表征信息和医学状态实体之间的对应关系,该对应关系从医学文献的摘要信息中提取得到,基于该医学状态实体 生成推送信息集合,发送该推送信息集合给该用户,使用基于医学文献的摘要信息构建的知识图谱确定推送给用户的推送消息,降低推送消息确定成本的同时提升推送消息的质量。It can be seen from this application scenario that, in the method for generating push information provided by the embodiments of the present application, after obtaining the standard representation information corresponding to the representation information in the user's input information, the standard is determined based on a pre-constructed medical knowledge graph. At least one medical state entity hit by the representation information, wherein the medical knowledge graph records the correspondence between the representation information and the medical state entity, the correspondence is extracted from the abstract information of the medical literature, and the push is generated based on the medical state entity Information collection, send the push information collection to the user, and use the knowledge graph constructed based on the abstract information of medical literature to determine the push message to be pushed to the user, thereby reducing the cost of determining the push message and improving the quality of the push message.
如图5所示,本实施例的推送信息的生成装置500可以包括:标准表征信息获取单元501,被配置成获取与用户的输入信息中的表征信息相对应的标准表征信息;医学状态实体确定单元502,被配置成基于预先构造的医学知识图谱,确定该标准表征信息命中的至少一个医学状态实体;其中,该医学知识图谱记录有表征信息和医学状态实体之间的对应关系,该对应关系从医学文献的摘要信息中提取得到;推送信息发送单元503,被配置成基于该医学状态实体生成推送信息集合,发送该推送信息集合给该用户。As shown in FIG. 5 , the apparatus 500 for generating push information in this embodiment may include: a standard representation information acquiring unit 501 configured to acquire standard representation information corresponding to the representation information in the user's input information; a medical state entity determination Unit 502 is configured to determine at least one medical state entity hit by the standard representation information based on a pre-constructed medical knowledge map; wherein, the medical knowledge map records a correspondence between the representation information and the medical state entity, and the corresponding relationship It is extracted from the abstract information of medical documents; the push information sending unit 503 is configured to generate a push information set based on the medical state entity, and send the push information set to the user.
在本实施例的一些可选的实现方式中,上述推送信息的生成装置还包括:医学知识图谱确定单元,包括:初始信息获取子单元,被配置成获取多个医学文献的摘要文本信息,得到摘要文本信息集合;实体识别子单元,被配置成采用实体识别神经网络确定该摘要文本信息集合中命中的实体集合;其中,该实体集合包括该摘要文本信息集合中的以下信息:与表征信息和医学状态实体相关的信息;规范匹配子单元,被配置成对该实体集合进行医学语言规范化匹配,得到规范化实体集合;分类标注子单元,被配置成对该规范化实体集合中的规范化实体进行分类标注,得到表征信息集合和医学状态实体集合;医学知识图谱生成子单元,被配置成基于该表征信息集合中的表征信息与该医学状态实体集合中的医学状态实体的共现关系,得到该医学知识图谱。In some optional implementations of this embodiment, the apparatus for generating push information further includes: a medical knowledge graph determination unit, including: an initial information acquisition subunit, configured to acquire abstract text information of a plurality of medical documents, and obtain The abstract text information set; the entity identification subunit is configured to use the entity recognition neural network to determine the entity set hit in the abstract text information set; wherein, the entity set includes the following information in the abstract text information set: with representation information and Information related to medical state entities; a normative matching subunit, configured to perform medical language normalization matching on the entity set to obtain a normalized entity set; a classification and labeling subunit, configured to classify and label the normalized entities in the normalized entity set , obtain the representation information set and the medical state entity set; the medical knowledge graph generation subunit is configured to obtain the medical knowledge based on the co-occurrence relationship between the representation information in the representation information set and the medical state entity in the medical state entity set Atlas.
在本实施例的一些可选的实现方式中,该实体识别子单元中实体识别神经网络包括:双向短期记忆网络和条件随机场。In some optional implementations of this embodiment, the entity recognition neural network in the entity recognition subunit includes: a bidirectional short-term memory network and a conditional random field.
在本实施例的一些可选的实现方式中,推送信息发送单元进一步被配置成:采用概率图模型对该医学状态实体进行排序,根据排序结果选取预设数量的该医学状态实体生成推送信息集合;发送该推送集合给该用户。In some optional implementations of this embodiment, the push information sending unit is further configured to: use a probability graph model to sort the medical state entities, and select a preset number of the medical state entities according to the sorting result to generate a push information set ; Send the push collection to the user.
在本实施例的一些可选的实现方式中,上述推送信息的生成装置还包括:标准信息生成单元,包括:初始信息获取子单元,被配置成获取用户的输入信息;信息识别子单元,被配置成识别该输入信息中包含的表征信 息,得到识别结果;标准表征信息确定子单元,被配置成基于该识别结果的归一化语义,确定该标准表征信息。In some optional implementation manners of this embodiment, the apparatus for generating push information further includes: a standard information generating unit, including: an initial information acquiring subunit, configured to acquire user input information; an information identifying subunit, which is configured to acquire user input information; is configured to identify the representation information contained in the input information to obtain a recognition result; the standard representation information determination subunit is configured to determine the standard representation information based on the normalized semantics of the recognition result.
在本实施例的一些可选的实现方式中,标准表征信息确定子单元进一步被配置成:基于该识别结果的归一化语义进行扩展,生成扩展表征信息集合;将该扩展表征信息集合中的扩展表征信息作为标准表征信息。In some optional implementation manners of this embodiment, the standard representation information determination subunit is further configured to: expand based on the normalized semantics of the recognition result to generate an extended representation information set; Extended characterization information as standard characterization information.
在本实施例的一些可选的实现方式中,医学状态实体确定单元进一步被配置成:响应于确定接收到针对该标准表征信息的选择信息,采用预先构造的医学知识图谱,确定该标准表征信息命中的至少一个医学状态实体。In some optional implementations of this embodiment, the medical state entity determination unit is further configured to: in response to determining that selection information for the standard representation information is received, use a pre-constructed medical knowledge graph to determine the standard representation information At least one medical state entity hit.
本实施例作为对应于上述方法实施例的装置实施例存在,相同内容参考对于上述方法实施例的说明,对此不再赘述。通过本申请实施例提供的推送信息的生成装置,使用基于医学文献的摘要信息构建的知识图谱确定推送给用户的推送消息,降低推送消息确定成本的同时提升推送消息的质量。This embodiment exists as an apparatus embodiment corresponding to the foregoing method embodiment. For the same content, reference is made to the description of the foregoing method embodiment, which will not be repeated here. With the device for generating push information provided in the embodiment of the present application, the knowledge graph constructed based on the abstract information of medical documents is used to determine the push message to be pushed to the user, thereby reducing the cost of determining the push message and improving the quality of the push message.
如图6所示,是根据本申请实施例的推送信息的生成方法的电子设备的框图。电子设备旨在表示各种形式的数字计算机,诸如,膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置,诸如,个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本申请的实现。As shown in FIG. 6 , it is a block diagram of an electronic device according to a method for generating push information according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are by way of example only, and are not intended to limit implementations of the application described and/or claimed herein.
如图6所示,该电子设备包括:一个或多个处理器601、存储器602,以及用于连接各部件的接口,包括高速接口和低速接口。各个部件利用不同的总线互相连接,并且可以被安装在公共主板上或者根据需要以其它方式安装。处理器可以对在电子设备内执行的指令进行处理,包括存储在存储器中或者存储器上以在外部输入/输出装置(诸如,耦合至接口的显示设备)上显示GUI的图形信息的指令。在其它实施方式中,若需要,可以将多个处理器和/或多条总线与多个存储器和多个存储器一起使用。同样,可以连接多个电子设备,各个设备提供部分必要的操作(例如,作为服务器阵列、一组刀片式服务器、或者多处理器系统)。图6中以一个处理器601为例。As shown in FIG. 6, the electronic device includes: one or more processors 601, a memory 602, and interfaces for connecting various components, including a high-speed interface and a low-speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or otherwise as desired. The processor may process instructions executed within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used with multiple memories and multiple memories, if desired. Likewise, multiple electronic devices may be connected, each providing some of the necessary operations (eg, as a server array, a group of blade servers, or a multiprocessor system). A processor 601 is taken as an example in FIG. 6 .
存储器602即为本申请所提供的非瞬时计算机可读存储介质。其中,该存储器存储有可由至少一个处理器执行的指令,以使上述至少一个处理器执行本申请所提供的推送信息的生成方法。本申请的非瞬时计算机可读存储介质存储计算机指令,该计算机指令用于使计算机执行本申请所提供的推送信息的生成方法。The memory 602 is the non-transitory computer-readable storage medium provided by the present application. The memory stores instructions executable by at least one processor, so that the at least one processor executes the method for generating push information provided by the present application. The non-transitory computer-readable storage medium of the present application stores computer instructions, and the computer instructions are used to cause the computer to execute the method for generating push information provided by the present application.
存储器602作为一种非瞬时计算机可读存储介质,可用于存储非瞬时软件程序、非瞬时计算机可执行程序以及模块,如本申请实施例中的推送信息的生成方法对应的程序指令/模块(例如,图5所示的标准表征信息获取单元501、医学状态实体确定单元502和推送信息发送单元503)。处理器601通过运行存储在存储器602中的非瞬时软件程序、指令以及模块,从而执行服务器的各种功能应用以及数据处理,即实现上述方法实施例中的推送信息的生成方法。As a non-transitory computer-readable storage medium, the memory 602 can be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as program instructions/modules (for example, program instructions/modules corresponding to the method for generating push information in the embodiments of the present application). , the standard representation information acquisition unit 501, the medical state entity determination unit 502 and the push information sending unit 503 shown in FIG. 5). The processor 601 executes various functional applications and data processing of the server by running the non-transitory software programs, instructions and modules stored in the memory 602, that is, implementing the method for generating push information in the above method embodiments.
存储器602可以包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需要的应用程序;存储数据区可存储根据推送信息的生成电子设备的使用所创建的数据等。此外,存储器602可以包括高速随机存取存储器,还可以包括非瞬时存储器,例如至少一个磁盘存储器件、闪存器件、或其他非瞬时固态存储器件。在一些实施例中,存储器602可选包括相对于处理器601远程设置的存储器,这些远程存储器可以通过网络连接推送信息的生成电子设备。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory 602 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function; the storage data area may store data created by the use of the electronic device according to the generation of the push information, etc. . Additionally, memory 602 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 602 may optionally include memory located remotely from the processor 601, and these remote memories may push the information-generating electronic device through a network connection. Examples of such networks include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.
用于执行推送信息的生成方法的电子设备还可以包括:输入装置603和输出装置604。处理器601、存储器602、输入装置603和输出装置604可以通过总线或者其他方式连接,图6中以通过总线连接为例。The electronic device for executing the method for generating push information may further include: an input device 603 and an output device 604 . The processor 601 , the memory 602 , the input device 603 and the output device 604 may be connected by a bus or in other ways, and the connection by a bus is taken as an example in FIG. 6 .
输入装置603可接收输入的数字或字符信息,以及产生与推送信息的生成电子设备的用户设置以及功能控制有关的键信号输入,例如触摸屏、小键盘、鼠标、轨迹板、触摸板、指示杆、一个或者多个鼠标按钮、轨迹球、操纵杆等输入装置。输出装置604可以包括显示设备、辅助照明装置(例如,LED)和触觉反馈装置(例如,振动电机)等。该显示设备可以包括但不限于,液晶显示器(LCD)、发光二极管(LED)显示器和等离 子体显示器。在一些实施方式中,显示设备可以是触摸屏。The input device 603 can receive input numerical or character information, and generate key signal input related to user settings and function control of the electronic device for generating push information, such as a touch screen, a keypad, a mouse, a trackpad, a touchpad, a pointing stick, One or more input devices such as mouse buttons, trackballs, joysticks, etc. Output devices 604 may include display devices, auxiliary lighting devices (eg, LEDs), haptic feedback devices (eg, vibration motors), and the like. The display device may include, but is not limited to, a liquid crystal display (LCD), a light emitting diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.
此处描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、专用ASIC(专用集成电路)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括:实施在一个或者多个计算机程序中,该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释,该可编程处理器可以是专用或者通用可编程处理器,可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令,并且将数据和指令传输至上述存储系统、上述至少一个输入装置、和上述至少一个输出装置。Various implementations of the systems and techniques described herein can be implemented in digital electronic circuitry, integrated circuit systems, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs executable and/or interpretable on a programmable system including at least one programmable processor that The processor, which can be a special purpose or general-purpose programmable processor, can receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device. an output device.
这些计算程序(也称作程序、软件、软件应用、或者代码)包括可编程处理器的机器指令,并且可以利用高级过程和/或面向对象的编程语言、和/或汇编/机器语言来实施这些计算程序。如本文使用的,术语“机器可读介质”和“计算机可读介质”指的是用于将机器指令和/或数据提供给可编程处理器的任何计算机程序产品、设备、和/或装置(例如,磁盘、光盘、存储器、可编程逻辑装置(PLD)),包括,接收作为机器可读信号的机器指令的机器可读介质。术语“机器可读信号”指的是用于将机器指令和/或数据提供给可编程处理器的任何信号。These computational programs (also referred to as programs, software, software applications, or codes) include machine instructions for programmable processors, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages calculation program. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or apparatus for providing machine instructions and/or data to a programmable processor ( For example, magnetic disks, optical disks, memories, programmable logic devices (PLDs), including machine-readable media that receive machine instructions as machine-readable signals. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
为了提供与用户的交互,可以在计算机上实施此处描述的系统和技术,该计算机具有:用于向用户显示信息的显示装置(例如,CRT(阴极射线管)或者LCD(液晶显示器)监视器);以及键盘和指向装置(例如,鼠标或者轨迹球),用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以用于提供与用户的交互;例如,提供给用户的反馈可以是任何形式的传感反馈(例如,视觉反馈、听觉反馈、或者触觉反馈);并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。To provide interaction with a user, the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user ); and a keyboard and pointing device (eg, a mouse or trackball) through which a user can provide input to the computer. Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (eg, visual feedback, auditory feedback, or tactile feedback); and can be in any form (including acoustic input, voice input, or tactile input) to receive input from the user.
可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如,作为数据服务器)、或者包括中间件部件的计算系统(例如,应用服务器)、或者包括前端部件的计算系统(例如,具有图形用户界面或者网络浏览器的用户计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处 描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如,通信网络)来将系统的部件相互连接。通信网络的示例包括:局域网(LAN)、广域网(WAN)和互联网。The systems and techniques described herein may be implemented on a computing system that includes back-end components (eg, as a data server), or a computing system that includes middleware components (eg, an application server), or a computing system that includes front-end components (eg, a user's computer having a graphical user interface or web browser through which a user may interact with implementations of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system. The components of the system may be interconnected by any form or medium of digital data communication (eg, a communication network). Examples of communication networks include: Local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。A computer system can include clients and servers. Clients and servers are generally remote from each other and usually interact through a communication network. The relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other.
根据本申请实施例的技术方案,在获取与用户的输入信息中的表征信息相对应的标准表征信息后,基于预先构造的医学知识图谱,确定该标准表征信息命中的至少一个医学状态实体,其中,该医学知识图谱记录有表征信息和医学状态实体之间的对应关系,该对应关系从医学文献的摘要信息中提取得到,基于该医学状态实体生成推送信息集合,发送该推送信息集合给该用户,使用基于医学文献的摘要信息构建的知识图谱确定推送给用户的推送消息,降低推送消息确定成本的同时提升推送消息的质量。According to the technical solutions of the embodiments of the present application, after obtaining the standard representation information corresponding to the representation information in the input information of the user, based on the pre-constructed medical knowledge graph, at least one medical state entity hit by the standard representation information is determined, wherein , the medical knowledge graph records the corresponding relationship between the representation information and the medical state entity, the corresponding relationship is extracted from the abstract information of the medical literature, and the push information set is generated based on the medical state entity, and the push information set is sent to the user. , using the knowledge graph constructed based on the abstract information of medical literature to determine the push messages to the users, reducing the cost of determining the push messages and improving the quality of the push messages.
应该理解,可以使用上面所示的各种形式的流程,重新排序、增加或删除步骤。例如,本申请中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行,只要能够实现本申请公开的技术方案所期望的结果,本文在此不进行限制。It should be understood that steps may be reordered, added or deleted using the various forms of flow shown above. For example, the steps described in the present application can be executed in parallel, sequentially or in different orders, as long as the desired results of the technical solutions disclosed in the present application can be achieved, no limitation is imposed herein.
上述具体实施方式,并不构成对本申请保护范围的限制。本领域技术人员应该明白的是,根据设计要求和其他因素,可以进行各种修改、组合、子组合和替代。任何在本申请的精神和原则之内所作的修改、等同替换和改进等,均应包含在本申请保护范围之内。The above-mentioned specific embodiments do not constitute a limitation on the protection scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may occur depending on design requirements and other factors. Any modifications, equivalent replacements and improvements made within the spirit and principles of this application shall be included within the protection scope of this application.

Claims (16)

  1. 一种推送信息的生成方法,包括:A method for generating push information, comprising:
    获取与用户的输入信息中的表征信息相对应的标准表征信息;Obtain standard representation information corresponding to the representation information in the user's input information;
    基于预先构造的医学知识图谱,确定所述标准表征信息命中的至少一个医学状态实体;其中,所述医学知识图谱记录有表征信息和医学状态实体之间的对应关系,所述对应关系从医学文献的摘要信息中提取得到;Based on a pre-constructed medical knowledge graph, at least one medical state entity hit by the standard representation information is determined; wherein, the medical knowledge graph records the correspondence between the representation information and the medical state entity, and the correspondence is obtained from medical documents extracted from the summary information;
    基于所述命中的医学状态实体生成推送信息集合,发送所述推送信息集合给所述用户。A push information set is generated based on the hit medical state entity, and the push information set is sent to the user.
  2. 根据权利要求1所述的方法,其中,所述医学知识图谱基于以下步骤确定:The method of claim 1, wherein the medical knowledge graph is determined based on the steps of:
    获取多个医学文献的摘要文本信息,得到摘要文本信息集合;Obtain abstract text information of multiple medical documents, and obtain a collection of abstract text information;
    采用实体识别神经网络确定所述摘要文本信息集合中命中的实体集合;其中,所述实体集合包括所述摘要文本信息集合中的以下信息:与表征信息和医学状态实体相关的信息;An entity recognition neural network is used to determine the entity set hit in the abstract text information set; wherein, the entity set includes the following information in the abstract text information set: information related to representation information and medical status entities;
    对所述实体集合进行医学语言规范化匹配,得到规范化实体集合;Performing medical language normalization matching on the entity set to obtain a normalized entity set;
    对所述规范化实体集合中的规范化实体进行分类标注,得到表征信息集合和医学状态实体集合;Classifying and labeling the normalized entities in the normalized entity set to obtain a representation information set and a medical state entity set;
    基于所述表征信息集合中的表征信息与所述医学状态实体集合中的医学状态实体的共现关系,得到所述医学知识图谱。Based on the co-occurrence relationship between the representation information in the representation information set and the medical state entities in the medical state entity set, the medical knowledge graph is obtained.
  3. 根据权利要求2所述的方法,其中,所述实体识别神经网络包括:双向短期记忆网络和条件随机场。The method of claim 2, wherein the entity recognition neural network comprises a bidirectional short-term memory network and a conditional random field.
  4. 根据权利要求1所述的方法,其中,所述基于所述医学状态实体生成推送信息集合,发送所述推送信息集合给所述用户包括:The method according to claim 1, wherein the generating a set of push information based on the medical state entity, and sending the set of push information to the user comprises:
    采用概率图模型对所述医学状态实体进行排序,根据排序结果选取预设数量的所述医学状态实体生成推送信息集合;Sort the medical state entities by using a probability graph model, and select a preset number of the medical state entities according to the sorting result to generate a push information set;
    发送所述推送集合给所述用户。Send the push collection to the user.
  5. 根据权利要求1所述的方法,其中,所述标准表征信息的生成步骤包括:The method according to claim 1, wherein the generating step of the standard characterization information comprises:
    获取用户的输入信息;Get the user's input information;
    识别所述输入信息中包含的表征信息,得到识别结果;Identify the representation information contained in the input information to obtain the identification result;
    基于所述识别结果的归一化语义,确定所述标准表征信息。The standard representation information is determined based on the normalized semantics of the recognition result.
  6. 根据权利要求5所述的方法,所述基于所述识别结果的归一化语义,确定所述标准表征信息,包括:The method according to claim 5, wherein the standard representation information is determined based on the normalized semantics of the recognition result, comprising:
    基于所述识别结果的归一化语义进行扩展,生成扩展表征信息集合;Expand based on the normalized semantics of the recognition result to generate an expanded representation information set;
    将所述扩展表征信息集合中的扩展表征信息作为标准表征信息。The extended representation information in the extended representation information set is used as standard representation information.
  7. 根据权利要求6所述的方法,其中,所述基于预先构造的医学知识图谱,确定所述标准表征信息命中的至少一个医学状态实体包括:The method according to claim 6, wherein the determining, based on a pre-constructed medical knowledge graph, at least one medical state entity hit by the standard representation information comprises:
    响应于确定接收到针对所述标准表征信息的选择信息,采用预先构造的医学知识图谱,确定所述标准表征信息命中的至少一个医学状态实体。In response to determining that selection information for the standard characterization information is received, at least one medical state entity hit by the standard characterization information is determined using a pre-configured medical knowledge graph.
  8. 一种推送信息的生成装置,包括:A device for generating push information, comprising:
    标准表征信息获取单元,被配置成获取与用户的输入信息中的表征信息相对应的标准表征信息;a standard representation information acquisition unit, configured to acquire standard representation information corresponding to the representation information in the user's input information;
    医学状态实体确定单元,被配置成基于预先构造的医学知识图谱,确定所述标准表征信息命中的至少一个医学状态实体;其中,所述医学知识图谱记录有表征信息和医学状态实体之间的对应关系,所述对应关系从医学文献的摘要信息中提取得到;A medical state entity determination unit, configured to determine at least one medical state entity hit by the standard representation information based on a pre-constructed medical knowledge map; wherein the medical knowledge map records the correspondence between the representation information and the medical state entity relationship, the corresponding relationship is extracted from the abstract information of medical literature;
    推送信息发送单元,被配置成基于所述命中的医学状态实体生成推送信息集合,发送所述推送信息集合给所述用户。A push information sending unit, configured to generate a push information set based on the hit medical state entity, and send the push information set to the user.
  9. 根据权利要求8所述的装置,还包括:The apparatus of claim 8, further comprising:
    医学知识图谱确定单元,包括:Medical knowledge graph determination unit, including:
    初始信息获取子单元,被配置成获取多个医学文献的摘要文本信息,得到摘要文本信息集合;The initial information acquisition subunit is configured to acquire abstract text information of a plurality of medical documents, and obtain a collection of abstract text information;
    实体识别子单元,被配置成采用实体识别神经网络确定所述摘要文本信息集合中命中的实体集合;其中,所述实体集合包括所述摘要文本信息集合中的以下信息:与表征信息和医学状态实体相关的信息;The entity identification subunit is configured to use an entity identification neural network to determine the entity set hit in the abstract text information set; wherein, the entity set includes the following information in the abstract text information set: related to representation information and medical status information about the entity;
    规范匹配子单元,被配置成对所述实体集合进行医学语言规范化匹配,得到规范化实体集合;A normative matching subunit, configured to perform medical language normalization matching on the entity set to obtain a normalized entity set;
    分类标注子单元,被配置成对所述规范化实体集合中的规范化实体进行分类标注,得到表征信息集合和医学状态实体集合;A classification and labeling subunit, configured to classify and label the normalized entities in the normalized entity set to obtain a representation information set and a medical state entity set;
    医学知识图谱生成子单元,被配置成基于所述表征信息集合中的表征信息与所述医学状态实体集合中的医学状态实体的共现关系,得到所述医学知识图谱。The medical knowledge graph generation subunit is configured to obtain the medical knowledge graph based on the co-occurrence relationship between the representation information in the representation information set and the medical state entities in the medical state entity set.
  10. 根据权利要求9所述的装置,其中,所述实体识别子单元中所述实体识别神经网络包括:双向短期记忆网络和条件随机场。The apparatus according to claim 9, wherein the entity recognition neural network in the entity recognition subunit comprises: a bidirectional short-term memory network and a conditional random field.
  11. 根据权利要求8所述的装置,其中,所述推送信息发送单元进一步被配置成:采用概率图模型对所述医学状态实体进行排序,根据排序结果选取预设数量的所述医学状态实体生成推送信息集合;The apparatus according to claim 8, wherein the push information sending unit is further configured to: use a probability graph model to sort the medical state entities, and select a preset number of the medical state entities according to the sorting result to generate a push collection of information;
    发送所述推送集合给所述用户。Send the push collection to the user.
  12. 根据权利要求8所述的装置,还包括:The apparatus of claim 8, further comprising:
    标准信息生成单元,包括:Standard information generation unit, including:
    初始信息获取子单元,被配置成获取用户的输入信息;an initial information acquisition subunit, configured to acquire the user's input information;
    信息识别子单元,被配置成识别所述输入信息中包含的表征信息,得到识别结果;an information identification subunit, configured to identify the representation information contained in the input information to obtain an identification result;
    标准表征信息确定子单元,被配置成基于所述识别结果的归一化语义,确定所述标准表征信息。The standard representation information determination subunit is configured to determine the standard representation information based on the normalized semantics of the recognition result.
  13. 根据权利要求12所述的装置,其中,所述标准表征信息确定子单元进一步被配置成:The apparatus according to claim 12, wherein the standard characterization information determination subunit is further configured to:
    基于所述识别结果的归一化语义进行扩展,生成扩展表征信息集合;Expand based on the normalized semantics of the recognition result to generate an expanded representation information set;
    将所述扩展表征信息集合中的扩展表征信息作为标准表征信息。The extended representation information in the extended representation information set is used as standard representation information.
  14. 根据权利要求13所述的装置,其中,所述医学状态实体确定单元进一步被配置成:The apparatus of claim 13, wherein the medical state entity determination unit is further configured to:
    响应于确定接收到针对所述标准表征信息的选择信息,采用预先构造的医学知识图谱,确定所述标准表征信息命中的至少一个医学状态实体。In response to determining that selection information for the standard characterization information is received, at least one medical state entity hit by the standard characterization information is determined using a pre-configured medical knowledge graph.
  15. 一种电子设备,包括:An electronic device comprising:
    至少一个处理器;以及at least one processor; and
    与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求1-7中任一项所述的推送信息的生成方法。a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to cause the at least one process The device can execute the method for generating push information according to any one of claims 1-7.
  16. 一种存储有计算机指令的非瞬时计算机可读存储介质,包括:所述计算机指令用于使所述计算机执行权利要求1-7中任一项所述的推送信息的生成方法。A non-transitory computer-readable storage medium storing computer instructions, comprising: the computer instructions are used to cause the computer to execute the method for generating push information according to any one of claims 1-7.
PCT/CN2021/128398 2020-11-09 2021-11-03 Method and apparatus for generating push information WO2022095892A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020237017518A KR20230092002A (en) 2020-11-09 2021-11-03 Method and apparatus for generating push information

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011241131.0 2020-11-09
CN202011241131.0A CN112287121A (en) 2020-11-09 2020-11-09 Push information generation method and device

Publications (1)

Publication Number Publication Date
WO2022095892A1 true WO2022095892A1 (en) 2022-05-12

Family

ID=74351859

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/128398 WO2022095892A1 (en) 2020-11-09 2021-11-03 Method and apparatus for generating push information

Country Status (3)

Country Link
KR (1) KR20230092002A (en)
CN (1) CN112287121A (en)
WO (1) WO2022095892A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116383413A (en) * 2023-06-05 2023-07-04 湖南云略信息技术有限公司 Knowledge graph updating method and system based on medical data extraction
CN116821712A (en) * 2023-08-25 2023-09-29 中电科大数据研究院有限公司 Semantic matching method and device for unstructured text and knowledge graph
CN117493681A (en) * 2023-11-15 2024-02-02 无锡胤兴智创科技有限公司 Intelligent medical information pushing system and method based on cloud computing

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112287121A (en) * 2020-11-09 2021-01-29 北京沃东天骏信息技术有限公司 Push information generation method and device
CN113016658B (en) * 2021-03-09 2022-05-27 广东海洋大学 Biological information utilization system and method based on big data

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106021281A (en) * 2016-04-29 2016-10-12 京东方科技集团股份有限公司 Method for establishing medical knowledge graph, device for same and query method for same
CN106777274A (en) * 2016-06-16 2017-05-31 北京理工大学 A kind of Chinese tour field knowledge mapping construction method and system
US20190102462A1 (en) * 2017-09-29 2019-04-04 International Business Machines Corporation Identification and evaluation white space target entity for transaction operations
CN109658208A (en) * 2019-01-15 2019-04-19 京东方科技集团股份有限公司 Recommended method, device, medium and the electronic equipment of drug
CN112287121A (en) * 2020-11-09 2021-01-29 北京沃东天骏信息技术有限公司 Push information generation method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106021281A (en) * 2016-04-29 2016-10-12 京东方科技集团股份有限公司 Method for establishing medical knowledge graph, device for same and query method for same
CN106777274A (en) * 2016-06-16 2017-05-31 北京理工大学 A kind of Chinese tour field knowledge mapping construction method and system
US20190102462A1 (en) * 2017-09-29 2019-04-04 International Business Machines Corporation Identification and evaluation white space target entity for transaction operations
CN109658208A (en) * 2019-01-15 2019-04-19 京东方科技集团股份有限公司 Recommended method, device, medium and the electronic equipment of drug
CN112287121A (en) * 2020-11-09 2021-01-29 北京沃东天骏信息技术有限公司 Push information generation method and device

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116383413A (en) * 2023-06-05 2023-07-04 湖南云略信息技术有限公司 Knowledge graph updating method and system based on medical data extraction
CN116383413B (en) * 2023-06-05 2023-08-29 湖南云略信息技术有限公司 Knowledge graph updating method and system based on medical data extraction
CN116821712A (en) * 2023-08-25 2023-09-29 中电科大数据研究院有限公司 Semantic matching method and device for unstructured text and knowledge graph
CN116821712B (en) * 2023-08-25 2023-12-19 中电科大数据研究院有限公司 Semantic matching method and device for unstructured text and knowledge graph
CN117493681A (en) * 2023-11-15 2024-02-02 无锡胤兴智创科技有限公司 Intelligent medical information pushing system and method based on cloud computing

Also Published As

Publication number Publication date
CN112287121A (en) 2021-01-29
KR20230092002A (en) 2023-06-23

Similar Documents

Publication Publication Date Title
CN112507715B (en) Method, device, equipment and storage medium for determining association relation between entities
WO2022095892A1 (en) Method and apparatus for generating push information
US10586155B2 (en) Clarification of submitted questions in a question and answer system
Xia et al. Dual sentiment analysis: Considering two sides of one review
WO2022100045A1 (en) Training method for classification model, sample classification method and apparatus, and device
US10713571B2 (en) Displaying quality of question being asked a question answering system
US10606946B2 (en) Learning word embedding using morphological knowledge
US11138523B2 (en) Greedy active learning for reducing labeled data imbalances
US10354188B2 (en) Extracting facts from unstructured information
US11556548B2 (en) Intelligent query system for attachments
US9318027B2 (en) Caching natural language questions and results in a question and answer system
US20210216580A1 (en) Method and apparatus for generating text topics
US9436918B2 (en) Smart selection of text spans
EP3933657A1 (en) Conference minutes generation method and apparatus, electronic device, and computer-readable storage medium
CN110543574A (en) knowledge graph construction method, device, equipment and medium
US20180032901A1 (en) Greedy Active Learning for Reducing User Interaction
CN111931500B (en) Search information processing method and device
US20220129448A1 (en) Intelligent dialogue method and apparatus, and storage medium
Mottaghinia et al. A review of approaches for topic detection in Twitter
WO2022077880A1 (en) Model training method and apparatus, short message verification method and apparatus, device, and storage medium
JP7163440B2 (en) Text query method, apparatus, electronics, storage medium and computer program product
CN112989208B (en) Information recommendation method and device, electronic equipment and storage medium
WO2022000934A1 (en) Method and apparatus for rewriting search term, device and storage medium
Krzywicki et al. Data mining for building knowledge bases: techniques, architectures and applications
Nasim et al. Cluster analysis of urdu tweets

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21888589

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 20237017518

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21888589

Country of ref document: EP

Kind code of ref document: A1