WO2020034810A1 - 搜索方法、装置、计算机设备和存储介质 - Google Patents

搜索方法、装置、计算机设备和存储介质 Download PDF

Info

Publication number
WO2020034810A1
WO2020034810A1 PCT/CN2019/096978 CN2019096978W WO2020034810A1 WO 2020034810 A1 WO2020034810 A1 WO 2020034810A1 CN 2019096978 W CN2019096978 W CN 2019096978W WO 2020034810 A1 WO2020034810 A1 WO 2020034810A1
Authority
WO
WIPO (PCT)
Prior art keywords
corpus
searched
target
semantic
sub
Prior art date
Application number
PCT/CN2019/096978
Other languages
English (en)
French (fr)
Inventor
胡帆
吴迪
Original Assignee
平安医疗健康管理股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安医疗健康管理股份有限公司 filed Critical 平安医疗健康管理股份有限公司
Publication of WO2020034810A1 publication Critical patent/WO2020034810A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Definitions

  • the present application relates to a search method, apparatus, computer equipment, and storage medium.
  • a search method, apparatus, computer device, and storage medium are provided.
  • a search method includes: receiving a search request sent by a terminal, the search request carrying a current medical term to be searched and a type identifier corresponding to a target corpus; segmenting the current medical term to be searched, and obtaining the current Multiple to-be-searched sub-words corresponding to the medical term to be searched; obtaining corresponding matching words from a pre-established semantic network according to the to-be-searched sub-words, and obtaining a code corresponding to the matching word as the current to-be-searched medical term A corresponding sub-code; obtaining an associated code corresponding to each sub-code from the semantic network to obtain a first set of associated codes corresponding to the current medical term to be searched; and from the first corpus corresponding to the type identifier corresponding to the target corpus A first target association code is selected from an association code set, and a corpus corresponding to the first target association code is obtained to obtain a target corpus; and the target corpus is sent to the terminal.
  • a search device includes a search request receiving module for receiving a search request sent by a terminal, the search request carrying a current term to be searched and a type identifier corresponding to a target corpus; a sub-word acquisition module to be searched, It is used for word segmentation of the current medical term to be searched, and a plurality of sub-words to be searched corresponding to the current medical term to be searched are obtained according to the segmentation result; a sub-code acquisition module is configured to pre-establish The corresponding matching word from the semantic network, and obtain the code corresponding to the matching word as the sub-code corresponding to the current medical term to be searched; the association code acquisition module is configured to obtain each sub-code from the semantic network A corresponding association code to obtain a first association code set corresponding to the current medical term to be searched; a first target corpus acquisition module, configured to select a first association code set from the first association code set according to a type identifier corresponding to the target corpus; A target association code, obtaining a corpus corresponding to
  • a computer device includes a memory and one or more processors.
  • the memory stores computer-readable instructions.
  • the one or more processors are executed. The following steps:
  • a search request sent by a terminal where the search request carries the current medical term to be searched and a type identifier corresponding to the target corpus; word segmentation is performed on the current medical term to be searched, and the corresponding medical term to be searched is obtained according to the word segmentation result Multiple sub-words to be searched; obtaining corresponding matching words from a pre-established semantic network according to the sub-words to be searched, and obtaining codes corresponding to the matching words as sub-codes corresponding to the current medical term to be searched; from Acquiring, from the semantic network, an association code corresponding to each sub-code to obtain a first association code set corresponding to the medical term to be searched currently; and selecting from the first association code set according to a type identifier corresponding to the target corpus A first target association code, obtaining a corpus corresponding to the first target association code to obtain a target corpus; and sending the target corpus to the terminal.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the one or more processors execute the following steps:
  • a search request sent by a terminal where the search request carries the current medical term to be searched and a type identifier corresponding to the target corpus; word segmentation is performed on the current medical term to be searched, and the corresponding medical term to be searched is obtained according to the word segmentation result Multiple sub-words to be searched; obtaining corresponding matching words from a pre-established semantic network according to the sub-words to be searched, and obtaining codes corresponding to the matching words as sub-codes corresponding to the current medical term to be searched; from Acquiring, from the semantic network, an association code corresponding to each sub-code to obtain a first association code set corresponding to the medical term to be searched currently; and selecting from the first association code set according to a type identifier corresponding to the target corpus A first target association code, obtaining a corpus corresponding to the first target association code to obtain a target corpus; and sending the target corpus to the terminal.
  • FIG. 1 is an application scenario diagram of a search method according to one or more embodiments
  • FIG. 2 is a schematic flowchart of a search method according to one or more embodiments
  • FIG. 3 is a schematic flowchart of steps generated by a semantic network according to one or more embodiments.
  • FIG. 4 is a structural block diagram of a search apparatus according to one or more embodiments.
  • FIG. 5 is an internal structural diagram of a computer device according to one or more embodiments.
  • the search method provided in this application can be applied to the application environment shown in FIG. 1.
  • the terminal 102 communicates with the server 104 through a network.
  • the terminal 102 sends a search request to the server 104 carrying the current medical term to be searched and the type identifier corresponding to the target corpus.
  • the server 104 After receiving the search request, the server 104 segmentes the current medical term to be searched to obtain multiple sub-words to be searched, and then The pre-established semantic network obtains the matching word corresponding to the sub-word to be searched and the code of the matching word, then searches for the associated code corresponding to each obtained code from the semantic network, and finally the type identifier and target in the obtained associated code
  • the corpus corresponding to the same corpus code is used as the target corpus, and finally the target corpus is sent to the terminal 102.
  • the terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.
  • the server 104 may be implemented by an independent server or a server cluster composed of multiple servers.
  • a search method is provided.
  • the method is applied to the server in FIG. 1 as an example, and includes the following steps:
  • step S202 a search request sent by the terminal is received, and the search request carries a type identifier corresponding to the current medical term to be searched and the target corpus.
  • the current medical term to be searched refers to the original corpus currently used for the search, including but not limited to disease names, anatomical words, disease words, etc .; the target corpus refers to the corpus expected to be obtained through the search; the type identifier is used for unique Identify the corpus dimensions to which the target corpus belongs. In one embodiment, the type identifier may be the name of the corpus dimension to which the target corpus belongs.
  • the target corpus is a drug-based corpus
  • the type corresponding to the target corpus is An identifier that uniquely identifies the corpus dimension of the drug.
  • the terminal may provide a search interface, and the search interface may include input controls such as input boxes, drop-down selection boxes, and confirmation search controls.
  • the search interface may include input controls such as input boxes, drop-down selection boxes, and confirmation search controls.
  • step S204 word segmentation is performed on the current medical term to be searched, and a plurality of sub-words to be searched corresponding to the current medical term to be searched are obtained according to the segmentation result.
  • the word segmentation result refers to the word sequence obtained from the word segmentation.
  • the word segmentation result obtained from the word segmentation of "open cerebellar hemorrhage” is "open / cerebellum / bleeding”.
  • the words with medical meaning from the word segmentation result including words related to the anatomical part, such as “cerebellum”, “index finger”, etc .; disease related words, such as “fracture”, “ Bleeding “,” dislocation “, etc .; degree / type related words, such as” crushing ",” openness “,” chronic “, etc.
  • the words with clear meanings selected from the word segmentation results are used as the sub-words to be searched. For example, when the medical term to be searched for is an open cerebellar hemorrhage, all three words in the word segmentation results can be used as the corresponding words for the term. Search for sub-terms.
  • Step S206 Obtain a corresponding matching word from a pre-established semantic network according to the sub-word to be searched, and obtain a code corresponding to the matching word as a sub-code corresponding to the current medical term to be searched.
  • the semantic network is a form of expressing medical knowledge structure in a network format.
  • the semantic network includes corpora with multiple semantic dimensions, such as anatomical site corpus, degree / type corpus, disease corpus, medicine corpus, inspection project corpus, and surgical project corpus. , Such as hip, coccyx, etc .; disease corpus refers to the specific medical description of the disease, such as fracture, bleeding, dislocation, etc .; degree / type corpus refers to the medical description of the severity of the disease or the corresponding disease Type of medical description, for example, comminuted, open, chronic, acute, etc.
  • the corpus of each dimension is coded according to preset rules, and the corpus is marked with coding in the semantic network.
  • the semantic network for any two corpora that belong to different semantic dimensions, if their co-occurrence frequency is greater than a preset threshold, it means that the two corpora have semantic associations, and the two corpora that have the association relationship are correspondingly coded. If an association relationship is established between the two codes, the two codes are association codes with each other. The two codes with the association relationship are connected through an "edge" in the network in the semantic network. In the Semantic Web, you can find the associated code of any code through these "edges". A matching word is a word that matches the sub-word to be searched.
  • the server obtains the sub-words to be searched, finds a word matching the sub-words to be searched from the pre-established semantic network as a matching word of the sub-words to be searched, and then obtains the code of the matching word as Sub-codes of the medical term to be searched currently, so that the medical term to be searched currently corresponds to multiple codes.
  • Step S208 Obtain an association code corresponding to each sub-code from the semantic network to obtain a first association code set corresponding to the medical term to be searched currently.
  • association codes are combined to obtain a first association code set.
  • a first target association code is selected from the first association code set according to a type identifier corresponding to the target corpus, and a corpus corresponding to the first target association code is obtained to obtain a target corpus.
  • the semantic network since the semantic network includes corpora with multiple semantic dimensions, and the target corpus is only one or several semantic dimensions, it needs to be selected from the first association coding set according to the type identifier corresponding to the target corpus.
  • Target association code since the semantic network includes corpora with multiple semantic dimensions, and the target corpus is only one or several semantic dimensions, it needs to be selected from the first association coding set according to the type identifier corresponding to the target corpus.
  • Target association code since the semantic network includes corpora with multiple semantic dimensions, and the target corpus is only one or several semantic dimensions, it needs to be selected from the first association coding set according to the type identifier corresponding to the target corpus.
  • the type identifier corresponding to the semantic dimension may be carried in the encoding. After obtaining the first set of associated codes, each associated code is corresponding to the target corpus, respectively. The type identifiers are compared. If the comparison is successful, that is, if an association code contains the type identifier, the association code is selected as the target association code.
  • the mapping relationship between the encoding of each semantic dimension and the type identifier corresponding to the semantic dimension is established in advance. After obtaining the first association encoding set, each first in the first association encoding set is found according to the mapping relationship. The type identifier corresponding to the association code, and the association code having the same type identifier as the type identifier corresponding to the target corpus is determined as the target association code.
  • the server can obtain corresponding corpora according to the target association code, and these corpora are the target corpora.
  • Step S212 Send the target corpus to the terminal.
  • the server sends the obtained target corpus to the terminal through the network.
  • the server After receiving a search request carrying the current medical term to be searched and the type identifier corresponding to the target corpus, the server performs word segmentation on the current medical term to be searched in the search request to obtain multiple sub-words to be searched, and then The corresponding matching words are obtained from the pre-established semantic network, and the codes corresponding to the matching words are obtained to obtain the sub-codes, and then the associated codes corresponding to each sub-code are found. Finally, the association code set is selected from the associated code set according to the type identifier corresponding to the target corpus. The target association code is obtained by obtaining the corpus corresponding to the target association code to obtain the target corpus. Using the method of this application, for any different description of the same term, the server can segment the words, obtain matching words, and obtain all related corpora from the semantic network. In order to obtain the target corpus, the comprehensiveness of medical data search is improved.
  • the above method further includes a step of generating a semantic network, which specifically includes:
  • Step S302 Obtain a semantic tree with multiple preset semantic dimensions.
  • the semantic tree of each semantic dimension corresponds to a type identifier.
  • the semantic tree of each semantic dimension includes multiple node corpora.
  • a corpus of each preset dimension may be extracted first from a standardized medical corpus, and a semantic tree may be constructed in advance according to the semantic relationship between the corpora corresponding to each dimension.
  • the preset semantic dimensions include, but are not limited to, anatomical parts, degrees, diseases, medicines, examination items, surgical items, and so on; the type identifier is used to uniquely identify the semantic dimension to which the semantic tree belongs, and can be composed of a preset number of letters, such as for Anatomical parts can be identified as "JP".
  • Table 1 an example of a partial semantic tree for the part “ear” is given:
  • Step S304 encode the node corpus corresponding to the semantic tree according to the type identifier and a preset encoding rule.
  • the code corresponding to the node corpus can be composed of type identifiers and numbers according to preset coding rules, such as JP3 for the ears in the table above, and JP3.1 and JP3 for the outer ear, middle ear, and inner ear, respectively. .2, JP3.3, which are coded as JP3.1.1, JP3.1.2, JP3.1.3 for the auricle, external auditory canal, and eardrum respectively, and so on.
  • step S306 the co-occurrence frequency between the node corpus corresponding to the semantic tree of each dimension and the node corpora corresponding to the semantic tree of other dimensions is calculated.
  • the co-occurrence frequency of each node corpus corresponding to the semantic tree of other semantic dimensions is calculated.
  • the co-occurrence frequency refers to the two corpora in the The frequency of co-occurrence within the context of the set, the greater the co-occurrence frequency, the greater the degree of correlation between the two words.
  • the co-occurrence frequency is often expressed in the form of a co-occurrence matrix.
  • the co-occurrence matrix can be calculated by using a pair algorithm or a stripes algorithm implemented by a MapReduce model.
  • Step S308 Establish an association relationship between codes corresponding to the two node corpora whose co-occurrence frequency is greater than a preset threshold to generate a semantic network.
  • the preset threshold can be set to different degrees according to different requirements on the degree of association between two interrelated node corpora in the semantic network. The higher the degree of relevance between two interrelated node corpora, the greater the preset threshold.
  • the corresponding codes are connected by one edge, that is, the association between the codes corresponding to the two node corpora is established.
  • a semantic network is obtained. In this semantic network, searching through any code can get all the codes associated with it.
  • the method further includes: obtaining a type identifier corresponding to the sub-code; selecting a second target association code from a set of association codes according to the type identifier corresponding to the sub-code; obtaining a second target association code corresponding to the second code from the semantic network To obtain the second association code set corresponding to the medical term to be searched; select a third target association code from the second association code set according to the type identifier corresponding to the target corpus, obtain the corpus corresponding to the third target association code to obtain the target Corpus.
  • the association code corresponding to the sub-code includes two types.
  • the first type is a code whose type identifier is the same as the type identifier corresponding to the target corpus
  • the second type is a code whose type identifier is different from the type identifier corresponding to the target corpus.
  • the second type of coding includes codes with the same type identifiers as the corresponding subcodes.
  • the corpus corresponding to these codes is a corpus that is semantically related to the current medical term to be searched, which can be used to expand the current medical term to be searched and further improve the data. Comprehensive search.
  • an association code with the same type identifier as the type identifier corresponding to the sub-code is selected from the association code set, and then these association codes are used as a reference to find the association codes corresponding to the association codes from the semantic network.
  • the association code obtained is the association code obtained by the extended search. From these association codes, the association code with the same type identifier as the type identifier corresponding to the target corpus is selected again as the target association code. The corpus corresponding to these target association codes is obtained.
  • the corpus obtained in step S210 is used as the target corpus corresponding to the medical term to be searched together to expand the number of target corpora and further improve the comprehensiveness of medical data search.
  • step S206 obtaining the corresponding matching words from the pre-established semantic network according to the sub-words to be searched includes: traversing the semantic tree corresponding to the semantic dimension to which the sub-words to be searched belong according to the sub-words to be searched Calculate the matching degree between the sub-words to be searched and the corpus of each traversed node; obtain the node corpus corresponding to the maximum value of the matching degree as the matching word corresponding to the sub-words to be searched.
  • the semantic dimension to which the sub-word to be searched belongs can be judged first, and then the semantic tree corresponding to the semantic dimension is traversed.
  • the matching degree between the node corpus and the sub-word to be searched is calculated.
  • the semantic dimension to which the search sub-word belongs can be obtained through part-of-speech tagging. Specifically, when the part-of-speech tagging result of a word is an anatomical part, the semantic dimension to which the word belongs is the anatomical part.
  • word2vec when calculating the matching degree, can be used to obtain the word vectors of the sub-words and node corpus to be searched respectively, and then calculate the vector distance or cosine angle value between the word vectors of the sub-words and node corpus to be searched , Use the vector distance or cosine angle value as the degree of matching.
  • searching for a matching word by traversing the semantic tree corresponding to the semantic dimension to which the sub-word to be searched is traversed.
  • the efficiency of obtaining matching words can be improved, thereby improving the overall search efficiency.
  • step S204 the plurality of sub-words to be searched corresponding to the current medical term to be searched are obtained according to the segmentation results, including: when any two words in the segmentation results are mutually exclusive, obtaining each mutually exclusive The mutually exclusive weights corresponding to the words, and the words with larger weights are used as the sub-words to be searched.
  • a mutex is a word that has a mutually exclusive relationship. When two words appear at the same time, the semantics of one of them can be ignored. These two words are mutually exclusive and mutually exclusive. For example, in soft tissue injury and semi-fracture, injury and fracture are mutually exclusive words.
  • a mutually exclusive dictionary may be established in advance, and a mutually exclusive weight is set for each pair of mutually exclusive words.
  • the server can determine whether a mutex exists in the segmentation result by searching from the mutex dictionary. When a mutex exists, obtain the mutex weight corresponding to each mutex, and use the word with a larger mutex as the search term. Child words. For example, in a soft tissue injury half fracture, if the mutually exclusive right of the fracture is greater than the injury, then the fracture is taken as the sub-word to be searched.
  • it may first determine whether there are more than two words that belong to the semantic dimension of the disease. If they exist, then these words are searched from the mutually exclusive dictionary to determine whether Are mutually exclusive.
  • the accuracy of the search can be improved.
  • steps in the flowchart of FIG. 2-3 are sequentially displayed in accordance with the directions of the arrows, these steps are not necessarily performed in the order indicated by the arrows. Unless explicitly stated in this document, the execution of these steps is not strictly limited, and these steps can be performed in other orders. Moreover, at least a part of the steps in Figure 2-3 may include multiple sub-steps or stages. These sub-steps or stages are not necessarily performed at the same time, but may be performed at different times. These sub-steps or stages The execution order of is not necessarily performed sequentially, but may be performed in turn or alternately with at least a part of another step or a sub-step or stage of another step.
  • a search device 400 which includes a search request receiving module 402, a sub-word acquisition module 404, a sub-code acquisition module 406, an association code acquisition module 408, and a first target corpus.
  • the search request receiving module 402 is configured to receive a search request sent by a terminal, and the search request carries a current term to be searched and a type identifier corresponding to the target corpus;
  • the to-be-searched sub-word acquisition module 404 is configured to perform word segmentation on the current to-be-searched medical term, and obtain a plurality of to-be-searched sub-words corresponding to the currently-to-search medical term according to the segmentation result;
  • the sub-code obtaining module 406 is configured to obtain a corresponding matching word from a pre-established semantic network according to the sub-words to be searched, and obtain a code corresponding to the matching word as a sub-code corresponding to the current medical term to be searched;
  • the association code acquisition module 408 is configured to obtain an association code corresponding to each sub-code from the semantic network to obtain a first association code set corresponding to the medical term to be searched currently;
  • the first target corpus acquisition module 410 is configured to select a first target association code from a first association code set according to a type identifier corresponding to the target corpus, obtain a corpus corresponding to the first target association code to obtain a target corpus;
  • the target corpus sending module 412 is configured to send the target corpus to the terminal.
  • the apparatus further includes a semantic network generation module; the semantic network generation module is configured to obtain a semantic tree of preset multiple semantic dimensions, and the semantic tree of each semantic dimension corresponds to a type identifier, and the semantic tree of each semantic dimension Contains multiple node corpora; encodes the node corpus corresponding to the semantic tree according to the type identification and preset coding rules; calculates the node corpus corresponding to the semantic tree of each dimension and the node corpus corresponding to the semantic tree of other dimensions Co-occurrence frequency; establish a correlation between the codes corresponding to the two node corpora whose co-occurrence frequency is greater than a preset threshold to generate a semantic network.
  • the semantic network generation module is configured to obtain a semantic tree of preset multiple semantic dimensions, and the semantic tree of each semantic dimension corresponds to a type identifier, and the semantic tree of each semantic dimension Contains multiple node corpora; encodes the node corpus corresponding to the semantic tree according to the type identification and preset coding rules; calculates the node corpus
  • the device further includes a first target corpus acquisition module; a second target corpus acquisition module is configured to acquire a type identifier corresponding to the sub-code; and select a second target association code from the association code set according to the type identifier corresponding to the sub-code. ; Obtain the association code corresponding to the second target association code from the semantic network to obtain the second association code set corresponding to the medical term to be searched; select a third target association from the second association code set according to the type identifier corresponding to the target corpus Encoding to obtain the corpus corresponding to the third target association code to obtain the target corpus.
  • the sub-code obtaining module 406 is further configured to traverse the semantic tree corresponding to the semantic dimension to which the sub-word to be searched according to the sub-words to be searched; and calculate the matching degree between the sub-words to be searched and the corpus of each traversed node ; Obtain the node corpus corresponding to the maximum matching degree as the matching word corresponding to the sub-word to be searched.
  • the sub-word search module 404 is further configured to obtain the mutual exclusion weight corresponding to each mutually exclusive word when any two words in the segmentation result are mutually exclusive, and use the larger weighted word as The search term.
  • Each module in the search device may be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules may be embedded in the hardware form or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor calls and performs the operations corresponding to the above modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 5.
  • the computer device includes a processor, a memory, a network interface, and a database connected through a system bus.
  • the processor of the computer device is used to provide computing and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer-readable instructions, and a database.
  • the internal memory provides an environment for operating systems and computer-readable instructions in a non-volatile storage medium.
  • the computer equipment database is used to store medical data.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer-readable instructions are executed by a processor to implement a search method.
  • FIG. 5 is only a block diagram of a part of the structure related to the solution of the application, and does not constitute a limitation on the computer equipment to which the solution of the application is applied.
  • the specific computer equipment may be Include more or fewer parts than shown in the figure, or combine certain parts, or have a different arrangement of parts.
  • a computer device includes a memory and one or more processors.
  • Computer-readable instructions are stored in the memory.
  • the one or more processors execute the following steps: Search request, the search request carries the current medical term to be searched and the type identifier corresponding to the target corpus; segment the current medical term to be searched, and obtain a plurality of sub-search terms corresponding to the current medical term to be searched according to the segmentation result;
  • the sub-words obtain the corresponding matching words from the pre-established semantic network, and obtain the codes corresponding to the matching words as the sub-codes corresponding to the medical terms to be searched currently; obtain the associated codes corresponding to each sub-code from the semantic network, and obtain the current waiting code.
  • Searching for the first association code set corresponding to the medical term selecting the first target association code from the first association code set according to the type identifier corresponding to the target corpus, obtaining the corpus corresponding to the first target association code to obtain the target corpus; and sending the target corpus; To the terminal.
  • the processor executes the computer-readable instructions, the following steps are further implemented: obtaining a semantic tree of preset multiple semantic dimensions, the semantic tree of each semantic dimension corresponds to a type identifier, and the semantic tree of each semantic dimension includes Multiple node corpora; encode the node corpus corresponding to the semantic tree according to the type identifier and preset coding rules; calculate the node corpus corresponding to the semantic tree of each dimension and the node corpora corresponding to the semantic tree of other dimensions Co-occurrence frequency; and establish an association relationship between codes corresponding to two node corpora whose co-occurrence frequency is greater than a preset threshold to generate a semantic network.
  • the processor before obtaining the corpus corresponding to the first target association code to obtain the target corpus, the processor further implements the following steps when the computer executes the computer-readable instructions: obtaining the type identifier corresponding to the sub-code; Selecting a second target association code from the association code set; obtaining an association code corresponding to the second target association code from the semantic network to obtain a second association code set corresponding to the medical term to be searched currently; and obtaining the first target Obtaining the target corpus from the corpus corresponding to the association code includes: obtaining the corpus corresponding to the first target association code and the corpus corresponding to the third target association code to obtain the target corpus.
  • obtaining corresponding matching words from a pre-established semantic network according to the sub-words to be searched includes: traversing the semantic tree corresponding to the semantic dimension to which the sub-words to be searched are based on the sub-words to be searched; and calculating the to-be-searched words The matching degree of the sub-words to each traversed node corpus; and the node corpus corresponding to the maximum matching degree is obtained as the matching word corresponding to the sub-word to be searched.
  • obtaining a plurality of sub-words to be searched corresponding to the current medical term to be searched according to the segmentation results includes: when any two words in the segmentation results are mutually exclusive words, obtaining the mutual words corresponding to each mutually exclusive word Reject weights, and use words with larger weights as sub-words to be searched.
  • the processor Before selecting a first target association code from the first association code set according to the type identifier corresponding to the target corpus, the processor further implements the following steps when the computer executes the computer-readable instructions: establishing a code of each semantic dimension and the semantics The mapping relationship between the type identifiers corresponding to the dimensions to obtain a mapping relationship table; and selecting the first target association code from the first association code set according to the type identifier corresponding to the target corpus includes: from the mapping relationship Look up in the table the type identifier corresponding to each first association code in the first association code set; and determine the first association code with the same type identifier as the type identifier corresponding to the target corpus as the first target association code.
  • One or more non-volatile computer-readable storage media storing computer-readable instructions.
  • the one or more processors execute the following steps: Search request, the search request carries the current medical term to be searched and the type identifier corresponding to the target corpus; segment the current medical term to be searched, and obtain a plurality of sub-search terms corresponding to the current medical term to be searched according to the segmentation result;
  • the sub-words obtain the corresponding matching words from the pre-established semantic network, and obtain the codes corresponding to the matching words as the sub-codes corresponding to the medical terms to be searched currently; obtain the associated codes corresponding to each sub-code from the semantic network, and obtain the current waiting code.
  • Searching for the first association code set corresponding to the medical term selecting the first target association code from the first association code set according to the type identifier corresponding to the target corpus, obtaining the corpus corresponding to the first target association code to obtain the target corpus; and sending the target corpus; To the terminal.
  • the following steps are further implemented: obtaining a semantic tree of preset multiple semantic dimensions, the semantic tree of each semantic dimension corresponding to a type identifier, and the semantic tree of each semantic dimension Contains multiple node corpora; encodes the node corpus corresponding to the semantic tree according to the type identification and preset coding rules; calculates the node corpus corresponding to the semantic tree of each dimension and the node corpus corresponding to the semantic tree of other dimensions Co-occurrence frequency; and establish an association relationship between the codes corresponding to the two node corpora whose co-occurrence frequency is greater than a preset threshold to generate a semantic network.
  • the following steps are further implemented: obtaining the type identifier corresponding to the subcode; and according to the type identifier corresponding to the subcode Selecting a second target association code from a set of association codes; obtaining an association code corresponding to the second target association code from the semantic network to obtain a second association code set corresponding to the medical term to be searched currently; and acquiring the first Obtaining the target corpus from the corpus corresponding to the target association code includes: obtaining the corpus corresponding to the first target association code and the corpus corresponding to the third target association code to obtain the target corpus.
  • obtaining corresponding matching words from a pre-established semantic network according to the sub-words to be searched includes: traversing the semantic tree corresponding to the semantic dimension to which the sub-words to be searched are based on the sub-words to be searched; The matching degree of the sub-words to each traversed node corpus; and the node corpus corresponding to the maximum matching degree is obtained as the matching word corresponding to the sub-word to be searched.
  • obtaining a plurality of sub-words to be searched corresponding to the current medical term to be searched according to the segmentation results includes: when any two words in the segmentation results are mutually exclusive words, obtaining the mutual words corresponding to each mutually exclusive word Reject weights, and use words with larger weights as sub-words to be searched.
  • the computer-readable instructions before selecting a first target association code from the first association code set according to the type identifier corresponding to the target corpus, the computer-readable instructions further implement the following steps when executed by the processor: establishing each semantic A mapping relationship between the encoding of the dimension and the type identifier corresponding to the semantic dimension to obtain a mapping relationship table; and selecting a first target association encoding from the first association encoding set according to the type identifier corresponding to the target corpus, The method includes: searching, from the mapping relationship table, a type identifier corresponding to each first association code in the first association code set; and determining a first association code whose type identifier is the same as a type identifier corresponding to the target corpus as the first A target association code.
  • Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory can include random access memory (RAM) or external cache memory.
  • RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
  • SRAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDRSDRAM dual data rate SDRAM
  • ESDRAM enhanced SDRAM
  • SLDRAM synchronous chain Synchlink DRAM
  • Rambus direct RAM
  • DRAM direct memory bus dynamic RAM
  • RDRAM memory bus dynamic RAM

Landscapes

  • Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

一种搜索方法,包括:接收终端发送的搜索请求,搜索请求中携带当前待搜索医疗术语及目标语料对应的类型标识(S202);对当前待搜索医疗术语进行分词,根据分词结果得到当前待搜索医疗术语对应的多个待搜索子词语(S204);根据待搜索子词语从预先建立的语义网络中获取对应的匹配词,并获取匹配词对应的编码作为当前待搜索医疗术语对应的子编码(S206);从语义网络中获取每一个子编码对应的关联编码,得到当前待搜索医疗术语对应的关联编码集合(S208);根据目标语料对应的类型标识从关联编码集合中选取目标关联编码,获取目标关联编码对应的语料得到目标语料(S210);将目标语料发送至终端(S212)。

Description

搜索方法、装置、计算机设备和存储介质
相关申请的交叉引用
本申请要求于2018年8月14日提交中国专利局,申请号为2018109232587,申请名称为“搜索方法、装置、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及一种搜索方法、装置、计算机设备和存储介质。
背景技术
随着计算机技术的发展,计算机中存储的医学数据不断积累,已逐步达到海量级。医疗工作者常常需要通过计算机从海量的医学数据中获取自己想要的数据,如获取与某类疾病相关的数据,包括药品、检查项目、手术项目等等。
传统技术中,计算机通常获取到医疗术语搜索词后,通过搜索词从海量数据中搜索包含该搜索词的医学数据,然而发明人意识到,由于同一个医疗术语对应多种不同的表述,因此通过这种方式搜索到的数据并不全面。
发明内容
根据本申请公开的各种实施例,提供一种搜索方法、装置、计算机设备和存储介质。
一种搜索方法包括:接收终端发送的搜索请求,所述搜索请求中携带当前待搜索医疗术语及目标语料对应的类型标识;对所述当前待搜索医疗术语进行分词,根据分词结果得到所述当前待搜索医疗术语对应的多个待搜索子词语;根据所述待搜索子词语从预先建立的语义网络中获取对应的匹配词,并获取所述匹配词对应的编码作为所述当前待搜索医疗术语对应的子编码;从所述语义网络中获取每一个子编码对应的关联编码,得到所述当前待搜索医疗术语对应的第一关联编码集合;根据所述目标语料对应的类型标识从所述第一关联编码集合中选取第一目标关联编码,获取所述第一目标关联编码对应的语料得到目标语料;及将所述目标语料发送至所述终端。
一种搜索装置,所述装置包括:搜索请求接收模块,用于接收终端发送的搜索请求,所述搜索请求中携带当前待搜索医疗术语及目标语料对应的类型标识;待搜索子词语获取模块,用于对所述当前待搜索医疗术语进行分词,根据分词结果得到所述当前待搜索医疗术语对应的多个待搜索子词语;子编码获取模块,用于根据所述待搜索子词语从预先建立的语义网络中获取对应的匹配词,并获取所述匹配词对应的编码作为所述当前待搜索医疗术语对应的子编码;关联编码获取模块,用于从所述语义网络中获取每一个子编码对应的 关联编码,得到所述当前待搜索医疗术语对应的第一关联编码集合;第一目标语料获取模块,用于根据所述目标语料对应的类型标识从所述第一关联编码集合中选取第一目标关联编码,获取所述第一目标关联编码对应的语料得到目标语料;及目标语料发送模块,用于将所述目标语料发送至所述终端。
一种计算机设备,包括存储器和一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述一个或多个处理器执行以下步骤:
接收终端发送的搜索请求,所述搜索请求中携带当前待搜索医疗术语及目标语料对应的类型标识;对所述当前待搜索医疗术语进行分词,根据分词结果得到所述当前待搜索医疗术语对应的多个待搜索子词语;根据所述待搜索子词语从预先建立的语义网络中获取对应的匹配词,并获取所述匹配词对应的编码作为所述当前待搜索医疗术语对应的子编码;从所述语义网络中获取每一个子编码对应的关联编码,得到所述当前待搜索医疗术语对应的第一关联编码集合;根据所述目标语料对应的类型标识从所述第一关联编码集合中选取第一目标关联编码,获取所述第一目标关联编码对应的语料得到目标语料;及将所述目标语料发送至所述终端。
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:
接收终端发送的搜索请求,所述搜索请求中携带当前待搜索医疗术语及目标语料对应的类型标识;对所述当前待搜索医疗术语进行分词,根据分词结果得到所述当前待搜索医疗术语对应的多个待搜索子词语;根据所述待搜索子词语从预先建立的语义网络中获取对应的匹配词,并获取所述匹配词对应的编码作为所述当前待搜索医疗术语对应的子编码;从所述语义网络中获取每一个子编码对应的关联编码,得到所述当前待搜索医疗术语对应的第一关联编码集合;根据所述目标语料对应的类型标识从所述第一关联编码集合中选取第一目标关联编码,获取所述第一目标关联编码对应的语料得到目标语料;及将所述目标语料发送至所述终端。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和优点将从说明书、附图以及权利要求书变得明显。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。
图1为根据一个或多个实施例中搜索方法的应用场景图;
图2为根据一个或多个实施例中搜索方法的流程示意图;
图3为根据一个或多个实施例中语义网络生成的步骤流程示意图;
图4为根据一个或多个实施例中搜索装置的结构框图;
图5为根据一个或多个实施例中计算机设备的内部结构图。
具体实施方式
为了使本申请的技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请提供的搜索方法,可以应用于如图1所示的应用环境中。其中,终端102通过网络与服务器104进行通信。终端102向服务器104发送携带当前待搜索医疗术语及目标语料对应的类型标识的搜索请求,服务器104接收到搜索请求后,对当前待搜索医疗术语进行分词以获取多个待搜索子词语,然后从预先建立的语义网络中获取待搜索子词语对应的匹配词以及匹配词的编码,接着从语义网络中查找获取到的每一个编码对应的关联编码,最后将获取到的关联编码中类型标识与目标语料相同的编码所对应的语料作为目标语料,最后将目标语料发送至终端102。
终端102可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备,服务器104可以用独立的服务器或者是多个服务器组成的服务器集群来实现。
在一些实施例中,如图2所示,提供了一种搜索方法,以该方法应用于图1中的服务器为例进行说明,包括以下步骤:
步骤S202,接收终端发送的搜索请求,搜索请求中携带当前待搜索医疗术语及目标语料对应的类型标识。
当前待搜索医疗术语指的是当前用于进行搜索的原始语料,包括但不限于疾病名称、解剖部位词语、疾病词语等等;目标语料指的是期望通过搜索得到的语料;类型标识用于唯一标识目标语料所属的语料维度。在其中一个实施例中,类型标识可以是目标语料所属语料维度的名称。举例说明,当需要搜索某个疾病如“开放性小脑出血”相关的药品时,则“开放性小脑出血”为当前待搜索医疗术语,目标语料为药品类的语料,目标语料对应的类型标识为唯一标识药品这一语料维度的标识。
在一些实施例中,终端可提供一个搜索界面,搜索界面上可包含输入框、下拉选择框等输入控件以及确认搜索控件。当用户在输入框中输入当前待搜索医疗术语,并在下拉选择框中选择一个或多个目标语料的类型,且终端检测到作用于确认搜索控件的点击操作时,获取目标语料的类型标识,然后根据目标语料的类型标识及当前待搜索医疗术语生成搜索请求,将搜索请求发送至服务器。
步骤S204,对当前待搜索医疗术语进行分词,根据分词结果得到当前待搜索医疗术语对应的多个待搜索子词语。
分词结果指的是分词得到的词序列,如对“开放性小脑出血”进行分词得到的分词结 果为“开放性/小脑/出血”。
在本实施例中,得到分词结果后,从分词结果中选取具有医学含义的词,包括解剖部位相关的词,比如“小脑”、“食指”等;疾病相关的词,比如“骨折”、“出血”、“脱位”等;程度/类型相关的词,比如“粉碎性”、“开放性”、“慢性”等等。进一步,将从分词结果中选取的具有明确含义的词作为待搜索子词语,如当当前待搜索医疗术语为开放性小脑出血时,其分词结果中的三个词都可以作为该术语对应的待搜索子词语。
步骤S206,根据待搜索子词语从预先建立的语义网络中获取对应的匹配词,并获取匹配词对应的编码作为当前待搜索医疗术语对应的子编码。
具体地,语义网络(semantic network)是一种以网络格式表达医学知识构造的形式。语义网络包括多个语义维度的语料,如解剖部位语料、程度/类型语料、疾病语料、药品语料、检查项目语料、手术项目语料,解剖部位语料指的是对人体解剖学中各个解剖部位的描述,如髋部、尾骨等;疾病语料指的是对疾病的具体医学描述,如骨折、出血、脱位等等;程度/类型语料指的是疾病所对应的严重程度的医学描述或者疾病所对应的类型的医学描述,例如,粉碎性、开放性、慢性、急性等等。
每一个维度的语料按照预设的规则进行编码,并在语义网络中用编码对语料进行标记。在语义网络中,对于属于不同语义维度的任意两个语料,若其共现频率大于预设阈值,则说明这两个语料具有语义上的关联关系,对具有关联关系的两个语料对应的编码之间建立关联关系,则这两个编码互为关联编码,具有关联关系的两个编码在语义网络中通过网络中的一条“边”进行连接。在语义网络中,可通过这些“边”查到任意一个编码的关联编码。匹配词指的是与待搜索子词语相匹配的词语。
在本实施例中,服务器获取到待搜索子词语,从预先建立的语义网络中查到与待搜索子词语相匹配的词作为待搜索子词语的匹配词,然后获取到该匹配词的编码作为当前待搜索医疗术语的子编码,这样,当前待搜索医疗术语将对应多个编码。
步骤S208,从语义网络中获取每一个子编码对应的关联编码,得到当前待搜索医疗术语对应的第一关联编码集合。
具体地,由于语义网络的各个编码之间事先已经建立了关联关系,因此,在获取到匹配词对应的编码作为子编码后,可从语义网络中查找每一个子编码对应的关联编码,所有的关联编码组合得到第一关联编码集合。
步骤S210,根据目标语料对应的类型标识从第一关联编码集合中选取第一目标关联编码,获取第一目标关联编码对应的语料得到目标语料。
具体地,由于语义网络中包含多个语义维度的语料,而目标语料只是其中某一个或某几个语义维度的语料,因此,需要根据目标语料对应的类型标识从第一关联编码集合中来选取目标关联编码。
在一些实施例中,在对各个语义维度的语料进行编码时,可在编码中带上该语义维度对应的类型标识,在获取到第一关联编码集合后,将各个关联编码分别与目标语料对应的 类型标识进行比对,若比对成功,即某个关联编码中包含该类型标识时,则选取该关联编码作为目标关联编码。
在另一个实施例中,事先将各个语义维度的编码与该语义维度对应的类型标识建立映射关系,当获取到第一关联编码集合后,根据映射关系查找到第一关联编码集合中各个第一关联编码对应的类型标识,将类型标识与目标语料对应的类型标识相同的关联编码确定为目标关联编码。
进一步,服务器可根据目标关联编码获取到对应的语料,这些语料即为目标语料。
步骤S212,将目标语料发送至终端。
具体地,服务器将获取到的目标语料通过网络发送至终端。
上述搜索方法中,服务器在接收到携带当前待搜索医疗术语及目标语料对应的类型标识的搜索请求后,对搜索请求中携带的当前待搜索医疗术语进行分词以得到多个待搜索子词语,然后在从预先建立的语义网络中获取对应的匹配词,并获取匹配词对应的编码得到子编码,然后查找每一个子编码对应的关联编码,最后根据目标语料对应的类型标识从关联编码集合中选取目标关联编码,获取目标关联编码对应的语料得到目标语料,采用本申请的方法,对于同一术语的任意不同描述,服务器都可以通过分词、获取匹配词,并从语义网络中获取到所有关联的语料以得到目标语料,因此提高了医疗数据搜索的全面性。
在一些实施例中,如图3所示,上述方法还包括生成语义网络的步骤,具体包括:
步骤S302,获取预设多个语义维度的语义树,每一个语义维度的语义树对应一个类型标识,每一个语义维度的语义树包含多个节点语料。
具体地,可首先针对标准化的医学语料库中抽取各个预设维度的语料,按照每一个维度对应的语料之间的语义关系预先构建语义树。预设的语义维度包括但不限于解剖部位、程度、疾病、药品、检查项目、手术项目等等;类型标识用于唯一标识语义树所属的语义维度,可由预设位数的字母组成,如对于解剖部位,可标识为“JP”。如下表1所示,以针对部位“耳”的部分语义树进行举例:
表1
Figure PCTCN2019096978-appb-000001
Figure PCTCN2019096978-appb-000002
步骤S304,根据类型标识及预设的编码规则对语义树对应的节点语料进行编码。
具体地,可以由类型标识及数字按照预设的编码规则来组成节点语料对应的编码,如对于上表中的耳可编码为JP3,对于外耳,中耳,内耳分别编码为JP3.1、JP3.2、JP3.3,对于耳廓、外耳道、鼓膜分别编码为JP3.1.1、JP3.1.2、JP3.1.3,依次类推。
步骤S306,计算每一个维度的语义树对应的节点语料与其他维度的语义树对应的节点语料两两之间的共现频率。
具体地,对于每一个语义维度的语义树,计算其对应的每一个节点语料,与其他语义维度的语义树对应的节点语料之间的共现频率,共现频率指的是两个语料在预设的上下文范围内共同出现的频率,共现频率越大,表示两个词语的关联程度越大。共现频率常常以共现矩阵的形式来表达,共现矩阵例如可以采用MapReduce模型实现的pairs算法或者stripes算法计算得到。
步骤S308,将共现频率大于预设阈值的两个节点语料对应的编码建立关联关系,以生成语义网络。
具体地,预设阈值可根据对语义网络中两个相互关联的节点语料之间关联程度的不同要求进行不同程度的设定。两个相互关联的节点语料之间关联程度要求越高,则预设阈值越大。
在本实施例中,对于共现频率大于预设阈值的两个节点语料,将其对应的编码通过一条边进行连接,即将两个节点语料对应的编码建立关联关系。当各个语义树对应的编码之间的关联关系建立好后,得到语义网络。在该语义网络中,通过任意一个编码进行搜索,可获取与之相关联的所有编码。
可以理解,由于编码与节点语料之间存在一一对应的映射关系,当编码之间建立关联关系后,编码对应的语料之间自然也有了关联关系。
在一些实施例中,上述方法还包括:获取子编码对应的类型标识;根据子编码对应的类型标识从关联编码集合中选取第二目标关联编码;从语义网络中获取与第二目标关联编码对应的关联编码,得到当前待搜索医疗术语对应的第二关联编码集合;根据目标语料对应的类型标识从第二关联编码集合中选取第三目标关联编码,获取第三目标关联编码对应的语料得到目标语料。
具体地,子编码对应的关联编码包括两类,第一类是类型标识与目标语料对应的类型 标识相同的编码,第二类是类型标识与目标语料对应的类型标识不相同的编码,在第二类编码中,包括了与子编码对应的类型标识相同的编码,这些编码对应的语料为与当前待搜索医疗术语语义相关的语料,可用于对当前待搜索医疗术语进行扩展搜索,进一步提升数据搜索的全面性。
在本实施例中,从关联编码集合中选取类型标识与子编码对应的类型标识相同的关联编码,然后以这些关联编码为基准,从语义网络中查找这些关联编码所对应的关联编码,此时得到的关联编码为扩展搜索得到的关联编码,从这些关联编码中再次选取类型标识与目标语料对应的类型标识相同的关联编码作为目标关联编码,获取这些目标关联编码对应的语料,将这些语料与步骤S210中得到的语料一起作为当前待搜索医疗术语对应的目标语料,以对目标语料的数量进行扩展,进一步提升医疗数据搜索的全面性。
在一些实施例中,步骤S206中,根据待搜索子词语从预先建立的语义网络中获取对应的匹配词,包括:根据待搜索子词语,遍历与待搜索子词语所属的语义维度对应的语义树;计算待搜索子词语与每个遍历的节点语料的匹配度;获取匹配度最大值对应的节点语料作为与待搜索子词语对应的匹配词。
具体地,可以首先判断待搜索子词语所属的语义维度,然后遍历该语义维度对应的语义树,每遍历到一个节点语料时,计算该节点语料与待搜索子词语的匹配度,当语义树遍历完毕时,对所有的匹配度进行排序,获取匹配度最大值对应的节点语料作为待搜索子词语的匹配词。在一些实施例中,可通过词性标注来得到带搜索子词语所属的语义维度,具体来说,当某个词的词性标注结果为解剖部位,则该词所属的语义维度为解剖部位。
在一些实施例中,计算匹配度时,可采用word2vec分别得到待搜索子词语、节点语料的词向量,然后计算待搜索子词语、节点语料对应的词向量之间的向量距离或余弦夹角值,将向量距离或余弦夹角值作为匹配度。
在本实施例中,通过遍历待搜索子词语所属的语义维度对应的语义树来查找匹配词,相较于遍历整个语义网络,可以提高匹配词的获取效率,从而提高整体的搜索效率。
在一些实施例中,步骤S204中,根据分词结果得到当前待搜索医疗术语对应的多个待搜索子词语,包括:当分词结果中任意两个词语互为互斥词时,获取每一个互斥词对应的互斥权重,将权重较大的词语作为待搜索子词语。
互斥词指的是存在互斥关系的词,当两个词同时出现时,其中一个词的语义可以忽略时,这两个词存在互斥关系,互为互斥词。如,软组织损伤半骨折中,损伤与骨折为互斥词。
具体地,可预先建立一个互斥词典,并对每一对互斥词分别设定互斥权重。服务器可通过从互斥词典中进行查找,判断分词结果中是否存在互斥词,当存在互斥词,获取每一个互斥词对应的互斥权重,将互斥权重较大的词语作为待搜索子词语。如,软组织损伤半骨折中,若骨折的互斥权重大于损伤,则将骨折作为待搜索子词语。在一些实施例中,为提高互斥词判断的效率,可首先判断是否存在两个以上属于疾病语义维度的词语,若存在, 则从互斥词典中,对这几个词语进行查找,判断是否为互斥词。
在本实施例中,通过判断互斥词,可提高搜索的精确性。
应该理解的是,虽然图2-3的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2-3中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
在一些实施例中,如图4所示,提供一种搜索装置400,包括搜索请求接收模块402、待搜索子词语获取模块404、子编码获取模块406、关联编码获取模块408、第一目标语料获取模块410、目标语料发送模块412,
搜索请求接收模块402用于接收终端发送的搜索请求,搜索请求中携带当前待搜索医疗术语及目标语料对应的类型标识;
待搜索子词语获取模块404用于对当前待搜索医疗术语进行分词,根据分词结果得到当前待搜索医疗术语对应的多个待搜索子词语;
子编码获取模块406用于根据待搜索子词语从预先建立的语义网络中获取对应的匹配词,并获取匹配词对应的编码作为当前待搜索医疗术语对应的子编码;
关联编码获取模块408用于从语义网络中获取每一个子编码对应的关联编码,得到当前待搜索医疗术语对应的第一关联编码集合;
第一目标语料获取模块410用于根据目标语料对应的类型标识从第一关联编码集合中选取第一目标关联编码,获取第一目标关联编码对应的语料得到目标语料;
目标语料发送模块412用于将目标语料发送至终端。
在一些实施例中,装置还包括语义网络生成模块;语义网络生成模块用于获取预设多个语义维度的语义树,每一个语义维度的语义树对应一个类型标识,每一个语义维度的语义树包含多个节点语料;根据类型标识及预设的编码规则对语义树对应的节点语料进行编码;计算每一个维度的语义树对应的节点语料与其他维度的语义树对应的节点语料两两之间的共现频率;将共现频率大于预设阈值的两个节点语料对应的编码建立关联关系,以生成语义网络。
在一些实施例中,装置还包括第一目标语料获取模块;第二目标语料获取模块用于获取子编码对应的类型标识;根据子编码对应的类型标识从关联编码集合中选取第二目标关联编码;从语义网络中获取与第二目标关联编码对应的关联编码,得到当前待搜索医疗术语对应的第二关联编码集合;根据目标语料对应的类型标识从第二关联编码集合中选取第三目标关联编码,获取第三目标关联编码对应的语料得到目标语料。
在一些实施例中,子编码获取模块406还用于根据待搜索子词语,遍历与待搜索子词语所属的语义维度对应的语义树;计算待搜索子词语与每个遍历的节点语料的匹配度;获取匹配度最大值对应的节点语料作为与待搜索子词语对应的匹配词。
在一些实施例中,待搜索子词语获取模块404还用于当分词结果中任意两个词语互为互斥词时,获取每一个互斥词对应的互斥权重,将权重较大的词语作为待搜索子词语。
关于搜索装置的具体限定可以参见上文中对于搜索方法的限定,在此不再赘述。上述搜索装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一些实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图5所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储医疗数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种搜索方法。
本领域技术人员可以理解,图5中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
一种计算机设备,包括存储器和一个或多个处理器,存储器中储存有计算机可读指令,计算机可读指令被处理器执行时,使得一个或多个处理器执行以下步骤::接收终端发送的搜索请求,搜索请求中携带当前待搜索医疗术语及目标语料对应的类型标识;对当前待搜索医疗术语进行分词,根据分词结果得到当前待搜索医疗术语对应的多个待搜索子词语;根据待搜索子词语从预先建立的语义网络中获取对应的匹配词,并获取匹配词对应的编码作为当前待搜索医疗术语对应的子编码;从语义网络中获取每一个子编码对应的关联编码,得到当前待搜索医疗术语对应的第一关联编码集合;根据目标语料对应的类型标识从第一关联编码集合中选取第一目标关联编码,获取第一目标关联编码对应的语料得到目标语料;及将目标语料发送至终端。
在一些实施例中,处理器执行计算机可读指令时还实现以下步骤:获取预设多个语义维度的语义树,每一个语义维度的语义树对应一个类型标识,每一个语义维度的语义树包含多个节点语料;根据类型标识及预设的编码规则对语义树对应的节点语料进行编码;计算每一个维度的语义树对应的节点语料与其他维度的语义树对应的节点语料两两之间的共现频率;及将共现频率大于预设阈值的两个节点语料对应的编码建立关联关系,以生成 语义网络。
在一些实施例中,在获取第一目标关联编码对应的语料得到目标语料之前,处理器执行计算机可读指令时还实现以下步骤:获取子编码对应的类型标识;根据子编码对应的类型标识从关联编码集合中选取第二目标关联编码;从语义网络中获取与第二目标关联编码对应的关联编码,得到当前待搜索医疗术语对应的第二关联编码集合;及所述获取所述第一目标关联编码对应的语料得到目标语料,包括:获取所述第一目标关联编码对应的语料及所述第三目标关联编码对应的语料得到目标语料。在一些实施例中,根据待搜索子词语从预先建立的语义网络中获取对应的匹配词,包括:根据待搜索子词语,遍历与待搜索子词语所属的语义维度对应的语义树;计算待搜索子词语与每个遍历的节点语料的匹配度;及获取匹配度最大值对应的节点语料作为与待搜索子词语对应的匹配词。
在一些实施例中,根据分词结果得到当前待搜索医疗术语对应的多个待搜索子词语,包括:当分词结果中任意两个词语互为互斥词时,获取每一个互斥词对应的互斥权重,及将权重较大的词语作为待搜索子词语。
在根据所述目标语料对应的类型标识从所述第一关联编码集合中选取第一目标关联编码之前,处理器执行计算机可读指令时还实现以下步骤:建立各个语义维度的编码与所述语义维度对应的类型标识之间的映射关系,得到映射关系表;所述根据所述目标语料对应的类型标识从所述第一关联编码集合中选取第一目标关联编码,包括:从所述映射关系表中查找所述第一关联编码集合中各个第一关联编码对应的类型标识;及将类型标识与所述目标语料对应的类型标识相同的第一关联编码确定为第一目标关联编码。
一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:接收终端发送的搜索请求,搜索请求中携带当前待搜索医疗术语及目标语料对应的类型标识;对当前待搜索医疗术语进行分词,根据分词结果得到当前待搜索医疗术语对应的多个待搜索子词语;根据待搜索子词语从预先建立的语义网络中获取对应的匹配词,并获取匹配词对应的编码作为当前待搜索医疗术语对应的子编码;从语义网络中获取每一个子编码对应的关联编码,得到当前待搜索医疗术语对应的第一关联编码集合;根据目标语料对应的类型标识从第一关联编码集合中选取第一目标关联编码,获取第一目标关联编码对应的语料得到目标语料;及将目标语料发送至终端。
在一些实施例中,计算机可读指令被处理器执行时还实现以下步骤:获取预设多个语义维度的语义树,每一个语义维度的语义树对应一个类型标识,每一个语义维度的语义树包含多个节点语料;根据类型标识及预设的编码规则对语义树对应的节点语料进行编码;计算每一个维度的语义树对应的节点语料与其他维度的语义树对应的节点语料两两之间的共现频率;及将共现频率大于预设阈值的两个节点语料对应的编码建立关联关系,以生成语义网络。
在一些实施例中,在获取第一目标关联编码对应的语料得到目标语料之前,计算机可读指令被处理器执行时还实现以下步骤:获取子编码对应的类型标识;根据子编码对应的类型标识从关联编码集合中选取第二目标关联编码;从语义网络中获取与第二目标关联编码对应的关联编码,得到当前待搜索医疗术语对应的第二关联编码集合;及所述获取所述第一目标关联编码对应的语料得到目标语料,包括:获取所述第一目标关联编码对应的语料及所述第三目标关联编码对应的语料得到目标语料。
在一些实施例中,根据待搜索子词语从预先建立的语义网络中获取对应的匹配词,包括:根据待搜索子词语,遍历与待搜索子词语所属的语义维度对应的语义树;计算待搜索子词语与每个遍历的节点语料的匹配度;及获取匹配度最大值对应的节点语料作为与待搜索子词语对应的匹配词。
在一些实施例中,根据分词结果得到当前待搜索医疗术语对应的多个待搜索子词语,包括:当分词结果中任意两个词语互为互斥词时,获取每一个互斥词对应的互斥权重,及将权重较大的词语作为待搜索子词语。
在一些实施例中,在根据所述目标语料对应的类型标识从所述第一关联编码集合中选取第一目标关联编码之前,计算机可读指令被处理器执行时还实现以下步骤:建立各个语义维度的编码与所述语义维度对应的类型标识之间的映射关系,得到映射关系表;所述根据所述目标语料对应的类型标识从所述第一关联编码集合中选取第一目标关联编码,包括:从所述映射关系表中查找所述第一关联编码集合中各个第一关联编码对应的类型标识;及将类型标识与所述目标语料对应的类型标识相同的第一关联编码确定为第一目标关联编码。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能 因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (20)

  1. 一种搜索方法,包括:
    接收终端发送的搜索请求,所述搜索请求中携带当前待搜索医疗术语及目标语料对应的类型标识;
    对所述当前待搜索医疗术语进行分词,根据分词结果得到所述当前待搜索医疗术语对应的多个待搜索子词语;
    根据所述待搜索子词语从预先建立的语义网络中获取对应的匹配词,并获取所述匹配词对应的编码作为所述当前待搜索医疗术语对应的子编码;
    从所述语义网络中获取每一个子编码对应的关联编码,得到所述当前待搜索医疗术语对应的第一关联编码集合;
    根据所述目标语料对应的类型标识从所述第一关联编码集合中选取第一目标关联编码,获取所述第一目标关联编码对应的语料得到目标语料;及将所述目标语料发送至所述终端。
  2. 根据权利要求1所述的方法,其特征在于,所述语义网络的生成步骤包括:
    获取预设多个语义维度的语义树,每一个语义维度的所述语义树对应一个类型标识,每一个语义维度的所述语义树包含多个节点语料;
    根据所述类型标识及预设的编码规则对所述语义树对应的节点语料进行编码;
    计算每一个维度的语义树对应的节点语料与其他维度的语义树对应的节点语料两两之间的共现频率;及
    将所述共现频率大于预设阈值的两个节点语料对应的编码建立关联关系,以生成语义网络。
  3. 根据权利要求1所述的方法,其特征在于,在所述获取所述第一目标关联编码对应的语料得到目标语料之前,所述方法还包括:
    获取所述子编码对应的类型标识;
    根据所述子编码对应的类型标识从所述第一关联编码集合中选取第二目标关联编码;
    从所述语义网络中获取与所述第二目标关联编码对应的关联编码,得到所述当前待搜索医疗术语对应的第二关联编码集合;
    根据所述目标语料对应的类型标识从所述第二关联编码集合中选取第三目标关联编码;及
    所述获取所述第一目标关联编码对应的语料得到目标语料,包括:
    获取所述第一目标关联编码对应的语料及所述第三目标关联编码对应的语料得到目标语料。
  4. 根据权利要求1所述的方法,其特征在于,所述根据所述待搜索子词语从预先建立的语义网络中获取对应的匹配词,包括:
    根据所述待搜索子词语,遍历与所述待搜索子词语所属的语义维度对应的语义树;
    计算所述待搜索子词语与每个遍历的节点语料的匹配度;及
    获取匹配度最大值对应的节点语料作为与所述待搜索子词语对应的匹配词。
  5. 根据权利要求1至4任意一项所述的方法,其特征在于,所述根据分词结果得到所述当前待搜索医疗术语对应的多个待搜索子词语,包括:
    当分词结果中任意两个词语互为互斥词时,获取每一个互斥词对应的互斥权重;及
    将权重较大的词语作为待搜索子词语。
  6. 根据权利要求1所述的方法,其特征在于,在根据所述目标语料对应的类型标识从所述第一关联编码集合中选取第一目标关联编码之前,所述方法包括:
    建立各个语义维度的编码与所述语义维度对应的类型标识之间的映射关系,得到映射关系表;
    所述根据所述目标语料对应的类型标识从所述第一关联编码集合中选取第一目标关联编码,包括:
    从所述映射关系表中查找所述第一关联编码集合中各个第一关联编码对应的类型标识;及
    将类型标识与所述目标语料对应的类型标识相同的第一关联编码确定为第一目标关联编码。
  7. 一种搜索装置,包括:
    搜索请求接收模块,用于接收终端发送的搜索请求,所述搜索请求中携带当前待搜索医疗术语及目标语料对应的类型标识;
    待搜索子词语获取模块,用于对所述当前待搜索医疗术语进行分词,根据分词结果得到所述当前待搜索医疗术语对应的多个待搜索子词语;
    子编码获取模块,用于根据所述待搜索子词语从预先建立的语义网络中获取对应的匹配词,并获取所述匹配词对应的编码作为所述当前待搜索医疗术语对应的子编码;
    关联编码获取模块,用于从所述语义网络中获取每一个子编码对应的关联编码,得到所述当前待搜索医疗术语对应的第一关联编码集合;
    第一目标语料获取模块,用于根据所述目标语料对应的类型标识从所述第一关联编码集合中选取第一目标关联编码,获取所述第一目标关联编码对应的语料得到目标语料;及目标语料发送模块,用于将所述目标语料发送至所述终端。
  8. 根据权利要求7所述的装置,其特征在于,所述装置还包括语义网络生成模块;
    所述语义网络生成模块用于获取预设多个语义维度的语义树,每一个语义维度的所述语义树对应一个类型标识,每一个语义维度的所述语义树包含多个节点语料;根据所述类型标识及预设的编码规则对所述语义树对应的节点语料进行编码;计算每一个维度的语义树对应的节点语料与其他维度的语义树对应的节点语料两两之间的共现频率;及将所述共现频率大于预设阈值的两个节点语料对应的编码建立关联关系,以生成语义网络。
  9. 一种计算机设备,包括存储器及一个或多个处理器,所述存储器中储存有计算机可读指令,所述计算机可读指令被所述一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:
    接收终端发送的搜索请求,所述搜索请求中携带当前待搜索医疗术语及目标语料对应的类型标识;
    对所述当前待搜索医疗术语进行分词,根据分词结果得到所述当前待搜索医疗术语对应的多个待搜索子词语;
    根据所述待搜索子词语从预先建立的语义网络中获取对应的匹配词,并获取所述匹配词对应的编码作为所述当前待搜索医疗术语对应的子编码;
    从所述语义网络中获取每一个子编码对应的关联编码,得到所述当前待搜索医疗术语对应的第一关联编码集合;
    根据所述目标语料对应的类型标识从所述第一关联编码集合中选取第一目标关联编码,获取所述第一目标关联编码对应的语料得到目标语料;及
    将所述目标语料发送至所述终端。
  10. 根据权利要求9所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:
    获取预设多个语义维度的语义树,每一个语义维度的所述语义树对应一个类型标识,每一个语义维度的所述语义树包含多个节点语料;
    根据所述类型标识及预设的编码规则对所述语义树对应的节点语料进行编码;
    计算每一个维度的语义树对应的节点语料与其他维度的语义树对应的节点语料两两之间的共现频率;及
    将所述共现频率大于预设阈值的两个节点语料对应的编码建立关联关系,以生成语义网络。
  11. 根据权利要求9所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:
    获取所述子编码对应的类型标识;
    根据所述子编码对应的类型标识从所述第一关联编码集合中选取第二目标关联编码;
    从所述语义网络中获取与所述第二目标关联编码对应的关联编码,得到所述当前待搜索医疗术语对应的第二关联编码集合;
    根据所述目标语料对应的类型标识从所述第二关联编码集合中选取第三目标关联编码;及
    获取所述第一目标关联编码对应的语料及所述第三目标关联编码对应的语料得到目标语料。
  12. 根据权利要求9所述的计算机设备,其特征在于,所述处理器执行所述计算机可 读指令时还执行以下步骤:
    根据所述待搜索子词语,遍历与所述待搜索子词语所属的语义维度对应的语义树;
    计算所述待搜索子词语与每个遍历的节点语料的匹配度;及
    获取匹配度最大值对应的节点语料作为与所述待搜索子词语对应的匹配词。
  13. 根据权利要求9至12任意一项所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:
    当分词结果中任意两个词语互为互斥词时,获取每一个互斥词对应的互斥权重;及
    将权重较大的词语作为待搜索子词语。
  14. 根据权利要求9所述的计算机设备,其特征在于,所述处理器执行所述计算机可读指令时还执行以下步骤:
    建立各个语义维度的编码与所述语义维度对应的类型标识之间的映射关系,得到映射关系表;
    从所述映射关系表中查找所述第一关联编码集合中各个第一关联编码对应的类型标识;及
    将类型标识与所述目标语料对应的类型标识相同的第一关联编码确定为第一目标关联编码。
  15. 一个或多个存储有计算机可读指令的非易失性计算机可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:
    接收终端发送的搜索请求,所述搜索请求中携带当前待搜索医疗术语及目标语料对应的类型标识;
    对所述当前待搜索医疗术语进行分词,根据分词结果得到所述当前待搜索医疗术语对应的多个待搜索子词语;
    根据所述待搜索子词语从预先建立的语义网络中获取对应的匹配词,并获取所述匹配词对应的编码作为所述当前待搜索医疗术语对应的子编码;
    从所述语义网络中获取每一个子编码对应的关联编码,得到所述当前待搜索医疗术语对应的第一关联编码集合;
    根据所述目标语料对应的类型标识从所述第一关联编码集合中选取第一目标关联编码,获取所述第一目标关联编码对应的语料得到目标语料;及
    将所述目标语料发送至所述终端。
  16. 根据权利要求15所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:
    获取预设多个语义维度的语义树,每一个语义维度的所述语义树对应一个类型标识,每一个语义维度的所述语义树包含多个节点语料;
    根据所述类型标识及预设的编码规则对所述语义树对应的节点语料进行编码;
    计算每一个维度的语义树对应的节点语料与其他维度的语义树对应的节点语料两两之间的共现频率;及
    将所述共现频率大于预设阈值的两个节点语料对应的编码建立关联关系,以生成语义网络。
  17. 根据权利要求15所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:
    获取所述子编码对应的类型标识;
    根据所述子编码对应的类型标识从所述第一关联编码集合中选取第二目标关联编码;
    从所述语义网络中获取与所述第二目标关联编码对应的关联编码,得到所述当前待搜索医疗术语对应的第二关联编码集合;
    根据所述目标语料对应的类型标识从所述第二关联编码集合中选取第三目标关联编码;及
    获取所述第一目标关联编码对应的语料及所述第三目标关联编码对应的语料得到目标语料。
  18. 根据权利要求15所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:
    根据所述待搜索子词语,遍历与所述待搜索子词语所属的语义维度对应的语义树;
    计算所述待搜索子词语与每个遍历的节点语料的匹配度;及
    获取匹配度最大值对应的节点语料作为与所述待搜索子词语对应的匹配词。
  19. 根据权利要求15至18任意一项所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:
    当分词结果中任意两个词语互为互斥词时,获取每一个互斥词对应的互斥权重;及
    将权重较大的词语作为待搜索子词语。
  20. 根据权利要求15所述的存储介质,其特征在于,所述计算机可读指令被所述处理器执行时还执行以下步骤:
    建立各个语义维度的编码与所述语义维度对应的类型标识之间的映射关系,得到映射关系表;
    从所述映射关系表中查找所述第一关联编码集合中各个第一关联编码对应的类型标识;及
    将类型标识与所述目标语料对应的类型标识相同的第一关联编码确定为第一目标关联编码。
PCT/CN2019/096978 2018-08-14 2019-07-22 搜索方法、装置、计算机设备和存储介质 WO2020034810A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810923258.7A CN109215796B (zh) 2018-08-14 2018-08-14 搜索方法、装置、计算机设备和存储介质
CN201810923258.7 2018-08-14

Publications (1)

Publication Number Publication Date
WO2020034810A1 true WO2020034810A1 (zh) 2020-02-20

Family

ID=64988597

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/096978 WO2020034810A1 (zh) 2018-08-14 2019-07-22 搜索方法、装置、计算机设备和存储介质

Country Status (2)

Country Link
CN (1) CN109215796B (zh)
WO (1) WO2020034810A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111552780A (zh) * 2020-04-29 2020-08-18 微医云(杭州)控股有限公司 医用场景的搜索处理方法、装置、存储介质及电子设备
CN111899822A (zh) * 2020-06-28 2020-11-06 广州万孚生物技术股份有限公司 医疗机构数据库构建方法、查询方法、装置、设备和介质
CN112395408A (zh) * 2020-11-19 2021-02-23 平安科技(深圳)有限公司 停用词表生成方法、装置、电子设备及存储介质

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109215796B (zh) * 2018-08-14 2023-04-25 深圳平安医疗健康科技服务有限公司 搜索方法、装置、计算机设备和存储介质
CN110704578B (zh) * 2019-10-09 2022-08-09 北京秒针人工智能科技有限公司 关联关系确定方法、装置、电子设备及可读存储介质
CN111291137B (zh) * 2020-01-22 2023-05-09 奇安信科技集团股份有限公司 基于实体关系的搜索方法和系统
CN111339193B (zh) * 2020-02-21 2023-06-27 腾讯云计算(北京)有限责任公司 类别的编码方法及装置
CN111341458B (zh) * 2020-02-27 2020-11-03 国家卫生健康委科学技术研究所 基于多层级结构相似度的单基因病名称推荐方法和系统
CN111581337A (zh) * 2020-03-19 2020-08-25 平安科技(深圳)有限公司 医疗文本搜索方法、装置、计算机设备及存储介质
CN111985241B (zh) * 2020-09-03 2023-08-08 深圳平安智慧医健科技有限公司 医学信息查询方法、装置、电子设备及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102043812A (zh) * 2009-10-13 2011-05-04 北京大学 一种医疗信息的检索方法及系统
US20170140008A1 (en) * 2015-11-13 2017-05-18 Google Inc. Suggestion-based differential diagnostics
CN107680689A (zh) * 2017-05-05 2018-02-09 平安科技(深圳)有限公司 医疗文本的潜在疾病推断方法、系统及可读存储介质
CN108133756A (zh) * 2017-12-26 2018-06-08 医渡云(北京)技术有限公司 医疗数据搜索方法及装置、存储介质、电子设备
CN109215796A (zh) * 2018-08-14 2019-01-15 平安医疗健康管理股份有限公司 搜索方法、装置、计算机设备和存储介质

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101089841B (zh) * 2006-06-14 2010-07-21 联想(北京)有限公司 基于知识编码的精确搜索方法和系统
JP5131923B2 (ja) * 2008-11-11 2013-01-30 日本電信電話株式会社 単語間関連度判定装置、単語間関連度判定方法、プログラムおよび記録媒体
JP6109664B2 (ja) * 2013-07-17 2017-04-05 Kddi株式会社 言語体系の間で同義語句に対する特定の感情を推定するプログラム、装置及び方法
CN104156415B (zh) * 2014-07-31 2017-04-12 沈阳锐易特软件技术有限公司 解决医疗数据标准编码对照问题的映射处理系统及方法
CN107731269B (zh) * 2017-10-25 2020-06-26 山东众阳软件有限公司 基于原始诊断数据和病历文件数据的疾病编码方法及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102043812A (zh) * 2009-10-13 2011-05-04 北京大学 一种医疗信息的检索方法及系统
US20170140008A1 (en) * 2015-11-13 2017-05-18 Google Inc. Suggestion-based differential diagnostics
CN107680689A (zh) * 2017-05-05 2018-02-09 平安科技(深圳)有限公司 医疗文本的潜在疾病推断方法、系统及可读存储介质
CN108133756A (zh) * 2017-12-26 2018-06-08 医渡云(北京)技术有限公司 医疗数据搜索方法及装置、存储介质、电子设备
CN109215796A (zh) * 2018-08-14 2019-01-15 平安医疗健康管理股份有限公司 搜索方法、装置、计算机设备和存储介质

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111552780A (zh) * 2020-04-29 2020-08-18 微医云(杭州)控股有限公司 医用场景的搜索处理方法、装置、存储介质及电子设备
CN111552780B (zh) * 2020-04-29 2023-09-29 微医云(杭州)控股有限公司 医用场景的搜索处理方法、装置、存储介质及电子设备
CN111899822A (zh) * 2020-06-28 2020-11-06 广州万孚生物技术股份有限公司 医疗机构数据库构建方法、查询方法、装置、设备和介质
CN111899822B (zh) * 2020-06-28 2024-01-30 广州万孚生物技术股份有限公司 医疗机构数据库构建方法、查询方法、装置、设备和介质
CN112395408A (zh) * 2020-11-19 2021-02-23 平安科技(深圳)有限公司 停用词表生成方法、装置、电子设备及存储介质
CN112395408B (zh) * 2020-11-19 2023-11-07 平安科技(深圳)有限公司 停用词表生成方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN109215796A (zh) 2019-01-15
CN109215796B (zh) 2023-04-25

Similar Documents

Publication Publication Date Title
WO2020034810A1 (zh) 搜索方法、装置、计算机设备和存储介质
US11227118B2 (en) Methods, devices, and systems for constructing intelligent knowledge base
CN111814447B (zh) 基于分词文本的电子病例查重方法、装置、计算机设备
WO2019136993A1 (zh) 文本相似度计算方法、装置、计算机设备和存储介质
CN110168523B (zh) 改变监测跨图查询
US20210312139A1 (en) Method and apparatus of generating semantic feature, method and apparatus of training model, electronic device, and storage medium
JP6456162B2 (ja) 匿名化処理装置、匿名化処理方法及びプログラム
WO2020052162A1 (zh) 疾病数据映射方法、装置、计算机设备和存储介质
US11334609B2 (en) Semantic structure search device and semantic structure search method
WO2022148055A1 (zh) 一种文件检索方法及计算设备
WO2020206910A1 (zh) 产品信息推送方法、装置、计算机设备和存储介质
WO2020114100A1 (zh) 一种信息处理方法、装置和计算机存储介质
WO2021151358A1 (zh) 基于解释模型的分诊信息推荐方法、装置、设备及介质
WO2019148712A1 (zh) 钓鱼网站检测方法、装置、计算机设备和存储介质
EP3940550A1 (en) Data compression methods and systems based on key-value store
CN112215008A (zh) 基于语义理解的实体识别方法、装置、计算机设备和介质
CN110134965B (zh) 用于信息处理的方法、装置、设备和计算机可读存储介质
WO2020034808A1 (zh) 决策数据获取方法、装置、计算机设备和存储介质
CN109213775B (zh) 搜索方法、装置、计算机设备和存储介质
CN114138817A (zh) 基于关系型数据库的数据查询方法、设备、介质及产品
WO2021103594A1 (zh) 一种默契度检测方法、设备、服务器及可读存储介质
JP2020527762A (ja) 医療テキスト中の医療エンティティを識別するための方法および装置
EP3926453A1 (en) Partitioning method and apparatus therefor
CN112800179A (zh) 关联数据库查询方法、装置、存储介质及电子设备
WO2018214956A1 (zh) 机器翻译方法、装置及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19849671

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19849671

Country of ref document: EP

Kind code of ref document: A1