WO2020132918A1 - 药品预测方法、装置、计算机设备及存储介质 - Google Patents

药品预测方法、装置、计算机设备及存储介质 Download PDF

Info

Publication number
WO2020132918A1
WO2020132918A1 PCT/CN2018/123761 CN2018123761W WO2020132918A1 WO 2020132918 A1 WO2020132918 A1 WO 2020132918A1 CN 2018123761 W CN2018123761 W CN 2018123761W WO 2020132918 A1 WO2020132918 A1 WO 2020132918A1
Authority
WO
WIPO (PCT)
Prior art keywords
symptom
vector
medicine
drug
target
Prior art date
Application number
PCT/CN2018/123761
Other languages
English (en)
French (fr)
Inventor
熊友军
罗沛鹏
廖洪涛
Original Assignee
深圳市优必选科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市优必选科技有限公司 filed Critical 深圳市优必选科技有限公司
Publication of WO2020132918A1 publication Critical patent/WO2020132918A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H20/00ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
    • G16H20/10ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to drugs or medications, e.g. for ensuring correct administration to patients
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Definitions

  • the present invention relates to the field of computer processing, and in particular, to a drug prediction method, device, computer equipment, and storage medium.
  • Predicting drugs based on symptoms can be used as a supplementary medical technology, which can relieve the pressure of hospital doctors to a certain extent.
  • an embodiment of the present invention provides a method for predicting medicine, the method including:
  • the target medicine corresponding to the symptom information is determined according to the vector distance.
  • an embodiment of the present invention provides a medicine prediction device, the device including:
  • An obtaining module configured to obtain the symptom information of the medicine to be predicted, the symptom information including at least one symptom;
  • Vector determination module used to determine the symptom vector corresponding to each symptom
  • a calculation module configured to calculate the vector distance between the symptom vector and the drug vector corresponding to each drug in the drug database according to the symptom vector of each symptom;
  • the medicine determination module is configured to determine the target medicine corresponding to the symptom information according to the vector distance.
  • an embodiment of the present invention provides a computer device, including a memory and a processor.
  • the memory stores a computer program.
  • the processor is caused to perform the following steps:
  • the target medicine corresponding to the symptom information is determined according to the vector distance.
  • an embodiment of the present invention provides a computer-readable storage medium that stores a computer program.
  • the processor is caused to perform the following steps:
  • the target medicine corresponding to the symptom information is determined according to the vector distance.
  • the above drug prediction method, device, computer equipment and storage medium After obtaining the symptom information of the drug to be predicted, the above drug prediction method, device, computer equipment and storage medium determine the symptom vector corresponding to each symptom, and then calculate the symptom vector and the drug database according to the symptom vector of each symptom The vector distance between the drug vectors of each drug, and then the target drug corresponding to the symptom information is determined according to the vector distance.
  • the above drug prediction method by converting the matching relationship between symptoms and drugs into a vector operation, through the distance operation between the symptom vector and the drug vector, you can quickly find the target drug, which greatly improves the search speed and improves the prediction of the drug Efficiency, and this search method is beneficial to improve the accuracy of prediction.
  • FIG. 1 is an application environment diagram of a medicine prediction method in an embodiment
  • FIG. 2 is a flowchart of a medicine prediction method in an embodiment
  • FIG. 3 is a schematic diagram of the principle of CBOW and Skip-gram prediction in an embodiment
  • FIG. 5 is a schematic diagram of visualization of different symptoms in a two-dimensional space in an embodiment
  • FIG. 6 is a schematic flowchart of a method for training a word vector model in an embodiment
  • FIG. 7 is a flowchart of a method for determining a target drug in an embodiment
  • FIG. 8 is a schematic flowchart of a method for predicting drugs in an embodiment
  • FIG. 9 is a structural block diagram of a medicine prediction device in an embodiment
  • FIG. 10 is a structural block diagram of a medicine prediction device in another embodiment
  • FIG. 11 is a structural block diagram of a medicine prediction device in still another embodiment
  • FIG. 12 is an internal structure diagram of a computer device in an embodiment.
  • FIG. 1 is an application environment diagram of a medicine prediction method in an embodiment.
  • the medicine prediction is applied to a medicine prediction system.
  • the medicine prediction system includes a terminal 110 and a server 120.
  • the terminal 110 and the server 120 are connected through a network.
  • the terminal 110 may specifically be a desktop terminal or a mobile terminal, and the mobile terminal may specifically be at least one of a mobile phone, a tablet computer, a notebook computer, and the like.
  • the server 120 may be implemented by an independent server or a server cluster composed of multiple servers.
  • the terminal 110 is used to obtain the symptom information of the drug to be predicted.
  • the symptom information includes at least one symptom, and then upload the symptom information to the server 120.
  • the server 120 After the server 120 obtains the symptom information of the drug to be predicted, the symptom vector corresponding to each symptom is determined , Calculating the vector distance between the symptom vector and the drug vector corresponding to each drug in the drug database according to the symptom vector of each symptom, and determining the target drug corresponding to the symptom information according to the vector distance, The target medicine is returned to the terminal 110.
  • the above drug prediction method may be directly applied to the terminal 110, and the terminal 110 is used to obtain symptom information of the drug to be predicted, the symptom information includes at least one symptom, and a symptom vector corresponding to each symptom is determined. Based on the symptom vector of each symptom, the vector distance between the symptom vector and the drug vector corresponding to each drug in the drug database is calculated, and the target drug corresponding to the symptom information is determined according to the vector distance.
  • the medicine prediction method can be applied to a terminal or a server.
  • the application to a terminal is taken as an example.
  • the medicine prediction method specifically includes the following steps:
  • Step 202 Obtain the symptom information of the medicine to be predicted, and the symptom information includes at least one symptom.
  • the symptom information is used to describe the characteristics of the illness.
  • the symptom information includes one or more symptoms.
  • the symptoms refer to the characteristics of the illness, such as headache and fever.
  • Step 204 Determine the symptom vector corresponding to each symptom.
  • the symptom vector refers to the vector representation of symptoms.
  • the symptom vector can be obtained by training a word vector model (for example, word2vec model).
  • word2vec model for example, word2vec model
  • the word vectors trained by word2vec can cluster words with many co-occurrences, so when symptoms and symptoms appear together, they The resulting word vectors are very close in space. For example, for drugs, a type of symptoms often appear together. For example, "headache" and "fever” often appear together.
  • the symptom vector and the symptom are stored in association. After the symptom is obtained, the corresponding symptom vector can be quickly found according to the correspondence between the symptom and the symptom vector.
  • Step 206 calculate the vector distance between the symptom vector and the drug vector corresponding to each drug in the drug database.
  • the medicine database stores each medicine and the medicine vector corresponding to each medicine. Find the corresponding medicine by calculating the space vector distance between the symptom vector and the medicine vector.
  • Vector distance refers to the distance between vectors.
  • the Euclidean distance can be used to calculate the vector distance. Let d denote the vector distance and X 1i and X 2i denote the symptom vector and the drug vector, respectively, then the formula for calculating the corresponding vector distance is as follows: The closer the calculated vector distance, the closer the distance to the drug.
  • Step 208 Determine the target medicine corresponding to the symptom information according to the vector distance.
  • the vector distance is sorted according to the calculated vector distance from small to large, and the top ranked drug is used as the predicted target drug.
  • the vector distance between each symptom vector and the drug vector is calculated separately, and then the calculated multiple vector distances are averaged to determine the average vector distance from the drug vector.
  • the target drug is determined by comparing the average vector distance corresponding to each drug.
  • the average vector distance for a drug is: Then, the average vector distance between the symptom information and each medicine is obtained, and then the average vector distance is sorted, and the medicine corresponding to the shortest average vector distance is taken as the target medicine.
  • the traditional text matching method has a different order of symptoms, and the searched content is likely to be different.
  • the above drug prediction method has greatly improved the prediction speed by converting the text matching problem into a mathematical vector operation problem, and there is no requirement for the order of symptoms. Different symptom orders can also get the same prediction results, and The vector distance calculation can easily sort the searched drugs. Therefore, compared with the traditional text matching method, the drug prediction method not only has high efficiency, but also greatly improves flexibility, accuracy and operability.
  • the above drug prediction method, device, computer equipment and storage medium After obtaining the symptom information of the drug to be predicted, the above drug prediction method, device, computer equipment and storage medium determine the symptom vector corresponding to each symptom, and then calculate the symptom vector and the drug database according to the symptom vector of each symptom The vector distance between the drug vectors of each drug, and then the target drug corresponding to the symptom information is determined according to the vector distance.
  • the above drug prediction method by converting the matching relationship between symptoms and drugs into a vector operation, through the distance operation between the symptom vector and the drug vector, you can quickly find the target drug, greatly improving the search speed, that is, improving the prediction of drugs Efficiency, and this search method is beneficial to improve the accuracy of prediction.
  • the method before determining the symptom vector corresponding to each symptom, the method further includes:
  • Step 210 Obtain the symptoms corresponding to each medicine to obtain a symptom training sample set.
  • the symptom training sample set includes multiple symptom training samples.
  • the symptoms corresponding to the drugs can be obtained by acquiring the symptoms treated in the drug instructions, collating the symptoms treated in the drug instructions, using the symptoms as a word, separated by spaces.
  • the medicine is deduplicated first, and the medicine with the same name and treatment for the same symptoms is deduplicated, for example, some medicines with different specifications and dosages but the same.
  • the symptoms of the same drug are put together as a training sample, that is, different drugs correspond to different training samples. Since there are often only a few symptoms treated by a drug, too little corpus will result in poor training, so the number of symptoms needs to be expanded.
  • it can be expanded into a new corpus by repeatedly copying the symptoms of its treatment, for example, copying the three symptoms of headache, fever, and nasal congestion twice to expand to obtain a new training sample ⁇ headache, fever, nasal congestion, headache , Fever, stuffy nose, headache, fever, stuffy nose ⁇ , thus expanding the training sample set.
  • step 212 the symptom training samples are used as input to the word vector model for unsupervised training to obtain symptom vectors corresponding to each symptom.
  • the symptom training samples are used as input to the word vector model for unsupervised training, and the symptom vector corresponding to each symptom can be obtained after the training is completed.
  • the word vector model can use the word2vec model.
  • the word vector model is trained according to the symptom training sample and can cluster words with many co-occurrences of words. So when symptoms and symptoms appear together, the word vectors they get are very close in space. of. Through the word vector model, the symptom vector corresponding to each symptom can be obtained.
  • the symptom word segmentation is performed on the symptoms in the instructions for each drug, and then the word2vec model is used to train the symptoms into symptom vectors.
  • Word2vec is divided into CBOW and Skip-gram.
  • CBOW predicts the probability of the current word according to the context; Skip-gram is just the opposite. It predicts the probability of the context based on the current word.
  • Figure 4 it is a schematic diagram of the principle of CBOW and Skip-gram prediction.
  • w(t) is a word in the text
  • w(t-1) and w(t+1) are the previous word and the next word of w(t) in the text, respectively.
  • the dimension of the symptom vector can be determined according to the number of medicines. For example, 10 dimensions can be used, and the context window can be set to 3.
  • step 214 the medicine vector corresponding to the medicine is calculated according to the symptom vector of the symptom corresponding to the medicine, and the medicine vector is stored in the medicine database.
  • the drug vector corresponding to the drug can be calculated according to the symptom corresponding to the drug.
  • a plurality of symptom vectors corresponding to the medicine may be averaged to obtain an average vector, and then the average vector may be used as the medicine vector of the medicine.
  • each medicine can correspond to multiple symptoms. If expressed mathematically, the medicine can be represented by a point in the space, and the symptoms treated by the medicine are distributed around the center point. As shown in Figure 5, it reflects the visualization of the distribution of different drugs and symptoms in three dimensions to two dimensions. In the figure, each symptom is represented by a dot, and the same type of dot corresponds to the symptoms of the same drug. It can be seen from the figure that the symptoms treated by each drug are very close in space, and the symptoms of different drugs are far away from each other. The points of the drugs in the figure are not marked.
  • FIG. 6 it is a schematic flowchart of a method for training a word vector model in an embodiment.
  • (1) De-duplicate the drugs first, and de-duplicate drugs with the same name and treatment for the same symptoms.
  • (2) Organize the symptoms treated in the drug package, and treat the symptoms as words, separated by spaces.
  • the word2vec model is used to train the symptoms into symptom vectors, and the dimension can be determined according to the number of drugs.
  • Store symptom vectors Store symptom vectors.
  • the word vector model reflects the distribution of drug symptoms.
  • (6) Using the symptom vectors treated by each drug, find the drug vector (geometric center of each symptom) for each drug.
  • calculating the medicine vector corresponding to the medicine according to the symptom vector of the symptom corresponding to the medicine includes: when the medicine corresponds to multiple symptoms, obtaining the symptom vector corresponding to each symptom; calculating the average vector of the multiple symptom vectors , Use the average vector as the drug vector of the corresponding drug.
  • the drug vector of the drug is determined according to the symptom vector of each symptom.
  • the geometric center corresponding to multiple symptom vectors can be used as the drug vector of the corresponding drug.
  • the multiple symptom vectors are averaged to obtain an average vector.
  • the average vector is the vector corresponding to the geometric centers of multiple symptom vectors.
  • determining the target medicine corresponding to the symptom information according to the vector distance includes:
  • Step 208A When the symptom information includes multiple symptoms, calculate the average vector distance according to the vector distance between each symptom vector and the drug vector.
  • the vector distance between each symptom vector and each drug vector is calculated separately, and multiple symptom vectors are calculated based on the vector distance between the multiple symptom vectors and the same drug vector The average vector distance between drug vectors.
  • the average vector distance for a drug is: Then find the average vector distance between the symptom information and each medicine.
  • Step 208B Calculate the target vector distance between the symptom vectors corresponding to the symptom information and the drug vector according to the average vector distance and the number of symptoms.
  • the target vector distance decreases as the number of symptoms increases.
  • the target vector distance is calculated using the following formula. when Time, when Time, among them, Represents the target vector distance after weighting, It is the average vector distance of multiple symptoms, K represents the number of symptoms.
  • Step 208C Determine the target medicine corresponding to the symptom information according to the target vector distance between the plurality of symptom vectors and the medicine vector of each medicine.
  • each target vector distance is sorted, and the drug corresponding to the shortest target vector distance is used as the target drug.
  • the method before the obtaining the symptom information of the medicine to be predicted, the method further includes: acquiring a consultation dialogue text, and performing word segmentation processing on the consultation dialogue text to obtain a plurality of words; when available in the symptom entity database When the word is found, the word is used as a symptom in the symptom information.
  • the interrogation dialogue text refers to the text describing the user's symptoms.
  • the interrogation dialogue text may be text obtained by recognizing the user's voice, or text input directly. After obtaining the interrogation dialogue text, perform word segmentation processing on the interrogation dialogue text to obtain multiple words.
  • the symptom entity database stores the words of various symptoms. By matching the obtained words with the words in the symptom entity database, if the word can be found in the symptom entity database, it means that the word is a word describing the symptom. Symptoms in the symptom message.
  • the interrogation dialogue text is input to the symptom entity recognition model, and the symptom entity recognition model is used to recognize the interrogation dialogue text to obtain the corresponding symptom entity (ie, a word describing the symptom).
  • the symptom entity recognition model is used to recognize the interrogation dialogue text to obtain the corresponding symptom entity (ie, a word describing the symptom).
  • the method further includes: obtaining a word mapping relationship table, and obtaining The target word corresponding to each word; when the word can be found in the symptom entity database, using the word as a symptom in the symptom information includes: when the symptom entity database can be found When the target word is used, the target word is used as the symptom in the symptom information.
  • the words describing the symptoms of headache include colloquial words such as "brain pain” and "headache". Therefore, after word segmentation is performed on the text to obtain multiple words, a word mapping relationship table is obtained, and a target word corresponding to each word is found in the word mapping relationship table.
  • the word mapping relationship table refers to converting colloquial symptom words into standard symptom words (target words). When the target word can be found in the symptom entity database, the target word is used as the symptom in the symptom information.
  • the word directly obtained by text segmentation is "brain pain”
  • the mapping relationship between "brain pain” and “headache” is recorded in the word mapping relationship table
  • the target word “headache” can be obtained, and then “headache” "As a symptom in the symptom message.
  • the method further includes: acquiring a target symptom corresponding to the target medicine; and comparing the target symptom with the symptom information Symptoms are compared, and when the target symptom contains all the symptoms in the symptom information, the target drug is determined to be safe.
  • the target drug after the target drug is predicted, in order to ensure the safety of the drug prediction, the symptoms of the predicted drug are checked, and the drug must be able to treat the symptoms described by the user before it is recommended. Therefore, to obtain the target symptom corresponding to the target drug, and then compare the target symptom with the symptom in the symptom information, only when the target symptom contains all the symptoms in the symptom information, the predicted target drug safety is determined.
  • the target symptom vector of the target symptom and the symptom symptom vector in the symptom information can be obtained respectively, and whether the two are the same according to the vector distance between the target symptom vector and the symptom vector symptom. For example, if the symptom vector of symptom 1 is v 1 and the symptom vector of symptom 2 is v 2 , and the vector distance is set to 0, it means that symptom 1 and symptom 2 are the same.
  • FIG. 8 is a schematic flowchart of a method for drug prediction in an embodiment. It includes the following steps: (1) Obtain the dialogue text of the consultation, and segment the dialogue text of the consultation. (2) Extract one or more symptoms based on the word segmentation results (for example, "headache” and “fever” in "I have a headache and fever”). (3) Obtain the symptom vector corresponding to each symptom according to the trained word vector. (4) Calculate the vector distance between the symptom and the medicine. Specifically, obtain the medicine vector of each medicine, and use the Euclidean distance to calculate the vector distance between the symptom and the medicine. When multiple symptoms predict a medicine, it is necessary to calculate multiple symptoms and The average vector distance of drugs.
  • the vector distance is weighted according to the number of symptoms to obtain the target vector distance between multiple symptoms and the medicine, and the predicted target medicine is determined according to the target vector distance.
  • the target drug must be able to treat all the symptoms presented.
  • a drug prediction device As shown in FIG. 9, in one embodiment, a drug prediction device is proposed, and the device includes:
  • the obtaining module 902 is used to obtain the symptom information of the medicine to be predicted, and the symptom information includes at least one symptom;
  • the vector determination module 904 is used to determine the symptom vector corresponding to each symptom
  • the calculation module 906 is configured to calculate the vector distance between the symptom vector and the drug vector corresponding to each drug in the drug database according to the symptom vector of each symptom;
  • the medicine determination module 908 is configured to determine the target medicine corresponding to the symptom information according to the vector distance.
  • the above medicine prediction device further includes:
  • the sample determination module 910 is used to obtain symptoms corresponding to each medicine to obtain a symptom training sample set, and the symptom training sample set includes multiple symptom training samples;
  • the input and output module 912 is used to perform unsupervised training on the symptom training samples as input to the word vector model to obtain symptom vectors corresponding to each symptom;
  • the storage module 914 is configured to calculate the medicine vector corresponding to the medicine according to the symptom vector of the symptom corresponding to the medicine, and store the medicine vector into the medicine database.
  • the storage module is further used to obtain a symptom vector corresponding to each symptom when the medicine corresponds to multiple symptoms; calculate an average vector of multiple symptom vectors, and use the average vector as The medicine vector of the corresponding medicine.
  • the medicine determination module is further configured to calculate an average vector distance according to the vector distance between each symptom vector and the medicine vector when the symptom information includes multiple symptoms; according to the average vector distance and The number of symptoms calculates the target vector distance between a plurality of symptom vectors corresponding to the symptom information and the drug vector; the target vector distance between the multiple symptom vectors and the drug vector of each drug is determined The target drug corresponding to the symptom information.
  • the above medicine prediction device further includes:
  • the text obtaining module 916 is used to obtain interrogation dialogue text, and perform word segmentation processing on the interrogation dialogue text to obtain multiple words.
  • the symptom determination module 918 is configured to use the word as a symptom in the symptom information when the word can be found in the symptom entity database.
  • the above drug prediction apparatus further includes: a mapping module for acquiring a word mapping relationship table, and acquiring target words corresponding to each word according to the word mapping relationship table; the symptom determining module is also used for When the target word can be found in the symptom entity database, the target word is used as the symptom in the symptom information.
  • the above drug prediction device further includes: a comparison module for acquiring a target symptom corresponding to the target drug, comparing the target symptom with the symptom in the symptom information, when the target symptom contains When there are all the symptoms in the symptom information, it is determined that the target drug is safe.
  • FIG. 12 shows an internal structure diagram of a computer device in an embodiment.
  • the computer device may be a terminal or a server.
  • the computer device includes a processor, a memory, and a network interface connected by a system bus.
  • the memory includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium of the computer device stores an operating system, and may also store a computer program.
  • the processor may enable the processor to implement a drug prediction method.
  • a computer program may also be stored in the internal memory.
  • the processor may cause the processor to execute the drug prediction method.
  • the network interface is used to communicate with the outside world.
  • FIG. 12 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
  • the specific computer device may It includes more or fewer components than shown in the figure, or some components are combined, or have a different component arrangement.
  • the medicine prediction method provided in this application may be implemented in the form of a computer program, and the computer program may run on a computer device as shown in FIG. 12.
  • the memory device of the computer device may store various program templates constituting the medicine prediction device.
  • a computer device includes a memory and a processor.
  • the memory stores a computer program.
  • the processor is caused to perform the following steps: obtain symptom information of a drug to be predicted.
  • the symptom information includes at least one symptom; determine the symptom vector corresponding to each symptom; calculate the vector distance between the symptom vector and the drug vector corresponding to each drug in the drug database according to the symptom vector of each symptom ; Determine the target medicine corresponding to the symptom information according to the vector distance.
  • a computer-readable storage medium storing a computer program, which when executed by a processor, causes the processor to perform the following steps: obtaining symptom information of a drug to be predicted, the symptom information including at least one symptom; Determine the symptom vector corresponding to each symptom; calculate the vector distance between the symptom vector and the drug vector corresponding to each drug in the drug database according to the symptom vector of each symptom; determine and The target medicine corresponding to the symptom information.
  • Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory can include random access memory (RAM) or external cache memory.
  • RAM random access memory
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDRSDRAM double data rate SDRAM
  • ESDRAM enhanced SDRAM
  • SLDRAM synchronous chain (Synchlink) DRAM
  • RDRAM direct RAM
  • DRAM direct memory bus dynamic RAM
  • RDRAM memory bus dynamic RAM

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medical Informatics (AREA)
  • Chemical & Material Sciences (AREA)
  • Public Health (AREA)
  • Primary Health Care (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Epidemiology (AREA)
  • Medicinal Chemistry (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

一种药品预测方法、装置、计算机设备及存储介质,该方法包括:获取待预测药品的症状信息,所述症状信息中包括至少一个症状(202);确定与每个症状对应的症状向量(204);根据所述每个症状的症状向量,计算所述症状向量与药品数据库中的每个药品对应的药品向量之间的向量距离(206);根据所述向量距离确定与所述症状信息对应的目标药品(208),该药品预测方法大大提高了预测的效率。

Description

药品预测方法、装置、计算机设备及存储介质 技术领域
本发明涉及计算机处理领域,尤其是涉及一种药品预测方法、装置、计算机设备及存储介质。
背景技术
随着人工智能的兴起,智能问诊已经成为了一个趋势,根据症状对药品进行预测可以作为辅助医疗的技术,在一定程度能减轻医院医生的压力。
但是,传统的机器问诊是通过文本匹配的方式直接根据症状去数据库里面搜药品的说明书,然后找到药品进行预测,这种文本匹配的方式十分僵化,预测药品的效率很低。
发明内容
基于此,有必要针对上述问题,提供了一种预测效率高的药品预测方法、装置、计算机设备及存储介质。
第一方面,本发明实施例提供一种药品预测方法,所述方法包括:
获取待预测药品的症状信息,所述症状信息中包括至少一个症状;
确定与每个症状对应的症状向量;
根据所述每个症状的症状向量,计算所述症状向量与药品数据库中的每个药品对应的药品向量之间的向量距离;
根据所述向量距离确定与所述症状信息对应的目标药品。
第二方面,本发明实施例提供一种药品预测装置,所述装置包括:
获取模块,用于获取待预测药品的症状信息,所述症状信息中包括至少一个症状;
向量确定模块,用于确定与每个症状对应的症状向量;
计算模块,用于根据所述每个症状的症状向量,计算所述症状向量与药品 数据库中的每个药品对应的药品向量之间的向量距离;
药品确定模块,用于根据所述向量距离确定与所述症状信息对应的目标药品。
第三方面,本发明实施例提供一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述计算机程序被所述处理器执行时,使得所述处理器执行如下步骤:
获取待预测药品的症状信息,所述症状信息中包括至少一个症状;
确定与每个症状对应的症状向量;
根据所述每个症状的症状向量,计算所述症状向量与药品数据库中的每个药品对应的药品向量之间的向量距离;
根据所述向量距离确定与所述症状信息对应的目标药品。
第四方面,本发明实施例提供一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时,使得所述处理器执行如下步骤:
获取待预测药品的症状信息,所述症状信息中包括至少一个症状;
确定与每个症状对应的症状向量;
根据所述每个症状的症状向量,计算所述症状向量与药品数据库中的每个药品对应的药品向量之间的向量距离;
根据所述向量距离确定与所述症状信息对应的目标药品。
上述药品预测方法、装置、计算机设备及存储介质,在获取到待预测药品的症状信息后,确定与每个症状对应的症状向量,然后根据每个症状的症状向量,计算症状向量与药品数据库中每个药品的药品向量之间的向量距离,然后根据向量距离确定症状信息对应的目标药品。上述药品预测方法,通过将症状和药品之间的匹配关系转换为了向量运算,通过症状向量与药品向量之间的距离运算便可以快速找到目标药品,大大提高了查找的速度,提高了预测药品的效率,且该查找方式有利于提高预测的准确度。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
其中:
图1为一个实施例中药品预测方法的应用环境图;
图2为一个实施例中药品预测方法的流程图;
图3为一个实施例中CBOW和Skip-gram预测的原理示意图;
图4为另一个实施例中药品预测方法的流程图;
图5为一个实施例中不同症状在二维空间的可视化示意图;
图6为一个实施例中词向量模型训练的方法流程示意图;
图7为一个实施例中确定目标药品的方法流程图;
图8为一个实施例中药品预测的方法流程示意图;
图9为一个实施例中药品预测装置的结构框图;
图10为另一个实施例中药品预测装置的结构框图;
图11为又一个实施例中药品预测装置的结构框图;
图12为一个实施例中计算机设备的内部结构图。
具体实施方式
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
图1为一个实施例中药品预测方法的应用环境图。参照图1,该药品预测应用于药品预测系统。该药品预测系统包括终端110和服务器120。终端110和服务器120通过网络连接,终端110具体可以是台式终端或移动终端,移动终端具体可以是手机、平板电脑、笔记本电脑等中的至少一种。服务器120可以用独立的服务器或者是多个服务器组成的服务器集群来实现。终端110用于 获取待预测药品的症状信息,症状信息中包括至少一个症状,然后将症状信息上传到服务器120,服务器120获取到待预测药品的症状信息后,确定与每个症状对应的症状向量,根据所述每个症状的症状向量,计算所述症状向量与药品数据库中的每个药品对应的药品向量之间的向量距离,根据所述向量距离确定与所述症状信息对应的目标药品,将目标药品返回给终端110。
在另一个实施例中,上述药品预测方法可以直接应用于终端110,终端110用于获取待预测药品的症状信息,所述症状信息中包括至少一个症状,确定与每个症状对应的症状向量,根据所述每个症状的症状向量,计算所述症状向量与药品数据库中的每个药品对应的药品向量之间的向量距离,根据所述向量距离确定与所述症状信息对应的目标药品。
如图2所示,提出了一种药品预测方法,该药品预测方法可以应用于终端,也可以应用于服务器,本实施例中以应用于终端为例说明,该药品预测方法具体包括以下步骤:
步骤202,获取待预测药品的症状信息,症状信息中包括至少一个症状。
其中,症状信息用于描述生病特征的信息,症状信息中包括一个或多个症状,症状是指生病的特征,比如,头痛、发烧等症状。为了给生病的人、动物以及植物预测出合适的药品,需要获取相应的人或动物或植物的症状信息,以便根据该症状信息进行药品的预测。
步骤204,确定与每个症状对应的症状向量。
其中,症状向量是指症状的向量表示。症状向量可以通过词向量模型(比如,word2vec模型)进行训练得到。在一个实施例中,通过将药品说明书的治疗症状进行分词,然后通过word2vec模型训练,word2vec训练的词向量能够将词共现多的词聚类,所以当症状与症状在一起出现的多时,他们得到的词向量在空间是很接近的,比如,对于药品,一类症状往往会常常一起出现,比如,“头痛”、“发热”经常会一起出现。在通过词向量模型训练得到各个症状对应的症状向量后,将症状向量与症状进行关联存储,当获取到症状后,根据症状与症状向量的对应关系便可快速查找到对应的症状向量。
步骤206,根据每个症状的症状向量,计算症状向量与药品数据库中的每个药品对应的药品向量之间的向量距离。
其中,药品数据库中存储了每个药品以及每个药品对应的药品向量。通过计算症状向量与药品向量之间的空间向量距离来查找相应的药品。向量距离是指向量之间的距离,在一个实施例中,可以采用欧式距离来进行向量距离的运算。令d表示向量距离,X 1i和X 2i分别表示症状向量和药品向量,则相应的向量距离的计算公式如下:
Figure PCTCN2018123761-appb-000001
计算得到的向量距离越近,说明离药品的距离越近。
步骤208,根据向量距离确定与症状信息对应的目标药品。
其中,在计算得到症状向量与每个药品的药品向量之间的向量距离后,根据计算得到的向量距离从小到大进行排序,将排序最靠前的药品作为预测的目标药品。当存在多个症状时,分别计算每个症状向量与药品向量之间的向量距离,然后将计算得到的多个向量距离进行平均确定与药品向量之间的平均向量距离。通过比较每个药品对应的平均向量距离来确定目标药品。
在一个实施例中,如果是多个症状预测一个药品,令
Figure PCTCN2018123761-appb-000002
为多个症状的平均向量距离,K为症状个数,d j为单个症状与药品的向量距离,那么对于一个药品的平均向量距离为:
Figure PCTCN2018123761-appb-000003
然后求出症状信息与每个药品的平均向量距离,之后进行平均向量距离的排序,将平均向量距离最短对应的药品作为目标药品。传统的文本匹配的方式对于症状的顺序不同,搜索到的内容很可能不同,且病人有多个症状时,需要多次进行全量查询数据库,效率低下,而且查找到药品后,怎么对药品进行排序也是一个很繁琐的工程。上述药品预测方法通过将文本匹配问题转换为了数学向量运算问题,在预测速度上得到了极大的提升,而且对于症状的顺序没有要求,不同症状的顺序不一样也能得到一模一样的预测结果,且通过向量距离的计算可以很容易地对搜索到的药品进行排序。故,该药品预测方法相对于传统的文本匹配的方式,不仅效率高,而且灵活性、准确度和可操作性都大大得到了提高。
上述药品预测方法、装置、计算机设备及存储介质,在获取到待预测药品的症状信息后,确定与每个症状对应的症状向量,然后根据每个症状的症状向 量,计算症状向量与药品数据库中每个药品的药品向量之间的向量距离,然后根据向量距离确定症状信息对应的目标药品。上述药品预测方法,通过将症状和药品之间的匹配关系转换为了向量运算,通过症状向量与药品向量之间的距离运算便可以快速找到目标药品,大大提高了查找的速度,即提高了预测药品的效率,且该查找方式有利于提高预测的准确度。
如图3所示,在一个实施例中,在确定与每个症状对应的症状向量之前,还包括:
步骤210,获取每个药品对应的症状,得到症状训练样本集,症状训练样本集中包括多个症状训练样本。
其中,药品对应的症状的获取可以通过获取药品说明书中治疗的症状,整理药品说明书中治疗的症状,将症状作为一个个单词,用空格隔开。在一个实施例中,在获取到药品后,先对药品进行去重,将相同名称且治疗相同症状的药品去重,比如,有些规格、剂量不一样但是是相同的药品。在一个实施例中,将同一个药品的症状放在一起作为一个训练样本,即不同的药品对应不同的训练样本。由于一个药品治疗的症状往往只有几个,语料太少会导致训练效果不好,所以需要对症状数量进行拓展。在一个实施例中,可以通过反复复制其治疗的症状来拓展成新语料,比如,将头痛、发热、鼻塞三个症状复制两遍进行扩展,得到新的训练样本{头痛、发热、鼻塞、头痛、发热、鼻塞、头痛、发热、鼻塞},从而扩大训练样本集。
步骤212,将症状训练样本作为词向量模型的输入进行无监督训练,得到每个症状对应的症状向量。
其中,在得到症状训练样本后,将症状训练样本作为词向量模型的输入进行无监督训练,训练完成就可以得到每个症状对应的症状向量。词向量模型可以采用word2vec模型,词向量模型根据症状训练样本进行训练,能够将词共现多的词聚类,所以当症状与症状在一起出现的多时,他们得到的词向量在空间是很接近的。通过词向量模型可以得到每个症状对应的症状向量。
在一个实施例中,通过对每个药品的说明书中症状进行症状分词,然后采 用word2vec模型将症状训练成症状向量。Word2vec分为CBOW和Skip-gram两种。CBOW是根据上下文来预测当前词语的概率;Skip-gram则刚好相反,是根据当前词语来预测上下文的概率。如图4所示,为CBOW和Skip-gram预测的原理示意图。w(t)为文本中的某个词,w(t-1)和w(t+1)分别为w(t)在文本中的前一个词和后一个词。其中,症状向量的维数可以根据药品数量的多少来定,比如,可以采用10维,上下文窗口可以设置为3。
步骤214,根据药品对应的症状的症状向量计算得到药品对应的药品向量,将药品向量存储到药品数据库中。
其中,在计算得到每个症状对应的症状向量后,根据药品对应的症状就可以计算得到该药品对应的药品向量。在一个实施例中,可以将药品对应的多个症状向量进行平均得到平均向量,然后将平均向量作为该药品的药品向量。
每个药品的药效可以对应多个症状,如果用数学表示,药品可以在空间中用一个点表示,而药品治疗的症状则是分布在中心点周围的点。如图5所示,反应了不同药品与症状在三维空间分布降维到二维空间的可视化图。在图中,每个症状用一个点来表示,同一种类型的点对应的是同一个药品的症状。从图中可以看出,每个药品所治疗的症状在空间上分布是很接近的,不同药品的症状相互间离的比较远,图中药品的点未标出。
如图6所示,为一个实施例中,词向量模型训练的方法流程示意图。(1)先对药品进行去重,将相同名称且治疗相同症状的药品去重。(2)整理药品说明书中治疗的症状,将症状作为一个个的单词,用空格隔开。(3)对症状数量进行扩展,由于一个药品治疗的症状往往只有几个,语料太少训练效果不好,需要反复复制其治疗的症状拓展成新语料,扩大训练集。(4)采用word2vec模型将症状训练成症状向量,维数可以根据药品数量的多少来定。(5)存储症状向量,词向量模型反应了药品症状的分布。(6)利用每个药品所治疗的症状向量,求出每个药品的药品向量(各个症状的几何中心)。(7)存储每个药品的药品向量。
在一个实施例中,根据药品对应的症状的症状向量计算得到药品对应的药 品向量,包括:当药品对应有多个症状时,获取每个症状对应的症状向量;计算多个症状向量的平均向量,将平均向量作为相应药品的药品向量。
其中,当药品对应有多个症状时,在得到每个症状对应的症状向量后,根据每个症状的症状向量来确定药品的药品向量。可以将多个症状向量对应的几何中心作为相应药品的药品向量,具体地,将多个症状向量进行平均得到平均向量,该平均向量为多个症状向量的几何中心对应的向量,将该平均向量作为药品向量。
如图7所示,在一个实施例中,根据向量距离确定与症状信息对应的目标药品,包括:
步骤208A,当症状信息中包括多个症状时,根据每个症状向量与药品向量之间的向量距离计算得到平均向量距离。
其中,症状信息中包括多个症状时,分别计算每个症状向量与每个药品向量之间的向量距离,根据多个症状向量与同一个药品向量之间的向量距离计算得到多个症状向量与药品向量之间的平均向量距离。
在一个实施例中,如果是多个症状预测一个药品,令
Figure PCTCN2018123761-appb-000004
为多个症状的平均向量距离,K为症状个数,d j为单个症状与药品的向量距离,那么对于一个药品的平均向量距离为:
Figure PCTCN2018123761-appb-000005
然后求出症状信息与每个药品的平均向量距离。
步骤208B,根据平均向量距离和症状个数计算症状信息对应的多个症状向量与药品向量之间的目标向量距离。
其中,在预测药品时,单个症状对药品预测的特征有限,每多一个症状,对药品的预测应该远大于单个症状,所以为了提高预测的速度。需要根据症状个数设置出合理的权重,以加快缩小向量距离的速度。在一个实施例中,目标向量距离随着症状个数的增多而缩减。在一个具体的实施例中,采用如下公式计算得到目标向量距离。当
Figure PCTCN2018123761-appb-000006
时,
Figure PCTCN2018123761-appb-000007
Figure PCTCN2018123761-appb-000008
时,
Figure PCTCN2018123761-appb-000009
其中,
Figure PCTCN2018123761-appb-000010
表示加权完后的目标向量距离,
Figure PCTCN2018123761-appb-000011
为多个症状的平均向量距离,K表示症状个数。通过上述公式,每多一个症状,与药品的空间距离将会成倍缩小。
步骤208C,根据多个症状向量与每个药品的药品向量之间的目标向量距离确定与症状信息对应的目标药品。
其中,在计算得到多个症状向量与每个药品的药品向量之间的目标向量距离后,将各个目标向量距离进行排序,将最短的目标向量距离对应的药品作为目标药品。
在一个实施例中,在所述获取待预测药品的症状信息之前,还包括:获取问诊对话文本,对所述问诊对话文本进行分词处理,得到多个词语;当在症状实体数据库中能够查找到所述词语时,将所述词语作为所述症状信息中的症状。
其中,问诊对话文本是指描述用户症状的文本。问诊对话文本可以是识别用户语音得到的文本,也可以是直接输入的文本。在得到问诊对话文本后,对问诊对话文本进行分词处理,得到多个词语。症状实体数据库中存储了各种症状的词语,通过将得到的词语与症状实体数据库中词语进行匹配,若能够在症状实体数据库中找到该词语,说明该词语为描述症状的词语,将该词语作为症状信息中的症状。在另一个实施例中,在提取到问诊对话文本,将问诊对话文本输入症状实体识别模型,通过该症状实体识别模型识别问诊对话文本得到相应的症状实体(即描述症状的词语)。
在一个实施例中,在所述获取问诊对话文本,对所述问诊对话文本进行分词处理,得到多个词语之后,还包括:获取词语映射关系表,根据所述词语映射关系表获取与每个词语对应的目标词语;所述当在症状实体数据库中能够查找到所述词语时,将所述词语作为所述症状信息中的症状,包括:当在症状实体数据库中能够查找到所述目标词语时,则将所述目标词语作为所述症状信息中的症状。
其中,由于同一症状的表述方式有多种,比如,描述头痛的症状的词语有“脑壳痛”、“头疼”等口语化的词语。所以在对文本进行分词处理得到多个词语后,获取词语映射关系表,在词语映射关系表中查找与每个词语对应的目标词语。词语映射关系表是指将口语化的症状词语转换为标准的症状词语(目标词语)。当在症状实体数据库中能够查找到目标词语时,则将目标词语作为症状 信息中的症状。比如,如果通过文本分词直接得到的词语为“脑壳痛”,那么词语映射关系表中记载了“脑壳痛”与“头痛”的映射关系,那么就可以得到目标词语“头痛”,之后将“头痛”作为症状信息中的症状。
在一个实施例中,在所述根据所述向量距离确定与所述症状信息对应的药品之后,还包括:获取所述目标药品对应的目标症状;将所述目标症状与所述症状信息中的症状进行比较,当所述目标症状包含有所述症状信息中的所有症状时,则判定所述目标药品安全。
其中,在预测得到目标药品后,为了确保药品预测的安全性,对预测到的药品进行症状检查,药品必须能够治疗用户所描述到的症状才能推荐。所以要获取目标药品对应的目标症状,然后将该目标症状与症状信息中的症状进行比较,只有当目标症状中包含有症状信息中的所有症状时,才确定预测得到的目标药品安全。在一个实施例中,在症状匹配时,可以分别获取目标症状的目标症状向量,和症状信息中症状的症状向量,根据目标症状向量和症状向量之间的向量距离来判断两者是否为同一个症状。比如,令症状1的症状向量为v 1、症状2的症状向量为v 2,将向量距离设为0时,则说明症状1和症状2相同。
如图8所示为一个实施例中,药品预测的方法流程示意图。包括如下步骤:(1)获取问诊对话文本,对问诊对话文本进行分词。(2)根据分词结果提取一个或多个症状(比如,“我有点头痛发烧”中的“头痛”和“发烧”)。(3)根据训练好的词向量,获取每个症状对应的症状向量。(4)计算症状与药品的向量距离,具体地,获取每个药品的药品向量,采用欧式距离计算症状与药品的向量距离,当多个症状预测一个药品时,则还需要计算多个症状与药品的平均向量距离。(5)根据症状个数对向量距离进行加权,得到多个症状与药品的目标向量距离,根据目标向量距离确定预测得到的目标药品。(6)对预测到的目标药品进行安全检查,目标药品必须能够治疗提出的所有症状。(7)返回预测到的目标药品或者提示无药治疗这些症状。
如图9所示,在一个实施例中,提出了一种药品预测装置,所述装置包括:
获取模块902,用于获取待预测药品的症状信息,所述症状信息中包括至 少一个症状;
向量确定模块904,用于确定与每个症状对应的症状向量;
计算模块906,用于根据所述每个症状的症状向量,计算所述症状向量与药品数据库中的每个药品对应的药品向量之间的向量距离;
药品确定模块908,用于根据所述向量距离确定与所述症状信息对应的目标药品。
如图10所示,在一个实施例中,上述药品预测装置还包括:
样本确定模块910,用于获取每个药品对应的症状,得到症状训练样本集,所述症状训练样本集中包括多个症状训练样本;
输入输出模块912,用于将所述症状训练样本作为词向量模型的输入进行无监督训练,得到每个症状对应的症状向量;
存储模块914,用于根据药品对应的症状的症状向量计算得到所述药品对应的药品向量,将所述药品向量存储到所述药品数据库中。
在一个实施例中,所述存储模块还用于当所述药品对应有多个症状时,获取每个症状对应的症状向量;计算多个所述症状向量的平均向量,将所述平均向量作为相应药品的药品向量。
在一个实施例中,药品确定模块还用于当所述症状信息中包括多个症状时,根据每个症状向量与药品向量之间的向量距离计算得平均向量距离;根据所述平均向量距离和症状个数计算所述症状信息对应的多个症状向量与所述药品向量之间的目标向量距离;根据所述多个症状向量与每个药品的药品向量之间的目标向量距离确定与所述症状信息对应的目标药品。
如图11所示,在一个实施例中,上述药品预测装置还包括:
文本获取模块916,用于获取问诊对话文本,对所述问诊对话文本进行分词处理,得到多个词语。
症状确定模块918,用于当在症状实体数据库中能够查找到所述词语时,将所述词语作为所述症状信息中的症状。
在一个实施例中,上述药品预测装置还包括:映射模块,用于获取词语映 射关系表,根据所述词语映射关系表获取与每个词语对应的目标词语;所述症状确定模块还用于当在症状实体数据库中能够查找到所述目标词语时,则将所述目标词语作为所述症状信息中的症状。
在一个实施例中,上述药品预测装置还包括:比较模块,用于获取所述目标药品对应的目标症状,将所述目标症状与所述症状信息中的症状进行比较,当所述目标症状包含有所述症状信息中的所有症状时,则判定所述目标药品安全。
图12示出了一个实施例中计算机设备的内部结构图。该计算机设备可以是终端,也可以是服务器。如图12所示,该计算机设备包括通过系统总线连接的处理器、存储器和网络接口。其中,存储器包括非易失性存储介质和内存储器。该计算机设备的非易失性存储介质存储有操作系统,还可存储有计算机程序,该计算机程序被处理器执行时,可使得处理器实现药品预测方法。该内存储器中也可储存有计算机程序,该计算机程序被处理器执行时,可使得处理器执行药品预测方法。网络接口用于与外界进行通信。本领域技术人员可以理解,图12中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一个实施例中,本申请提供的药品预测方法可以实现为一种计算机程序的形式,计算机程序可在如图12所示的计算机设备上运行。计算机设备的存储器中可存储组成该药品预测装置的各个程序模板。比如,获取模块902、向量确定模块904、计算模块906和药品确定模块908。
一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述计算机程序被所述处理器执行时,使得所述处理器执行如下步骤:获取待预测药品的症状信息,所述症状信息中包括至少一个症状;确定与每个症状对应的症状向量;根据所述每个症状的症状向量,计算所述症状向量与药品数据库中的每个药品对应的药品向量之间的向量距离;根据所述向量距离确定与所述症状信息对应的目标药品。
一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时,使得所述处理器执行如下步骤:获取待预测药品的症状信息,所述症状信息中包括至少一个症状;确定与每个症状对应的症状向量;根据所述每个症状的症状向量,计算所述症状向量与药品数据库中的每个药品对应的药品向量之间的向量距离;根据所述向量距离确定与所述症状信息对应的目标药品。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一非易失性计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对本申请专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (10)

  1. 一种药品预测方法,其特征在于,所述方法包括:
    获取待预测药品的症状信息,所述症状信息中包括至少一个症状;
    确定与每个症状对应的症状向量;
    根据所述每个症状的症状向量,计算所述症状向量与药品数据库中的每个药品对应的药品向量之间的向量距离;
    根据所述向量距离确定与所述症状信息对应的目标药品。
  2. 根据权利要求1所述的方法,其特征在于,在所述确定与每个症状对应的症状向量之前,还包括:
    获取每个药品对应的症状,得到症状训练样本集,所述症状训练样本集中包括多个症状训练样本;
    将所述症状训练样本作为词向量模型的输入进行无监督训练,得到每个症状对应的症状向量;
    根据药品对应的症状的症状向量计算得到所述药品对应的药品向量,将所述药品向量存储到所述药品数据库中。
  3. 根据权利要求2所述的方法,其特征在于,所述根据药品对应的症状的症状向量计算得到所述药品对应的药品向量,包括:
    当所述药品对应有多个症状时,获取每个症状对应的症状向量;
    计算多个所述症状向量的平均向量,将所述平均向量作为相应药品的药品向量。
  4. 根据权利要求1所述的方法,其特征在于,所述根据所述向量距离确定与所述症状信息对应的目标药品,包括:
    当所述症状信息中包括多个症状时,根据每个症状向量与药品向量之间的向量距离计算得平均向量距离;
    根据所述平均向量距离和症状个数计算所述症状信息对应的多个症状向量与所述药品向量之间的目标向量距离;
    根据所述多个症状向量与每个药品的药品向量之间的目标向量距离确定与 所述症状信息对应的目标药品。
  5. 根据权利要求1所述的方法,其特征在于,在所述获取待预测药品的症状信息之前,还包括:
    获取问诊对话文本,对所述问诊对话文本进行分词处理,得到多个词语;
    当在症状实体数据库中能够查找到所述词语时,将所述词语作为所述症状信息中的症状。
  6. 根据权利要求5所述的方法,其特征在于,在所述获取问诊对话文本,对所述问诊对话文本进行分词处理,得到多个词语之后,还包括:
    获取词语映射关系表,根据所述词语映射关系表获取与每个词语对应的目标词语;
    所述当在症状实体数据库中能够查找到所述词语时,将所述词语作为所述症状信息中的症状,包括:
    当在症状实体数据库中能够查找到所述目标词语时,则将所述目标词语作为所述症状信息中的症状。
  7. 根据权利要求1所述的方法,其特征在于,在所述根据所述向量距离确定与所述症状信息对应的药品之后,还包括:
    获取所述目标药品对应的目标症状;
    将所述目标症状与所述症状信息中的症状进行比较,当所述目标症状包含有所述症状信息中的所有症状时,则判定所述目标药品安全。
  8. 一种药品预测装置,其特征在于,所述装置包括:
    获取模块,用于获取待预测药品的症状信息,所述症状信息中包括至少一个症状;
    向量确定模块,用于确定与每个症状对应的症状向量;
    计算模块,用于根据所述每个症状的症状向量,计算所述症状向量与药品数据库中的每个药品对应的药品向量之间的向量距离;
    药品确定模块,用于根据所述向量距离确定与所述症状信息对应的目标药 品。
  9. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述计算机程序被所述处理器执行时,使得所述处理器执行如权利要求1至7中任一项所述方法的步骤。
  10. 一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时,使得所述处理器执行如权利要求1至7中任一项所述方法的步骤。
PCT/CN2018/123761 2018-12-24 2018-12-26 药品预测方法、装置、计算机设备及存储介质 WO2020132918A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811585289.2A CN111429991B (zh) 2018-12-24 2018-12-24 药品预测方法、装置、计算机设备及存储介质
CN201811585289.2 2018-12-24

Publications (1)

Publication Number Publication Date
WO2020132918A1 true WO2020132918A1 (zh) 2020-07-02

Family

ID=71126861

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/123761 WO2020132918A1 (zh) 2018-12-24 2018-12-26 药品预测方法、装置、计算机设备及存储介质

Country Status (2)

Country Link
CN (1) CN111429991B (zh)
WO (1) WO2020132918A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112036478A (zh) * 2020-08-28 2020-12-04 平安医疗健康管理股份有限公司 慢病报销药品的识别方法、装置以及计算机设备

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111863243A (zh) * 2020-07-22 2020-10-30 乌镇互联网医院(桐乡)有限公司 一种药房预问诊方法、装置、存储介质及电子设备
CN112309565B (zh) * 2020-08-28 2024-09-20 北京京东世纪贸易有限公司 用于匹配药品信息和病症信息的方法、装置、电子设备和介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107092797A (zh) * 2017-04-26 2017-08-25 广东亿荣电子商务有限公司 一种基于深度学习的药品推荐算法
CN107247868A (zh) * 2017-05-18 2017-10-13 深思考人工智能机器人科技(北京)有限公司 一种人工智能辅助问诊系统
CN107330288A (zh) * 2017-07-10 2017-11-07 叮当(深圳)健康机器人科技有限公司 一种用药信息获取方法及装置
WO2018006629A1 (zh) * 2016-07-06 2018-01-11 北京搜狗科技发展有限公司 一种药方匹配方法和装置、一种用于药方匹配的装置
CN107591189A (zh) * 2017-08-29 2018-01-16 科大智能科技股份有限公司 一种基于otc药品的推荐系统

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104200069B (zh) * 2014-08-13 2017-08-04 周晋 一种基于症状分析和机器学习的用药推荐系统和方法
CN105224794A (zh) * 2015-09-19 2016-01-06 石庆平 一种智能化处方审核系统及方法
CN106202893B (zh) * 2016-06-30 2019-05-14 山东诺安诺泰信息系统有限公司 一种药品推荐方法
CN107194203A (zh) * 2017-06-09 2017-09-22 西安电子科技大学 基于miRNA数据和组织特异性网络的药物重定位方法
CN107403069B (zh) * 2017-07-31 2020-05-12 京东方科技集团股份有限公司 一种药物-疾病关联关系分析系统及方法
CN107463791A (zh) * 2017-08-25 2017-12-12 上海中医药大学附属岳阳中西医结合医院 采用基于集对分析四元联系数的疗效曲线筛选中药的方法和系统
CN108231153A (zh) * 2018-02-08 2018-06-29 康美药业股份有限公司 一种药品推荐方法、电子设备和存储介质
CN109065172B (zh) * 2018-07-04 2023-04-11 平安科技(深圳)有限公司 病症信息获取方法、装置、计算机设备和存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018006629A1 (zh) * 2016-07-06 2018-01-11 北京搜狗科技发展有限公司 一种药方匹配方法和装置、一种用于药方匹配的装置
CN107092797A (zh) * 2017-04-26 2017-08-25 广东亿荣电子商务有限公司 一种基于深度学习的药品推荐算法
CN107247868A (zh) * 2017-05-18 2017-10-13 深思考人工智能机器人科技(北京)有限公司 一种人工智能辅助问诊系统
CN107330288A (zh) * 2017-07-10 2017-11-07 叮当(深圳)健康机器人科技有限公司 一种用药信息获取方法及装置
CN107591189A (zh) * 2017-08-29 2018-01-16 科大智能科技股份有限公司 一种基于otc药品的推荐系统

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112036478A (zh) * 2020-08-28 2020-12-04 平安医疗健康管理股份有限公司 慢病报销药品的识别方法、装置以及计算机设备

Also Published As

Publication number Publication date
CN111429991B (zh) 2023-06-13
CN111429991A (zh) 2020-07-17

Similar Documents

Publication Publication Date Title
CN108986908B (zh) 问诊数据处理方法、装置、计算机设备和存储介质
CN110021439B (zh) 基于机器学习的医疗数据分类方法、装置和计算机设备
WO2021073277A1 (zh) 一种个性化精准用药推荐方法及装置
CN109783617B (zh) 用于答复问题的模型训练方法、装置、设备及存储介质
WO2021184567A1 (zh) 电子病历查询方法、装置、计算机设备和存储介质
CN113707297B (zh) 医疗数据的处理方法、装置、设备及存储介质
WO2020048264A1 (zh) 药品数据处理方法、装置、计算机设备和存储介质
CN109815333B (zh) 信息获取方法、装置、计算机设备和存储介质
CN110504028A (zh) 一种疾病问诊方法、装置、系统、计算机设备和存储介质
WO2021151328A1 (zh) 症状数据处理方法、装置、计算机设备及存储介质
CN108766545B (zh) 在线问诊科室分配方法、装置、计算机设备和存储介质
WO2022062353A1 (zh) 医疗数据处理方法、装置、计算机设备和存储介质
WO2020114100A1 (zh) 一种信息处理方法、装置和计算机存储介质
US20200012905A1 (en) Label consistency for image analysis
WO2020132918A1 (zh) 药品预测方法、装置、计算机设备及存储介质
CN112016318B (zh) 基于解释模型的分诊信息推荐方法、装置、设备及介质
WO2022068160A1 (zh) 基于人工智能的重症问诊数据识别方法、装置、设备及介质
CN112015900A (zh) 医学属性知识图谱构建方法、装置、设备及介质
CN108874773B (zh) 关键词新增方法、装置、计算机设备和存储介质
CN112349410B (zh) 用于科室分诊的分诊模型的训练方法、分诊方法和系统
WO2020034808A1 (zh) 决策数据获取方法、装置、计算机设备和存储介质
WO2020052162A1 (zh) 疾病数据映射方法、装置、计算机设备和存储介质
WO2020034801A1 (zh) 医疗特征筛选方法、装置、计算机设备和存储介质
CN111370102A (zh) 科室导诊方法、装置以及设备
WO2023240846A1 (zh) 基于人工智能的药品推荐方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18944451

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18944451

Country of ref document: EP

Kind code of ref document: A1