WO2021017306A1 - Personalized search method, system, and device employing user portrait, and storage medium - Google Patents

Personalized search method, system, and device employing user portrait, and storage medium Download PDF

Info

Publication number
WO2021017306A1
WO2021017306A1 PCT/CN2019/118070 CN2019118070W WO2021017306A1 WO 2021017306 A1 WO2021017306 A1 WO 2021017306A1 CN 2019118070 W CN2019118070 W CN 2019118070W WO 2021017306 A1 WO2021017306 A1 WO 2021017306A1
Authority
WO
WIPO (PCT)
Prior art keywords
keyword
category
patient
article
word
Prior art date
Application number
PCT/CN2019/118070
Other languages
French (fr)
Chinese (zh)
Inventor
周晓峰
朱威
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021017306A1 publication Critical patent/WO2021017306A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Definitions

  • the embodiments of the present application relate to the field of search technology, and in particular, to a personalized search method, system, device, and storage medium based on user portraits.
  • the current implementation of the article search function is usually based on the search term entered by the user.
  • the article that meets the conditions is searched from multiple articles and displayed, but not satisfied
  • the conditional articles are hidden, and then these filtered articles are displayed in a certain order.
  • the articles are sorted from largest to smallest according to the matching degree, and the search results are provided to users.
  • the inventor found that, for patient users, the searched articles cannot fully meet their actual needs, and the search accuracy is insufficient.
  • the purpose of the embodiments of the present application is to provide a personalized search method, system, device, and storage medium based on user portraits, which can perform accurate searches based on user portraits and input information.
  • the embodiments of the present application provide a personalized search method based on user portraits, including:
  • the fourth weight coefficient of each article includes the first keyword, the first category word, the second keyword and/or the second category word;
  • the search result page is output.
  • an embodiment of the present application also provides a personalized search system based on user portraits, including:
  • the first obtaining module is configured to obtain a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category;
  • the second acquisition module is configured to acquire input information of the patient, and acquire a second keyword and/or a second category word corresponding to the second keyword according to the input information;
  • the third obtaining module is configured to obtain a first article collection according to the first keyword, the first category word, the second keyword, and the second category word, where the first article collection includes at least one article;
  • the first calculation module is used to calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, and the third weight coefficient of the second keyword in each article in the first article set.
  • a weighting coefficient and a fourth weighting coefficient of a second category word, each article includes the first keyword, the first category word, the second keyword and/or the second category word;
  • the second calculation module is configured to calculate the first match between each article and the patient search target according to the first weight coefficient, the second weight coefficient, the third weight coefficient, and the fourth weight coefficient degree;
  • the result output module is used to output the search result page based on the first matching degree of each article.
  • an embodiment of the present application further provides a computer device, the computer device includes a memory and a processor, the memory stores computer-readable instructions that can run on the processor, and the computer When the readable instructions are executed by the processor, the following steps are implemented:
  • the fourth weight coefficient of each article includes the first keyword, the first category word, the second keyword and/or the second category word;
  • the search result page is output.
  • the embodiments of the present application also provide a non-volatile computer-readable storage medium.
  • the non-volatile computer-readable storage medium stores computer-readable instructions, and the computer-readable instructions may Is executed by at least one processor, so that the at least one processor executes the following steps:
  • the fourth weight coefficient of each article includes the first keyword, the first category word, the second keyword and/or the second category word;
  • the search result page is output.
  • FIG. 1 is a flowchart of Embodiment 1 of a personalized search method based on a user portrait according to an embodiment of the application.
  • Fig. 2 is a flowchart of step S100 in Fig. 1 of an embodiment of the application.
  • Figure 3 is a flow chart of pre-establishing a disease classification system according to an embodiment of the application.
  • Fig. 4 is a flowchart of step S100B in Fig. 2 of the embodiment of the application.
  • Fig. 5 is a flowchart of step S100B2 in Fig. 4 of an embodiment of the application.
  • Fig. 6 is a flowchart of a second embodiment of a personalized search method based on a user portrait according to an embodiment of the application.
  • FIG. 7 is a schematic diagram of program modules in Embodiment 3 of a personalized search system based on user portraits according to an embodiment of the application.
  • FIG. 8 is a schematic diagram of the hardware structure of Embodiment 4 of the computer device according to the embodiment of the application.
  • FIG. 1 shows a flowchart of the steps of a personalized search method based on a user portrait in Embodiment 1 of the present application. It can be understood that the flowchart in this method embodiment is not used to limit the order of execution of the steps. The following is an exemplary description with the server as the execution subject. details as follows.
  • Step S100 Obtain a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category.
  • the mapping relationship between the first keyword and the first category word is configured in advance. For example, if the patient is more interested in eating or cooking, the first category words of the patient’s interest category are diet, recipe, etc. The patient searches for the corresponding diet, recipe, etc. according to their own condition, such as diabetic patients , The first category words are diabetic diet, diabetic recipes, etc.
  • the first keyword of diabetic diet is low salt and low fat, and further can be coarse grains such as buckwheat, oatmeal, corn flour, soybeans and soy products, vegetables, etc.
  • step S100 further includes:
  • Step S100A Obtain personal information input by the patient through a terminal device, and query the patient's associated information from a designated server based on the personal information, the associated information including the patient's historical operation records.
  • the patient's personal information is acquired through a pre-configured electronic page, the electronic page includes multiple fields, and the multiple fields correspond to personal information such as gender, age, and past medical history.
  • the patient's electronic medical record can be obtained from the designated server such as a medical sharing platform, and the patient's medical history information can be extracted from the electronic medical record. If the patient’s electronic medical record cannot be obtained, when a search request provided by the patient is received, some electronic pages with preset question and answer information are pushed for the patient to choose, such as follow-up, patient education question and answer, medicine question and answer, etc., so as to obtain the patient Feedback information for these preset questions and answers.
  • the patient’s related information can also be inquired by obtaining the patient’s historical operation record.
  • the historical operation record includes the patient’s login query information and related articles, and the patient’s degree of interest in disease-related articles. Historical operation information such as duration, clicks, reposts, or comments are analyzed.
  • step S100A further includes pre-establishing a disease classification system:
  • Step S100AA Obtain multiple sample user information of multiple sample users and sample related information with the sample users.
  • Step S100AB using the TF-IDF model, extract multiple sample keywords from the multiple sample user information and sample related information with the sample user.
  • the TF-IDF (Term Frequency-Inverse Document Frequency) model is used to evaluate the importance of a word to multiple sample user information and sample related information of the sample user.
  • the TF-IDF model The weight value of each word is calculated, sorted by size, and all words with a weight value greater than a certain preset value are taken as sample keywords.
  • Step S100AC take multiple sample keyword sets as the input of the first layer sample neural network model, and use the first sample category word in the classification system as the output of the first layer sample neural network model to train the first layer sample neural network
  • the model predicts the performance of the corresponding category words based on the keywords.
  • Step S100AD stop training until the sample neural network model of the mth layer is reached, where 2 ⁇ m ⁇ M, and M is the total number of sample category words included in the classification system.
  • Step S100AE taking the training result of the sample neural network model of the m-1 layer and multiple sample keywords as the input of the sample neural network model of the mth layer, and taking the mth sample category word in the classification system as the mth layer sample The output of the neural network model, training the m-th sample neural network model to predict the performance of the corresponding category words according to the keywords.
  • Step S100AF extract multiple keywords from the personal information and the associated information with the patient through the TF-IDF model.
  • Step S100AG input the multiple keywords into the sample neural network model, and predict the performance of the corresponding category words according to the keywords of each layer of the sample neural network model, and output multiple keywords corresponding to multiple keywords.
  • Category words input the multiple keywords into the sample neural network model, and predict the performance of the corresponding category words according to the keywords of each layer of the sample neural network model, and output multiple keywords corresponding to multiple keywords.
  • Step S100AH Associate multiple category words with corresponding keywords to obtain the disease classification system; the disease classification system includes multiple sub-disease categories, each sub-disease category includes multiple category words, and each category word corresponds to At least one set of keywords; wherein the multiple category words include at least disease causes and disease medications.
  • category words include: disease cause, disease medication, disease prevention, disease examination, disease diagnosis, treatment, common sense, care, cutting-edge information, etc.; category words can further expand its subcategory words, such as hazards, complications, etc.: further, Disease causes are cases, the category words can also be subdivided into: smoking, drinking and other bad habits of keyword sets.
  • Step S100B construct a user portrait of the patient based on the personal information and the associated information, the user portrait including multiple portrait tags corresponding to multiple dimensions.
  • step S100B further includes:
  • step S100B1 the word vector of the first keyword of the personal information and related information is obtained through the word2vec model.
  • Step S100B2 input the word vector of the first keyword into a prediction model, and output the correlation probability of the patient and each portrait label through the prediction model to obtain a user portrait of the patient.
  • step S100B2 further includes:
  • step S100B2A a mapping relationship between the disease classification system and the patient is established to form a user portrait of the patient, each category word corresponds to a dimension, and each keyword corresponds to a portrait label.
  • Step S100B2B input the word vector of the keyword set of the disease classification system into the prediction model, and calculate the first association probability according to the softmax layer of the prediction model.
  • Step S100B2C input the word vectors of the category words of the disease classification system into the prediction model, and calculate the second correlation probability according to the softmax layer of the prediction model.
  • Step S100B2D Obtain the correlation probability of the user portrait of the patient according to the first correlation probability and the second correlation probability.
  • step S100B further includes:
  • the real-time historical operation information of the patient is analyzed, the patient's attention point is obtained according to the real-time historical operation information, and the attention point is mapped to the user portrait of the patient to update the user portrait of the patient.
  • the patient does not provide personal information
  • Step S100C Acquire the patient's interest category according to the user portrait.
  • the user portrait has the correlation probability between the patient and each portrait label, and the patient’s interest category is determined according to the correlation probability, the same category portrait tags are associated, and similar portrait tags with the correlation probability greater than the preset range are selected as the interest category .
  • Step S102 Obtain input information of the patient, and obtain a second keyword and/or a second category word corresponding to the second keyword according to the input information.
  • mapping relationship between the second keyword and the second category word is configured in advance.
  • the step of obtaining the second keyword is as follows: traverse the input information according to the keyword set to obtain one or more second keywords from the input information. For example: when a patient searches, it is verified that the patient enters the dietary direction information.
  • the searched keywords can be XXX (symptoms, such as diabetes) what can and can’t eat, which may be protein, meat, seafood, etc. .
  • protein, meat, seafood, etc. are the second category words
  • pork, beef, chicken, etc. are the second keywords. Map pork, beef, chicken, etc. as second keywords to the meat of the second category words.
  • Step S104 Obtain a first article set according to the first keyword, the first category word, the second keyword, and the second category word, where the first article set includes at least one article.
  • the article is identified and filtered out.
  • Step S106 Calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, the third weight coefficient of the second keyword, and the first weight coefficient of each article in the first article collection.
  • the fourth weight coefficient of the two-category words, each article includes the first keyword, the first category word, the second keyword and/or the second category word.
  • each article includes the first keyword, the first category word, the second keyword, and/or the second category word.
  • the weight coefficient of the keyword set of the title is The weight coefficient of the title is added to the weight coefficient of the keyword set.
  • the weight coefficient of the keyword set of the subject is the weight coefficient of the subject plus the weight coefficient of the keyword set.
  • the TF-IDF (term frequency-inverse document frequency, inverse text frequency index of term frequency) model and LDA (Latent Dirichlet Allocation, document topic generation model) model are used to calculate the first weight coefficient and the first weight coefficient of the first keyword.
  • the LDA model can identify topic words hidden in a large-scale document set or corpus.
  • Step S108 According to the first weighting coefficient, the second weighting coefficient, the third weighting coefficient, and the fourth weighting coefficient, the first matching degree between each article and the patient search target is calculated.
  • the first weight coefficient of each article is added to obtain the total first weight coefficient
  • the second weight coefficient is added to obtain the total second weight coefficient
  • the third weight coefficient is added to obtain the total third weight coefficient and the first weight coefficient.
  • the four weighting coefficients are added to obtain a total fourth weighting coefficient
  • the total first weighting coefficient, the total second weighting coefficient, the total third weighting coefficient and the total fourth weighting coefficient are added to obtain the first matching degree.
  • the first matching degree of each article in the first article collection is obtained according to the above-mentioned method.
  • Step S110 based on the first matching degree of each article, output a search result page.
  • the first matching degree of each article in the first article set is sorted by unity (for example: from big to small) to obtain the first search result of the patient, and other factors may also be added for sorting , Such as bidding ranking, etc.
  • the first search result is displayed, so that the patient can obtain more accurate search article information.
  • the search method is different.
  • the personalized search is not turned on, and the search method is only for the patient's input information. It includes the following steps:
  • Step S120 Obtain search information input by the patient through a terminal device, and obtain a fifth keyword of the search information from a designated server according to the search information and a fifth category word corresponding to the fifth keyword.
  • a keyword set and category words are established, the keyword set includes a plurality of fifth keywords, the category words include a plurality of fifth category words, and a pre-configured relationship between the fifth keyword and the fifth category word Mapping relations.
  • Step S122 Obtain a second article collection according to the fifth keyword and the fifth category word, wherein the second article collection includes at least one article.
  • a related second article set in the database is obtained, as long as the article has at least one fifth keyword and/or the fifth category word ,
  • the article is identified and called out.
  • each of the articles in the second article collection includes the fifth keyword and/or fifth category words, and calculate the fifth weight coefficient and the fifth weight coefficient of the fifth keyword of each article.
  • the fifth weight coefficient and the first weight coefficient of the fifth keyword are calculated through the TF-IDF (term frequency-inverse document frequency) model and the LDA (Latent Dirichlet Allocation, document topic generation model) model.
  • the sixth weight coefficient of five categories of words are calculated through the TF-IDF (term frequency-inverse document frequency) model and the LDA (Latent Dirichlet Allocation, document topic generation model) model.
  • the sixth weight coefficient of five categories of words are calculated through the TF-IDF (term frequency-inverse document frequency) model and the LDA (Latent Dirichlet Allocation, document topic generation model) model.
  • the weight coefficient of the keyword set of the title is the The weight coefficient of the title is added to the weight coefficient of the keyword set.
  • the weight coefficient of the keyword set of the subject is the weight coefficient of the subject plus the weight coefficient of the keyword set.
  • Step S126 calculating the second matching degree of each article in the second article collection according to the fifth weighting coefficient and the sixth weighting coefficient.
  • the fifth weight coefficient of each keyword is added to obtain a total fifth weight coefficient
  • the sixth weight coefficient of each category word is added to obtain a total sixth weight coefficient
  • the total The fifth weight coefficient is added to the total sixth weight coefficient to obtain the second degree of matching.
  • the second matching degree of each article in the second article collection is obtained according to the above-mentioned method.
  • the second matching degree of each article in the second article set is sorted from largest to smallest to obtain the second search result of the patient, and the second search result is displayed to obtain the Patient’s search article information.
  • FIG. 7 shows a schematic diagram of the program modules of the third embodiment of the personalized search system according to the user portrait of this application.
  • the personalized search system 20 based on user portraits may include or be divided into one or more program modules, and the one or more program modules are stored in a storage medium and are executed by one or more processors. Execute to complete this application and realize the above-mentioned personalized search method based on user portrait.
  • the program module referred to in the embodiment of the present application refers to a series of computer-readable instruction instruction segments capable of completing specific functions, and is more suitable than the program itself to describe the execution process of the personalized search system 20 based on the user portrait in the storage medium. The following description will specifically introduce the functions of each program module in this embodiment:
  • the first obtaining module 200 is configured to obtain a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category.
  • the mapping relationship between the first keyword and the first category word is configured in advance. For example, if the patient is more interested in eating or cooking, the first category words of the patient’s interest category are diet, recipe, etc. The patient searches for the corresponding diet, recipe, etc. according to their own condition, such as diabetic patients , The first category words are diabetic diet, diabetic recipes, etc.
  • the first keyword of diabetic diet is low salt and low fat, and further can be coarse grains such as buckwheat, oatmeal, corn flour, soybeans and soy products, vegetables, etc.
  • the first obtaining module 200 is further configured to:
  • the patient's personal information is acquired through a pre-configured electronic page, the electronic page includes multiple fields, and the multiple fields correspond to personal information such as gender, age, and past medical history.
  • the patient's electronic medical record can be obtained from the designated server such as a medical sharing platform, and the patient's medical history information can be extracted from the electronic medical record. If the patient’s electronic medical record cannot be obtained, when a search request provided by the patient is received, some electronic pages with preset question and answer information are pushed for the patient to choose, such as follow-up, patient education question and answer, medicine question and answer, etc., so as to obtain the patient Feedback information for these preset questions and answers.
  • the patient’s associated information can also be inquired by obtaining the patient’s historical operation record, which includes the patient’s login query information and related articles, and the patient’s degree of interest in disease-related articles, which can be viewed by the patient
  • the historical operation information such as the length of the article, clicks, reposts, or comments are analyzed.
  • the first acquisition module 200 is also used to establish a disease classification system in advance:
  • the disease classification system includes multiple sub-disease categories, each sub-disease category includes multiple category words, and each category word corresponds to at least one key Word set; wherein, the multiple category words include at least disease causes and disease medications.
  • category words include: disease cause, disease medication, disease prevention, disease examination, disease diagnosis, treatment, common sense, care, cutting-edge information, etc.; category words can further expand its subcategory words, such as hazards, complications, etc.: further, Disease causes are cases, the category words can also be subdivided into: smoking, drinking and other bad habits of keyword sets.
  • a user portrait of the patient Based on the personal information and the associated information, construct a user portrait of the patient, the user portrait including a plurality of portrait tags corresponding to multiple dimensions.
  • the first obtaining module 200 is further configured to:
  • the user portrait has the correlation probability between the patient and each portrait label, and the patient’s interest category is determined according to the correlation probability, the same category portrait tags are associated, and similar portrait tags with the correlation probability greater than the preset range are selected as the interest category .
  • the first obtaining module 200 is further configured to:
  • each category word corresponds to a dimension
  • each keyword corresponds to a portrait label
  • the correlation probability of the user portrait of the patient is obtained according to the first correlation probability and the second correlation probability.
  • the TF-IDF (Term Frequency-Inverse Document Frequency) model is used to evaluate the importance of a word to multiple sample user information and sample related information of the sample user.
  • the TF-IDF model The weight value of each word is calculated, sorted by size, and all words with a weight value greater than a certain preset value are taken as sample keywords.
  • the first obtaining module 200 is further configured to:
  • the real-time historical operation information of the patient is analyzed, the patient's attention point is obtained according to the real-time historical operation information, and the attention point is mapped to the user portrait of the patient to update the user portrait of the patient.
  • the patient does not provide personal information
  • the second acquisition module 201 is configured to acquire input information of the patient, and acquire a second keyword and/or a second category word corresponding to the second keyword according to the input information.
  • mapping relationship between the second keyword and the second category word is configured in advance.
  • the step of obtaining the second keyword is as follows: traverse the input information according to the keyword set to obtain one or more second keywords from the input information. For example: when a patient searches, it is verified that the patient enters the dietary direction information.
  • the searched keywords can be XXX (symptoms, such as diabetes) what can and can’t eat, which may be protein, meat, seafood, etc. .
  • protein, meat, seafood, etc. are the second category words
  • pork, beef, chicken, etc. are the second keywords. Map pork, beef, chicken, etc. as second keywords to the meat of the second category words.
  • the third obtaining module 202 is configured to obtain a first article set according to the first keyword, the first category word, the second keyword, and the second category word, where the first article set includes at least one article.
  • the article is identified and filtered out.
  • the first calculation module 203 is configured to calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, and the second keyword of the second keyword of each article in the first article collection. Three weight coefficients and a fourth weight coefficient of the second category words, each article includes the first keyword, the first category word, the second keyword and/or the second category word.
  • each article includes the first keyword, the first category word, the second keyword, and/or the second category word.
  • the weight coefficient of the keyword set of the title is The weight coefficient of the title is added to the weight coefficient of the keyword set.
  • the weight coefficient of the keyword set of the subject is the weight coefficient of the subject plus the weight coefficient of the keyword set.
  • the TF-IDF (term frequency-inverse document frequency, inverse text frequency index of term frequency) model and LDA (Latent Dirichlet Allocation, document topic generation model) model are used to calculate the first weight coefficient and the first weight coefficient of the first keyword.
  • the second calculation module 204 is configured to calculate the first weight coefficient of each article and the patient search target according to the first weight coefficient, the second weight coefficient, the third weight coefficient, and the fourth weight coefficient. suitability.
  • the first weight coefficient of each article is added to obtain the total first weight coefficient
  • the second weight coefficient is added to obtain the total second weight coefficient
  • the third weight coefficient is added to obtain the total third weight coefficient and the first weight coefficient.
  • the four weighting coefficients are added to obtain a total fourth weighting coefficient
  • the total first weighting coefficient, the total second weighting coefficient, the total third weighting coefficient and the total fourth weighting coefficient are added to obtain the first matching degree.
  • the first matching degree of each article in the first article collection is obtained according to the above-mentioned method.
  • the result output module 205 is configured to output a search result page based on the first matching degree of each article.
  • the first matching degree of each article in the first article set is sorted by unity (for example: from big to small) to obtain the first search result of the patient, and other factors may also be added for sorting , Such as bidding ranking, etc.
  • the first search result is displayed, so that the patient can obtain more accurate search article information.
  • the computer device 2 is a device that can automatically perform numerical calculation and/or information processing according to pre-set or stored instructions.
  • the computer device 2 may be a rack server, a blade server, a tower server, or a cabinet server (including an independent server or a server cluster composed of multiple servers).
  • the computer device 2 at least includes, but is not limited to, a memory 21, a processor 22, a network interface 23, and a personalized search system 20 based on user portraits that can be connected to each other through a system bus. among them:
  • the memory 21 includes at least one type of non-volatile computer-readable storage medium.
  • the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), Random access memory (RAM), static random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disk, optical disk Wait.
  • the memory 21 may be an internal storage unit of the computer device 2, such as a hard disk or memory of the computer device 2.
  • the memory 21 may also be an external storage device of the computer device 2, for example, a plug-in hard disk, a smart media card (SMC), and a secure digital (Secure Digital, SD card, Flash Card, etc.
  • the memory 21 may also include both the internal storage unit of the computer device 2 and its external storage device.
  • the memory 21 is generally used to store the operating system and various application software installed in the computer device 2, such as the program code of the personalized search system 20 according to the user portrait in the third embodiment.
  • the memory 21 can also be used to temporarily store various types of data that have been output or will be output.
  • the processor 22 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments.
  • the processor 22 is generally used to control the overall operation of the computer device 2.
  • the processor 22 is used to run the program code or process data stored in the memory 21, for example, to run the personalized search system 20 based on user portraits, so as to implement the personalized search methods based on user portraits of the first and second embodiments. .
  • the network interface 23 may include a wireless network interface or a wired network interface.
  • the network interface 23 is generally used to establish a communication connection between the server 2 and other electronic devices.
  • the network interface 23 is used to connect the server 2 to an external terminal through a network, and to establish a data transmission channel and a communication connection between the server 2 and the external terminal.
  • the network may be an intranet, the Internet, a global system of mobile communication (GSM), a wideband code division multiple access (WCDMA), a 4G network, a 5G network , Bluetooth (Bluetooth), Wi-Fi and other wireless or wired networks.
  • FIG. 8 only shows the computer device 2 with components 20-23, but it should be understood that it is not required to implement all the components shown, and more or fewer components may be implemented instead.
  • the personalized search system 20 based on user portraits stored in the memory 21 may also be divided into one or more program modules, and the one or more program modules are stored in the memory 21, It is executed by one or more processors (the processor 22 in this embodiment) to complete the application.
  • FIG. 7 shows a schematic diagram of the program modules of the third embodiment of the personalized search system 20 based on user portraits.
  • the personalized search system 20 based on user portraits can be divided into first acquisition. Module 200, second acquisition module 201, third acquisition module 202, first calculation module 203, second calculation module 204, and result output module 205.
  • the program module referred to in this application refers to an instruction segment of a series of computer-readable instructions that can complete specific functions. The specific functions of the program modules 200-205 have been described in detail in the third embodiment, and will not be repeated here.
  • This embodiment also provides a non-volatile computer-readable storage medium, such as flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory ( SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disk, optical disk, server, App application mall, etc., on which storage There are computer-readable instructions, and the corresponding functions are realized when the program is executed by the processor.
  • the non-volatile computer-readable storage medium of this embodiment is used to store the personalized search system 20 according to the user portrait, and when executed by the processor, the following steps are implemented:
  • the fourth weight coefficient of each article includes the first keyword, the first category word, the second keyword and/or the second category word;
  • the search result page is output.
  • the embodiment of the present invention obtains the user portrait by analyzing the patient's personal information and related information, and then combines the input information of the patient to obtain keywords and category words related to the keywords, and calls articles with keywords and category words It also calculates the weight coefficient of the keywords and category words of each article in the article collection, and then obtains the matching degree of each article through the weight coefficients of the keywords and category words, and sorts them from large to small to improve Improve the accuracy of patients’ search articles.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A personalized search method employing a user portrait comprises: acquiring a first keyword and/or a first category word mapped to the first keyword, wherein the first keyword is associated with a category of interest of a patient (S100); acquiring input information of the patient, and acquiring, according to the input information, a second keyword and/or a second category word corresponding to the second keyword (S102); acquiring a first article set according to the first keyword, the first category word, the second keyword, and the second category word, wherein the first article set comprises at least one article (S104); calculating a first weight coefficient of the first keyword, a second weight coefficient of the first category word, a third weight coefficient of the second keyword, and a fourth weight coefficient of the second category word of each article in the first article set, wherein each article comprises the first keyword, the first category word, the second keyword, and/or the second category word (S106); calculating a first matching degree between each article and a search target of the patient according to the first weight coefficient, the second weight coefficient, the third weight coefficient, and the fourth weight coefficient (S108); and outputting a search result page on the basis of the first matching degree of each article (S110). By using the method, an accurate search is performed.

Description

根据用户画像的个性化搜索方法、系统、设备及存储介质Personalized search method, system, equipment and storage medium based on user portrait
本申请要求于2019年7月30日提交中国专利局,申请号、专利名称分别为201910694255.5、“根据用户画像的个性化搜索方法、系统、设备及存储介质”发明专利的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application requires that it be submitted to the Chinese Patent Office on July 30, 2019. The application number and patent name are respectively 201910694255.5, and the priority of the Chinese patent application for the invention patent of "personalized search method, system, equipment and storage medium based on user portrait" , Its entire content is incorporated in this application by reference.
技术领域Technical field
本申请实施例涉及搜索技术领域,尤其涉及一种根据用户画像的个性化搜索方法、系统、设备及存储介质。The embodiments of the present application relate to the field of search technology, and in particular, to a personalized search method, system, device, and storage medium based on user portraits.
技术背景technical background
随着互联网的发展,许多人发表的文章可以在网络上查询到,实现了资料共享,人们可以在网络上查找到自己所需要的资料。With the development of the Internet, articles published by many people can be found on the Internet, which realizes data sharing, and people can find the information they need on the Internet.
目前文章搜索功能的实现,通常是根据用户输入的搜索词进行搜索,搜索到包括与该搜索词匹配的关键词集的文章时,从多篇文章中查找满足条件的文章并且显示出来,不满足条件的文章被隐藏起来,再把这些被筛选出来的文章按一定的顺序排列展示出来,排列时以文章匹配度进行由大到小的排序,将该搜索结果提供给用户。然而,发明人发现,对于患者用户而言,查找的文章不能尽然满足其实际需求,查找的准确度不够。The current implementation of the article search function is usually based on the search term entered by the user. When an article including a keyword set that matches the search term is searched, the article that meets the conditions is searched from multiple articles and displayed, but not satisfied The conditional articles are hidden, and then these filtered articles are displayed in a certain order. When arranged, the articles are sorted from largest to smallest according to the matching degree, and the search results are provided to users. However, the inventor found that, for patient users, the searched articles cannot fully meet their actual needs, and the search accuracy is insufficient.
发明内容Summary of the invention
有鉴于此,本申请实施例的目的是提供一种根据用户画像的个性化搜索方法、系统、设备及存储介质,能够根据用户画像及输入信息进行精准搜索。In view of this, the purpose of the embodiments of the present application is to provide a personalized search method, system, device, and storage medium based on user portraits, which can perform accurate searches based on user portraits and input information.
为实现上述目的,本申请实施例提供了一种根据用户画像的个性化搜索方法,包括:In order to achieve the foregoing objectives, the embodiments of the present application provide a personalized search method based on user portraits, including:
获取第一关键词和/或与所述第一关键词映射的第一类别词,所述第一关键词关联于患者的感兴趣类别;Acquiring a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category;
获取所述患者的输入信息,根据所述输入信息获取第二关键词和/或与所述第二关键词对应的第二类别词;Acquiring input information of the patient, and acquiring a second keyword and/or a second category word corresponding to the second keyword according to the input information;
根据所述第一关键词、第一类别词、第二关键词以及第二类别词获取第一文章集,其中,所述第一文章集包括至少一篇文章;Obtaining a first article collection according to the first keyword, the first category word, the second keyword, and the second category word, where the first article collection includes at least one article;
计算所述第一文章集中的每篇文章的所述第一关键词的第一权重系数、第一类别词的第二权重系数、所述第二关键词的第三权重系数以及第二类别词的第四权重系数,所述每篇文章包括所述第一关键词、第一类别词、第二关键词和/或第二类别词;Calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, the third weight coefficient of the second keyword, and the second category word of each article in the first article set The fourth weight coefficient of each article includes the first keyword, the first category word, the second keyword and/or the second category word;
根据所述第一权重系数、所述第二权重系数、所述第三权重系数与所述第四权重系数计算得到所述每篇文章与患者搜索目标的第一匹配度;及Calculating the first matching degree between each article and the patient's search target according to the first weighting coefficient, the second weighting coefficient, the third weighting coefficient, and the fourth weighting coefficient; and
基于每篇文章的第一匹配度,输出搜索结果页面。Based on the first matching degree of each article, the search result page is output.
为实现上述目的,本申请实施例还提供了一种根据用户画像的个性化搜索系统,包括:In order to achieve the foregoing objective, an embodiment of the present application also provides a personalized search system based on user portraits, including:
第一获取模块,用于获取第一关键词和/或与所述第一关键词映射的第一类别词,所述第一关键词关联于患者的感兴趣类别;The first obtaining module is configured to obtain a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category;
第二获取模块,用于获取所述患者的输入信息,根据所述输入信息获取第二关键词和/或与所述第二关键词对应的第二类别词;The second acquisition module is configured to acquire input information of the patient, and acquire a second keyword and/or a second category word corresponding to the second keyword according to the input information;
第三获取模块,用于根据所述第一关键词、第一类别词、第二关键词以及第二类别词获取第一文章集,其中,所述第一文章集包括至少一篇文章;The third obtaining module is configured to obtain a first article collection according to the first keyword, the first category word, the second keyword, and the second category word, where the first article collection includes at least one article;
第一计算模块,用于计算所述第一文章集中的每篇文章的所述第一关键词的第一权重系数、第一类别词的第二权重系数、所述第二关键词的第三权重系数以及第二类别词的第四权重系数,所述每篇文章包括所述第一关键词、第一类别词、第二关键词和/或第二类别词;The first calculation module is used to calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, and the third weight coefficient of the second keyword in each article in the first article set. A weighting coefficient and a fourth weighting coefficient of a second category word, each article includes the first keyword, the first category word, the second keyword and/or the second category word;
第二计算模块,用于根据所述第一权重系数、所述第二权重系数、所述第三权重系数与所述第四权重系数计算得到所述每篇文章与患者搜索目标的第一匹配度;The second calculation module is configured to calculate the first match between each article and the patient search target according to the first weight coefficient, the second weight coefficient, the third weight coefficient, and the fourth weight coefficient degree;
结果输出模块,用于基于每篇文章的第一匹配度,输出搜索结果页面。The result output module is used to output the search result page based on the first matching degree of each article.
为实现上述目的,本申请实施例还提供了一种计算机设备,所述计算机设备包括存储器、处理器,所述存储器上存储有可在所述处理器上运行的计算机可读指令,所述计算机可读指令被所述处理器执行时实现以下步骤:In order to achieve the foregoing objective, an embodiment of the present application further provides a computer device, the computer device includes a memory and a processor, the memory stores computer-readable instructions that can run on the processor, and the computer When the readable instructions are executed by the processor, the following steps are implemented:
获取第一关键词和/或与所述第一关键词映射的第一类别词,所述第一关键词关联于患者的感兴趣类别;Acquiring a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category;
获取所述患者的输入信息,根据所述输入信息获取第二关键词和/或与所述第二关键词对应的第二类别词;Acquiring input information of the patient, and acquiring a second keyword and/or a second category word corresponding to the second keyword according to the input information;
根据所述第一关键词、第一类别词、第二关键词以及第二类别词获取第一文章集,其中,所述第一文章集包括至少一篇文章;Obtaining a first article collection according to the first keyword, the first category word, the second keyword, and the second category word, where the first article collection includes at least one article;
计算所述第一文章集中的每篇文章的所述第一关键词的第一权重系数、第一类别词的第二权重系数、所述第二关键词的第三权重系数以及第二类别词的第四权重系数,所述每篇文章包括所述第一关键词、第一类别词、第二关键词和/或第二类别词;Calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, the third weight coefficient of the second keyword, and the second category word of each article in the first article set The fourth weight coefficient of each article includes the first keyword, the first category word, the second keyword and/or the second category word;
根据所述第一权重系数、所述第二权重系数、所述第三权重系数与所述第四权重系数计算得到所述每篇文章与患者搜索目标的第一匹配度;及Calculating the first matching degree between each article and the patient's search target according to the first weighting coefficient, the second weighting coefficient, the third weighting coefficient, and the fourth weighting coefficient; and
基于每篇文章的第一匹配度,输出搜索结果页面。Based on the first matching degree of each article, the search result page is output.
为实现上述目的,本申请实施例还提供了一种非易失性计算机可读存储介质,所述非易失性计算机可读存储介质内存储有计算机可读指令,所述计算机可读指令可被至少一个处理器所执行,以使所述至少一个处理器执行以下步骤:In order to achieve the above objective, the embodiments of the present application also provide a non-volatile computer-readable storage medium. The non-volatile computer-readable storage medium stores computer-readable instructions, and the computer-readable instructions may Is executed by at least one processor, so that the at least one processor executes the following steps:
获取第一关键词和/或与所述第一关键词映射的第一类别词,所述第一关键词关联于患者的感兴趣类别;Acquiring a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category;
获取所述患者的输入信息,根据所述输入信息获取第二关键词和/或与所述第二关键词对应的第二类别词;Acquiring input information of the patient, and acquiring a second keyword and/or a second category word corresponding to the second keyword according to the input information;
根据所述第一关键词、第一类别词、第二关键词以及第二类别词获取第一文章集,其中,所述第一文章集包括至少一篇文章;Obtaining a first article collection according to the first keyword, the first category word, the second keyword, and the second category word, where the first article collection includes at least one article;
计算所述第一文章集中的每篇文章的所述第一关键词的第一权重系数、第一类别词的第二权重系数、所述第二关键词的第三权重系数以及第二类别词的第四权重系数,所述每篇文章包括所述第一关键词、第一类别词、第二关键词和/或第二类别词;Calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, the third weight coefficient of the second keyword, and the second category word of each article in the first article set The fourth weight coefficient of each article includes the first keyword, the first category word, the second keyword and/or the second category word;
根据所述第一权重系数、所述第二权重系数、所述第三权重系数与所述第四权重系数计算得到所述每篇文章与患者搜索目标的第一匹配度;及Calculating the first matching degree between each article and the patient's search target according to the first weighting coefficient, the second weighting coefficient, the third weighting coefficient, and the fourth weighting coefficient; and
基于每篇文章的第一匹配度,输出搜索结果页面。Based on the first matching degree of each article, the search result page is output.
附图说明Description of the drawings
图1为本申请实施例根据用户画像的个性化搜索方法实施例一的流程图。FIG. 1 is a flowchart of Embodiment 1 of a personalized search method based on a user portrait according to an embodiment of the application.
图2为本申请实施例图1中步骤S100的流程图。Fig. 2 is a flowchart of step S100 in Fig. 1 of an embodiment of the application.
图3为本申请实施例预先建立疾病分类体系的流程图。Figure 3 is a flow chart of pre-establishing a disease classification system according to an embodiment of the application.
图4为本申请实施例图2中步骤S100B的流程图。Fig. 4 is a flowchart of step S100B in Fig. 2 of the embodiment of the application.
图5为本申请实施例图4中步骤S100B2的流程图。Fig. 5 is a flowchart of step S100B2 in Fig. 4 of an embodiment of the application.
图6为本申请实施例根据用户画像的个性化搜索方法实施例二的流程图。Fig. 6 is a flowchart of a second embodiment of a personalized search method based on a user portrait according to an embodiment of the application.
图7为本申请实施例根据用户画像的个性化搜索系统实施例三的程序模块示意图。FIG. 7 is a schematic diagram of program modules in Embodiment 3 of a personalized search system based on user portraits according to an embodiment of the application.
图8为本申请实施例计算机设备实施例四的硬件结构示意图。FIG. 8 is a schematic diagram of the hardware structure of Embodiment 4 of the computer device according to the embodiment of the application.
具体实施方式Detailed ways
实施例一Example one
参阅图1,示出了本申请实施例一之根据用户画像的个性化搜索方法的步骤流程图。可以理解,本方法实施例中的流程图不用于对执行步骤的顺序进行限定。下面以服务器为执行主体进行示例性描述。具体如下。Referring to FIG. 1, it shows a flowchart of the steps of a personalized search method based on a user portrait in Embodiment 1 of the present application. It can be understood that the flowchart in this method embodiment is not used to limit the order of execution of the steps. The following is an exemplary description with the server as the execution subject. details as follows.
步骤S100,获取第一关键词和/或与所述第一关键词映射的第一类别词,所述第一关键词关联于患者的感兴趣类别。Step S100: Obtain a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category.
具体地,预先配置第一关键词和第一类别词之间的映射关系。例如:患者对饮食比较感兴趣,或者是做菜比较感兴趣,则患者的感兴趣类别的第一类别词为饮食、食谱等,患者根据自身的病情搜索相应的饮食、食谱等,比如糖尿病患者,第一类别词即为糖尿病饮食、糖尿病食谱等,糖尿病饮食的第一关键词即为低盐低脂,进一步可以为粗杂粮如荞麦、燕麦片、玉米面、大豆及豆制品、蔬菜等。Specifically, the mapping relationship between the first keyword and the first category word is configured in advance. For example, if the patient is more interested in eating or cooking, the first category words of the patient’s interest category are diet, recipe, etc. The patient searches for the corresponding diet, recipe, etc. according to their own condition, such as diabetic patients , The first category words are diabetic diet, diabetic recipes, etc. The first keyword of diabetic diet is low salt and low fat, and further can be coarse grains such as buckwheat, oatmeal, corn flour, soybeans and soy products, vegetables, etc.
示例性地,参阅图2,步骤S100还包括:Exemplarily, referring to FIG. 2, step S100 further includes:
步骤S100A,获取患者通过终端设备输入的个人信息,以及基于所述个人信息从指定服务器中查询所述患者的关联信息,所述关联信息包括所述患者的历史操作记录。Step S100A: Obtain personal information input by the patient through a terminal device, and query the patient's associated information from a designated server based on the personal information, the associated information including the patient's historical operation records.
示例性地,通过预先配置的电子页面获取患者的个人信息,所述电子页面中包括多个字段,所述多个字段对应于性别、年龄,既往病史等个人信息。Exemplarily, the patient's personal information is acquired through a pre-configured electronic page, the electronic page includes multiple fields, and the multiple fields correspond to personal information such as gender, age, and past medical history.
具体地,在获取相关权限的前提下,可以通过医疗共享平台等所述指定服务器中获取所述患者的电子病历,从所述电子病历中提取患者的医疗历史信息。若获取不到所述患者的电子病历,当接收到患者提供的搜索请求时,推送一些预设问答信息的电子页面以供需要患者进行选择,如随访,患教问答,药品问答等,从而获取患者针对这些预设问答信息的反馈信息。Specifically, on the premise of obtaining relevant permissions, the patient's electronic medical record can be obtained from the designated server such as a medical sharing platform, and the patient's medical history information can be extracted from the electronic medical record. If the patient’s electronic medical record cannot be obtained, when a search request provided by the patient is received, some electronic pages with preset question and answer information are pushed for the patient to choose, such as follow-up, patient education question and answer, medicine question and answer, etc., so as to obtain the patient Feedback information for these preset questions and answers.
具体地,还可通过获取患者的历史操作记录查询患者的关联信息,历史操作记录包括患者登录查询的信息及相关性文章、患者对疾病相关的文章的兴趣程度,兴趣程度可以由患者查看文章的时长、点踩、转发或者评论等历史操作信息进行分析得到。Specifically, the patient’s related information can also be inquired by obtaining the patient’s historical operation record. The historical operation record includes the patient’s login query information and related articles, and the patient’s degree of interest in disease-related articles. Historical operation information such as duration, clicks, reposts, or comments are analyzed.
示例性地,参阅图3,步骤S100A还包括预先建立疾病分类体系:Exemplarily, referring to Fig. 3, step S100A further includes pre-establishing a disease classification system:
步骤S100AA,获取多个样本用户的多个样本用户信息以及与样本用户的样本关联信息。Step S100AA: Obtain multiple sample user information of multiple sample users and sample related information with the sample users.
步骤S100AB,通过TF-IDF模型,从所述多个样本用户信息以及与样本用户的样本关联 信息中提取多个样本关键词。Step S100AB, using the TF-IDF model, extract multiple sample keywords from the multiple sample user information and sample related information with the sample user.
具体的,TF-IDF(Term Frequency-Inverse DocumentFrequency,词频-逆文件频率)模型用以评估一字词对于多个样本用户信息以及与样本用户的样本关联信息的重要程度,利用TF-IDF模型求出每个字词的权重值,按大小进行排序,取权重值大于某一预设值的全部字词作为样本关键词。Specifically, the TF-IDF (Term Frequency-Inverse Document Frequency) model is used to evaluate the importance of a word to multiple sample user information and sample related information of the sample user. The TF-IDF model The weight value of each word is calculated, sorted by size, and all words with a weight value greater than a certain preset value are taken as sample keywords.
步骤S100AC,将多个样本关键词集做为第1层样本神经网络模型的输入,以分类体系中第1个样本类别词为第1层样本神经网络模型的输出,训练第1层样本神经网络模型根据关键词预测对应的类别词的性能。Step S100AC, take multiple sample keyword sets as the input of the first layer sample neural network model, and use the first sample category word in the classification system as the output of the first layer sample neural network model to train the first layer sample neural network The model predicts the performance of the corresponding category words based on the keywords.
步骤S100AD,直至到第m层样本神经网络模型,停止训练,其中,2≤m≤M,M为所述分类体系包括的样本类别词总数量。Step S100AD, stop training until the sample neural network model of the mth layer is reached, where 2≤m≤M, and M is the total number of sample category words included in the classification system.
步骤S100AE,以第m-1层样本神经网络模型的训练结果以及多个样本关键词为第m层样本神经网络模型的输入,以所述分类体系中第m个样本类别词为第m层样本神经网络模型的输出,训练第m级样本神经网络模型根据关键词预测对应的类别词的性能。Step S100AE, taking the training result of the sample neural network model of the m-1 layer and multiple sample keywords as the input of the sample neural network model of the mth layer, and taking the mth sample category word in the classification system as the mth layer sample The output of the neural network model, training the m-th sample neural network model to predict the performance of the corresponding category words according to the keywords.
步骤S100AF,将所述个人信息以及与所述患者的关联信息通过TF-IDF模型提取多个关键词。Step S100AF, extract multiple keywords from the personal information and the associated information with the patient through the TF-IDF model.
步骤S100AG,将所述多个关键词输入到所述样本神经网络模型中,经过每层样本神经网络模型的根据关键词预测对应的类别词的性能,输出得到多个关键词分别对应的多个类别词。Step S100AG, input the multiple keywords into the sample neural network model, and predict the performance of the corresponding category words according to the keywords of each layer of the sample neural network model, and output multiple keywords corresponding to multiple keywords. Category words.
步骤S100AH,将多个类别词与对应的关键词进行关联,得到所述疾病分类体系;所述疾病分类体系包括多个子疾病类别,每个子疾病类别中包括多个类别词,每个类别词对应至少一个关键词集;其中,所述多个类别词包括至少疾病成因和疾病用药。Step S100AH: Associate multiple category words with corresponding keywords to obtain the disease classification system; the disease classification system includes multiple sub-disease categories, each sub-disease category includes multiple category words, and each category word corresponds to At least one set of keywords; wherein the multiple category words include at least disease causes and disease medications.
具体地,将疾病分类体系中的关键词集输入到样本神经网络模型中,输出对应的类别词。类别词包括:疾病成因、疾病用药、疾病预防、疾病检查、疾病诊断、治疗、常识、护理、前沿资讯等;类别词可以进一步扩展其子类别词,如危害、并发症等:进一步的,以疾病成因为例,还可以将该类别词细分为:抽烟、饮酒等不良嗜好的关键词集。Specifically, the keyword set in the disease classification system is input into the sample neural network model, and the corresponding category words are output. Category words include: disease cause, disease medication, disease prevention, disease examination, disease diagnosis, treatment, common sense, care, cutting-edge information, etc.; category words can further expand its subcategory words, such as hazards, complications, etc.: further, Disease causes are cases, the category words can also be subdivided into: smoking, drinking and other bad habits of keyword sets.
步骤S100B,基于所述个人信息以及与所述关联信息,构建所述患者的用户画像,所述用户画像包括多个维度对应的多个画像标签。Step S100B, construct a user portrait of the patient based on the personal information and the associated information, the user portrait including multiple portrait tags corresponding to multiple dimensions.
示例性地,参阅图4,步骤S100B进一步包括:Exemplarily, referring to FIG. 4, step S100B further includes:
步骤S100B1,通过word2vec模型得到所述个人信息和关联信息的第一关键词的词向量。In step S100B1, the word vector of the first keyword of the personal information and related information is obtained through the word2vec model.
步骤S100B2,将所述第一关键词的词向量输入到预测模型中,通过所述预测模型输出所述患者与各个画像标签的关联概率,以得到所述患者的用户画像。Step S100B2, input the word vector of the first keyword into a prediction model, and output the correlation probability of the patient and each portrait label through the prediction model to obtain a user portrait of the patient.
示例性地,参阅图5,步骤S100B2进一步包括:Exemplarily, referring to FIG. 5, step S100B2 further includes:
步骤S100B2A,将所述疾病分类体系与所述患者建立映射关系,形成所述患者的用户画像,每个类别词对应为一个维度,每个关键词对应为一个画像标签。In step S100B2A, a mapping relationship between the disease classification system and the patient is established to form a user portrait of the patient, each category word corresponds to a dimension, and each keyword corresponds to a portrait label.
步骤S100B2B,将所述疾病分类体系的关键词集的词向量输入到所述预测模型中,根据所述预测模型的softmax层计算得到第一关联概率。Step S100B2B, input the word vector of the keyword set of the disease classification system into the prediction model, and calculate the first association probability according to the softmax layer of the prediction model.
步骤S100B2C,将所述疾病分类体系的类别词的词向量输入到所述预测模型中,根据所述预测模型的softmax层计算得到第二关联概率。Step S100B2C, input the word vectors of the category words of the disease classification system into the prediction model, and calculate the second correlation probability according to the softmax layer of the prediction model.
步骤S100B2D,根据所述第一关联概率与所述第二关联概率得到所述患者的用户画像的关联概率。Step S100B2D: Obtain the correlation probability of the user portrait of the patient according to the first correlation probability and the second correlation probability.
示例性地,步骤S100B进一步包括:Exemplarily, step S100B further includes:
分析所述患者的实时历史操作信息,根据实时历史操作信息获取患者的关注点,将关注点映射到患者的用户画像上,以更新所述患者的用户画像。The real-time historical operation information of the patient is analyzed, the patient's attention point is obtained according to the real-time historical operation information, and the attention point is mapped to the user portrait of the patient to update the user portrait of the patient.
具体地,若患者未给出个人信息,可以先从指定服务器中获取患者的关联信息,再通过样本神经网络模型获取患者的关联信息的疾病分类体系,并通过word2vec模型得到对应的多个词向量;将所述多个词向量输入到预测模型中,通过预测模型输出患者与各个画像标签的关联概率,以得到患者的用户画像;其中,预测模型可以是深度学习模型等。Specifically, if the patient does not provide personal information, you can first obtain the patient's related information from the designated server, then obtain the disease classification system of the patient's related information through the sample neural network model, and obtain the corresponding multiple word vectors through the word2vec model Input the multiple word vectors into the prediction model, and output the correlation probability of the patient and each portrait label through the prediction model to obtain the user portrait of the patient; wherein the prediction model may be a deep learning model.
步骤S100C,根据所述用户画像获取所述患者的感兴趣类别。Step S100C: Acquire the patient's interest category according to the user portrait.
具体地,用户画像上有患者与各个画像标签的关联概率,根据关联概率确定患者的感兴趣类别,将同一类画像标签进行关联,选取关联概率大于预设范围的同类画像标签设为感兴趣类别。Specifically, the user portrait has the correlation probability between the patient and each portrait label, and the patient’s interest category is determined according to the correlation probability, the same category portrait tags are associated, and similar portrait tags with the correlation probability greater than the preset range are selected as the interest category .
步骤S102,获取所述患者的输入信息,根据所述输入信息获取第二关键词和/或与所述第二关键词对应的第二类别词。Step S102: Obtain input information of the patient, and obtain a second keyword and/or a second category word corresponding to the second keyword according to the input information.
具体地,预先配置第二关键词和第二类别词之间的映射关系。Specifically, the mapping relationship between the second keyword and the second category word is configured in advance.
获取第二关键词的步骤如下:根据关键词集遍历所述输入信息,以从所述输入信息中得到一个或多个第二关键词。例如:患者在搜索时,检验到患者输入饮食方向的信息,所搜索的关键词可以为XXX(症状,如糖尿病)能吃什么和不能吃什么,具体可能为蛋白质类、肉类、海鲜类等。其中蛋白质类、肉类、海鲜类等为第二类别词,猪肉、牛肉、鸡肉等为第二关键词。将猪肉、牛肉、鸡肉等为第二关键词映射到第二类别词的肉类上。The step of obtaining the second keyword is as follows: traverse the input information according to the keyword set to obtain one or more second keywords from the input information. For example: when a patient searches, it is verified that the patient enters the dietary direction information. The searched keywords can be XXX (symptoms, such as diabetes) what can and can’t eat, which may be protein, meat, seafood, etc. . Among them, protein, meat, seafood, etc. are the second category words, and pork, beef, chicken, etc. are the second keywords. Map pork, beef, chicken, etc. as second keywords to the meat of the second category words.
步骤S104,根据所述第一关键词、第一类别词、第二关键词以及第二类别词获取第一文章集,其中所述第一文章集包括至少一篇文章。Step S104: Obtain a first article set according to the first keyword, the first category word, the second keyword, and the second category word, where the first article set includes at least one article.
具体地,只要所述文章包括有至少一个所述第一关键词、第一类别词、第二关键词和/或第二类别词,就将该文章识别并筛选出来。Specifically, as long as the article includes at least one of the first keyword, first category word, second keyword, and/or second category word, the article is identified and filtered out.
步骤S106,计算所述第一文章集的每篇文章的所述第一关键词的第一权重系数、第一类别词的第二权重系数、所述第二关键词的第三权重系数以及第二类别词的第四权重系数,所述每篇文章包括所述第一关键词、第一类别词、第二关键词和/或第二类别词。Step S106: Calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, the third weight coefficient of the second keyword, and the first weight coefficient of each article in the first article collection. The fourth weight coefficient of the two-category words, each article includes the first keyword, the first category word, the second keyword and/or the second category word.
示例性地,每篇所述文章包括所述第一关键词、第一类别词、第二关键词和/或第二类别词。Exemplarily, each article includes the first keyword, the first category word, the second keyword, and/or the second category word.
示例性地,若所述文章的标题与主体包含有所述关键词集或者所述类别词,而所述文章的标题与主体的权重系数不一致,所述标题的关键词集的权重系数为所述标题的权重系数加上所述关键词集的权重系数,同理,所述主体的关键词集的权重系数为所述主体的权重系数加上所述关键词集的权重系数。Exemplarily, if the title and body of the article contain the keyword set or the category words, and the weight coefficient of the article title and the body are inconsistent, the weight coefficient of the keyword set of the title is The weight coefficient of the title is added to the weight coefficient of the keyword set. Similarly, the weight coefficient of the keyword set of the subject is the weight coefficient of the subject plus the weight coefficient of the keyword set.
具体地,通过TF-IDF(term frequency–inverse document frequency,词频的逆文本频率指数)模型和LDA(Latent Dirichlet Allocation,文档主题生成模型)模型计算所述第一关键词的第一权重系数、第一类别词的第二权重系数、所述第二目标关键词的第三权重系数以及第二目标类别词的第四权重系数。LDA模型可以识别大规模文档集或语料库中潜藏的主题词语。Specifically, the TF-IDF (term frequency-inverse document frequency, inverse text frequency index of term frequency) model and LDA (Latent Dirichlet Allocation, document topic generation model) model are used to calculate the first weight coefficient and the first weight coefficient of the first keyword. The second weight coefficient of a category word, the third weight coefficient of the second target keyword, and the fourth weight coefficient of the second target category word. The LDA model can identify topic words hidden in a large-scale document set or corpus.
步骤S108,根据所述第一权重系数、所述第二权重系数、所述第三权重系数与所述第四权重系数计算得到所述每篇文章与患者搜索目标的第一匹配度。Step S108: According to the first weighting coefficient, the second weighting coefficient, the third weighting coefficient, and the fourth weighting coefficient, the first matching degree between each article and the patient search target is calculated.
具体地,将每篇所述文章的第一权重系数相加得到总第一权重系数、第二权重系数相 加得到总第二权重系数、第三权重系数相加得到总第三权重系数及第四权重系数相加得到总第四权重系数,最后将所述总第一权重系数、所述总第二权重系数总第三权重系数与总第四权重系数相加得到所述第一匹配度。将所述第一文章集的每篇文章都按上述方法求得其第一匹配度。Specifically, the first weight coefficient of each article is added to obtain the total first weight coefficient, the second weight coefficient is added to obtain the total second weight coefficient, and the third weight coefficient is added to obtain the total third weight coefficient and the first weight coefficient. The four weighting coefficients are added to obtain a total fourth weighting coefficient, and finally the total first weighting coefficient, the total second weighting coefficient, the total third weighting coefficient and the total fourth weighting coefficient are added to obtain the first matching degree. The first matching degree of each article in the first article collection is obtained according to the above-mentioned method.
步骤S110,基于每篇文章的第一匹配度,输出搜索结果页面。Step S110, based on the first matching degree of each article, output a search result page.
具体地,将所述第一文章集中的每篇文章的第一匹配度进行单一性(例如:由大到小)的排序,得到所述患者的第一搜索结果,也可加入其他因素进行排序,例如竞价排名等。将所述第一搜索结果进行显示,使所述患者得到更加精确的搜索文章信息。Specifically, the first matching degree of each article in the first article set is sorted by unity (for example: from big to small) to obtain the first search result of the patient, and other factors may also be added for sorting , Such as bidding ranking, etc. The first search result is displayed, so that the patient can obtain more accurate search article information.
实施例二Example two
请参阅6,与实施例一不同之处在于,搜索方法不同,在本实施例中,未开启个性化搜索,是只针对于患者输入信息进行的搜索方法。包括以下步骤:Please refer to 6. The difference from the first embodiment is that the search method is different. In this embodiment, the personalized search is not turned on, and the search method is only for the patient's input information. It includes the following steps:
步骤S120,获取所述患者通过终端设备输入的搜索信息,根据所述搜索信息从指定服务器中获取所述搜索信息的第五关键词与所述第五关键词对应的第五类别词。Step S120: Obtain search information input by the patient through a terminal device, and obtain a fifth keyword of the search information from a designated server according to the search information and a fifth category word corresponding to the fifth keyword.
具体地,建立关键词集与类别词,所述关键词集包括多个第五关键词,所述类别词包括多个第五类别词,预先配置第五关键词和第五类别词之间的映射关系。Specifically, a keyword set and category words are established, the keyword set includes a plurality of fifth keywords, the category words include a plurality of fifth category words, and a pre-configured relationship between the fifth keyword and the fifth category word Mapping relations.
步骤S122,根据所述第五关键词与所述第五类别词获取第二文章集,其中所述第二文章集包括至少一篇文章。Step S122: Obtain a second article collection according to the fifth keyword and the fifth category word, wherein the second article collection includes at least one article.
具体地,根据所述搜索信息的所述第五关键词与第五类别词获取数据库中相关的第二文章集,只要所述文章有至少一个第五关键词和/或所述第五类别词,就将该文章识别并调用出来。Specifically, according to the fifth keyword and fifth category words of the search information, a related second article set in the database is obtained, as long as the article has at least one fifth keyword and/or the fifth category word , The article is identified and called out.
步骤S124,所述第二文章集的每篇所述文章包括所述第五关键词和/或第五类别词,计算所述每篇文章的所述第五关键词的第五权重系数、第五类别词的第六权重系数。Step S124, each of the articles in the second article collection includes the fifth keyword and/or fifth category words, and calculate the fifth weight coefficient and the fifth weight coefficient of the fifth keyword of each article. The sixth weight coefficient of five categories of words.
具体地,通过TF-IDF(term frequency–inverse document frequency,词频的逆文本频率指数)模型和LDA(Latent Dirichlet Allocation,文档主题生成模型)模型计算所述第五关键词的第五权重系数、第五类别词的第六权重系数。Specifically, the fifth weight coefficient and the first weight coefficient of the fifth keyword are calculated through the TF-IDF (term frequency-inverse document frequency) model and the LDA (Latent Dirichlet Allocation, document topic generation model) model. The sixth weight coefficient of five categories of words.
示例性地,若所述文章的标题与主体包含有所述关键词集或者所述类别词,所述文章的标题与主体的权重系数不一致,所述标题的关键词集的权重系数为所述标题的权重系数加上所述关键词集的权重系数,同理,所述主体的关键词集的权重系数为所述主体的权重系数加上所述关键词集的权重系数。Exemplarily, if the title and body of the article contain the keyword set or the category words, the title of the article and the body have different weight coefficients, and the weight coefficient of the keyword set of the title is the The weight coefficient of the title is added to the weight coefficient of the keyword set. Similarly, the weight coefficient of the keyword set of the subject is the weight coefficient of the subject plus the weight coefficient of the keyword set.
步骤S126,根据所述第五权重系数与所述第六权重系数计算得到所述第二文章集每篇文章的第二匹配度。Step S126, calculating the second matching degree of each article in the second article collection according to the fifth weighting coefficient and the sixth weighting coefficient.
具体地,将每个所述关键词的第五权重系数相加得到总第五权重系数、将每个所述类别词的第六权重系数相加得到总第六权重系数,最后将所述总第五权重系数加上所述总第六权重系数得到所述第二匹配度。将所述第二文章集的每篇文章都按上述方法求得其第二匹配度。Specifically, the fifth weight coefficient of each keyword is added to obtain a total fifth weight coefficient, the sixth weight coefficient of each category word is added to obtain a total sixth weight coefficient, and finally the total The fifth weight coefficient is added to the total sixth weight coefficient to obtain the second degree of matching. The second matching degree of each article in the second article collection is obtained according to the above-mentioned method.
步骤S128,基于所述第二文章集的每篇文章中的第二匹配度,输出搜索结果页面。Step S128, based on the second degree of matching in each article in the second article collection, output a search result page.
具体地,将所述第二文章集中的每篇文章的第二匹配度进行由大到小的排序,得到所述患者的第二搜索结果,将所述第二搜索结果进行显示,得到所述患者的搜索文章信息。Specifically, the second matching degree of each article in the second article set is sorted from largest to smallest to obtain the second search result of the patient, and the second search result is displayed to obtain the Patient’s search article information.
实施例三Example three
请继续参阅图7,示出了本申请根据用户画像的个性化搜索系统实施例三的程序模块示意图。在本实施例中,根据用户画像的个性化搜索系统20可以包括或被分割成一个或多个程序模块,一个或者多个程序模块被存储于存储介质中,并由一个或多个处理器所执行,以完成本申请,并可实现上述根据用户画像的个性化搜索方法。本申请实施例所称的程序模块是指能够完成特定功能的一系列计算机可读指令指令段,比程序本身更适合于描述根据用户画像的个性化搜索系统20在存储介质中的执行过程。以下描述将具体介绍本实施例各程序模块的功能:Please continue to refer to FIG. 7, which shows a schematic diagram of the program modules of the third embodiment of the personalized search system according to the user portrait of this application. In this embodiment, the personalized search system 20 based on user portraits may include or be divided into one or more program modules, and the one or more program modules are stored in a storage medium and are executed by one or more processors. Execute to complete this application and realize the above-mentioned personalized search method based on user portrait. The program module referred to in the embodiment of the present application refers to a series of computer-readable instruction instruction segments capable of completing specific functions, and is more suitable than the program itself to describe the execution process of the personalized search system 20 based on the user portrait in the storage medium. The following description will specifically introduce the functions of each program module in this embodiment:
第一获取模块200,用于获取第一关键词和/或与所述第一关键词映射的第一类别词,所述第一关键词关联于患者的感兴趣类别。The first obtaining module 200 is configured to obtain a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category.
具体地,预先配置第一关键词和第一类别词之间的映射关系。例如:患者对饮食比较感兴趣,或者是做菜比较感兴趣,则患者的感兴趣类别的第一类别词为饮食、食谱等,患者根据自身的病情搜索相应的饮食、食谱等,比如糖尿病患者,第一类别词即为糖尿病饮食、糖尿病食谱等,糖尿病饮食的第一关键词即为低盐低脂,进一步可以为粗杂粮如荞麦、燕麦片、玉米面、大豆及豆制品、蔬菜等。Specifically, the mapping relationship between the first keyword and the first category word is configured in advance. For example, if the patient is more interested in eating or cooking, the first category words of the patient’s interest category are diet, recipe, etc. The patient searches for the corresponding diet, recipe, etc. according to their own condition, such as diabetic patients , The first category words are diabetic diet, diabetic recipes, etc. The first keyword of diabetic diet is low salt and low fat, and further can be coarse grains such as buckwheat, oatmeal, corn flour, soybeans and soy products, vegetables, etc.
示例性地,所述第一获取模块200还用于:Exemplarily, the first obtaining module 200 is further configured to:
获取患者通过终端设备输入的个人信息,以及基于所述个人信息从指定服务器中查询所述患者的关联信息,所述关联信息包括所述患者的历史操作记录;Acquiring personal information input by the patient through the terminal device, and querying the patient's associated information from a designated server based on the personal information, the associated information including the patient's historical operation records;
示例性地,通过预先配置的电子页面获取患者的个人信息,所述电子页面中包括多个字段,所述多个字段对应于性别、年龄,既往病史等个人信息。Exemplarily, the patient's personal information is acquired through a pre-configured electronic page, the electronic page includes multiple fields, and the multiple fields correspond to personal information such as gender, age, and past medical history.
具体地,在获取相关权限的前提下,可以通过医疗共享平台等所述指定服务器中获取所述患者的电子病历,从所述电子病历中提取患者的医疗历史信息。若获取不到所述患者的电子病历,当接收到患者提供的搜索请求时,推送一些预设问答信息的电子页面以供需要患者进行选择,如随访,患教问答,药品问答等,从而获取患者针对这些预设问答信息的反馈信息。Specifically, on the premise of obtaining relevant permissions, the patient's electronic medical record can be obtained from the designated server such as a medical sharing platform, and the patient's medical history information can be extracted from the electronic medical record. If the patient’s electronic medical record cannot be obtained, when a search request provided by the patient is received, some electronic pages with preset question and answer information are pushed for the patient to choose, such as follow-up, patient education question and answer, medicine question and answer, etc., so as to obtain the patient Feedback information for these preset questions and answers.
具体地,还可通过获取患者的历史操作记录查询患者的关联信息,所述历史操作记录包括患者登录查询的信息及相关性文章、患者对疾病相关的文章的兴趣程度,兴趣程度可以由患者查看文章的时长、点踩、转发或者评论等历史操作信息进行分析得到。Specifically, the patient’s associated information can also be inquired by obtaining the patient’s historical operation record, which includes the patient’s login query information and related articles, and the patient’s degree of interest in disease-related articles, which can be viewed by the patient The historical operation information such as the length of the article, clicks, reposts, or comments are analyzed.
示例性地,所述第一获取模块200还用于预先建立疾病分类体系:Exemplarily, the first acquisition module 200 is also used to establish a disease classification system in advance:
获取多个样本用户的多个样本用户信息以及与样本用户的样本关联信息;Acquire multiple sample user information of multiple sample users and sample related information with sample users;
通过TF-IDF模型,从所述多个样本用户信息以及与样本用户的样本关联信息中提取多个样本关键词;Using the TF-IDF model, extract multiple sample keywords from the multiple sample user information and sample related information with the sample users;
将多个样本关键词集做为第1层样本神经网络模型的输入,以分类体系中第1个样本类别词为第1层样本神经网络模型的输出,训练第1层样本神经网络模型根据关键词预测对应的类别词的性能;Take multiple sample keyword sets as the input of the first layer sample neural network model, and use the first sample category word in the classification system as the output of the first layer sample neural network model, and train the first layer sample neural network model according to the key The performance of word prediction corresponding category words;
直至到第m层样本神经网络模型,停止训练,其中,2≤m≤M,M为所述分类体系包括的样本类别词总数量;Stop training until the sample neural network model of the mth layer is reached, where 2≤m≤M, and M is the total number of sample category words included in the classification system;
以第m-1层样本神经网络模型的训练结果以及多个样本关键词为第m层样本神经网络模型的输入,以所述分类体系中第m个样本类别词为第m层样本神经网络模型的输出,训练第m级样本神经网络模型根据关键词预测对应的类别词的性能;Take the training result of the sample neural network model of the m-1 layer and multiple sample keywords as the input of the sample neural network model of the m-th layer, and use the m-th sample category word in the classification system as the sample neural network model of the m-th layer The output of training the m-th sample neural network model to predict the performance of the corresponding category words according to the keywords;
将所述个人信息以及与所述患者的关联信息通过TF-IDF模型提取多个关键词;Extracting multiple keywords from the personal information and the associated information with the patient through the TF-IDF model;
将所述多个关键词输入到所述样本神经网络模型中,经过每层样本神经网络模型的根据关键词预测对应的类别词的性能,输出得到多个关键词分别对应的多个类别词;Inputting the multiple keywords into the sample neural network model, and predicting the performance of corresponding category words according to the keywords of each layer of the sample neural network model, and outputting multiple category words corresponding to the multiple keywords;
将多个类别词与对应的关键词进行关联,得到所述疾病分类体系;所述疾病分类体系包括多个子疾病类别,每个子疾病类别中包括多个类别词,每个类别词对应至少一个关键词集;其中,所述多个类别词包括至少疾病成因和疾病用药。Associate multiple category words with corresponding keywords to obtain the disease classification system; the disease classification system includes multiple sub-disease categories, each sub-disease category includes multiple category words, and each category word corresponds to at least one key Word set; wherein, the multiple category words include at least disease causes and disease medications.
具体地,将疾病分类体系中的关键词集输入到样本神经网络模型中,输出对应的类别词。类别词包括:疾病成因、疾病用药、疾病预防、疾病检查、疾病诊断、治疗、常识、护理、前沿资讯等;类别词可以进一步扩展其子类别词,如危害、并发症等:进一步的,以疾病成因为例,还可以将该类别词细分为:抽烟、饮酒等不良嗜好的关键词集。Specifically, the keyword set in the disease classification system is input into the sample neural network model, and the corresponding category words are output. Category words include: disease cause, disease medication, disease prevention, disease examination, disease diagnosis, treatment, common sense, care, cutting-edge information, etc.; category words can further expand its subcategory words, such as hazards, complications, etc.: further, Disease causes are cases, the category words can also be subdivided into: smoking, drinking and other bad habits of keyword sets.
基于所述个人信息以及与所述关联信息,构建所述患者的用户画像,所述用户画像包括多个维度对应的多个画像标签。Based on the personal information and the associated information, construct a user portrait of the patient, the user portrait including a plurality of portrait tags corresponding to multiple dimensions.
示例性地,所述第一获取模块200进一步用于:Exemplarily, the first obtaining module 200 is further configured to:
通过word2vec模型得到所述个人信息和关联信息的第一关键词的词向量;Obtain the word vector of the first keyword of the personal information and related information through the word2vec model;
将所述第一关键词的词向量输入到预测模型中,通过所述预测模型输出所述患者与各个画像标签的关联概率,以得到所述患者的用户画像;Inputting the word vector of the first keyword into a prediction model, and outputting the correlation probability of the patient and each portrait label through the prediction model to obtain a user portrait of the patient;
根据所述用户画像获取所述患者的感兴趣类别。Acquire the patient's interest category according to the user portrait.
具体地,用户画像上有患者与各个画像标签的关联概率,根据关联概率确定患者的感兴趣类别,将同一类画像标签进行关联,选取关联概率大于预设范围的同类画像标签设为感兴趣类别。Specifically, the user portrait has the correlation probability between the patient and each portrait label, and the patient’s interest category is determined according to the correlation probability, the same category portrait tags are associated, and similar portrait tags with the correlation probability greater than the preset range are selected as the interest category .
示例性地,所述第一获取模块200进一步用于:Exemplarily, the first obtaining module 200 is further configured to:
将所述疾病分类体系与所述患者建立映射关系,形成所述患者的用户画像,每个类别词对应为一个维度,每个关键词对应为一个画像标签;Establishing a mapping relationship between the disease classification system and the patient to form a user portrait of the patient, each category word corresponds to a dimension, and each keyword corresponds to a portrait label;
将所述疾病分类体系的关键词集的词向量输入到所述预测模型中,根据所述预测模型的softmax层计算得到第一关联概率;Input the word vector of the keyword set of the disease classification system into the prediction model, and calculate the first association probability according to the softmax layer of the prediction model;
将所述疾病分类体系的类别词的词向量输入到所述预测模型中,根据所述预测模型的softmax层计算得到第二关联概率;Input the word vector of the category word of the disease classification system into the prediction model, and calculate the second association probability according to the softmax layer of the prediction model;
根据所述第一关联概率与所述第二关联概率得到所述患者的用户画像的关联概率。The correlation probability of the user portrait of the patient is obtained according to the first correlation probability and the second correlation probability.
具体的,TF-IDF(Term Frequency-Inverse DocumentFrequency,词频-逆文件频率)模型用以评估一字词对于多个样本用户信息以及与样本用户的样本关联信息的重要程度,利用TF-IDF模型求出每个字词的权重值,按大小进行排序,取权重值大于某一预设值的全部字词作为样本关键词。Specifically, the TF-IDF (Term Frequency-Inverse Document Frequency) model is used to evaluate the importance of a word to multiple sample user information and sample related information of the sample user. The TF-IDF model The weight value of each word is calculated, sorted by size, and all words with a weight value greater than a certain preset value are taken as sample keywords.
示例性地,所述第一获取模块200进一步用于:Exemplarily, the first obtaining module 200 is further configured to:
分析所述患者的实时历史操作信息,根据实时历史操作信息获取患者的关注点,将关注点映射到患者的用户画像上,以更新所述患者的用户画像。The real-time historical operation information of the patient is analyzed, the patient's attention point is obtained according to the real-time historical operation information, and the attention point is mapped to the user portrait of the patient to update the user portrait of the patient.
具体地,若患者未给出个人信息,可以先从指定服务器中获取患者的关联信息,再通过样本神经网络模型获取患者的关联信息的疾病分类体系,并通过word2vec模型得到对应的多个词向量;将所述多个词向量输入到预测模型中,通过预测模型输出患者与各个画像标签的关联概率,以得到患者的用户画像;其中,预测模型可以是深度学习模型等。Specifically, if the patient does not provide personal information, you can first obtain the patient's related information from the designated server, then obtain the disease classification system of the patient's related information through the sample neural network model, and obtain the corresponding multiple word vectors through the word2vec model Input the multiple word vectors into the prediction model, and output the correlation probability of the patient and each portrait label through the prediction model to obtain the user portrait of the patient; wherein the prediction model may be a deep learning model.
第二获取模块201,用于获取所述患者的输入信息,根据所述输入信息获取第二关键词 和/或与所述第二关键词对应的第二类别词。The second acquisition module 201 is configured to acquire input information of the patient, and acquire a second keyword and/or a second category word corresponding to the second keyword according to the input information.
具体地,预先配置第二关键词和第二类别词之间的映射关系。Specifically, the mapping relationship between the second keyword and the second category word is configured in advance.
获取第二关键词的步骤如下:根据关键词集遍历所述输入信息,以从所述输入信息中得到一个或多个第二关键词。例如:患者在搜索时,检验到患者输入饮食方向的信息,所搜索的关键词可以为XXX(症状,如糖尿病)能吃什么和不能吃什么,具体可能为蛋白质类、肉类、海鲜类等。其中蛋白质类、肉类、海鲜类等为第二类别词,猪肉、牛肉、鸡肉等为第二关键词。将猪肉、牛肉、鸡肉等为第二关键词映射到第二类别词的肉类上。The step of obtaining the second keyword is as follows: traverse the input information according to the keyword set to obtain one or more second keywords from the input information. For example: when a patient searches, it is verified that the patient enters the dietary direction information. The searched keywords can be XXX (symptoms, such as diabetes) what can and can’t eat, which may be protein, meat, seafood, etc. . Among them, protein, meat, seafood, etc. are the second category words, and pork, beef, chicken, etc. are the second keywords. Map pork, beef, chicken, etc. as second keywords to the meat of the second category words.
第三获取模块202,用于根据所述第一关键词、第一类别词、第二关键词以及第二类别词获取第一文章集,其中所述第一文章集包括至少一篇文章。The third obtaining module 202 is configured to obtain a first article set according to the first keyword, the first category word, the second keyword, and the second category word, where the first article set includes at least one article.
具体地,只要所述文章包括有至少一个所述第一关键词、第一类别词、第二关键词和/或第二类别词,就将该文章识别并筛选出来。Specifically, as long as the article includes at least one of the first keyword, first category word, second keyword, and/or second category word, the article is identified and filtered out.
第一计算模块203,用于计算所述第一文章集的每篇文章的所述第一关键词的第一权重系数、第一类别词的第二权重系数、所述第二关键词的第三权重系数以及第二类别词的第四权重系数,所述每篇文章包括所述第一关键词、第一类别词、第二关键词和/或第二类别词。The first calculation module 203 is configured to calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, and the second keyword of the second keyword of each article in the first article collection. Three weight coefficients and a fourth weight coefficient of the second category words, each article includes the first keyword, the first category word, the second keyword and/or the second category word.
示例性地,每篇所述文章包括所述第一关键词、第一类别词、第二关键词和/或第二类别词。Exemplarily, each article includes the first keyword, the first category word, the second keyword, and/or the second category word.
示例性地,若所述文章的标题与主体包含有所述关键词集或者所述类别词,而所述文章的标题与主体的权重系数不一致,所述标题的关键词集的权重系数为所述标题的权重系数加上所述关键词集的权重系数,同理,所述主体的关键词集的权重系数为所述主体的权重系数加上所述关键词集的权重系数。Exemplarily, if the title and body of the article contain the keyword set or the category words, and the weight coefficient of the article title and the body are inconsistent, the weight coefficient of the keyword set of the title is The weight coefficient of the title is added to the weight coefficient of the keyword set. Similarly, the weight coefficient of the keyword set of the subject is the weight coefficient of the subject plus the weight coefficient of the keyword set.
具体地,通过TF-IDF(term frequency–inverse document frequency,词频的逆文本频率指数)模型和LDA(Latent Dirichlet Allocation,文档主题生成模型)模型计算所述第一关键词的第一权重系数、第一类别词的第二权重系数、所述第二目标关键词的第三权重系数以及第二目标类别词的第四权重系数。Specifically, the TF-IDF (term frequency-inverse document frequency, inverse text frequency index of term frequency) model and LDA (Latent Dirichlet Allocation, document topic generation model) model are used to calculate the first weight coefficient and the first weight coefficient of the first keyword. The second weight coefficient of a category word, the third weight coefficient of the second target keyword, and the fourth weight coefficient of the second target category word.
第二计算模块204,用于根据所述第一权重系数、所述第二权重系数、所述第三权重系数与所述第四权重系数计算得到所述每篇文章与患者搜索目标的第一匹配度。The second calculation module 204 is configured to calculate the first weight coefficient of each article and the patient search target according to the first weight coefficient, the second weight coefficient, the third weight coefficient, and the fourth weight coefficient. suitability.
具体地,将每篇所述文章的第一权重系数相加得到总第一权重系数、第二权重系数相加得到总第二权重系数、第三权重系数相加得到总第三权重系数及第四权重系数相加得到总第四权重系数,最后将所述总第一权重系数、所述总第二权重系数总第三权重系数与总第四权重系数相加得到所述第一匹配度。将所述第一文章集的每篇文章都按上述方法求得其第一匹配度。Specifically, the first weight coefficient of each article is added to obtain the total first weight coefficient, the second weight coefficient is added to obtain the total second weight coefficient, and the third weight coefficient is added to obtain the total third weight coefficient and the first weight coefficient. The four weighting coefficients are added to obtain a total fourth weighting coefficient, and finally the total first weighting coefficient, the total second weighting coefficient, the total third weighting coefficient and the total fourth weighting coefficient are added to obtain the first matching degree. The first matching degree of each article in the first article collection is obtained according to the above-mentioned method.
结果输出模块205,用于基于每篇文章的第一匹配度,输出搜索结果页面。The result output module 205 is configured to output a search result page based on the first matching degree of each article.
具体地,将所述第一文章集中的每篇文章的第一匹配度进行单一性(例如:由大到小)的排序,得到所述患者的第一搜索结果,也可加入其他因素进行排序,例如竞价排名等。将所述第一搜索结果进行显示,使所述患者得到更加精确的搜索文章信息。Specifically, the first matching degree of each article in the first article set is sorted by unity (for example: from big to small) to obtain the first search result of the patient, and other factors may also be added for sorting , Such as bidding ranking, etc. The first search result is displayed, so that the patient can obtain more accurate search article information.
实施例四Example four
参阅图8,是本申请实施例四之计算机设备的硬件架构示意图。本实施例中,所述计算机设备2是一种能够按照事先设定或者存储的指令,自动进行数值计算和/或信息处理的设备。该计算机设备2可以是机架式服务器、刀片式服务器、塔式服务器或机柜式服务器(包 括独立的服务器,或者多个服务器所组成的服务器集群)等。如图8所示,所述计算机设备2至少包括,但不限于,可通过系统总线相互通信连接存储器21、处理器22、网络接口23、以及根据用户画像的个性化搜索系统20。其中:Refer to FIG. 8, which is a schematic diagram of the hardware architecture of the computer device according to the fourth embodiment of the present application. In this embodiment, the computer device 2 is a device that can automatically perform numerical calculation and/or information processing according to pre-set or stored instructions. The computer device 2 may be a rack server, a blade server, a tower server, or a cabinet server (including an independent server or a server cluster composed of multiple servers). As shown in FIG. 8, the computer device 2 at least includes, but is not limited to, a memory 21, a processor 22, a network interface 23, and a personalized search system 20 based on user portraits that can be connected to each other through a system bus. among them:
本实施例中,存储器21至少包括一种类型的非易失性计算机可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。在一些实施例中,存储器21可以是计算机设备2的内部存储单元,例如该计算机设备2的硬盘或内存。在另一些实施例中,存储器21也可以是计算机设备2的外部存储设备,例如该计算机设备2上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。当然,存储器21还可以既包括计算机设备2的内部存储单元也包括其外部存储设备。本实施例中,存储器21通常用于存储安装于计算机设备2的操作系统和各类应用软件,例如实施例三的根据用户画像的个性化搜索系统20的程序代码等。此外,存储器21还可以用于暂时地存储已经输出或者将要输出的各类数据。In this embodiment, the memory 21 includes at least one type of non-volatile computer-readable storage medium. The readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), Random access memory (RAM), static random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disk, optical disk Wait. In some embodiments, the memory 21 may be an internal storage unit of the computer device 2, such as a hard disk or memory of the computer device 2. In other embodiments, the memory 21 may also be an external storage device of the computer device 2, for example, a plug-in hard disk, a smart media card (SMC), and a secure digital (Secure Digital, SD card, Flash Card, etc. Of course, the memory 21 may also include both the internal storage unit of the computer device 2 and its external storage device. In this embodiment, the memory 21 is generally used to store the operating system and various application software installed in the computer device 2, such as the program code of the personalized search system 20 according to the user portrait in the third embodiment. In addition, the memory 21 can also be used to temporarily store various types of data that have been output or will be output.
处理器22在一些实施例中可以是中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器22通常用于控制计算机设备2的总体操作。本实施例中,处理器22用于运行存储器21中存储的程序代码或者处理数据,例如运行根据用户画像的个性化搜索系统20,以实现实施例一及二的根据用户画像的个性化搜索方法。The processor 22 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments. The processor 22 is generally used to control the overall operation of the computer device 2. In this embodiment, the processor 22 is used to run the program code or process data stored in the memory 21, for example, to run the personalized search system 20 based on user portraits, so as to implement the personalized search methods based on user portraits of the first and second embodiments. .
所述网络接口23可包括无线网络接口或有线网络接口,该网络接口23通常用于在所述服务器2与其他电子装置之间建立通信连接。例如,所述网络接口23用于通过网络将所述服务器2与外部终端相连,在所述服务器2与外部终端之间的建立数据传输通道和通信连接等。所述网络可以是企业内部网(Intranet)、互联网(Internet)、全球移动通讯系统(Global System of Mobile communication,GSM)、宽带码分多址(Wideband Code Division MultipleAccess,WCDMA)、4G网络、5G网络、蓝牙(Bluetooth)、Wi-Fi等无线或有线网络。The network interface 23 may include a wireless network interface or a wired network interface. The network interface 23 is generally used to establish a communication connection between the server 2 and other electronic devices. For example, the network interface 23 is used to connect the server 2 to an external terminal through a network, and to establish a data transmission channel and a communication connection between the server 2 and the external terminal. The network may be an intranet, the Internet, a global system of mobile communication (GSM), a wideband code division multiple access (WCDMA), a 4G network, a 5G network , Bluetooth (Bluetooth), Wi-Fi and other wireless or wired networks.
需要指出的是,图8仅示出了具有部件20-23的计算机设备2,但是应理解的是,并不要求实施所有示出的部件,可以替代的实施更多或者更少的部件。It should be pointed out that FIG. 8 only shows the computer device 2 with components 20-23, but it should be understood that it is not required to implement all the components shown, and more or fewer components may be implemented instead.
在本实施例中,存储于存储器21中的所述根据用户画像的个性化搜索系统20还可以被分割为一个或者多个程序模块,所述一个或者多个程序模块被存储于存储器21中,并由一个或多个处理器(本实施例为处理器22)所执行,以完成本申请。In this embodiment, the personalized search system 20 based on user portraits stored in the memory 21 may also be divided into one or more program modules, and the one or more program modules are stored in the memory 21, It is executed by one or more processors (the processor 22 in this embodiment) to complete the application.
例如,图7示出了所述实现根据用户画像的个性化搜索系统20实施例三的程序模块示意图,该实施例中,所述根据用户画像的个性化搜索系统20可以被划分为第一获取模块200、第二获取模块201、第三获取模块202、第一计算模块203、第二计算模块204与结果输出模块205。其中,本申请所称的程序模块是指能够完成特定功能的一系列计算机可读指令的指令段。所述程序模块200-205的具体功能在实施例三中已有详细描述,在此不再赘述。For example, FIG. 7 shows a schematic diagram of the program modules of the third embodiment of the personalized search system 20 based on user portraits. In this embodiment, the personalized search system 20 based on user portraits can be divided into first acquisition. Module 200, second acquisition module 201, third acquisition module 202, first calculation module 203, second calculation module 204, and result output module 205. Among them, the program module referred to in this application refers to an instruction segment of a series of computer-readable instructions that can complete specific functions. The specific functions of the program modules 200-205 have been described in detail in the third embodiment, and will not be repeated here.
实施例五Example five
本实施例还提供一种非易失性计算机可读存储介质,如闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁 性存储器、磁盘、光盘、服务器、App应用商城等等,其上存储有计算机可读指令,程序被处理器执行时实现相应功能。本实施例的非易失性计算机可读存储介质用于存储根据用户画像的个性化搜索系统20,被处理器执行时实现以下步骤:This embodiment also provides a non-volatile computer-readable storage medium, such as flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory ( SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disk, optical disk, server, App application mall, etc., on which storage There are computer-readable instructions, and the corresponding functions are realized when the program is executed by the processor. The non-volatile computer-readable storage medium of this embodiment is used to store the personalized search system 20 according to the user portrait, and when executed by the processor, the following steps are implemented:
获取第一关键词和/或与所述第一关键词映射的第一类别词,所述第一关键词关联于患者的感兴趣类别;Acquiring a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category;
获取所述患者的输入信息,根据所述输入信息获取第二关键词和/或与所述第二关键词对应的第二类别词;Acquiring input information of the patient, and acquiring a second keyword and/or a second category word corresponding to the second keyword according to the input information;
根据所述第一关键词、第一类别词、第二关键词以及第二类别词获取第一文章集,其中,所述第一文章集包括至少一篇文章;Obtaining a first article collection according to the first keyword, the first category word, the second keyword, and the second category word, where the first article collection includes at least one article;
计算所述第一文章集中的每篇文章的所述第一关键词的第一权重系数、第一类别词的第二权重系数、所述第二关键词的第三权重系数以及第二类别词的第四权重系数,所述每篇文章包括所述第一关键词、第一类别词、第二关键词和/或第二类别词;Calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, the third weight coefficient of the second keyword, and the second category word of each article in the first article set The fourth weight coefficient of each article includes the first keyword, the first category word, the second keyword and/or the second category word;
根据所述第一权重系数、所述第二权重系数、所述第三权重系数与所述第四权重系数计算得到所述每篇文章与患者搜索目标的第一匹配度;及Calculating the first matching degree between each article and the patient's search target according to the first weighting coefficient, the second weighting coefficient, the third weighting coefficient, and the fourth weighting coefficient; and
基于每篇文章的第一匹配度,输出搜索结果页面。Based on the first matching degree of each article, the search result page is output.
本发明实施例通过对患者的个人信息及关联信息进行分析,得到用户画像,再结合患者进行搜索的输入信息获取关键词及与关键词相关的类别词,调用带有关键词与类别词的文章集,并且对文章集的每篇文章的关键词及类别词进行权重系数的计算,再通过关键词及类别词的权重系数得到每篇文章的匹配度,进行从大到小的排序,从而提高了患者搜索文章的准确度。The embodiment of the present invention obtains the user portrait by analyzing the patient's personal information and related information, and then combines the input information of the patient to obtain keywords and category words related to the keywords, and calls articles with keywords and category words It also calculates the weight coefficient of the keywords and category words of each article in the article collection, and then obtains the matching degree of each article through the weight coefficients of the keywords and category words, and sorts them from large to small to improve Improve the accuracy of patients’ search articles.
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。The serial numbers of the foregoing embodiments of the present application are only for description, and do not represent the advantages and disadvantages of the embodiments.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。Through the description of the above embodiments, those skilled in the art can clearly understand that the method of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。The above are only preferred embodiments of this application, and do not limit the scope of this application. Any equivalent structure or equivalent process transformation made using the content of the description and drawings of this application, or directly or indirectly used in other related technical fields , The same reason is included in the scope of patent protection of this application.

Claims (21)

  1. 一种根据用户画像的个性化搜索方法,包括:A personalized search method based on user portraits, including:
    获取第一关键词和/或与所述第一关键词映射的第一类别词,所述第一关键词关联于患者的感兴趣类别;Acquiring a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category;
    获取所述患者的输入信息,根据所述输入信息获取第二关键词和/或与所述第二关键词对应的第二类别词;Acquiring input information of the patient, and acquiring a second keyword and/or a second category word corresponding to the second keyword according to the input information;
    根据所述第一关键词、第一类别词、第二关键词以及第二类别词获取第一文章集,其中,所述第一文章集包括至少一篇文章;Obtaining a first article collection according to the first keyword, the first category word, the second keyword, and the second category word, where the first article collection includes at least one article;
    计算所述第一文章集中的每篇文章的所述第一关键词的第一权重系数、第一类别词的第二权重系数、所述第二关键词的第三权重系数以及第二类别词的第四权重系数,所述每篇文章包括所述第一关键词、第一类别词、第二关键词和/或第二类别词;Calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, the third weight coefficient of the second keyword, and the second category word of each article in the first article set The fourth weight coefficient of each article includes the first keyword, the first category word, the second keyword and/or the second category word;
    根据所述第一权重系数、所述第二权重系数、所述第三权重系数与所述第四权重系数计算得到所述每篇文章与患者搜索目标的第一匹配度;及Calculating the first matching degree between each article and the patient's search target according to the first weighting coefficient, the second weighting coefficient, the third weighting coefficient, and the fourth weighting coefficient; and
    基于每篇文章的第一匹配度,输出搜索结果页面。Based on the first matching degree of each article, the search result page is output.
  2. 根据权利要求1所述的个性化搜索方法,还包括获取患者的感兴趣类别的步骤,包括:The personalized search method according to claim 1, further comprising the step of obtaining the patient's interest category, including:
    获取所述患者通过终端设备输入的个人信息,以及基于所述个人信息从指定服务器中查询所述患者的关联信息,所述关联信息包括所述患者的历史操作记录;Acquiring personal information input by the patient through a terminal device, and querying the patient's associated information from a designated server based on the personal information, the associated information including historical operation records of the patient;
    基于所述个人信息以及与所述关联信息,构建所述患者的用户画像,所述用户画像包括多个维度对应的多个画像标签;Constructing a user portrait of the patient based on the personal information and the associated information, the user portrait including multiple portrait tags corresponding to multiple dimensions;
    根据所述用户画像获取所述患者的感兴趣类别。Acquire the patient's interest category according to the user portrait.
  3. 根据权利要求2所述的个性化搜索方法,获取所述患者通过终端设备输入的个人信息,以及基于所述个人信息从指定服务器中查询所述患者的关联信息的步骤,还包括预先建立疾病分类体系:The personalized search method according to claim 2, wherein the step of obtaining personal information input by the patient through a terminal device, and querying the patient's associated information from a designated server based on the personal information, further comprising pre-establishing disease classification system:
    获取多个样本用户的多个样本用户信息以及与样本用户的样本关联信息;Acquire multiple sample user information of multiple sample users and sample related information with sample users;
    通过TF-IDF模型,从所述多个样本用户信息以及与样本用户的样本关联信息中提取多个样本关键词集:Using the TF-IDF model, extract multiple sample keyword sets from the multiple sample user information and sample related information with the sample users:
    将多个样本关键词集做为第1层样本神经网络模型的输入,以分类体系中第1个样本类别词为第1层样本神经网络模型的输出,训练第1层样本神经网络模型根据关键词预测对应的类别词的性能;Take multiple sample keyword sets as the input of the first layer sample neural network model, and use the first sample category word in the classification system as the output of the first layer sample neural network model, and train the first layer sample neural network model according to the key The performance of word prediction corresponding category words;
    直至到第m层样本神经网络模型,停止训练,其中,2≤m≤M,M为所述分类体系包括的样本类别词总数量;Stop training until the sample neural network model of the mth layer is reached, where 2≤m≤M, and M is the total number of sample category words included in the classification system;
    以第m-1层样本神经网络模型的训练结果以及多个样本关键词为第m层样本神经网络模型的输入,以所述分类体系中第m个样本类别词为第m层样本神经网络模型的输出,训练第m级样本神经网络模型根据关键词预测对应的类别词的性能;Take the training result of the sample neural network model of the m-1 layer and multiple sample keywords as the input of the sample neural network model of the m-th layer, and use the m-th sample category word in the classification system as the sample neural network model of the m-th layer The output of training the m-th sample neural network model to predict the performance of the corresponding category words according to the keywords;
    将所述个人信息以及与所述患者的关联信息通过TF-IDF模型提取多个关键词;Extracting multiple keywords from the personal information and the associated information with the patient through the TF-IDF model;
    将所述多个关键词输入到所述样本神经网络模型中,经过每层样本神经网络模型的根据关键词预测对应的类别词的性能,输出得到多个关键词分别对应的多个类别词;及Inputting the multiple keywords into the sample neural network model, and predicting the performance of corresponding category words according to the keywords of each layer of the sample neural network model, and outputting multiple category words corresponding to the multiple keywords; and
    将多个类别词与对应的关键词进行关联,得到所述疾病分类体系;所述疾病分类体系 包括多个子疾病类别,每个子疾病类别中包括多个类别词,每个类别词对应至少一个关键词集;其中,所述多个类别词包括至少疾病成因和疾病用药。Associate multiple category words with corresponding keywords to obtain the disease classification system; the disease classification system includes multiple sub-disease categories, each sub-disease category includes multiple category words, and each category word corresponds to at least one key Word set; wherein, the multiple category words include at least disease causes and disease medications.
  4. 根据权利要求2所述的个性化搜索方法,基于所述个人信息以及与所述患者的关联信息,构建所述患者的用户画像,所述用户画像包括多个维度对应的多个画像标签的步骤,包括:The personalized search method according to claim 2, constructing a user portrait of the patient based on the personal information and associated information with the patient, the user portrait including a plurality of portrait labels corresponding to multiple dimensions ,include:
    通过word2vec模型得到所述个人信息和关联信息的第一关键词的词向量;Obtain the word vector of the first keyword of the personal information and related information through the word2vec model;
    将所述第一关键词的词向量输入到预测模型中,通过所述预测模型输出所述患者与各个画像标签的关联概率,以得到所述患者的用户画像。The word vector of the first keyword is input into a prediction model, and the correlation probability between the patient and each portrait label is output through the prediction model to obtain a user portrait of the patient.
  5. 根据权利要求4所述的个性化搜索方法,将所述第一关键词的词向量输入到预测模型中,通过所述预测模型输出所述患者与各个画像标签的关联概率,以得到所述患者的用户画像的步骤,包括:The personalized search method according to claim 4, inputting the word vector of the first keyword into a prediction model, and outputting the correlation probability between the patient and each portrait label through the prediction model to obtain the patient The steps of the user portrait include:
    将所述疾病分类体系与所述患者建立映射关系,形成所述患者的用户画像,每个类别词对应为一个维度,每个关键词对应为一个画像标签;Establishing a mapping relationship between the disease classification system and the patient to form a user portrait of the patient, each category word corresponds to a dimension, and each keyword corresponds to a portrait label;
    将所述疾病分类体系的关键词集的词向量输入到所述预测模型中,根据所述预测模型的softmax层计算得到第一关联概率;Input the word vector of the keyword set of the disease classification system into the prediction model, and calculate the first association probability according to the softmax layer of the prediction model;
    将所述疾病分类体系的类别词的词向量输入到所述预测模型中,根据所述预测模型的softmax层计算得到第二关联概率;Input the word vector of the category word of the disease classification system into the prediction model, and calculate the second association probability according to the softmax layer of the prediction model;
    根据所述第一关联概率与所述第二关联概率得到所述患者的用户画像的关联概率。The correlation probability of the user portrait of the patient is obtained according to the first correlation probability and the second correlation probability.
  6. 根据权利要求2所述的个性化搜索方法,基于所述个人信息以及与所述患者的关联信息,构建所述患者的用户画像的步骤,还包括:The personalized search method according to claim 2, wherein the step of constructing a user portrait of the patient based on the personal information and the associated information with the patient, further comprising:
    分析所述患者的实时历史操作信息,根据实时历史操作信息获取患者的关注点,将关注点映射到患者的用户画像上,以更新所述患者的用户画像。The real-time historical operation information of the patient is analyzed, the patient's attention point is obtained according to the real-time historical operation information, and the attention point is mapped to the user portrait of the patient to update the user portrait of the patient.
  7. 根据权利要求1所述的个性化搜索方法,基于每篇文章的第一匹配度,输出搜索结果页面的步骤之后,还包括:The personalized search method according to claim 1, after the step of outputting the search result page based on the first matching degree of each article, further comprising:
    获取所述患者通过终端设备输入的搜索信息,根据所述搜索信息从指定服务器中获取所述搜索信息的第五关键词与所述第五关键词对应的第五类别词;Acquiring search information input by the patient through a terminal device, and acquiring a fifth keyword of the search information from a designated server according to the search information and a fifth category word corresponding to the fifth keyword;
    根据所述第五关键词与所述第五类别词获取第二文章集,其中所述第二文章集包括至少一篇文章;Obtaining a second collection of articles according to the fifth keyword and the fifth category words, wherein the second collection of articles includes at least one article;
    所述第二文章集的每篇所述文章包括所述第五关键词和/或第五类别词,计算所述每篇文章的所述第五关键词的第五权重系数、第五类别词的第六权重系数;Each article in the second article collection includes the fifth keyword and/or fifth category word, and calculates the fifth weight coefficient and fifth category word of the fifth keyword of each article The sixth weight coefficient;
    根据所述第五权重系数与所述第六权重系数计算得到所述第二文章集每篇文章的第二匹配度;Calculating the second matching degree of each article in the second article collection according to the fifth weighting coefficient and the sixth weighting coefficient;
    基于所述第二文章集的每篇文章中的第二匹配度,输出搜索结果页面。Based on the second degree of matching in each article of the second article collection, a search result page is output.
  8. 一种根据用户画像的个性化搜索系统,包括:A personalized search system based on user portraits, including:
    第一获取模块,用于获取第一关键词和/或与所述第一关键词映射的第一类别词,所述第一关键词关联于患者的感兴趣类别;The first obtaining module is configured to obtain a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category;
    第二获取模块,用于获取所述患者的输入信息,根据所述输入信息获取第二关键词和/或与所述第二关键词对应的第二类别词;The second acquisition module is configured to acquire input information of the patient, and acquire a second keyword and/or a second category word corresponding to the second keyword according to the input information;
    第三获取模块,用于根据所述第一关键词、第一类别词、第二关键词以及第二类别词获取第一文章集,其中,所述第一文章集包括至少一篇文章;The third obtaining module is configured to obtain a first article collection according to the first keyword, the first category word, the second keyword, and the second category word, where the first article collection includes at least one article;
    第一计算模块,用于计算所述第一文章集中的每篇文章的所述第一关键词的第一权重系数、第一类别词的第二权重系数、所述第二关键词的第三权重系数以及第二类别词的第四权重系数,所述每篇文章包括所述第一关键词、第一类别词、第二关键词和/或第二类别词;The first calculation module is used to calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, and the third weight coefficient of the second keyword in each article in the first article set. A weighting coefficient and a fourth weighting coefficient of a second category word, each article includes the first keyword, the first category word, the second keyword and/or the second category word;
    第二计算模块,用于根据所述第一权重系数、所述第二权重系数、所述第三权重系数与所述第四权重系数计算得到所述每篇文章与患者搜索目标的第一匹配度;The second calculation module is configured to calculate the first match between each article and the patient search target according to the first weight coefficient, the second weight coefficient, the third weight coefficient, and the fourth weight coefficient degree;
    结果输出模块,用于基于每篇文章的第一匹配度,输出搜索结果页面。The result output module is used to output the search result page based on the first matching degree of each article.
  9. 根据权利要求8所述的个性化搜索系统,所述第一获取模块还用于:According to the personalized search system of claim 8, the first obtaining module is further configured to:
    获取所述患者通过终端设备输入的个人信息,以及基于所述个人信息从指定服务器中查询所述患者的关联信息,所述关联信息包括所述患者的历史操作记录;Acquiring personal information input by the patient through a terminal device, and querying the patient's associated information from a designated server based on the personal information, the associated information including historical operation records of the patient;
    基于所述个人信息以及与所述关联信息,构建所述患者的用户画像,所述用户画像包括多个维度对应的多个画像标签;Constructing a user portrait of the patient based on the personal information and the associated information, the user portrait including multiple portrait tags corresponding to multiple dimensions;
    根据所述用户画像获取所述患者的感兴趣类别。Acquire the patient's interest category according to the user portrait.
  10. 根据权利要求9所述的个性化搜索系统,所述第一获取模块还用于:According to the personalized search system of claim 9, the first acquisition module is further configured to:
    获取多个样本用户的多个样本用户信息以及与样本用户的样本关联信息;Acquire multiple sample user information of multiple sample users and sample related information with sample users;
    通过TF-IDF模型,从所述多个样本用户信息以及与样本用户的样本关联信息中提取多个样本关键词集:Using the TF-IDF model, extract multiple sample keyword sets from the multiple sample user information and sample related information with the sample users:
    将多个样本关键词集做为第1层样本神经网络模型的输入,以分类体系中第1个样本类别词为第1层样本神经网络模型的输出,训练第1层样本神经网络模型根据关键词预测对应的类别词的性能;Take multiple sample keyword sets as the input of the first layer sample neural network model, and use the first sample category word in the classification system as the output of the first layer sample neural network model, and train the first layer sample neural network model according to the key The performance of word prediction corresponding category words;
    直至到第m层样本神经网络模型,停止训练,其中,2≤m≤M,M为所述分类体系包括的样本类别词总数量;Stop training until the sample neural network model of the mth layer is reached, where 2≤m≤M, and M is the total number of sample category words included in the classification system;
    以第m-1层样本神经网络模型的训练结果以及多个样本关键词为第m层样本神经网络模型的输入,以所述分类体系中第m个样本类别词为第m层样本神经网络模型的输出,训练第m级样本神经网络模型根据关键词预测对应的类别词的性能;Take the training result of the sample neural network model of the m-1 layer and multiple sample keywords as the input of the sample neural network model of the m-th layer, and use the m-th sample category word in the classification system as the sample neural network model of the m-th layer The output of training the m-th sample neural network model to predict the performance of the corresponding category words according to the keywords;
    将所述个人信息以及与所述患者的关联信息通过TF-IDF模型提取多个关键词;Extracting multiple keywords from the personal information and the associated information with the patient through the TF-IDF model;
    将所述多个关键词输入到所述样本神经网络模型中,经过每层样本神经网络模型的根据关键词预测对应的类别词的性能,输出得到多个关键词分别对应的多个类别词;及Inputting the multiple keywords into the sample neural network model, and predicting the performance of corresponding category words according to the keywords of each layer of the sample neural network model, and outputting multiple category words corresponding to the multiple keywords; and
    将多个类别词与对应的关键词进行关联,得到所述疾病分类体系;所述疾病分类体系包括多个子疾病类别,每个子疾病类别中包括多个类别词,每个类别词对应至少一个关键词集;其中,所述多个类别词包括至少疾病成因和疾病用药。Associate multiple category words with corresponding keywords to obtain the disease classification system; the disease classification system includes multiple sub-disease categories, each sub-disease category includes multiple category words, and each category word corresponds to at least one key Word set; wherein, the multiple category words include at least disease causes and disease medications.
  11. 根据权利要求9所述的个性化搜索系统,所述第一获取模块还用于:According to the personalized search system of claim 9, the first acquisition module is further configured to:
    通过word2vec模型得到所述个人信息和关联信息的第一关键词的词向量;Obtain the word vector of the first keyword of the personal information and related information through the word2vec model;
    将所述第一关键词的词向量输入到预测模型中,通过所述预测模型输出所述患者与各个画像标签的关联概率,以得到所述患者的用户画像。The word vector of the first keyword is input into a prediction model, and the correlation probability between the patient and each portrait label is output through the prediction model to obtain a user portrait of the patient.
  12. 根据权利要求11所述的个性化搜索系统,所述第一获取模块还用于:According to the personalized search system of claim 11, the first obtaining module is further configured to:
    将所述疾病分类体系与所述患者建立映射关系,形成所述患者的用户画像,每个类别词对应为一个维度,每个关键词对应为一个画像标签;Establishing a mapping relationship between the disease classification system and the patient to form a user portrait of the patient, each category word corresponds to a dimension, and each keyword corresponds to a portrait label;
    将所述疾病分类体系的关键词集的词向量输入到所述预测模型中,根据所述预测模型的softmax层计算得到第一关联概率;Input the word vector of the keyword set of the disease classification system into the prediction model, and calculate the first association probability according to the softmax layer of the prediction model;
    将所述疾病分类体系的类别词的词向量输入到所述预测模型中,根据所述预测模型的 softmax层计算得到第二关联概率;Input the word vectors of the category words of the disease classification system into the prediction model, and calculate the second association probability according to the softmax layer of the prediction model;
    根据所述第一关联概率与所述第二关联概率得到所述患者的用户画像的关联概率。The correlation probability of the user portrait of the patient is obtained according to the first correlation probability and the second correlation probability.
  13. 根据权利要求9所述的个性化搜索系统,所述第一获取模块还用于:According to the personalized search system of claim 9, the first acquisition module is further configured to:
    分析所述患者的实时历史操作信息,根据实时历史操作信息获取患者的关注点,将关注点映射到患者的用户画像上,以更新所述患者的用户画像。The real-time historical operation information of the patient is analyzed, the patient's attention point is obtained according to the real-time historical operation information, and the attention point is mapped to the user portrait of the patient to update the user portrait of the patient.
  14. 一种计算机设备,所述计算机设备包括存储器、处理器,所述存储器上存储有可在所述处理器上运行的根据用户画像的计算机可读指令,所述计算机可读指令被所述处理器执行时实现以下步骤:A computer device comprising a memory and a processor. The memory stores computer readable instructions that can run on the processor according to a user portrait, and the computer readable instructions are stored by the processor. The following steps are implemented during execution:
    获取第一关键词和/或与所述第一关键词映射的第一类别词,所述第一关键词关联于患者的感兴趣类别;Acquiring a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category;
    获取所述患者的输入信息,根据所述输入信息获取第二关键词和/或与所述第二关键词对应的第二类别词;Acquiring input information of the patient, and acquiring a second keyword and/or a second category word corresponding to the second keyword according to the input information;
    根据所述第一关键词、第一类别词、第二关键词以及第二类别词获取第一文章集,其中,所述第一文章集包括至少一篇文章;Obtaining a first article collection according to the first keyword, the first category word, the second keyword, and the second category word, where the first article collection includes at least one article;
    计算所述第一文章集中的每篇文章的所述第一关键词的第一权重系数、第一类别词的第二权重系数、所述第二关键词的第三权重系数以及第二类别词的第四权重系数,所述每篇文章包括所述第一关键词、第一类别词、第二关键词和/或第二类别词;Calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, the third weight coefficient of the second keyword, and the second category word of each article in the first article set The fourth weight coefficient of each article includes the first keyword, the first category word, the second keyword and/or the second category word;
    根据所述第一权重系数、所述第二权重系数、所述第三权重系数与所述第四权重系数计算得到所述每篇文章与患者搜索目标的第一匹配度;及Calculating the first matching degree between each article and the patient's search target according to the first weighting coefficient, the second weighting coefficient, the third weighting coefficient, and the fourth weighting coefficient; and
    基于每篇文章的第一匹配度,输出搜索结果页面。Based on the first matching degree of each article, the search result page is output.
  15. 根据权利要求14所述的个性化搜索系统,所述计算机可读指令被所述处理器执行时还实现以下步骤:According to the personalized search system of claim 14, the computer-readable instructions further implement the following steps when executed by the processor:
    获取所述患者通过终端设备输入的个人信息,以及基于所述个人信息从指定服务器中查询所述患者的关联信息,所述关联信息包括所述患者的历史操作记录;Acquiring personal information input by the patient through a terminal device, and querying the patient's associated information from a designated server based on the personal information, the associated information including the patient's historical operation records;
    基于所述个人信息以及与所述关联信息,构建所述患者的用户画像,所述用户画像包括多个维度对应的多个画像标签;Constructing a user portrait of the patient based on the personal information and the associated information, the user portrait including multiple portrait tags corresponding to multiple dimensions;
    根据所述用户画像获取所述患者的感兴趣类别。Acquire the patient's interest category according to the user portrait.
  16. 根据权利要求15所述的个性化搜索系统,所述计算机可读指令被所述处理器执行时还实现以下步骤:According to the personalized search system according to claim 15, the computer-readable instructions further implement the following steps when executed by the processor:
    获取多个样本用户的多个样本用户信息以及与样本用户的样本关联信息;Acquire multiple sample user information of multiple sample users and sample related information with sample users;
    通过TF-IDF模型,从所述多个样本用户信息以及与样本用户的样本关联信息中提取多个样本关键词集:Using the TF-IDF model, extract multiple sample keyword sets from the multiple sample user information and sample related information with the sample users:
    将多个样本关键词集做为第1层样本神经网络模型的输入,以分类体系中第1个样本类别词为第1层样本神经网络模型的输出,训练第1层样本神经网络模型根据关键词预测对应的类别词的性能;Take multiple sample keyword sets as the input of the first layer sample neural network model, and use the first sample category word in the classification system as the output of the first layer sample neural network model, and train the first layer sample neural network model according to the key The performance of word prediction corresponding category words;
    直至到第m层样本神经网络模型,停止训练,其中,2≤m≤M,M为所述分类体系包括的样本类别词总数量;Stop training until the sample neural network model of the mth layer is reached, where 2≤m≤M, and M is the total number of sample category words included in the classification system;
    以第m-1层样本神经网络模型的训练结果以及多个样本关键词为第m层样本神经网络模型的输入,以所述分类体系中第m个样本类别词为第m层样本神经网络模型的输出,训练第m级样本神经网络模型根据关键词预测对应的类别词的性能;Take the training result of the sample neural network model of the m-1 layer and multiple sample keywords as the input of the sample neural network model of the m-th layer, and use the m-th sample category word in the classification system as the sample neural network model of the m-th layer The output of training the m-th sample neural network model to predict the performance of the corresponding category words according to the keywords;
    将所述个人信息以及与所述患者的关联信息通过TF-IDF模型提取多个关键词;Extracting multiple keywords from the personal information and the associated information with the patient through the TF-IDF model;
    将所述多个关键词输入到所述样本神经网络模型中,经过每层样本神经网络模型的根据关键词预测对应的类别词的性能,输出得到多个关键词分别对应的多个类别词;及Inputting the multiple keywords into the sample neural network model, and predicting the performance of corresponding category words according to the keywords of each layer of the sample neural network model, and outputting multiple category words corresponding to the multiple keywords; and
    将多个类别词与对应的关键词进行关联,得到所述疾病分类体系;所述疾病分类体系包括多个子疾病类别,每个子疾病类别中包括多个类别词,每个类别词对应至少一个关键词集;其中,所述多个类别词包括至少疾病成因和疾病用药。Associate multiple category words with corresponding keywords to obtain the disease classification system; the disease classification system includes multiple sub-disease categories, each sub-disease category includes multiple category words, and each category word corresponds to at least one key Word set; wherein, the multiple category words include at least disease causes and disease medications.
  17. 根据权利要求15所述的个性化搜索系统,所述计算机可读指令被所述处理器执行时还实现以下步骤:According to the personalized search system according to claim 15, the computer-readable instructions further implement the following steps when executed by the processor:
    通过word2vec模型得到所述个人信息和关联信息的第一关键词的词向量;Obtain the word vector of the first keyword of the personal information and related information through the word2vec model;
    将所述第一关键词的词向量输入到预测模型中,通过所述预测模型输出所述患者与各个画像标签的关联概率,以得到所述患者的用户画像。The word vector of the first keyword is input into a prediction model, and the correlation probability between the patient and each portrait label is output through the prediction model to obtain a user portrait of the patient.
  18. 根据权利要求17所述的个性化搜索系统,所述计算机可读指令被所述处理器执行时还实现以下步骤:According to the personalized search system of claim 17, the computer-readable instructions further implement the following steps when executed by the processor:
    将所述疾病分类体系与所述患者建立映射关系,形成所述患者的用户画像,每个类别词对应为一个维度,每个关键词对应为一个画像标签;Establishing a mapping relationship between the disease classification system and the patient to form a user portrait of the patient, each category word corresponds to a dimension, and each keyword corresponds to a portrait label;
    将所述疾病分类体系的关键词集的词向量输入到所述预测模型中,根据所述预测模型的softmax层计算得到第一关联概率;Input the word vector of the keyword set of the disease classification system into the prediction model, and calculate the first association probability according to the softmax layer of the prediction model;
    将所述疾病分类体系的类别词的词向量输入到所述预测模型中,根据所述预测模型的softmax层计算得到第二关联概率;Input the word vector of the category word of the disease classification system into the prediction model, and calculate the second association probability according to the softmax layer of the prediction model;
    根据所述第一关联概率与所述第二关联概率得到所述患者的用户画像的关联概率。The correlation probability of the user portrait of the patient is obtained according to the first correlation probability and the second correlation probability.
  19. [根据细则91更正 09.01.2020]
    [Corrected according to Rule 91 09.01.2020]
  20. 一种计算机设备,所述计算机设备包括存储器、处理器,所述存储器上存储有可在所述处理器上运行的计算机可读指令,所述计算机可读指令被所述处理器执行时实现以下步骤:A computer device, the computer device includes a memory and a processor, the memory stores computer readable instructions that can run on the processor, and the computer readable instructions when executed by the processor realize the following step:
    获取第一关键词和/或与所述第一关键词映射的第一类别词,所述第一关键词关联于患者的感兴趣类别;Acquiring a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category;
    获取所述患者的输入信息,根据所述输入信息获取第二关键词和/或与所述第二关键词对应的第二类别词;Acquiring input information of the patient, and acquiring a second keyword and/or a second category word corresponding to the second keyword according to the input information;
    根据所述第一关键词、第一类别词、第二关键词以及第二类别词获取第一文章集,其中,所述第一文章集包括至少一篇文章;Obtaining a first article collection according to the first keyword, the first category word, the second keyword, and the second category word, where the first article collection includes at least one article;
    计算所述第一文章集中的每篇文章的所述第一关键词的第一权重系数、第一类别词的第二权重系数、所述第二关键词的第三权重系数以及第二类别词的第四权重系数,所述每篇文章包括所述第一关键词、第一类别词、第二关键词和/或第二类别词;Calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, the third weight coefficient of the second keyword, and the second category word of each article in the first article set The fourth weight coefficient of each article includes the first keyword, the first category word, the second keyword and/or the second category word;
    根据所述第一权重系数、所述第二权重系数、所述第三权重系数与所述第四权重系数计算得到所述每篇文章与患者搜索目标的第一匹配度;及Calculating the first matching degree between each article and the patient's search target according to the first weighting coefficient, the second weighting coefficient, the third weighting coefficient, and the fourth weighting coefficient; and
    基于每篇文章的第一匹配度,输出搜索结果页面。Based on the first matching degree of each article, the search result page is output.
  21. 一种非易失性计算机可读存储介质,所述非易失性计算机可读存储介质内存储有 计算机可读指令,所述计算机可读指令可被至少一个处理器所执行,以使所述至少一个处理器执行以下步骤:A non-volatile computer-readable storage medium in which computer-readable instructions are stored, and the computer-readable instructions can be executed by at least one processor to cause the At least one processor performs the following steps:
    获取第一关键词和/或与所述第一关键词映射的第一类别词,所述第一关键词关联于患者的感兴趣类别;Acquiring a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category;
    获取所述患者的输入信息,根据所述输入信息获取第二关键词和/或与所述第二关键词对应的第二类别词;Acquiring input information of the patient, and acquiring a second keyword and/or a second category word corresponding to the second keyword according to the input information;
    根据所述第一关键词、第一类别词、第二关键词以及第二类别词获取第一文章集,其中,所述第一文章集包括至少一篇文章;Obtaining a first article collection according to the first keyword, the first category word, the second keyword, and the second category word, where the first article collection includes at least one article;
    计算所述第一文章集中的每篇文章的所述第一关键词的第一权重系数、第一类别词的第二权重系数、所述第二关键词的第三权重系数以及第二类别词的第四权重系数,所述每篇文章包括所述第一关键词、第一类别词、第二关键词和/或第二类别词;Calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, the third weight coefficient of the second keyword, and the second category word of each article in the first article set The fourth weight coefficient of each article includes the first keyword, the first category word, the second keyword and/or the second category word;
    根据所述第一权重系数、所述第二权重系数、所述第三权重系数与所述第四权重系数计算得到所述每篇文章与患者搜索目标的第一匹配度;及Calculating the first matching degree between each article and the patient's search target according to the first weighting coefficient, the second weighting coefficient, the third weighting coefficient, and the fourth weighting coefficient; and
    基于每篇文章的第一匹配度,输出搜索结果页面。Based on the first matching degree of each article, the search result page is output.
PCT/CN2019/118070 2019-07-30 2019-11-13 Personalized search method, system, and device employing user portrait, and storage medium WO2021017306A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910694255.5A CN110580278B (en) 2019-07-30 2019-07-30 Personalized search method, system, equipment and storage medium according to user portraits
CN201910694255.5 2019-07-30

Publications (1)

Publication Number Publication Date
WO2021017306A1 true WO2021017306A1 (en) 2021-02-04

Family

ID=68811173

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/118070 WO2021017306A1 (en) 2019-07-30 2019-11-13 Personalized search method, system, and device employing user portrait, and storage medium

Country Status (2)

Country Link
CN (1) CN110580278B (en)
WO (1) WO2021017306A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114661990A (en) * 2022-03-23 2022-06-24 北京百度网讯科技有限公司 Method, apparatus, device, medium and product for data prediction and model training
CN114756745A (en) * 2022-03-29 2022-07-15 重庆义康鑫科技有限公司 Intelligent information recommendation method and device based on big data analysis
CN117076658A (en) * 2023-08-22 2023-11-17 南京朗拓科技投资有限公司 Quotation recommendation method, device and terminal based on information entropy

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113139386A (en) * 2020-01-19 2021-07-20 浙江爱多特大健康科技有限公司 Information processing method, device and equipment and computer storage medium
CN111326142A (en) * 2020-01-21 2020-06-23 青梧桐有限责任公司 Text information extraction method and system based on voice-to-text and electronic equipment
CN111538751B (en) * 2020-03-23 2021-05-04 重庆特斯联智慧科技股份有限公司 Tagged user portrait generation system and method for Internet of things data
CN111859094A (en) * 2020-08-10 2020-10-30 广州驰兴通用技术研究有限公司 Information analysis method and system based on cloud computing
CN113821730A (en) * 2021-11-23 2021-12-21 北京嘉和海森健康科技有限公司 Medical information pushing method and device and electronic equipment
CN114512241B (en) * 2021-12-27 2024-05-03 中国人民解放军总医院第一医学中心 Frequency analysis-based intelligent searching method and system for esophageal vein tumor information

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120084277A1 (en) * 2010-09-10 2012-04-05 Veveo, Inc. Method of and system for conducting personalized federated search and presentation of results therefrom
CN104484380A (en) * 2014-12-09 2015-04-01 百度在线网络技术(北京)有限公司 Personalized search method and personalized search device
CN105426528A (en) * 2015-12-15 2016-03-23 中南大学 Retrieving and ordering method and system for commodity data
CN109446402A (en) * 2017-08-29 2019-03-08 阿里巴巴集团控股有限公司 A kind of searching method and device

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160062967A1 (en) * 2014-08-27 2016-03-03 Tll, Llc System and method for measuring sentiment of text in context
CN105930425A (en) * 2016-04-18 2016-09-07 乐视控股(北京)有限公司 Personalized video recommendation method and apparatus
CN107273476A (en) * 2017-06-08 2017-10-20 广州优视网络科技有限公司 A kind of article search method, device and server
CN108090162A (en) * 2017-12-13 2018-05-29 北京百度网讯科技有限公司 Information-pushing method and device based on artificial intelligence
CN109993618A (en) * 2017-12-29 2019-07-09 北京京东尚科信息技术有限公司 Object search method, system and computer system, computer readable storage medium
CN109189904A (en) * 2018-08-10 2019-01-11 上海中彦信息科技股份有限公司 Individuation search method and system
CN109740016A (en) * 2019-01-03 2019-05-10 百度在线网络技术(北京)有限公司 Method, apparatus, server and the computer readable storage medium of music query

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120084277A1 (en) * 2010-09-10 2012-04-05 Veveo, Inc. Method of and system for conducting personalized federated search and presentation of results therefrom
CN104484380A (en) * 2014-12-09 2015-04-01 百度在线网络技术(北京)有限公司 Personalized search method and personalized search device
CN105426528A (en) * 2015-12-15 2016-03-23 中南大学 Retrieving and ordering method and system for commodity data
CN109446402A (en) * 2017-08-29 2019-03-08 阿里巴巴集团控股有限公司 A kind of searching method and device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114661990A (en) * 2022-03-23 2022-06-24 北京百度网讯科技有限公司 Method, apparatus, device, medium and product for data prediction and model training
CN114756745A (en) * 2022-03-29 2022-07-15 重庆义康鑫科技有限公司 Intelligent information recommendation method and device based on big data analysis
CN117076658A (en) * 2023-08-22 2023-11-17 南京朗拓科技投资有限公司 Quotation recommendation method, device and terminal based on information entropy
CN117076658B (en) * 2023-08-22 2024-05-03 南京朗拓科技投资有限公司 Quotation recommendation method, device and terminal based on information entropy

Also Published As

Publication number Publication date
CN110580278A (en) 2019-12-17
CN110580278B (en) 2023-05-26

Similar Documents

Publication Publication Date Title
WO2021017306A1 (en) Personalized search method, system, and device employing user portrait, and storage medium
Yang et al. Yum-me: a personalized nutrient-based meal recommender system
KR102411921B1 (en) A method for calculating relevance, an apparatus for calculating relevance, a data query apparatus, and a non-transitory computer-readable storage medium
US8898180B2 (en) Method and system for querying information
CN108733766B (en) Data query method and device and readable medium
EP2823410B1 (en) Entity augmentation service from latent relational data
US8478749B2 (en) Method and apparatus for determining relevant search results using a matrix framework
EP2866421B1 (en) Method and apparatus for identifying a same user in multiple social networks
WO2017024884A1 (en) Search intention identification method and device
CN110569349B (en) Method, system, equipment and storage medium for pushing ill teaching article based on big data
CN112948540B (en) Information query method, device, electronic equipment and computer readable medium
US20230298481A1 (en) Food description processing methods and apparatuses
CN112052297B (en) Information generation method, apparatus, electronic device and computer readable medium
CN110569419A (en) question-answering system optimization method and device, computer equipment and storage medium
CN107885875B (en) Synonymy transformation method and device for search words and server
CN112307190A (en) Medical literature sorting method and device, electronic equipment and storage medium
CN115630144A (en) Document searching method and device and related equipment
US20120059786A1 (en) Method and an apparatus for matching data network resources
CN115917528A (en) Question inquiry device, method, equipment and storage medium
CN116798590A (en) Processing method, device, equipment and medium for constructing medicine management prediction model
Yang et al. Integrating PPI datasets with the PPI data from biomedical literature for protein complex detection
Liang et al. JST-RR model: joint modeling of ratings and reviews in sentiment-topic prediction
Altosaar et al. RankFromSets: Scalable set recommendation with optimal recall
US20220327361A1 (en) Method for Training Joint Model, Object Information Processing Method, Apparatus, and System
US11727051B2 (en) Personalized image recommendations for areas of interest

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19939066

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19939066

Country of ref document: EP

Kind code of ref document: A1