WO2021017306A1

WO2021017306A1 - Personalized search method, system, and device employing user portrait, and storage medium

Info

Publication number: WO2021017306A1
Application number: PCT/CN2019/118070
Authority: WO
Inventors: 周晓峰; 朱威
Original assignee: 平安科技（深圳）有限公司
Priority date: 2019-07-30
Filing date: 2019-11-13
Publication date: 2021-02-04
Also published as: CN110580278A; CN110580278B

Abstract

A personalized search method employing a user portrait comprises: acquiring a first keyword and/or a first category word mapped to the first keyword, wherein the first keyword is associated with a category of interest of a patient (S100); acquiring input information of the patient, and acquiring, according to the input information, a second keyword and/or a second category word corresponding to the second keyword (S102); acquiring a first article set according to the first keyword, the first category word, the second keyword, and the second category word, wherein the first article set comprises at least one article (S104); calculating a first weight coefficient of the first keyword, a second weight coefficient of the first category word, a third weight coefficient of the second keyword, and a fourth weight coefficient of the second category word of each article in the first article set, wherein each article comprises the first keyword, the first category word, the second keyword, and/or the second category word (S106); calculating a first matching degree between each article and a search target of the patient according to the first weight coefficient, the second weight coefficient, the third weight coefficient, and the fourth weight coefficient (S108); and outputting a search result page on the basis of the first matching degree of each article (S110). By using the method, an accurate search is performed.

Description

Personalized search method, system, equipment and storage medium based on user portrait

This application requires that it be submitted to the Chinese Patent Office on July 30, 2019. The application number and patent name are respectively 201910694255.5, and the priority of the Chinese patent application for the invention patent of "personalized search method, system, equipment and storage medium based on user portrait" , Its entire content is incorporated in this application by reference.

Technical field

The embodiments of the present application relate to the field of search technology, and in particular, to a personalized search method, system, device, and storage medium based on user portraits.

technical background

With the development of the Internet, articles published by many people can be found on the Internet, which realizes data sharing, and people can find the information they need on the Internet.

The current implementation of the article search function is usually based on the search term entered by the user. When an article including a keyword set that matches the search term is searched, the article that meets the conditions is searched from multiple articles and displayed, but not satisfied The conditional articles are hidden, and then these filtered articles are displayed in a certain order. When arranged, the articles are sorted from largest to smallest according to the matching degree, and the search results are provided to users. However, the inventor found that, for patient users, the searched articles cannot fully meet their actual needs, and the search accuracy is insufficient.

Summary of the invention

In view of this, the purpose of the embodiments of the present application is to provide a personalized search method, system, device, and storage medium based on user portraits, which can perform accurate searches based on user portraits and input information.

In order to achieve the foregoing objectives, the embodiments of the present application provide a personalized search method based on user portraits, including:

Acquiring a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category;

Acquiring input information of the patient, and acquiring a second keyword and/or a second category word corresponding to the second keyword according to the input information;

Obtaining a first article collection according to the first keyword, the first category word, the second keyword, and the second category word, where the first article collection includes at least one article;

Calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, the third weight coefficient of the second keyword, and the second category word of each article in the first article set The fourth weight coefficient of each article includes the first keyword, the first category word, the second keyword and/or the second category word;

Calculating the first matching degree between each article and the patient's search target according to the first weighting coefficient, the second weighting coefficient, the third weighting coefficient, and the fourth weighting coefficient; and

Based on the first matching degree of each article, the search result page is output.

In order to achieve the foregoing objective, an embodiment of the present application also provides a personalized search system based on user portraits, including:

The first obtaining module is configured to obtain a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category;

The second acquisition module is configured to acquire input information of the patient, and acquire a second keyword and/or a second category word corresponding to the second keyword according to the input information;

The third obtaining module is configured to obtain a first article collection according to the first keyword, the first category word, the second keyword, and the second category word, where the first article collection includes at least one article;

The first calculation module is used to calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, and the third weight coefficient of the second keyword in each article in the first article set. A weighting coefficient and a fourth weighting coefficient of a second category word, each article includes the first keyword, the first category word, the second keyword and/or the second category word;

The second calculation module is configured to calculate the first match between each article and the patient search target according to the first weight coefficient, the second weight coefficient, the third weight coefficient, and the fourth weight coefficient degree;

The result output module is used to output the search result page based on the first matching degree of each article.

In order to achieve the foregoing objective, an embodiment of the present application further provides a computer device, the computer device includes a memory and a processor, the memory stores computer-readable instructions that can run on the processor, and the computer When the readable instructions are executed by the processor, the following steps are implemented:

In order to achieve the above objective, the embodiments of the present application also provide a non-volatile computer-readable storage medium. The non-volatile computer-readable storage medium stores computer-readable instructions, and the computer-readable instructions may Is executed by at least one processor, so that the at least one processor executes the following steps:

Description of the drawings

FIG. 1 is a flowchart of Embodiment 1 of a personalized search method based on a user portrait according to an embodiment of the application.

Fig. 2 is a flowchart of step S100 in Fig. 1 of an embodiment of the application.

Figure 3 is a flow chart of pre-establishing a disease classification system according to an embodiment of the application.

Fig. 4 is a flowchart of step S100B in Fig. 2 of the embodiment of the application.

Fig. 5 is a flowchart of step S100B2 in Fig. 4 of an embodiment of the application.

Fig. 6 is a flowchart of a second embodiment of a personalized search method based on a user portrait according to an embodiment of the application.

FIG. 7 is a schematic diagram of program modules in Embodiment 3 of a personalized search system based on user portraits according to an embodiment of the application.

FIG. 8 is a schematic diagram of the hardware structure of Embodiment 4 of the computer device according to the embodiment of the application.

Detailed ways

Example one

Referring to FIG. 1, it shows a flowchart of the steps of a personalized search method based on a user portrait in Embodiment 1 of the present application. It can be understood that the flowchart in this method embodiment is not used to limit the order of execution of the steps. The following is an exemplary description with the server as the execution subject. details as follows.

Step S100: Obtain a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category.

Specifically, the mapping relationship between the first keyword and the first category word is configured in advance. For example, if the patient is more interested in eating or cooking, the first category words of the patient’s interest category are diet, recipe, etc. The patient searches for the corresponding diet, recipe, etc. according to their own condition, such as diabetic patients , The first category words are diabetic diet, diabetic recipes, etc. The first keyword of diabetic diet is low salt and low fat, and further can be coarse grains such as buckwheat, oatmeal, corn flour, soybeans and soy products, vegetables, etc.

Exemplarily, referring to FIG. 2, step S100 further includes:

Step S100A: Obtain personal information input by the patient through a terminal device, and query the patient's associated information from a designated server based on the personal information, the associated information including the patient's historical operation records.

Exemplarily, the patient's personal information is acquired through a pre-configured electronic page, the electronic page includes multiple fields, and the multiple fields correspond to personal information such as gender, age, and past medical history.

Specifically, on the premise of obtaining relevant permissions, the patient's electronic medical record can be obtained from the designated server such as a medical sharing platform, and the patient's medical history information can be extracted from the electronic medical record. If the patient’s electronic medical record cannot be obtained, when a search request provided by the patient is received, some electronic pages with preset question and answer information are pushed for the patient to choose, such as follow-up, patient education question and answer, medicine question and answer, etc., so as to obtain the patient Feedback information for these preset questions and answers.

Specifically, the patient’s related information can also be inquired by obtaining the patient’s historical operation record. The historical operation record includes the patient’s login query information and related articles, and the patient’s degree of interest in disease-related articles. Historical operation information such as duration, clicks, reposts, or comments are analyzed.

Exemplarily, referring to Fig. 3, step S100A further includes pre-establishing a disease classification system:

Step S100AA: Obtain multiple sample user information of multiple sample users and sample related information with the sample users.

Step S100AB, using the TF-IDF model, extract multiple sample keywords from the multiple sample user information and sample related information with the sample user.

Specifically, the TF-IDF (Term Frequency-Inverse Document Frequency) model is used to evaluate the importance of a word to multiple sample user information and sample related information of the sample user. The TF-IDF model The weight value of each word is calculated, sorted by size, and all words with a weight value greater than a certain preset value are taken as sample keywords.

Step S100AC, take multiple sample keyword sets as the input of the first layer sample neural network model, and use the first sample category word in the classification system as the output of the first layer sample neural network model to train the first layer sample neural network The model predicts the performance of the corresponding category words based on the keywords.

Step S100AD, stop training until the sample neural network model of the mth layer is reached, where 2≤m≤M, and M is the total number of sample category words included in the classification system.

Step S100AE, taking the training result of the sample neural network model of the m-1 layer and multiple sample keywords as the input of the sample neural network model of the mth layer, and taking the mth sample category word in the classification system as the mth layer sample The output of the neural network model, training the m-th sample neural network model to predict the performance of the corresponding category words according to the keywords.

Step S100AF, extract multiple keywords from the personal information and the associated information with the patient through the TF-IDF model.

Step S100AG, input the multiple keywords into the sample neural network model, and predict the performance of the corresponding category words according to the keywords of each layer of the sample neural network model, and output multiple keywords corresponding to multiple keywords. Category words.

Step S100AH: Associate multiple category words with corresponding keywords to obtain the disease classification system; the disease classification system includes multiple sub-disease categories, each sub-disease category includes multiple category words, and each category word corresponds to At least one set of keywords; wherein the multiple category words include at least disease causes and disease medications.

Specifically, the keyword set in the disease classification system is input into the sample neural network model, and the corresponding category words are output. Category words include: disease cause, disease medication, disease prevention, disease examination, disease diagnosis, treatment, common sense, care, cutting-edge information, etc.; category words can further expand its subcategory words, such as hazards, complications, etc.: further, Disease causes are cases, the category words can also be subdivided into: smoking, drinking and other bad habits of keyword sets.

Step S100B, construct a user portrait of the patient based on the personal information and the associated information, the user portrait including multiple portrait tags corresponding to multiple dimensions.

Exemplarily, referring to FIG. 4, step S100B further includes:

In step S100B1, the word vector of the first keyword of the personal information and related information is obtained through the word2vec model.

Step S100B2, input the word vector of the first keyword into a prediction model, and output the correlation probability of the patient and each portrait label through the prediction model to obtain a user portrait of the patient.

Exemplarily, referring to FIG. 5, step S100B2 further includes:

In step S100B2A, a mapping relationship between the disease classification system and the patient is established to form a user portrait of the patient, each category word corresponds to a dimension, and each keyword corresponds to a portrait label.

Step S100B2B, input the word vector of the keyword set of the disease classification system into the prediction model, and calculate the first association probability according to the softmax layer of the prediction model.

Step S100B2C, input the word vectors of the category words of the disease classification system into the prediction model, and calculate the second correlation probability according to the softmax layer of the prediction model.

Step S100B2D: Obtain the correlation probability of the user portrait of the patient according to the first correlation probability and the second correlation probability.

Exemplarily, step S100B further includes:

The real-time historical operation information of the patient is analyzed, the patient's attention point is obtained according to the real-time historical operation information, and the attention point is mapped to the user portrait of the patient to update the user portrait of the patient.

Specifically, if the patient does not provide personal information, you can first obtain the patient's related information from the designated server, then obtain the disease classification system of the patient's related information through the sample neural network model, and obtain the corresponding multiple word vectors through the word2vec model Input the multiple word vectors into the prediction model, and output the correlation probability of the patient and each portrait label through the prediction model to obtain the user portrait of the patient; wherein the prediction model may be a deep learning model.

Step S100C: Acquire the patient's interest category according to the user portrait.

Specifically, the user portrait has the correlation probability between the patient and each portrait label, and the patient’s interest category is determined according to the correlation probability, the same category portrait tags are associated, and similar portrait tags with the correlation probability greater than the preset range are selected as the interest category .

Step S102: Obtain input information of the patient, and obtain a second keyword and/or a second category word corresponding to the second keyword according to the input information.

Specifically, the mapping relationship between the second keyword and the second category word is configured in advance.

The step of obtaining the second keyword is as follows: traverse the input information according to the keyword set to obtain one or more second keywords from the input information. For example: when a patient searches, it is verified that the patient enters the dietary direction information. The searched keywords can be XXX (symptoms, such as diabetes) what can and can’t eat, which may be protein, meat, seafood, etc. . Among them, protein, meat, seafood, etc. are the second category words, and pork, beef, chicken, etc. are the second keywords. Map pork, beef, chicken, etc. as second keywords to the meat of the second category words.

Step S104: Obtain a first article set according to the first keyword, the first category word, the second keyword, and the second category word, where the first article set includes at least one article.

Specifically, as long as the article includes at least one of the first keyword, first category word, second keyword, and/or second category word, the article is identified and filtered out.

Step S106: Calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, the third weight coefficient of the second keyword, and the first weight coefficient of each article in the first article collection. The fourth weight coefficient of the two-category words, each article includes the first keyword, the first category word, the second keyword and/or the second category word.

Exemplarily, each article includes the first keyword, the first category word, the second keyword, and/or the second category word.

Exemplarily, if the title and body of the article contain the keyword set or the category words, and the weight coefficient of the article title and the body are inconsistent, the weight coefficient of the keyword set of the title is The weight coefficient of the title is added to the weight coefficient of the keyword set. Similarly, the weight coefficient of the keyword set of the subject is the weight coefficient of the subject plus the weight coefficient of the keyword set.

Specifically, the TF-IDF (term frequency-inverse document frequency, inverse text frequency index of term frequency) model and LDA (Latent Dirichlet Allocation, document topic generation model) model are used to calculate the first weight coefficient and the first weight coefficient of the first keyword. The second weight coefficient of a category word, the third weight coefficient of the second target keyword, and the fourth weight coefficient of the second target category word. The LDA model can identify topic words hidden in a large-scale document set or corpus.

Step S108: According to the first weighting coefficient, the second weighting coefficient, the third weighting coefficient, and the fourth weighting coefficient, the first matching degree between each article and the patient search target is calculated.

Specifically, the first weight coefficient of each article is added to obtain the total first weight coefficient, the second weight coefficient is added to obtain the total second weight coefficient, and the third weight coefficient is added to obtain the total third weight coefficient and the first weight coefficient. The four weighting coefficients are added to obtain a total fourth weighting coefficient, and finally the total first weighting coefficient, the total second weighting coefficient, the total third weighting coefficient and the total fourth weighting coefficient are added to obtain the first matching degree. The first matching degree of each article in the first article collection is obtained according to the above-mentioned method.

Step S110, based on the first matching degree of each article, output a search result page.

Specifically, the first matching degree of each article in the first article set is sorted by unity (for example: from big to small) to obtain the first search result of the patient, and other factors may also be added for sorting , Such as bidding ranking, etc. The first search result is displayed, so that the patient can obtain more accurate search article information.

Example two

Please refer to 6. The difference from the first embodiment is that the search method is different. In this embodiment, the personalized search is not turned on, and the search method is only for the patient's input information. It includes the following steps:

Step S120: Obtain search information input by the patient through a terminal device, and obtain a fifth keyword of the search information from a designated server according to the search information and a fifth category word corresponding to the fifth keyword.

Specifically, a keyword set and category words are established, the keyword set includes a plurality of fifth keywords, the category words include a plurality of fifth category words, and a pre-configured relationship between the fifth keyword and the fifth category word Mapping relations.

Step S122: Obtain a second article collection according to the fifth keyword and the fifth category word, wherein the second article collection includes at least one article.

Specifically, according to the fifth keyword and fifth category words of the search information, a related second article set in the database is obtained, as long as the article has at least one fifth keyword and/or the fifth category word , The article is identified and called out.

Step S124, each of the articles in the second article collection includes the fifth keyword and/or fifth category words, and calculate the fifth weight coefficient and the fifth weight coefficient of the fifth keyword of each article. The sixth weight coefficient of five categories of words.

Specifically, the fifth weight coefficient and the first weight coefficient of the fifth keyword are calculated through the TF-IDF (term frequency-inverse document frequency) model and the LDA (Latent Dirichlet Allocation, document topic generation model) model. The sixth weight coefficient of five categories of words.

Exemplarily, if the title and body of the article contain the keyword set or the category words, the title of the article and the body have different weight coefficients, and the weight coefficient of the keyword set of the title is the The weight coefficient of the title is added to the weight coefficient of the keyword set. Similarly, the weight coefficient of the keyword set of the subject is the weight coefficient of the subject plus the weight coefficient of the keyword set.

Step S126, calculating the second matching degree of each article in the second article collection according to the fifth weighting coefficient and the sixth weighting coefficient.

Specifically, the fifth weight coefficient of each keyword is added to obtain a total fifth weight coefficient, the sixth weight coefficient of each category word is added to obtain a total sixth weight coefficient, and finally the total The fifth weight coefficient is added to the total sixth weight coefficient to obtain the second degree of matching. The second matching degree of each article in the second article collection is obtained according to the above-mentioned method.

Step S128, based on the second degree of matching in each article in the second article collection, output a search result page.

Specifically, the second matching degree of each article in the second article set is sorted from largest to smallest to obtain the second search result of the patient, and the second search result is displayed to obtain the Patient’s search article information.

Example three

Please continue to refer to FIG. 7, which shows a schematic diagram of the program modules of the third embodiment of the personalized search system according to the user portrait of this application. In this embodiment, the personalized search system 20 based on user portraits may include or be divided into one or more program modules, and the one or more program modules are stored in a storage medium and are executed by one or more processors. Execute to complete this application and realize the above-mentioned personalized search method based on user portrait. The program module referred to in the embodiment of the present application refers to a series of computer-readable instruction instruction segments capable of completing specific functions, and is more suitable than the program itself to describe the execution process of the personalized search system 20 based on the user portrait in the storage medium. The following description will specifically introduce the functions of each program module in this embodiment:

The first obtaining module 200 is configured to obtain a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category.

Exemplarily, the first obtaining module 200 is further configured to:

Acquiring personal information input by the patient through the terminal device, and querying the patient's associated information from a designated server based on the personal information, the associated information including the patient's historical operation records;

Specifically, the patient’s associated information can also be inquired by obtaining the patient’s historical operation record, which includes the patient’s login query information and related articles, and the patient’s degree of interest in disease-related articles, which can be viewed by the patient The historical operation information such as the length of the article, clicks, reposts, or comments are analyzed.

Exemplarily, the first acquisition module 200 is also used to establish a disease classification system in advance:

Acquire multiple sample user information of multiple sample users and sample related information with sample users;

Using the TF-IDF model, extract multiple sample keywords from the multiple sample user information and sample related information with the sample users;

Take multiple sample keyword sets as the input of the first layer sample neural network model, and use the first sample category word in the classification system as the output of the first layer sample neural network model, and train the first layer sample neural network model according to the key The performance of word prediction corresponding category words;

Stop training until the sample neural network model of the mth layer is reached, where 2≤m≤M, and M is the total number of sample category words included in the classification system;

Take the training result of the sample neural network model of the m-1 layer and multiple sample keywords as the input of the sample neural network model of the m-th layer, and use the m-th sample category word in the classification system as the sample neural network model of the m-th layer The output of training the m-th sample neural network model to predict the performance of the corresponding category words according to the keywords;

Extracting multiple keywords from the personal information and the associated information with the patient through the TF-IDF model;

Inputting the multiple keywords into the sample neural network model, and predicting the performance of corresponding category words according to the keywords of each layer of the sample neural network model, and outputting multiple category words corresponding to the multiple keywords;

Associate multiple category words with corresponding keywords to obtain the disease classification system; the disease classification system includes multiple sub-disease categories, each sub-disease category includes multiple category words, and each category word corresponds to at least one key Word set; wherein, the multiple category words include at least disease causes and disease medications.

Based on the personal information and the associated information, construct a user portrait of the patient, the user portrait including a plurality of portrait tags corresponding to multiple dimensions.

Exemplarily, the first obtaining module 200 is further configured to:

Obtain the word vector of the first keyword of the personal information and related information through the word2vec model;

Inputting the word vector of the first keyword into a prediction model, and outputting the correlation probability of the patient and each portrait label through the prediction model to obtain a user portrait of the patient;

Acquire the patient's interest category according to the user portrait.

Exemplarily, the first obtaining module 200 is further configured to:

Establishing a mapping relationship between the disease classification system and the patient to form a user portrait of the patient, each category word corresponds to a dimension, and each keyword corresponds to a portrait label;

Input the word vector of the keyword set of the disease classification system into the prediction model, and calculate the first association probability according to the softmax layer of the prediction model;

Input the word vector of the category word of the disease classification system into the prediction model, and calculate the second association probability according to the softmax layer of the prediction model;

The correlation probability of the user portrait of the patient is obtained according to the first correlation probability and the second correlation probability.

Exemplarily, the first obtaining module 200 is further configured to:

The second acquisition module 201 is configured to acquire input information of the patient, and acquire a second keyword and/or a second category word corresponding to the second keyword according to the input information.

The third obtaining module 202 is configured to obtain a first article set according to the first keyword, the first category word, the second keyword, and the second category word, where the first article set includes at least one article.

The first calculation module 203 is configured to calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, and the second keyword of the second keyword of each article in the first article collection. Three weight coefficients and a fourth weight coefficient of the second category words, each article includes the first keyword, the first category word, the second keyword and/or the second category word.

Specifically, the TF-IDF (term frequency-inverse document frequency, inverse text frequency index of term frequency) model and LDA (Latent Dirichlet Allocation, document topic generation model) model are used to calculate the first weight coefficient and the first weight coefficient of the first keyword. The second weight coefficient of a category word, the third weight coefficient of the second target keyword, and the fourth weight coefficient of the second target category word.

The second calculation module 204 is configured to calculate the first weight coefficient of each article and the patient search target according to the first weight coefficient, the second weight coefficient, the third weight coefficient, and the fourth weight coefficient. suitability.

The result output module 205 is configured to output a search result page based on the first matching degree of each article.

Example four

Refer to FIG. 8, which is a schematic diagram of the hardware architecture of the computer device according to the fourth embodiment of the present application. In this embodiment, the computer device 2 is a device that can automatically perform numerical calculation and/or information processing according to pre-set or stored instructions. The computer device 2 may be a rack server, a blade server, a tower server, or a cabinet server (including an independent server or a server cluster composed of multiple servers). As shown in FIG. 8, the computer device 2 at least includes, but is not limited to, a memory 21, a processor 22, a network interface 23, and a personalized search system 20 based on user portraits that can be connected to each other through a system bus. among them:

In this embodiment, the memory 21 includes at least one type of non-volatile computer-readable storage medium. The readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), Random access memory (RAM), static random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disk, optical disk Wait. In some embodiments, the memory 21 may be an internal storage unit of the computer device 2, such as a hard disk or memory of the computer device 2. In other embodiments, the memory 21 may also be an external storage device of the computer device 2, for example, a plug-in hard disk, a smart media card (SMC), and a secure digital (Secure Digital, SD card, Flash Card, etc. Of course, the memory 21 may also include both the internal storage unit of the computer device 2 and its external storage device. In this embodiment, the memory 21 is generally used to store the operating system and various application software installed in the computer device 2, such as the program code of the personalized search system 20 according to the user portrait in the third embodiment. In addition, the memory 21 can also be used to temporarily store various types of data that have been output or will be output.

The processor 22 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments. The processor 22 is generally used to control the overall operation of the computer device 2. In this embodiment, the processor 22 is used to run the program code or process data stored in the memory 21, for example, to run the personalized search system 20 based on user portraits, so as to implement the personalized search methods based on user portraits of the first and second embodiments. .

The network interface 23 may include a wireless network interface or a wired network interface. The network interface 23 is generally used to establish a communication connection between the server 2 and other electronic devices. For example, the network interface 23 is used to connect the server 2 to an external terminal through a network, and to establish a data transmission channel and a communication connection between the server 2 and the external terminal. The network may be an intranet, the Internet, a global system of mobile communication (GSM), a wideband code division multiple access (WCDMA), a 4G network, a 5G network , Bluetooth (Bluetooth), Wi-Fi and other wireless or wired networks.

It should be pointed out that FIG. 8 only shows the computer device 2 with components 20-23, but it should be understood that it is not required to implement all the components shown, and more or fewer components may be implemented instead.

In this embodiment, the personalized search system 20 based on user portraits stored in the memory 21 may also be divided into one or more program modules, and the one or more program modules are stored in the memory 21, It is executed by one or more processors (the processor 22 in this embodiment) to complete the application.

For example, FIG. 7 shows a schematic diagram of the program modules of the third embodiment of the personalized search system 20 based on user portraits. In this embodiment, the personalized search system 20 based on user portraits can be divided into first acquisition. Module 200, second acquisition module 201, third acquisition module 202, first calculation module 203, second calculation module 204, and result output module 205. Among them, the program module referred to in this application refers to an instruction segment of a series of computer-readable instructions that can complete specific functions. The specific functions of the program modules 200-205 have been described in detail in the third embodiment, and will not be repeated here.

Example five

This embodiment also provides a non-volatile computer-readable storage medium, such as flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory ( SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disk, optical disk, server, App application mall, etc., on which storage There are computer-readable instructions, and the corresponding functions are realized when the program is executed by the processor. The non-volatile computer-readable storage medium of this embodiment is used to store the personalized search system 20 according to the user portrait, and when executed by the processor, the following steps are implemented:

The embodiment of the present invention obtains the user portrait by analyzing the patient's personal information and related information, and then combines the input information of the patient to obtain keywords and category words related to the keywords, and calls articles with keywords and category words It also calculates the weight coefficient of the keywords and category words of each article in the article collection, and then obtains the matching degree of each article through the weight coefficients of the keywords and category words, and sorts them from large to small to improve Improve the accuracy of patients’ search articles.

The serial numbers of the foregoing embodiments of the present application are only for description, and do not represent the advantages and disadvantages of the embodiments.

Through the description of the above embodiments, those skilled in the art can clearly understand that the method of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。

The above are only preferred embodiments of this application, and do not limit the scope of this application. Any equivalent structure or equivalent process transformation made using the content of the description and drawings of this application, or directly or indirectly used in other related technical fields , The same reason is included in the scope of patent protection of this application.

Claims

A personalized search method based on user portraits, including:

Acquiring a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category;

Acquiring input information of the patient, and acquiring a second keyword and/or a second category word corresponding to the second keyword according to the input information;

Obtaining a first article collection according to the first keyword, the first category word, the second keyword, and the second category word, where the first article collection includes at least one article;

Calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, the third weight coefficient of the second keyword, and the second category word of each article in the first article set The fourth weight coefficient of each article includes the first keyword, the first category word, the second keyword and/or the second category word;

Calculating the first matching degree between each article and the patient's search target according to the first weighting coefficient, the second weighting coefficient, the third weighting coefficient, and the fourth weighting coefficient; and

Based on the first matching degree of each article, the search result page is output.
The personalized search method according to claim 1, further comprising the step of obtaining the patient's interest category, including:

Acquiring personal information input by the patient through a terminal device, and querying the patient's associated information from a designated server based on the personal information, the associated information including historical operation records of the patient;

Constructing a user portrait of the patient based on the personal information and the associated information, the user portrait including multiple portrait tags corresponding to multiple dimensions;

Acquire the patient's interest category according to the user portrait.
The personalized search method according to claim 2, wherein the step of obtaining personal information input by the patient through a terminal device, and querying the patient's associated information from a designated server based on the personal information, further comprising pre-establishing disease classification system:

Acquire multiple sample user information of multiple sample users and sample related information with sample users;

Using the TF-IDF model, extract multiple sample keyword sets from the multiple sample user information and sample related information with the sample users:

Take multiple sample keyword sets as the input of the first layer sample neural network model, and use the first sample category word in the classification system as the output of the first layer sample neural network model, and train the first layer sample neural network model according to the key The performance of word prediction corresponding category words;

Stop training until the sample neural network model of the mth layer is reached, where 2≤m≤M, and M is the total number of sample category words included in the classification system;

Take the training result of the sample neural network model of the m-1 layer and multiple sample keywords as the input of the sample neural network model of the m-th layer, and use the m-th sample category word in the classification system as the sample neural network model of the m-th layer The output of training the m-th sample neural network model to predict the performance of the corresponding category words according to the keywords;

Extracting multiple keywords from the personal information and the associated information with the patient through the TF-IDF model;

Inputting the multiple keywords into the sample neural network model, and predicting the performance of corresponding category words according to the keywords of each layer of the sample neural network model, and outputting multiple category words corresponding to the multiple keywords; and

Associate multiple category words with corresponding keywords to obtain the disease classification system; the disease classification system includes multiple sub-disease categories, each sub-disease category includes multiple category words, and each category word corresponds to at least one key Word set; wherein, the multiple category words include at least disease causes and disease medications.
The personalized search method according to claim 2, constructing a user portrait of the patient based on the personal information and associated information with the patient, the user portrait including a plurality of portrait labels corresponding to multiple dimensions ,include:

Obtain the word vector of the first keyword of the personal information and related information through the word2vec model;

The word vector of the first keyword is input into a prediction model, and the correlation probability between the patient and each portrait label is output through the prediction model to obtain a user portrait of the patient.
The personalized search method according to claim 4, inputting the word vector of the first keyword into a prediction model, and outputting the correlation probability between the patient and each portrait label through the prediction model to obtain the patient The steps of the user portrait include:

Establishing a mapping relationship between the disease classification system and the patient to form a user portrait of the patient, each category word corresponds to a dimension, and each keyword corresponds to a portrait label;

Input the word vector of the keyword set of the disease classification system into the prediction model, and calculate the first association probability according to the softmax layer of the prediction model;

Input the word vector of the category word of the disease classification system into the prediction model, and calculate the second association probability according to the softmax layer of the prediction model;

The correlation probability of the user portrait of the patient is obtained according to the first correlation probability and the second correlation probability.
The personalized search method according to claim 2, wherein the step of constructing a user portrait of the patient based on the personal information and the associated information with the patient, further comprising:

The real-time historical operation information of the patient is analyzed, the patient's attention point is obtained according to the real-time historical operation information, and the attention point is mapped to the user portrait of the patient to update the user portrait of the patient.
The personalized search method according to claim 1, after the step of outputting the search result page based on the first matching degree of each article, further comprising:

Acquiring search information input by the patient through a terminal device, and acquiring a fifth keyword of the search information from a designated server according to the search information and a fifth category word corresponding to the fifth keyword;

Obtaining a second collection of articles according to the fifth keyword and the fifth category words, wherein the second collection of articles includes at least one article;

Each article in the second article collection includes the fifth keyword and/or fifth category word, and calculates the fifth weight coefficient and fifth category word of the fifth keyword of each article The sixth weight coefficient;

Calculating the second matching degree of each article in the second article collection according to the fifth weighting coefficient and the sixth weighting coefficient;

Based on the second degree of matching in each article of the second article collection, a search result page is output.
A personalized search system based on user portraits, including:

The first obtaining module is configured to obtain a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category;

The second acquisition module is configured to acquire input information of the patient, and acquire a second keyword and/or a second category word corresponding to the second keyword according to the input information;

The third obtaining module is configured to obtain a first article collection according to the first keyword, the first category word, the second keyword, and the second category word, where the first article collection includes at least one article;

The first calculation module is used to calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, and the third weight coefficient of the second keyword in each article in the first article set. A weighting coefficient and a fourth weighting coefficient of a second category word, each article includes the first keyword, the first category word, the second keyword and/or the second category word;

The second calculation module is configured to calculate the first match between each article and the patient search target according to the first weight coefficient, the second weight coefficient, the third weight coefficient, and the fourth weight coefficient degree;

The result output module is used to output the search result page based on the first matching degree of each article.
According to the personalized search system of claim 8, the first obtaining module is further configured to:

Acquiring personal information input by the patient through a terminal device, and querying the patient's associated information from a designated server based on the personal information, the associated information including historical operation records of the patient;

Constructing a user portrait of the patient based on the personal information and the associated information, the user portrait including multiple portrait tags corresponding to multiple dimensions;

Acquire the patient's interest category according to the user portrait.
According to the personalized search system of claim 9, the first acquisition module is further configured to:

Acquire multiple sample user information of multiple sample users and sample related information with sample users;

Using the TF-IDF model, extract multiple sample keyword sets from the multiple sample user information and sample related information with the sample users:

Take multiple sample keyword sets as the input of the first layer sample neural network model, and use the first sample category word in the classification system as the output of the first layer sample neural network model, and train the first layer sample neural network model according to the key The performance of word prediction corresponding category words;

Stop training until the sample neural network model of the mth layer is reached, where 2≤m≤M, and M is the total number of sample category words included in the classification system;

Take the training result of the sample neural network model of the m-1 layer and multiple sample keywords as the input of the sample neural network model of the m-th layer, and use the m-th sample category word in the classification system as the sample neural network model of the m-th layer The output of training the m-th sample neural network model to predict the performance of the corresponding category words according to the keywords;

Extracting multiple keywords from the personal information and the associated information with the patient through the TF-IDF model;

Inputting the multiple keywords into the sample neural network model, and predicting the performance of corresponding category words according to the keywords of each layer of the sample neural network model, and outputting multiple category words corresponding to the multiple keywords; and

Associate multiple category words with corresponding keywords to obtain the disease classification system; the disease classification system includes multiple sub-disease categories, each sub-disease category includes multiple category words, and each category word corresponds to at least one key Word set; wherein, the multiple category words include at least disease causes and disease medications.
According to the personalized search system of claim 9, the first acquisition module is further configured to:

Obtain the word vector of the first keyword of the personal information and related information through the word2vec model;

The word vector of the first keyword is input into a prediction model, and the correlation probability between the patient and each portrait label is output through the prediction model to obtain a user portrait of the patient.
According to the personalized search system of claim 11, the first obtaining module is further configured to:

Establishing a mapping relationship between the disease classification system and the patient to form a user portrait of the patient, each category word corresponds to a dimension, and each keyword corresponds to a portrait label;

Input the word vector of the keyword set of the disease classification system into the prediction model, and calculate the first association probability according to the softmax layer of the prediction model;

Input the word vectors of the category words of the disease classification system into the prediction model, and calculate the second association probability according to the softmax layer of the prediction model;

The correlation probability of the user portrait of the patient is obtained according to the first correlation probability and the second correlation probability.
According to the personalized search system of claim 9, the first acquisition module is further configured to:

The real-time historical operation information of the patient is analyzed, the patient's attention point is obtained according to the real-time historical operation information, and the attention point is mapped to the user portrait of the patient to update the user portrait of the patient.
A computer device comprising a memory and a processor. The memory stores computer readable instructions that can run on the processor according to a user portrait, and the computer readable instructions are stored by the processor. The following steps are implemented during execution:

Acquiring a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category;

Acquiring input information of the patient, and acquiring a second keyword and/or a second category word corresponding to the second keyword according to the input information;

Obtaining a first article collection according to the first keyword, the first category word, the second keyword, and the second category word, where the first article collection includes at least one article;

Calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, the third weight coefficient of the second keyword, and the second category word of each article in the first article set The fourth weight coefficient of each article includes the first keyword, the first category word, the second keyword and/or the second category word;

Calculating the first matching degree between each article and the patient's search target according to the first weighting coefficient, the second weighting coefficient, the third weighting coefficient, and the fourth weighting coefficient; and

Based on the first matching degree of each article, the search result page is output.
According to the personalized search system of claim 14, the computer-readable instructions further implement the following steps when executed by the processor:

Acquiring personal information input by the patient through a terminal device, and querying the patient's associated information from a designated server based on the personal information, the associated information including the patient's historical operation records;

Constructing a user portrait of the patient based on the personal information and the associated information, the user portrait including multiple portrait tags corresponding to multiple dimensions;

Acquire the patient's interest category according to the user portrait.
According to the personalized search system according to claim 15, the computer-readable instructions further implement the following steps when executed by the processor:

Acquire multiple sample user information of multiple sample users and sample related information with sample users;

Using the TF-IDF model, extract multiple sample keyword sets from the multiple sample user information and sample related information with the sample users:

Take multiple sample keyword sets as the input of the first layer sample neural network model, and use the first sample category word in the classification system as the output of the first layer sample neural network model, and train the first layer sample neural network model according to the key The performance of word prediction corresponding category words;

Stop training until the sample neural network model of the mth layer is reached, where 2≤m≤M, and M is the total number of sample category words included in the classification system;

Take the training result of the sample neural network model of the m-1 layer and multiple sample keywords as the input of the sample neural network model of the m-th layer, and use the m-th sample category word in the classification system as the sample neural network model of the m-th layer The output of training the m-th sample neural network model to predict the performance of the corresponding category words according to the keywords;

Extracting multiple keywords from the personal information and the associated information with the patient through the TF-IDF model;

Inputting the multiple keywords into the sample neural network model, and predicting the performance of corresponding category words according to the keywords of each layer of the sample neural network model, and outputting multiple category words corresponding to the multiple keywords; and

Associate multiple category words with corresponding keywords to obtain the disease classification system; the disease classification system includes multiple sub-disease categories, each sub-disease category includes multiple category words, and each category word corresponds to at least one key Word set; wherein, the multiple category words include at least disease causes and disease medications.
According to the personalized search system according to claim 15, the computer-readable instructions further implement the following steps when executed by the processor:

Obtain the word vector of the first keyword of the personal information and related information through the word2vec model;

The word vector of the first keyword is input into a prediction model, and the correlation probability between the patient and each portrait label is output through the prediction model to obtain a user portrait of the patient.
According to the personalized search system of claim 17, the computer-readable instructions further implement the following steps when executed by the processor:

Establishing a mapping relationship between the disease classification system and the patient to form a user portrait of the patient, each category word corresponds to a dimension, and each keyword corresponds to a portrait label;

Input the word vector of the keyword set of the disease classification system into the prediction model, and calculate the first association probability according to the softmax layer of the prediction model;

Input the word vector of the category word of the disease classification system into the prediction model, and calculate the second association probability according to the softmax layer of the prediction model;

The correlation probability of the user portrait of the patient is obtained according to the first correlation probability and the second correlation probability.
[Corrected according to Rule 91 09.01.2020]
A computer device, the computer device includes a memory and a processor, the memory stores computer readable instructions that can run on the processor, and the computer readable instructions when executed by the processor realize the following step:

Acquiring a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category;

Acquiring input information of the patient, and acquiring a second keyword and/or a second category word corresponding to the second keyword according to the input information;

Obtaining a first article collection according to the first keyword, the first category word, the second keyword, and the second category word, where the first article collection includes at least one article;

Calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, the third weight coefficient of the second keyword, and the second category word of each article in the first article set The fourth weight coefficient of each article includes the first keyword, the first category word, the second keyword and/or the second category word;

Calculating the first matching degree between each article and the patient's search target according to the first weighting coefficient, the second weighting coefficient, the third weighting coefficient, and the fourth weighting coefficient; and

Based on the first matching degree of each article, the search result page is output.
A non-volatile computer-readable storage medium in which computer-readable instructions are stored, and the computer-readable instructions can be executed by at least one processor to cause the At least one processor performs the following steps:

Acquiring a first keyword and/or a first category word mapped to the first keyword, the first keyword being associated with the patient's interest category;

Acquiring input information of the patient, and acquiring a second keyword and/or a second category word corresponding to the second keyword according to the input information;

Obtaining a first article collection according to the first keyword, the first category word, the second keyword, and the second category word, where the first article collection includes at least one article;

Calculate the first weight coefficient of the first keyword, the second weight coefficient of the first category word, the third weight coefficient of the second keyword, and the second category word of each article in the first article set The fourth weight coefficient of each article includes the first keyword, the first category word, the second keyword and/or the second category word;

Calculating the first matching degree between each article and the patient's search target according to the first weighting coefficient, the second weighting coefficient, the third weighting coefficient, and the fourth weighting coefficient; and

Based on the first matching degree of each article, the search result page is output.