WO2021237707A1 - 问题推荐方法及装置、系统和电子设备、可读存储介质 - Google Patents

问题推荐方法及装置、系统和电子设备、可读存储介质 Download PDF

Info

Publication number
WO2021237707A1
WO2021237707A1 PCT/CN2020/093390 CN2020093390W WO2021237707A1 WO 2021237707 A1 WO2021237707 A1 WO 2021237707A1 CN 2020093390 W CN2020093390 W CN 2020093390W WO 2021237707 A1 WO2021237707 A1 WO 2021237707A1
Authority
WO
WIPO (PCT)
Prior art keywords
question
user
candidate
questions
data
Prior art date
Application number
PCT/CN2020/093390
Other languages
English (en)
French (fr)
Inventor
王瑜
贺王强
王洪
王玉峰
雷一鸣
Original Assignee
京东方科技集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东方科技集团股份有限公司 filed Critical 京东方科技集团股份有限公司
Priority to PCT/CN2020/093390 priority Critical patent/WO2021237707A1/zh
Priority to CN202080000849.2A priority patent/CN114072782A/zh
Priority to EP20900702.0A priority patent/EP4002141A4/en
Priority to JP2022504676A priority patent/JP2023535849A/ja
Priority to US17/281,310 priority patent/US20220198300A1/en
Publication of WO2021237707A1 publication Critical patent/WO2021237707A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems

Definitions

  • the embodiments of the present disclosure relate to a question recommendation method, a question recommendation device, a question recommendation system, an electronic device, and a non-transitory readable storage medium.
  • At least one embodiment of the present disclosure provides a question recommendation method, including: obtaining a set of candidate questions of a user, wherein the set of candidate questions includes a plurality of candidate questions; obtaining user behavior data, and obtaining data based on the user behavior data User interest parameter; based on the user interest parameter and the multiple candidate questions, obtain at least one similarity feature between each candidate question of the multiple candidate questions and the user interest parameter; based on basic user information , The plurality of candidate questions and the at least one similarity feature, sort the plurality of candidate questions to obtain a question sequence; and based on the order of the question sequence, combine at least one candidate question in the question sequence Recommend to the user.
  • the multiple candidate questions are sorted based on the basic user information, the multiple candidate questions, and the at least one similarity feature.
  • Obtaining a sequence of questions includes: using a ranking model to compose the basic user information, the multiple candidate questions, and the at least one similarity feature into an input feature vector of the ranking model to obtain each of the multiple candidate questions.
  • the score corresponding to a candidate question is sorted according to the size of the score to obtain the question sequence.
  • each candidate question of the multiple candidate questions and the user interest are obtained.
  • the at least one similarity feature between the parameters includes: using at least one similarity matching model to obtain the relationship between each candidate question and the user interest parameter based on the user interest parameter and the multiple candidate questions At least one similarity feature.
  • the at least one similarity matching model includes: a cosine similarity model, a Jackard similarity model, an edit distance similarity model, and a word shift distance similarity At least one of a degree model and a deep semantic matching similarity model.
  • the ranking model includes a Wide&Deep model.
  • obtaining a candidate question set of the user includes: accessing a data knowledge base, where the data knowledge base includes a plurality of knowledge question sets; and obtaining a user Basic information, and establish a user tag set based on the user basic information; associate the user tag set with the data knowledge base, and obtain the candidate question set from the multiple knowledge question sets.
  • the user tag set includes a multi-level tag set
  • the multi-level tag set includes a multi-level tag
  • different types of tags are different.
  • the question recommendation method is used to recommend questions about diseases
  • the first-level tags of the multi-level tag set are age groups
  • the second-level tags Is the time period
  • the third-level label is the disease type
  • the fourth-level label is the complication.
  • each of the plurality of knowledge question sets includes: a standard question, a standard answer corresponding to the standard question, and a standard answer corresponding to the standard question The problem of expansion.
  • obtaining the candidate question set of the user further includes: establishing the data knowledge base.
  • the establishment of the data knowledge base includes: grabbing a data set from the network, and classifying the data set according to intent to form the multiple A set of knowledge questions, so as to establish the data knowledge base.
  • the question recommendation method is used to recommend a question about a disease, and the data set comes from a consultation data set between doctors and patients, and the disease At least one of a hot issue associated with the disease and a reward issue associated with the disease.
  • associating the user tag set with the data knowledge base, and obtaining a candidate question set from the plurality of knowledge question sets includes: establishing all The mapping relationship between the set of user tags and the standard questions in the data knowledge base, the set of user tags is matched with the standard questions in the data knowledge base, and the set of knowledge questions corresponding to the matched standard questions Compose the set of candidate questions.
  • obtaining the candidate question set of the user includes: retrieving a pre-stored candidate question set of the user.
  • obtaining the user interest parameter based on the user behavior data includes: analyzing the user behavior data, and analyzing the user's clicked questions or user feelings Interest words and sentences are transformed into the user interest parameters.
  • At least one embodiment of the present disclosure also provides a question recommendation device, including: a set acquisition circuit configured to acquire a user's candidate question set, wherein the candidate question set includes a plurality of candidate questions; and a behavior analysis circuit is configured To obtain user behavior data and obtain user interest parameters based on the user behavior data; the feature generation circuit is configured to obtain each of the multiple candidate questions based on the user interest parameters and the multiple candidate questions At least one similarity feature between the candidate question and the user interest parameter; the question sorting circuit is configured to compare the multiple candidate questions based on the basic user information, the multiple candidate questions, and the at least one similarity feature The questions are sorted to obtain a question sequence, and a recommendation circuit is configured to recommend at least one candidate question in the question sequence to the user based on the order of the question sequence.
  • a set acquisition circuit configured to acquire a user's candidate question set, wherein the candidate question set includes a plurality of candidate questions
  • a behavior analysis circuit is configured To obtain user behavior data and obtain user interest parameters based on the user behavior data
  • the question sorting circuit includes: a question sorting sub-circuit configured to: use a sorting model to combine the basic user information, the multiple candidate questions, and The at least one similarity feature constitutes the input feature vector of the ranking model, the score corresponding to each of the multiple candidate questions is obtained, and the corresponding multiple candidate questions are sorted according to the size of the score, to Get the problem sequence.
  • the set acquisition circuit includes: a knowledge base access circuit configured to access a data knowledge base, wherein the data knowledge base includes a plurality of knowledge question sets;
  • the information acquisition circuit is configured to acquire basic user information and establish a user tag set based on the basic user information;
  • the candidate set generation circuit is configured to associate the user tag set with the data knowledge base, and from the Obtaining a candidate question set from a plurality of knowledge question sets, wherein the candidate question set includes a plurality of candidate questions;
  • the collection acquisition circuit further includes: a knowledge base establishment circuit configured to grab a data collection from the network and classify the data collection according to intent to The multiple knowledge question sets are formed, and the data knowledge base is established.
  • the candidate set generating circuit includes: a candidate set generating sub-circuit configured to establish a relationship between the user tag set and the standard question of the data knowledge base The mapping relationship of, matches the set of user tags with standard questions in the data knowledge base, and sets of knowledge questions corresponding to the matched standard questions form the set of candidate questions.
  • the behavior analysis circuit includes: a behavior analysis sub-circuit configured to analyze the user behavior data, and analyze the questions that the user has clicked on or the words and sentences that the user is interested in. Converted into the user interest parameter.
  • the feature generation circuit includes: a feature generation sub-circuit configured to adopt at least one similarity matching model based on the user interest parameter and the multiple candidate questions To obtain at least one similarity feature between each candidate question and the user interest parameter.
  • At least one embodiment of the present disclosure also provides a question recommendation system, including a terminal and a question recommendation server.
  • the terminal is configured to send request data to the question recommendation server;
  • the question recommendation server is configured to: in response to the request data: obtain a user candidate question set, wherein the candidate question set includes a plurality of Candidate questions; Obtain user behavior data, and obtain user interest parameters based on the user behavior data; Based on the user interest parameters and the multiple candidate questions, obtain each of the multiple candidate questions and the At least one similarity feature between user interest parameters; and based on user basic information, the plurality of candidate questions, and the at least one similarity feature, sorting the plurality of candidate questions to obtain a question sequence; the terminal It is also configured to display the first N candidate questions in the question sequence, where N is an integer greater than or equal to 1.
  • At least one embodiment of the present disclosure also provides an electronic device, including: a processor and a memory, the memory includes one or more computer program modules; wherein the one or more computer program modules are stored in the memory And is configured to be executed by the processor, and the one or more computer program modules include instructions for executing the problem recommendation method described in any one of the foregoing embodiments.
  • At least one embodiment of the present disclosure also provides a non-transitory readable storage medium having computer instructions stored thereon, wherein the computer instructions execute the problem recommendation method described in any one of the foregoing embodiments when the computer instructions are executed by a processor.
  • FIG. 1A is an exemplary flowchart of a question recommendation method provided by at least one embodiment of the present disclosure
  • FIG. 1B is an exemplary flowchart of another question recommendation method provided by at least one embodiment of the present disclosure
  • Fig. 2A shows a user interface of a certain platform according to at least one embodiment of the present disclosure
  • FIG. 2B shows a schematic diagram of establishing a user tag set according to at least one embodiment of the present disclosure
  • Fig. 3A shows a schematic diagram of a rule scheme between a user tag set and a data knowledge base according to at least one embodiment of the present disclosure
  • FIG. 3B shows another user interface of a certain platform according to at least one embodiment of the present disclosure
  • 4A is a schematic structural diagram of a Wide&Deep model provided by at least one embodiment of the present disclosure
  • FIG. 4B shows still another user interface of a certain platform according to at least one embodiment of the present disclosure
  • FIG. 4C shows still another user interface of a certain platform according to at least one embodiment of the present disclosure
  • FIG. 5A is an exemplary flowchart of another question recommendation method provided by at least one embodiment of the present disclosure.
  • FIG. 5B is a schematic block diagram of the question recommendation method in FIG. 5A provided by at least one embodiment of the present disclosure
  • FIG. 6 is a schematic block diagram of a question recommendation device provided by at least one embodiment of the present disclosure.
  • FIG. 7 is a schematic block diagram of a question recommendation system provided by at least one embodiment of the present disclosure.
  • FIG. 8 is a schematic block diagram of an electronic device provided by at least one embodiment of the present disclosure.
  • FIG. 9 is a schematic block diagram of a terminal provided by at least one embodiment of the present disclosure.
  • FIG. 10 is a schematic block diagram of a non-transitory readable storage medium provided by at least one embodiment of the present disclosure.
  • FIG. 11 shows an exemplary scene diagram of a problem recommendation system provided by at least one embodiment of the present disclosure.
  • Chronic disease refers to the general term for diseases that do not constitute infection and have long-term accumulation of disease form damage.
  • Common chronic diseases mainly include cardiovascular and cerebrovascular diseases, cancer, diabetes, and chronic respiratory diseases.
  • Cardiovascular and cerebrovascular diseases include hypertension, stroke and coronary heart disease.
  • Statistics show that one of the causes of chronic diseases is an unhealthy lifestyle.
  • unhealthy lifestyles include irrational diet, insufficient exercise, tobacco use, and excessive alcohol use. Therefore, for patients with chronic diseases, in addition to medical treatment (for example, treatment with drugs), doctors need to provide patients with reasonable advice (for example, dietary advice, exercise advice, etc.) for patients with chronic diseases.
  • At least one embodiment of the present disclosure provides a question recommendation method, a question recommendation device, a question recommendation system, an electronic device, and a non-transitory readable storage medium.
  • the question recommendation method includes: obtaining a user's candidate question set, which includes multiple candidate questions; obtaining user behavior data, and obtaining user interest parameters based on the user behavior data; obtaining multiple user interest parameters and multiple candidate questions based on the user interest parameters and multiple candidate questions.
  • At least one similarity feature between each candidate question in the candidate questions and the user's interest parameter based on user basic information, multiple candidate questions, and at least one similarity feature, sorting the multiple candidate questions to obtain a question sequence, and Based on the order of the question sequence, at least one candidate question in the question sequence is recommended to the user.
  • the question recommendation method provided by at least one embodiment of the present disclosure not only effectively avoids the problem of inappropriate feedback answers that may occur due to the patient’s unclear expression, but also can target individual factors, own characteristics, etc. (for example, user basic information, user’s Click behavior, browsing behavior, etc.) for personalized problem recommendations, so that users can more targeted to grasp the knowledge related to their own health.
  • the question recommendation method can also make the question recommendation relevant, personalized, and diversified by using a ranking model, while paying attention to the final feedback order, achieving an effect that more closely meets the needs of users, and is effective Improved user experience.
  • FIG. 1A is an exemplary flow chart of a question recommendation method provided by at least one embodiment of the present disclosure
  • FIG. 1B is an exemplary flow chart of another question recommendation method provided by at least one embodiment of the present disclosure.
  • the question recommendation method 10 provided by at least one embodiment of the present disclosure can be applied to scenarios such as medical intelligent question and answer, online consultation, health consultation, etc., for example, can be applied to a chronic disease health knowledge intelligent question answering system.
  • the question recommendation method 10 may include the following operations:
  • Step S100 Obtain a user's candidate question set, where the candidate question set includes multiple candidate questions;
  • Step S140 Obtain user behavior data, and obtain user interest parameters based on the user behavior data
  • Step S150 Based on the user interest parameter and the multiple candidate questions, obtain at least one similarity feature between each candidate question of the multiple candidate questions and the user interest parameter;
  • Step S160 Sort the multiple candidate questions based on the user's basic information, multiple candidate questions, and at least one similarity feature to obtain a question sequence;
  • Step S170 Recommend at least one candidate question in the question sequence to the user based on the order of the question sequence.
  • step S100 in the question recommendation method 10 may specifically include steps S110-S130, so the question recommendation method 10 may specifically include steps S110-S170.
  • the question recommendation method 10 may specifically include steps S110 to S170, as shown in FIG. Show.
  • step S110-step S170 can be performed sequentially, or in other adjusted order.
  • step S110 can be performed first and then step S120
  • step S120 can be performed first and then step S120 Step S110 is executed.
  • part or all of the operations in Step S110-Step S160 may also be executed in parallel.
  • Step S110 and Step S120 may be executed in parallel.
  • the embodiment of the present disclosure does not limit the execution order of each step. Actual adjustments.
  • step S110 to step S160 can be implemented by a server or a local end, which is not limited in the embodiment of the present disclosure.
  • implementing the question recommendation method 10 provided by at least one embodiment of the present disclosure may selectively perform some of the steps in steps S110-S170, or may perform some additional steps in addition to steps S110-S170. The embodiment does not specifically limit this.
  • Step S110 Access a data knowledge base, where the data knowledge base includes multiple knowledge question sets.
  • the data knowledge base may be a data knowledge base associated with chronic diseases (for example, diabetes, hypertension, etc.), and it includes complete basic information related to chronic diseases. health knowledge.
  • the data knowledge base includes multiple knowledge question sets.
  • the basic health knowledge in the data knowledge base is classified according to intent (such as diet, exercise, medication, examination, complications, surgery, treatment, symptoms, etc.), and each The standard questions and corresponding extended questions and standard answers are sorted under one intention (for example, manually sorted) to form a knowledge question set, thereby forming a data knowledge base.
  • each of the plurality of knowledge question sets may include a standard question, a standard answer corresponding to the standard question, and an extended question corresponding to the standard question.
  • the user is a diabetic patient.
  • the sorted standard question may include "Which food should the diabetic patient eat?”
  • the extended question corresponding to the standard question may include, for example, " What foods are good for diabetes?", "What foods are suitable for diabetes?”, "Which foods should diabetes eat to lower blood sugar?” and so on.
  • the set of knowledge questions under the above-mentioned "diet” intention may include the aforementioned standard questions, extended questions, and their corresponding standard answers.
  • the standard problems and extended problems listed in the embodiments of the present disclosure are only illustrative, and can be adjusted and updated according to application scenarios and medical practices.
  • the standard answer can also be adjusted and updated according to application scenarios and medical practice, etc., and can also be adjusted and updated according to the adopted language (for example, Chinese, English), which is not specifically limited in the embodiments of the present disclosure. .
  • the data knowledge base may be established, and the established knowledge database may be pre-stored locally or in the server, or it may be used by the server when implementing the problem recommendation method 10 It can also be read from other devices.
  • the embodiment of the present disclosure does not specifically limit this, and can be set according to actual needs. A detailed description of the establishment of a data knowledge base will be explained below.
  • Step S120 Obtain basic user information, and establish a user tag set based on the basic user information.
  • the user's basic information may include, for example, the user's age, gender, height, weight, waist circumference, and lifestyle habits.
  • the basic user information also includes the type of diabetes, confirmed chronic complications, existing symptoms, and so on.
  • the user's basic information also includes the user's medical history, fasting blood glucose level, or two-hour postprandial blood glucose level, etc.
  • the user's basic information also includes the user's diastolic blood pressure, systolic blood pressure, type of hypertension comorbidity, symptoms, and so on.
  • the embodiments of the present disclosure do not specifically limit the content included in the user's basic information, and can be set according to actual needs.
  • the user's basic information may come from the user's online real-time information, for example, if permission is obtained, it may come from an online health record platform, an information management system of a medical institution (e.g., a hospital or a physical examination institution) (e.g., Laboratory information management system), electronic device for physical examination report, etc.
  • a medical institution e.g., a hospital or a physical examination institution
  • Laboratory information management system e.g., Laboratory information management system
  • electronic device for physical examination report e.g., etc.
  • the user when the user is a registered user of a certain software platform (for example, a health management platform), the user actively fills in and saves his basic information (for example, name, gender, age, height, Weight, waist circumference, living habits, etc.), the user's basic information can be obtained directly from the storage library (for example, backend) associated with the specific platform.
  • the storage library for example, backend
  • the user when the user is not a registered user of the specific platform (health management platform) or the user does not complete or save his basic information in the specific platform, he can use a third-party platform (for example, a hospital Or the information management system of a medical examination institution, etc.) or related electronic devices (for example, bracelets, smart watches, etc.) to collect basic user information.
  • a third-party platform for example, a hospital Or the information management system of a medical examination institution, etc.
  • related electronic devices for example, bracelets, smart watches, etc.
  • the embodiments of the present disclosure do not specifically limit this, and can be adjusted according to actual conditions.
  • the user’s basic information can also be obtained according to the personal information entered by the user in the text box. Adjust the actual situation.
  • Fig. 2A shows a user interface of a certain platform according to an embodiment of the present disclosure.
  • users can fill in basic information according to their own situation (for example, including name, gender, age, height, weight, waist circumference), and then compare the specific lifestyle options provided on the user interface (for example, including frequent drinking (every More than three times a week), smoking, salty diet, fond of fried foods, fond of sweets, often staying up late (average time to fall asleep later than 12 o'clock) rarely exercise, etc.), check the lifestyle habits that you meet, and compare these users
  • the basic information is saved in a database (for example, the background) associated with the health management platform.
  • the health management platform may interface with at least one (for example, multiple) medical institutions, and obtain basic information and examination results of patients participating in physical examinations at these medical institutions from at least one (for example, multiple) medical institutions.
  • the health management platform can also obtain basic user information from smart terminals (for example, smart measuring instruments, smart bracelets, smart watches, smart clothes, etc.), as well as at least one physical sign data of the patient detected by the sensors in the smart terminal (For example, pulse, body temperature, heart rate, respiration, brain electricity, electrocardiogram, blood pressure, blood sugar, myoelectricity, etc.).
  • the health management platform may periodically (for example, daily) obtain the examination results of patients participating in physical examination at these medical institutions from multiple medical institutions, and store them in a database (or memory) associated with the health management platform in advance.
  • a database or memory associated with the health management platform in advance.
  • the embodiments of the present disclosure do not specifically limit the source of the user's basic information, and can be set according to actual needs.
  • the user's basic information can also be filled in by the user online, for example, a corresponding web page can be provided, and the user fills in his own information in the web page, and the web page sends the information filled in by the user to the server, and the server organizes the information Then get the user's basic information.
  • a tag set is established based on the acquired basic user information.
  • the user's label may include: the user's gender, name, age, and so on.
  • the label also includes the name of the disease, such as type I diabetes, etc., for example, in another example, the label also includes the name of the complication, such as diabetic foot, and the like.
  • FIG. 2B shows a schematic diagram of establishing a user tag set according to at least one embodiment of the present disclosure.
  • the dashed box on the left is the user portrait, that is, the user's basic information.
  • the user’s basic information includes: “User is diabetic A”, “Height 172cm”, “Weight 81kg”, “Male”, “68 years old”, “Memory loss”, “Long-term leg ulcers”, “Thickened and enlarged toes ", “Bone big toe protruding", “alcohol and smoking often”, “rarely exercise”.
  • the label is obtained through entity recognition and other methods.
  • entity recognition method includes, for example, using conditional random field, deep learning and other methods to obtain the label.
  • the user tag collection includes: “blood sugar”, “old age”, “overweight”, “complication”, “recipe”, “neuropathy” and “diabetic foot”.
  • the user tag set may include a multi-level tag set, the multi-level tag set includes a multi-level tag, and different types of tags are different.
  • nearly two hundred basic rules can be established according to different ages, time periods, disease types, presence or absence of symptoms, presence or absence of complications, etc., and then spread to different diabetes types and multiple types.
  • the first-level tag of the multi-level tag set is age
  • the second-level tag is time period
  • the third-level tag is disease type
  • the fourth-level tag is complication.
  • Step S130 Associate the user tag set with the data knowledge base, and obtain a candidate question set from multiple knowledge question sets, where the candidate question set includes multiple candidate questions.
  • the user tag set established in step S120 can be matched with the data knowledge base accessed in step S110, and the problem candidate set can be obtained according to rule association.
  • Fig. 3A shows a schematic diagram of a rule scheme between a user tag set and a data knowledge base according to at least one embodiment of the present disclosure.
  • the operation in step S130 will be described in detail below with reference to FIG. 3A, taking the basic diabetic rule scheme as an example.
  • the age group is used as the first-level label to first directly associate with the user (for example, a diabetic patient) in a rule.
  • the user for example, a diabetic patient
  • the first-level label can be associated with three categories, that is, minors under 18 years of age, newborns, children, and adults within the range of 18-59 years of age and 60 years of age. The above middle-aged and elderly people.
  • the corresponding labels include “Type 1, Type 2 Diabetes", “Children” and "6:00-8:00”
  • the corresponding multiple candidate questions can include: “What should children with diabetes eat for breakfast?”, "What are the precautions for children with type 1 and type 2 diabetes after breakfast?” "Diabetes What equipment does the patient use to measure blood sugar” etc.
  • the corresponding labels include “Type 1, Type 2 Diabetes”, “Children” and “11:00-15:00”
  • the corresponding multiple candidate questions can include: “What should children with diabetes eat for lunch?”, "What are the precautions for children with type 1 and type 2 diabetes after lunch?”, " What exercises are suitable for type 1 and type 2 children with diabetes after lunch?” etc.
  • the corresponding labels include “type 1 and type 2 diabetes", “children” and “17:00-20:00”
  • the corresponding multiple candidate questions can include: “What should patients with type 1 and type 2 diabetes eat for dinner?”, “What are the precautions for diabetic patients after dinner?”, “Children What kind of exercise is suitable for diabetic patients after dinner?” "How do children with type 1 and type 2 diabetes take medicine at night?” etc. It should be noted that the recommended questions listed in the embodiments of the present disclosure are only exemplary, and not restrictive.
  • the corresponding labels include “type 1, type 2 diabetes", “adult” and “6:00-8:00”
  • the corresponding multiple candidate questions can include: “What should adult diabetic patients eat for breakfast?”, "What are the precautions for type 1 and type 2 adult diabetic patients after breakfast?" and " Can adult diabetic patients drink milk in the morning?” Wait.
  • the corresponding labels include “Type 1, Type 2 Diabetes", “Adult” and “11:00-15:00”
  • the corresponding multiple candidate questions can include: “What should adult diabetic patients with type 1 and type 2 eat for lunch?”, "What are the precautions for adult diabetic patients after lunch?", " What exercises are suitable for type 1 and type 2 adult diabetic patients after lunch?" "How do type 1 and type 2 adult diabetic patients take medicine at noon?” etc.
  • the corresponding labels include “Type 1, Type 2 Diabetes", “Adult” and “17:00-20:00”
  • the corresponding multiple candidate questions can include: “What should adults with type 1 and type 2 diabetes eat for dinner?”, "What should you pay attention to after dinner for type 1 and type 2 adults with diabetes? Matters?”, "What kind of exercise are suitable for adults with type 1 and type 2 diabetes after dinner?” "How do adults with type 1 and type 2 diabetes take medicine at night?” etc.
  • the recommended questions listed in the embodiments of the present disclosure are only exemplary, and not restrictive.
  • the corresponding labels include “diabetes”, “gestational diabetes", “elderly” and " 6:00-8:00”
  • the corresponding multiple candidate questions can include: "What should elderly diabetic patients and gestational diabetes patients eat for breakfast?", "What are the precautions for elderly diabetic patients and gestational diabetes patients after breakfast?" And "How do elderly diabetic patients and gestational diabetes patients take medicine in the morning?” and so on.
  • the dinner time for example, 17:00-20:00
  • they may consult related issues such as recipes and nutrition recommendations, exercise recommendations and precautions, precautions for blood glucose measurement 2h after dinner, medication-related matters and other related issues.
  • the corresponding labels include “diabetes", “gestational diabetes", “elderly” and " 17:00-20:00”
  • the corresponding multiple candidate questions can include: “What should elderly diabetic patients and gestational diabetes patients eat for dinner?”, "What are the precautions for elderly diabetic patients and gestational diabetes patients after dinner?”, "What exercise are suitable for elderly diabetic patients and gestational diabetes patients after dinner?” "How do elderly diabetic patients and gestational diabetes patients take medicine at night?” and so on.
  • the recommended questions listed in the embodiments of the present disclosure are only exemplary, and not restrictive.
  • the corresponding multiple candidate questions may include: "What should a patient with prediabetes eat for breakfast?”, "Diabetes What are the precautions for pre-diabetes patients after breakfast?” and “How do pre-diabetics take medicine in the morning?” etc.
  • lunch hours for example, 11:00-15:00
  • the corresponding multiple candidate questions may include: "What should a patient with prediabetes eat for lunch?", "Diabetes What are the precautions for pre-diabetes patients after lunch?" and "How do pre-diabetes patients take medicine at noon?" and so on.
  • dinner time for example, 17:00-20:00
  • they may ask about recipes and nutrition advice, exercise advice, and precautions.
  • the corresponding multiple candidate questions may include: “What should a patient with prediabetes eat for dinner?”, “Diabetes What are the precautions for pre-diabetes patients after dinner?”, "What exercises are suitable for pre-diabetes patients after dinner?” "How do pre-diabetics take medicine at night?” and so on. It should be noted that the recommended questions listed in the embodiments of the present disclosure are only exemplary, and not restrictive.
  • the second-level label e.g., time of day
  • the third-level label e.g., disease type
  • Use complications as a fourth-level label for rule association about 10 years after the onset of diabetes, 30% to 40% of patients will have at least one complication, such as cardiovascular disease, kidney disease, retinopathy, neuropathy, lower extremity vascular disease, Diabetic foot and so on.
  • the user’s existing symptoms are used to select and recommend issues related to one or more complications, and then these related issues are returned to the corresponding user
  • the set of candidate questions is composed of multiple candidate questions corresponding to the user.
  • the user's existing symptoms are combined with clinical data to correspond to recommendations related to one or more complications-related problems. For example, in one example, high blood pressure, pain in the precordial area, palpitation, chest tightness, etc. are symptoms of complications of cardiovascular disease.
  • the user’s label set can include the label "Cardiovascular Disease", so that some problems related to cardiovascular disease complications can be recommended, for example, "Heart What are the symptoms of vascular disease complications?" "Why do you feel palpitation, chest tightness?", "What should you do if you have palpitation, chest tightness?", etc.
  • the embodiments of the present disclosure do not specifically limit this.
  • foamy urine, difficulty in urination, lower extremity edema, eyelid edema, etc. are symptoms of complications of diabetic nephropathy.
  • blurred vision, decreased vision, black eyes, etc. are complications of retinopathy.
  • slurred speech, decreased memory, and persistent numbness, tingling, and swelling of the hands and feet are neuropathic complications.
  • intermittent numbness and weakness of the lower extremities, claudication, and nocturnal pain in the lower extremities are complications of lower extremity vascular disease.
  • long-term ulcers in the lower limbs, thickening and swelling of the ends of the fingers or toes, and protruding big toe bones are symptoms of diabetic foot complications.
  • a complication has been diagnosed, related issues can be recommended based on the complication. For example, in an example, if the user’s tag includes "cardiovascular disease", the tag will be matched in the data knowledge base.
  • the questions may include: “Diabetes patients have cardiovascular disease complications, how should they be treated?", "What are the symptoms of cardiovascular disease complications in diabetic patients?”, etc.
  • the embodiments of the present disclosure do not specifically limit this.
  • classification of various diseases and the classification of various symptoms in complications described in the embodiments of the present disclosure are only to illustrate how to establish the mapping between the user tag set and the data knowledge base. Relationship, that is, to describe the above-mentioned specific rules and schemes for illustrative purposes only.
  • the classification and symptom analysis of specific diseases can be adjusted and set based on a large amount of clinical data, professional experience judgments, etc.
  • the embodiments of the present disclosure do not make this Specific restrictions.
  • the above descriptions are all situations where the user chooses to have diabetes, and if the user chooses to have no diabetes, the following operations can be performed.
  • the user meets at least three of the following conditions: often sitting still, first-degree relatives have a history of diabetes, high blood pressure, dyslipidemia, have a history of impaired glucose regulation, have a history of large babies, the user is a woman with a history of gestational diabetes,
  • the users are patients with atherosclerotic cardiovascular and cerebrovascular diseases, and they also have one of the following conditions: 6.1mmol/L ⁇ fasting blood glucose ⁇ 7mmol/L, 7.8mmol/L ⁇ 2h postprandial blood glucose ⁇ 11.1mmol/L, Then, the data knowledge base is matched according to the pre-diabetes rules to form a set of candidate questions.
  • the user is a hypertensive patient.
  • the regular plan for hypertensive patients is different from the regular plan for diabetic patients.
  • the rule plan of a hypertensive patient can be directly judged as prehypertension, mild hypertension, moderate hypertension, and severe hypertension according to the values of diastolic blood pressure and systolic blood pressure, so as to execute different rule plans.
  • the rule program for hypertensive patients for the first-level label (ie, age group), a stage for the elderly over 80 years old is specially set, which is classified as critical elderly hypertension, and blood pressure must be monitored at any time.
  • the regular program for hypertensive patients it is also necessary to monitor blood pressure at a specific time while monitoring blood sugar.
  • rule scheme provided by the embodiment of the present disclosure is only illustrative, and the embodiment of the present disclosure does not limit the specific rule scheme, and can be set according to actual needs.
  • the user tag set is matched by using methods such as entity recognition, keyword matching, and deep learning.
  • entity recognition refers to the recognition of entities with specific meanings in the text, for example, names of people, place names, and time.
  • keyword matching methods include broad matching, exact matching, phrase matching, and negative matching.
  • the text content of the label is "diabetes”
  • it can be matched to the recommendation question containing the three words "diabetes" in the data knowledge base, for example, "diabetic recipe?”, “what is suitable for diabetic patients?" Exercise?”, “Symptoms of diabetic patients?”, etc.
  • the embodiments of the present disclosure do not specifically limit this.
  • an elderly person over 60 years old with type 1 diabetes and complications will usually want to get the corresponding diet, advice on blood glucose monitoring, and understand the related problems related to complications at 12 noon.
  • create a corresponding user tag set that is, including tags: "type 1 diabetes", “complication”, "over 60 years old” and "12 noon" based on the user's basic information.
  • the label set is associated with the data knowledge base, and a candidate question set is obtained from multiple knowledge question sets included in the data knowledge base, and the candidate question set includes multiple candidate questions.
  • multiple related candidate questions can be matched from the data knowledge base through methods such as entity recognition, keyword matching, and deep learning to form a candidate question set.
  • the multiple candidate questions corresponding to the user in the above example may include: “Three meals a day recipes for diabetic patients?", "What kind of exercise is suitable for elderly diabetic patients?", "How often do diabetic patients monitor blood sugar per day?”, “Symptoms of Type 1 Diabetes?”, “How to treat diabetic complications and peripheral neuropathy?”.
  • the corresponding knowledge question set (including standard questions) can be quickly retrieved from the data knowledge base through the user tag set.
  • Extended questions and standard answers provide convenient conditions.
  • Step S140 Obtain user behavior data, and obtain user interest parameters based on the user behavior data.
  • user behavior data (for example, user behavior logs) can be acquired from software on a client terminal or a Web server, and user behavior data can also be customized.
  • user behavior data can include all visits, browsing, clicks and other behavior data when the user visits a website. That is to say, user behavior data can feed back the user's specific behavior, for example, which link the user clicks on and which page opens. , Which search term was used, etc.
  • user interest parameters can be obtained by analyzing these user behavior data.
  • the user's feedback behavior may include explicit feedback behavior and implicit feedback behavior.
  • explicit feedback behavior includes the user's explicit feedback on the answer, such as clearly choosing whether the answer is helpful.
  • a certain software platform for example, a health management platform
  • the platform After providing the answer to the question, the user will be asked: "Does this answer help you?”.
  • the user's click "Yes” or "No” behavior the user's feedback on the answer can be clearly known, which can reflect the user's points of interest and concerns.
  • Implicit feedback behavior refers to the fact that it cannot directly reflect the user's preferences, but in an indirect way, for example, through the frequency of the user's click and browse within a certain period of time.
  • the health knowledge read by the user can be summarized by the maximum boundary similarity (MMR) algorithm
  • the document can be extracted according to the importance of the sentence to form the summary by the MMR algorithm
  • the word frequency-inverse text frequency index can be used
  • the (term frequency-inverse document frequency, TF-IDF) method obtains the high-frequency words in the abstract. These high-frequency words (also called keywords) are also important features that reflect the user's points of interest and attention.
  • user behavior data can be obtained by analyzing the logs of the application stored in the device.
  • the log may be a log stored after the device was started up and run this time, or it may be a log that has been stored after the device was booted up and run last time.
  • the embodiment of the present disclosure does not specifically limit this, and can be adjusted according to actual conditions.
  • the user’s clicked questions or browsed high-frequency words, keywords, etc. reflecting the user’s points of interest and concerns are converted into user interest parameters.
  • the user interest parameter may be a numerical value.
  • Word embedding is performed on the question clicked by the user or the words and sentences that the user is interested in, and the embedding vector of the words and sentences of the user's interest is generated, which constitutes the aforementioned "user interest parameter".
  • Word Embedding can be understood as a mapping relationship, which can map or embed a word in a text space into another numerical vector space through a certain method. In other words, Word Embedding can use vocabulary and complete sentences. Expressed in the form of a vector.
  • Step S150 Based on the user interest parameter and the multiple candidate questions, obtain at least one similarity feature between each candidate question of the multiple candidate questions and the user interest parameter.
  • obtaining at least one similarity feature between each candidate question of the multiple candidate questions and the user interest parameter may include: adopting at least one The similarity matching model obtains at least one similarity feature between each candidate question and the user interest parameter based on the user interest parameter and multiple candidate questions.
  • At least one similarity matching model includes: cosine similarity model, Jaccard similarity model, edit distance (Levenshtein) similarity model, word shift distance (WMD) similarity At least one of a degree model and a deep semantic matching (DSSM) similarity model.
  • any candidate question for example, a standard question or an extended question
  • the candidate question Embedding vector B which is also a numerical vector.
  • Word Embedding can be understood as a mapping relationship, which can map or embed a word in a text space into another numerical vector space through a certain method. In other words, Word Embedding can use vocabulary and complete sentences. Expressed in the form of a vector. Input the user interest parameter A and a certain candidate question B into the above-mentioned multiple similarity models.
  • each similarity model outputs the similarity characteristics between the numerical vectors A and B.
  • the larger the value of the similarity feature the closer the words and sentences corresponding to vector A and the words and sentences corresponding to vector B are.
  • the embodiment of the present disclosure does not limit the number of similarity matching models used. For example, in an example, if five similarity matching models are used, the vectors A and B may have five similarity characteristics. For example, in an example, if three similarity matching models are used, the vectors A and B may have three similarity features.
  • Cosine similarity uses the cosine value of the angle between the vectors as a measure of the difference between two individuals. The closer the cosine value is to 1, the more similar the two vectors A and B are. The following formula is usually used to calculate the cosine similarity (also known as the cosine distance):
  • Jaccard distance uses the ratio of all elements of different elements in two sets to measure the degree of discrimination between two sets. It is expressed by the following formula, where J(A,B) is the Jeckard's similarity coefficient.
  • Edit distance also known as Levenshtein distance, refers to the minimum number of operations required to convert string A into string B using character operations.
  • the permitted character operations include modifying a character, inserting a character, and deleting a character.
  • the similarity features output by the edit distance similarity model are continuous.
  • Word shift distance refers to considering the similarity between two documents as a whole, and measuring the semantic similarity of the documents by finding the pair of the minimum distance of all words before the two documents. Among them, the similarity feature output by the WMD similarity model is continuous.
  • DSSM is a deep semantic matching model, which maps the two matched to low-dimensional space, and the correlation problem is transformed into the distance of low-dimensional space vectors.
  • the model can not only be used to predict the semantic similarity of two sentences, but also can obtain the low-dimensional semantic vector expression of the sentence.
  • the similarity features output by the DSSM model are discrete.
  • the above five similarity matching models can be used for the user interest parameter and multiple candidate questions at the same time, and then for each candidate question of the multiple candidate questions, the user's interest can be obtained.
  • the parameters respectively correspond to the five similarity features of the five similarity matching models.
  • the similarity matching model used in the embodiments of the present disclosure may not be limited to the similarity matching model described above, but other similarity matching models can also be used, as long as the same or similar technical effects can be achieved, that is, it can be calculated
  • the similarity between the two vectors is sufficient, and the embodiment of the present disclosure does not specifically limit this.
  • the embodiment of the present disclosure does not limit the number of similarity matching models used, and can be set according to actual requirements.
  • Step S160 Based on the user's basic information, multiple candidate questions, and at least one similarity feature, sort the multiple candidate questions to obtain a question sequence.
  • sorting multiple candidate questions based on user basic information, multiple candidate questions, and at least one similarity feature to obtain a question sequence includes: using a ranking model to combine user basic information, multiple A candidate question and at least one similarity feature form the input feature vector of the ranking model, and the score corresponding to each candidate question in the multiple candidate questions is obtained, and the corresponding multiple The candidate questions are sorted to get the question sequence.
  • the basic information of a certain user includes: gender is male (for example, its corresponding discrete feature is "0"), suffering from diabetes (for example, its corresponding discrete feature is "1"). ");
  • the Embedding vector corresponding to one of the multiple candidate questions is [0.3,0.5,0.6];
  • the similarity characteristics between the candidate question and the user interest parameters include: cosine similarity of 0.85, Jaccard distance of 0.91, edit distance is 3, WMD is 1.17, DSSM is 2.
  • the input feature vector that composes them into the ranking model is a vector [0,1,0.3,0.5,0.6,0.85,0.91,3,1.17,2]. It should be noted that the features provided in this embodiment The data is only exemplary, and the specific value of the characteristic data can be set according to experimental results or actual conditions, which are not specifically limited in the embodiments of the present disclosure.
  • the basic user information, multiple candidate questions, and at least one similarity feature form the input feature vector of the ranking model, and the individual factors of the user (for example, user basic information, user behavior data, etc.) are fully considered for personalized problem recommendation. This allows users to more specifically grasp knowledge related to their own health.
  • the ranking model adopted is the classic Wide&Deep model proposed by Google.
  • the vector [0,1,0.3,0.5,0.6,0.85,0.91,3,1.17,2] is input to the Wide&Deep model as the input feature vector.
  • the core idea of the model is to combine the memory ability of the linear model with the generalization ability of the deep neural network model, and reflect the integration of relevance and diversity in the recommended scene.
  • This model is used to score multiple candidate questions in the question candidate set, and the corresponding multiple candidate questions are sorted in the order of the scores from high to low to obtain the question sequence. Then, according to actual needs, the first N questions in the question sequence are shown to the user, where N is an integer greater than or equal to 1.
  • the score of a candidate question can be expressed as a conditional probability, p(y
  • x represents the input feature vector.
  • the input feature vector x includes the discrete features of the aforementioned user basic information, the continuous Embedding vector of the candidate question itself, and at least one similarity feature between the candidate question and the user interest parameter (the similarity feature may be continuous or may be Discrete type).
  • the probability value P(y 1
  • FIG. 4A is a schematic structural diagram of a Wide&Deep model provided by at least one embodiment of the present disclosure.
  • the Wide&Deep model includes two parts, namely the Wide part (the left side of Fig. 4A) and the Deep part (the right side of Fig. 4A).
  • the Wide&Deep model balances the memory and generalization capabilities of the Wide model and the Deep model.
  • the two parts of the model require different input features.
  • the Wide model is a generalized linear model (for example, logistic regression), the formula is as follows:
  • x represents the feature vector [x1,x2,x3...]
  • w represents the parameter vector [w1,w2,w3....]
  • b is the bias term, and
  • y is the output label.
  • the output is a probability value between 0 and 1.
  • the input features used in the Wide part are discrete features, for example, discrete features of user basic information and discrete similarity features.
  • the Deep model is a feed-forward neural network.
  • the input of the deep neural network model is continuous dense features.
  • the sparse high-dimensional features need to be Embedding (dimensionality reduction) to convert them to low-dimensional dense features, and then It is used as the input of the first hidden layer, and it is trained and updated in the reverse direction according to the final error (loss).
  • the activation function f of the hidden layer usually adopts the ReLU function that prevents the gradient from disappearing. Therefore, the input features used in the Deep part are continuous features, such as the Embedding vector of the candidate problem and the continuous similarity feature.
  • the gradient is calculated according to the final error, and back-propagated to the Wide and Deep parts to continuously update the parameters of the model to obtain the final model. It should be noted that training the Wide model and the Deep model at the same time does not mean model fusion, but the weighted sum of the results of the two models as the final prediction result:
  • W wide is the weight of the Wide part
  • x is the original feature vector
  • ⁇ (x) is the cross feature.
  • W deep is the deep part of the weight.
  • Model training uses joint training. Compared with the individual training of a single model in ensemble learning, the model is only fused in the final prediction stage, while joint training is model fusion performed in the training stage. The training error will be fed back to Wide and Deep at the same time. The weights are updated in the model. Therefore, the Wide model focuses on the cross product of discrete features, and the non-linear transformation of the original features produces the memory of feature interaction, while the Deep model focuses on generalization.
  • the deep neural network uses low-dimensional dense features and only requires a small amount of feature engineering. However, it can better generalize the combination of features that have not appeared in the training sample, and improve the generalization ability of the model. After the model is trained, it is deployed to the problem recommendation scenario.
  • the problem recommendation method 10 adopts a ranking model (ie, Wide&Deep model) to compose basic user information, multiple candidate questions, and at least one similarity feature into the input feature vector of the ranking model.
  • a ranking model ie, Wide&Deep model
  • Each of the multiple candidate questions included in the candidate set returns a score.
  • the multiple candidate questions are sorted to obtain the question sequence.
  • the scores are sorted from high to low.
  • the scores are sorted from low to high. It should be noted that it can also be sorted in other orders. The embodiment does not specifically limit this.
  • Step S170 Recommend at least one candidate question in the question sequence to the user based on the order of the question sequence. For example, in an example, the question sequence is sorted in the order of the scores from high to low, and the first N candidate questions in the question sequence are recommended to the user, where N is an integer greater than or equal to 1. For example, selecting, for example, the first 5 (or other numerical values) of the question sequence as the final recommendation question and recommending it to the user.
  • FIG. 4B shows a user interface diagram of a recommendation question provided to a user by a certain platform (for example, a health management platform).
  • a certain platform for example, a health management platform.
  • the health management platform has already recommended 5 questions based on the above question recommendation method (for example, "High blood pressure can be operated on Is it treated?", "What are the symptoms of high blood pressure?", “What are the symptoms of high blood pressure?”, "How often do hypertensive patients do blood pressure checks every day?” and “What exercise should be done for hypertensive patients?”) recommended to user.
  • the health management platform has already recommended 5 questions based on the above question recommendation method (for example, "High blood pressure can be operated on Is it treated?", "What are the symptoms of high blood pressure?", “What are the symptoms of high blood pressure?”, "How often do hypertensive patients do blood pressure checks every day?” and "What exercise should be done for hypertensive patients?”) recommended to user.
  • the final recommendation question to the user, which is not specifically limited in
  • the platform when the user is inputting the question that he wants to consult into the input text box, the platform can match the data in the data knowledge base according to the text information input by the user through, for example, entity recognition and keyword matching methods. And display these questions above the input text box in the user interface. For example, as shown in Figure 4C, when the user enters the three words "hypertension” in the input text box, "hypertension diet", “hypertension diet precautions” and “hypertension diet” are prompted above the input text box.
  • the recommended problem of "pregnancy hypertension diet” is not specifically limited in the embodiments of the present disclosure, and can be adjusted according to actual conditions.
  • steps S110-S170 can be selectively executed according to actual conditions, which are not specifically limited in the embodiments of the present disclosure.
  • steps S110-S130 can be omitted, and the pre-stored set of candidate questions for the user can be directly retrieved.
  • a user uses a certain platform (for example, a health management platform) for the first time, the platform has generated a corresponding set of candidate questions for the user and saved it in the database associated with the platform ,
  • the platform may not need to repeat steps S110-S130 to obtain the user's candidate question set, but obtain the user candidate question set by calling the candidate question set previously stored in the associated database
  • the embodiments of the present disclosure do not specifically limit this, and may be adjusted according to actual conditions.
  • the question recommendation method 10 not only effectively avoids the problem of inappropriate feedback answers that may occur due to unclear expression of the patient, but also fully considers the user's individual factors (for example, user basic information, user behavior data). Etc.) Carry out personalized problem recommendation, so that users can more pertinently grasp the knowledge related to their own health, and through the ranking model (ie, Wide&Deep model) fusion of user basic information and the output of various similarity matching models, etc.
  • the ranking model ie, Wide&Deep model
  • a variety of features make the question recommendation relevant, personalized and diversified, and at the same time pay attention to the final feedback sequence, achieving an effect that more closely meets the needs of users.
  • FIG. 5A is an exemplary flowchart of another question recommendation method provided by at least one embodiment of the present disclosure
  • FIG. 5B is a schematic block diagram of the question recommendation method as shown in FIG. 5A provided by at least one embodiment of the present disclosure.
  • the question recommendation method 50 includes step S510-step S580.
  • step S510-step S580 can be executed in sequence, or in other adjusted order.
  • step S520 can be executed first and then step S530, or step S530 can be executed first and then step S520 can be executed.
  • part or all of the operations in step S510-step S580 can also be executed in parallel.
  • step S520 and step S530 can be executed in parallel.
  • step S510 to step S580 can be implemented by the server or the local end, which is not limited in the embodiment of the present disclosure.
  • the implementation of the question recommendation method 50 may selectively execute some of the steps in steps S510-S580, or may execute some additional steps in addition to steps S510-S580, which are not specifically limited in the embodiment of the present disclosure. .
  • the steps S520-S580 included in the question recommendation method 50 shown in FIG. 5A are basically the same as the steps S110-S170 included in the question recommendation method 10 shown in FIG.
  • the description of -S580 please refer to the relevant description of steps S110-S170 in FIG. 1B above, which will not be repeated here.
  • FIG. 5A further includes step S510, namely, establishing a data knowledge base.
  • a data set can be captured from the network, and the data set can be classified according to intent to form multiple knowledge problem sets, thereby building a data knowledge base.
  • a web crawler can be used to grab a data collection containing a large amount of data from a network such as the Internet.
  • a web crawler also known as a web spider, is a program or script that automatically crawls information on the World Wide Web in accordance with certain rules.
  • the question recommendation method 50 when the above-mentioned question recommendation method 50 is applied to a medical intelligent question-and-answer scenario, the question recommendation method 50 can be used to recommend a question about a certain disease, and the data set can be from a doctor on the Internet. At least one of the patient's questioning data set, hot issues related to the disease, and reward questions.
  • the reward question may be a question that requires paid consultation on certain websites (for example, Xunyiwenyao.com, 39health.com, etc.), which is not specifically limited in the embodiment of the present disclosure.
  • TF-IDF TF-IDF method
  • high-frequency keywords such as "symptoms”, “treatment”, “blood sugar”, “diet”, “medication”, “check”, "insulin”, “diabetic foot” "Wait.
  • intent a large amount of data from the data set can be classified into intent, which can facilitate the construction of a data knowledge base.
  • use the deep learning algorithm Text-CNN to classify the data set organize (for example, manually organize) standard questions and corresponding extended questions and answers under each intent, thereby establishing basic health knowledge of chronic diseases such as diabetes and hypertension
  • the complete data knowledge base For example, in one example, based on the extracted high-frequency keywords, multiple types of intents can be manually determined.
  • humans can determine the type of intent according to the degree of attention of people, the frequency of occurrence of keywords, etc., which are not specifically limited in the embodiment of the present disclosure.
  • the following intention categories are manually determined: diet, exercise, medication, examination, complications, surgery, treatment, symptoms, etc., which are not specifically limited in the embodiments of the present disclosure.
  • manually organize the corresponding data under each type of intent for example, the problem of matching this type of intent
  • use it as training data and use the deep learning algorithm Text-CNN for model training
  • the model outputs the intent category corresponding to each data, so as to realize the intent classification of a large amount of data.
  • step S510 is executed to establish a knowledge database, which is stored in a server, a memory, or a database.
  • step S510 can be omitted, and the knowledge database can be directly accessed, thereby improving the processing efficiency.
  • step S510 can be executed occasionally, or the knowledge database can be updated by other suitable methods, thereby the knowledge database can be updated and optimized, so that the candidate questions obtained in the subsequent steps are closer to the needs of users and closer to the current society. Cognitive level.
  • the technical effect achieved by the question recommendation method 50 shown in FIG. 5A is similar to the technical effect achieved by the question recommendation method 10 described above in conjunction with FIG. 1B, and will not be repeated here.
  • FIG. 5B For the description of each block diagram shown in FIG. 5B, reference may be made to the above detailed description of each step in FIG. 5A and FIG. 1B, which will not be repeated here.
  • Fig. 6 is a schematic block diagram of a question recommendation device provided by at least one embodiment of the present disclosure.
  • the question recommendation device 60 includes: a collection acquisition module, a behavior analysis module, a feature generation module, a question ranking module, and a recommendation module.
  • These modules can be implemented by software, hardware, firmware, or any combination thereof, for example, It can be implemented as a collection acquisition circuit 600, a behavior analysis circuit 640, a feature generation circuit 650, a question sorting circuit 660, and a recommendation circuit 670, respectively.
  • the set acquisition circuit 600 is configured to acquire a candidate question set of the user, and the candidate question set includes a plurality of candidate questions.
  • the set acquisition circuit 600 includes a knowledge base access circuit 610, an information acquisition circuit 620, and a candidate set generation circuit 630.
  • the knowledge base access circuit 610 is configured to access a data knowledge base including a plurality of knowledge question sets.
  • the information acquisition circuit 620 is configured to acquire basic user information and establish a user tag set based on the basic user information.
  • the candidate set generating circuit 630 is configured to associate a user tag set with a data knowledge base, and obtain a candidate question set from a plurality of knowledge question sets, and the candidate question set includes a plurality of candidate questions.
  • the behavior analysis circuit 640 is configured to obtain user behavior data, and obtain user interest parameters based on the user behavior data.
  • the feature generation circuit 650 is configured to obtain at least one similarity feature between each candidate question and the user interest parameter based on the user interest parameter and the multiple candidate questions.
  • the question sorting circuit 660 is configured to sort a plurality of candidate questions based on basic user information, a plurality of candidate questions, and at least one similarity feature to obtain a question sequence.
  • the recommendation circuit 670 is configured to recommend at least one candidate question in the question sequence to the user based on the order of the question sequence.
  • the specific operations that the knowledge base access circuit 610, the information acquisition circuit 620, the candidate set generation circuit 630, the behavior analysis circuit 640, the feature generation circuit 650, the question sorting circuit 660, and the recommendation circuit 670 are configured to perform can all be referred to in the above disclosure.
  • the related descriptions of the problem recommendation methods 10 and 50 provided in at least one embodiment of, will not be repeated here.
  • the question sorting circuit 660 in the question recommending device 60 includes a question sorting sub-circuit 661.
  • the question sorting sub-circuit 661 is configured to adopt a sorting model to compose the input feature vector of the sorting model into basic user information, multiple candidate questions, and at least one similarity feature, to obtain a score corresponding to each candidate question in the multiple candidate questions, and follow The score size sorts the corresponding multiple candidate questions to obtain the question sequence.
  • the specific operations that the question sorting sub-circuit 661 is configured to perform can refer to the relevant descriptions of the question recommendation methods 10 and 50 provided in at least one embodiment of the present disclosure above, and details are not repeated here.
  • the question recommendation device 60 further includes a knowledge base establishing circuit 601.
  • the knowledge base establishment circuit 601 is configured to grab a data set from the network and classify the data set according to intent to form multiple knowledge problem sets and establish a data knowledge base.
  • the specific operations that the knowledge base establishing circuit 601 is configured to perform can refer to the related description of the problem recommendation method 50 provided in at least one embodiment of the present disclosure, which will not be repeated here.
  • the candidate set generation circuit 630 in the question recommendation device 60 includes a candidate set generation sub-circuit 631.
  • the candidate set generation sub-circuit 631 is configured to establish a mapping relationship between the user tag set and the standard question of the data knowledge base, and match the user tag set to the standard question in the data knowledge base, and the knowledge corresponding to the matched standard question
  • the set of questions constitutes a set of candidate questions.
  • the candidate set generation sub-circuit 631 For example, for the specific operations that the candidate set generation sub-circuit 631 is configured to perform, refer to the related descriptions of the problem recommendation methods 10 and 50 provided in at least one embodiment of the present disclosure, which will not be repeated here.
  • the behavior analysis circuit 640 in the question recommendation device 60 includes a behavior analysis sub-circuit 641.
  • the behavior analysis sub-circuit 641 is configured to analyze user behavior data, and convert questions clicked by the user or words and sentences that the user is interested in into user interest parameters.
  • the specific operations that the behavior analysis sub-circuit 641 is configured to perform can refer to the related descriptions of the problem recommendation methods 10 and 50 provided in at least one embodiment of the present disclosure above, which will not be repeated here.
  • the feature generation circuit 650 in the question recommendation device 60 includes a feature generation sub-circuit 651.
  • the feature generation sub-circuit 651 is configured to adopt at least one similarity matching model to obtain at least one similarity feature between each candidate question and the user interest parameter based on the user interest parameter and multiple candidate questions.
  • the specific operations that the feature generation sub-circuit 651 is configured to perform can refer to the related descriptions of the problem recommendation methods 10 and 50 provided by at least one embodiment of the present disclosure, and details are not described herein again.
  • the collection acquisition circuit 600, the knowledge base access circuit 610, the information acquisition circuit 620, the candidate set generation circuit 630, the behavior analysis circuit 640, the feature generation circuit 650, the question sorting circuit 660, and the recommendation circuit in the embodiment of the present disclosure 670 and feature generation sub-circuit 651, behavior analysis sub-circuit 641, candidate set generation sub-circuit 631, question sorting sub-circuit 661, and knowledge base establishment circuit 601 can be composed of hardware such as processors, controllers, etc., and software that can implement related functions. Or a combination of the two can be implemented, and the embodiments of the present disclosure do not limit their specific implementation manners.
  • the problem recommendation device 60 may also include more circuits, and is not limited to the aforementioned set acquisition circuit 600, knowledge base access circuit 610, information acquisition circuit 620, and candidate set generation circuit 630. , Behavior analysis circuit 640, feature generation circuit 650, question sorting circuit 660, recommendation circuit 670, and feature generation sub-circuit 651, behavior analysis sub-circuit 641, candidate set generation sub-circuit 631, question sorting sub-circuit 661, and knowledge base establishment circuit 601 This may be determined according to actual needs, and the embodiments of the present disclosure do not limit this.
  • problem recommendation device 60 provided in the embodiment of the present disclosure can implement the aforementioned problem recommendation methods 10 and 50, and can also achieve similar technical effects to the aforementioned problem recommendation methods 10 and 50, which will not be repeated here.
  • FIG. 7 is a schematic block diagram of a question recommendation system provided by at least one embodiment of the present disclosure.
  • the question recommendation system 70 includes a terminal 710 and a question recommendation server 720, and the terminal 710 and the question recommendation server 720 are signally connected.
  • the terminal 710 is configured to send the request data to the question recommendation server 720.
  • the question recommendation server 720 is configured to: in response to request data, access a data knowledge base, the data knowledge base including a plurality of knowledge question sets; obtain a user's candidate question set, the candidate question set includes a plurality of candidate questions; obtain user behavior data, And obtain user interest parameters based on user behavior data; obtain at least one similarity feature between each candidate question and user interest parameters based on user interest parameters and multiple candidate questions; and based on user basic information, Multiple candidate questions and at least one similarity feature are sorted to obtain a question sequence.
  • the terminal 710 is also configured to display the first N candidate questions in the question sequence, and N is an integer greater than or equal to 1.
  • question server 720 For example, for the above-mentioned operations that the question server 720 is configured to perform, reference may be made to question recommendation methods 10 and 50 provided in at least one embodiment of the present disclosure, which will not be repeated here.
  • the terminal 710 included in the question recommendation system 70 may be implemented as a client (such as a mobile phone, a computer, etc.), and the question recommendation server 720 may be implemented as a server (such as a server).
  • the question recommendation system 70 may also include a knowledge base server 730 storing a data knowledge base.
  • the knowledge base server 730 is in signal connection with the question recommendation server 720, and is configured to respond to the request information of the question recommendation server 720 and return data corresponding to the request information in the data knowledge base to the question recommendation server 720.
  • the data in the data knowledge base can be directly stored on the problem recommendation server 720 or stored in another storage device provided separately, or the problem recommendation server 720 establishes a data knowledge base by itself, and then stores it on the problem recommendation server 720 or in another storage device provided separately, which is not specifically limited in the embodiment of the present disclosure.
  • a question recommendation system 70 provided by at least one embodiment of the present disclosure can implement the question recommendation methods 10 and 50 provided in the foregoing embodiment, and can also achieve similar technical effects as the question recommendation methods 10 and 50 provided in the foregoing embodiment. This will not be repeated here.
  • FIG. 8 is a schematic diagram of an electronic device provided by at least one embodiment of the present disclosure.
  • the electronic device 80 includes a processor 810 and a memory 820.
  • the memory 820 includes one or more computer program modules 821.
  • One or more computer program modules 821 are stored in the memory 820 and configured to be executed by the processor 810.
  • the one or more computer program modules 821 include any problem recommendation provided by at least one embodiment of the present disclosure.
  • the instructions of the method when executed by the processor 810, can execute one or more steps in the problem recommendation method provided by at least one embodiment of the present disclosure.
  • the memory 820 and the processor 810 may be interconnected by a bus system and/or other forms of connection mechanisms (not shown).
  • the memory 820 and the processor 810 may be set on the server side (or the cloud), for example, set in the aforementioned question recommendation server 720, for executing one of the question recommendation methods described in FIG. 1A, FIG. 1B, and FIG. 5A. Multiple steps.
  • the processor 810 may be a central processing unit (CPU), a digital signal processor (DSP), or other forms of processing units with data processing capabilities and/or program execution capabilities, such as field programmable gate arrays (FPGA), etc.;
  • the central processing unit (CPU) can be an X86 or ARM architecture.
  • the processor 810 may be a general-purpose processor or a special-purpose processor, and may control other components in the electronic device 80 to perform desired functions.
  • the memory 820 may include any combination of one or more computer program products, and the computer program products may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
  • Volatile memory may include random access memory (RAM) and/or cache memory (cache), for example.
  • Non-volatile memory may include, for example, read only memory (ROM), hard disk, erasable programmable read only memory (EPROM), portable compact disk read only memory (CD-ROM), USB memory, flash memory, etc.
  • One or more computer program modules 821 may be stored on the computer-readable storage medium, and the processor 810 may run one or more computer program modules 821 to implement various functions of the electronic device 80.
  • the computer-readable storage medium may also store various application programs and various data, various data used and/or generated by the application programs, and the like.
  • various application programs and various data various data used and/or generated by the application programs, and the like.
  • FIG. 9 is a schematic block diagram of a terminal provided by at least one embodiment of the present disclosure.
  • the terminal is the display terminal 900, which can be applied to the question recommendation method provided in the embodiment of the present disclosure, for example.
  • the display terminal 900 may provide user access logs reflecting user behavior data (for example, access logs recorded by applications such as a browser running in the system through cookies, etc.), and display at least one candidate recommended to the user problem.
  • user behavior data for example, access logs recorded by applications such as a browser running in the system through cookies, etc.
  • the display terminal 900 may include a processing device (such as a central processing unit, a graphics processor, etc.) 910, which may be loaded into a random access device according to a program stored in a read-only memory (ROM) 920 or from a storage device 980.
  • the program in the memory (RAM) 930 executes various appropriate actions and processes.
  • various programs and data required for the operation of the display terminal 900 are also stored.
  • the processing device 910, the ROM 920, and the RAM 930 are connected to each other through a bus 940.
  • An input/output (I/O) interface 950 is also connected to the bus 940.
  • the following devices can be connected to the I/O interface 950: including input devices 960 such as touch screens, touch pads, keyboards, mice, cameras, microphones, accelerometers, gyroscopes, etc.; including, for example, liquid crystal displays (LCD), speakers, vibrations
  • An output device 970 such as a device; a storage device 980 such as a magnetic tape, a hard disk, etc.; and a communication device 990.
  • the communication device 990 may allow the display terminal 900 to perform wireless or wired communication with other electronic devices to exchange data.
  • FIG. 9 shows the display terminal 900 having various devices, it should be understood that it is not required to implement or have all the illustrated devices, and the display terminal 900 may alternatively implement or have more or fewer devices.
  • FIG. 10 is a schematic block diagram of a non-transitory readable storage medium 100 provided by at least one embodiment of the present disclosure.
  • the non-transitory readable storage medium 100 includes computer program instructions 111 stored thereon.
  • the computer program instruction 111 is executed by the processor, one or more steps in the problem recommendation method 10 or 50 provided in at least one embodiment of the present disclosure are executed.
  • the storage medium may be any combination of one or more computer-readable storage media.
  • one computer-readable storage medium contains computer-readable program code for obtaining a user's candidate question set, and another computer-readable storage medium
  • the medium contains computer-readable program codes for obtaining user behavior data and obtaining user interest parameters based on the user behavior data.
  • Another computer-readable storage medium contains user interest parameters and multiple candidate questions to obtain each of the multiple candidate questions.
  • the candidate questions are sorted to obtain the computer-readable program code of the question sequence.
  • each of the above-mentioned program codes can also be stored in the same computer-readable medium, which is not limited in the embodiments of the present disclosure.
  • the computer when the program code is read by a computer, the computer can execute the program code stored in the computer storage medium, and execute, for example, the problem recommendation method provided by any embodiment of the present disclosure.
  • the storage medium may include a memory card of a smart phone, a storage component of a tablet computer, a hard disk of a personal computer, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM), Portable compact disk read-only memory (CD-ROM), flash memory, or any combination of the foregoing storage media may also be other suitable storage media.
  • the readable storage medium may also be the memory 820 in FIG. 8, and the relevant description may refer to the foregoing content, which will not be repeated here.
  • the storage medium 100 can be applied to the problem recommendation server 720, and the technician can make a selection according to a specific scenario, which is not limited here.
  • FIG. 11 shows an exemplary scene diagram of a problem recommendation system provided by at least one embodiment of the present disclosure.
  • the question recommendation system 300 may include a user terminal 310, a network 320, a server 330, and a database 340.
  • the user terminal 310 may be the computer 310-1 or the portable terminal 310-2 shown in FIG. 11. It is understandable that the user terminal may also be any other type of electronic device capable of receiving, processing, and displaying data, which may include, but is not limited to, a desktop computer, a notebook computer, a tablet computer, a smart home device, a wearable device, In-vehicle electronic equipment, medical electronic equipment, etc.
  • the network 320 may be a single network, or a combination of at least two different networks.
  • the network 320 may include, but is not limited to, one or a combination of several of a local area network, a wide area network, a public network, a private network, the Internet, and a mobile communication network.
  • the server 330 may be a single server or a server group, and each server in the server group is connected through a wired network or a wireless network.
  • the wired network may, for example, use twisted pair, coaxial cable, or optical fiber transmission for communication
  • the wireless network may use, for example, a 3G/4G/5G mobile communication network, Bluetooth, Zigbee, or WiFi.
  • the present disclosure does not limit the types and functions of the network here.
  • the server group may be centralized, such as a data center, or distributed.
  • the server can be local or remote.
  • the server 330 may be a general-purpose server or a dedicated server, and may be a virtual server or a cloud server.
  • the database 340 may be used to store various data used, generated, and output from the work of the user terminal 310 and the server 330.
  • the database 340 may be connected or communicated with the server 330 or a part of the server 330 via the network 320, or directly connected or communicated with the server 330, or may be connected or communicated with the server 330 through a combination of the above two methods.
  • the database 340 may be a stand-alone device.
  • the database 340 may also be integrated in at least one of the user terminal 310 and the server 340.
  • the database 340 may be set on the user terminal 310 or on the server 340.
  • the database 340 may also be distributed, a part of which is set on the user terminal 310, and the other part is set on the server 340.
  • the user terminal 310 may send the request data to the server 330 via the network 320 or other technologies (for example, Bluetooth communication, infrared communication, etc.).
  • the server 330 obtains the user's candidate question set in response to the request data, where the candidate question set includes multiple candidate questions.
  • the server 330 obtains user behavior data, and obtains user interest parameters based on the user behavior data.
  • the user behavior data is transmitted from the user terminal 310 to the server 330 via the network 320.
  • the server 330 obtains at least one similarity feature between each candidate question of the multiple candidate questions and the user interest parameter based on the user interest parameter and the multiple candidate questions.
  • the server 330 sorts the multiple candidate questions based on the user's basic information, multiple candidate questions, and at least one similarity feature to obtain a question sequence, and then displays the top N candidate questions in the question sequence via the network 320 or other technologies.
  • the user terminal 310 is sent (for example, Bluetooth communication, infrared communication, etc.). Finally, the user terminal 310 displays the first N candidate questions from the server 330.
  • the term “plurality” refers to two or more than two unless specifically defined otherwise.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Pathology (AREA)
  • Evolutionary Computation (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

本公开提供了一种问题推荐方法、装置、系统和电子设备以及非瞬时可读存储介质。问题推荐方法包括:获取用户的候选问题集合,其包括多个候选问题;获取用户行为数据,并基于用户行为数据得到用户兴趣参数;基于用户兴趣参数和多个候选问题,得到多个候选问题中的每一个候选问题与用户兴趣参数之间的至少一个相似度特征;以及基于用户基本信息、多个候选问题和至少一个相似度特征,对多个候选问题进行排序以得到问题序列,基于问题序列的顺序,将问题序列中的至少一个候选问题推荐给用户。

Description

问题推荐方法及装置、系统和电子设备、可读存储介质 技术领域
本公开的实施例涉及一种问题推荐方法、问题推荐装置、问题推荐系统、电子设备以及非瞬时可读存储介质。
背景技术
随着互联网技术的迅速发展,给人们的生活带来了许多便利。人们可以通过互联网搜索、浏览感兴趣的信息,也可以通过互联网进行在线咨询、在线问诊等。随着时间的推移,互联网积累了海量的内容信息,传统的搜索引擎网页的搜索结果较多,而且可能有许多重复和无关的内容,用户难以在短时间内找到自己感兴趣的信息或者想咨询的问题等。
发明内容
本公开的至少一个实施例提供了一种问题推荐方法,包括:获取用户的候选问题集合,其中,所述候选问题集合包括多个候选问题;获取用户行为数据,并基于所述用户行为数据得到用户兴趣参数;基于所述用户兴趣参数和所述多个候选问题,得到所述多个候选问题中的每一个候选问题与所述用户兴趣参数之间的至少一个相似度特征;基于用户基本信息、所述多个候选问题和所述至少一个相似度特征,对所述多个候选问题进行排序以得到问题序列;以及基于所述问题序列的顺序,将所述问题序列中的至少一个候选问题推荐给所述用户。
例如,在本公开至少一个实施例提供的一种问题推荐方法中,基于所述用户基本信息、所述多个候选问题和所述至少一个相似度特征,对所述多个候选问题进行排序以得到问题序列,包括:采用排序模型,将所述用户基本信息、所述多个候选问题和所述至少一个相似度特征组成所述排序模型的输入特征向量,得到所述多个候选问题中每一个候选问题对应的分数,按照所述分数的大小对对应的多个候选问题进行排序,以得到所述问题序列。
例如,在本公开至少一个实施例提供的一种问题推荐方法中,基于所述用户兴趣参数和所述多个候选问题,得到所述多个候选问题中的每一个候选问题与所述用户兴趣参数之间的至少一个相似度特征,包括:采用至少一个相似度匹配模型,基于所述用户兴趣参数和所述多个候选问题,得到所述每一个候选问题与所述用户兴趣参数之间的至少一个相似度特征。
例如,在本公开至少一个实施例提供的一种问题推荐方法中,所述至少一个相似度匹配模型包括:余弦相似度模型、杰卡德相似度模型、编辑距离相似度模型、词移距离相似度模型和深度语义匹配相似度模型中的至少一种。
例如,在本公开至少一个实施例提供的一种问题推荐方法中,所述排序模型包括Wide&Deep模型。
例如,在本公开至少一个实施例提供的一种问题推荐方法中,获取所述用户的候选问题集合,包括:访问数据知识库,其中,所述数据知识库包括多个知识问题集;获取用户基本信息,并基于所述用户基本信息建立用户标签集合;将所述用户标签集合与所述数据知识库相关联,从所述多个知识问题集得到所述候选问题集合。
例如,在本公开至少一个实施例提供的一种问题推荐方法中,所述用户标签集合包括多级标签集合,多级标签集合包括多级标签,不同级标签的类型不同。
例如,在本公开至少一个实施例提供的一种问题推荐方法中,所述问题推荐方法用于推荐关于疾病的问题,所述多级标签集合的第一级标签为年龄段,第二级标签为时间段,第三级标签为疾病类型,第四级标签为并发症。
例如,在本公开至少一个实施例提供的一种问题推荐方法中,所述多个知识问题集中的每一个包括:标准问题,与所述标准问题对应的标准答案,以及与所述标准问题对应的扩展问题。
例如,在本公开至少一个实施例提供的一种问题推荐方法中,获取所述用户的候选问题集合,还包括:建立所述数据知识库。
例如,在本公开至少一个实施例提供的一种问题推荐方法中,所述建立所述数据知识库包括:从网络抓取数据集合,并对所述数据集合按照意图分类,以形成所述多个知识问题集,从而建立所述数据知识库。
例如,在本公开至少一个实施例提供的一种问题推荐方法中,所述问题推荐方法用于推荐关于疾病的问题,并且所述数据集合来自医患间的问诊数据集、与所述疾病相关联的热点问题和与所述疾病相关联的悬赏问题中的至少一种。
例如,在本公开至少一个实施例提供的一种问题推荐方法中,将所述用户标签集合与所述数据知识库相关联,从所述多个知识问题集得到候选问题集合,包括:建立所述用户标签集合与所述数据知识库中的标准问题之间的映射关系,将所述用户标签集合匹配所述数据知识库中的标准问题,与所匹配到的标准问题相对应的知识问题集组成所述候选问题集合。
例如,在本公开至少一个实施例提供的一种问题推荐方法中,获取所述用户的候选问题集合,包括:调取预先存储的所述用户的候选问题集合。
例如,在本公开至少一个实施例提供的一种问题推荐方法中,基于所述用户行为数据,得到所述用户兴趣参数,包括:分析所述用户行为数据,将用户点击过的问题或者用户感兴趣的词句转化为所述用户兴趣参数。
本公开至少一个实施例还提供了一种问题推荐装置,包括:集合获取电路,被配置为获取用户的候选问题集合,其中,所述候选问题集合包括多个候选问题;行为分析电路,被配置为获取用户行为数据,并基于所述用户行为数据得到用户兴趣参数;特征生成电路,被配置为基于所述用户兴趣参数和所述多个候选问题,得到所述多个候选问题中的每一个候选问题与所述用户兴趣参数之间的至少一个相似度特征;问题排序电路,被配置为基于用户基本信息、所述多个候选问题和所述至少一个相似度特征,对所述多个候选问题进行 排序以得到问题序列,以及推荐电路,被配置为基于所述问题序列的顺序,将所述问题序列中的至少一个候选问题推荐给用户。
例如,在本公开至少一个实施例提供的问题推荐装置中,所述问题排序电路包括:问题排序子电路,被配置为:采用排序模型,将所述用户基本信息、所述多个候选问题和所述至少一个相似度特征组成所述排序模型的输入特征向量,得到所述多个候选问题中每一个候选问题对应的分数,按照所述分数的大小对对应的多个候选问题进行排序,以得到所述问题序列。
例如,在本公开至少一个实施例提供的问题推荐装置中,所述集合获取电路包括:知识库访问电路,被配置为访问数据知识库,其中,所述数据知识库包括多个知识问题集;信息获取电路,被配置为获取用户基本信息,并基于所述用户基本信息建立用户标签集合;候选集生成电路,被配置为将所述用户标签集合与所述数据知识库相关联,从所述多个知识问题集得到候选问题集合,其中,所述候选问题集合包括多个候选问题;
例如,在本公开至少一个实施例提供的问题推荐装置中,所述集合获取电路还包括:知识库建立电路,被配置为从网络抓取数据集合,并对所述数据集合按照意图分类,以形成所述多个知识问题集,建立所述数据知识库。
例如,在本公开至少一个实施例提供的问题推荐装置中,所述候选集生成电路包括:候选集生成子电路,被配置为建立所述用户标签集合与所述数据知识库的标准问题之间的映射关系,将所述用户标签集合匹配所述数据知识库中的标准问题,与所匹配到的标准问题相对应的知识问题集组成所述候选问题集合。
例如,在本公开至少一个实施例提供的问题推荐装置中,所述行为分析电路包括:行为分析子电路,被配置为分析所述用户行为数据,将用户点击过的问题或者用户感兴趣的词句转化为所述用户兴趣参数。
例如,在本公开至少一个实施例提供的问题推荐装置中,特征生成电路包括:特征生成子电路,被配置为采用至少一个相似度匹配模型,基于所述用户兴趣参数和所述多个候选问题,得到所述每一个候选问题与所述用户兴趣参数之间的至少一个相似度特征。
本公开至少一个实施例还提供了一种问题推荐系统,包括终端和问题推荐服务器。所述终端被配置为将请求数据发送给所述问题推荐服务器;所述问题推荐服务器被配置为:响应于所述请求数据:获取用户的候选问题集合,其中,所述候选问题集合包括多个候选问题;获取用户行为数据,并基于所述用户行为数据得到用户兴趣参数;基于所述用户兴趣参数和所述多个候选问题,得到所述多个候选问题中的每一个候选问题与所述用户兴趣参数之间的至少一个相似度特征;以及基于用户基本信息、所述多个候选问题和所述至少一个相似度特征,对所述多个候选问题进行排序以得到问题序列;所述终端还被配置为,显示所述问题序列中的前N个候选问题,其中,N为大于或等于1的整数。
本公开的至少一个实施例还提供了一种电子设备,包括:处理器和存储器,存储器包括一个或多个计算机程序模块;其中,所述一个或多个计算机程序模块被存储在所述存储器中并被配置为由所述处理器执行,所述一个或多个计算机程序模块包括用于执行上述任 一实施例所述的问题推荐方法的指令。
本公开的至少一个实施例还提供了一种非瞬时可读存储介质,其上存储有计算机指令,其中,所述计算机指令被处理器执行时执行上述任一实施例所述的问题推荐方法。
附图说明
为了更清楚地说明本公开实施例的技术方案,下面将对实施例的附图作简单地介绍,显而易见地,下面描述的附图仅仅涉及本公开的一些实施例,而非对本公开的限制。
图1A为本公开至少一实施例提供的一种问题推荐方法的示例流程图;
图1B为本公开至少一实施例提供的另一种问题推荐方法的示例流程图;
图2A示出了根据本公开至少一实施例的某一平台的用户界面;
图2B示出了根据本公开至少一实施例的建立用户标签集合的示意图;
图3A示出了根据本公开至少一实施例的用户标签集合与数据知识库之间的一种规则方案的示意图;
图3B示出了根据本公开至少一实施例的某一平台的另一用户界面;
图4A为本公开至少一实施例提供的Wide&Deep模型的结构示意图;
图4B示出了根据本公开至少一实施例的某一平台的再一用户界面;
图4C示出了根据本公开至少一实施例的某一平台的又一用户界面;
图5A是本公开至少一实施例提供的另一种问题推荐方法的示例性流程图;
图5B是本公开至少一实施例提供的如图5A中的问题推荐方法的示意框图;
图6是本公开至少一实施例提供的一种问题推荐装置的示意框图;
图7是本公开至少一实施例提供的一种问题推荐系统的示意框图;
图8是本公开至少一实施例提供的一种电子设备的示意框图;
图9为本公开至少一实施例提供的一种终端的示意框图;
图10为本公开至少一实施例提供的一种非瞬时可读存储介质的示意框图;以及
图11示出了本公开至少一个实施例提供的问题推荐系统的示例性的场景图。
具体实施方式
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合附图,对本公开实施例的技术方案进行清楚、完整地描述。显然,所描述的实施例是本公开的一部分实施例,而不是全部的实施例。基于所描述的本公开的实施例,本领域普通技术人员在无需创造性劳动的前提下所获得的所有其他实施例,都属于本公开保护的范围。
除非另外定义,本公开使用的技术术语或者科学术语应当为本公开所属领域内具有一般技能的人士所理解的通常意义。本公开中使用的“第一”、“第二”以及类似的词语并不表示任何顺序、数量或者重要性,而只是用来区分不同的组成部分。同样,“一个”、“一”或者“该”等类似词语也不表示数量限制,而是表示存在至少一个。“包括”或者“包含”等类似的词语意指出现该词前面的元件或者物件涵盖出现在该词后面列举的元件或者物件及其 等同,而不排除其他元件或者物件。“连接”或者“相连”等类似的词语并非限定于物理的或者机械的连接,而是可以包括电性的连接,不管是直接的还是间接的。“上”、“下”、“左”、“右”等仅用于表示相对位置关系,当被描述对象的绝对位置改变后,则该相对位置关系也可能相应地改变。
随着经济发展和生活水平的提高,越来越多的人患有不同程度的慢性病。慢性病是指不构成传染、具有长期积累形成疾病形态损害的疾病的总称。常见的慢性病主要有心脑血管疾病、癌症、糖尿病、慢性呼吸系统疾病等,心脑血管疾病包含高血压、脑卒中和冠心病等。资料显示,慢性病的诱因之一是不健康的生活方式。例如,不健康的生活方式包括饮食不合理、运动不足、使用烟草以及过量使用酒精等。因此,针对慢性病患者,医生除了需要针对患者的病情对患者进行医疗治疗(例如,通过药物进行治疗)以外,还需要提供给患者合理化的应对慢性病的建议(例如,饮食建议、运动建议等)作为配合医疗治疗的辅助手段,以更好的控制和预防慢性病。随着网络技术的发展,智慧医疗提供了更多的便利,尤其是对于慢性病防控和自检相关的健康知识,可以提高患者对疾病的认知水平,进行更有效地防治。因此,面向慢性病患者健康教育的智能问答系统可以高效准确地解答患者的疑惑与问题,同时在当前国内医疗资源较为缺乏的条件下,还可以减轻医务工作者的负担,有利于长期健康管理。
本公开的发明人注意到,目前,医疗问答系统或者在线问诊依赖于患者描述的病情与患者提出的问题,通常会出现由于患者表达不够清晰而导致系统反馈的答案与患者信息不够贴切或者顺序不匹配的问题。此外,一般医疗问答系统并未利用用户信息来提供有针对性的健康知识,难以提供个性化和多样化的服务。
本公开的至少一个实施例提供了一种问题推荐方法、问题推荐装置、问题推荐系统、电子设备和非瞬时可读存储介质。该问题推荐方法包括:获取用户的候选问题集合,候选问题集合包括多个候选问题;获取用户行为数据,并基于用户行为数据得到用户兴趣参数;基于用户兴趣参数和多个候选问题,得到多个候选问题中的每一个候选问题与用户兴趣参数之间的至少一个相似度特征;基于用户基本信息、多个候选问题和至少一个相似度特征,对多个候选问题进行排序以得到问题序列,以及基于问题序列的顺序,将问题序列中的至少一个候选问题推荐给用户。
本公开的至少一个实施例提供的问题推荐方法不仅有效避免了由于患者表达不清晰而可能出现的反馈答案不贴切的问题,而且可以针对个体因素、自身特征等(例如,用户基本信息、用户的点击行为、浏览行为等)进行个性化问题推荐,使得用户能更有针对性地掌握与自身健康相关的知识。至少另一些实施例中,该问题推荐方法还可以通过采用排序模型使得问题推荐具有相关性、个性化和多样化,同时又注重了最终的反馈顺序,达到了更贴合用户需求的效果,有效提升了用户体验。
下面通过几个示例或实施例对根据本公开的至少一个实施例提供的问题推荐方法进行非限制性的说明,如下面所描述的,在不相互抵触的情况下这些具体示例或实施例中不同特征可以相互组合,从而得到新的示例或实施例,这些新的示例或实施例也都属于本公 开保护的范围。
图1A为本公开至少一实施例提供的一种问题推荐方法的示例流程图;图1B为本公开至少一实施例提供的另一种问题推荐方法的示例流程图。
本公开至少一实施例提供的问题推荐方法10可以应用于医疗智能问答、在线问诊、健康咨询等场景中,例如,可以应用于慢性病健康知识智能问答系统。例如,在一个实施例中,如图1A所示,该问题推荐方法10可以包括以下操作:
步骤S100:获取用户的候选问题集合,其中,候选问题集合包括多个候选问题;
步骤S140:获取用户行为数据,并基于用户行为数据得到用户兴趣参数;
步骤S150:基于用户兴趣参数和多个候选问题,得到多个候选问题中的每一个候选问题与用户兴趣参数之间的至少一个相似度特征;
步骤S160:基于用户基本信息、多个候选问题和至少一个相似度特征,对多个候选问题进行排序以得到问题序列;以及
步骤S170:基于问题序列的顺序,将问题序列中的至少一个候选问题推荐给用户。
例如,在一个实施例中,如图1B所示,该问题推荐方法10中的步骤S100可以具体包括步骤S110-S130,因此该问题推荐方法10可以具体包括步骤S110-S170。
下面结合图1A和图1B来详细说明本公开的实施例提供的问题推荐方法10,例如,在本公开一实施例中,该问题推荐方法10具体可以包括步骤S110-步骤S170,如图1B所示。例如,步骤S110-步骤S170可以顺序执行,也可以按调整后的其他次序执行,例如,在一个示例中,可以先执行步骤S110再执行步骤S120,在另一个示例中,可以先执行步骤S120再执行步骤S110,又例如,步骤S110-步骤S160中的部分或全部操作也可以并行执行,例如,步骤S110和步骤S120可以并行执行,本公开的实施例对各个步骤的执行顺序不作限制,可以根据实际情况调整。例如,步骤S110-步骤S160可以由服务器或本地端实现,本公开的实施例对此不做限制。例如,在一些示例中,实施本公开至少一实施例提供的问题推荐方法10可以选择地执行步骤S110-S170中的部分步骤,也可以执行除了步骤S110-S170以外的一些附加步骤,本公开的实施例对此不做具体限制。
下面以示例的方式结合附图详细说明本公开所提供的问题推荐方法10。
步骤S110:访问数据知识库,其中,数据知识库包括多个知识问题集。例如,问题推荐方法10应用于慢性病健康知识智能问答系统的情况下,数据知识库可以与慢性病(例如,糖尿病、高血压等)相关联的数据知识库,并且其包括与慢性病相关的完备的基本健康知识。数据知识库包括多个知识问题集,例如,将数据知识库中的基本健康知识,按照意图(例如饮食、运动、药物、检查、并发症、手术、治疗、症状等)进行分类,并且在每一种意图下整理(例如,人工整理)标准问题以及相应的扩展问题和标准答案以组成一个知识问题集,从而构成数据知识库。
例如,本公开的至少一个实施例中,多个知识问题集中的每一个可以包括标准问题、与该标准问题对应的标准答案,以及与该标准问题对应的扩展问题。例如,在一个示例中,用户为糖尿病患者,在“饮食”意图下,整理后的标准问题可以包括“糖尿病患者宜吃哪 些食物?”,与该标准问题对应的扩展问题可以包括,例如,“糖尿病吃哪些食物好?”、“糖尿病适合吃什么食物?”、“糖尿病应该吃哪些食物降血糖?”等。在这种情况下,在上述“饮食”意图下的知识问题集则可以包括前述标准问题、扩展问题以及它们对应的标准答案。
需要说明的是,在本公开的实施例中列举的标准问题、扩展问题仅仅是示意性的,可以根据应用场景和医疗实践等进行调整和更新。此外,类似地,标准答案也可以根据应用场景和医疗实践等进行调整和更新,也可以根据所采用的语言(例如汉语、英语)而进行调整和更新,本公开的实施例对此不作具体限制。
例如,在本公开的实施例中,数据知识库可以是已经建立好的,该已经建立好的知识数据库可以预先存储在本地或服务器中,也可以是在实施该问题推荐方法10时例如由服务器建立的,还可以是从其他设备读取的,本公开的实施例对此不作具体限制,可以根据实际需求来设置。关于建立数据知识库的详细描述将在下文说明。
步骤S120:获取用户基本信息,并基于用户基本信息建立用户标签集合。
例如,在本公开的至少一实施例中,用户基本信息可以例如包括用户的年龄、性别、身高、体重、腰围以及生活习惯等。例如,在一个示例中,若用户是糖尿病患者,则用户基本信息还包括糖尿病类型、已确诊的慢性并发症、已存在的症状等。例如,另一个示例中,若用户不是糖尿病患者,则用户基本信息还包括用户病史、空腹血糖值或餐后两小时血糖值等。例如,在又一个示例中,若用户是高血压患者,则用户基本信息还包括用户舒张压、收缩压值、高血压合并症类型以及症状等。本公开的实施例对用户基本信息所包含的内容不做具体限制,可以根据实际需求来设置。
例如,在一个示例中,用户基本信息可以来自用户在线实时信息,例如在获得许可的情况下可以来自线上的健康档案平台、医疗机构(例如,医院或体检机构)的信息管理系统(例如,实验室信息管理系统)、体征检查报告电子化装置等。
例如,在一个示例中,当用户是某一软件平台(例如,健康管理平台)的注册用户时,用户在注册时主动填写并保存了自己的基本信息(例如,姓名、性别、年龄、身高、体重、腰围、生活习惯等),则可以直接从与该特定平台相关联的存储库(例如,后台)中获取用户的基本信息。例如,在另一个示例中,当用户不是该特定平台(健康管理平台)的注册用户时或者用户未在该特定平台中完善或保存自己的基本信息,则可以通过第三方平台(例如,某医院或体检机构的信息管理系统等)或者相关电子装置(例如,手环、智能手表等)等来采集用户的基本信息,本公开的实施例对此不作具体限制,可以根据实际情况进行调整。例如,在一个示例中,当用户首次使用某网页或某平台时,也可以根据用户在文本框中输入的个人信息来获取用户的基本信息,本公开的实施例对此不作具体限制,可以根据实际情况进行调整。
图2A示出了根据本公开实施例的某一平台的用户界面。如图2A所示,用户可以根据自身情况填写基本信息(例如,包括姓名、性别、年龄、身高、体重、腰围),然后对照用户界面上提供的具体生活习惯选项(例如,包括经常饮酒(每周超过三次)、吸烟、饮 食偏咸、喜食油炸食品、喜食甜食、经常熬夜(平均入睡时间晚于12点)很少运动等),勾选自己符合的生活习惯,并将这些用户基本信息保存到与该健康管理平台相关联的数据库(例如,后台)中。
例如,健康管理平台可以与至少一个(例如,多个)医疗机构对接,并从至少一个(例如,多个)医疗机构获取在这些医疗机构参与体征检查的患者的基本信息和检查结果。例如,健康管理平台还可以从智能终端(例如,智能测量仪、智能手环、智能手表、智能衣物等)获取用户的基本信息,以及智能终端中的传感器检测到的患者的至少一条的体征数据(例如,脉搏、体温、心率、呼吸、脑电、心电、血压、血糖、肌电等)。例如,健康管理平台可以定期(例如,每天)从多个医疗机构获取在这些医疗机构参与体征检查的患者的检查结果,并预先存储在与该健康管理平台关联的数据库(或存储器)中。需要说明的是,本公开的实施例对用户基本信息的来源不作具体限制,可以根据实际需求来设置。例如,在其他一些示例中,用户基本信息也可以由用户在线填写,例如可以提供相应的网页,用户在该网页中填写自身的信息,该网页将用户填写的信息发送至服务器,服务器整理这些信息后得到用户基本信息。
例如,在一个示例中,根据上述内容获取到用户基本信息以后,基于获取到的用户基本信息,建立标签集合。
例如,用户的标签可以包括:用户性别、名称、年龄等。例如,在一个示例中,标签还包括疾病名称,诸如I型糖尿病等,例如,在另一个示例中,标签还包括并发症名称,诸如糖尿病足等。
下面结合图2B说明基于获取到的用户基本信息,建立标签集合的过程。图2B示出了根据本公开至少一实施例的建立用户标签集合的示意图。如图2B所示,左侧虚线框中为用户画像,即用户基本信息。该用户基本信息包括:“用户为糖尿病患者A”、“身高172cm”、“体重81kg”、“男”、“68岁”、“记忆力下降”、“下肢长期溃疡”、“脚趾末端增厚膨大”、“大脚趾骨突出”、“经常饮酒、吸烟”、“很少运动”。然后,例如通过实体识别等方法获取标签,通过实体识别方法包括,例如,使用条件随机场、深度学习等方法来获取标签,例如,右侧虚线框中为获取到的该用户的相关标签,即用户标签集合包括:“血糖”、“老年”、“超重”、“并发症”、“食谱”、“神经病变”和“糖尿病足”。
需要说明的是,本公开的上述实施例的描述中所列举的具体标签仅仅是示意性的,而不是限制性的。
例如,在本公开至少一实施例中,用户标签集合可以包括多级标签集合,多级标签集合包括多级标签,不同级标签的类型不同。
例如,在本公开至少一实施例中,可以根据不同的年龄、时间段、疾病类型、有无症状、有无并发症等建立了近两百种基础规则,再发散到不同的糖尿病类型、多种症状组合以及多种并发症类型组合,因此,多级标签集合可以涵盖用户的基本信息与兴趣导向。
例如,在一个示例中,多级标签集合的第一级标签为年龄段,第二级标签为时间段, 第三级标签为疾病类型,第四级标签为并发症。
需要说明的是,本公开的实施例对用户标签集合的具体规则不作限制,可以根据实际需求进行设置。
步骤S130:将用户标签集合与数据知识库相关联,从多个知识问题集得到候选问题集合,其中,候选问题集合包括多个候选问题。
例如,在本公开至少一实施例中,可以将在步骤S120中建立的用户标签集合与在步骤S110访问的数据知识库进行匹配,根据规则关联得到问题候选集合。
例如,在本公开的至少一实施例中,在步骤S130中,将用户标签集合与数据知识库相关联,从多个知识问题集得到候选问题集合,可以包括:首先,建立用户标签集合与数据知识库中的标准问题之间的映射关系,将用户标签集合匹配数据知识库中的标准问题,然后,与所匹配到的标准问题相对应的知识问题集组成候选问题集合。
图3A示出了根据本公开至少一实施例的用户标签集合与数据知识库之间的一种规则方案的示意图。下面结合图3A,以糖尿病基本规则方案为例详细说明步骤S130中的操作。
如图3A所示,在本公开至少一实施例中,将年龄段作为第一级标签首先与用户(例如,糖尿病患者)直接进行规则关联。例如,用户与第一级标签关联时可以关联至三类,也即,年龄在18岁以下的未成年人、新生儿、儿童、以及年龄在18-59岁范围内的成人和年龄在60岁以上的中老年人等。
对于不同年龄段,结合第二级标签(即一天中的不同时间段)、第三级标签(糖尿病类型),分别有相应的不同规则方案。
例如,各级标签具体执行时没有先后顺序的限制,例如,年龄、时间、糖尿病类型是单一的选择,在前边分类会更方便一些,并发症类型等会多选,较复杂一些,放在了后边的位置。
例如,在本公开至少一实施例中,对于1型、2型儿童糖尿病患者,在早餐时段(例如,6:00-8:00),他们可能会咨询食谱与营养建议、以及早餐后2h血糖注意事项等相关问题。例如,在一个示例中,对于1型、2型儿童糖尿病患者,在早餐时段(例如,6:00-8:00),其对应的标签包括“1型、2型糖尿病”、“儿童”和“6:00-8:00”,其对应的多个候选问题可以包括:“儿童糖尿病患者早餐应该吃什么?”、“1型、2型儿童糖尿病患者早餐后有哪些注意事项?”“糖尿病患者测量血糖用什么仪器”等。对于1型、2型儿童糖尿病患者,在午餐时段(例如,11:00-15:00),他们可能会咨询食谱与营养建议、运动建议和注意事项等相关问题。例如,在一个示例中,对于1型、2型儿童糖尿病患者,在午餐时段(例如,11:00-15:00),其对应的标签包括“1型、2型糖尿病”、“儿童”和“11:00-15:00”,其对应的多个候选问题可以包括:“儿童糖尿病患者午餐应该吃什么?”、“1型、2型儿童糖尿病患者午餐后有哪些注意事项?”、“1型、2型儿童糖尿病患者午餐后适合做什么运动?”等。对于1型、2型儿童糖尿病患者,在晚餐时段(例如,17:00-20:00),他们可能会咨询食谱与营养建议、运动建议和注意事项、用药相关事项等相关问题。例如,在一个示例中,对于1型、2型儿童糖尿病患者,在晚餐时段(例如, 17:00-20:00),其对应的标签包括“1型、2型糖尿病”、“儿童”和“17:00-20:00”,其对应的多个候选问题可以包括:“1型、2型儿童糖尿病患者晚餐应该吃什么?”、“糖尿病患者晚餐后有哪些注意事项?”、“儿童糖尿病患者晚餐后适合做什么运动?”“1型、2型儿童糖尿病患者晚上如何用药?”等。需要说明的是,本公开的实施例所列举的推荐问题仅仅是示例性的,而不是限制性的。
例如,在本公开至少一实施例中,对于1型、2型成人糖尿病患者,在早餐时段(例如,6:00-8:00)他们可能会咨询食谱与营养建议、以及早餐后2h测血糖注意事项等相关问题。例如,在一个示例中,对于1型、2型成人糖尿病患者,在早餐时段(例如,6:00-8:00),其对应的标签包括“1型、2型糖尿病”、“成人”和“6:00-8:00”,其对应的多个候选问题可以包括:“成人糖尿病患者早餐应该吃什么?”、“1型、2型成人糖尿病患者早餐后有哪些注意事项?”和“成人糖尿病患者早上可以喝牛奶吗?”等。在午餐时段(例如,11:00-15:00)他们可能会咨询有食谱与营养建议、运动建议和注意事项等相关问题。例如,在一个示例中,对于1型、2型成人糖尿病患者,在午餐时段(例如,11:00-15:00),其对应的标签包括“1型、2型糖尿病”、“成人”和“11:00-15:00”,其对应的多个候选问题可以包括:“1型、2型成人糖尿病患者午餐应该吃什么?”、“成人糖尿病患者午餐后有哪些注意事项?”、“1型、2型成人糖尿病患者午餐后适合做什么运动?”“1型、2型成人糖尿病患者中午如何用药?”等。在晚餐时段(例如,17:00-20:00)他们可能会咨询食谱与营养建议、运动建议和注意事项、晚餐后2h测血糖注意事项、用药相关事项等相关问题。例如,在一个示例中,对于1型、2型成人糖尿病患者,在晚餐时段(例如,17:00-20:00),其对应的标签包括“1型、2型糖尿病”、“成人”和“17:00-20:00”,其对应的多个候选问题可以包括:“1型、2型成人糖尿病患者晚餐应该吃什么?”、“1型、2型成人糖尿病患者晚餐后有哪些注意事项?”、“1型、2型成人糖尿病患者晚餐后适合做什么运动?”“1型、2型成人糖尿病患者晚上如何用药?”等。需要说明的是,本公开的实施例所列举的推荐问题仅仅是示例性的,而不是限制性的。
例如,在本公开至少一实施例中,对于老人糖尿病患者与妊娠糖尿病患者,早餐时段(例如,6:00-8:00)他们可能会咨询食谱与营养建议、以及早餐后2h测血糖注意事项等相关问题。例如,在一个示例中,对于老人糖尿病患者与妊娠糖尿病患者,在早餐时段(例如,6:00-8:00),其对应的标签包括“糖尿病”、“妊娠糖尿病”、“老人”和“6:00-8:00”,其对应的多个候选问题可以包括:“老人糖尿病患者与妊娠糖尿病患者早餐应该吃什么?”、“老人糖尿病患者与妊娠糖尿病患者早餐后有哪些注意事项?”和“老人糖尿病患者与妊娠糖尿病患者早上如何用药?”等。在午餐时段(例如,11:00-15:00)他们可能会咨询食谱与营养建议、午餐后2h测血糖注意事项、运动建议和注意事项等相关问题。例如,在一个示例中,对于老人糖尿病患者与妊娠糖尿病患者,在午餐时段(例如,11:00-15:00),其其对应的标签包括“糖尿病”、“妊娠糖尿病”、“老人”和“6:00-8:00”,其对应的多个候选问题可以包括:“老人糖尿病患者与妊娠糖尿病患者午餐应该吃什么?”、“老人糖尿病患者与妊娠糖尿病患者午餐后有哪些注意事项?”、“老人糖尿病 患者与妊娠糖尿病患者午餐后适合做什么运动?”“老人糖尿病患者与妊娠糖尿病患者中午如何用药?”等。在晚餐时段(例如,17:00-20:00)他们可能会咨询食谱与营养建议、运动建议和注意事项、晚餐后2h测血糖注意事项、用药相关事项等相关问题。例如,在一个示例中,对于老人糖尿病患者与妊娠糖尿病患者,在晚餐时段(例如,17:00-20:00),其对应的标签包括“糖尿病”、“妊娠糖尿病”、“老人”和“17:00-20:00”其对应的多个候选问题可以包括:“老人糖尿病患者与妊娠糖尿病患者晚餐应该吃什么?”、“老人糖尿病患者与妊娠糖尿病患者晚餐后有哪些注意事项?”、“老人糖尿病患者与妊娠糖尿病患者晚餐后适合做什么运动?”“老人糖尿病患者与妊娠糖尿病患者晚上如何用药?”等。需要说明的是,本公开的实施例所列举的推荐问题仅仅是示例性的,而不是限制性的。
例如,在本公开至少一实施例中,对于糖尿病前期患者,在早餐时段(例如,6:00-8:00)他们可能会咨询食谱与营养建议以及早餐后2h血糖注意事项等相关问题。例如,在一个示例中,对于糖尿病前期患者,在早餐时段(例如,6:00-8:00),其对应的多个候选问题可以包括:“糖尿病前期患者早餐应该吃什么?”、“糖尿病前期患者早餐后有哪些注意事项?”和“糖尿病前期患者早上如何用药?”等。在午餐时段(例如,11:00-15:00)他们可能会咨询食谱与营养建议、运动建议和注意事项等相关问题。例如,在一个示例中,对于糖尿病前期患者,在午餐时段(例如,11:00-15:00),其对应的多个候选问题可以包括:“糖尿病前期患者午餐应该吃什么?”、“糖尿病前期患者午餐后有哪些注意事项?”和“糖尿病前期患者中午如何用药?”等。在晚餐时段(例如,17:00-20:00)他们可能会咨询食谱与营养建议、运动建议和注意事项等相关问题。例如,在一个示例中,对于糖尿病前期患者,在晚餐时段(例如,17:00-20:00),其对应的多个候选问题可以包括:“糖尿病前期患者晚餐应该吃什么?”、“糖尿病前期患者晚餐后有哪些注意事项?”、“糖尿病前期患者晚餐后适合做什么运动?”“糖尿病前期患者晚上如何用药?”等。需要说明的是,本公开的实施例所列举的推荐问题仅仅是示例性的,而不是限制性的。
例如,在本公开的至少一实施例中,在第一级标签(例如,年龄段)、第二级标签(例如,一天中的时间段)和第三级标签(例如,疾病类型)之后,将并发症作为第四级标签来进行规则关联。根据临床数据显示,糖尿病患者在发病后10年左右,将有30%~40%的患者会发生至少一种并发症,例如,心血管疾病、肾脏疾病、视网膜病变、神经病变、下肢血管病变、糖尿病足等。
例如,在一些示例规则方案中,若没有确诊的并发症,则通过用户已存在的症状来选择推荐与某一种或多种并发症相关的问题,然后将这些相关的问题返回至对应该用户的候选问题集,即组成对应该用户的多个候选问题。例如,在没有确诊的并发症的情况下,通过用户已存在的症状,结合临床数据,来对应推荐相关的一种或多种并发症相关的问题。例如,在一个示例中,血压高、心前区疼痛以及心慌、胸闷等属于心血管疾病并发症症状。例如,当用户已存在血压高、心前区疼痛症状时,则在用户的标签集合里面可以包括标签“心血管疾病”,从而可以推荐一些与心血管疾病并发症相关的问题,例如,“心血管疾病并发症有哪些症状?”“为什么会心慌、胸闷?”、“出现心慌、胸闷怎么办?”等, 本公开的实施例对此不作具体限制。例如,在另一个示例中,小便泡沫多、小便困难与下肢水肿、眼睑水肿等属于糖尿病肾病并发症症状。例如,在另一个示例中,视线模糊、视力下降、眼前发黑等属于视网膜病变并发症。例如,在另一个示例中,口齿不清、记忆力下降、以及持续性手足麻木、刺痛、肿胀等属于神经病变并发症。例如,在另一个示例中,间断性出现下肢麻木无力、跛行以及下肢夜间疼痛等属于下肢血管病变并发症。例如,又一个示例中,下肢长期溃疡以及手指或脚趾末端增厚膨大、大脚趾骨突出等属于糖尿病足并发症症状。
例如,在已经确诊有并发症的情况下,则可以根据该并发症来推荐相关问题,例如,在一个示例中,用户的标签包括“心血管疾病”,则通过该标签在数据知识库里匹配到的问题可以包括:“糖尿病患者存在心血管疾病并发症,该如何治疗?”、“糖尿病患者的心血管疾病并发症有哪些症状?”等等,本公开的实施例对此不作具体限制。
需要说明的是,在本公开的实施例中所描述的各种疾病的分类、各种症状的在并发症里的归类仅仅是为了说明如何建立用户标签集合与数据知识库中之间的映射关系,即仅仅出于说明的目的来描述上述具体规则方案,对具体疾病的归类和症状分析可以根据大量临床数据,专业人士的经验判断等来调整和设置,本公开的实施例对此不作具体限制。
参考图3A可知,上述描述均为用户选择自己有糖尿病的情况,而在用户选择自己无糖尿病的情况下,可以如下操作。例如,如果用户满足以下条件中的至少三种:经常静坐、一级亲属有糖尿病史、高血压、血脂异常、有糖调节受损史、有巨大儿生产史、用户为妊娠糖尿病史的妇女、用户为动脉粥样硬化性心脑血管疾病患者等,并且同时还具备以下条件之一:6.1mmol/L<空腹血糖<7mmol/L、7.8mmol/L<餐后2h血糖<11.1mmol/L,则按照糖尿病前期规则来匹配数据知识库,从而组成候选问题集合。
例如,本公开的至少一实施例中,用户为高血压患者。高血压患者的规则方案不同于糖尿病患者的规则方案。例如,在一个示例中,高血压患者的规则方案可以直接按照舒张压与收缩压值大小判断为高血压前期、轻度高血压、中度高血压与重度高血压,从而执行不同的规则方案。例如,在高血压患者的规则方案中,对于第一级标签(即,年龄段),特别设置了80岁以上老人的阶段,将其分类为危重老年高血压,要随时监测血压。除此之外,由于糖尿病是高血压一种常见的并发症,因此,在高血压患者的规则方案中,还需要在特定时间监测血压的同时监测血糖。
需要说明的是,本公开的实施例所提供的规则方案仅仅是示意性的,本公开的实施例对具体规则方案不作限制,可以根据实际需求来设置。
例如,在本公开的实施例中,在建立用户标签集合与数据知识库中的标准问题之间的映射关系后,通过采用例如实体识别、关键词匹配、深度学习等方法,将用户标签集合匹配数据知识库中的标准问题,然后,从数据知识库中调取与所匹配到的标准问题相对应的知识问题集,从而组成候选问题集合。其中,实体识别是指识别文本中具有特定意义的实体,例如,人名、地名、时间等。例如,关键字匹配方法包括广泛匹配、精确匹配、短语匹配、否定匹配等方式。例如,在一个示例中,标签的文本内容是“糖尿病”,则可以匹 配到数据知识库中包含“糖尿病”三个字的推荐问题,例如,“糖尿病人的食谱?”、“糖尿病人适合什么运动?”、“糖尿病患者的症状?”等,本公开的实施例对此不作具体限制。
例如,在一个示例中,一位患1型糖尿病、有并发症的60岁以上的老人在中午12时通常会想要获取对应的食谱、血糖监测的建议,并且了解与并发症的相关问题等,通过前述步骤,根据此用户的基本信息建立相应的用户标签集合(即包括标签:“1型糖尿病”、“有并发症”、“60岁以上”和“中午12时”),将该用户标签集合与数据知识库相关联,从数据知识库中所包括的多个知识问题集得到候选问题集合,候选问题集合包括多个候选问题。例如,根据用户标签集合,可以通过例如实体识别、关键词匹配、深度学习等方法,从数据知识库中匹配到多个相关的候选问题,以组成候选问题集合。例如,在上述示例中的用户所对应的多个候选问题可以包括:“糖尿病人一日三餐食谱?”、“老年糖尿病人适合什么运动?”、“糖尿病患者每天血糖监测的频率?”、“1型糖尿病的症状?”、“糖尿病并发症周围神经病变怎么治疗?”。
因此,通过上述建立用户标签集合与数据知识库中标准问题之间的映射关系,为后续建立规则关联时,快速地通过用户标签集合从数据知识库中调取相应的知识问题集(包括标准问题、扩展问题以及标准答案)提供了便利的条件。
步骤S140:获取用户行为数据,并基于用户行为数据得到用户兴趣参数。
例如,在本公开至少一实施例中,可以通过从客户终端或者Web服务器上的软件获取用户行为数据(例如,用户行为日志),也可以自定义采集用户行为数据,本公开的实施例对此不做限制。例如,用户行为数据可以包括用户访问网站时所有的访问、浏览、点击等行为数据,也就是说,用户行为数据可以反馈用户的具体行为,例如,用户点击了哪一个链接、打开了哪一个页面、采用了哪个搜索项等。例如,在本公开至少一实施例中,通过分析这些用户行为数据,可以得到用户兴趣参数。例如,在一个示例中,用户的反馈行为可以包括显性反馈行为和隐性反馈行为。例如,显性反馈行为包括用户明确表示对答案的反馈,例如明确选择答案是否有帮助。如图3B所示,在一个示例中,在某一特定软件平台(例如,健康管理平台)上,当用户提出了某一问题,例如,“高血压患者每天血压监测的频率?”,该平台在提供了对应该问题的答案以后,会咨询用户:“该答案对您有帮助吗?”。根据用户点击“是”或“否”的行为,可以明确的知道用户对答案的反馈,从而可以反映用户的兴趣点和关注点。隐性反馈行为指的是不能直接反应用户的喜好,而是以间接的方式,例如,通过用户在一定时间段内点击浏览的频次等。例如,在一个示例中,可以通过最大边界相似(MMR)算法对用户阅读的健康知识进行摘要,通过MMR算法可以将文档按照重要性进行句子的抽取以组成摘要,并通过词频-逆文本频率指数(term frequency–inverse document frequency,TF-IDF)方法获取摘要中的高频词,这些高频词(也可以称为关键词)也是反映用户的兴趣点和关注点的重要特征。
例如,在一个示例中,若用户初次使用某一设备中的特定软件平台(例如,健康管理平台),则可以根据该设备中存储的应用程序的日志分析得到用户行为数据。该日志可以 是该设备本次开机运行之后所存储的日志,也可以是该设备上次开机运行之后所存储的日志,本公开的实施例对此不作具体限制,可以根据实际情况进行调整。
例如,在一个示例中,将反映用户的兴趣点和关注点的用户点击过的问题或者浏览过的高频词、关键词等转化为用户兴趣参数,例如,该用户兴趣参数可以是一种数值向量,用以反映用户的兴趣点和关注点。例如,将用户点击过的问题或者用户感兴趣的词句进行文字嵌入(Word Embedding),生成该用户兴趣词句的嵌入(Embedding)向量,即构成上述“用户兴趣参数”。其中Word Embedding可以理解为是一种映射关系,能够将文本空间里的某个词语,通过一定的方法,映射或者嵌入到另一个数值向量空间,也就是说,Word Embedding能够将词汇、完整句子用向量的形式表达出来。
步骤S150:基于用户兴趣参数和多个候选问题,得到多个候选问题中的每一个候选问题与用户兴趣参数之间的至少一个相似度特征。
例如,在本公开至少一个实施例中,基于用户兴趣参数和多个候选问题,得到多个候选问题中的每一个候选问题与用户兴趣参数之间的至少一个相似度特征可以包括:采用至少一个相似度匹配模型,基于用户兴趣参数和多个候选问题,得到每一个候选问题与用户兴趣参数之间的至少一个相似度特征。
例如,在本公开至少一个实施例中,至少一个相似度匹配模型包括:余弦相似度模型、杰卡德(Jaccard)相似度模型、编辑距离(Levenshtein)相似度模型、词移距离(WMD)相似度模型和深度语义匹配(DSSM)相似度模型中的至少一种。
例如,在一个示例中,给定用户兴趣参数A(例如,一个数值向量),将候选集中的任一个候选问题(例如,标准问题或者扩展问题)进行文字嵌入(Word Embedding),生成该候选问题的嵌入(Embedding)向量B,该向量B也是一种数值向量。其中Word Embedding可以理解为是一种映射关系,能够将文本空间里的某个词语,通过一定的方法,映射或者嵌入到另一个数值向量空间,也就是说,Word Embedding能够将词汇、完整句子用向量的形式表达出来。将用户兴趣参数A和某一候选问题B输入上述多个相似度模型,通过多个相似度模型中的每一个,每一个相似度模型输出的是数值向量A与B之间的相似度特征,该相似度特征的数值越大,表示向量A对应的词句和向量B对应的词句之间越相近。本公开的实施例对采用的相似度匹配模型的数量不作限制,例如,在一个示例中,采用5个相似度匹配模型,则向量A与B之间可以具有5个相似度特征。例如,在一个示例中,采用3个相似度匹配模型,则向量A与B之间可以具有3个相似度特征。
下面简单介绍上面提及的几个相似度匹配模型。
(1)余弦相似度:余弦相似度用向量夹角的余弦值作为衡量两个个体间差异的大小。余弦值越接近1就表明两个向量A和B越相似。通常利用以下公式来计算余弦相似度(也称为余弦距离):
Figure PCTCN2020093390-appb-000001
此外,通过余弦相似度模型输出的相似度特征是连续型。
(2)杰卡德(Jaccard)距离:杰卡德距离是用两个集合中不同元素站所有元素的比 例来衡量两个集合的区分度。如下公式来表示,其中J(A,B)是杰卡德相似系数。
Figure PCTCN2020093390-appb-000002
此外,通过杰卡德相似度模型输出的相似度特征是连续型。
(3)编辑距离,又名Levenshtein距离,是指利用字符操作,把字符串A转换成字符串B所需要的最少操作数。许可的字符操作包括将修改一个字符、插入一个字符和删除一个字符。一般来说,两个字符串的编辑距离越小,则它们越相似。如果两个字符串相等,则它们的编辑距离为0。此外,通过编辑距离相似度模型输出的相似度特征是连续型。
(4)词移距离(WMD),是指从文档整体上来考虑两个文档之间的相似性,通过寻找两个文档之前所有词最小距离之和的配对来度量文档的语义相似度。其中,通过WMD相似度模型输出的相似度特征是连续型。
(5)DSSM模型:DSSM是一种深度语义匹配模型,将匹配的两者映射到低维空间,相关性问题转化为低维空间向量的距离。该模型既可以用来预测两个句子的语义相似度,又可以获得句子的低维语义向量表达。此外,通过DSSM模型输出的相似度特征是离散型。
例如,在本公开至少一实施例中,对用户兴趣参数和多个候选问题可以同时采用上述五个相似度匹配模型,则可以针对多个候选问题中的每一个候选问题,得到与该用户兴趣参数之间的、分别对应于五个相似度匹配模型的五个相似度特征。
需要说明的是,本公开的实施例采用的相似度匹配模型可以不仅限于以上描述的相似度匹配模型,也可以采用其他相似度匹配模型,只要能实现相同或相似的技术效果,即,能计算两个向量之间的相似度即可,本公开的实施例对此不作具体限制。此外,本公开的实施例对采用的相似度匹配模型的数量也不作限制,可以根据实际需求来设置。
步骤S160:基于用户基本信息、多个候选问题和至少一个相似度特征,对多个候选问题进行排序以得到问题序列。
例如,在本公开至少一实施中,基于用户基本信息、多个候选问题和至少一个相似度特征,对多个候选问题进行排序以得到问题序列,包括:采用排序模型,将用户基本信息、多个候选问题和至少一个相似度特征组成排序模型的输入特征向量,得到多个候选问题中每一个候选问题对应的分数,按照分数大小(例如,按照分数从高到底的顺序)对对应的多个候选问题进行排序,以得到问题序列。
例如,在本公开至少一实施例中,某一用户的基本信息包括:性别为男(例如,其对应的离散特征是“0”),患有糖尿病(例如,其对应的离散特征是“1”);多个候选问题中的某一候选问题对应的Embedding向量是[0.3,0.5,0.6];该候选问题与用户兴趣参数之间的相似度特征包括:余弦相似度为0.85、Jaccard距离为0.91、编辑距离为3、WMD为1.17、DSSM为2。在这样的情况下,将它们组成排序模型的输入特征向量为向量[0,1,0.3,0.5,0.6,0.85,0.91,3,1.17,2],需要说明的是,本实施例提供的特征数据仅仅是示例性的,特征数据的具体取值可以根据实验结果或实际情况来设置,本公开的实施例对此不作具体限制。
这里,将用户基本信息、多个候选问题和至少一个相似度特征组成排序模型的输入特征向量,充分考虑了用户的个体因素(例如,用户基本信息、用户行为数据等)进行个性化问题推荐,使得用户能更有针对性地掌握与自身健康相关的知识。
例如,在本公开至少一实施例中,采用的排序模型为Google公司提出的经典模型Wide&Deep模型。在上述示例中,也就是将向量[0,1,0.3,0.5,0.6,0.85,0.91,3,1.17,2]作为输入特征向量输入到Wide&Deep模型。该模型的核心思想是结合线性模型的记忆能力和深度神经网络模型的泛化能力,在推荐场景中体现了相关性和多样性的融合。通过此模型来给问题候选集中的多个候选问题打分,并且按照分数从高到低的顺序对对应的多个候选问题进行排序,以得到问题序列。然后根据实际需求,将问题序列中的前N个问题展示给用户,其中N为大于或等于1的整数。
例如,在本公开至少一实施例中,一个候选问题的分数可以表示为一个条件概率,p(y|x)。其中,y表示对应于某一用户行为的标签,例如,用户点击该候选问题,则y=1,如果用户未点击该候选问题,则y=0。其中,x表示输入特征向量。该输入特征向量x包括前述用户基本信息的离散型特征、候选问题自身的连续型Embedding向量以及该候选问题与用户兴趣参数之间的至少一个相似度特征(相似度特征可能是连续型也可能是离散型)。例如,将根据y=1时输出的概率值P(y=1|x)作为最终该候选问题的分数,从而通过此排序模型(即,Wide&Deep模型),可以输出问题候选集中每一个候选问题的分数(即,概率值)。
下面结合附图4A简单介绍Wide&Deep模型的原理。图4A为本公开至少一实施例提供的Wide&Deep模型的结构示意图。
如图4A所示,Wide&Deep模型包括了两部分,即Wide部分(图4A的左侧)和Deep部分(图4A的右侧)。Wide&Deep模型平衡了Wide模型和Deep模型的记忆能力和泛化能力。这两部分模型需要不同的输入特征。
对于Wide部分,Wide模型是一个广义的线性模型(例如,逻辑回归),公式如下:
y=w Tx+b
其中,x表示特征向量[x1,x2,x3…]、w表示参数向量[w1,w2,w3….]、b为偏置项、y为输出的标签(label),其经过sigmoid函数后,输出为0~1之间的概率值。该Wide部分使用的输入特征是离散型特征,例如,用户基本信息的离散型特征以及离散型相似度特征。
对于Deep部分,Deep模型是一个前馈神经网络,一般情况下深度神经网络模型输入的是连续的稠密特征,稀疏高维的特征需经过Embedding(降维),转换为低维稠密特征,再将其作为第一个隐藏层的输入,根据最终的误差(loss)来反向训练更新。隐藏层的激活函数f通常采用防止梯度消失的ReLU函数。因此,Deep部分使用的输入特征是连续型特征,例如,候选问题的Embedding向量以及连续性相似度特征。
在模型训练时,根据最终的误差计算出梯度,反向传播到Wide和Deep两个部分中不断更新自己模型的参数,获得最终模型。需要注意的是同时训练Wide模型和Deep模型,并不代表模型融合,而是将两个模型的结果的加权和作为最终的预测结果:
Figure PCTCN2020093390-appb-000003
在上述Sigmoid函数中,W wide为Wide部分的权重,x为原始特征向量,φ(x)为交叉特征,例如,one-hot(独热)编码之后组合的新特征,W deep为Deep部分的神经网络的最后激活层输出的权重,l表示隐藏层,f表示激活函数,a表示输入特征,b为偏置项。
模型训练采用的是联合训练,相比于集成学习中单个模型进行单独训练,模型只是在最终的预测阶段进行融合,而联合训练是在训练阶段进行的模型融合,训练误差会同时反馈Wide和Deep模型中进行权重更新。因此,Wide模型关注于离散特征的叉乘,对原始特征做非线性变换产生特征相互作用的记忆,Deep模型则关注于泛化,深度神经网络使用低维稠密特征,只需少量的特征工程,却可以更好地泛化训练样本中未出现过的特征组合,提高了模型的泛化能力。模型训练完成后,将其部署到问题推荐场景中。
需要说明的是,关于Wide&Deep模型的更多信息可以参考其他相关参考文献,本公开的上述描述仅仅是示意性的介绍。
例如,在本公开一示例中,问题推荐方法10通过采用排序模型(即,Wide&Deep模型),将用户基本信息、多个候选问题和至少一个相似度特征组成该排序模型的输入特征向量,对问题候选集中包括的多个候选问题的每一个返回一个分数。按照该分数大小,对多个候选问题进行排序,以得到问题序列。例如,在一个示例中,按照分数从高到低的顺序排序,例如,在另一个示例中,按照分数从低到高的顺序排序,需要说明的是,还可以按照其他次序排序,本公开的实施例对此不作具体限制。
步骤S170:基于问题序列的顺序,将问题序列中的至少一个候选问题推荐给用户。例如,在一个示例中,问题序列是按照分数从高到低的顺序来排序,则将问题序列中的前N个候选问题推荐给用户,其中,N为大于或等于1的整数。例如,选取该问题序列中的例如前5个(也可以为其他数值)问题作为最终的推荐问题推荐给用户。
图4B示出了某一平台(例如,健康管理平台)提供给用户的推荐问题的用户界面图。例如,在一个示例中,如图4B所示,在用户向文本输入框中输入想要咨询的问题之前,健康管理平台已经根据上述问题推荐方法,将5个推荐问题(例如“高血压可以手术治疗吗?”、“高血压要做哪些检查?”、“高血压的症状有哪些?”、“高血压患者每天血压检测的频率?”和“高血压患者做什么运动好?”)推荐给用户。需要说明的是,将最终的推荐问题提供给用户的方式是多种多样的,本公开的实施例对此不作具体限制。
例如,在一个实施例中,当用户正在向输入文本框中输入想要咨询的问题时,该平台可以根据用户输入的文本信息,通过例如实体识别和关键词匹配方法等,匹配数据知识库中的相关问题,并将这些问题显示在用户界面中的输入文本框的上方。例如,如图4C所示,当用户在输入文本框中输入“高血压”三个字时,在输入文本框的上方提示有“高血压的食疗方法”、“高血压饮食方面注意事项”和“妊娠高血压食谱”的推荐问题,本公开的实施例对此不作具体限制,可以根据实际情况进行调整。
需要说明的是,可以根据实际情况,选择性地执行步骤S110-S170中的部分步骤,本公开的实施例对此不作具体限制。
例如,在一个实施例中,在同一用户多次使用某一平台的情况下,如果其基本信息尚未发生变化,则可以省略步骤S110-S130,直接调取预先存储的该用户的候选问题集合。例如,在一个示例中,用户第一次使用某一平台(例如,健康管理平台)时,该平台已经针对该用户生成了相应的候选问题集合,并且将其保存到了与平台相关联的数据库中,则当用户第二次使用该平台时,可能无需重复步骤S110-S130来获取该用户的候选问题集合,而是通过调取先前存储在该关联数据库中的候选问题集合来实现获取用户候选问题集合的目的,本公开的实施例对此不作具体限制,可以根据实际情况进行调整。
本公开至少一实施例提供的问题推荐方法10不仅有效避免了由于患者表达不清晰而可能出现的反馈答案不贴切的问题,还充分考虑了用户的个体因素(例如,用户基本信息、用户行为数据等)进行个性化问题推荐,使得用户能更有针对性地掌握与自身健康相关的知识,而且,通过排序模型(即,Wide&Deep模型)融合了用户基本信息与各种相似度匹配模型的输出等多种特征,使得问题推荐具有相关性、个性化和多样化,同时又注重了最终的反馈顺序,达到了更贴合用户需求的效果。
图5A是本公开至少一实施例提供的另一种问题推荐方法的示例性流程图,以及图5B是本公开至少一实施例提供的如图5A中的问题推荐方法的示意框图。如图5所示,该问题推荐方法50包括步骤S510-步骤S580。例如,步骤S510-步骤S580可以顺序执行,也可以按调整后的其他次序执行,例如可以先执行步骤S520再执行步骤S530,也可以先执行步骤S530再执行步骤S520。又例如,步骤S510-步骤S580中的部分或全部操作也可以并行执行,例如,步骤S520和步骤S530可以并行执行,本公开的实施例对各个步骤的执行顺序不作限制,可以根据实际情况调整。例如,步骤S510-步骤S580可以由服务器或本地端实现,本公开的实施例对此不做限制。例如,在一些示例中,实施问题推荐方法50可以选择地执行步骤S510-S580中的部分步骤,也可以执行除了步骤S510-S580以外的一些附加步骤,本公开的实施例对此不做具体限制。
参考图5A和图1B,图5A中示出的问题推荐方法50所包括的步骤S520-S580与图1B中示出的问题推荐方法10所包括的步骤S110-S170基本一致,因此,关于步骤S520-S580的说明可参考以上图1B中的步骤S110-S170的相关描述,此处不再赘述。
相比于图1B示出问题推荐方法10,图5A还包括步骤S510,即建立数据知识库。例如,在本公开一实施例中,可以从网络中抓取数据集合,并对数据集合按照意图分类,以形成多个知识问题集,从而建立数据知识库。
例如,在一个示例中,可以利用网络爬虫,从诸如互联网的网络中抓取包含大量数据的数据集合。其中,网络爬虫,又名网页蜘蛛,是一种按照一定的规则,自动地抓取万维网信息的程序或者脚本。
例如,在本公开一实施例中,在上述问题推荐方法50应用于医疗智能问答场景时,问题推荐方法50可以用于推荐关于某一疾病的问题,并且该数据集合可以来自是网络上的医患间的问诊数据集、与该疾病相关联的热点问题、悬赏问题中的至少一种。例如,在一个示例中,悬赏问题可以是在某些网站(例如,寻医问药网、39健康网等)上需要付费咨询 的问题,本公开的实施例对此不作具体限制。
例如,利用TF-IDF方法从数据集合中提取高频关键字,例如,“症状”、“治疗”、“血糖”、“饮食”、“服药”、“检查”、“胰岛素”、“糖尿病足”等。按照高频关键词可将来自数据集合中的大量数据进行意图分类,这样可以方便构建数据知识库。例如,通过深度学习算法Text-CNN对数据集合进行意图分类,在每个意图下整理(例如,通过人工整理)标准问题及对应的扩展问题和答案,从而建立糖尿病、高血压等慢性病基本健康知识的完备的数据知识库。例如,在一个示例中,根据提取到的高频关键字,可以人工确定多类意图。例如,人工可以根据人们的关注程度、关键字出现的频率等来确定意图的种类,本公开的实施例对此不作具体限制。例如,在一个示例中,人工确定了以下意图类别:饮食、运动、药物、检查、并发症、手术、治疗、症状等,本公开的实施例对此不作具体限制。然后,在每一类意图下人工整理相应的数据(例如,匹配该类意图的问题),将其作为训练数据,通过深度学习算法Text-CNN进行模型训练,然后,将来自数据集合的大量数据输入该训练后的模型,该模型则输出每一数据对应的意图类别,从而实现对大量数据的意图分类。为了提高意图分类的准确性,可以在每个意图下再进行人工过滤、整理和补充标准问题及对应的扩展问题和答案。例如,在第一次执行该问题推荐方法50时,通过执行步骤S510以建立知识数据库,该知识数据库被存储在服务器、存储器或数据库中。在以后执行该问题推荐方法50时,可以省略步骤S510,而直接访问知识数据库,由此提高处理效率。例如,可以间或执行步骤S510,或者采用其他适用的方式对知识数据库进行更新,由此可以实现知识数据库的更新和优化,使得后续步骤中得到的候选问题更加贴近用户需求,也更加贴近当前社会的认知水平。
图5A中示出的问题推荐方法50实现的技术效果和上述结合图1B说明的问题推荐方法10实现的技术效果相似,在此不再赘述。关于图5B中示出的各个框图的描述可以参见以上对图5A和图1B中的各个步骤的详细说明,在此不再赘述。
本公开的至少一个实施例还提供了一种问题推荐装置。图6是本公开至少一实施例提供的一种问题推荐装置的示意框图。如图6所示,该问题推荐装置60包括:集合获取模块、行为分析模块、特征生成模块、问题排序模块和推荐模块,这些模块可以通过软件、硬件、固件或它们的任意组合实现,例如,可以分别实现为集合获取电路600、行为分析电路640、特征生成电路650、问题排序电路660和推荐电路670。
例如,在一个示例中,集合获取电路600被配置为获取用户的候选问题集合,该候选问题集合包括多个候选问题。例如,该集合获取电路600包括知识库访问电路610、信息获取电路620和候选集生成电路630。例如,知识库访问电路610被配置为访问数据知识库,该数据知识库包括多个知识问题集。例如,信息获取电路620被配置为获取用户基本信息,并基于用户基本信息建立用户标签集合。例如,候选集生成电路630被配置为将用户标签集合与数据知识库相关联,从多个知识问题集得到候选问题集合,该候选问题集合包括多个候选问题。例如,行为分析电路640被配置为获取用户行为数据,并基于用户行为数据得到用户兴趣参数。例如,特征生成电路650被配置为基于用户兴趣参数和多个候 选问题,得到多个候选问题中的每一个候选问题与用户兴趣参数之间的至少一个相似度特征。例如,问题排序电路660被配置为基于用户基本信息、多个候选问题和至少一个相似度特征,对多个候选问题进行排序以得到问题序列。例如,推荐电路670被配置为基于问题序列的顺序,将问题序列中的至少一个候选问题推荐给用户。
例如,知识库访问电路610、信息获取电路620、候选集生成电路630、行为分析电路640、特征生成电路650、问题排序电路660和推荐电路670被配置执行的具体操作均可以参见上文中本公开的至少一个实施例提供的问题推荐方法10和50的相关描述,在此不再赘述。
例如,在本公开至少一实施例中,问题推荐装置60中的问题排序电路660包括问题排序子电路661。问题排序子电路661被配置为采用排序模型,将用户基本信息、多个候选问题和至少一个相似度特征组成排序模型的输入特征向量,得到多个候选问题中每一个候选问题对应的分数,按照分数大小对对应的多个候选问题进行排序,以得到问题序列。
例如,问题排序子电路661被配置执行的具体操作可以参见上文中本公开的至少一个实施例提供的问题推荐方法10和50的相关描述,在此不再赘述。
例如,在本公开至少一实施例中,问题推荐装置60中还包括知识库建立电路601。该知识库建立电路601被配置为从网络抓取数据集合,并对数据集合按照意图分类,以形成多个知识问题集,建立数据知识库。
例如,知识库建立电路601被配置执行的具体操作可以参见本公开的至少一个实施例提供的问题推荐方法50的相关描述,在此不再赘述。
例如,在本公开至少一实施例中,问题推荐装置60中的候选集生成电路630包括候选集生成子电路631。候选集生成子电路631被配置为建立用户标签集合与数据知识库的标准问题之间的映射关系,将用户标签集合匹配数据知识库中的标准问题,与所匹配到的标准问题相对应的知识问题集组成候选问题集合。
例如,候选集生成子电路631被配置执行的具体操作可以参见本公开的至少一个实施例提供的问题推荐方法10和50的相关描述,在此不再赘述。
例如,在本公开至少一实施例中,问题推荐装置60中的行为分析电路640包括行为分析子电路641。行为分析子电路641被配置为分析用户行为数据,将用户点击过的问题或者用户感兴趣的词句转化为用户兴趣参数。
例如,行为分析子电路641被配置执行的具体操作可以参见上文中本公开的至少一个实施例提供的问题推荐方法10和50的相关描述,在此不再赘述。
例如,在本公开至少一实施例中,问题推荐装置60中的特征生成电路650包括特征生成子电路651。特征生成子电路651被配置为采用至少一个相似度匹配模型,基于用户兴趣参数和多个候选问题,得到每一个候选问题与用户兴趣参数之间的至少一个相似度特征。
例如,特征生成子电路651被配置执行的具体操作可以参见本公开的至少一个实施例提供的问题推荐方法10和50的相关描述,在此不再赘述。
需要说明的是,本公开实施例中的集合获取电路600、知识库访问电路610、信息获取电路620、候选集生成电路630、行为分析电路640、特征生成电路650、问题排序电路660、推荐电路670以及特征生成子电路651、行为分析子电路641、候选集生成子电路631、问题排序子电路661和知识库建立电路601可以由诸如处理器、控制器等的硬件、能实施相关功能的软件或者两者相结合来实现,本公开的实施例对它们的具体实施方式不作限制。
还需要说明的是,本公开的实施例中,问题推荐装置60还可以包括更多的电路,而不限于上述集合获取电路600、知识库访问电路610、信息获取电路620、候选集生成电路630、行为分析电路640、特征生成电路650、问题排序电路660、推荐电路670以及特征生成子电路651、行为分析子电路641、候选集生成子电路631、问题排序子电路661和知识库建立电路601,这可以根据实际需求而定,本公开的实施例对此不作限制。
应当理解的是,本公开实施例提供的问题推荐装置60可以实施前述问题推荐方法10和50,也可以实现与前述问题推荐方法10和50相似的技术效果,在此不作赘述。
本公开的至少一个实施例还提供了一种问题推荐系统。图7是本公开的至少一个实施例提供的一种问题推荐系统的示意性框图。如图7所示,问题推荐系统70包括终端710和问题推荐服务器720,并且终端710和问题推荐服务器720信号连接。终端710被配置为将请求数据发送给问题推荐服务器720。问题推荐服务器720被配置为:响应于请求数据,访问数据知识库,该数据知识库包括多个知识问题集;获取用户的候选问题集合,候选问题集合包括多个候选问题;获取用户行为数据,并基于用户行为数据得到用户兴趣参数;基于用户兴趣参数和多个候选问题,得到多个候选问题中的每一个候选问题与用户兴趣参数之间的至少一个相似度特征;以及基于用户基本信息、多个候选问题和至少一个相似度特征,对多个候选问题进行排序以得到问题序列。例如,终端710还被配置为,显示问题序列中的前N个候选问题,并且N为大于或等于1的整数。
例如,问题服务器720被配置执行的上述操作,可以参见本公开的至少一个实施例提供的问题推荐方法10和50,在此不再赘述。
例如,在一个示例中,问题推荐系统70包括的终端710可以实现为客户端(例如手机、电脑等),问题推荐服务器720可以实现为服务端(例如服务器)。
例如,在一个示例中,如图7所示,问题推荐系统70除了包括终端710和问题推荐服务器720以外,还可以包括存储有数据知识库的知识库服务器730。知识库服务器730与问题推荐服务器720信号连接,被配置为响应于问题推荐服务器720的请求信息,将数据知识库中、与请求信息对应的的数据返回给问题推荐服务器720。需要说明的是,在问题推荐系统70不包括知识库服务器730时,数据知识库中的数据可以直接存储在问题推荐服务器720上或存储在另行提供的其他存储设备中,也可以由问题推荐服务器720自行建立数据知识库,然后存储在问题推荐服务器720上或存储在另行提供的其他存储设备中,本公开的实施例对此不做具体限制。
本公开的至少一个实施例提供的一种问题推荐系统70可以实施前述实施例提供的问 题推荐方法10和50,也可以实现与前述实施例提供的问题推荐方法10和50相似的技术效果,在此不作赘述。
本公开的至少一个实施例还提供了一种电子设备。图8是本公开至少一实施例提供的一种电子设备的示意图。例如,如图8所示,该电子设备80包括处理器810和存储器820。存储器820包括一个或多个计算机程序模块821。一个或多个计算机程序模块821被存储在存储器820中并被配置为由处理器810执行,该一个或多个计算机程序模块821包括用于执行本公开的至少一个实施例提供的任一问题推荐方法的指令,其被处理器810执行时,可以执行本公开的至少一个实施例提供的问题推荐方法中的一个或多个步骤。存储器820和处理器810可以通过总线系统和/或其它形式的连接机构(未示出)互连。
例如,存储器820和处理器810可以设置在服务器端(或云端),例如设置在前述的问题推荐服务器720中,以用于执行图1A、图1B和图5A描述的问题推荐方法中的一个或多个步骤。
例如,处理器810可以是中央处理单元(CPU)、数字信号处理器(DSP)或者具有数据处理能力和/或程序执行能力的其它形式的处理单元,例如现场可编程门阵列(FPGA)等;例如,中央处理单元(CPU)可以为X86或ARM架构等。处理器810可以为通用处理器或专用处理器,可以控制电子设备80中的其它组件以执行期望的功能。
例如,存储器820可以包括一个或多个计算机程序产品的任意组合,计算机程序产品可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性存储器。易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。非易失性存储器例如可以包括只读存储器(ROM)、硬盘、可擦除可编程只读存储器(EPROM)、便携式紧致盘只读存储器(CD-ROM)、USB存储器、闪存等。在计算机可读存储介质上可以存储一个或多个计算机程序模块821,处理器810可以运行一个或多个计算机程序模块821,以实现电子设备80的各种功能。在计算机可读存储介质中还可以存储各种应用程序和各种数据以及应用程序使用和/或产生的各种数据等。电子设备80的具体功能和技术效果可以参考上文中关于问题推荐方法的描述,此处不再赘述。
图9为本公开至少一实施例提供的一种终端的示意框图。例如,在本公开至少一实施例中,该终端为显示终端900,例如可应用于本公开实施例提供的问题推荐方法中。例如,该显示终端900可以提供反映用户行为数据的用户访问日志(例如通过运行在该系统中的浏览器等应用程序通过Cookie等记录的访问日志等),并显示被推荐给用户的至少一个候选问题。需要注意的是,图9示出的终端为显示终端900仅仅是一个示例,其不会对本公开实施例的功能和使用范围带来任何限制。
如图9所示,显示终端900可以包括处理装置(例如中央处理器、图形处理器等)910,其可以根据存储在只读存储器(ROM)920中的程序或者从存储装置980加载到随机访问存储器(RAM)930中的程序而执行各种适当的动作和处理。在RAM 930中,还存储有显示终端900操作所需的各种程序和数据。处理装置910、ROM 920以及RAM 930通过总线940彼此相连。输入/输出(I/O)接口950也连接至总线940。
通常,以下装置可以连接至I/O接口950:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置960;包括例如液晶显示器(LCD)、扬声器、振动器等的输出装置970;包括例如磁带、硬盘等的存储装置980;以及通信装置990。通信装置990可以允许显示终端900与其他电子设备进行无线或有线通信以交换数据。虽然图9示出了具有各种装置的显示终端900,但应理解的是,并不要求实施或具备所有示出的装置,显示终端900可以替代地实施或具备更多或更少的装置。
本公开的至少一个实施例还提供了一种非瞬时可读存储介质。图10是本公开的至少一个实施例提供的非瞬时可读存储介质100的示意框图。例如,如图10所示,该非瞬时可读存储介质100包括存储其上的计算机程序指令111。计算机程序指令111被处理器执行时,执行本公开的至少一个实施例提供的问题推荐方法10或50中的一个或多个步骤。
例如,该存储介质可以是一个或多个计算机可读存储介质的任意组合,例如一个计算机可读存储介质包含用于获取用户的候选问题集合的计算机可读的程序代码,另一个计算机可读存储介质包含获取用户行为数据,并基于用户行为数据得到用户兴趣参数的计算机可读的程序代码,又一个计算机可读存储介质包含基于用户兴趣参数和多个候选问题,得到多个候选问题中的每一个候选问题与用户兴趣参数之间的至少一个相似度特征的计算机可读的程序代码,还有一个计算机可读存储介质包含基于用户基本信息、多个候选问题和至少一个相似度特征,对多个候选问题进行排序以得到问题序列的计算机可读的程序代码。当然,上述各个程序代码也可以存储在同一个计算机可读介质中,本公开的实施例对此不作限制。例如,当该程序代码由计算机读取时,计算机可以执行该计算机存储介质中存储的程序代码,执行例如本公开任一实施例提供的问题推荐方法。
例如,存储介质可以包括智能电话的存储卡、平板电脑的存储部件、个人计算机的硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM)、便携式紧致盘只读存储器(CD-ROM)、闪存、或者上述存储介质的任意组合,也可以为其他适用的存储介质。例如,该可读存储介质也可以为图8中的存储器820,相关描述可以参考前述内容,此处不再赘述。
需要说明的是,该存储介质100可以应用于问题推荐服务器720,技术人员可以根据具体场景进行选择,在此不作限定。
图11示出了本公开的至少一个实施例提供的问题推荐系统的示例性的场景图。如图11所示,该问题推荐系统300可以包括用户终端310、网络320、服务器330以及数据库340。
例如,用户终端310可以是图11中示出的电脑310-1、便携式终端310-2。可以理解的是,用户终端还可以是能够执行数据的接收、处理和显示的任何其他类型的电子设备,其可以包括但不限于台式电脑、笔记本电脑、平板电脑、智能家居设备、可穿戴设备、车载电子设备、医疗电子设备等。
例如,网络320可以是单个网络,或至少两个不同网络的组合。例如,网络320可以包括但不限于局域网、广域网、公用网络、专用网络、因特网、移动通信网络等中的一种 或几种的组合。
例如,服务器可330以为一个单独的服务器,或者为一个服务器群组,服务器群组内的各个服务器通过有线网络或无线网络进行连接。有线网络例如可以采用双绞线、同轴电缆或光纤传输等方式进行通信,无线网络例如可以采用3G/4G/5G移动通信网络、蓝牙、Zigbee或者WiFi等通信方式。本公开对网络的类型和功能在此不作限制。该一个服务器群组可以是集中式的,例如数据中心,也可以是分布式的。服务器可以是本地的或远程的。例如,该服务器330可以为通用型服务器或专用型服务器,可以为虚拟服务器或云服务器等。
例如,数据库340可用于存储从用户终端310和服务器330工作中所利用、产生和输出的各种数据。数据库340可以经由网络320与服务器330或服务器330的一部分相互连接或通信,或直接与服务器330相互连接或通信,或者经由上述两种方式的结合实现与服务器330相互连接或通信。在一些实施例中,数据库340可以是独立的设备。在另一些实施例中,数据库340也可以集成在用户终端310和服务器340中的至少一个中。例如,数据库340可以设置在用户终端310上,也可以设置在服务器340上。又例如,数据库340也可以是分布式的,其一部分设置在用户终端310上,另一部分设置在服务器340上。
例如,在一个示例中,首先,用户终端310(例如,用户的手机)可以经由网络320或其它技术(例如,蓝牙通信、红外通信等)将请求数据发送至服务器330。接着,服务器330响应于请求数据,获取用户的候选问题集合,其中,该候选问题集合包括多个候选问题。接下来,服务器330获取用户行为数据,并基于用户行为数据得到用户兴趣参数,例如,用户行为数据由用户终端310通过网络320传输至服务器330。然后,服务器330基于用户兴趣参数和多个候选问题,得到多个候选问题中的每一个候选问题与用户兴趣参数之间的至少一个相似度特征。接着,服务器330基于用户基本信息、多个候选问题和至少一个相似度特征,对多个候选问题进行排序以得到问题序列,然后将显示问题序列中的前N个候选问题经由网络320或其它技术(例如,蓝牙通信、红外通信等)发送用户终端310。最后,用户终端310在接收到来自服务器330的前N个候选问题之后,进行显示。
在本公开中,术语“多个”指两个或两个以上,除非另有明确的限定。
本领域技术人员在考虑说明书及实践这里公开的公开后,将容易想到本公开的其它实施方案。本公开旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由下面的权利要求指出。
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。

Claims (25)

  1. 一种问题推荐方法,包括:
    获取用户的候选问题集合,其中,所述候选问题集合包括多个候选问题;
    获取用户行为数据,并基于所述用户行为数据得到用户兴趣参数;
    基于所述用户兴趣参数和所述多个候选问题,得到所述多个候选问题中的每一个候选问题与所述用户兴趣参数之间的至少一个相似度特征;
    基于用户基本信息、所述多个候选问题和所述至少一个相似度特征,对所述多个候选问题进行排序以得到问题序列;以及
    基于所述问题序列的顺序,将所述问题序列中的至少一个候选问题推荐给所述用户。
  2. 根据权利要求1所述的问题推荐方法,其中,基于所述用户基本信息、所述多个候选问题和所述至少一个相似度特征,对所述多个候选问题进行排序以得到问题序列,包括:
    采用排序模型,将所述用户基本信息、所述多个候选问题和所述至少一个相似度特征组成所述排序模型的输入特征向量,得到所述多个候选问题中每一个候选问题对应的分数,按照所述分数的大小对对应的多个候选问题进行排序,以得到所述问题序列。
  3. 根据权利要求1或2所述的问题推荐方法,其中,基于所述用户兴趣参数和所述多个候选问题,得到所述多个候选问题中的每一个候选问题与所述用户兴趣参数之间的至少一个相似度特征,包括:
    采用至少一个相似度匹配模型,基于所述用户兴趣参数和所述多个候选问题,得到所述每一个候选问题与所述用户兴趣参数之间的至少一个相似度特征。
  4. 根据权利要求3所述的问题推荐方法,其中,所述至少一个相似度匹配模型包括:余弦相似度模型、杰卡德相似度模型、编辑距离相似度模型、词移距离相似度模型和深度语义匹配相似度模型中的至少一种。
  5. 根据权利要求2所述的问题推荐方法,其中,所述排序模型包括Wide&Deep模型。
  6. 根据权利要求1-5中任一项所述的问题推荐方法,其中,获取所述用户的候选问题集合,包括:
    访问数据知识库,其中,所述数据知识库包括多个知识问题集;
    获取所述用户基本信息,并基于所述用户基本信息建立用户标签集合;
    将所述用户标签集合与所述数据知识库相关联,从所述多个知识问题集得到所述候选问题集合。
  7. 根据权利要求6所述的问题推荐方法,其中,所述用户标签集合包括多级标签集合,多级标签集合包括多级标签,不同级标签的类型不同。
  8. 根据权利要求7所述的问题推荐方法,其中,所述问题推荐方法用于推荐关于疾病的问题,所述多级标签集合的第一级标签为年龄段,第二级标签为时间段,第三级标签为疾病类型,第四级标签为并发症。
  9. 根据权利要求6-8中任一项所述的问题推荐方法,其中,所述多个知识问题集中的每一个包括:
    标准问题,与所述标准问题对应的标准答案,以及与所述标准问题对应的扩展问题。
  10. 根据权利要求6-9中任一项所述的问题推荐方法,其中,获取所述用户的候选问题集合,还包括:
    建立所述数据知识库。
  11. 根据权利要求10所述的问题推荐方法,其中,所述建立所述数据知识库包括:
    从网络抓取数据集合,并对所述数据集合按照意图分类,以形成所述多个知识问题集,从而建立所述数据知识库。
  12. 根据权利要求11所述的问题推荐方法,其中,所述问题推荐方法用于推荐关于疾病的问题,并且所述数据集合来自医患间的问诊数据集、与所述疾病相关联的热点问题和与所述疾病相关联的悬赏问题中的至少一种。
  13. 根据权利要求9所述的问题推荐方法,其中,将所述用户标签集合与所述数据知识库相关联,从所述多个知识问题集得到候选问题集合,包括:
    建立所述用户标签集合与所述数据知识库中的标准问题之间的映射关系,将所述用户标签集合匹配所述数据知识库中的标准问题,与所匹配到的标准问题相对应的知识问题集组成所述候选问题集合。
  14. 根据权利要求1-5中任一项所述的问题推荐方法,其中,获取所述用户的候选问题集合,包括:
    调取预先存储的所述用户的候选问题集合。
  15. 根据权利要求1-14中任一项所述的问题推荐方法,其中,基于所述用户行为数据,得到所述用户兴趣参数,包括:
    分析所述用户行为数据,将用户点击过的问题或者用户感兴趣的词句转化为所述用户兴趣参数。
  16. 一种问题推荐装置,包括:
    集合获取电路,被配置为获取用户的候选问题集合,其中,所述候选问题集合包括多个候选问题;
    行为分析电路,被配置为获取用户行为数据,并基于所述用户行为数据得到用户兴趣参数;
    特征生成电路,被配置为基于所述用户兴趣参数和所述多个候选问题,得到所述多个候选问题中的每一个候选问题与所述用户兴趣参数之间的至少一个相似度特征;
    问题排序电路,被配置为基于用户基本信息、所述多个候选问题和所述至少一个相似度特征,对所述多个候选问题进行排序以得到问题序列,以及
    推荐电路,被配置为基于所述问题序列的顺序,将所述问题序列中的至少一个候选问题推荐给用户。
  17. 根据权利要求16所述的问题推荐装置,其中,所述问题排序电路包括:
    问题排序子电路,被配置为:采用排序模型,将所述用户基本信息、所述多个候选问题和所述至少一个相似度特征组成所述排序模型的输入特征向量,得到所述多个候选问题中每一个候选问题对应的分数,按照所述分数的大小对对应的多个候选问题进行排序,以得到所述问题序列。
  18. 根据权利要求16或17所述的问题推荐装置,其中,所述集合获取电路包括:
    知识库访问电路,被配置为访问数据知识库,其中,所述数据知识库包括多个知识问题集;
    信息获取电路,被配置为获取用户基本信息,并基于所述用户基本信息建立用户标签集合;
    候选集生成电路,被配置为将所述用户标签集合与所述数据知识库相关联,从所述多个知识问题集得到候选问题集合,其中,所述候选问题集合包括多个候选问题。
  19. 根据权利要求18所述的问题推荐装置,其中,所述集合获取电路还包括:
    知识库建立电路,被配置为从网络抓取数据集合,并对所述数据集合按照意图分类,以形成所述多个知识问题集,建立所述数据知识库。
  20. 根据权利要求18或19所述的问题推荐装置,其中,所述候选集生成电路包括:
    候选集生成子电路,被配置为建立所述用户标签集合与所述数据知识库的标准问题之间的映射关系,将所述用户标签集合匹配所述数据知识库中的标准问题,与所匹配到的标准问题相对应的知识问题集组成所述候选问题集合。
  21. 根据权利要求16-20中任一项所述的问题推荐装置,其中,所述行为分析电路包括:
    行为分析子电路,被配置为分析所述用户行为数据,将用户点击过的问题或者用户感兴趣的词句转化为所述用户兴趣参数。
  22. 根据权利要求16-21中任一项所述的问题推荐装置,其中,特征生成电路包括:
    特征生成子电路,被配置为采用至少一个相似度匹配模型,基于所述用户兴趣参数和所述多个候选问题,得到所述每一个候选问题与所述用户兴趣参数之间的至少一个相似度特征。
  23. 一种问题推荐系统,包括终端和问题推荐服务器;其中,
    所述终端被配置为将请求数据发送给所述问题推荐服务器;
    所述问题推荐服务器被配置为:
    响应于所述请求数据:
    获取用户的候选问题集合,其中,所述候选问题集合包括多个候选问题;
    获取用户行为数据,并基于所述用户行为数据得到用户兴趣参数;
    基于所述用户兴趣参数和所述多个候选问题,得到所述多个候选问题中的每一个候选问题与所述用户兴趣参数之间的至少一个相似度特征;以及
    基于用户基本信息、所述多个候选问题和所述至少一个相似度特征,对所述多个候选问题进行排序以得到问题序列;
    所述终端还被配置为,显示所述问题序列中的前N个候选问题,其中,N为大于或等于1的整数。
  24. 一种电子设备,包括:
    处理器;
    存储器,包括一个或多个计算机程序模块;
    其中,所述一个或多个计算机程序模块被存储在所述存储器中并被配置为由所述处理器执行,所述一个或多个计算机程序模块包括用于执行权利要求1-15中任一项所述的问题推荐方法的指令。
  25. 一种非瞬时可读存储介质,其上存储有计算机指令,其中,所述计算机指令被处理器执行时执行权利要求中1-15中任一项所述的问题推荐方法。
PCT/CN2020/093390 2020-05-29 2020-05-29 问题推荐方法及装置、系统和电子设备、可读存储介质 WO2021237707A1 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
PCT/CN2020/093390 WO2021237707A1 (zh) 2020-05-29 2020-05-29 问题推荐方法及装置、系统和电子设备、可读存储介质
CN202080000849.2A CN114072782A (zh) 2020-05-29 2020-05-29 问题推荐方法及装置、系统和电子设备、可读存储介质
EP20900702.0A EP4002141A4 (en) 2020-05-29 2020-05-29 QUESTION RECOMMENDATION METHOD, APPARATUS AND SYSTEM, AND ELECTRONIC DEVICE AND COMPUTER READABLE RECORDING MEDIA
JP2022504676A JP2023535849A (ja) 2020-05-29 2020-05-29 問題推奨方法及びその装置及びそのシステム、及び電子機器、読み取り可能な記憶媒体
US17/281,310 US20220198300A1 (en) 2020-05-29 2020-05-29 Question recommendation method, device and system, electronic device, and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/093390 WO2021237707A1 (zh) 2020-05-29 2020-05-29 问题推荐方法及装置、系统和电子设备、可读存储介质

Publications (1)

Publication Number Publication Date
WO2021237707A1 true WO2021237707A1 (zh) 2021-12-02

Family

ID=78745462

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/093390 WO2021237707A1 (zh) 2020-05-29 2020-05-29 问题推荐方法及装置、系统和电子设备、可读存储介质

Country Status (5)

Country Link
US (1) US20220198300A1 (zh)
EP (1) EP4002141A4 (zh)
JP (1) JP2023535849A (zh)
CN (1) CN114072782A (zh)
WO (1) WO2021237707A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116383481A (zh) * 2023-02-09 2023-07-04 四川云数赋智教育科技有限公司 一种基于学生画像的个性化试题推荐方法及系统

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114398546A (zh) * 2022-01-06 2022-04-26 北京明略软件系统有限公司 菜品的推荐方法和装置、存储介质、电子装置
CN114842930B (zh) * 2022-06-30 2022-09-27 苏州景昱医疗器械有限公司 数据采集方法、装置、系统及计算机可读存储介质
CN115659058B (zh) * 2022-12-30 2023-04-11 杭州远传新业科技股份有限公司 问题生成的方法和装置
CN115982429B (zh) * 2023-03-21 2023-08-01 中交第四航务工程勘察设计院有限公司 一种基于流程控制的知识管理方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740476A (zh) * 2016-03-18 2016-07-06 科润智能科技股份有限公司 一种相关联问题推荐方法、装置及系统
CN107451199A (zh) * 2017-07-05 2017-12-08 阿里巴巴集团控股有限公司 问题推荐方法及装置、设备
CN110188186A (zh) * 2019-04-24 2019-08-30 平安科技(深圳)有限公司 医疗领域的内容推荐方法、电子装置、设备及存储介质
CN110377715A (zh) * 2019-07-23 2019-10-25 天津汇智星源信息技术有限公司 基于法律知识图谱的推理式精准智能问答方法
US20200034465A1 (en) * 2018-07-30 2020-01-30 International Business Machines Corporation Increasing the accuracy of a statement by analyzing the relationships between entities in a knowledge graph

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104572734B (zh) * 2013-10-23 2019-04-30 腾讯科技(深圳)有限公司 问题推荐方法、装置及系统
JP7324709B2 (ja) * 2017-02-09 2023-08-10 コグノア,インク. デジタル個別化医療のためのプラットフォームとシステム
CN110096581B (zh) * 2019-04-28 2021-04-20 宁波深擎信息科技有限公司 一种基于用户行为构建问答体系推荐问的系统及方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740476A (zh) * 2016-03-18 2016-07-06 科润智能科技股份有限公司 一种相关联问题推荐方法、装置及系统
CN107451199A (zh) * 2017-07-05 2017-12-08 阿里巴巴集团控股有限公司 问题推荐方法及装置、设备
US20200034465A1 (en) * 2018-07-30 2020-01-30 International Business Machines Corporation Increasing the accuracy of a statement by analyzing the relationships between entities in a knowledge graph
CN110188186A (zh) * 2019-04-24 2019-08-30 平安科技(深圳)有限公司 医疗领域的内容推荐方法、电子装置、设备及存储介质
CN110377715A (zh) * 2019-07-23 2019-10-25 天津汇智星源信息技术有限公司 基于法律知识图谱的推理式精准智能问答方法

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116383481A (zh) * 2023-02-09 2023-07-04 四川云数赋智教育科技有限公司 一种基于学生画像的个性化试题推荐方法及系统
CN116383481B (zh) * 2023-02-09 2024-03-29 四川云数赋智教育科技有限公司 一种基于学生画像的个性化试题推荐方法及系统

Also Published As

Publication number Publication date
EP4002141A4 (en) 2022-09-28
JP2023535849A (ja) 2023-08-22
CN114072782A (zh) 2022-02-18
EP4002141A1 (en) 2022-05-25
US20220198300A1 (en) 2022-06-23

Similar Documents

Publication Publication Date Title
WO2021237707A1 (zh) 问题推荐方法及装置、系统和电子设备、可读存储介质
Ali et al. An intelligent healthcare monitoring framework using wearable sensors and social networking data
Sorokowska et al. Affective interpersonal touch in close relationships: A cross-cultural perspective
Vassli et al. Acceptance of health-related ICT among elderly people living in the community: A systematic review of qualitative evidence
US20200097814A1 (en) Method and system for enabling interactive dialogue session between user and virtual medical assistant
US11769571B2 (en) Cognitive evaluation of assessment questions and answers to determine patient characteristics
Lin et al. Psychometric evaluation of the Persian eHealth Literacy Scale (eHEALS) among elder Iranians with heart failure
Yan et al. A 12-week pilot study of acceptance of a computer-based chronic disease self-monitoring system among patients with type 2 diabetes mellitus and/or hypertension
US20020035486A1 (en) Computerized clinical questionnaire with dynamically presented questions
US11640403B2 (en) Methods and systems for automated analysis of behavior modification data
US20200211709A1 (en) Method and system to provide medical advice to a user in real time based on medical triage conversation
KR102217307B1 (ko) 기계 학습 및 의미론적 지식 기반 빅데이터 분석으로 웨어러블 센서와 소셜 네트워킹 데이터를 이용한 새로운 의료 모니터링 방법 및 장치
Kwong et al. A prediction model of blood pressure for telemedicine
US20220384003A1 (en) Patient viewer customized with curated medical knowledge
Nag et al. Live personalized nutrition recommendation engine
US20220139535A1 (en) Efficient determination of a data entity storing healthcare data through mapped entry and/or traversal of a semantic data structure
US20160180051A1 (en) Method and arrangement for matching of diseases and detection of changes for a disease by the use of mathematical models
US20230082381A1 (en) Image and information extraction to make decisions using curated medical knowledge
KR20200113954A (ko) 사용자 맞춤형 건강 정보 서비스 제공 시스템 및 그 방법
KR20210052122A (ko) 사용자 맞춤형 식품 정보 서비스 제공 시스템 및 그 방법
US20240087700A1 (en) System and Method for Steering Care Plan Actions by Detecting Tone, Emotion, and/or Health Outcome
KR102360651B1 (ko) 의료분야 마이데이터를 이용한 목표질환 수치 개선용 개인 맞춤형 식단 서비스 제공 시스템
Kalantari et al. Opportunities and challenges of consumer health information on the internet: Is cyberchondria an emerging challenge
Chen et al. What concerns consumers about hypertension? A comparison between the online health community and the Q&A forum
Dongre et al. Deep Learning-Based Drug Recommendation and ADR Detection Healthcare Model on Social Media

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20900702

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022504676

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2020900702

Country of ref document: EP

Effective date: 20220215

NENP Non-entry into the national phase

Ref country code: DE