WO2023029502A1 - 基于问诊会话构建用户画像的方法、装置、设备和介质 - Google Patents

基于问诊会话构建用户画像的方法、装置、设备和介质 Download PDF

Info

Publication number
WO2023029502A1
WO2023029502A1 PCT/CN2022/087528 CN2022087528W WO2023029502A1 WO 2023029502 A1 WO2023029502 A1 WO 2023029502A1 CN 2022087528 W CN2022087528 W CN 2022087528W WO 2023029502 A1 WO2023029502 A1 WO 2023029502A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
user
medical
medical inquiry
chief complaint
Prior art date
Application number
PCT/CN2022/087528
Other languages
English (en)
French (fr)
Inventor
赵建双
Original Assignee
康键信息技术(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 康键信息技术(深圳)有限公司 filed Critical 康键信息技术(深圳)有限公司
Publication of WO2023029502A1 publication Critical patent/WO2023029502A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment

Definitions

  • the present application relates to the technical field of machine learning, and in particular to a method, device, device and medium for constructing user portraits based on consultation sessions.
  • This application aims to solve at least one of the technical problems existing in the prior art. To this end, this application proposes a method, device, device, and medium for constructing user portraits based on consultation sessions, which can improve the efficiency of constructing user portraits and reduce labor costs.
  • the method for constructing a user portrait based on a consultation session includes: acquiring the main complaint information input by the user, wherein the main complaint information is the user's disease description information; performing an operation on the main complaint information feature extraction to obtain a first eigenvector matrix; input the first eigenvector matrix into a predictive network model to obtain medical questions matching the chief complaint information, wherein the predictive network model is based on the first data set Obtained by training, the first data set includes a plurality of medical consultation samples, each of which includes a medical consultation question and a corresponding disease; presenting the medical consultation question to the user to obtain the The medical inquiry information input by the user; constructing a user portrait according to the medical inquiry information.
  • the device for constructing a user portrait based on a medical consultation session includes: an information acquisition module, the information acquisition module is used to acquire the main complaint information input by the user, wherein the main complaint information is the user The description information of the disease; the feature extraction module, the feature extraction module is used to extract the features of the main complaint information, and obtains the first feature vector matrix; the prediction module, the prediction module is used to input the first feature vector matrix
  • the predictive network model a medical inquiry question matching the chief complaint information is obtained, wherein the predictive network model is trained according to a first data set, and the first data set includes a plurality of medical consultation samples, each Each of the medical consultation samples includes medical inquiry questions and corresponding diseases; the medical inquiry module is used to present the medical inquiry questions to the user, so as to obtain the medical inquiry information input by the user; portrait A construction module, the portrait construction module is used to construct a user portrait according to the medical inquiry information.
  • An electronic device includes: at least one memory; at least one processor; at least one program; the program is stored in the memory, and the processor executes the at least one program to Realization:
  • the storage medium is a computer-readable storage medium
  • the computer-readable storage medium stores computer-executable instructions
  • the computer-executable instructions are used to make the computer Execution: the method of constructing a user portrait based on a medical consultation session: wherein, the method of constructing a user portrait based on a medical consultation session includes: obtaining the main complaint information input by the user, wherein the main complaint information is the user's disease description information; performing feature extraction on the chief complaint information to obtain a first feature vector matrix; inputting the first feature vector matrix into a predictive network model to obtain a medical inquiry question matching the chief complaint information, wherein the predictive network model is Obtained according to the training of the first data set, the first data set includes a plurality of medical consultation samples, each of the medical consultation samples includes a medical consultation question and a corresponding disease; presenting the medical consultation question to the user , to obtain the medical inquiry information input by the user; constructing a user portrait according to the medical inquiry information
  • the method, device, device, and medium for constructing user portraits based on medical consultation sessions at least the following beneficial effects are achieved: by identifying the main complaint information input by the user, and extracting features of the main complaint information and then inputting it to the prediction network
  • the consultation questions corresponding to the chief complaint information are obtained by predicting the network model, and the users are quickly and automatically consulted through the consultation questions to obtain the user's consultation information, and the user portrait is constructed according to the consultation information, which improves the construction
  • the efficiency of user portraits is improved, and manual consultation is not required when collecting consultation information, saving labor costs.
  • Through the constructed user portrait it is convenient to select a doctor in the treatment field that matches the user's current illness for further consultation, and the user portrait can be used to recommend items to the user more accurately.
  • Fig. 1 is a flowchart of a method for constructing a user portrait based on an interrogation session in an embodiment of the present application
  • Fig. 2 is the flowchart of obtaining the first eigenvector matrix according to the embodiment of the present application
  • Fig. 3 is the schematic diagram of training word vector model of the embodiment of the present application.
  • FIG. 4 is a flow chart of natural language preprocessing in an embodiment of the present application.
  • Fig. 5 is the flow chart of obtaining the consultation question in the embodiment of the present application.
  • Fig. 6 is the flow chart of obtaining medical inquiry information in the embodiment of the present application.
  • FIG. 7 is a flow chart of a method for constructing a user portrait based on an interrogation session according to another embodiment of the present application.
  • FIG. 8 is a flow chart of obtaining a user's health label according to an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present application.
  • Natural Language Processing uses computers to process, understand and use human languages (such as Chinese, English, etc.). NLP is a branch of artificial intelligence and an interdisciplinary subject between computer science and linguistics. Known as computational linguistics. Natural language processing includes syntax analysis, semantic analysis, text understanding, etc. Natural language processing is often used in technical fields such as machine translation, handwritten and printed character recognition, speech recognition and text-to-speech conversion, information retrieval, information extraction and filtering, text classification and clustering, public opinion analysis and opinion mining. It involves language processing Related data mining, machine learning, knowledge acquisition, knowledge engineering, artificial intelligence research and linguistics research related to language computing, etc.
  • Word2Vec It is a tool for training word vectors. Word2Vec believes that the similarity of words that often appear in a sentence is relatively high, that is, for a central word, maximize the probability of surrounding words. Word2Vec uses a three-layer network for training, and the last layer uses a Huffman tree (Huffman) for prediction.
  • Huffman Huffman tree
  • GloVe is another tool for training word vectors. GloVe is implemented by co-occurrence counting: first, construct a vocabulary co-occurrence matrix, each row is a word, each column is a sentence, and the co-occurrence matrix is used to calculate each The frequency of words appearing in each sentence. Since the sentence is a combination of various words, its dimension is very large, and it needs to be reduced in dimension, that is, the co-occurrence matrix needs to be reduced in dimension.
  • LSM Long Short Term Memory
  • Bi-directional Long Short Term Memory It is composed of forward LSTM and backward LSTM, which can use the information of the past moment and the information of the future moment. Compared with the unidirectional LSTM, the final prediction results are more accurate.
  • Conditional Random Field It is a discriminative probability model and a type of random field. It is often used to label or analyze sequence data, and is often used in lexical analysis such as Chinese word segmentation and part-of-speech tagging.
  • BRNN Bidirectional Recurrent Neural Network
  • the "gate” structure is added to the high-speed neural network, which can solve the problem of deepening the network depth and blocking the return flow of gradient information, which makes network training difficult.
  • CNN Convolutional Neural Networks It is a type of feed-forward neural network that includes convolution calculations and has a deep structure.
  • the convolutional neural network has the ability to learn representations and can translate input information according to its hierarchical structure. Classification can be applied in supervised learning and unsupervised learning.
  • AI artificial intelligence
  • digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results.
  • Artificial intelligence basic technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, and mechatronics.
  • Artificial intelligence software technology mainly includes computer vision technology, robotics technology, biometrics technology, speech processing technology, natural language processing technology, and machine learning/deep learning.
  • the embodiments of the present application provide a method, device, device, and medium for constructing user portraits based on consultation sessions, which can improve the efficiency and accuracy of constructing user portraits.
  • the embodiment of the present application provides a method, device, device, and medium for constructing a user portrait based on a medical consultation session, which is specifically described through the following embodiments. First, the method for constructing a user portrait in a medical consultation session in the embodiment of the present application is described.
  • the method for constructing a user portrait based on a consultation session provided in the embodiment of the present application relates to the technical field of machine learning.
  • the method for constructing a user portrait in a medical consultation session provided in the embodiment of the present application can be applied to a terminal, can also be applied to a server, and can also be software running on the terminal or the server.
  • the terminal can be a smart phone, a tablet computer, a notebook computer, a desktop computer, or a smart watch;
  • the server end can be configured as an independent physical server, or as a server cluster composed of multiple physical servers or as a distributed
  • the system can also be configured to provide basic cloud computing such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, CDN, and big data and artificial intelligence platforms.
  • the cloud server of the service; the software can be an application that realizes the method of constructing a user portrait in a consultation session, but is not limited to the above forms.
  • FIG. 1 an optional flow chart of a method for constructing a user profile for a consultation session in an embodiment of the present application, the method in FIG. 1 may include but not limited to include S100 to S500 .
  • S400 presenting medical inquiry questions to the user, so as to obtain medical inquiry information input by the user;
  • the chief complaint information acquired by the present application is the user's disease description information.
  • different ways of obtaining the chief complaint information may be selected.
  • the method of the present application is implemented in the form of an application program (Application, APP)
  • the user can input the main complaint information into the dialog box through the consultation APP, and the main complaint information can be text information directly input by the user, or can be It is the voice information input by the user.
  • the consultation APP needs to perform voice recognition on the voice information first, and then proceed to the subsequent processing process after obtaining the recognized text information.
  • the text information entered into the dialog box is "the child is 7 years old, a little low, and does not eat well", and the consultation APP automatically recognizes and obtains the text information in the dialog box, so as to as the complaint information.
  • the main complaint information After obtaining the main complaint information, perform natural language preprocessing on the main complaint information, such as converting traditional Chinese to simplified Chinese, normalizing synonyms, word segmentation, etc., and then performing feature extraction to obtain the first feature vector matrix, which includes the main complaint information Word vectors for each token in .
  • the predictive network model is trained according to the first data set, and the first data set includes a plurality of medical consultation samples, wherein , each medical consultation sample includes consultation questions and corresponding diseases.
  • each medical consultation sample includes consultation questions and corresponding diseases.
  • the symptom in a medical consultation sample is "do not eat well", and the corresponding consultation question is "Picky eaters are obviously picky about food and only eat certain kinds of food. I don’t like it, what kind of situation does the child belong to?”; another medical interview sample included the symptom “picky eater”, and the corresponding question was “how long has the child been picky eater?”.
  • the predicted consultation questions are presented to the user to obtain the consultation information input by the user.
  • the presentation method can be presented in the form of text through a dialog box of the consultation APP, or can be converted into voice information through voice conversion, and then presented to the user through a speaker.
  • the user answers the medical inquiry question and re-enters the answer information.
  • the answer information input by all users is collected as the consultation information of the current user.
  • a user portrait is constructed. As shown in Table 1, in one embodiment, the user portrait constructed according to the consultation information:
  • tag item tag value gender male age 7-year old height short weight - Symptoms and course Picky eaters (more than one month), dry stool Drug allergy none ... -
  • the consultation questions related to the corresponding disease are selected through the prediction network model, and the user is automatically asked.
  • no manual participation is required. While reducing labor costs, it improves the efficiency of asking questions, thereby improving the efficiency of building user portraits.
  • FIG. 2 it is a flowchart of obtaining the first eigenvector matrix for the embodiment of the present application, including:
  • FIG. 3 it is a schematic diagram of training a word vector model according to an embodiment of the present application.
  • This application uses a large number of medical consultation sample data as training samples. It can be understood that the medical consultation sample data are all word-segmented. Then it is trained by the GloVe algorithm to obtain a trained word vector model.
  • the word vector model can convert words into vector representations, that is, use low-dimensional, dense, and real-valued word vectors to represent each word, so that word correlation can be calculated. If two words are semantically related or similar, their corresponding The distance between word vectors is similar.
  • the Word2Vec algorithm can also be used to train word vectors. Compared with the GloVe algorithm, the GloVe algorithm has more advantages in parallel processing and faster processing speed.
  • Natural language preprocessing may include: removing stop words, converting traditional Chinese to simplified Chinese, normalizing synonyms, word segmentation, etc. After natural language preprocessing, multiple first word segmentations are obtained, and then multiple first segmentation words are input into the word vector model to obtain multiple first word vectors. It can be understood that multiple first word segmentations and multiple first word vectors The word vectors are in one-to-one correspondence, and finally the first feature vector matrix can be obtained after combining multiple first word vectors.
  • FIG. 4 it is a flowchart of natural language preprocessing in an embodiment of the present application, including:
  • Natural language preprocessing including: converting traditional Chinese to simplified Chinese, word segmentation, removing stop words, and normalizing synonyms.
  • the input text information is: "Hello, doctor, I have a little stomachache.”
  • word segmentation it becomes: " ⁇ Your Good] ⁇ , ⁇ Doctor ⁇ , ⁇ I ⁇ a bit ⁇ stomach pain ⁇ . ⁇ ”
  • word segmentation it becomes: "[I] [a little] [stomach pain]”.
  • the words in the inactive vocabulary can be removed from the word segmentation results, so as to reduce the amount of data in the subsequent processing process.
  • Synonym normalization can replace words with the same meaning with a specific word, and can also reduce the amount of data in subsequent processing.
  • Table 2 is a normalized mapping table for synonym conversion:
  • the predictive network model includes: a bidirectional cyclic neural network, a high-speed neural network and a convolutional neural network.
  • the first feature vector matrix is input into the predictive network model to obtain a consultation matched with the chief complaint information. questions, including:
  • the features in the input first feature vector matrix are fused through a bidirectional cyclic neural network, that is, the current word vector and its adjacent word vectors are spliced to learn the semantic features of the current word vector , to obtain the first fusion feature vector matrix; then input the first fusion feature vector matrix into the high-speed neural network, and obtain the first depth feature vector matrix through multi-layer network training; finally, the first depth feature vector matrix is obtained through the convolutional neural network
  • the eigenvector matrix is used for feature extraction to obtain a low-dimensional first vector, and according to the first vector, a medical inquiry question matching the chief complaint information can be obtained.
  • the user is presented with medical inquiry questions to obtain the medical inquiry information input by the user, including:
  • the present application presents medical inquiry questions to the user, it also presents structured answer options to the user at the same time, so the obtained medical inquiry information is structured user answer information.
  • the consultation APP to present a medical inquiry question to the user
  • a structured answer to the medical inquiry question is presented to the user at the same time.
  • the inquiry question is "Picky eaters are picky about food and only eat certain types of food. Anorexia is dislike of all foods. What kind of situation does the child belong to?"
  • the user can only choose Input "picky eater partial eclipse” or “anorexia” selectively, and the user is not allowed to input answer information independently.
  • the structured user answer information can be directly used to construct the user portrait without processing the medical inquiry information, which further improves the efficiency of constructing the user portrait.
  • the medical inquiry information input by the user may also be processed by keyword matching or using a feature extraction network to extract keywords to construct a user portrait.
  • FIG. 7 it is an optional flow chart of a method for constructing a user portrait in an interrogation session according to another embodiment of the present application.
  • the method also includes:
  • a preliminary user portrait is constructed based on the consultation information.
  • a manual consultation is required at this time to make up for the information missed during the automatic questioning.
  • qualified physicians in the field of care can be selected for manual consultation.
  • This session information can be text information obtained through the dialog box of the consultation APP, or it can be voice information during the voice consultation. If it is a voice session information, it is necessary to carry out voice recognition on the voice information, and then proceed to the subsequent processing.
  • the session information After obtaining the session information, perform natural language preprocessing on the session information, such as converting traditional Chinese to simplified Chinese, normalizing synonyms, word segmentation, etc., and then performing feature extraction to obtain the second feature vector matrix, which includes the main complaint information Word vectors for each token in .
  • the same trained word vector model in the above embodiment can be used to extract the feature vector of the conversation information. The way of training the word vector model has been described in detail in the above embodiment, and will not be repeated here.
  • the second feature vector matrix is input into the label extraction network model to obtain the user's health class label, wherein the label extraction network model is obtained according to the training of the second data set, which includes a plurality of session information and multiple The health label corresponding to each session information.
  • the tag extraction network model of the present application is used to tag the collected session information, and extract the corresponding health tags according to the obtained tag information. For example, referring to Table 3, it is a correspondence table of the marked corpus of an embodiment:
  • the second data set contains session information and health labels corresponding to the session information.
  • the label extraction network model is trained, and the trained label extraction network The model can annotate the currently input session information, so as to extract the health class label according to the annotation. For example, when the input session information is "I have a stomachache", after processing by the label extraction network model, the user's health label is "stomachache", and the session information is screened to obtain information related to the disease. key information.
  • the health tags can also include the user's personal information, the treatment of the disease, etc., and the training samples in the second data set can be changed according to the specific needs of constructing the user portrait.
  • the update process is to merge the user portrait obtained according to the main complaint information with the user portrait obtained according to the session information.
  • Table 4 it is a user portrait constructed according to session information in an embodiment:
  • the follow-up user's consultation experience can be improved, and the accuracy of medical recommendation items can be improved.
  • the second feature vector matrix is input into the label extraction network model to obtain the user's health category label, including:
  • the label extraction network of this application includes a bidirectional long short-term memory network and a conditional random field. Specifically, this application uses BiLSTM-CRF to mark the input session information, and BiLSTM is composed of forward LSTM and backward LSTM. During training, the marked conversational information is first mapped to a word vector through the word vector model, and then the word vector is input to the BiLSTM layer.
  • the score probability of each word corresponding to each label is output, and finally the The output of all BiLSTM layers is used as the input of the CRF layer, and the final prediction result is obtained by learning the order dependence information between the labels, and the transition probability between the labels in the second data set is learned through the CRF layer to correct the output of the BiLSTM layer, ensuring that The rationality of the obtained predicted labels is improved, thereby improving the accuracy of the obtained health labels.
  • the present application also discloses a device for constructing a user portrait based on an interrogation session, including:
  • An information acquisition module the information acquisition module is used to acquire the main complaint information input by the user, wherein the main complaint information is the user's disease description information;
  • a feature extraction module the feature extraction module is used to perform feature extraction on the main complaint information to obtain the first feature vector matrix
  • a prediction module the prediction module is used to input the first eigenvector matrix into the prediction network model to obtain medical questions matched with the chief complaint information, wherein the prediction network model is trained according to the first data set, and the first data set includes Multiple medical consultation samples, each medical consultation sample includes consultation questions and corresponding diseases;
  • An inquiry module the inquiry module is used to present inquiry questions to the user, so as to obtain the inquiry information input by the user;
  • the portrait construction module is used to construct a user portrait according to the consultation information.
  • the specific implementation steps of the device for constructing a user portrait based on a medical consultation session of the present application are the same as the specific implementation steps of the method for constructing a user portrait based on a medical consultation session in the above-mentioned embodiments, and will not be repeated here.
  • the present application also discloses an electronic device, including: at least one memory, at least one processor, and at least one program, the program is stored in the memory, and the processor executes at least one program to realize: a method for constructing a user portrait based on a consultation session :
  • the method for constructing the user portrait based on the consultation session includes: obtaining the main complaint information input by the user, wherein the main complaint information is the user's disease description information; performing feature extraction on the main complaint information to obtain the first feature vector matrix;
  • the vector matrix is input into the predictive network model to obtain medical inquiry questions matching the chief complaint information, wherein the predictive network model is trained according to the first data set, and the first data set includes a plurality of medical consultation samples, and each medical question Diagnosis samples include medical inquiry questions and corresponding diseases; present the medical inquiry questions to the user to obtain the medical inquiry information input by the user; construct user portraits based on the medical inquiry information.
  • the electronic device may be any intelligent terminal including a mobile phone, a tablet computer, a personal digital
  • FIG. 9 illustrates a hardware structure of an electronic device in an embodiment, and the electronic device includes:
  • the processor can be implemented by a general-purpose CPU (Central Processing Unit, central processing unit), a microprocessor, an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, etc., and is used to execute related programs to realize the technical solutions provided by the embodiments of the present disclosure;
  • a general-purpose CPU Central Processing Unit, central processing unit
  • a microprocessor an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, etc.
  • ASIC Application Specific Integrated Circuit
  • Memory can be realized in forms such as ROM (Read Only Memory, read-only memory), static storage device, dynamic storage device or RAM (Random Access Memory, random access memory).
  • the memory can store operating systems and other application programs.
  • the relevant program codes are stored in the memory, and are called by the processor to execute the programs based on the embodiments of the present disclosure.
  • Input/output interface used to realize information input and output
  • the communication interface is used to realize the communication and interaction between this device and other devices, which can realize communication through wired methods (such as USB, network cable, etc.) or wireless methods (such as mobile network, WIFI, Bluetooth, etc.);
  • bus which transfers information between the various components of the device, such as the processor, memory, input/output interfaces, and communication interfaces;
  • the processor, the memory, the input/output interface and the communication interface are connected to each other within the device through the bus.
  • the present application also discloses a storage medium, the storage medium is a computer-readable storage medium, and the computer-readable storage medium stores computer-executable instructions, and the computer-executable instructions are used to make the computer execute: a method for constructing a user portrait based on a consultation session :
  • the method for constructing the user portrait based on the consultation session includes: obtaining the main complaint information input by the user, wherein the main complaint information is the user's disease description information; performing feature extraction on the main complaint information to obtain the first feature vector matrix;
  • the vector matrix is input into the predictive network model to obtain medical inquiry questions matching the chief complaint information, wherein the predictive network model is trained according to the first data set, and the first data set includes a plurality of medical consultation samples, and each medical question Diagnosis samples include medical inquiry questions and corresponding diseases; present the medical inquiry questions to the user to obtain the medical inquiry information input by the user; construct user portraits based on the medical inquiry information.
  • the computer-readable storage medium may be non-volatile or volatile.
  • memory can be used to store non-transitory software programs and non-transitory computer-executable programs.
  • the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage devices.
  • the memory optionally includes memory located remotely from the processor, and these remote memories may be connected to the processor via a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • At least one (item) means one or more, and “multiple” means two or more.
  • “And/or” is used to describe the association relationship of associated objects, indicating that there can be three types of relationships, for example, “A and/or B” can mean: only A exists, only B exists, and A and B exist at the same time , where A and B can be singular or plural.
  • the character “/” generally indicates that the contextual objects are an “or” relationship.
  • At least one of the following” or similar expressions refer to any combination of these items, including any combination of single or plural items.
  • At least one item (piece) of a, b or c can mean: a, b, c, "a and b", “a and c", “b and c", or "a and b and c ", where a, b, c can be single or multiple.
  • the disclosed devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or part of the contribution to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including multiple instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read only memory (Read Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store programs.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Molecular Biology (AREA)
  • Pathology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

提供了一种基于问诊会话构建用户画像的方法、装置、设备和介质,涉及机器学习技术领域,方法包括:获取用户输入的主诉信息;对主诉信息进行特征提取,得到第一特征向量矩阵;将第一特征向量矩阵输入至预测网络模型中,得到与主诉信息匹配的问诊问题,预测网络模型是根据第一数据集训练得到的,第一数据集包括多个医疗问诊样本,其中,每个医疗问诊样本包括问诊问题与对应的病症;向用户呈现问诊问题,以得到用户输入的问诊信息;根据问诊信息构建用户画像。通过对用户输入的主诉信息进行识别,并通过预测网络模型得到与主诉信息相对应的问诊问题,以此快速的对用户进行自动问诊,提高了获取问诊信息的效率且降低了人工成本。

Description

基于问诊会话构建用户画像的方法、装置、设备和介质
本申请要求于2021年08月30日提交中国专利局、申请号为202111005960.3,发明名称为“基于问诊会话构建用户画像的方法、装置、设备和介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及机器学习技术领域,尤其是涉及一种基于问诊会话构建用户画像的方法、装置、设备和介质。
背景技术
相关技术中的根据医疗数据生成用户画像的方法中,大多通过人工在线问诊的方式来获取医患间的问诊信息,通过对问诊信息进行分词、筛查和标识后,根据标识结果来构建患者的用户画像。但发明人意识到此种以人工在线问诊的方式获取问诊信息,并构建用户画像的方式效率较低,且人工成本较高。
技术问题
本申请旨在至少解决现有技术中存在的技术问题之一。为此,本申请提出一种基于问诊会话构建用户画像的方法、装置、设备和介质,能够提高构建用户画像的效率且降低人工成本。
技术解决方案
根据本申请的第一方面实施例的基于问诊会话构建用户画像的方法,包括:获取用户输入的主诉信息,其中,所述主诉信息为所述用户的病症描述信息;对所述主诉信息进行特征提取,得到第一特征向量矩阵;将所述第一特征向量矩阵输入至预测网络模型中,得到与所述主诉信息匹配的问诊问题,其中,所述预测网络模型是根据第一数据集训练得到的,所述第一数据集包括多个医疗问诊样本,每个所述医疗问诊样本包括问诊问题与对应的病症;向所述用户呈现所述问诊问题,以得到所述用户输入的问诊信息;根据所述问诊信息构建用户画像。
根据本申请的第二方面实施例的基于问诊会话构建用户画像的装置,包括:信息获取模块,所述信息获取模块用于获取用户输入的主诉信息,其中,所述主诉信息为所述用户的病症描述信息;特征提取模块,所述特征提取模块用于对所述主诉信息进行特征提取,得到第一特征向量矩阵;预测模块,所述预测模块用于将所述第一特征向量矩阵输入至预测网络模型中,得到与所述主诉信息匹配的问诊问题,其中,所述预测网络模型是根据第一数据集训 练得到的,所述第一数据集包括多个医疗问诊样本,每个所述医疗问诊样本包括问诊问题与对应的病症;问诊模块,所述问诊模块用于向所述用户呈现所述问诊问题,以得到所述用户输入的问诊信息;画像构建模块,所述画像构建模块用于根据所述问诊信息构建用户画像。
根据本申请的第三方面实施例的一种电子设备,包括:至少一个存储器;至少一个处理器;至少一个程序;所述程序被存储在所述存储器中,处理器执行所述至少一个程序以实现:基于问诊会话构建用户画像的方法:其中,所述基于问诊会话构建用户画像的方法包括:获取用户输入的主诉信息,其中,所述主诉信息为所述用户的病症描述信息;对所述主诉信息进行特征提取,得到第一特征向量矩阵;将所述第一特征向量矩阵输入至预测网络模型中,得到与所述主诉信息匹配的问诊问题,其中,所述预测网络模型是根据第一数据集训练得到的,所述第一数据集包括多个医疗问诊样本,每个所述医疗问诊样本包括问诊问题与对应的病症;向所述用户呈现所述问诊问题,以得到所述用户输入的问诊信息;根据所述问诊信息构建用户画像。
根据本申请的第四方面实施例的一种存储介质,所述存储介质为计算机可读存储介质,所述计算机可读存储介质存储有计算机可执行指令,所述计算机可执行指令用于使计算机执行:基于问诊会话构建用户画像的方法:其中,所述基于问诊会话构建用户画像的方法包括:获取用户输入的主诉信息,其中,所述主诉信息为所述用户的病症描述信息;对所述主诉信息进行特征提取,得到第一特征向量矩阵;将所述第一特征向量矩阵输入至预测网络模型中,得到与所述主诉信息匹配的问诊问题,其中,所述预测网络模型是根据第一数据集训练得到的,所述第一数据集包括多个医疗问诊样本,每个所述医疗问诊样本包括问诊问题与对应的病症;向所述用户呈现所述问诊问题,以得到所述用户输入的问诊信息;根据所述问诊信息构建用户画像。
有益效果
根据本申请实施例的基于问诊会话构建用户画像的方法、装置、设备和介质,至少具有如下有益效果:通过对用户输入的主诉信息进行识别,并对主诉信息进行特征提取后输入至预测网络模型中,通过预测网络模型得到与主诉信息相对应的问诊问题,通过问诊问题快速的对用户进行自动问诊,以获取用户的问诊信息,根据问诊信息构建用户画像,提高了构建用户画像的效率,且在收集问诊信息时不需要通过人工进行问诊,节省人工成本。通过构建好的用户画像,方便后续选择符合用户当前病症的治疗领域的医生进行进一步问诊,且通过用户画像可以更精确的向用户进行项目的推荐。
本申请的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本申请的实践了解到。
附图说明
下面结合附图和实施例对本申请做进一步的说明,其中:
图1为本申请实施例基于问诊会话构建用户画像的方法的流程图;
图2为本申请实施例获取第一特征向量矩阵的流程图;
图3为本申请实施例训练词向量模型的示意图;
图4为本申请实施例进行自然语言预处理的流程图;
图5为本申请实施例得到问诊问题的流程图;
图6为本申请实施例得到问诊信息的流程图;
图7为本申请另一实施例基于问诊会话构建用户画像的方法的流程图;
图8为本申请实施例获取用户的健康类标签的流程图;
图9为本申请实施例提供的电子设备的硬件结构示意图。
本发明的实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。
需要说明的是,虽然在装置示意图中进行了功能模块划分,在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于装置中的模块划分,或流程图中的顺序执行所示出或描述的步骤。说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。
首先,对本申请中涉及的若干名词进行解析:
自然语言处理(Natural Language Processing,NLP):NLP用计算机来处理、理解以及运用人类语言(如中文、英文等),NLP属于人工智能的一个分支,是计算机科学与语言学的交叉学科,又常被称为计算语言学。自然语言处理包括语法分析、语义分析、篇章理解等。自然语言处理常用于机器翻译、手写体和印刷体字符识别、语音识别及文语转换、信息检索、信息抽取与过滤、文本分类与聚类、舆情分析和观点挖掘等技术领域,它涉及与语言处理相关的数据挖掘、机器学习、知识获取、知识工程、人工智能研究和与语言计算相关的语言学研究等。
Word2Vec:是一种训练词向量的工具,Word2Vec认为经常在一个句子中出现的词语相似度是比较高的,即对于一个中心词,最大化周边单词的概率。Word2Vec采用三层网络进行训练,最后一层采用霍夫曼树(Huffuman)来预测。
GloVe:是另一种训练词向量的工具,GloVe是通过共现计数来实现的:首先,构建一个词汇的共现矩阵,每一行是一个词,每一列是句子,通过共现矩阵计算每个词在每个句子中出现的频率,由于句子是多种词汇的组合,其维度非常大,需要进行降维,即需要对共现矩阵进行降维。
长短期记忆网络(Long Short Term Memory,LSTM):是一种时间循环神经网络,能够学习长期依赖关系,并可保留误差,在沿时间和层进行反向传递时,可以将误差保持在更加恒定的水平,让循环网络能够进行多个时间步的学习,从而建立远距离因果联系,非常适合用于对时序数据的建模,如文本数据。
双向长短期记忆网络(Bi-directional Long Short Term Memory,BiLSTM):是由前向LSTM与后向LSTM组合而成,可以利用过去时刻的信息与未来时刻的信息,相较于单向的LSTM最终的预测结果更加准确。
条件随机场(Conditional Random Field,CRF):是一种判别式概率模型,是随机场的一种,常用于标注或分析序列资料,经常用于中文分词和词性标注等词法分析工作中。
双向循环神经网络(Bidirectional Recurrent Neural Network,BRNN):双向循环神经网络是由两个单向循环神经网络上下叠加在一起组成的,输出由这两个循环神经网络的状态共同决定,当前时刻的输出不仅和之前的状态有关,也与之后的状态有关。
高速神经网络(Highway Network):高速神经网络中加入了“门”结构,可以解决网络深度加深,梯度信息回流受阻,造成网络训练困难的问题。
卷积神经网络(Convolutional Neural Networks,CNN):是一类包含卷积计算且具有深度结构的前馈神经网络,卷积神经网络具有表征学习能力,能够按其阶层结构对输入信息进行平移不变分类,可应用监督学习和非监督学习中。
本申请实施例可以基于人工智能技术对相关的数据进行获取和处理。其中,人工智能(Artificial Intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。
人工智能基础技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理技术、操作/交互系统、机电一体化等技术。人工智能软件技术主要包括计算机视觉技术、机器人技术、生物识别技术、语音处理技术、自然语言处理技术以及机器学习/深度学习等几大方向。
基于此,本申请的实施例提供一种基于问诊会话构建用户画像的方法、装置、设备和介质,可以提高构建用户画像的效率和准确性。
本申请实施例提供一种基于问诊会话构建用户画像的方法、装置、设备和介质,具体通过如下实施例进行说明,首先描述本申请实施例中的问诊会话构建用户画像的方法。
本申请实施例提供的基于问诊会话构建用户画像的方法,涉及机器学习技术领域。本申请实施例提供的问诊会话构建用户画像的方法可应用于终端中,也可应用于服务器端中,还可以是运行于终端或服务器端中的软件。在一些实施例中,终端可以是智能手机、平板电脑、笔记本电脑、台式计算机或者智能手表等;服务器端可以配置成独立的物理服务器,也可以配置成多个物理服务器构成的服务器集群或者分布式系统,还可以配置成提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、CDN以及大数据和人工智能平台等基础云计算服务的云服务器;软件可以是实现问诊会话构建用户画像的方法的应用等,但并不局限于以上形式。
参照图1,为本申请实施例的问诊会话构建用户画像的方法的一个可选的流程图,图1中的方法可以包括但不限于包括S100至S500。
S100,获取用户输入的主诉信息;
S200,对主诉信息进行特征提取,得到第一特征向量矩阵;
S300,将第一特征向量矩阵输入至预测网络模型中,得到与主诉信息匹配的问诊问题;
S400,向用户呈现问诊问题,以得到用户输入的问诊信息;
S500,根据问诊信息构建用户画像。
一些实施例,在S100中,本申请获取的主诉信息为用户的病症描述信息。根据本申请方法的不同具体实施方式,可以选择不同的获取主诉信息的方式。例如,当本申请的方法以问诊应用程序(Application,APP)的形式实施时,用户可以通过问诊APP,向对话框中输入主诉信息,主诉信息可以为用户直接输入的文本信息,也可以为用户输入的语音信息,在用户输入语音信息的情况下,问诊APP需要对语音信息先进行语音识别,得到识别的文本信息后再进行后续的处理过程。例如,用户打开问诊APP后,向对话框中输入的文本信息为“孩子7岁了,有点低,不好好吃饭”,问诊APP自动对对话框中的文本信息进行识别并获取,以此作为主诉信息。
获取到主诉信息后,对主诉信息进行自然语言预处理,如进行繁体转简体、同义词归一、分词等处理,然后进行特征提取,得到第一特征向量矩阵,第一特征向量矩阵中包括主诉信息中的每个分词的词向量。
然后将第一特征向量矩阵输入至预测网络模型中,得到与主诉信息匹配的问诊问题,预测网络模型是根据第一数据集训练得到的,第一数据集包括多个医疗问诊样本,其中,每个医疗问诊样本包括问诊问题与对应的病症。具体示例,一个医疗问诊样本中的病症为“不好好吃饭”,对应的问诊问题为“挑食偏食是对食物有较为明显的挑剔,只吃某几种食物,厌食是对所有的食物都不喜欢,请问孩子是属于哪种情况呢?”;另一个医疗问诊样本中病症为“挑食”,对应的问诊问题为“孩子挑食症状大概有多长时间了?”。通过使用大量医疗问诊样本对预测网络模型进行训练,通过向预测网络模型中输入主诉信息,即可得到与主诉信息对应的问诊问题。
将预测得到的问诊问题向用户呈现,以得到用户输入的问诊信息。可以理解的是,根据本申请方法的不同具体实施方式,可以选择不同的呈现问诊问题的方式。例如,呈现方式可以为通过问诊APP的对话框,以文字的形式进行呈现,也可以通过语音转换,将问诊问题转换为语音信息后,通过扬声器向用户呈现。对应的,用户在接收到问诊问题后,再对问诊问题进行回答,并将回答信息重新输入。在一些实施例中,当获取到对应的用户回答后,通过获取用户的回答信息,并对回答信息进行特征提取并输入至预测网络模型中,可以继续得到新的问诊问题,以此方式完成问诊问题的连续提问,从而全面完整的对特定病症的有关问诊问题进行全面的提问;在一些其他实施例中,也可以为一个病症对应多个问诊问题,通过不同问诊问题的权重优先级,依次对用户进行提问,来得到用户的回答信息。最后将所有用户输入的回答信息进行采集,作为当前用户的问诊信息。
通过获取用户的问诊信息,并对问诊信息进行处理后,构建用户画像。如表1所示,为一实施例中,根据问诊信息构建出的用户画像:
表1:
标签项 标签值
性别
年龄 7岁
身高 偏矮
体重 -
症状及病程 挑食(一个月以上)、大便偏干
药物过敏
…… -
本申请公开的问诊会话构建用户画像的方法中,通过预测网络模型选取与对应病症相关的问诊问题,并自动对用户进行提问,相比于传统的人工问诊的方式,无需人工参与,在降低人工成本的同时,提高了提问的效率,从而提高了构建用户画像的效率。
参照图2,为本申请实施例获取第一特征向量矩阵的流程图,包括:
S210,对主诉信息进行自然语言预处理,得到多个第一分词;
S220,将多个第一分词输入至预先训练好的词向量模型中,以得到多个第一词向量;
S230,对多个第一词向量进行组合处理,得到第一特征向量矩阵。
参照图3,为本申请实施例训练词向量模型的示意图。本申请将大量的医疗问诊样本数据作为训练样本,可以理解的是,医疗问诊样本数据都是经过分词的。然后通过GloVe算法进行训练,以得到训练好的词向量模型。词向量模型可以将词转化为向量表示,即使用低维、稠密、实值的词向量来表示每一个词,从而可以计算词语相关度,两个词具有语义相关或相似,则它们所对应的词向量之间的距离相近。在一些其他实施例中,也可以使用Word2Vec算法进行词向量的训练,与GloVe算法相比,GloVe算法在并行化处理上更有优势,处理速度较快。
在对获取的主诉信息进行特征提取时,首先需要对主诉信息进行自然语言预处理,自然语言预处理可以包括:去停用词、繁体转简体、同义词归一、分词等。通过自然语言预处理后得到多个第一分词,然后将多个第一分词输入至词向量模型中,得到多个第一词向量,可以理解的是,多个第一分词与多个第一词向量为一一对应的关系,最后将多个第一词向量进行组合后,即可得到第一特征向量矩阵。
在一些实施例中,参照图4,为本申请一实施例进行自然语言预处理的流程图,包括:
S211,对主诉信息进行繁体转简体处理,得到简体信息;
S212,对简体信息进行分词,得到预分词信息;
S213,对预分词信息进行去停用词处理,得到分词信息;
S214,对分词信息进行同义词归一化,得到多个第一分词。
自然语言预处理,包括:繁体转简体、分词、去停用词、同义词归一。具体示例,当输入的文字信息为:“您好,醫生,我有點肚子痛。”,进行繁体转简体后为:“您好,医生,我有点肚子痛。”经过分词后为:“【您好】【,】【医生】【,】【我】【有点】【肚子 痛】【。】”去掉停用词后为:“【我】【有点】【肚子痛】”。通过设置停用词表文件,可以在分词结果中去除停用词表中的词,以减少后续处理流程中的数据量。同义词归一可以将相同含义的词都替换为一个特定的词,同样可以减小后续处理过程的数据量。例如,表2为同义词转换的一个归一化映射表:
表2:
核心词 需要归一的词
腹痛 肚子痛
腹痛 腹部疼
腹痛 腹疼
腹痛 腹部疼痛
在一些实施例中,预测网络模型包括:双向循环神经网络、高速神经网络和卷积神经网络,参照图5,将第一特征向量矩阵输入至预测网络模型中,得到与主诉信息匹配的问诊问题,包括:
S310,将第一特征向量矩阵输入双向循环神经网络进行特征融合处理,得到第一融合特征向量矩阵;
S320,将第一融合特征向量矩阵输入高速神经网络进行深度处理,得到第一深度特征向量矩阵;
S330,通过卷积神经网络对第一深度特征向量矩阵进行特征提取,得到第一向量;
S340,根据第一向量得到与主诉信息匹配的问诊问题。
本申请的预测网络模型中,通过双向循环神经网络,对输入的第一特征向量矩阵中的特征进行融合处理,即将当前字向量与其临近的字向量进行拼接,以学习到当前字向量的语义特征,以得到第一融合特征向量矩阵;然后将第一融合特征向量矩阵输入至高速神经网络中,通过多层网络的训练,得到第一深度特征向量矩阵;最后通过卷积神经网络对第一深度特征向量矩阵进行特征提取,得到低维的第一向量,根据第一向量即可得到与主诉信息匹配的问诊问题。
在一些实施例中,参照图6,向用户呈现问诊问题,以得到用户输入的问诊信息,包括:
S410,向用户呈现问诊问题和结构化的答案选项;
S420,根据用户输入的答案选项,得到用户输入的问诊信息。
本申请在向用户呈现问诊问题时,同时会向用户呈现结构化的答案选项,因此得到的问诊信息为结构化的用户回答信息。例如,当使用问诊APP向用户呈现问诊问题时,同时向用户呈现回答问诊问题的结构化的答案。问诊问题为“挑食偏食是对食物有较为明显的挑剔,只吃某几种食物,厌食是对所有的食物都不喜欢,请问孩子是属于哪种情况呢?”,此时用户只可以选择性地输入“挑食偏食”或“厌食”,而不允许用户自主输入回答信息。通过使用户选择结构化的用户回答信息,不需要再对问诊信息进行处理,可以直接使用结构化的用户回答信息构建用户画像,进一步提高了构建用户画像的效率。在一些其他实施例中,也可 以通过关键词匹配或者使用特征提取网络提取关键词的方式,来处理用户输入的问诊信息,以构建用户画像。
参照图7,为本申请另一实施例的问诊会话构建用户画像的方法的一个可选的流程图,该方法还包括:
S600,获取对用户进行人工问诊时的会话信息;
S700,对会话信息进行特征提取,得到第二特征向量矩阵;
S800,将第二特征向量矩阵输入至标签提取网络模型中,获取用户的健康类标签;
S900,根据用户的健康类标签更新用户画像。
当自动提问结束后,根据问诊信息构建出初步用户画像,为使用户画像构建得更加完整,此时需要进行人工问诊,以弥补自动提问时所遗漏的信息。在一些实施例中,可以根据自动提问时所构建的用户画像,选择主治领域符合要求的医师,来进行人工问诊。
当人工问诊结束后,获取进行人工问诊时的会话信息,此会话信息可以为通过问诊APP对话框获取的文字信息,也可以为通过语音问诊时的语音信息,若为语音的会话信息,则需要对语音信息进行语音识别后,再继续进行后续的处理过程。
获取到会话信息后,对会话信息进行自然语言预处理,如进行繁体转简体、同义词归一、分词等处理,然后进行特征提取,得到第二特征向量矩阵,第二特征向量矩阵中包括主诉信息中的每个分词的词向量。可以通过上述实施例中相同的训练好的词向量模型对会话信息进行特征向量的提取,训练词向量模型的方式已经在上述实施例中进行了详细说明,此处不再一一赘述。
然后将第二特征向量矩阵输入至标签提取网络模型中,获取用户的健康类标签,其中,标签提取网络模型是根据第二数据集训练得到的,第二数据集中包括多个会话信息以及与多个会话信息对应的健康类标签。本申请的标签提取网络模型用于对采集的会话信息进行标注,根据得到的标注信息,提取对应的健康类标签。例如,参照表3,为一实施例的标注语料的对应关系表:
表3:
原文 标注
O-O
O-O
O-O
S-spt
M-spt
E-spt
第二数据集中为会话信息和与会话信息对应的健康类标签,通过准备的多个会话信息以及与多个会话信息对应的健康类标签,对标签提取网络模型进行训练,训练好的标签提取网络模型即可对当前输入的会话信息进行标注,从而根据标注提取出健康类标签。例如,当输 入的会话信息为“我有点肚子痛”时,经过标签提取网络模型处理后,得到的用户的健康类标签即为“肚子痛”,以此对会话信息进行筛选,得到与病症有关的关键信息。可以理解的是,健康类标签也可以包括用户的个人信息、病症的治疗方式等,可以根据构建用户画像的具体需求对第二数据集中的训练样本进行改变。
最后根据获得的用户的健康类标签,对用户画像进行更新,使得最终生成的用户画像更加完整。其更新过程为,将根据主诉信息得到的用户画像与根据会话信息得到的用户画像进行合并。例如,参照表4,为一实施例中根据会话信息构建出的用户画像:
表4:
Figure PCTCN2022087528-appb-000001
与表1中构建的用户画像进行合并后,得到更新后的用户画像,如表5所示:
表5:
Figure PCTCN2022087528-appb-000002
通过设置的标签提取网络模型,对会话信息中的健康类标签进行提取,并对用户画像进行更新,使得最终得到的用户画像更加具体、完整。通过构建用户画像,可以提升后续用户的问诊体验,提高投放医疗推荐项目的准确性。
在一些实施例中,参照图8,本申请中将第二特征向量矩阵输入至标签提取网络模型中, 获取用户的健康类标签,包括:
S810,将第二特征向量矩阵输入双向长短期记忆网络进行词性标注处理,得到标签得分概率;
S820,通过条件随机场对标签得分概率进行标签顺序修正,得到用户的健康类标签。
本申请的标签提取网络包括双向长短期记忆网络和条件随机场。具体的,本申请使用BiLSTM-CRF来实现对输入的会话信息进行标注,BiLSTM由前向LSTM和后向LSTM组成。在训练时,先将标注好的会话信息通过词向量模型映射为词向量,再将词向量输入至BiLSTM层,通过学习上下文的信息,输出每个单词对应于每个标签的得分概率,最后将所有的BiLSTM层的输出作为CRF层的输入,通过学习标签之间的顺序依赖信息,得到最终的预测结果,通过CRF层学习第二数据集中标签之间的转移概率从而修正BiLSTM层的输出,保证了得到的预测标签的合理性,从而提高获取的健康类标签的准确性。
本申请还公开了一种基于问诊会话构建用户画像的装置,包括:
信息获取模块,信息获取模块用于获取用户输入的主诉信息,其中,主诉信息为用户的病症描述信息;
特征提取模块,特征提取模块用于对主诉信息进行特征提取,得到第一特征向量矩阵;
预测模块,预测模块用于将第一特征向量矩阵输入至预测网络模型中,得到与主诉信息匹配的问诊问题,其中,预测网络模型是根据第一数据集训练得到的,第一数据集包括多个医疗问诊样本,每个医疗问诊样本包括问诊问题与对应的病症;
问诊模块,问诊模块用于向用户呈现问诊问题,以得到用户输入的问诊信息;
画像构建模块,画像构建模块用于根据问诊信息构建用户画像。
本申请的基于问诊会话构建用户画像的装置的具体实施步骤与上述实施例中的基于问诊会话构建用户画像的方法的具体实施步骤相同,此处不再一一赘述。
本申请还公开了一种电子设备,包括:至少一个存储器,至少一个处理器,至少一个程序,程序被存储在存储器中,处理器执行至少一个程序以实现:基于问诊会话构建用户画像的方法:其中,基于问诊会话构建用户画像的方法包括:获取用户输入的主诉信息,其中,主诉信息为用户的病症描述信息;对主诉信息进行特征提取,得到第一特征向量矩阵;将第一特征向量矩阵输入至预测网络模型中,得到与主诉信息匹配的问诊问题,其中,预测网络模型是根据第一数据集训练得到的,第一数据集包括多个医疗问诊样本,每个医疗问诊样本包括问诊问题与对应的病症;向用户呈现问诊问题,以得到用户输入的问诊信息;根据问诊信息构建用户画像。该电子设备可以为包括手机、平板电脑、个人数字助理(Personal Digital Assistant,PDA)、车载电脑等任意智能终端。
请参阅图9,图9示意了一实施例的电子设备的硬件结构,电子设备包括:
处理器,可以采用通用的CPU(Central Processing Unit,中央处理器)、微处理器、应用专用集成电路(Application Specific Integrated Circuit,ASIC)、或者一个或多个集成电路等方式实现,用于执行相关程序,以实现本公开实施例所提供的技术方案;
存储器,可以采用ROM(Read Only Memory,只读存储器)、静态存储设备、动态存储 设备或者RAM(Random Access Memory,随机存取存储器)等形式实现。存储器可以存储操作系统和其他应用程序,在通过软件或者固件来实现本说明书实施例所提供的技术方案时,相关的程序代码保存在存储器中,并由处理器来调用执行本公开实施例的基于问诊会话构建用户画像的方法;
输入/输出接口,用于实现信息输入及输出;
通信接口,用于实现本设备与其他设备的通信交互,可以通过有线方式(例如USB、网线等)实现通信,也可以通过无线方式(例如移动网络、WIFI、蓝牙等)实现通信;
总线,在设备的各个组件(例如处理器、存储器、输入/输出接口和通信接口)之间传输信息;
其中处理器、存储器、输入/输出接口和通信接口通过总线实现彼此之间在设备内部的通信连接。
本申请还公开了一种存储介质,存储介质为计算机可读存储介质,计算机可读存储介质存储有计算机可执行指令,计算机可执行指令用于使计算机执行:基于问诊会话构建用户画像的方法:其中,基于问诊会话构建用户画像的方法包括:获取用户输入的主诉信息,其中,主诉信息为用户的病症描述信息;对主诉信息进行特征提取,得到第一特征向量矩阵;将第一特征向量矩阵输入至预测网络模型中,得到与主诉信息匹配的问诊问题,其中,预测网络模型是根据第一数据集训练得到的,第一数据集包括多个医疗问诊样本,每个医疗问诊样本包括问诊问题与对应的病症;向用户呈现问诊问题,以得到用户输入的问诊信息;根据问诊信息构建用户画像。
所述计算机可读存储介质可以是非易失性,也可以是易失性。存储器作为一种非暂态计算机可读存储介质,可用于存储非暂态软件程序以及非暂态性计算机可执行程序。此外,存储器可以包括高速随机存取存储器,还可以包括非暂态存储器,例如至少一个磁盘存储器件、闪存器件、或其他非暂态固态存储器件。在一些实施方式中,存储器可选包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至该处理器。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
本公开实施例描述的实施例是为了更加清楚的说明本公开实施例的技术方案,并不构成对于本公开实施例提供的技术方案的限定,本领域技术人员可知,随着技术的演变和新应用场景的出现,本公开实施例提供的技术方案对于类似的技术问题,同样适用。
本领域技术人员可以理解的是,图中示出的技术方案并不构成对本公开实施例的限定,可以包括比图示更多或更少的步骤,或者组合某些步骤,或者不同的步骤。
以上所描述的装置实施例仅仅是示意性的,其中作为分离部件说明的单元可以是或者也可以不是物理上分开的,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统、设备中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。
本申请的说明书及上述附图中的术语“第一”、“第二”、“第三”、“第四”等(如 果存在)是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
应当理解,在本申请中,“至少一个(项)”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,用于描述关联对象的关联关系,表示可以存在三种关系,例如,“A和/或B”可以表示:只存在A,只存在B以及同时存在A和B三种情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b或c中的至少一项(个),可以表示:a,b,c,“a和b”,“a和c”,“b和c”,或“a和b和c”,其中a,b,c可以是单个,也可以是多个。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括多指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序的介质。
上面结合附图对本申请实施例作了详细说明,但是本申请不限于上述实施例,在所属技术领域普通技术人员所具备的知识范围内,还可以在不脱离本申请宗旨的前提下作出各种变化。此外,在不冲突的情况下,本申请的实施例及实施例中的特征可以相互组合。

Claims (20)

  1. 基于问诊会话构建用户画像的方法,其中,包括:
    获取用户输入的主诉信息,其中,所述主诉信息为所述用户的病症描述信息;
    对所述主诉信息进行特征提取,得到第一特征向量矩阵;
    将所述第一特征向量矩阵输入至预测网络模型中,得到与所述主诉信息匹配的问诊问题,其中,所述预测网络模型是根据第一数据集训练得到的,所述第一数据集包括多个医疗问诊样本,每个所述医疗问诊样本包括问诊问题与对应的病症;
    向所述用户呈现所述问诊问题,以得到所述用户输入的问诊信息;
    根据所述问诊信息构建用户画像。
  2. 根据权利要求1所述的方法,其中,所述对所述主诉信息进行特征提取,得到第一特征向量矩阵,包括:
    对所述主诉信息进行自然语言预处理,得到多个第一分词;
    将所述多个第一分词输入至预先训练好的词向量模型中,以得到多个第一词向量;
    对所述多个第一词向量进行组合处理,得到所述第一特征向量矩阵。
  3. 根据权利要求2所述的方法,其中,所述对所述主诉信息进行自然语言预处理,得到多个第一分词,包括:
    对所述主诉信息进行繁体转简体处理,得到简体信息;
    对所述简体信息进行分词,得到预分词信息;
    对所述预分词信息进行去停用词处理,得到分词信息;
    对所述分词信息进行同义词归一化,得到所述多个第一分词。
  4. 根据权利要求1所述的方法,其中,所述将所述第一特征向量矩阵输入至预测网络模型中,得到与所述主诉信息匹配的问诊问题,包括:
    将所述第一特征向量矩阵输入双向循环神经网络进行特征融合处理,得到第一融合特征向量矩阵;
    将所述第一融合特征向量矩阵输入高速神经网络进行深度处理,得到第一深度特征向量矩阵;
    通过卷积神经网络对所述第一深度特征向量矩阵进行特征提取,得到第一向量;
    根据所述第一向量得到与所述主诉信息匹配的问诊问题。
  5. 根据权利要求1所述的方法,其中,所述向所述用户呈现所述问诊问题,以得到所述用户输入的问诊信息,包括:
    向所述用户呈现所述问诊问题和结构化的答案选项;
    根据所述用户输入的所述答案选项,得到所述用户输入的问诊信息。
  6. 根据权利要求1至5任一项所述的方法,其中,还包括:
    获取对所述用户进行人工问诊时的会话信息;
    对所述会话信息进行特征提取,得到第二特征向量矩阵;
    将所述第二特征向量矩阵输入至标签提取网络模型中,获取所述用户的健康类标签,其中,所述标签提取网络模型是根据第二数据集训练得到的,所述第二数据集中包括多个会话信息以及与所述多个会话信息对应的健康类标签;
    根据所述健康类标签更新所述用户画像。
  7. 根据权利要求6所述的方法,其中,所述将所述第二特征向量矩阵输入至标签提取网络模型中,获取所述用户的健康类标签,包括:
    将所述第二特征向量矩阵输入双向长短期记忆网络进行词性标注处理,得到标签得分概率;
    通过条件随机场对所述标签得分概率进行标签顺序修正,得到所述用户的健康类标签。
  8. 基于问诊会话构建用户画像的装置,其中,包括:
    信息获取模块,所述信息获取模块用于获取用户输入的主诉信息,其中,所述主诉信息为所述用户的病症描述信息;
    特征提取模块,所述特征提取模块用于对所述主诉信息进行特征提取,得到第一特征向量矩阵;
    预测模块,所述预测模块用于将所述第一特征向量矩阵输入至预测网络模型中,得到与所述主诉信息匹配的问诊问题,其中,所述预测网络模型是根据第一数据集训练得到的,所述第一数据集包括多个医疗问诊样本,每个所述医疗问诊样本包括问诊问题与对应的病症;
    问诊模块,所述问诊模块用于向所述用户呈现所述问诊问题,以得到所述用户输入的问诊信息;
    画像构建模块,所述画像构建模块用于根据所述问诊信息构建用户画像。
  9. 一种电子设备,其中,包括:
    至少一个存储器;
    至少一个处理器;
    至少一个程序;
    所述程序被存储在所述存储器中,处理器执行所述至少一个程序以实现基于问诊会话构建用户画像的方法:
    其中,所述基于问诊会话构建用户画像的方法包括:
    获取用户输入的主诉信息,其中,所述主诉信息为所述用户的病症描述信息;
    对所述主诉信息进行特征提取,得到第一特征向量矩阵;
    将所述第一特征向量矩阵输入至预测网络模型中,得到与所述主诉信息匹配的问诊问题,其中,所述预测网络模型是根据第一数据集训练得到的,所述第一数据集包括多个医疗问诊样本,每个所述医疗问诊样本包括问诊问题与对应的病症;
    向所述用户呈现所述问诊问题,以得到所述用户输入的问诊信息;
    根据所述问诊信息构建用户画像。
  10. 根据权利要求9所述的一种电子设备,其中,所述对所述主诉信息进行特征提取,得到第一特征向量矩阵,包括:
    对所述主诉信息进行自然语言预处理,得到多个第一分词;
    将所述多个第一分词输入至预先训练好的词向量模型中,以得到多个第一词向量;
    对所述多个第一词向量进行组合处理,得到所述第一特征向量矩阵。
  11. 根据权利要求10所述的一种电子设备,其中,所述对所述主诉信息进行自然语言预处理,得到多个第一分词,包括:
    对所述主诉信息进行繁体转简体处理,得到简体信息;
    对所述简体信息进行分词,得到预分词信息;
    对所述预分词信息进行去停用词处理,得到分词信息;
    对所述分词信息进行同义词归一化,得到所述多个第一分词。
  12. 根据权利要求9所述的一种电子设备,其中,所述将所述第一特征向量矩阵输入至预测网络模型中,得到与所述主诉信息匹配的问诊问题,包括:
    将所述第一特征向量矩阵输入双向循环神经网络进行特征融合处理,得到第一融合特征向量矩阵;
    将所述第一融合特征向量矩阵输入高速神经网络进行深度处理,得到第一深度特征向量矩阵;
    通过卷积神经网络对所述第一深度特征向量矩阵进行特征提取,得到第一向量;
    根据所述第一向量得到与所述主诉信息匹配的问诊问题。
  13. 根据权利要求9所述的一种电子设备,其中,所述向所述用户呈现所述问诊问题,以得到所述用户输入的问诊信息,包括:
    向所述用户呈现所述问诊问题和结构化的答案选项;
    根据所述用户输入的所述答案选项,得到所述用户输入的问诊信息。
  14. 根据权利要求9至13任一项所述的一种电子设备,其中,还包括:
    获取对所述用户进行人工问诊时的会话信息;
    对所述会话信息进行特征提取,得到第二特征向量矩阵;
    将所述第二特征向量矩阵输入至标签提取网络模型中,获取所述用户的健康类标签,其中,所述标签提取网络模型是根据第二数据集训练得到的,所述第二数据集中包括多个会话信息以及与所述多个会话信息对应的健康类标签;
    根据所述健康类标签更新所述用户画像。
  15. 一种存储介质,所述存储介质为计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机可执行指令,所述计算机可执行指令用于使计算机执行:基于问诊会话构建用户画像的方法:
    其中,所述基于问诊会话构建用户画像的方法包括:
    获取用户输入的主诉信息,其中,所述主诉信息为所述用户的病症描述信息;
    对所述主诉信息进行特征提取,得到第一特征向量矩阵;
    将所述第一特征向量矩阵输入至预测网络模型中,得到与所述主诉信息匹配的问诊问题,其中,所述预测网络模型是根据第一数据集训练得到的,所述第一数据集包括多个医疗问诊样本,每个所述医疗问诊样本包括问诊问题与对应的病症;
    向所述用户呈现所述问诊问题,以得到所述用户输入的问诊信息;
    根据所述问诊信息构建用户画像。
  16. 根据权利要求15所述的一种存储介质,其中,所述对所述主诉信息进行特征提取,得到第一特征向量矩阵,包括:
    对所述主诉信息进行自然语言预处理,得到多个第一分词;
    将所述多个第一分词输入至预先训练好的词向量模型中,以得到多个第一词向量;
    对所述多个第一词向量进行组合处理,得到所述第一特征向量矩阵。
  17. 根据权利要求16所述的一种存储介质,其中,所述对所述主诉信息进行自然语言预处理,得到多个第一分词,包括:
    对所述主诉信息进行繁体转简体处理,得到简体信息;
    对所述简体信息进行分词,得到预分词信息;
    对所述预分词信息进行去停用词处理,得到分词信息;
    对所述分词信息进行同义词归一化,得到所述多个第一分词。
  18. 根据权利要求15所述的一种存储介质,其中,所述将所述第一特征向量矩阵输入至预测网络模型中,得到与所述主诉信息匹配的问诊问题,包括:
    将所述第一特征向量矩阵输入双向循环神经网络进行特征融合处理,得到第一融合特征向量矩阵;
    将所述第一融合特征向量矩阵输入高速神经网络进行深度处理,得到第一深度特征向量矩阵;
    通过卷积神经网络对所述第一深度特征向量矩阵进行特征提取,得到第一向量;
    根据所述第一向量得到与所述主诉信息匹配的问诊问题。
  19. 根据权利要求15所述的一种存储介质,其中,所述向所述用户呈现所述问诊问题,以得到所述用户输入的问诊信息,包括:
    向所述用户呈现所述问诊问题和结构化的答案选项;
    根据所述用户输入的所述答案选项,得到所述用户输入的问诊信息。
  20. 根据权利要求15至19任一项所述的一种存储介质,其中,还包括:
    获取对所述用户进行人工问诊时的会话信息;
    对所述会话信息进行特征提取,得到第二特征向量矩阵;
    将所述第二特征向量矩阵输入至标签提取网络模型中,获取所述用户的健康类标签,其中,所述标签提取网络模型是根据第二数据集训练得到的,所述第二数据集中包括多个会话信息以及与所述多个会话信息对应的健康类标签;
    根据所述健康类标签更新所述用户画像。
PCT/CN2022/087528 2021-08-30 2022-04-19 基于问诊会话构建用户画像的方法、装置、设备和介质 WO2023029502A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111005960.3 2021-08-30
CN202111005960.3A CN113724882B (zh) 2021-08-30 2021-08-30 基于问诊会话构建用户画像的方法、装置、设备和介质

Publications (1)

Publication Number Publication Date
WO2023029502A1 true WO2023029502A1 (zh) 2023-03-09

Family

ID=78679296

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/087528 WO2023029502A1 (zh) 2021-08-30 2022-04-19 基于问诊会话构建用户画像的方法、装置、设备和介质

Country Status (2)

Country Link
CN (1) CN113724882B (zh)
WO (1) WO2023029502A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116521822A (zh) * 2023-03-15 2023-08-01 上海帜讯信息技术股份有限公司 基于5g消息多轮会话机制的用户意图识别方法和装置
CN117854713A (zh) * 2024-03-06 2024-04-09 之江实验室 一种中医证候诊断模型训练的方法、一种信息推荐的方法
CN117874633A (zh) * 2024-03-13 2024-04-12 金祺创(北京)技术有限公司 基于深度学习算法的网络数据资产画像生成方法及装置
CN118051879A (zh) * 2024-04-16 2024-05-17 杭州小策科技有限公司 海量数据下的人群画像分析方法及系统
CN118132736A (zh) * 2024-05-08 2024-06-04 青岛国创智能家电研究院有限公司 用户画像识别系统的训练方法、控制装置以及存储介质

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113724882B (zh) * 2021-08-30 2024-07-12 康键信息技术(深圳)有限公司 基于问诊会话构建用户画像的方法、装置、设备和介质
CN114048283A (zh) * 2022-01-11 2022-02-15 北京仁科互动网络技术有限公司 用户画像生成方法、装置、电子设备及存储介质
CN115631852B (zh) * 2022-11-02 2024-04-09 北京大学重庆大数据研究院 证型推荐方法、装置、电子设备及非易失性存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170103324A1 (en) * 2015-10-13 2017-04-13 Facebook, Inc. Generating responses using memory networks
CN108922608A (zh) * 2018-06-13 2018-11-30 平安医疗科技有限公司 智能导诊方法、装置、计算机设备和存储介质
CN109192300A (zh) * 2018-08-17 2019-01-11 百度在线网络技术(北京)有限公司 智能问诊方法、系统、计算机设备和存储介质
CN111326251A (zh) * 2020-02-13 2020-06-23 北京百度网讯科技有限公司 一种问诊问题输出方法、装置以及电子设备
CN113724882A (zh) * 2021-08-30 2021-11-30 康键信息技术(深圳)有限公司 基于问诊会话构建用户画像的方法、装置、设备和介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110019793A (zh) * 2017-10-27 2019-07-16 阿里巴巴集团控股有限公司 一种文本语义编码方法及装置
CN109545394B (zh) * 2018-11-21 2021-08-17 上海依智医疗技术有限公司 一种问诊方法及装置
CN111274365B (zh) * 2020-02-25 2023-09-19 广州七乐康药业连锁有限公司 基于语义理解的智能问诊方法、装置、存储介质及服务器
CN112084783B (zh) * 2020-09-24 2022-04-12 中国民航大学 基于民航不文明旅客的实体识别方法及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170103324A1 (en) * 2015-10-13 2017-04-13 Facebook, Inc. Generating responses using memory networks
CN108922608A (zh) * 2018-06-13 2018-11-30 平安医疗科技有限公司 智能导诊方法、装置、计算机设备和存储介质
CN109192300A (zh) * 2018-08-17 2019-01-11 百度在线网络技术(北京)有限公司 智能问诊方法、系统、计算机设备和存储介质
CN111326251A (zh) * 2020-02-13 2020-06-23 北京百度网讯科技有限公司 一种问诊问题输出方法、装置以及电子设备
CN113724882A (zh) * 2021-08-30 2021-11-30 康键信息技术(深圳)有限公司 基于问诊会话构建用户画像的方法、装置、设备和介质

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116521822A (zh) * 2023-03-15 2023-08-01 上海帜讯信息技术股份有限公司 基于5g消息多轮会话机制的用户意图识别方法和装置
CN116521822B (zh) * 2023-03-15 2024-02-13 上海帜讯信息技术股份有限公司 基于5g消息多轮会话机制的用户意图识别方法和装置
CN117854713A (zh) * 2024-03-06 2024-04-09 之江实验室 一种中医证候诊断模型训练的方法、一种信息推荐的方法
CN117854713B (zh) * 2024-03-06 2024-06-04 之江实验室 一种中医证候诊断模型训练的方法、一种信息推荐的方法
CN117874633A (zh) * 2024-03-13 2024-04-12 金祺创(北京)技术有限公司 基于深度学习算法的网络数据资产画像生成方法及装置
CN117874633B (zh) * 2024-03-13 2024-05-28 金祺创(北京)技术有限公司 基于深度学习算法的网络数据资产画像生成方法及装置
CN118051879A (zh) * 2024-04-16 2024-05-17 杭州小策科技有限公司 海量数据下的人群画像分析方法及系统
CN118051879B (zh) * 2024-04-16 2024-06-11 杭州小策科技有限公司 海量数据下的人群画像分析方法及系统
CN118132736A (zh) * 2024-05-08 2024-06-04 青岛国创智能家电研究院有限公司 用户画像识别系统的训练方法、控制装置以及存储介质

Also Published As

Publication number Publication date
CN113724882B (zh) 2024-07-12
CN113724882A (zh) 2021-11-30

Similar Documents

Publication Publication Date Title
WO2023029502A1 (zh) 基于问诊会话构建用户画像的方法、装置、设备和介质
CN110297908B (zh) 诊疗方案预测方法及装置
CN107977361B (zh) 基于深度语义信息表示的中文临床医疗实体识别方法
CN106682397B (zh) 一种基于知识的电子病历质控方法
CN111709233B (zh) 基于多注意力卷积神经网络的智能导诊方法及系统
CN110459282B (zh) 序列标注模型训练方法、电子病历处理方法及相关装置
CN110675944A (zh) 分诊方法及装置、计算机设备及介质
CN109871538A (zh) 一种中文电子病历命名实体识别方法
WO2023029506A1 (zh) 病情分析方法、装置、电子设备及存储介质
CN111834014A (zh) 一种医疗领域命名实体识别方法及系统
CN112151183A (zh) 一种基于Lattice LSTM模型的中文电子病历的实体识别方法
WO2023165012A1 (zh) 问诊方法和装置、电子设备及存储介质
CN113704428B (zh) 智能问诊方法、装置、电子设备及存储介质
CN116682553A (zh) 一种融合知识与患者表示的诊断推荐系统
CN112241457A (zh) 一种融合扩展特征的事理知识图谱事件检测方法
CN111581364B (zh) 一种面向医疗领域的中文智能问答短文本相似度计算方法
Zhang et al. Cross-modal image sentiment analysis via deep correlation of textual semantic
CN112232065A (zh) 挖掘同义词的方法及装置
CN113764112A (zh) 一种在线医疗问答方法
CN110444261B (zh) 序列标注网络训练方法、电子病历处理方法及相关装置
CN113707339A (zh) 一种多源异质数据库间概念对齐与内容互译方法及系统
WO2023124837A1 (zh) 问诊处理方法、装置、设备及存储介质
CN111611780A (zh) 基于深度学习的消化内镜报告结构化方法与系统
CN115545021A (zh) 一种基于深度学习的临床术语识别方法与装置
CN110969005B (zh) 一种确定实体语料之间的相似性的方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22862648

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 19/06/2024)