CN115934967B - Commodity recommendation method and system based on combination of twin transducer model and knowledge graph - Google Patents

Commodity recommendation method and system based on combination of twin transducer model and knowledge graph Download PDF

Info

Publication number
CN115934967B
CN115934967B CN202310011140.8A CN202310011140A CN115934967B CN 115934967 B CN115934967 B CN 115934967B CN 202310011140 A CN202310011140 A CN 202310011140A CN 115934967 B CN115934967 B CN 115934967B
Authority
CN
China
Prior art keywords
user
recommendation
commodity
medicine
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310011140.8A
Other languages
Chinese (zh)
Other versions
CN115934967A (en
Inventor
肖慧彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lingchuang Beijing Technology Co ltd
Original Assignee
Lingchuang Beijing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lingchuang Beijing Technology Co ltd filed Critical Lingchuang Beijing Technology Co ltd
Priority to CN202310011140.8A priority Critical patent/CN115934967B/en
Publication of CN115934967A publication Critical patent/CN115934967A/en
Application granted granted Critical
Publication of CN115934967B publication Critical patent/CN115934967B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention provides a commodity recommendation method, a commodity recommendation system and a commodity recommendation storage medium based on combination of a twin transducer model and a knowledge graph, wherein the commodity recommendation method comprises the following steps: analyzing the symptom description of the user to obtain the purchase intention of the user; analyzing commodity browsing actions of the user to obtain shopping preferences of the user; carrying out named entity recognition, semantic recognition and relationship recognition on the existing medical knowledge text to obtain a medical knowledge graph; carrying out potential knowledge discovery based on a medical knowledge graph according to a twin transducer model, wherein the knowledge discovery refers to the discovery of the alternatives between two drug commodities or the correlation between two symptoms; and recommending the medicine commodity with high accuracy according to the constructed medicine knowledge graph and the acquired user behaviors, and feeding back the recommendation result according to the user behaviors. The technical scheme disclosed by the invention can improve the accuracy of online retail recommendation of the medicine commodity and improve the convenience and timeliness of searching and purchasing the proper medicine by a user.

Description

Commodity recommendation method and system based on combination of twin transducer model and knowledge graph
Technical Field
The invention relates to application of deep learning based on recommendation algorithm in the technical field of computer networks, in particular to a commodity recommendation method, a commodity recommendation system and a non-transitory computer readable storage medium based on combination of a twin transducer model and a knowledge graph.
Background
With the gradual and tight integration of the internet and life, online retailing is becoming increasingly popular in people's daily lives. In competition of each online retail platform, the ability to accurately recommend the required commodity to the user gradually becomes a standard for measuring whether the platform is good or not, so that roles played by online retail recommendation are increasingly important, and the value brought to enterprises is also increased.
However, the current algorithm applied to online retail recommendation is mainly oriented to commodities such as quick-burn, daily-use, clothing, electronic products and the like, and the characteristics of the commodities are often strongly related to the interest points of the users, so that the emphasis of the recommendation algorithm is on the aspect of mining the interests of the users, and most of the recommendation algorithm is based on the modeling thought of guessing the users.
However, in the aspect of retail recommendation of medical and health products, the existing recommendation algorithm often cannot timely mine out the real demands and intentions of users. This is because the medical and health products are different from general commodities, and the user can only make shopping demands in the sick, weak and other states. In fact, when a user enters a medicine purchasing process, the user has clear requirements, and the purchasing requirements of medicines corresponding to symptoms which are not affected in the past are less associated with the history. This means that the existing recommendation algorithm based on the interest points by mining the historical behavior of the user has little correlation between the recommended goods and the current demands of the user. Therefore, the current commodity recommendation algorithm has little benefit in practical application, and most users do not generate obvious re-purchase behaviors, so that related medicines recommended according to interest points mined by the historical shopping records are not helpful.
Under the retail recommendation scene of actual medical health products, the algorithm is required to have the function of mining the requirements of users in real time. After the user enters the medicine purchasing flow, each action is generated, the algorithm is fully mined and utilized, and the real requirement of the user is predicted according to the action.
Disclosure of Invention
In order to solve the problems, the invention provides a recommendation method and a recommendation system for recommending medical commodities. Specifically, the knowledge graph is applied to a recommendation system, and the defect of directivity of user behaviors is made up by using professional knowledge brought by self-trained medical knowledge graph, so that the recommendation method and the recommendation system can mine the requirements of users in real time under the condition of less user behavior data quantity, and provide high-quality commodity recommendation.
The recommendation method comprises the following steps: analyzing the symptom description input by the user into the search box to obtain symptom labels which can be used for retrieval; collecting the action of browsing commodities by a user to obtain user preference data which can be used for describing the portrait of the user; learning according to the existing medical knowledge text on the network to obtain the corresponding relation between the medicine and the symptom, and forming a medical knowledge graph which can be used for inquiring the existing relation and finding the potential relation; recommending proper medicine commodities to a user according to symptom labels and user preference data and knowledge patterns provided by the user and used for searching; and adjusting the recommendation method according to the browsing action of the user on the recommended medicine commodity.
Preferably, the process of resolving the user symptom description includes: and carrying out semantic segmentation on the natural language character strings input by the user description symptom to obtain symptom description keywords which can be used for retrieval and data of applicable ages.
Preferably, the process of resolving the user symptom description includes: and (3) carrying out normalized correction on the natural language character string input by the user, wherein the normalized correction comprises conversion from traditional Chinese characters to simplified Chinese characters, conversion from homonyms and correction of wrongly written characters.
Preferably, the user behavior analysis is to collect data of clicking, browsing and residence time of the user, drawing user portraits according to the collected user data, and finally giving different recommendation labels to the user according to different user portraits.
Preferably, the process of acquiring user preference data by collecting the action of browsing recommended goods or initial goods by the user comprises the following steps: collecting data of clicking, browsing and residence time of a user; collecting data of medicines which are not interested or shielded by long-time pressing feedback of a user; and collecting data of people suitable for browsing the medicine commodities by a user.
Preferably, the collection and analysis of the user behavior after the user enters the purchasing process can be used for predicting the current requirement of the user and rearranging the commodity display sequence of the commodity page.
Preferably, the construction process of the medical knowledge graph comprises the following steps: identifying a drug named entity, wherein the identification of the drug named entity comprises standardization of the drug entity; identifying symptom entities, the identification of symptom entities including associations between symptom entities; and identifying the corresponding relation between the medicine and the symptom, wherein the identification of the corresponding relation comprises the steps of carrying out relation abstraction according to the existing medical knowledge text and carrying out knowledge discovery according to the relation between symptom entities of the medical knowledge graph. Preferably, the combination of the twin transducer learning module and the knowledge graph can be used for exploring potential entity relationships, and the link is called a knowledge discovery link.
Preferably, when analyzing user behaviors, corresponding weights can be adjusted for different user tags, so that accurate portraits and recommendations of users are realized.
Preferably, the process of recommending the proper medicine commodity to the user comprises the following steps: symptom description keywords and applicable age data obtained based on the process of analyzing the symptom descriptions of the users; based on user preference data; based on the entity of the medicine knowledge graph and the relation between the entities, recommending medicine commodities to the user.
Preferably, the recommendation method is optimized according to the browsing action of the user on the recommended medicine commodity, and the optimization method comprises the following steps: list sorting is carried out on the recommended commodities, and the list sorting order of the recommended commodities can be adjusted according to the browsing action of the user; managing life cycle of recommended goods and corresponding labels thereof, and controlling the duration or termination of the life cycle according to browsing actions of users; and feeding back the recommendation method according to the browsing action of the user, and determining whether a recommendation model used by the recommendation method needs to be updated or not.
Preferably, when the recommendation module is maintained, different rules are formulated for different recommendation plates or life cycle management for various commodities is increased, so that the phenomenon that meaningless or overtime recommended commodities which are not browsed for a long time are repeatedly recommended can be avoided, the consumption of calculation amount of a background is avoided, and maintainability of each label is improved.
Preferably, when the data tables output by the twin transducer module are related to each other, knowledge embedding can be performed by using only the map embedding module, so that visual representation of output data of the twin transducer module is realized as one of inputs of the downstream recommendation module.
Preferably, in the process of training the recommendation module, the model is trained by adopting an countermeasure idea, so that the learning times of the confusable sample and the confusable sample are dynamically adjusted, and the balance of the model is realized.
Preferably, in the recommendation method, a method of sorting verification is adopted for optimizing a plurality of medicine commodities serving as recommendation results, namely, commodities with front sorting are regarded as being clicked and browsed with high probability, and if the commodities are not actually clicked and browsed, the positions of the commodities in a sorting queue are adjusted backwards, so that real-time algorithm optimization is achieved.
Preferably, before the commodity is finally recommended to the user, an orderly recommendation queue is maintained for each user, and the score of the recommended commodity in the recommendation queue, which changes in real time due to the operation of the user, is updated in an increment mode, so that the commodity in the recommendation queue is subjected to duplicate removal processing.
Preferably, when actually recommending the commodity to the user, the click of the buried commodity by the user is also used for deciding whether the recommendation algorithm giving the current recommendation result needs to be evolved or eliminated.
Compared with the prior art, the scheme of the invention has the following beneficial effects:
through the technical scheme, when the medicine purchasing requirements of the user are recommended, the recommendation system can dig the requirements of the user in real time under the condition that the data quantity of the user behavior is small, and high-quality commodity recommendation is provided. Compared with offline calculation adopted by the traditional mainstream recommendation algorithm, the technical scheme provided by the invention has higher advantages in real-time calculation and analysis of the user requirements.
When a user enters a medicine purchasing program, firstly, recording the names or uncomfortable symptoms of medicines searched in a search box by the user, recording data generated in the step as user behavior data of the user, extracting keywords from the data, and then transmitting the keywords into an intelligent recommendation method for keyword searching and analysis. In the analysis process of the intelligent recommendation method, first, a vocabulary for describing symptoms or a direct medicine name is extracted from a symptom description inputted by a user using a natural language processing (Natural Language Processing, NLP) method. Due to the specificity of medicine use, before the recommended result or the search result is output, the invention adds a checking step of medicine indication and user complaint symptom description at the same time, thereby achieving the purposes of reducing the recommended error rate and recommending medical health care commodity accurately and efficiently.
Drawings
FIG. 1 is a schematic diagram of a recommendation system according to the present invention;
FIG. 2 is a schematic diagram of a commodity recommendation method according to the present invention;
FIG. 3 is a schematic diagram of a twin transducer model;
FIG. 4 is a schematic diagram of the construction process of a transducer-KG;
fig. 5 is a diagram of an example of the relationship between knowledge-graph entities obtained by medical text learning.
Detailed Description
The invention is elucidated below on the basis of embodiments shown in the drawings. The presently disclosed embodiments are considered in all respects to be illustrative and not restrictive. The scope of the present invention is not limited by the following description of the embodiments, but is only indicated by the scope of the claims, and includes all modifications having the same meaning and within the scope of the claims.
In order to solve the technical problems, the invention provides a recommendation method and a recommendation system for recommending medical and health care products. The recommendation method and the recommendation system can perform semantic segmentation on the input of the user in the search box, and the result of the semantic segmentation can be used for analyzing the purchase demand of the user and can also be used for portraying the user. Secondly, the method and the system can also perform autonomous learning according to medical knowledge recorded in natural language, and generate a knowledge graph for optimizing the medicine recommendation process. The natural language is a language which is used for communication and description in life, such as daily dialogue, scientific papers, and the like. Finally, the method and the system can analyze the follow-up browsing action and feedback of the recommendation result according to the user, so that the recommendation method and the system are optimized according to the user experience.
In order to realize accurate recommendation for different crowds, the purchasing habits of customers need to be analyzed first. For example, in analyzing user behavior, it is necessary to establish a tag for a user. For example, when the elderly suffer from a chronic disease, treatment or maintenance drugs for the corresponding disease can be purchased periodically by means of the chronic disease prescription. When a family for nursing infants purchases medicines, related commodities related to the medicines for infants can be purchased frequently. Then the old and infant users with chronic illness are all labels that the user can use for classification.
[ user input resolution Module ]
Unlike on-line shopping platforms for quick-wear, daily necessities, clothing, electronic products, most of which drugs should be purchased by users are most suitable when they purchase them on-line, so the disclosed embodiments of the invention support users to input natural language describing symptoms in search boxes, such as "fever 38 ℃, slight diarrhea, muscle soreness" and the like, and different symptom descriptions may be broken without punctuation marks if included at the same time. Meanwhile, in order to obtain more accurate medication information, the user can also add a description of the age of the user when describing symptoms, and at the moment, the 'one year old child, fever 38 ℃ and slight diarrhea' can be input into a search box.
The symptom description sentences filled in the search box by the user can be used as the input of the user input analysis module, and semantic segmentation is carried out through the BERT model. At this time, the user input parsing module may parse the "one year old" age tag, the "child" applicable crowd tag, the "fever" and the "slight diarrhea" symptom descriptive entity from the user input in the search box. The various labels obtained by the user input analysis module can be used as the input of the user portrait module so as to carry out user portrait.
[ user portrayal Module ]
The input to the user portrayal module includes output labels of the user input parsing module, such as an age label of "one year" and an applicable crowd label of "child". Meanwhile, due to the correlation between the two labels, a mapping table can be directly established, and the two labels are all pointed to the applicable crowd labels of 'infants', so that redundant labels are reduced.
In addition, the user portrayal module may also build user tags by analyzing user behavior. For example, if the user periodically purchases a lot of medicines for maintaining or treating chronic diseases, it can be presumed that the user may be a chronic disease patient or a guardian of the chronic disease patient, and at this time, a "chronic disease medicine" label can be newly added for the user.
When the user is portrait, the user portrait module can also manage the label data of the user. In an actual usage scenario, users may be given different kinds of labels by purchasing pharmaceuticals for different applicable people or different usage symptoms. These tags added for the user's buying habits can be input into the twin transducer deep neural network module for further user classification. The user portrayal module can add new user labels, if an administrator sets user classification labels (such as 'actual reserves') for describing the purchasing habits of users for certain types of medicines, the user portrayal module can also perform ad hoc query (ad hoc) so as to rapidly define the range of target users recommending the medicines in the next time. The manager can also export the user label analysis report form at the rear end of the user image module, so that the group characteristics and the medicine purchasing tendency of the crowd using the online medicine purchasing are mastered, and the commodity is configured and scheduled more accurately and reasonably.
[ medical Label Module ]
In learning medical knowledge, symptoms of a disease such as fever, diarrhea, muscular soreness, and the like can be expressed as a symptom entity. In classifying medicines, nouns such as antipyretic medicines, antidiarrheal medicines, health care medicines, etc. may also be used as upper labels of medicine entities.
In extracting the superior label summarizing the drug action from a priori knowledge describing the drug action, the action and symptom information of the drug entity may be extracted from, for example, a drug specification. The label summarizing the action of the medicine entity is an upper label of the medicine entity, and after the upper label summarizing various medicine entities is constructed, the medicine entity needs to be added under the upper label. And the symptom information corresponding to the extracted medicine is a symptom entity, and the label for summarizing the symptom entity is an upper label of the symptom entity.
[ user behavior analysis Module ]
The user behavior analysis module can analyze browsing action data of recommended commodities of the recommended page through a user. The user population may be generally categorized by querying which category of drugs the user accesses most of the drugs belong to. Specifically, the recommended page is divided into blocks of hot sales, buying and buying again, guessing that you like, and the interest of the user in the commodity is evaluated by collecting the click action and browsing time of the user. The feedback of the primary recommendation effect can be realized by setting long-press direct feedback, such as selecting 'uninteresting', 'shielding the brand commodity', 'shielding the medicine', and the like after long-press, and the background can calculate the recall rate and the accuracy rate of recommendation, so that the recommendation weight of various suitable commodities is modified for a user. Different weights can be set for different recommended plates, different rules are set according to different recommended plates, or life cycle management of various commodities is increased, so that the situation that meaningless or overtime recommendation is not browsed for a long time but is still in a recommended position is avoided, the calculation amount of a background is avoided, and maintainability of each label is improved.
[ twin transducer deep neural network Module ]
As shown in fig. 3, the twin neural network comprises at least two parallel, identical module architectures of the convolutional neural network. The parallel architecture allows learning of similarities, which can be used instead of direct classification. This approach has previously been commonly used for image data such as facial recognition. In the training process, the two sets of frameworks share the same initial weight and updating weight, have the same super parameters and have ultrahigh consistency. This consistency allows the modules to compare the inputs they receive. In the training and reasoning process, a single branch is used for processing and embedding the characteristics by a neural network branch (such as a convolutional neural network (convolution neural network, CNN) branch), and the characteristics are compared with clusters formed by the results of other branches, so that a list of similarity scores is obtained, the difference between the input and an expected template is obtained, and finally the input is classified.
The Transformer model is a model for improving the training speed of the model by using an attention mechanism, and plays a great role in semantic segmentation links in the field of natural language processing (Natural Language Processing, NLP). The advantage is that based on the Multi-layer neural network, a Multi-head section mechanism is added, and compared with a common sequence modeling model such as a Long Short-Term Memory (LSTM), a gating circulation unit (Gated Recurrent Unit, GRU) for enhancing the Memory capacity of the RNN neural network and a convolutional neural network (Convolutional Neural Networks, CNN) have more learning capacity and reasoning performance.
Deep learning can mine more implicit features than traditional shallow machine learning. The neural network is an indispensable ring in machine learning and is a main structure of a member deep learning frame, and the neural network is formed into a network topology structure by mutually connecting elements similar to human neurons, so that the neural network has the capability of autonomously mining deeper features. And compared with the deep neural network (Deep Neural Networks, DNN), the hidden layer is added to enhance the expression capability of the module, more than one output can be provided for the neurons in the output layer, so that the module can be flexibly applied to classification regression.
The twin-converter deep neural network module has strong learning capacity of a converter structure and has the characteristics of fast learning and strong pushability of a twin-converter. In the use process, the links of medical knowledge learning are penetrated.
Before learning with the twin transducer deep neural network module, it is necessary to set a tag for classification to an object to be classified. In the process of medical knowledge learning, three steps of named entity recognition, entity standardization and entity relation extraction are needed.
[ named entity recognition submodule ]
And the named entity association extraction sub-module is used for extracting the corresponding relation among the entities which can be used for knowledge graph construction from the original text. In learning medical knowledge in text form, the learning method of medical knowledge in text form requires more detailed and multidimensional learning rules than the user behavior learning method in which most of digital weights are input. In the learning process, for example, the medicine, disease type, symptom and corresponding ID to be covered need to be listed manually, so that the entities are accurately defined. For text input, a language module, a transducer-LM, is required, which contains word libraries of various medical nouns, including a drug name library, a symptom expression word library, and the like. These word stores require manual writing of the entity table and labeling of names, types and IDs. The labeling is carried out on the original medicine text in a similar way to the labeling method of identifying and labeling general named entities and associating the entities, and the labeling method belongs to the conventional technical means in the field and is not repeated here. On the basis of the transform-LM, a named entity association extraction sub-module transform-KG is constructed, as shown in fig. 4, which contains a text modeling sub-network transform-NODE (mapping text inputs into vector sequences), an MLP-Residual sub-network (predicting the association class of two NODE vectors), and three outputs, named entity identification tags, knowledge graph NODE ids, and associations between NODEs.
The model structure of the transducer-NODE is equivalent to the transducer-LM and the trained transducer-LM parameters are directly used as the parameters for initialization. The task of the transducer-NODE is two, and the input character sequence is marked with a NODE type label corresponding to the character and a NODE id. Such asThe Sanhuang tablet is the name of the medicine, and the corresponding node type label is medicine B, medicine M and medicine E]Drug B is the beginning, drug M is the middle, and drug E is the end. The corresponding node id is [23,23,23 ]]Because the node id of the Sanhuang tablet in the knowledge graph is 23, the part is preset in the database. The transducer-NODE outputs a sequence of vectors, along with a NODE type tag and NODE id for each vector. And for vectors with node ids not being empty, aggregating according to the node ids, wherein the aggregation mode is avgpool. Thus, each node id has an aggregated vector, and a plurality of node ids and corresponding vectors are generated after an original text is input. In popular terms, the transducer-NODE labels the portion of the text sequence that is an entity, and the vector for each entity is formed by averaging the vectors of the several characters that make up the entity. Both tasks of the transducer-NODE are classification tasks, and the objective function used is l= - Σ i log(P(c i =y i ) S), S is the input sequence, ci represents the ith character of the sequence, yi represents the correct label of the ith character. It should be noted that there are two labels for each character, a node id label and a node type label, respectively, and both the node id label and the node type label may be NULL.
The task of MLP-Residual is to extract associations between entities from a piece of text input. Specifically, the transducer-NODE generates a number of entity vectors, for any two of which it is necessary to predict their relationship. The MLP-Residual predicts the corresponding relation of the two node ids according to the two input entity vectors. In the simple process of prediction, there is an MLP encoder, each layer network of the MLP generates an implicit token, if the MLP has three layers, the corresponding input entity vector generates three implicit tokens from layer 1, layer 2 and layer 3 respectively. Then when we input two entity vectors to the MLP in turn, three implicit characterizations are generated for each entity vector, then the implicit characterizations belonging to the same layer are multiplied to obtain a score, and then the scores multiplied by each layer are added to obtain the final score. The score, in turn, generates a probability value via a Sigmoid function. That is, the MLP-Residual inputs two entity vectors, which output a probability of belonging to a certain relationship.
The MLP-Residual formula is as follows:
h z =σ z (W z ·h z-1 +b z )
where vi and vj represent vector representations of different entities, respectively, hz represents a representation of a certain hidden layer,and taking the logarithm of the predicted value and the complement thereof to obtain a residual error for the predicted value, wherein the residual error can drive the optimization of the model. The final L is the probability of the relationship between the two entities. For any two entities, the relationship between them may be { medicine treatment, disease onset, none }. Each relation has one MLP-Residual responsible for prediction, so three probability values are generated respectively, and the relation corresponding to the maximum value of the three probability values is the relation corresponding to the two entities. Essentially a twin multilayer MLP structure, the end result being the addition of the predictions output from each layer. The objective function used is
Wherein W is a gradient, which can be solved by using a random gradient descent method, and only one error sample is adopted at a time, the gradient iteration formula is W (t+1) =w (t) -α·y i x i T is the number of descending rounds, and one round of descending rounds is added each time from 0; xi and yi (same as ri) are the sample point abscissas with each round of the new addition; the step size is alpha. The residual iterative steps of the multilayer MLP on one side in twinning are as follows: (1) defining an initial w as 1, and defining an initial value of a w vector and a step length alpha; (2) selecting a meeting y from the training set i (w T x i +b) < 0 misclassified sample points (xi, yi); (3) carrying out random gradient descent iteration on the w vector; (4) and checking whether the training set has misclassification points, if not, ending the algorithm, and if not, returning to the second step. And (3) iterating the reverse order of the misclassified sample point set at the other side of the twin network, and fitting the misclassified sample set from the other direction. The residual iterations at two sides are performed simultaneously, and are mutually opposed, and the characteristics of different channels are fused.
[ physical standardization submodule ]
In extracting drug entities or symptom entities from the drug text, a named entity recognition module is used. This module is mainly used to identify from a piece of text which part is the name of the drug, which part is the name of the indication and which part is the name of the symptom. For example, "Sanhuang tablet" is a medicine name, "heat-clearing and detoxicating" is an indication name, and "conjunctival congestion and swelling and pain" is a symptom name. The named entity recognition module simply extracts parts of the original text and marks the categories. However, because of the flexibility of natural language, the names identified in the description provided from medical knowledge or users may be multiple parcels of the same indication, which may be difficult to correspond to standard labels in the module, and thus require further standardized mapping. At this point, an entity normalization module is used, which is mainly used to map the identified entity words to normalized nouns. Such as the most common cold, there are a number of descriptions. "Cold", "common cold", "wind-cold" are all actually referred to as "upper respiratory tract infection" as a standard name of the disease. In order to lighten the construction of the module, avoiding redundancy of the entity, the function of the entity standardization module is important. It merges substantially identical entities, thus completing the entity standardization.
[ entity association extraction submodule ]
The entity association extraction sub-module establishes association between the named entity recognition module and the results after the symptom description or vocabulary normalization in the entity normalization module. The module is characterized in that it is different from a general entity relation extraction sub-module: for the situation that a plurality of medicines and a plurality of indications appear in one sentence, correct medicines and indications can be accurately associated, for example, the 'Xiaoyao pill' is a Chinese patent medicine and is used for treating depression and discomfort caused by liver depression and spleen deficiency, chest and hypochondrium distending pain, dizziness, anorexia and diarrhea caused by liver depression and spleen deficiency. If enteritis is caused, the medicines such as amoxicillin, enteritis relieving and intestinal metaplasia treating can be used. In the above description, the four medicines of "Xiaoyao pill", "amoxicillin", "enteritis-relieving" and "intestinal metaplasia" are referred to as "liver depression spleen deficiency" and "enteritis" respectively, and the sub-modules are extracted by using the entity relation at this time, so that the "Xiaoyao pill" and the "liver depression spleen deficiency" are associated, and the "amoxicillin", "enteritis-relieving", "intestinal metaplasia" and "enteritis" are associated. In summary, three outputs of named entity identification tag, node ID for knowledge graph and association between nodes are formed after learning the medical knowledge text.
Typically, named entity recognition is the determination of what type an entity belongs to. In the invention, the entity standardization is realized by utilizing the twin transducer deep neural network during entity extraction. Therefore, the technical scheme of the knowledge graph not only can judge the category of the test questions, but also can judge the node ID of the entity, and the essence is a sequence labeling task, so that the classification problem is solved. The obtained entity category is totally 3 major categories, but the ID classification of the entity is judged to be tens of thousands of categories in total, and the knowledge graph can integrate the ID of the entity with the named entity identification by virtue of a transducer structure and a pre-training model obtained by using the structure.
In addition, due to the continuous development and expansion of medical knowledge, the twin transducer deep neural network can perform knowledge discovery on the basis of the initially formed entity corresponding relation, so that the knowledge blank which cannot be covered by medical text data is made up. Knowledge discovery may simultaneously translate nodes into a node feature matrix for provision to a knowledge graph module as a complement to knowledge and to a recommendation module as part of the input.
[ construction of knowledge-graph ]
The medical knowledge graph is a database with a graph structure in terms of form, and comprises three nodes of medicines, symptoms and diseases, and two kinds of correlations of medicines for treating the diseases and the diseases to generate certain symptoms. Each node of the knowledge graph has a node ID, and each node ID is used as a primary key of the data table so as to be associated with other data tables of an external system. Knowledge maps are essentially data tables, and the construction of the data tables requires a plurality of algorithm models and is directly related to other data tables in the algorithm models. For a small scope, the knowledge graph can be associated with a data table output by the twin transducer deep neural network, and entity inquiry and relation acquisition are performed through inquiry operation. In a large scale, the method can be associated with a medicine sales data table in actual life, and the ID of the same medicine is kept consistent by directly adding medicine nodes in a knowledge graph through adding (join) operation.
The following illustrates the interaction process of the twin transducer deep learning module and the knowledge graph module in the knowledge graph embedding process of the medical knowledge described by the embodiment of the present invention for the existing natural language. For example: sanhuang tablet, chinese patent medicine name. Is that Heat clearing agentHas the following advantages ofClearing away heat and toxic materialsPurging fireLaxative effect. Indications for treatingTriple energizerThe heat is contained in the hot water,conjunctival congestion with swelling and painMouth(s) Sore on noseSore throatRestlessness and thirstYellow urineConstipation. The text is found to contain the indication of the Sanhuang tablet (body heat and excessive internal heat) and the symptoms corresponding to the indication (eye swelling and pain, oral sore, nasal bridge sore, throat swelling and pain, dysphoria, yellow urine and constipation) after entity identification and relation extraction.
First, the transducer-NODE model extracts the following entity data table from the original text, where the entity data table is generated by the transducer-NODE that predicts the id and type corresponding to an entity, but the id and type may be NULL when predicting, that is, there is no corresponding id for an entity, or there is no corresponding type (it may be understood that the prediction error of the model is equal to the id and type):
entity Id Type(s)
Sanhuang tablet 23 Medicine
Heat clearing agent 98 NULL
Triangle heat-retention 102 Disease of the human body
Conjunctival congestion with swelling and pain 123 Symptoms of
Sore throat 125 Symptoms of
Restlessness and thirst NULL Symptoms of
Yellow urine 47 Symptoms of
Constipation 57 Symptoms of
Next, the id and data of type NULL need to be removed because if the id indicates that the entity is not within the coverage of the knowledge-graph, if the type NULL indicates that the entity is not a legal entity. The following data table can be obtained after filtering:
Entity Id Type(s)
Sanhuang tablet 23 Medicine
Triangle heat-retention 102 Disease of the human body
Conjunctival congestion with swelling and pain 123 Symptoms of
Sore throat 125 Symptoms of
Yellow urine 47 Symptoms of
Constipation 57 Symptoms of
The vector representation of each entity is also output while the entity table is extracted. Finally, based on the entity table and the corresponding vector characterization, the MLP-Residual predicts the relationship between nodes, and the result is stored in the node relationship table. See the following table:
23 102 123 125 47 57
23 treatment of
102 Production of Production of Production of Production of
123
125
47
57
Wherein the absence in the table indicates that there is no relationship between nodes as a result of the prediction. The extracted node relation table is finally stored in a knowledge graph database and becomes a part of the content of the knowledge graph.
Compared with the knowledge graph described in the prior art, the knowledge graph disclosed by the embodiment of the invention has the advantage that the standardized model is added. As is well known, the biggest problem with medical materials is that the nouns are confusing, the same meaning will have a variety of expressions, and the differences in meaning of a word, such as "wind-heat cold", "wind-cold" will look very similar, but actually different cold, requiring different medication. Through the standardized model, knowledge confusion caused by noun confusion on the knowledge graph is avoided, and the knowledge graph is more accurate.
In embedding new medical knowledge or discovered potential medical knowledge, embodiments of the present disclosure use a ConvE-like model to perform knowledge embedding on medical knowledge maps. The existing ConvE model characterizes the nodes and the associated types of the knowledge graph by vectors, and the vectors of one node and one associated type are used for reshape and concat in the input process, so that a two-dimensional matrix is formed. In the embodiment disclosed by the invention, the node and the vector representation of the association type are enabled to obtain a two-dimensional matrix through a bilinear function, wherein the matrix xij=xi is the ith element of the node vector, and yj is the jth element of the association type. The specific method is shown in the following formula
The existing ConvE model does not explain the logic thought in the original text, and can only try to make a blank, and has not much technical value. In contrast, the improvement of the embodiment disclosed by the invention is one of the common technical means for generating new characterization on two vectors, but the technical means are common in image processing and less common in the graph structure neural network. Experiments show that the model using the improved method has faster convergence and shorter training time when in network training. It can be considered that our improvement helps the characterization learning of the model, and can learn more quickly to stabilize the characterization.
When the method is associated with a data table output by the twin-Transformer deep neural network, knowledge embedding is performed by using a knowledge map embedding module, namely, the relation between the twin-Transformer deep neural network output entity and the entity is visualized by using an image method, namely, the relation between nodes is modeled, and the nodes and associated vector characterization is output, as shown in fig. 5, and is used as a data feature provided for downstream application. And meanwhile, the relation among the nodes is learned by utilizing the knowledge discovery characteristic of the twin transducer deep neural network so as to make up for the insufficient coverage rate of the module for extracting the whole of the existing data and entity relation, and the potential entity association is mined. The disclosed embodiments of the present invention overcome the drawback of overfitting training caused by only a small number of positive samples used for modeling by employing an contrast-resistant learning algorithm.
For example, the bailing capsule is a common medicament for treating kidney diseases, and mainly plays a role in protecting kidney. In fact, a medicine very similar to the Bai Shui capsule is called a Jinshuibao capsule, which also has the effect of protecting the kidney. In the database of the medicine text, the related information of the bailing capsules is more, the information of the Jinshuibao capsules is less, and the physical relationship of the bailing capsules can only be extracted from the information, but the physical relationship of the Jinshuibao capsules can not be directly extracted from the existing information. While knowledge discovery of the twin transducer deep neural network module refers to predicting whether there is an association between two entities from a limited, more obscure depiction.
From the prior data, we know that the bailing capsule can protect kidney and benefit essence, so we can describe some of the bailing capsules, for example, the bailing capsule is formed by grinding Cordyceps sinensis into powder, and can promote the synthesis of tubular epithelial cells, thereby being beneficial to the recovery of acute tubular necrosis. "associated description of kidney disease as input A" causes chronic kidney disease to include various primary, secondary glomerulonephritis, tubular injury, and renal vascular lesions, among others. According to GFR, chronic kidney diseases can be divided into 5 phases, complications of CKD (chronic kidney disease) patients can be obviously reduced by early discovery and early intervention, and survival rate is obviously improved. For the treatment of CKD, including the treatment of primary disease, the management of various risk factors and the delay of progression of chronic renal insufficiency. When CKD patients progress to stage 5, renal replacement therapy should be performed in time. And (3) taking the function as input B, and then learning a twin transducer deep neural network module to obtain the relationship between the Baimao capsule and the kidney disease as auxiliary treatment. Then we bring the description of the Jinshuibao capsule 'Jiangxi Jimin credibility to bring good news to the patients with chronic kidney disease', the Jinshuibao capsule is derived from natural cordyceps, and has the main components of the cordyceps: after the cordycepic acid, the cordyceps polysaccharide and the adenosine are input into the module as the input A, the module predicts that the relationship between the Jinshuibao capsule and the kidney disease is an auxiliary treatment relationship with high probability.
Finally, during the training of the recommended module, we find that the module cannot learn useful information effectively according to a general neural network training method, because the general training method is to select a negative sample of the module by adopting a negative sampling method. In a medical setting, however, randomly sampled negative samples are very easy to distinguish. Therefore, the model training mode taking antagonism as an idea is related, and the difficulty of model learning is increased. Including a decrease in the weight of the easily distinguishable negative samples and an increase in the weight of the easily aliased samples, increasing the sampling probability and density of the easily aliased samples. Repeated verification learning for a plurality of times of difficult sample learning is realized, and a small amount of simple sample learning is realized, so that model balance is realized.
[ recommendation Module ]
Individual recommendations are made for each shopping plate, such as "recent hot purchase", "buy still buy", "others are looking at", "guess you like", etc., and different recommendation rules are used for each shopping plate.
The commodity recommendation scheme used by the invention starts from two aspects of user behavior and commodity association. In the recommendation method of the user behavior, various collected times can be put into the user behavior sequence, clicked/searched/added to shopping cart/browse and the like. A measure of the time difference is placed between the behaviors to enhance the impact of time on the recommendation. As shown in fig. 2. The recommendation module calculates the existing relevant click probability of the collected commodities, and for any commodity a, multiple users (marked as User1, user2, user3 and User4 in the figure) can generate relevant click probability if a commodity b is recommended as the relevant commodity. The data obtained is then split into two parts, one part being used to train the model and the other part being used to verify the recommended effect of training the resulting model. The feature extraction is performed on the commodity a, and the feature extraction comprises the request recommended number of the commodity a, the sales amount of the commodity a, the searching times of the commodity a, the times of the commodity a being arranged on the first page in the searching results and the like. LightGBM (Light Gradient Boosting Machine) provided on the basis of a traditional GBDT (Gradient Boosting Decision Tree) machine learning model is a framework for realizing the GBDT algorithm, supports high-efficiency parallel training, and has the advantages of faster training speed, lower memory consumption, better accuracy, support of distributed data processing and the like. Finally, predicting whether the probability that the recommended commodity b will be clicked and browsed is higher than the probability of another recommended commodity c for any commodity a by using the lambdarank ordering method of the LightGBM. If the probability of clicking the commodity b is higher than the probability of clicking the commodity c, the recommendation result of the description module is effective, if the probability of clicking the commodity b is lower than the probability of clicking the commodity c, the recommendation result is described as having deviation, at the moment, the logic relationship between the commodity b and the commodity c is required to be interchanged, the recommendation module is put into training, and the steps are repeated until a reasonable recommendation result is obtained.
In actual use of a user, the action of browsing the medicine commodity by the user can also feed back the recommendation result of the recommendation system, so that on one hand, the recommendation effect of the recommendation scheme can be verified, on the other hand, the user portrait can be refined, and the recommendation result can be optimized.
The modeling method of the recommendation module comprises four steps of model training, model loading, model management and log recording. During model training, user history data is required to be read from a Hadoop distributed file system (HDFS, hadoop Distributed File System), model training is carried out on Spark, 4-8 iterations are generally required for training, and the model effect after each iteration is improved. After model training, the model results are stored on the HDFS. When the model training method is used, the model is not required to be retrained, and only the model trained before is required to be loaded, so that the separation of two processes of model training and model reasoning is realized. Therefore, after model training is completed, signals need to be sent to a model reasoning process, asynchronous model loading is completed by the model reasoning process, and finally the old model is replaced. In the model reasoning process, user data can be input into the model in real time, a model calculation module is called, a predicted commodity score is output, and a result is returned. And finally, recording a recommendation log, and recording a recommendation result returned to the user by each model so as to be transparent in a link when the problem is checked later. The formula used in the related approval recommendation evaluation method is as follows:
Wherein, nDCGp is approximately equivalent to the top 9 commodity click rates predicted by the model, and belongs to the duty ratio of the top 9 of the actual commodity click rate ranking. The nDCGp score calculated by the current statistical method is 0.32, and the score calculated by the model is 0.63.
[ recommendation feedback Module ]
Recommendation feedback is performed on the buried point system design of the recommended commodity, specifically: the evaluation results given by the buried point system can be used to decide whether the recommendation algorithm giving the current recommendation results needs to be evolved or eliminated. After the buried point system is evaluated, the recommendation results under different grouping plates or recommendation results of different tag users are subjected to ABTest rule setting, so that information such as multidimensional data (including but not limited to buried points) of different groupings is subjected to comparison analysis.
The rules of recommending commodities are based on different properties of the medicine retail industry and other retail quick-sales industries, when the recommendation rules are prepared, evaluation and point difference are required to be carried out according to a large amount of industry experience, and the special forms of new medicine retail are supplemented by combining with the business rules of an operator, so that unique rule engine logic and an adaptation scheme are designed. The ABTest rule scheme for verification and evaluation among different classifications can be preferably recommended by a recommendation engine only for using APP and applets, wherein users can be flexibly grouped through unique ABTest rules, equipment is put in proportion, and the openID, unionID of WeChat, IDFA of an iOS system and IMEI number rules of an Android system are automatically adapted. Thereby realizing accurate throwing, scientific testing and strict feedback. When the buried point system of the recommended commodity is matched, the recommended engine and the buried point are tightly coupled, but the universality of the engine is not changed, and the universality of the buried point system is not changed. Timely feedback of data is achieved, and real-time feedback recommendation can be achieved.
Before final recommendation to the user, the items that have been previously recommended need to be filtered. There are three methods: the first method is to store all the object scores browsed by the users, so that the obtained user data is very comprehensive, the consumed storage amount is huge, millions of users and millions of objects need at least 10T of storage space, and the subsequent data updating is difficult to quickly insert into a 10T-level database in real time for real-time sorting. The second is to save only the record of the articles browsed by the user, and remove the weight of the articles each time, but if the articles purchased by the user have field limitations, the recommendation results obtained each time are not different, and a great amount of time is required for removing the weight of the articles each time after the subsequent recommendation. The third method is to maintain an orderly recommended queue (recommended set) for each user, the recommended queue size of each user is ten thousand, the update score and the deduplication operation are all carried out in the queue, the scores of the items and the users are calculated in an increment mode, namely only the newly added and changed parts are calculated, and the calculation is triggered after the browsing quantity of the items is updated to a certain extent. Therefore, the recommendation result is stored in advance, and the recommendation module is not required to be frequently called in the follow-up recommendation process for a certain number of times, and one article can only search for the user once within a certain time range, so that a large amount of computing resources can be saved. Secondly, the queue can be dynamically adjusted due to the uniqueness of the data structure, the length of the queue of the active user can be a little longer, and the length of the queue of the inactive user can be a little shorter, so that the uniform storage space is configured as required, and the user experience is optimized. Finally, the queues automatically have ordering and de-duplication functions, and the queue data structure can be flexibly adjusted according to the service logic to support complex ordering (e.g., the ordering function can be flexibly set using the scaled set). However, there is a performance that causes a complicated structure, and there is a difficulty in quickly recovering a service when a real-time architecture in which a plurality of storage services such as a highly available memory service and an offline database service coexist is involved. In addition, the user portrait module can be multiplexed to carry out feedback correction on the user classification, or different storage spaces can be allocated according to different user types.
The invention uses the elastic search to perform performance test on the actual data in the aspect of data management, and maintains and supports the data. The elastsearch is an alternative to MySQL on simple queries. When the user labels are stored, the recommendation accuracy can be improved, and the recommendation method is particularly beneficial to the scene representation of the complex user structure. Meanwhile, the storage and simple and complex analysis and filtration of the multidimensional labels of a single user are also satisfied, and the method has great advantages in the drawing of the later user portraits. In storing the recommended data, the full amount of recommended data may be stored in the elastic search, and query performance is accelerated using a full index manner, unlike a scenario in which the dis preamble characteristic can be used.
Finally, because the recommendation system is combined with the rule engine, the service scene can be quickly matched, the consumed computing resources during recommendation can be controlled, and meanwhile, a manager can flexibly increase and adjust the recommendation rules without restarting the service. Therefore, the method has the capability of updating the rule without interrupting the service, and the service seamless updating capability, and further improves the service quality for users. In real-time, the recommendation system adopts real-time recommendation, so that information such as the behavior of the user can be obtained in real time, and the recommendation can be performed in real time according to the current situation of the user, so that the experience of the user is ensured. The recommendation system is realized by stateless service, can rapidly deploy and restart service, recovers service in second level, and ensures high availability of self data by relying on a high-availability database at the bottom layer. In the deployment mode, distributed deployment is supported, on one hand, the load of a central machine room and the decoupling of modules are reduced, safety and load capacity are guaranteed, and secondly, the distributed deployment can be deployed on any machine to realize load balancing, and performance is dynamically adjusted in real time according to the request quantity.
According to the medicine commodity recommendation system disclosed by the invention, the user behavior and the user preference are analyzed, so that the purchasing needs of the user are predicted; the medical knowledge graph is obtained through learning the prior medical knowledge text, the purchasing requirement of the user is inquired in the knowledge graph to obtain a commodity list which can be used for recommendation, and finally, the commodity in the commodity recommendation list is checked and de-duplicated through a recommendation module, so that the final recommended commodity is obtained. Compared with the commodity recommendation method in the prior art, the commodity recommendation method has the advantages that factors for determining recommended commodities are more numerous, so that the accuracy of the recommended commodity results is higher, and the commodity recommendation method has directivity.
Fig. 3 illustrates a physical schematic diagram of an electronic device, as shown in fig. 3, where the electronic device may include: a processor (processor), a communication interface (Communications Interface), a memory (memory) and a communication bus, wherein the processor, the communication interface, and the memory communicate with each other via the communication bus. The processor may invoke logic instructions in the memory to perform a recommendation method for a pharmaceutical commodity, the method comprising: analyzing the symptom description input by the user into the search box to obtain symptom labels which can be used for retrieval; collecting the action of browsing commodities by a user to obtain user preference data which can be used for describing the portrait of the user; learning according to the existing medical knowledge text on the network to obtain the corresponding relation between the medicine and the symptom, and forming a medical knowledge graph which can be used for inquiring the existing relation and finding the potential relation; recommending proper medicine commodities to a user according to symptom labels and user preference data and knowledge patterns provided by the user and used for searching; and adjusting the recommendation method according to the browsing action of the user on the recommended medicine commodity.
In addition, the on-line medicine recommendation method disclosed by the invention can be stored in a memory, and logic instructions stored in the memory can be realized in a form of software and can be stored in a computer readable storage medium when the logic instructions are sold or used as independent commodities. Based on this understanding, the technical solution of the present invention may be embodied in essence or a part contributing to the prior art or a part of the technical solution, or in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method of the embodiment of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the above-mentioned commodity online medicine recommendation methods, the method comprising: analyzing the symptom description input by the user into the search box to obtain symptom labels which can be used for retrieval; collecting the action of browsing commodities by a user to obtain user preference data which can be used for describing the portrait of the user; learning according to the existing medical knowledge text on the network to obtain the corresponding relation between the medicine and the symptom, and forming a medical knowledge graph which can be used for inquiring the existing relation and finding the potential relation; recommending proper medicine commodities to a user according to symptom labels and user preference data and knowledge patterns provided by the user and used for searching; and adjusting the recommendation method according to the browsing action of the user on the recommended medicine commodity.
The above described apparatus embodiments are merely illustrative, wherein the modules may or may not be physically separate, i.e. may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

Claims (13)

1. A commodity recommendation method based on combination of a twin transducer model and a knowledge graph is characterized in that: semantic segmentation is carried out on the input of the user in the search box so as to analyze the actual demands of the user, obtain symptom labels which can be used for retrieval, and further portray the user; collecting the action of browsing commodities by a user to obtain user preference data which can be used for describing the portrait of the user; the twin transducer model is used for learning medical knowledge texts recorded in natural language to extract medicine and symptom labels, and the labels are normalized to obtain corresponding relations between the medicine and the symptoms, so that a medical knowledge graph which can be used for inquiring existing relations and finding potential relations is formed; recommending proper medicine commodities to a user according to symptom labels and user preference data and knowledge patterns provided by the user and used for searching; according to the browsing action of the user on the recommended medicine commodity, the recommendation strategy is adjusted,
wherein, the portrait of the user comprises the steps of analyzing the portrait according to the input of the user and establishing a user label according to the purchasing behavior of the user;
the potential relation is found by using a ConvE model optimized through a bilinear function to conduct knowledge embedding on a medical knowledge graph and using a twin transformation model to predict whether a relation exists between two entities from the description;
The method comprises the steps that a proper medicine commodity is recommended to a user, the time difference between user behaviors is measured in terms of the user behaviors and commodity association is carried out, so that the influence of time on recommendation is enhanced, in terms of commodity association, the existing commodity relevant click rate which can be collected is calculated, the commodity relevant click rate is the probability that the user clicks a first commodity and then clicks a second commodity serving as a recommended commodity, and the commodity relevant click rate is used for training a model on one hand and verifying the recommendation effect of the model obtained through training on the other hand;
before recommending proper medicine commodities to users, the method filters the recommended commodities, generates a recommendation queue for each user, and only calculates the scores of the medicines newly added and changed in the queue; when the browsing quantity of the user for the medicine commodity is updated to a certain threshold value, the recommendation queue is completely updated; the length of the recommendation queue is set to be related to the activity of the user, so that the limited storage space is configured according to the need, and the user experience is optimized;
the commodity recommending method comprises the steps of combining a recommending system with a rule engine so as to quickly match business scenes, and enabling a manager to flexibly increase and adjust recommending rules without restarting services.
2. The recommendation method of claim 1, wherein said process of parsing a user symptom description comprises: and carrying out semantic segmentation on the natural language character strings input by the user description symptom to obtain symptom description keywords which can be used for retrieval and data of applicable ages.
3. The recommendation method according to claim 2, wherein said process of parsing a user symptom description comprises: and (3) carrying out normalized correction on the natural language character string input by the user, wherein the normalized correction comprises conversion from traditional Chinese characters to simplified Chinese characters, conversion from homonyms and correction of wrongly written characters.
4. The recommendation method according to claim 2, wherein the current purchasing demands of the user are predicted based on the parsed data.
5. The recommendation method according to claim 2, wherein the process of retrieving symptom description keywords is performed in a medical knowledge graph generated from existing medical knowledge texts.
6. The recommendation method of claim 1 wherein the process of collecting user preference data from the user's actions of browsing recommended or initial items comprises: collecting data of clicking, browsing and residence time of a user; collecting actions of users which are not interested in long-time pressing feedback or shielding the medicines; and collecting data of people suitable for browsing the medicine commodities by a user.
7. The recommendation method according to claim 1, wherein the medical knowledge graph construction process comprises: identifying a drug named entity, wherein the identification of the drug named entity comprises standardization of the drug entity; identifying symptom entities, the identification of symptom entities including associations between symptom entities; and identifying the corresponding relation between the medicine and the symptom, wherein the identification of the corresponding relation comprises the steps of carrying out relation abstraction according to the existing medical knowledge text and carrying out knowledge discovery according to the relation between symptom entities of the medical knowledge graph.
8. The recommendation method of claim 6, wherein the knowledge discovery process is to use a twinning transducer learning module in combination with a medical knowledge graph to discover potential entity relationships; when the data tables output by the twin transducer module are associated with each other, knowledge embedding is performed by using the knowledge graph embedding module, so that visual representation of output data of the twin transducer module is realized, and the visual representation is used as one of inputs of the downstream recommendation module.
9. The recommendation method according to claim 1, wherein said recommending a suitable pharmaceutical product to a user comprises: symptom description keywords and applicable age data obtained based on the process of analyzing the symptom descriptions of the users; based on user preference data; based on the entity of the medicine knowledge graph and the relation between the entities, recommending medicine commodities to the user.
10. The recommendation method according to claim 8, wherein the recommendation method is optimized according to a browsing action of a user on recommended pharmaceutical products, the optimization method comprising: list sorting is carried out on the recommended commodities, and the list sorting order of the recommended commodities can be adjusted according to the browsing action of the user; managing life cycle of recommended goods and corresponding labels thereof, and controlling the duration or termination of the life cycle according to browsing actions of users; feeding back the recommendation method according to the browsing action of the user, and determining whether a recommendation model used by the recommendation method needs to be updated or not; in the process of training the recommendation module, the model is trained by adopting the countermeasure idea, so that the learning times of the confusing sample and the confusing sample are dynamically adjusted, and the balance of the model is realized.
11. A commodity recommendation system based on a combination of a twinning transducer model and a knowledge graph, the system comprising: analyzing the symptom description of the user input search box to obtain a user input analysis module of symptom labels which can be used for retrieval; the user portrait module is used for acquiring the action of browsing commodities by a user and obtaining user preference data which can be used for describing user portraits; learning according to the existing medical knowledge text on the network to obtain the corresponding relation between the medicine and the symptom, and forming a knowledge graph module which can be used for inquiring the existing relation and finding the medical knowledge graph of the potential relation; a recommendation module for recommending proper medicine commodities to the user according to the symptom labels and the user preference data provided by the user and the knowledge graph; a recommendation feedback module for adjusting the recommendation method according to the browsing action of the user on the recommended medicine products,
The user portrait module comprises portrait of the user according to the output of the user input analysis module and establishes user labels according to the purchase behaviors of the user;
the discovery potential relation of the knowledge graph module is that a ConvE model optimized through a bilinear function is used for carrying out knowledge embedding on a medical knowledge graph, and a twin transformation model is used for predicting whether the relation exists between two entities from the description;
the recommendation module measures the time difference between user behaviors from two aspects of user behaviors and commodity association, and the time difference between the user behaviors is measured in the aspect of user behaviors, so that the influence of time on recommendation is enhanced, and in the aspect of commodity association, the existing commodity correlation click rate which can be collected is calculated, wherein the commodity correlation click rate is the probability that a user clicks a first commodity and then clicks a second commodity serving as a recommended commodity, and the commodity correlation click rate is used for training a model on one hand and verifying the recommendation effect of the model obtained by training on the other hand;
generating a recommendation queue for each user in a recommendation module and a recommendation feedback module, and only calculating the scores of newly added and changed medicines in the queue; after the browsing quantity of the user for the medicine commodity is updated to a certain threshold value, the recommendation module is called to update the recommendation queue; the length of the recommendation queue is set to be related to the activity of the user, so that the limited storage space is configured according to the requirement;
The commodity recommendation system is combined with a rule engine in use, so that a service scene is matched quickly, and a manager can flexibly increase and adjust recommendation rules without restarting services.
12. A recommendation system according to claim 11, wherein the modules of the recommendation system are adapted to perform the steps of the method according to any one of claims 2 to 10.
13. A non-transitory computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when executed by a processor, can implement the method and steps of any one of claims 1 to 10.
CN202310011140.8A 2023-01-05 2023-01-05 Commodity recommendation method and system based on combination of twin transducer model and knowledge graph Active CN115934967B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310011140.8A CN115934967B (en) 2023-01-05 2023-01-05 Commodity recommendation method and system based on combination of twin transducer model and knowledge graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310011140.8A CN115934967B (en) 2023-01-05 2023-01-05 Commodity recommendation method and system based on combination of twin transducer model and knowledge graph

Publications (2)

Publication Number Publication Date
CN115934967A CN115934967A (en) 2023-04-07
CN115934967B true CN115934967B (en) 2024-02-27

Family

ID=86552454

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310011140.8A Active CN115934967B (en) 2023-01-05 2023-01-05 Commodity recommendation method and system based on combination of twin transducer model and knowledge graph

Country Status (1)

Country Link
CN (1) CN115934967B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113377916A (en) * 2021-06-22 2021-09-10 哈尔滨工业大学 Extraction method of main relations in multiple relations facing legal text
CN113961720A (en) * 2021-10-29 2022-01-21 北京百度网讯科技有限公司 Method for predicting entity relationship and method and device for training relationship prediction model
CN115101166A (en) * 2022-07-19 2022-09-23 康键信息技术(深圳)有限公司 Medicine information recommendation method and device, storage medium and computer equipment
CN115186102A (en) * 2022-07-08 2022-10-14 大连民族大学 Dynamic knowledge graph complementing method based on double-flow embedding and deep neural network
CN115293161A (en) * 2022-08-19 2022-11-04 广州中康资讯股份有限公司 Reasonable medicine taking system and method based on natural language processing and medicine knowledge graph

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11016965B2 (en) * 2019-01-22 2021-05-25 International Business Machines Corporation Graphical user interface for defining atomic query for querying knowledge graph databases

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113377916A (en) * 2021-06-22 2021-09-10 哈尔滨工业大学 Extraction method of main relations in multiple relations facing legal text
CN113961720A (en) * 2021-10-29 2022-01-21 北京百度网讯科技有限公司 Method for predicting entity relationship and method and device for training relationship prediction model
CN115186102A (en) * 2022-07-08 2022-10-14 大连民族大学 Dynamic knowledge graph complementing method based on double-flow embedding and deep neural network
CN115101166A (en) * 2022-07-19 2022-09-23 康键信息技术(深圳)有限公司 Medicine information recommendation method and device, storage medium and computer equipment
CN115293161A (en) * 2022-08-19 2022-11-04 广州中康资讯股份有限公司 Reasonable medicine taking system and method based on natural language processing and medicine knowledge graph

Also Published As

Publication number Publication date
CN115934967A (en) 2023-04-07

Similar Documents

Publication Publication Date Title
Colnerič et al. Emotion recognition on twitter: Comparative study and training a unison model
Zeberga et al. [Retracted] A Novel Text Mining Approach for Mental Health Prediction Using Bi‐LSTM and BERT Model
Cheng et al. Risk prediction with electronic health records: A deep learning approach
Akbari et al. From tweets to wellness: Wellness event detection from twitter streams
US11586811B2 (en) Multi-layer graph-based categorization
Feng et al. Enhanced sentiment labeling and implicit aspect identification by integration of deep convolution neural network and sequential algorithm
CN107357793A (en) Information recommendation method and device
CN109408811A (en) A kind of data processing method and server
Subramanian et al. A survey on sentiment analysis
CN111651606B (en) Text processing method and device and electronic equipment
Hu et al. Predicting the quality of online health expert question-answering services with temporal features in a deep learning framework
CN115293161A (en) Reasonable medicine taking system and method based on natural language processing and medicine knowledge graph
Ghosh et al. An attention-based hybrid architecture with explainability for depressive social media text detection in Bangla
Zhang et al. Transfer correlation between textual content to images for sentiment analysis
Jia Music emotion classification method based on deep learning and improved attention mechanism
Zhao et al. Sentimental prediction model of personality based on CNN-LSTM in a social media environment
Cheung et al. Crossmodal bipolar attention for multimodal classification on social media
Bhattacharjee Capsule network on social media text: An application to automatic detection of clickbaits
CN115934967B (en) Commodity recommendation method and system based on combination of twin transducer model and knowledge graph
Bhuvaneswari et al. A deep learning approach for the depression detection of social media data with hybrid feature selection and attention mechanism
Kaur Analyzing twitter feeds to facilitate crises informatics and disaster response during mass emergencies
CN115660871A (en) Medical clinical process unsupervised modeling method, computer device, and storage medium
Bell et al. Detecting diabetes risk from social media activity
Asada Integrating Heterogeneous Domain Information into Relation Extraction: A Case Study on Drug-Drug Interaction Extraction
Wang et al. SCIEnt: A Semantic-Feature-Based Framework for Core Information Extraction from Web Pages

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant