CN112435103B - Intelligent recommendation method and system for postmortem diversity interpretation - Google Patents

Intelligent recommendation method and system for postmortem diversity interpretation Download PDF

Info

Publication number
CN112435103B
CN112435103B CN202011507787.2A CN202011507787A CN112435103B CN 112435103 B CN112435103 B CN 112435103B CN 202011507787 A CN202011507787 A CN 202011507787A CN 112435103 B CN112435103 B CN 112435103B
Authority
CN
China
Prior art keywords
model
black box
sample data
baseline
interpretation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011507787.2A
Other languages
Chinese (zh)
Other versions
CN112435103A (en
Inventor
杨晓春
丁蕊
马红
王斌
Original Assignee
东北大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 东北大学 filed Critical 东北大学
Priority to CN202011507787.2A priority Critical patent/CN112435103B/en
Publication of CN112435103A publication Critical patent/CN112435103A/en
Application granted granted Critical
Publication of CN112435103B publication Critical patent/CN112435103B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Animal Behavior & Ethology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides an intelligent recommendation method and system for post diversity interpretation, which are characterized in that sample data containing commodity interpretation information is collected according to a user history transaction record, a sample set is constructed, each piece of sample data in the sample set is preprocessed, then an unexplained recommendation model is used as a black box model, n kinds of interpretable algorithms are used as baseline models of the black box model, the preprocessed sample data are utilized to train the models, and an optimal matching baseline model is screened out according to a generated KL divergence value.

Description

Intelligent recommendation method and system for postmortem diversity interpretation
Technical Field
The invention relates to the technical field of computers, in particular to an intelligent recommendation method and system for post diversity interpretation.
Background
In a recommendation system, the purpose of the interpretable recommendation is to explain the answer why to the user, solve the reason of the recommendation, and help to improve the transparency, persuasion, effectiveness, credibility and user satisfaction of the recommendation system. Early content-based recommendations may directly explain to the user why the item is to be recommended to the user from among a plurality of candidate items; collaborative filtering is interpreted by swarm intelligence, but it is not intuitive enough for many content-based recommendation algorithms to interpret; a latent semantic model (LFM) comprising a matrix decomposition method based on Coordinated Filtering (CF) is poorly interpretable; in recent years, the deep learning improves the performance of personalized recommendation, but the interpretation of a recommendation model is difficult to a certain extent due to the black box characteristic of the depth model; in recent years, researchers in the field of artificial intelligence research have also realized the importance of interpretable artificial intelligence (expandable AI), which has to be solved in tasks such as deep learning, computer vision, autopilot, natural language processing, etc.
The interpretability and effectiveness are considered two conflicting goals in model design, and items must trade off-we can choose a simple model to get better interpretability, or a complex model to get better accuracy, while sacrificing interpretability. Recent evidence, however, suggests that these two goals do not necessarily conflict with each other when designing the recommendation model. For example, most advanced techniques such as deep representation learning methods can help us design recommendation models that are both efficient and interpretable. Developing an interpretable depth model is also a direction of attractive force in a broader AI field, which has driven the development of not only interpretable recommendation studies, but also the basic principles of interpretable machine learning problems.
The post-interpretation recommendation aims at searching a method for interpreting a black box recommendation algorithm after deep learning training, and the interpretation of the model can be improved on the basis of maintaining accuracy by approximating the black box model by using an interpretable model. The method explores the distribution of the black box model on the basis of fixing the original recommendation algorithm, so that the accuracy rate is not changed. The post-interpretation recommendation attracts the attention of the relevant expert scholars, and a series of interpretation recommendation algorithms are developed, and mainly comprise interpretation recommendation based on a knowledge graph, interpretation recommendation based on association rules, interpretation recommendation based on causal reasoning and the like, but the interpretation obtained by the methods is single, and multiple interpretable algorithms cannot be obtained.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides an intelligent recommendation method for post diversity interpretation, which comprises the following steps:
step 1: collecting sample data containing commodity explanation information according to a user historical transaction record, and constructing a sample set;
step 2: preprocessing each sample data in a sample set, including:
step 2.1: marking whether each piece of sample data has text comments, social relations and commodity characteristics;
step 2.2: normalizing commodity grading values in each sample data by using a formula (1), limiting the commodity grading values to be within a range of (0, 1),
wherein x represents commodity grading value in each sample data, x max Represents the maximum value, x, of all sample data scoring values in the sample set min Representing the minimum value, x, of all sample data scoring values in a sample set * Representing commodity grading values after normalization treatment;
step 3: taking the unexplained recommended model as a black box model, and taking n kinds of interpretable algorithms as a baseline model of the black box model;
step 4: training the black box model and the baseline model simultaneously by using the preprocessed sample data, outputting a recommended result, and generating probability distribution of corresponding model parameters;
step 5: screening out a baseline model consistent with the black box model recommendation result, and taking the baseline model as a candidate baseline model;
step 6: calculating a divergence value between probability distribution generated by the black box model and probability distribution generated by each candidate baseline model by utilizing KL divergence;
step 7: sequencing all the divergence values from small to large, and selecting a baseline model corresponding to the first K divergence values as an optimal matching baseline model of the black box model, wherein K is smaller than or equal to n;
step 8: and generating K interpretable descriptions of the recommended commodity output by the black box model by using the K optimal matching baseline models.
The KL divergence is expressed as:
where P is the probability distribution of the model parameters of the black box model, Q is the probability distribution of the model parameters of the baseline model, and KL (p||q) is the KL divergence value of the probability distribution P, Q.
A recommendation system for realizing an intelligent recommendation method for postmortem diversity interpretation comprises a sample acquisition module, a sample data preprocessing module and an interpretable interpretation output module;
the sample acquisition module is used for extracting sample data containing commodity interpretation information from a user historical transaction record;
the sample data preprocessing module is used for preprocessing each piece of sample data;
the interpretable explanation output module is used for constructing a baseline model of the black box model, training the black box model and the baseline model simultaneously, searching an optimal matching baseline model according to the KL divergence value, and outputting K interpretable explanations of recommended commodities output by the black box model by using the optimal matching baseline model.
The beneficial effects of the invention are as follows:
the invention provides an intelligent recommendation method and system for postmortem diversity interpretation, which takes a non-interpretable recommendation model as a black box model, takes a interpretable algorithm with stronger interpretation as a baseline model of the black box model, trains the black box model and the baseline model simultaneously through a constructed sample set, and searches a baseline model which is most matched with the black box model by utilizing KL divergence values.
Drawings
FIG. 1 is a flow chart of an intelligent recommendation method for post diversity interpretation in the invention.
FIG. 2 is a block diagram of the recommendation system of the present invention.
Detailed Description
The invention will be further described with reference to the accompanying drawings and examples of specific embodiments. Existing interpretable models are mainly divided into two types, namely a model-based interpretable recommendation algorithm and a post-interpretable recommendation algorithm. The model-based interpretable recommendation algorithm mostly needs to introduce some additional information including text comments, commodity characteristics, social information and the like, and is based on the additional information, modeling is performed by deep learning, and characteristics are extracted. Two major disadvantages of this type of approach are: (1) The cost of obtaining additional information is relatively high, (2) feature-based modeling jointly trains an interpretability target and an accuracy target, and mutual exclusivity generally exists, namely, the interpretability with good accuracy is poor, and the accuracy with strong interpretability is influenced. The post-interpretation recommendation aims at searching a method for interpreting a black box recommendation algorithm after deep learning training, and the interpretation of the model can be improved on the basis of maintaining accuracy by approximating the black box model by using an interpretable model. The method explores the distribution of the black box model on the basis of fixing the original recommendation algorithm, so that the accuracy rate is not changed. Interpretable recommendation based on a knowledge graph, interpretable recommendation based on association rules, interpretable recommendation based on causal reasoning and the like, and the interpretation obtained by the methods is single, for example: only interpretable forms such as beer- & gt fried chicken can be obtained based on the association rule, and various interpretable forms cannot be obtained. Based on the difficulty, the invention focuses on designing a post-interpretation recommendation method and system with various interpretations, which can be used for predicting distribution for a plurality of traditional black box methods and providing reasonable interpretation.
As shown in fig. 1, an intelligent recommendation method for post diversity interpretation includes:
step 1: collecting historical transaction records of purchasing movies of a user in a period of time from a movieens movie official network, for example, collecting account numbers Id of registered users from the movieens official network, crawling historical transaction records of purchasing commodities of the user according to the user Id, and then collecting sample data containing commodity interpretation information according to the user historical transaction records to construct a sample set;
step 2: preprocessing each sample data in a sample set, including:
step 2.1: marking and sorting whether the sample data has text comments, social relations and commodity features, specifically describing that text fields are set to represent comment information, constructing a friend relation matrix according to the 'focused friends' of registered users, and adding feature information, such as directors and lead directors, for the commodities;
step 2.2: normalizing commodity grading values in each sample data by using a formula (1), limiting the commodity grading values to be within a range of (0, 1),
wherein x represents commodity grading value in each sample data, x max Represents the maximum value, x, of all sample data scoring values in the sample set min Representing the minimum value, x, of all sample data scoring values in a sample set * Representing commodity grading values after normalization treatment;
step 3: taking the unexplained recommended model as a black box model, and taking n kinds of interpretable algorithms as a baseline model of the black box model;
the method comprises the steps of researching an existing algorithm with strong interpretability as a baseline model, such as a classical feature-based exon model (EFM) algorithm, introducing commodity features into interpretable modeling, a social network-based interpretable algorithm (NNF algorithm), taking friend relations into consideration, and a knowledge graph-based strategy-guided path reasoning algorithm (KGPR algorithm), wherein path information collected on a social graph is displayed to a user as interpretation.
In this embodiment, the selected interpretable stronger algorithm is used as a baseline model, including a classical feature-based exon model (EFM) algorithm, which models commodity features to introduce interpretable modeling; an interpretable social network-based algorithm (NNF algorithm) that takes into account friend relationships; a knowledge graph-based policy-directed path reasoning algorithm (KGPR algorithm) that presents path information collected on a social graph to a user as an explanation. The selected unexplained algorithm includes a bayesian personalized ranking algorithm (BPR algorithm), an antagonistic personalized ranking algorithm (APR algorithm).
The existing recommended model without the interpretability is selected as the black box model, namely, the internal structure of the black box recommended model is not concerned, and only the interpretability is given to the black box recommended model. The distribution difference of the black box model and the baseline models is measured by KL divergence, and the baseline models are fine tuned to better fit the black box model.
Step 4: training the black box model and the baseline model simultaneously by using the preprocessed sample data, and generating probability distribution of corresponding model parameters through the black box model and the baseline model;
step 4.1: initializing parameter probability distributions of the black box model and the baseline model by using Gaussian distribution;
step 4.2: training the loss function of the black box model and the loss function of the baseline model by using an Adam gradient descent method, so that the two loss functions are descended to be basically kept stable;
filtering the baseline model result, reserving the KL divergence value of the baseline model with the recommended result consistent with the black box model recommended result, and sorting according to the KL divergence distribution value, wherein the sorting of the low KL divergence is higher, the sorting of the high KL divergence is lower, and the interpretable form of top-K is selected and displayed to the user, so that K interpretation forms can be provided for the user, and the attention points of the user are reached.
Step 5: screening out a baseline model consistent with the black box model recommendation result, and taking the baseline model as a candidate baseline model;
step 6: calculating a divergence value between probability distribution generated by the black box model and probability distribution generated by each candidate baseline model by utilizing KL divergence; the smaller the divergence value, the more similar the two distributions are;
the KL divergence (also known as cross entropy) is expressed as:
where P is the probability distribution of the model parameters of the black box model, Q is the probability distribution of the model parameters of the baseline model, and KL (p||q) is the KL divergence value of the probability distribution P, Q.
Step 7: sequencing all the divergence values from small to large, and selecting a baseline model corresponding to the first K divergence values as an optimal matching baseline model of the black box model, wherein K is smaller than or equal to n;
step 8: and generating K interpretable descriptions of the recommended commodity output by the black box model by using the K optimal matching baseline models.
A recommendation system adopting an intelligent recommendation method of post diversity interpretation can be realized by adopting a python programming language based on a pycharm programming platform, and in addition, a deep learning tool package is used in a training stage, as shown in figure 2, and comprises a sample acquisition module, a sample data preprocessing module and an interpretable output module;
the sample acquisition module is used for extracting sample data containing commodity interpretation information from a user historical transaction record;
the sample data preprocessing module is used for preprocessing each piece of sample data;
the interpretable explanation output module is used for constructing a baseline model of the black box model, training the black box model and the baseline model simultaneously, searching an optimal matching baseline model according to the KL divergence value, and outputting K interpretable explanations of recommended commodities output by the black box model by using the optimal matching baseline model.

Claims (3)

1. An intelligent recommendation method for post diversity interpretation is characterized by comprising the following steps:
step 1: collecting sample data containing commodity explanation information according to a user historical transaction record, and constructing a sample set;
step 2: preprocessing each sample data in a sample set, including:
step 2.1: marking whether each piece of sample data has text comments, social relations and commodity characteristics;
step 2.2: normalizing commodity grading values in each sample data by using a formula (1), limiting the commodity grading values to be within a range of (0, 1),
wherein x represents commodity grading value in each sample data, x max Represents the maximum value, x, of all sample data scoring values in the sample set min Representing the minimum value, x, of all sample data scoring values in a sample set * Representing commodity grading values after normalization treatment;
step 3: taking the unexplained recommended model as a black box model, and taking n kinds of interpretable algorithms as a baseline model of the black box model;
step 4: training the black box model and the baseline model simultaneously by using the preprocessed sample data, outputting a recommended result, and generating probability distribution of corresponding model parameters;
step 5: screening out a baseline model consistent with the black box model recommendation result, and taking the baseline model as a candidate baseline model;
step 6: calculating a divergence value between probability distribution generated by the black box model and probability distribution generated by each candidate baseline model by utilizing KL divergence;
step 7: sequencing all the divergence values from small to large, and selecting a baseline model corresponding to the first K divergence values as an optimal matching baseline model of the black box model, wherein K is smaller than or equal to n;
step 7: and generating K interpretable descriptions of the recommended commodity output by the black box model by using the K optimal matching baseline models.
2. The intelligent recommendation method for post diversity interpretation according to claim 1, wherein the KL divergence is expressed as:
where P is the probability distribution of the model parameters of the black box model, Q is the probability distribution of the model parameters of the baseline model, and KL (p||q) is the KL divergence value of the probability distribution P, Q.
3. The recommendation system for realizing the intelligent recommendation method for the postmortem diversity interpretation is characterized by comprising a sample acquisition module, a sample data preprocessing module and an interpretable interpretation output module;
the sample acquisition module is used for extracting sample data containing commodity interpretation information from a user historical transaction record;
the sample data preprocessing module is used for preprocessing each piece of sample data;
the interpretable explanation output module is used for constructing a baseline model of the black box model, training the black box model and the baseline model simultaneously, searching an optimal matching baseline model according to the KL divergence value, and outputting K interpretable explanations of recommended commodities output by the black box model by using the optimal matching baseline model.
CN202011507787.2A 2020-12-18 2020-12-18 Intelligent recommendation method and system for postmortem diversity interpretation Active CN112435103B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011507787.2A CN112435103B (en) 2020-12-18 2020-12-18 Intelligent recommendation method and system for postmortem diversity interpretation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011507787.2A CN112435103B (en) 2020-12-18 2020-12-18 Intelligent recommendation method and system for postmortem diversity interpretation

Publications (2)

Publication Number Publication Date
CN112435103A CN112435103A (en) 2021-03-02
CN112435103B true CN112435103B (en) 2023-11-24

Family

ID=74696799

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011507787.2A Active CN112435103B (en) 2020-12-18 2020-12-18 Intelligent recommendation method and system for postmortem diversity interpretation

Country Status (1)

Country Link
CN (1) CN112435103B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113486242A (en) * 2021-07-13 2021-10-08 同济大学 Non-invasive personalized interpretation method and system based on recommendation system
CN114968788B (en) * 2022-05-27 2024-08-06 浙江大学 Automatic evaluation method, device, medium and equipment for programming capability of artificial intelligent algorithm

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108550046A (en) * 2018-03-07 2018-09-18 阿里巴巴集团控股有限公司 A kind of resource and market recommendation method, apparatus and electronic equipment
CN111222332A (en) * 2020-01-06 2020-06-02 华南理工大学 Commodity recommendation method combining attention network and user emotion
CN111259238A (en) * 2020-01-13 2020-06-09 山西大学 Post-interpretable recommendation method and device based on matrix decomposition

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108550046A (en) * 2018-03-07 2018-09-18 阿里巴巴集团控股有限公司 A kind of resource and market recommendation method, apparatus and electronic equipment
CN111222332A (en) * 2020-01-06 2020-06-02 华南理工大学 Commodity recommendation method combining attention network and user emotion
CN111259238A (en) * 2020-01-13 2020-06-09 山西大学 Post-interpretable recommendation method and device based on matrix decomposition

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于深度学习的推荐系统研究综述;黄立威;江碧涛;吕守业;刘艳博;李德毅;;计算机学报(07);191-219 *
机器学习隐私保护研究综述;谭作文;张连福;;软件学报(07);201-230 *

Also Published As

Publication number Publication date
CN112435103A (en) 2021-03-02

Similar Documents

Publication Publication Date Title
US9710760B2 (en) Multi-facet classification scheme for cataloging of information artifacts
Pal et al. Pattern recognition algorithms for data mining
AU2011269676B2 (en) Systems of computerized agents and user-directed semantic networking
Wong et al. Pattern discovery: a data driven approach to decision support
El Morr et al. Descriptive, predictive, and prescriptive analytics
Akerkar et al. Intelligent techniques for data science
CN103649905A (en) Method and system for unified information representation and applications thereof
CN112435103B (en) Intelligent recommendation method and system for postmortem diversity interpretation
Akerkar Advanced data analytics for business
Kumar et al. Sentic computing for aspect-based opinion summarization using multi-head attention with feature pooled pointer generator network
Ahmed et al. Pattern Recognition: An Introduction
Ma et al. A visual analytical approach for transfer learning in classification
Chakraborty et al. Data classification and incremental clustering in data mining and machine learning
Matthews et al. The introduction of a design heuristics extraction method
CN108363759A (en) Subject tree generation method and system based on structural data and Intelligent dialogue method
Jones et al. The Unsupervised Learning Workshop: Get started with unsupervised learning algorithms and simplify your unorganized data to help make future predictions
Sivaselvan Data mining: Techniques and trends
Liu [Retracted] Art Painting Image Classification Based on Neural Network
Stefanovič et al. Travel direction recommendation˙ model based on photos of user social network profile
Trianasari et al. Analysis Of Product Recommendation Models at Each Fixed Broadband Sales Location Using K-Means, DBSCAN, Hierarchical Clustering, SVM, RF, and ANN
Miksatko et al. What’s in a cluster? automatically detecting interesting interactions in student e-discussions
Neau et al. In defense of scene graph generation for human-robot open-ended interaction in service robotics
Farhadloo Statistical Methods for Aspect Level Sentiment Analysis
TU Online Text Retrieval Method Based on Convolution Neural Network.
US12008409B1 (en) Apparatus and a method for determining resource distribution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant