WO2021066903A1

WO2021066903A1 - Providing explainable product recommendation in a session

Info

Publication number: WO2021066903A1
Application number: PCT/US2020/038298
Authority: WO
Inventors: Xianchao WU
Original assignee: Microsoft Technology Licensing, Llc
Priority date: 2019-09-30
Filing date: 2020-06-18
Publication date: 2021-04-08
Also published as: CN112581203A

Abstract

The present disclosure provides methods and apparatuses for providing explainable product recommendation in a session. At least one question associated with product recommendation may be provided. An answer to the at least one question may be received. It may be determined whether there exists at least one recommended product based at least on the at least one question and the answer. In response to determining that there exists the at least one recommended product, a recommendation reason of the at least one recommended product may be generated. A response including product information and the recommendation reason of the at least one recommended product may be provided.

Description

PROVIDING EXPLAINABLE PRODUCT RECOMMENDATION IN A SESSION

BACKGROUND

[0001] Artificial intelligence (AI) chatbots are becoming more and more popular and are being applied in more and more scenarios. Chatbots are designed to simulate human utterances and may chat with users through text, voice, images, etc. Generally, the chatbots may identify language content in a message input by a user or apply natural language processing to the message, and then provide the user with a response to the message.

SUMMARY

[0002] This Summary is provided to introduce a selection of concepts that are further described below in the Detailed Description. It is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

[0003] Methods and apparatuses for providing explainable product recommendation in a session are proposed in embodiments of the present disclosure. At least one question associated with product recommendation may be provided. An answer to the at least one question may be received. It may be determined whether there exists at least one recommended product based at least on the at least one question and the answer. In response to determining that there exists the at least one recommended product, a recommendation reason of the at least one recommended product may be generated. A response including product information and the recommendation reason of the at least one recommended product may be provided.

[0004] It should be noted that the above one or more aspects comprise the features hereinafter fully described and particularly pointed out in the claims. The following description and the drawings set forth in detail certain illustrative features of the one or more aspects. These features are only indicative of the various ways in which the principles of various aspects may be employed, and this disclosure is intended to include all such aspects and their equivalents.

BRIEF DESCRIPTION OF THE DRAWINGS

[0005] The disclosed aspects will hereinafter be described in connection with the appended drawings that are provided to illustrate and not to limit the disclosed aspects. [0006] FIG. 1 illustrates an exemplary network architecture in which a chatbot is deployed according to an embodiment. [0007] FIG. 2 illustrates an exemplary chatbot system according to an embodiment.

[0008] FIG. 3 illustrates mapping between a candidate product set and a candidate question set according to an embodiment.

[0009] FIG. 4 illustrates an exemplary overall process for providing explainable product recommendation according to an embodiment.

[0010] FIG. 5 illustrates an exemplary specific process for providing explainable product recommendation according to an embodiment.

[0011] FIG. 6 illustrates an exemplary process for training a recommendation reason generating model according to an embodiment.

[0012] FIG. 7 illustrates an exemplary process for generating a recommendation reason according to an embodiment.

[0013] FIG. 8 to FIG. 12 illustrates exemplary chat windows according to the embodiments.

[0014] FIG. 13 illustrates a flowchart of an exemplary method for providing explainable product recommendation in a session according to an embodiment.

[0015] FIG. 14 illustrates an exemplary apparatus for providing explainable product recommendation in a session according to an embodiment.

[0016] FIG. 15 illustrates an exemplary apparatus for providing explainable product recommendation in a session according to an embodiment.

DETAILED DESCRIPTION

[0017] The present disclosure will now be discussed with reference to several example implementations. It should be appreciated that these implementations are discussed only for enabling those skilled in the art to better understand and thus implement the embodiments of the present disclosure, rather than suggesting any limitations on the scope of the present disclosure.

[0018] Generally, a chatbot may chat automatically in a session with a user. Herein, "session" may refer to a time continuous conversation between two chat participants, and may comprise messages and responses in the conversation. "Message" may refer to any information input by the user, e.g., a query from the user, an answer of the user to the chatbot's question, an opinion by the user, etc. The term "message" and the term "query" may also be used interchangeably. "Response" may refer to any information provided by the chatbot, e.g., an answer of the chatbot to the user's question, a comment by the chatbot, a question proposed by the chatbot, etc.

[0019] In some application scenarios, a chatbot may provide product recommendations to a user in a session with the user. Herein, products may comprise goods, service, etc. However, providing product recommendations by a chatbot will face many challenges. In an aspect, a large-scale labeled corpus needs to be prepared for training a machine learning model in order to capture a user's intention or demand expressed in natural language. The intention or demand of the user indicates the user's preference for product attributes of recommended products. The product attributes may comprise various parameters, configurations, characteristics, etc. of the products. In another aspect, there is information asymmetry between products that the chatbot can provide and requirements of the user. For example, the user does not know what products the chatbot can provide, and does not know how to find desired products. The user needs to include keywords describing product attributes in a message sent to the chatbot, however, the chatbot may be unable to use these keywords to efficiently find recommended products or no product corresponding to these keywords exist at all. In still another aspect, since the user may chat with the chatbot in an inefficient and low- information approach, it will take a long time for the chatbot to gradually collect the user’s requirements.

[0020] Embodiments of the present disclosure propose to provide explainable product recommendation in an efficient and accurate approach during a session between a chatbot and a user. The chatbot may provide explainable product recommendation based on a learning-to-explain (LTE) architecture proposed by the embodiments of the present disclosure. The LTE architecture may dynamically provide a series of questions associated with product recommendation and collect the user’s answers to these questions in multiple rounds of session with the user, and may learn at least new knowledge from the user's answers in order to determine a recommended product and give a recommendation reason for explaining why the product is recommended. The LTE architecture may screen out the recommended product from many candidate products through a relatively short session. [0021] Questions provided to the user may direct to various product attributes. Optionally, options indicating different product attributes may be added to a question so that the user may directly select a desired option in an answer. Optionally, if the user answers a question or sends a message in a natural language sentence, the natural language sentence may be parsed to identify product attributes desired by the user.

[0022] Each time an answer to the current question is received from the user, the LTE architecture may perform product ranking to a plurality of candidate products. Those candidate products having attributes selected by the user will be ranked higher. Through the product ranking, each candidate product will have a corresponding expected probability, which indicates a likelihood that this candidate product is desired by the user after this round of session. Expected probabilities of the candidate products may be calculated for each round of session. Therefore, as the session proceeds, the expected probability of each candidate product is continuously updated. Optionally, a weight may be calculated for each candidate product in the product ranking, which may be mapped to an expected probability. Since there is specific mapping relationship between a weight and an expected probability of a candidate product, they may be used interchangeably herein, or collectively referred to as probability weight.

[0023] After performing product ranking to the candidate products, if no candidate product meets with a predetermined condition, the LTE architecture may further perform question ranking on a plurality of candidate questions. The predetermined condition may comprise, e.g., an expected probability of a candidate product being above a threshold, etc. These candidate questions may be directly or indirectly predetermined based on attributes of the plurality of candidate products. For example, the plurality of candidate questions may be ranked by considering previously-provided questions and the user's answers, product ranking of the candidate products, etc. The LTE architecture may perform the question ranking through entropy-based ranking and/or policy-based reinforcement learning ranking. Through the question ranking, each candidate question will have a corresponding information gain. A candidate question having, e.g., the maximum information gain may be selected from the ranked candidate questions as a next question to be provided to the user. Here, the maximum information gain may indicate that the candidate question has the highest discrimination, so that the range of candidate products that will be considered later can be reduced as much as possible at the fastest speed. Optionally, a weight may also be calculated for each candidate question in the product ranking, which may be mapped to information gain. Since there is specific mapping relationship between a weight and information gain of a candidate question, they may be used interchangeably herein.

[0024] After performing the product ranking to the candidate products, if there is at least one candidate product that meets with a predetermined condition, the LTE architecture may determine the candidate product as a recommended product and provide the user with product information about the recommended product. The product information may comprise any information about the product, e.g., name, number, price, etc. Further, a recommendation reason for the recommended product may be determined by the LTE architecture. In one aspect, the LTE architecture may select, from questions previously provided to the user, at least one question that contributes the most and corresponding answer of the user, for generating the recommendation reason. For example, a question resulting in the maximum expected probability rise for the recommended product may be selected from historical questions provided to the user, and the recommendation reason may be generated with reference to the selected question and an answer of the user. In another aspect, the LTE architecture may generate the recommendation reason through a pre-trained recommendation reason generating model. [0025] In an aspect, the embodiments of the present disclosure may use a product attribute list of candidate products to generate questions in advance, and may directly identify attributes selected by the user from the user's answer, or identify attributes selected by the user by comparing the user's answer with the product attribute list. Therefore, the embodiments of the present disclosure do not require a large-scale labeled corpus for capturing the user’s intentions or requirements. In another aspect, since the embodiments of the present disclosure may provide the user with a question attached with options, and the user may answer the question by simply selecting an option, the chatbot may effectively guide the session to avoid problems caused by information asymmetry between available products and user requirements and may improve the efficiency of product recommendation. In still another aspect, the LTE architecture according to the embodiments of the present disclosure may learn new knowledge in the session in real time and guide the next round of session accordingly, so that the user's intentions or requirements may be accurately understood and the user's participation, interest, etc. may be effectively improved. In yet another aspect, the embodiments of the present disclosure may provide the user with a recommendation reason of a recommended product, thereby enhancing the user's attention and trust in the recommended product, and further increasing the likelihood that the user will subscribe the recommended product.

[0026] The explainable product recommendation according to the embodiments of the present disclosure may be applied in various scenarios. In some scenarios, the embodiments of the present disclosure may be used for goods recommendation, e.g., gift recommendation, etc. In some scenarios, the embodiments of the present disclosure may be used for service recommendation, e.g., task-oriented reservation recommendation of hotel and restaurant, etc.

[0027] FIG. 1 illustrates an exemplary network architecture 100 in which a chatbot is deployed according to an embodiment. [0028] In FIG. 1, a network 110 is applied for interconnecting between a terminal device 120 and a chatbot server 130.

[0029] The network 110 may be any type of network capable of interconnecting network entities. The network 110 may be a single network or a combination of various types of network. In terms of coverage, the network 110 may be a local area network (LAN), a wide area network (WAN), etc. In terms of bearing medium, the network 110 may be a wired network, a wireless network, etc. In terms of data exchanging technology, the network 110 may be a circuit switched network, a packet switched network, etc.

[0030] The terminal device 120 may be any type of electronic computing device capable of connecting to the network 110, accessing a server or website on the network 110, processing data or signals, etc. For example, the terminal device 120 may be a desktop computer, a notebook computer, a tablet computer, a smart phone, an AI terminal, etc. Although only one terminal device is shown in FIG. 1, it should be appreciated that a different number of terminal devices may be connected to the network 110.

[0031] In an implementation, the terminal device 120 may be used by a user. The terminal device 120 may comprise a chatbot client 122 which may provide an automated chatting service to the user. In some cases, the chatbot client 122 may interact with the chatbot server 130. For example, the chatbot client 122 may send a message input by the user to the chatbot server 130 and receive a response associated with the message from the chatbot server 130. However, it should be appreciated that, in other cases, the chatbot client 122 may also generate a response to a user-input message locally, rather than interacting with the chatbot server 130.

[0032] The chatbot server 130 may be connected to or comprise a chatbot database 140. The chatbot database 140 may comprise information that may be used by the chatbot server 130 to generate responses. In an implementation, the chatbot server 130 may also be connected to a product database 150. The product database 150 may comprise various product information about a plurality of candidate products, e.g., product names, product attributes, etc. The product information may be, e.g., provided in advance by product providers or crawled from the network. Although the product database 150 is shown as independent from the chatbot database 140, the product database 150 may also be included in the chatbot database 140. When one or more candidate products are determined as recommended products, product information associated with these recommended products may be provided to the user.

[0033] It should be appreciated that all the network entities shown in FIG. 1 are exemplary, and depending on specific application requirements, any other network entities may be involved in the network architecture 100.

[0034] FIG. 2 illustrates an exemplary chatbot system 200 according to an embodiment.

[0035] The chatbot system 200 may comprise a user interface (UI) 210 for presenting a chat window. The chat window may be used by a chatbot to interact with a user.

[0036] The chatbot system 200 may comprise a core processing module 220. The core processing module 220 is configured for providing processing capabilities during operation of the chatbot through cooperation with other modules in the chatbot system 200. [0037] The core processing module 220 may obtain messages input by the user in the chat window, which are stored in a message queue 232. The messages may adopt various multimedia forms, e.g., text, speech, image, video, etc.

[0038] The core processing module 220 may process the messages in a message queue 232 in a first-in first-out approach. The core processing module 220 may invoke processing units in an Application Programming Interface (API) module 240 to process messages in various forms. The API module 240 may comprise a text processing unit 242, a speech processing unit 244, an image processing unit 246, etc.

[0039] For a text message, the text processing unit 242 may perform text understanding on the text message, and the core processing module 220 may further determine a text response.

[0040] For a speech message, the speech processing unit 244 may perform speech-to- text conversion on the speech message to obtain a text sentence, the text processing unit 242 may perform text understanding on the obtained text sentence, and the core processing module 220 may further determine a text response. If it is determined to provide responses by speech, the speech processing unit 244 may perform text-to-speech conversion on the text response to generate a corresponding voice response.

[0041] For an image message, the image processing unit 246 may perform image recognition on the image message to generate a corresponding text, and the core processing module 220 may further determine a text response. In some cases, the image processing unit 246 may also be used for obtaining an image response based on the text response.

[0042] Moreover, although not shown in FIG. 2, the API module 240 may also comprise any other processing units. For example, the API module 240 may comprise a video processing unit which cooperates with the core processing module 220 to process video messages and determine responses.

[0043] The core processing module 220 may determine responses through database 250. The database 250 may comprise various information accessible by the core processing module 220 for determining responses.

[0044] The database 250 may comprise a pure chat index set 251. The pure chat index set 251 may comprise index entries that are prepared for free chat between the chatbot and the user, and may be established with data from, e.g., social networks.

[0045] The database 250 may comprise at least one candidate product set 252. The candidate product set 252 may comprise a list of candidate products, attributes of each candidate product, etc. In an implementation, a candidate product set may be established for each product category, which comprises multiple candidate products that belong to the category. Product categories may be set based on different levels or criteria, and a candidate product may be classified to one or more corresponding categories. For example, the "gift" category may comprise candidate products such as chocolate, jewelry, flowers, crafts, etc., the "electronic goods" category may comprise candidate products such as mobile phones, computers, televisions, earphones, etc., and so on. Moreover, the candidate products in the candidate product set 252 may also comprise, e.g., candidate products of a specific brand and a specific model, e.g., a mobile phone Z of the brand XX and the model YY, a hotel K, etc. Taking the candidate product "mobile phone Z" as an example, its attributes may comprise, e.g., 4G+5G mobile network, 6-inch screen, fingerprint recognition function, etc. Various information included in the candidate product set 252 may be e.g., provided in advance by product providers or crawled from the network.

[0046] The database 250 may comprise a candidate question set 253. The candidate question set 253 may comprise a candidate question list determined based at least on attributes of the candidate products in the candidate product set 252. Each candidate question may direct to one or more product attributes, so that when a user provides an answer to the candidate question, attributes desired by the user may be determined based on the answer and which candidate products having the desired attributes may be identified. In an implementation, each candidate question may also be attached with one or more options that correspond to product attributes to which the candidate question directs. [0047] In an implementation, the candidate product set 252 and the candidate question set 253 may be linked together via product attributes. FIG. 3 illustrates mapping 300 between a candidate product set and a candidate question set according to an embodiment. A candidate product set Pi comprises a plurality of candidate products

wherein i represents a category of these candidate products, and M is the number of the candidate products. The candidate product set Pi also comprises a plurality of attributes of each candidate product. For example, attributes of the candidate product

comprises wherein N is the number of the attributes. A candidate question set

comprises a plurality of candidate questions wherein i represents a product

category to which the candidate question set

directs, and N is the number of the candidate questions. The candidate question set

may also comprise reference answers by different candidate products for each candidate question. For example, for the candidate question a reference answer by the candidate product a reference

answer by the candidate product

, etc. Therefore, a matrix composed of the attributes of the candidate products in FIG. 3 may also be referred to as a reference answer matrix The reference answer matrix

links the candidate product set

and the candidate question set together. Each element in

may be represented as wherein

m is a candidate product index and n is a candidate question index. It should be appreciated that although the candidate question set is shown in FIG. 3 as having N candidate questions, the candidate question set may also have any number of candidate questions. Although FIG. 3 shows that each candidate product has N attributes, different candidate products may also have different numbers of attributes.

[0048] It should be appreciated that the embodiments of the present disclosure are not limited to any specific approaches of constructing candidate questions based on attributes of candidate products. As an example, assuming that a candidate product 1 is an earphone X which comprises an attribute "in-ear type", and a candidate product 2 is an earphone Y which comprises an attribute "over-ear type", a candidate question "Do you prefer an in- ear type earphone or an over-ear type earphone?" may be constructed. For this candidate question, a reference answer by the candidate product 1 is "in-ear type", and a reference answer by the candidate product 2 is "over-ear type". If the user's answer is "in-ear type", it may be determined that the candidate product 1 has the attribute selected by the user, and accordingly, the candidate product 1 may have a higher ranking than the candidate product 2.

[0049] The database 250 may comprise a session record 254. The session record 254 may comprise historical questions associated with product recommendation provided by the chatbot and corresponding historical answers from the user in the session between the chatbot and the user.

[0050] The database 250 may comprise a candidate product assessing state 255. The candidate product assessing state 255 may comprise an expected probability or weight assessed for each candidate product after each round of session.

[0051] The chatbot system 200 may comprise a module set 260, which is a set of functional modules that may be operated by the core processing module 220 for generating or obtaining responses.

[0052] The module set 260 may comprise a product ranking module 261. Each time an answer to the current question is received from the user, the product ranking module 261 may recalculate an expected probability or weight of each candidate product, and rank candidate products accordingly. The calculated expected probabilities or weights may be used for updating the candidate product assessing state. When one or more candidate products meet with a predetermined condition, the one or more candidate products may be determined as recommended products to be provided to the user.

[0053] The module set 260 may comprise a question ranking module 262. Before providing the next question, the question ranking module 262 may calculate a weight of each candidate question and rank the candidate questions accordingly. Question ranking may be based at least on the results of the product ranking. For example, the question ranking may consider the current expected probability or weight of each candidate product included in the current candidate product assessing state. Moreover, the question ranking may also consider the session record, etc. The top-ranked candidate question may be selected, or a candidate question may be randomly selected from the multiple top-ranked candidate questions, as the next question to be provided to the user.

[0054] It should be appreciated that although the product ranking module 261 and the question ranking module 262 are shown as separate modules, these two modules may also be combined together so that both the product ranking and the question ranking may be implemented through performing a unified process.

[0055] The module set 260 may comprise a recommendation reason generating module 263. The recommendation reason generating module 263 may generate a recommendation reason for the determined recommended product through various approaches. In one approach, the recommendation reason generating module 263 may select, from the previously-provided questions, a question resulting in the maximum expected probability rise for the recommended product, and generate the recommendation reason with reference to the selected question and an answer of the user. In another approach, the recommendation reason generating module 263 may generate the recommendation reason through a pre-trained recommendation reason generating model. [0056] The module set 260 may comprise a sentence parsing module 264. When the user answers a question or sends a message in a natural language sentence, the sentence parsing module 264 may parse the natural language sentence in order to identify product attributes desired by the user.

[0057] The module set 260 may comprise a response providing module 265. The response providing module 265 may be configured for providing or delivering a response to a message of the user. In some implementations, the response provided by the response providing module 265 may comprise product information, a recommendation reason, etc. for the determined recommended product.

[0058] The core processing module 220 may provide the determined response to a response queue or response cache 234. For example, the response cache 234 may ensure that the response sequence may be displayed in an appropriate timing. Assuming that for a message, more than two responses are determined by the core processing module 220, a time delay setting for the responses may be necessary. For example, if a message input by the user is "Did you have breakfast?", two responses may be determined, e.g., a first response "Yes, I ate bread" and a second response "What about you? Are you still hungry?". In this case, through the response cache 234, the chatbot may ensure that the first response is provided to the user immediately. Furthermore, the chatbot may ensure that the second response is provided with a time delay of, e.g., 1 or 2 seconds, so that the second response will be provided to the user 1 or 2 seconds after the first response. Thus, the response cache 234 may manage responses to be sent and appropriate timing for each response.

[0059] The responses in the response queue or response cache 234 may be further transmitted to the UI 210 so that the responses may be displayed to the user in the chat window.

[0060] It should be appreciated that all the units shown in the chatbot system 200 in FIG. 2 are exemplary, and according to specific application requirements, any of the units shown in the chatbot system 200 may be omitted and any other units may be involved. [0061] FIG. 4 illustrates an exemplary overall process 400 for providing explainable product recommendation according to an embodiment.

[0062] At 410, a message from a user may be received. The message may indicate the user's intention to obtain product recommendation. For example, the message may be "Recommend a gift for me."

[0063] At 420, a product category to which a product to be recommended belongs may be determined based at least on the message received at 410. For example, when the received message is "Recommend a gift for me", it may be determined that the product category is "gift". For example, when the received message is "Recommend an electronic goods for me", it may be determined that the product category is "electronic goods".

[0064] At 430, multiple rounds of session with the user may be conducted based on the LTE architecture proposed by the embodiments of the present disclosure. In the multiple rounds of session, questions associated with product recommendation under the determined product category may be dynamically provided, the user's answers to these questions may be collected, and a recommended product and a recommendation reason may be determined. This will be discussed in detail later in connection with FIG. 5.

[0065] At 440, product information and the recommendation reason of the recommended product may be provided to the user as a response.

[0066] FIG. 5 illustrates an exemplary specific process 500 for providing explainable product recommendation according to an embodiment. The process 500 shows an exemplary operating process of the LTE architecture according to the embodiments of the present disclosure. Before the process 500 is performed, an user's intention to obtain product recommendation has been determined, and a product category has been determined. Therefore, the process 500 will be performed for providing product recommendation under this product category. It should be appreciated that the process 500 may be performed iteratively until a recommended product is determined.

[0067] At 502, a question associated with product recommendation may be provided to the user. The question may direct to product attributes. Optionally, multiple options may be provided to the user along with the question so that the user may select an answer from these options.

[0068] At 504, an answer to the provided question may be received from the user.

[0069] In the case that the question is attached with options, the answer of the user may be a direct selection of one or more options in the attached options. Therefore, the selection made by the user to these options may be determined at 506 in order to determine product attributes selected by the user. For example, if three options are attached to the question, and the user's answer is the index “2” of the second option or comprises an expression associated with content of the second option, it may be determined that a product attribute indicated by the second option is desired by the user. [0070] In the case that no option is attached to the question, the user's answer may be in the form of natural language sentence. Parsing may be performed to the natural language sentence at 508. The parsing may adopt any existing intent-slot parsing techniques to detect slots and corresponding values from the natural language sentence. These values may be used as keywords for retrieving relevant questions from a candidate question set and corresponding answers. The retrieved relevant questions may comprise the question provided to the user and other questions. For example, assuming that the question provided to the user is "Do you like to stay in a quiet place?", and the user's answer is a natural language sentence "I like quiet, but I also like running", at least the keywords "quiet" and "running" may be detected from the natural language sentence. The "quiet" may be an answer to a relevant question "Do you like to stay in a quiet place?", and the "running" may be a corresponding answer to another relevant question "What are your hobbies?". Thus, at least two answers by the user to two relevant questions are obtained from the natural language sentence.

[0071] At 510, a <question, answer> pair may be added to a session record 512. The question may be the provided question or relevant questions, and the answer may be content of options selected by the user or corresponding answers to the relevant questions. Questions and answers included in the session record 512 may be continuously updated as the session proceeds.

[0072] At 514, product ranking may be performed on a plurality of candidate products 516 based at least on the provided question and the user's answer. Through product ranking, an expected probability of each candidate product may be updated, and the calculated expected probability may be included in a candidate product assessing state 518. In an implementation, the candidate product assessing state 518 may be in the form of vector, wherein the vector’s dimension corresponds to the number of candidate products, and a value of each dimension corresponds to an expected probability of a candidate product. Moreover, as described above, an expected probability of a candidate product may be replaced by a weight of the candidate product, so that the candidate product assessing state 518 comprises a weight of each candidate product.

[0073] At 520, it may be determined, based on the result of the product ranking, whether there exists at least one candidate product in the plurality of candidate products which meets with a predetermined condition. The predetermined condition may indicate whether a candidate product reaches a condition of being determined as a recommended product. For example, the predetermined condition may be that an expected probability or weight of a candidate product is above a threshold, wherein the threshold may be set empirically in advance.

[0074] If it is determined at 520 that there is no candidate product that meets with the predetermined condition, the process 500 may provide a further question to the user. At 522, question ranking may be performed on a plurality of candidate questions 524 in order to determine a next question to be provided to the user. The question ranking may be based at least on the result of the product ranking, e.g., the current expected probability or weight of each candidate product included in the current candidate product assessing state. The question ranking may also be based on historical questions and historical answers in the session record 512. In the question ranking, a weight or information gain of each candidate question may be calculated, and the candidate questions may be ranked according to weights or information gain. The question ranking may be performed in various approaches, e.g., through entropy -based ranking 522-1, through policy-based reinforcement learning ranking 522-2, etc., which will be discussed in detail later.

[0075] At 526, the next question to be provided to the user may be selected based on weights of the candidate question. For example, the top-ranked candidate question may be selected as the next question, a candidate question may be randomly selected from the multiple top-ranked candidate questions as the next question, etc.

[0076] After the next question is selected, the process 500 iteratively returns to 502 to provide the selected next question to the user, and then performs the subsequent steps. [0077] If it is determined at 520 that there is at least one candidate product that meets with the predetermined condition, the at least one candidate product may be determined as a recommended product to be provided to the user at 528.

[0078] At 530, a recommendation reason for the recommended product may be determined. The recommendation reason may be determined in various approaches.

[0079] In an approach, at least one question that contributes the most in the questions previously provided to the user and a corresponding answer of the user may be selected to generate the recommendation reason. For example, a question resulting in the maximum expected probability rise for the recommended product may be selected from the historical questions provided to the user, and the recommendation reason may be generated with reference to the selected question and an answer of the user. The recommendation reason may be constructed based on the selected question and the answer according to various predefined rules. The recommendation reason may comprise a simple repetition of at least one of the selected question and the answer. The recommendation reason may comprise a transformed expression of at least one of the selected question and the answer. For example, the recommendation reason may comprise an expression "like fast food" transformed from an answer "like burgers of KFC". The recommendation reason may comprise a generalized expression of at least one of the selected question and the answer. For example, content of the question and answer may be semantically generalized through any natural language processing technique. The recommendation reason may comprise some words or phrases commonly used in a free chat to make a sentence’s expression more natural. For example, expressions such as "The reason why I gave the above recommendation is ...", "Considering ..., I decided to recommend ...", etc. are added in the recommendation reason.

[0080] In another approach, a recommendation reason generating model may be previously trained for generating a recommendation reason based on at least one of attributes selected by historical answers in the session record, attributes of a recommended product, description of the recommended product, etc., which will be discussed in detail later.

[0081] At 532, a response including product information and the recommendation reason of the recommended product may be provided to the user. The product information may be extracted from, e.g., the product database 150 in FIG. 1 or the candidate product set 252 in FIG. 2.

[0082] It should be appreciated that all the processing steps and their order included in the process 500 in FIG. 5 are exemplary, and any addition, deletion, or replacement may be made to the processing steps in the process 500 according to specific application requirements. For example, in an implementation, in the case that the chatbot receives a natural language message sent by the user and the message does not direct to any question associated with product recommendations provided by the chatbot, the natural language message may be directly parsed at 508 in order to determine corresponding relevant questions and corresponding answers, and then the subsequent processing may be performed. Moreover, in an implementation, other decision conditions for determining recommended products may be defined or added at 520. For example, a threshold of the number of questions provided to the user may be predefined. If it is determined that the number of questions having provided to the user in the process 500 is above the threshold, a recommended product may be directly determined at 528, which may be, e.g., a candidate product currently having the highest expected probability or weight.

[0083] Methods for selecting a question through the entropy-based ranking will be discussed below, which may be used for ranking candidate questions and then selecting a next question to be provided to the user from the candidate questions.

[0084] The next question may be selected so that it may exclude as many candidate products with low possibility of being recommended products as possible, regardless of the user's answer to the question. The next question may be selected so that it may divide the candidate products into two subsets with similar sizes or similar weights. For example, when the answer to the question is a binary answer, e.g., "yes" or "no", the candidate products may be divided into a subset with a reference answer of "yes" and a subset with a reference answer of "no". The two subsets may have equal or approximate number of candidate products, or have candidate products with equal or approximate cumulative weights. For example, when the answer to the question has more than two options, e.g., "cheap", "medium" and "expensive", the candidate products may be divided into three subsets with reference answers of "cheap", "medium" and "expensive" respectively, and these subsets have similar sizes or similar weights. In an implementation, after determining product attributes currently selected by the user through a round of session, those candidate products that conform to the attributes currently selected by the user may be used as a candidate product set to be considered when determining a question for the next round of session. Through continuously performing the above process, the size of the candidate product set for determining a next question may be continuously reduced.

[0085] In the entropy-based ranking, for a product category /, each candidate product may be initially assigned a prior probability weight w(-). The weight may

be set with reference to search frequency for the candidate product on search engines, or review scores or order frequency for the candidate product on e-commerce websites. Then, the w(-) may be normalized as:

[0086] For a candidate product its contribution to the selection of a candidate

question q_n may be calculated as:

wherein represents the frequency of finally selecting the candidate product after

users select an option / of the candidate question q_n in historical data. The historical data may be obtained by collecting usage information of a large number of users, thereby

reflects historical usage information of a large number of users. /(.) is an indicator function, which returns 1 when

holds, otherwise returns 0, wherein

yes} indicates that the option / is a reference answer by the candidate product

to the candidate question q_{n .} The parameter a is used for balancing the historical usage information with reference answers in a reference answer matrix

For example, when a is 0, only the historical usage information is considered, and the reference answers are ignored. When a is set to an extremely large value, the reference answers are considered to a greater extent while the historical usage information is ignored. Moreover, a may also be extended to a time decay function wherein t represents time.

[0087] In an implementation, a negative Shannon entropy may be used for a multivariate Bernoulli distribution of the options, and a parameter M_mn is calculated as:

[0088] Then, a weight w(qr_n) of the candidate question q_n may be calculated as:

[0089] The above process may be performed for each candidate question in a candidate question set in order to calculate a weight of each candidate question. The

candidate questions may be ranked based on weights, and a next question to be

provided to the user may be selected from the ranked candidate questions.

[0090] Based on the user's answer a to the question

the weight of each candidate product P_m may be updated as:

[0091] In the case that the user's answer a does not match any option of the question will be 0 In this case, a return value of /(.) may be set to a very small value,

e.g., 0.01, instead of 0, in order to avoid completely discarding this candidate product. [0092] The calculated weight of each candidate product may be added to a candidate product assessing state. If a certain candidate product meets with a predetermined condition, e.g., the weight or expected probability of the candidate product is above a threshold, the candidate product may be determined as a recommended product.

[0093] If no candidate product meets with the predetermined condition, the entropy- based ranking described above may be performed iteratively again. For example,

may be normalized as through Equation (1) again, and subsequent processing may

be continued to perform.

[0094] The subscripts t and /+ 1 may be used for representing the current stage and the next stage, respectively, thus obtaining

representing a weight of a candidate product used in the current stage, and representing a weight of a candidate

product that will be used in the next stage after the current stage. Then, a weight rise or expected probability rise of the candidate product

, caused by the

current question

provided to the user and the user's answer a, may be calculated as:

[0095] In an implementation, after a candidate product is determined as a recommended product, a question resulting in the maximum weight rise or maximum expected probability rise of the recommended product and a corresponding answer by the user may be selected for generating a recommendation reason for the recommended product. For example, assuming that the first question causes the expected probability of the recommended product to change from 0 to 0.2, the second question causes the expected probability to change from 0.2 to 0.6, and the third question causes the expected probability to change from 0.6 to 0.9, wherein the expected probability 0.9 is above a threshold 0.8. Since the first question results in an expected probability rise of 0.2, the second question results in an expected probability rise of 0.4, and the third question results in an expected probability rise of 0.3, the second question resulting in the maximum expected probability rise (i.e., 0.4) and a corresponding answer by the user may be selected for generating the recommendation reason. Moreover, as described above, the recommendation reason may also be generated by the recommendation reason generating model.

[0096] Through performing the entropy-based ranking described above, in one aspect, weights of the candidate questions may be continuously updated during the session for selecting the next question to be provided to the user, and in another aspect, weights or expected probabilities of the candidate products may be continuously updated so that the recommended product may be determined.

[0097] It should be appreciated that all the above equations are exemplary, which are only used for illustrating exemplary processing procedures, and the embodiments of the present disclosure are not limited to any of the above specific equations. For example, instead of Equation (3) and Equation (4), the weight of the candidate question q_n may also be calculated through the following process.

[0098] After calculating through Equation representing importance of the

option / in the candidate question q_n may then be calculated as:

[0099] Then, the negative variance of is used for calculating the weight w(q_n) of the question q_n:

[00100] Moreover, it should be appreciated that, during describing the entropy-based ranking in the above, except for , the category index i is omitted for most variables for

simplicity.

[00101] Methods for selecting a question through the policy-based reinforcement learning ranking will be discussed below, which may be used for ranking candidate questions and then selecting a next question to be provided to the user from the candidate questions.

[00102] The policy-based reinforcement learning algorithm may be used for predicting a specific entity (e.g., a celebrity, etc.) under a constraint of allowing the user to answer single-attribute questions. A single-attribute question may refer to a question aiming at obtaining a binary answer (e.g., yes, no, etc.). The embodiments of the present disclosure adapt this algorithm to be suitable for multi-option questions for explainable product recommendation. For example, the algorithm may be adapted as suitable for a task- oriented scenario in which questions are attached with a plurality of options. The user may select any subset of these options. This may be construed as a jump across several singleattribute questions. Further, when the user expresses requirements in a natural language sentence involving answers to a plurality of questions, a jump across multiple singleattribute questions or combined questions may be achieved in a single round of session with the user.

[00103] The question ranking may be summarized as a finite Markov decision process (MDP) represented by a 5-tuple (5, A, P, R, g} , S is a continuous candidate product assessing state space, and each state 5 in S represents a vector storing an expected probability of a candidate product. A = {q₁, q₂, ..., q_n} is a candidate question set. P(S_t+1 = s'|S_t = s,A_t = q) is a state transition probability matrix. R(s, q) represents a reward function or reward network, g e [0,1] is a decay factor to discount the long-term return value. In the policy-based reinforcement learning algorithm, at each time step /, the chatbot may provide a candidate question q_t under the current candidate question assessing state 5 according to a policy function or policy network After providing the candidate question q_t and receiving the user's answer to q_t, a reward score may be generated and the candidate question assessing state s may be updated to s'.

A quadruple ^maY be used as an episode in the reinforcement learning

process. The long-term reward at the time step t may be defined as: R_t =

[00104] The candidate question assessing state s_t may keep track of confidence of the candidate product

the time step /, e.g., expected probability. For example,

Here, s_{t m} represents confidence that the product pi_n is

desired by the user at the time step t. Initially, similar with the above entropy-based ranking, s₀ may take the priori expected probability of the candidate product.

[00105] Given a candidate product set and a candidate question set

normalized confidence of the user's answer over multiple options of each question q_n may be calculated. That is, the transition of the candidate product assessing state may be defined as:

Here, is a dot product operator, b depends on the user's answer x_t to the question q_t selected at the time step /, wherein q_t has an index n_t in the candidate question set Q =

When the user selects an option

for the answer to the current question may be defined, and adopts a similar

definition in Equation (2). In this way, the confidence s_{t m} of the candidate product pi_n may be updated as s_{t+l m} based on the user's answer

to the question q_t at the time step t.

[00106] In order to allow the policy-based reinforcement learning algorithm to take previously-provided questions as a precondition for determining the next question and utilize the user's historical selections for ranking candidate questions, the embodiments of the present disclosure proposes to utilize a neural network-based LTE reward network, which takes a quadruple as input and outputs a reward of the next step.

The LTE reward network adopts a maximum layer perception (MLP) with sigmoid output in order to learn appropriate immediate rewards during training. Table 1 below shows an exemplary training process.

Table 1

[00107] In the process shown in Table 1, the question Q_t and the corresponding answer are embedded, and the resulting embedding vector is concatenated with s_t for training the reward network R with, e.g., a squared difference loss function. The reward network is further used for training the policy function to rank the candidate questions and select the next question. The policy function may be trained by using a reinforcement algorithm under, e.g., a cross-entropy loss function.

[00108] The value network V may be used for scoring the goodness of the current state s_t. The value network may estimate how good the current state itself is to be selected in an episode. The value network may use, e.g., a squared difference loss function, and take a cumulative reward

as a reference score. After updating, the newly estimated score is subtracted from

for further updating R and p_q, respectively.

[00109] Four loops are included in Table 1. The first loop is from 1.3 to 1.21, which controls the number of epochs to be within Z. The second loop is from 1.5 to 1.9, which applies a policy function to select a question and update the candidate product assessing state. In an implementation, a candidate question may be restricted to be selected and used only once during a session. The result obtained in the second loop is stored in

for use in subsequent steps. The third loop is from 1.11 to 1.13, which applies the reward networks to obtain the immediate reward. The fourth loop is from 1.14 to 1.21, which updates parameters in the policy function and the reward network by picking mini-batches from the episode memory M. In an implementation, the policy function and the reward network may adopt a MLP with, e.g., 3 hidden layers, and utilize an algorithm based on an ADAM optimizer.

[00110] At the stage of applying the policy -based reinforcement learning ranking, an expected probability rise at the step for all candidate products resulted by the

current question q_t and the user's answer is:

[00111] Thus, the recommended product may be determined through comparing an expected probability of each candidate product with a predetermined condition.

[00112] It should be appreciated that all the equations, variables, parameters, etc. involved in the above discussion on the entropy-based ranking and the policy-based reinforcement learning ranking are exemplary. According to the specific application requirements, these equations, variables, parameters, etc. may be deleted, replaced, and added in any approaches. The embodiments of the present disclosure are not limited to any details discussed above.

[00113] FIG. 6 illustrates an exemplary process 600 for training a recommendation reason generating model according to an embodiment.

[00114] Training data for training the recommendation reason generating model may be collected from the network, e.g., an e-commerce website. A product set 610 for generating training data may be identified or specified first. For each product in the product set 610, category information 640 of the product may be further obtained from the website that provides the product. Generally, category information of a product may comprise a series of categories at different levels. For example, for the product "salmon", it may correspond to multiple categories at different levels, e.g., "food", "seafood", "fish", etc.

[00115] In some cases, after a user purchases a product from an e-commerce website, he may provide reviews for the product, which may contain explainable reasons for purchasing the product in natural language. Therefore, reviews 620 for the products in the product set 610 may be collected. Optionally, the process 600 may perform filtering on the reviews 620 to obtain filtered reviews 622. For example, sentiment analysis may be performed on reviews to filter out negative reviews while retaining positive reviews. Moreover, for example, a predefined expression pattern may be adopted for detecting validity of the reviews, and invalid reviews containing too few words or too many repeated characters in the expression may be filtered out. It should be appreciated that the term "review" involved below may broadly refer to either or both of the reviews 620 and the filtered reviews 622. The process 600 may extract attribute information 650 of products from the reviews 620 or the filtered reviews 622. For example, for a review "These shoes are great! Super soft, shock-absorbing, and very light", attributes of the product "shoes", e.g., "soft", "shock-absorbing", "light", etc., may be extracted.

[00116] Generally, the e-commerce website may also provide descriptions of products, e.g., products’ characteristics, parameters, etc. These descriptions usually explicitly comprise various attributes of the products expressed in natural language. Therefore, descriptions 630 for the products in the product set 610 may be collected. Optionally, the process 600 may perform summarization on the product descriptions 630 to obtain product description summaries 632. In some cases, a product description may be long, so only the main content of the description may be used for subsequent training. Existing unsupervised text ranking algorithms may be used for performing summarization on the description. It should be appreciated that the term "product description" involved below may broadly refer to either or both of the product descriptions 630 and the product description summaries 632. The process 600 may extract attribute information 650 of products from the product descriptions 630 or the product description summaries 632. [00117] Through the data collecting process described above, a training data set 660 in the form of <attribute + description, review> pair may be formed. Each <attribute + description, review> data pair is associated with a specific product. The training data set 660 may be used for training a recommendation reason generating model 670. The "attribute + description" in the training data may be used as input of the model, and the "review" as output of the model. The recommendation reason generating model 670 may adopts transformer architecture in which both the encoding part and the decoding part may adopts a self-attention mechanism with positional encoding for sequential dependency learning. In the training, the encoding part may process attributes and description of a product, while the decoding part may process previous reviews for the product by different purchasers. The trained recommendation reason generating model 670 may generate recommendation reasons in natural language similar with reviews.

[00118] It should be appreciated that all the steps in the process 600 are exemplary, and the process 600 may be changed in any forms according to specific application requirements and designs. For example, when the number of reviews collected for a product from the network is small, product description, instead of the reviews, may be used for constructing an explainable reason. For example, the training data may take the form of <attribute, description^ so that when training the recommendation reason generating model, the "attribute" may be used as input of the model, and the "description" as output of the model.

[00119] FIG. 7 illustrates an exemplary process 700 for generating a recommendation reason according to an embodiment. As described above, after a recommended product is determined, a recommendation reason of the recommended product may be further determined in order to be provided to the user in a response. A recommendation reason generating model 710, which may be previously trained through the process 600 in FIG. 6, is adopted in the process 700 for generating a recommendation reason for a recommended product.

[00120] A session record 702 may comprise historical questions provided by a chatbot in a session and historical answers provided by a user. The user's historical answers may indicate product attributes 704 selected or desired by the user. The attributes 704 of a recommended product selected by the user may be provided to the recommendation reason generating model 710 as input.

[00121] Attributes 706 of the recommended product may be obtained through, e.g., the process 600 in FIG. 6 and provided to the recommendation reason generating model 710 as input. Moreover, a description 708 of the recommended product may also be obtained through, e.g., the process 600 in FIG. 6 and provided to the recommendation reason generating model 710 as input.

[00122] The recommendation reason generating model 710 may generate a recommendation reason 720 for the recommended product according to at least one of the attributes 704 selected by the user, the attributes 706 of the recommended product, and the description 708 of the recommended product. Optionally, for a question resulting in the maximum expected probability rise for the recommended product, attributes selected by the user indicated by an answer to the question may be given a higher weight in the process of generating a recommendation reason.

[00123] FIG. 8 illustrates an exemplary chat window 800 according to an embodiment. [00124] After receiving a message "Help me find a gift" provided by a user, a chatbot may determine that the user wants to obtain product recommendations and the product category is "gift". Then, the chatbot may provide multiple questions to the user through multiple rounds of session and receive the user's answers. These questions may be dynamically determined sequentially according to the embodiments of the present disclosure discussed above. Each question is attached with options, and accordingly, the user's answers comprises explicit selections of options. Finally, the chatbot provides the user with a response "According to your selection of 'Riverside' for question 6, I recommend you to buy a fishing rod". The response comprises product information "fishing rod" of the recommended product, wherein the recommended product may be determined according to the embodiments of the present disclosure discussed above. The response also comprises a recommendation reason "according to your selection of 'Riverside' for question 6", wherein the recommendation reason may be generated according to the embodiments of the present disclosure discussed above, e.g., generated based on question 6, which results in the maximum expected probability rise for the recommended product, and its corresponding answer.

[00125] FIG. 9 illustrates an exemplary chat window 900 according to an embodiment. [00126] After receiving a message "Recommend an electronic goods for me as a gift" provided by a user, a chatbot may determine that the user wants to obtain product recommendation and the product categories are "gift" and "electronic goods". Then, the chatbot may provide multiple questions to the user through multiple rounds of session and receive the user's answers. Finally, the chatbot provides the user with a response "Considering that you like 'running', I recommend you to buy a smart bracelet". The response comprises product information "smart bracelet" of the recommended product and a recommendation reason "considering that you like 'running' ". The recommended product and the recommendation reason may be determined according to the embodiments of the present disclosure discussed above, wherein the recommendation reason may be generated based on question 5, which results in the maximum expected probability rise for the recommended product, and its corresponding answer.

[00127] Through the comparison between FIG. 8 and FIG. 9, it can be seen that the chatbot may dynamically determine the next question based at least on the user’s answer to each question.

[00128] FIG. 10 illustrates an exemplary chat window 1000 according to an embodiment. In the session of FIG. 10, a chatbot may provide a user with product recommendation involving hotel reservation services. Question 1 to question 2 are attached with options for the user to select. Question 3 to question 6 are not attached with options, and product attributes desired by the user may be determined through parsing the user's responses in natural language sentences via, e.g., the step 508 in FIG. 5. Further, relevant questions and corresponding answers that correspond to the user's answers may be determined through parsing, which are used for determining the next question.

[00129] FIG. 11 illustrates an exemplary chat window 1100 according to an embodiment. In the session of FIG. 11, a chatbot may provide a user with product recommendation involving gifts. The first question and the second question are attached with options, and the user's answers take the form of natural language sentences. The user's answers may be identified by, e.g., the step 506 in FIG. 5 to determine options selected by the user, or the user’s answers may be parsed by, e.g., the step 508 of FIG. 5 to determine product attributes desired by the user and to determine relevant questions and corresponding answers. Two recommendation reasons are included in a response provided by the chatbot. The first recommendation reason "I recommend it based on the keywords 'quiet' and 'yoga/reading' " may be generated based on the first question and the second question, that result in the maximum expected probability rise for the recommended product, and corresponding answers. The second recommendation reason "I hope these gifts can help her enjoy life in a quieter environment" may be generated by a recommendation reason generating model.

[00130] FIG. 12 illustrates an exemplary chat window 1200 according to an embodiment. In the session in FIG. 12, a user proactively sends a message in a natural language sentence "I can tell you that she likes quiet places, music, and yoga". The message may be parsed through, e.g., the step 508 in FIG. 5 to determine a set of relevant questions and corresponding answers, e.g., a question "Does she like being quiet?" and a corresponding answer "Like being quiet", a question "Her hobby?" and a corresponding answer "Music", "Yoga", etc. The "quiet", "music", "yoga", etc. described above may all be considered as the user's desired product attributes, and further be used for performing subsequent product ranking, question ranking, etc.

[00131] FIG. 13 illustrates a flowchart of an exemplary method 1300 for providing explainable product recommendation in a session according to an embodiment.

[00132] At 1310, at least one question associated with product recommendation may be provided.

[00133] At 1320, an answer to the at least one question may be received.

[00134] At 1330, it may be determined whether there exists at least one recommended product based at least on the at least one question and the answer.

[00135] At 1340, in response to determining that there exists the at least one recommended product, a recommendation reason of the at least one recommended product may be generated.

[00136] At 1350, a response including product information and the recommendation reason of the at least one recommended product may be provided.

[00137] In an implementation, the determining whether there exists at least one recommended product may comprise: performing product ranking to a plurality of candidate products based at least on the at least one question and the answer; and determining, based on a result of the product ranking, whether there exists at least one candidate product in the plurality of candidate products which meets with a predetermined condition.

[00138] In an implementation, the performing product ranking may comprise: updating a candidate product assessing state based at least on the at least one question and the answer, the candidate product assessing state comprising an expected probability of each candidate product in the plurality of candidate products.

[00139] In an implementation, the predetermined condition may comprise: an expected probability of a candidate product is above a threshold.

[00140] In an implementation, the method 1300 may further comprise: adding the at least one question and the answer into a session record of the session, the session record comprising historical questions and historical answers associated with the product recommendation in the session.

[00141] In an implementation, the generating a recommendation reason may comprise: determining an expected probability rise resulted by each historical question in the session record to the at least one recommended product; selecting a historical question resulting in the maximum expected probability rise; and generating the recommendation reason based at least on the selected historical question and corresponding answer.

[00142] In an implementation, the generating a recommendation reason may comprise: generating the recommendation reason based on at least one of attributes selected by the historical answers, attributes of the recommended product, and description of the recommended product, through a recommendation reason generating model.

[00143] In an implementation, the method 1300 may further comprise: in response to determining that there does not exist the at least one recommended product, performing question ranking to a plurality of candidate questions based at least on a result of the product ranking; and selecting a next question to be provided based on a result of the question ranking.

[00144] In an implementation, the question ranking may be performed through an entropy-based ranking or a policy-based reinforcement learning ranking.

[00145] In an implementation, the plurality of candidate questions may be previously determined based at least on attributes of the plurality of candidate products.

[00146] In an implementation, the at least one question may comprise one or more options, and the answer may comprise a selection from the one or more options.

[00147] In an implementation, the answer may comprise a natural language sentence. The determining whether there exists at least one recommended product may comprise: determining one or more relevant questions and corresponding answers corresponding to the natural language sentence; and determining whether there exists the at least one recommended product based at least on the one or more relevant questions and corresponding answers.

[00148] It should be appreciated that the method 1300 may further comprise any step/process for providing explainable product recommendation in a session according to the above embodiments of the present disclosure.

[00149] FIG. 14 illustrates an exemplary apparatus 1400 for providing explainable product recommendation in a session according to an embodiment.

[00150] The apparatus 1400 may comprise: a question providing module 1410, for providing at least one question associated with product recommendation; an answer receiving module 1420, for receiving an answer to the at least one question; a recommended product determining module 1430, for determining whether there exists at least one recommended product based at least on the at least one question and the answer; a recommendation reason generating module 1440, for in response to determining that there exists the at least one recommended product, generating a recommendation reason of the at least one recommended product; and a response providing module 1450, for providing a response including product information and the recommendation reason of the at least one recommended product. [00151] In an implementation, the recommended product determining module 1430 may be for: performing product ranking to a plurality of candidate products based at least on the at least one question and the answer; and determining, based on a result of the product ranking, whether there exists at least one candidate product in the plurality of candidate products which meets with a predetermined condition. The performing product ranking may comprise: updating a candidate product assessing state based at least on the at least one question and the answer, the candidate product assessing state comprising an expected probability of each candidate product in the plurality of candidate products. [00152] In an implementation, the apparatus 1400 may further comprise a session record adding module, for adding the at least one question and the answer into a session record of the session, the session record comprising historical questions and historical answers associated with the product recommendation in the session.

[00153] In an implementation, the recommendation reason generating module 1440 may be for: determining an expected probability rise resulted by each historical question in the session record to the at least one recommended product; selecting a historical question resulting in the maximum expected probability rise; and generating the recommendation reason based at least on the selected historical question and corresponding answer.

[00154] In an implementation, the recommendation reason generating module 1440 may be for: generating the recommendation reason based on at least one of attributes selected by the historical answers, attributes of the recommended product, and description of the recommended product, through a recommendation reason generating model.

[00155] In an implementation, the apparatus 1400 may further comprise a question selecting module for: in response to determining that there does not exist the at least one recommended product, performing question ranking to a plurality of candidate questions based at least on a result of the product ranking; and selecting a next question to be provided based on a result of the question ranking.

[00156] Moreover, the apparatus 1400 may further comprise any other modules configured for providing explainable product recommendation in a session according to the above embodiments of the present disclosure.

[00157] FIG. 15 illustrates an exemplary apparatus 1500 for providing explainable product recommendation in a session according to an embodiment.

[00158] The apparatus 1500 may comprise at least one processor 1510 and a memory 1520 storing computer-executable instructions. When executing the computer-executable instructions, the processor 1510 may: provide at least one question associated with product recommendation; receive an answer to the at least one question; determine whether there exists at least one recommended product based at least on the at least one question and the answer; in response to determining that there exists the at least one recommended product, generate a recommendation reason of the at least one recommended product; and provide a response including product information and the recommendation reason of the at least one recommended product. Moreover, the processor 1510 may further perform any other processing for providing explainable product recommendation in a session according to the above embodiments of the present disclosure.

[00159] The embodiments of the present disclosure may be embodied in a non- transitory computer-readable medium. The non-transitory computer-readable medium may comprise instructions that, when executed, cause one or more processors to perform any operations of the methods for providing explainable product recommendation in a session according to the above embodiments of the present disclosure.

[00160] It should be appreciated that all the operations in the methods described above are merely exemplary, and the present disclosure is not limited to any operations in the methods or sequence orders of these operations, and should cover all other equivalents under the same or similar concepts.

[00161] It should also be appreciated that all the modules in the apparatuses described above may be implemented in various approaches. These modules may be implemented as hardware, software, or a combination thereof. Moreover, any of these modules may be further functionally divided into sub-modules or combined together.

[00162] Processors are described in connection with various apparatuses and methods. These processors may be implemented using electronic hardware, computer software, or any combination thereof. Whether these processors are implemented as hardware or software will depend on the specific application and the overall design constraints imposed on the system. By way of example, a processor, any portion of a processor, or any combination of processors presented in this disclosure may be implemented as a microprocessor, a micro-controller, a digital signal processor (DSP), a field programmable gate array (FPGA), a programmable logic device (PLD), state machine, gate logic, discrete hardware circuitry, and other suitable processing components configured to perform the various functions described in this disclosure. The functions of a processor, any portion of a processor, or any combination of processors presented in this disclosure may be implemented as software executed by a microprocessor, a micro-controller, a DSP, or other suitable platforms. [00163] Software should be considered broadly to represent instructions, instruction sets, code, code segments, program code, programs, subroutines, software modules, applications, software applications, software packages, routines, subroutines, objects, running threads, processes, functions, etc. Software may reside on computer readable medium. Computer readable medium may include, e.g., a memory, which may be, e.g., a magnetic storage device (e.g., a hard disk, a floppy disk, a magnetic strip), an optical disk, a smart card, a flash memory device, a random access memory (RAM), a read only memory (ROM), a programmable ROM (PROM), an erasable PROM (EPROM), an electrically erasable PROM (EEPROM), a register, or a removable disk. Although a memory is shown as being separate from the processor in various aspects presented in this disclosure, a memory may also be internal to the processor (e.g., a cache or a register). [00164] The previous description is provided to enable any person skilled in the art to practice the various aspects described herein. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects. Thus, the claims are not intended to be limited to the aspects shown herein. All structural and functional equivalent transformations to the elements of the various aspects of the present disclosure, which are known or to be apparent to those skilled in the art, will be included herein, and are intended to be covered by the claims.

Claims

1. A method for providing explainable product recommendation in a session, comprising: providing at least one question associated with product recommendation; receiving an answer to the at least one question; determining whether there exists at least one recommended product based at least on the at least one question and the answer; in response to determining that there exists the at least one recommended product, generating a recommendation reason of the at least one recommended product; and providing a response including product information and the recommendation reason of the at least one recommended product.

2. The method of claim 1, wherein the determining whether there exists at least one recommended product comprises: performing product ranking to a plurality of candidate products based at least on the at least one question and the answer; and determining, based on a result of the product ranking, whether there exists at least one candidate product in the plurality of candidate products which meets with a predetermined condition.

3. The method of claim 2, wherein the performing product ranking comprises: updating a candidate product assessing state based at least on the at least one question and the answer, the candidate product assessing state comprising an expected probability of each candidate product in the plurality of candidate products.

4. The method of claim 3, wherein the predetermined condition comprises: an expected probability of a candidate product is above a threshold.

5. The method of claim 1, further comprising: adding the at least one question and the answer into a session record of the session, the session record comprising historical questions and historical answers associated with the product recommendation in the session.

6. The method of claim 5, wherein the generating a recommendation reason comprises: determining an expected probability rise resulted by each historical question in the session record to the at least one recommended product; selecting a historical question resulting in the maximum expected probability rise; and generating the recommendation reason based at least on the selected historical question and corresponding answer.

7. The method of claim 5, wherein the generating a recommendation reason comprises: generating the recommendation reason based on at least one of attributes selected by the historical answers, attributes of the recommended product, and description of the recommended product, through a recommendation reason generating model.

8. The method of claim 2, further comprising: in response to determining that there does not exist the at least one recommended product, performing question ranking to a plurality of candidate questions based at least on a result of the product ranking; and selecting a next question to be provided based on a result of the question ranking.

9. The method of claim 8, wherein the question ranking is performed through an entropy-based ranking or a policy-based reinforcement learning ranking.

10. The method of claim 8, wherein the plurality of candidate questions are previously determined based at least on attributes of the plurality of candidate products.

11. The method of claim 1, wherein the at least one question comprises one or more options, and the answer comprises a selection from the one or more options.

12. The method of claim 1, wherein the answer comprises a natural language sentence, and the determining whether there exists at least one recommended product comprises: determining one or more relevant questions and corresponding answers corresponding to the natural language sentence; and determining whether there exists the at least one recommended product based at least on the one or more relevant questions and corresponding answers.

13. An apparatus for providing explainable product recommendation in a session, comprising: a question providing module, for providing at least one question associated with product recommendation; an answer receiving module, for receiving an answer to the at least one question; a recommended product determining module, for determining whether there exists at least one recommended product based at least on the at least one question and the answer; a recommendation reason generating module, for in response to determining that there exists the at least one recommended product, generating a recommendation reason of the at least one recommended product; and a response providing module, for providing a response including product information and the recommendation reason of the at least one recommended product.

14. The apparatus of claim 13, wherein the recommended product determining module is for: performing product ranking to a plurality of candidate products based at least on the at least one question and the answer; and determining, based on a result of the product ranking, whether there exists at least one candidate product in the plurality of candidate products which meets with a predetermined condition.

15. An apparatus for providing explainable product recommendation in a session, comprising: at least one processor; and a memory storing computer-executable instructions that, when executed, cause the at least one processor to: provide at least one question associated with product recommendation, receive an answer to the at least one question, determine whether there exists at least one recommended product based at least on the at least one question and the answer, in response to determining that there exists the at least one recommended product, generate a recommendation reason of the at least one recommended product, and provide a response including product information and the recommendation reason of the at least one recommended product.