TWI779887B - Dynamic homestay information recommendation device - Google Patents

Dynamic homestay information recommendation device Download PDF

Info

Publication number
TWI779887B
TWI779887B TW110138760A TW110138760A TWI779887B TW I779887 B TWI779887 B TW I779887B TW 110138760 A TW110138760 A TW 110138760A TW 110138760 A TW110138760 A TW 110138760A TW I779887 B TWI779887 B TW I779887B
Authority
TW
Taiwan
Prior art keywords
homestay
user
recommendation
information
encoder
Prior art date
Application number
TW110138760A
Other languages
Chinese (zh)
Other versions
TW202318328A (en
Inventor
林澤
Original Assignee
國立臺灣大學
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 國立臺灣大學 filed Critical 國立臺灣大學
Priority to TW110138760A priority Critical patent/TWI779887B/en
Application granted granted Critical
Publication of TWI779887B publication Critical patent/TWI779887B/en
Publication of TW202318328A publication Critical patent/TW202318328A/en

Links

Images

Landscapes

  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一種動態式民宿資訊推薦裝置,係包括一民宿特徵學習模組、一使用者偏好模組以及一推薦模組所構成。當運用時,本發明僅使用用戶的瀏覽點擊選訂紀錄與民宿資訊進行建模,利用去噪自編碼器進行特徵抽取來提取優良的民宿特徵向量,並搭配用戶瀏覽點擊選訂紀錄引入時間概念,透過雙向遞迴神經網路結合注意力機制來預測用戶的喜好並進行商品推薦生成民宿推薦清單。實驗顯示本發明所提裝置,不管是提供用戶完整或部分的瀏覽點擊選訂紀錄皆有十分優異的表現,Top@5提升至56.8%而MAP@5提升至0.419。實驗結果驗證本發明所提裝置,即便是新用戶也能在完全不需使用者提供個人資訊及歷史訂購紀錄便可進行商品推薦,提供使用者優異的民宿推薦清單,對於使用者逐漸重視個資的時代,顯見本裝置展現出優異的潛力。A dynamic homestay information recommendation device is composed of a homestay characteristic learning module, a user preference module and a recommendation module. When in use, the present invention only uses the user's browsing, clicking, and ordering records and homestay information for modeling, uses a denoising self-encoder to perform feature extraction to extract excellent homestay feature vectors, and introduces the concept of time with the user's browsing, clicking, and ordering records , through the two-way recurrent neural network combined with the attention mechanism to predict the user's preferences and recommend products to generate a recommended list of homestays. Experiments show that the device proposed by the present invention has excellent performance no matter it provides users with complete or partial browsing, clicking and ordering records. Top@5 is increased to 56.8% and MAP@5 is increased to 0.419. Experimental results verify that the device proposed by the present invention can recommend products even for new users without providing personal information and historical order records at all, and provide users with excellent homestay recommendation lists. Users gradually pay more attention to personal information era, it is evident that this device exhibits excellent potential.

Description

動態式民宿資訊推薦裝置Dynamic homestay information recommendation device

本發明係有關於一種動態式民宿資訊推薦裝置,尤指涉及一種在 使用者無須登入及無需用戶資料之情況下,適用於訂房推薦系統及電商推薦系統應用,特別係指提供快速並準確模型預測使用者偏好之商品,有助於協助提供更好的使用者體驗者。 The present invention relates to a dynamic homestay information recommendation device, especially relates to a It is suitable for room reservation recommendation systems and e-commerce recommendation system applications without the need for users to log in and user information, especially refers to products that provide fast and accurate model predictions of user preferences, helping to provide a better user experience By.

隨著科技的進步,消費模式從實體轉變到線上平台。於平台中引 入個人化推薦系統為電商帶來巨大的經濟效益,並在競爭激烈的購物平台上迅速媒合商品給用戶,進而提高用戶品牌忠誠度以防止用戶流失。 With the advancement of technology, the consumption mode has changed from bricks and mortar to online platforms. Cited in the platform Incorporating a personalized recommendation system brings huge economic benefits to e-commerce, and quickly matches products to users on highly competitive shopping platforms, thereby improving user brand loyalty to prevent user loss.

傳統的電商銷售模式中,大多提供搜尋引擎或篩選器讓用戶挑選 產品,或以人工方式對用戶分群並分析,針對特定用戶投遞廣告或優惠券。但此舉在面對大量用戶及交易紀錄下,將消耗龐大的公司資源及大量的成本損耗。現今大數據的時代來臨,利用資料探勘(Data Mining)與深度學習(deep learning)之技術,能快速的自動化分析用戶行為及商品資訊,建立出優異的個人化推薦系統。 In the traditional e-commerce sales model, most of them provide search engines or filters for users to choose products, or manually group and analyze users, and deliver advertisements or coupons to specific users. However, in the face of a large number of users and transaction records, this move will consume huge company resources and a lot of cost loss. With the advent of the era of big data, data mining (Data Mining) and deep learning (deep learning) technologies can be used to quickly and automatically analyze user behavior and product information, and create an excellent personalized recommendation system.

在建構推薦系統時,資料的蒐集上主要可以分成四種類型,包含 商品資料、用戶瀏覽資料、用戶交易資料、及用戶評分資料。一般來說,優良的推薦系統建構於電商平台上大量的資料集上,但大量的商品資料以及用戶瀏覽紀錄時常導致稀疏矩陣的問題,造成資料探勘以及傳統機器學習的困難。例如,傳統上經典的個人化推薦系統如關聯規則(Association rule),係利用一段 時間內用戶所購買的商品資料,找出商品間彼此之關聯性。但此關聯規則面臨著數據龐大時運算成本將大幅上升之問題,以及容易找出過多或不實用的商品關聯性等缺點;此外,另一經典的個人化推薦系統為協同過濾(Collaborative filtering),其主要係利用用戶交易資料以及用戶評分資料建模,將這些數據構成一個評分矩陣(rating matrix),並且從中計算用戶彼此間的相似性或商品間的相關程度,接著將用戶以及商品進行分群的動作,找出與特定用戶相似的用戶群,最終利用相似用戶群分析出特定用戶的偏好並產生適合該用戶的產品進行推薦。然而,該協同過濾也面臨著冷啟動問題(即新使用者問題)、稀疏矩陣問題以及大眾偏見等問題。 When building a recommendation system, data collection can be divided into four types, including Commodity data, user browsing data, user transaction data, and user rating data. Generally speaking, an excellent recommendation system is built on a large amount of data sets on the e-commerce platform, but a large amount of product data and user browsing records often lead to the problem of sparse matrix, which makes data mining and traditional machine learning difficult. For example, the traditional classic personalized recommendation system such as association rule (Association rule), is based on a The product information purchased by the user within a certain period of time is used to find out the correlation between the products. However, this association rule faces the problem that the calculation cost will increase significantly when the data is huge, and it is easy to find too many or impractical product associations. In addition, another classic personalized recommendation system is collaborative filtering (Collaborative filtering), It mainly uses user transaction data and user rating data to model, forms these data into a rating matrix (rating matrix), and calculates the similarity between users or the degree of correlation between products, and then groups users and products. Actions to find user groups similar to a specific user, and finally use similar user groups to analyze the preferences of a specific user and generate products suitable for the user for recommendation. However, the collaborative filtering also faces problems such as cold start problem (that is, new user problem), sparse matrix problem, and public bias.

由以上可知,傳統方法因為透過分析用戶並將之歸類,並分析各 類別用戶所購買的列表進行推薦,所以在新產品或新用戶出現時會有無法推薦的狀況。此外,過往方法需要透過用戶輸入相關資訊後才能進行推薦,換言之,現在多數推薦系統需要仰賴用戶的個人資訊來產生個人化推薦,在使用者越來越重視個資的潮流下,將面臨使用者不願提供相關資訊,導致推薦嚴重失準。 As can be seen from the above, the traditional method is because by analyzing and classifying users, and analyzing each Recommendations are made based on the lists purchased by category users, so when new products or new users appear, there may be cases where recommendations cannot be made. In addition, previous methods required users to input relevant information before making recommendations. In other words, most recommendation systems now rely on users’ personal information to generate personalized recommendations. As users pay more and more attention to personal information, users will face Reluctance to provide relevant information leads to serious inaccurate recommendations.

鑑於以往常見推薦系統僅只透過用戶之行為將其歸類,或是透過 分析商品本身基本資訊作為商品表示,且大部分是需要用戶登入才能使用其過往紀錄作為推薦依據。職是之故,在個資保護意識抬頭下,針對既有之缺失加以改良,發展一種可解決前案技術缺點並且能在不使用個資的情況下進行有效推薦之發明實有必要。 In view of the fact that the common recommendation system in the past only classified users through their behavior, or through Analyze the basic information of the product itself as a product representation, and most of them require users to log in to use their past records as the basis for recommendation. Therefore, under the rising awareness of personal data protection, it is necessary to improve the existing deficiencies and develop an invention that can solve the technical shortcomings of the previous case and can make effective recommendations without using personal data.

本發明之主要目的係在於,克服習知技藝所遭遇之上述問題並提 供一種不需使用者資訊即可根據使用者偏好進行個人化推薦之裝置,經過測試後推薦準確度可達到Top@20百分之75的高準確度,能進一步優化一般市面上的推薦算法,大幅提升舊有技術的準確度,從而有助於協助提供更好的使用者體驗之動態式民宿資訊推薦裝置。 The main purpose of the present invention is to overcome the above-mentioned problems encountered in the prior art and provide Provides a device that can make personalized recommendations based on user preferences without user information. After testing, the recommendation accuracy can reach a high accuracy of Top@20% 75%, which can further optimize the general recommendation algorithm on the market. The accuracy of the old technology is greatly improved, which helps to provide a dynamic homestay information recommendation device that provides a better user experience.

為達上述目的,本發明係一種動態式民宿資訊推薦裝置,係在使 用者無須登入及無需用戶資料之情況下,適用於訂房推薦系統及電商推薦系統應用,該動態式民宿資訊推薦裝置包括:一民宿特徵學習模組,係根據數間民宿的民宿資訊,利用一去噪自編碼器(Denoising Autoencoder)對該些民宿資訊進行降維以及特徵向量抽取的計算,通過該去噪自編碼器的預先訓練(pre-train)而產生數個較低維度民宿特徵向量,再依照用戶當下之瀏覽點擊(click records)選訂紀錄引入時間概念,從該數間民宿中依序抽取出被瀏覽與選訂所對應的民宿資訊,經過該去噪自編碼器進行編碼,從該些較低維度民宿特徵向量中取得數個相對應的民宿特徵向量(B&B’s embedding vector),同時一方面輸出至基於注意力機制(Attention Mechanism)雙向遞迴神經網路(Bidirectional Recurrent Neural Network, BiRNN)之預測模型中,一方面經由該去噪自編碼器進行解碼還原;一使用者偏好模組,連接該民宿特徵學習模組,係以該雙向遞迴神經網路結合該注意力機制為一模型訓練而建立該預測模型,取得輸入的該些民宿特徵向量,該些民宿特徵向量係根據該用戶當下之瀏覽點擊選訂紀錄上的時間依序輸入該預測模型,利用該預測模型學習分析使用者偏好以產生數個使用者偏好向量,再採用聯合訓練(joint training)方式,利用該去噪自編碼器與該預測模型結合比較該些民宿特徵向量與該些使用者偏好向量所產生的誤差進行聯合訓練以最小化聯合誤差;以及一推薦模組,連接該使用者偏好模組,經過該聯合訓練後,使該預測模型最終可從該些民宿特徵向量中預測出該用戶最有可能選購的數個最具代表性民宿特徵向量,利用最小化向量間餘弦相似度來計算各民宿間的相關程度,通過計算該些最具代表性民宿特徵向量與所有民宿特徵向量之間的餘弦相似度,並從該些最具代表性民宿特徵向量中依照該餘弦相似度從高到低進行排序,根據該排序擷取其中N個排序靠前的最相似民宿特徵向量所對應之民宿,經過推薦流程(Recommendation process)產生出適合該用戶的民宿推薦清單進行推薦;其中,該民宿特徵學習模組、該使用者偏好模組及該推薦模組係進行共同訓練,得以相互校正,從而有效提升其推薦精準度。 In order to achieve the above purpose, the present invention is a dynamic homestay information recommendation device, which is used in The user does not need to log in and does not need user information, and is suitable for room reservation recommendation system and e-commerce recommendation system applications. The dynamic homestay information recommendation device includes: a homestay characteristic learning module, based on the homestay information of several homestays, using A denoising autoencoder (Denoising Autoencoder) performs dimension reduction and feature vector extraction calculations on these homestay information, and generates several lower-dimensional homestay feature vectors through the pre-training of the denoising autoencoder , and then introduce the concept of time according to the user's current browsing click (click records) selection records, sequentially extract the information of the homestays corresponding to the browsed and selected reservations from the several homestays, and encode them through the denoising self-encoder. Obtain several corresponding B&B's embedding vectors from these lower-dimensional B&B feature vectors, and at the same time output them to the Bidirectional Recurrent Neural Network (Bidirectional Recurrent Neural Network) based on the Attention Mechanism (Attention Mechanism), In the prediction model of BiRNN), on the one hand, the denoising self-encoder is used for decoding and restoration; a user preference module is connected to the homestay feature learning module, which is based on the bidirectional recurrent neural network combined with the attention mechanism. A model training is used to establish the prediction model and obtain the input feature vectors of these homestays. The feature vectors of these homestays are sequentially input into the prediction model according to the time on the user's current browsing click selection record, and the prediction model is used to learn and analyze User preferences to generate several user preference vectors, and then use the joint training method to use the denoising autoencoder and the prediction model to compare the feature vectors of these homestays with those generated by these user preference vectors error to minimize the joint error; and a recommendation module, connected to the user preference module, after the joint training, the prediction model can finally predict the most likely user from the feature vectors of these homestays Select several of the most representative homestay feature vectors, use the minimized cosine similarity between vectors to calculate the degree of correlation between each homestay, and calculate the cosine between these most representative homestay feature vectors and all homestay feature vectors According to the cosine similarity from the most representative homestay feature vectors, sort them from high to low, and extract the homestays corresponding to the N most similar homestay feature vectors in the top ranking according to the sorting, and pass The recommendation process (Recommendation process) generates a recommended list of homestays suitable for the user for recommendation; among them, the homestay feature learning module, the user preference module and the recommendation module are jointly trained to correct each other, thereby effectively improving Its recommended accuracy.

於本發明上述實施例中,該民宿資訊包含民宿之地區、房型、設 施、配備、服務、價格及其它與民宿相關資料。 In the above-mentioned embodiments of the present invention, the homestay information includes the area, room type, and facilities of the homestay. Facilities, equipment, services, prices and other information related to homestays.

於本發明上述實施例中,該瀏覽點擊選訂紀錄包含用戶當下之點 擊次數、順序、瀏覽資料及用戶選訂之民宿。 In the above-mentioned embodiment of the present invention, the browsing click selection record includes the user's current point The number of clicks, order, browsing information and the homestays selected by the user.

於本發明上述實施例中,該去噪自編碼器係由編碼器(encoder)、 解碼器(decoder)以及包含於該編碼器與該解碼器之間的隱藏層所組成,該去噪自編碼器將該些較低維度民宿特徵向量經過該編碼器進行編碼,並投影到該隱藏層中,接著透過該解碼器將該隱藏層進行解碼還原。 In the above embodiments of the present invention, the denoising self-encoder is composed of an encoder, The decoder (decoder) and the hidden layer included between the encoder and the decoder, the denoising self-encoder encodes these lower-dimensional feature vectors through the encoder, and projects them to the hidden layer layer, and then decode and restore the hidden layer through the decoder.

於本發明上述實施例中,該去噪自編碼器在進行降維以及特徵向 量抽取前,係先將該數個民宿資訊中的類別特徵(Categorical feature)進行獨熱(one-hot)編碼,以編碼來擴增該數個民宿資訊的維度。 In the above embodiments of the present invention, the denoising self-encoder performs dimension reduction and feature orientation Before volume extraction, the system first performs one-hot encoding on the categorical features in the several homestay information, and uses encoding to amplify the dimension of the several homestay information.

於本發明上述實施例中,該預測模型係利用該注意力機制分配關 注的權重來進行預測,以輸出該用戶最有可能選購的數個最具代表性民宿特徵向量。 In the above-mentioned embodiments of the present invention, the prediction model uses the attention mechanism to assign relations Note the weights to make predictions to output the most representative feature vectors of the most representative homestays that the user is most likely to purchase.

於本發明上述實施例中,該預測模型包括應用於該雙向遞迴神經 網路的每個通道輸出的一或多個門控結構(gated recurrent unit, GRU)。 In the above-mentioned embodiments of the present invention, the predictive model includes One or more gated recurrent units (GRUs) output by each channel of the network.

請參閱『第1圖~第15圖』所示,係分別為本發明動態式民宿 資訊推薦裝置之架構示意圖、本發明動態式民宿資訊推薦裝置之訓練流程示意圖、本發明去噪自編碼器之架構示意圖、本發明預測模型之架構示意圖、本發明之推薦流程示意圖、本發明之Top@N示意圖、本發明之MAP@N示意圖、本發明之不同模型在Top@N的柱狀圖、本發明之不同模型在MAP@N的柱狀圖、本發明之動態推薦示意圖、本發明所提裝置模型之動態推薦表現圖、本發明比較AE-RNN模型之動態推薦表現圖、本發明比較Wide&Deep模型之動態推薦表現圖、本發明比較modified Collaborative Filtering模型之動態推薦表現圖、及本發明之用戶點擊序列的長度統計圖。如圖所示:本發明係一種動態式民宿資訊推薦裝置,係包括一民宿特徵學習模組1、一使用者偏好模組2以及一推薦模組3所構成。 Please refer to "Figures 1 to 15", which are respectively the dynamic homestays of the present invention Schematic diagram of the structure of the information recommendation device, a schematic diagram of the training process of the dynamic homestay information recommendation device of the present invention, a schematic diagram of the architecture of the denoising self-encoder of the present invention, a schematic diagram of the architecture of the prediction model of the present invention, a schematic diagram of the recommendation process of the present invention, and Top of the present invention @N schematic diagram, MAP@N schematic diagram of the present invention, the histogram of different models of the present invention in Top@N, the histogram of different models of the present invention in MAP@N, the dynamic recommendation schematic diagram of the present invention, and the histogram of the present invention The dynamic recommendation performance diagram of the proposed device model, the dynamic recommendation performance diagram of the AE-RNN model compared with the present invention, the dynamic recommendation performance diagram of the Wide&Deep model compared with the present invention, the dynamic recommendation performance diagram of the modified Collaborative Filtering model compared with the present invention, and the dynamic recommendation performance diagram of the present invention The length statistics graph of the sequence clicked by the user. As shown in the figure: the present invention is a dynamic homestay information recommendation device, which is composed of a homestay characteristic learning module 1, a user preference module 2 and a recommendation module 3.

上述所提之民宿特徵學習模組1係根據數間民宿的民宿資訊,利 用一去噪自編碼器(Denoising Autoencoder)11對該些民宿資訊進行降維以及特徵向量抽取的計算,通過該去噪自編碼器11的預先訓練(pre-train)而產生數個較低維度民宿特徵向量,再依照用戶當下之瀏覽點擊(click records)選訂紀錄引入時間概念,從該數間民宿中依序抽取出被瀏覽與選訂所對應的民宿資訊,經過該去噪自編碼器11進行編碼,從該些較低維度民宿特徵向量中取得數個相對應的民宿特徵向量(B&B’s embedding vector),同時一方面輸出至基於注意力機制(Attention Mechanism)212雙向遞迴神經網路(Bidirectional Recurrent Neural Network, BiRNN)211之預測模型21中,一方面將接著經由該去噪自編碼器11進行解碼還原。 The B&B feature learning module 1 mentioned above is based on the B&B information of several B&Bs. Use a denoising autoencoder (Denoising Autoencoder) 11 to perform dimension reduction and feature vector extraction calculations on these homestay information, and generate several lower dimensions through the pre-training of the denoising autoencoder 11 B&B feature vector, and then introduce the concept of time according to the user's current browsing click (click records) selection record, and sequentially extract the B&B information corresponding to the browsed and selected B&Bs from the several B&Bs, and pass through the denoising self-encoder 11 to encode, and obtain several corresponding B&B's embedding vectors from these lower-dimensional B&B feature vectors, and at the same time output them to the Attention Mechanism (Attention Mechanism) 212 bidirectional recurrent neural network ( In the prediction model 21 of the Bidirectional Recurrent Neural Network, BiRNN) 211, on the one hand, the denoising self-encoder 11 will then be used for decoding and restoration.

該使用者偏好模組2連接該民宿特徵學習模組1,係以該雙向遞 迴神經網路211結合該注意力機制212為一模型訓練而建立該預測模型 21,取得輸入的該些民宿特徵向量,該些民宿特徵向量係根據該用戶當下之瀏覽點擊選訂紀錄上的時間依序輸入該預測模型21,利用該預測模型21學習分析使用者偏好以產生數個使用者偏好向量,再採用聯合訓練(joint training)方式,利用該去噪自編碼器11與該預測模型21結合比較該些民宿特徵向量與該些使用者偏好向量所產生的誤差進行聯合訓練以最小化聯合誤差。 The user preference module 2 is connected to the homestay feature learning module 1, and the two-way transmission The neural network 211 combines the attention mechanism 212 for model training to establish the predictive model 21. Obtain the input feature vectors of these homestays. The feature vectors of these homestays are sequentially input into the prediction model 21 according to the time on the user's current browsing, clicking and ordering records, and use the prediction model 21 to learn and analyze user preferences to generate Several user preference vectors, and then use joint training (joint training), use the denoising autoencoder 11 and the prediction model 21 to compare the errors generated by these homestay feature vectors and these user preference vectors for joint training. Trained to minimize the joint error.

該推薦模組3連接該使用者偏好模組2,經過該聯合訓練後,使 該預測模型21最終可從該些民宿特徵向量中預測出該用戶最有可能選購的數個最具代表性民宿特徵向量,利用最小化向量間餘弦相似度來計算各民宿間的相關程度,通過計算該些最具代表性民宿特徵向量與所有民宿特徵向量之間的餘弦相似度,並從該些最具代表性民宿特徵向量中依照該餘弦相似度從高到低進行排序,根據該排序擷取其中N個排序靠前的最相似民宿特徵向量所對應之民宿,經過推薦流程(Recommendation process)產生出適合該用戶的民宿推薦清單31進行推薦;其中,該民宿特徵學習模組1、該使用者偏好模組2及該推薦模組3係進行共同訓練,得以相互校正,從而有效提升其推薦精準度。如是,藉由上述揭露之結構構成一全新之動態式民宿資訊推薦裝置100。 The recommendation module 3 is connected with the user preference module 2, and after the joint training, the The prediction model 21 can finally predict several representative homestay feature vectors that the user is most likely to purchase from these homestay feature vectors, and calculate the degree of correlation between homestays by minimizing the cosine similarity between vectors. By calculating the cosine similarity between the most representative homestay feature vectors and all homestay feature vectors, and sorting from the most representative homestay feature vectors according to the cosine similarity from high to low, according to the sorting Extract the homestays corresponding to the N most similar homestay feature vectors in the top ranking, and generate a homestay recommendation list 31 suitable for the user through the recommendation process (Recommendation process); among them, the homestay feature learning module 1, the The user preference module 2 and the recommendation module 3 are jointly trained to correct each other, thereby effectively improving the recommendation accuracy. If so, a brand-new dynamic homestay information recommendation device 100 is formed by the structure disclosed above.

上述民宿資訊包含民宿之地區、房型、設施、配備、服務、價格 及其它與民宿相關資料;而該瀏覽點擊選訂紀錄包含用戶當下之點擊次數、順序、瀏覽資料及用戶選訂之民宿。 The above homestay information includes the area, room type, facility, equipment, service and price of the homestay and other information related to homestays; and the browsing click selection record includes the user's current click times, sequence, browsing information and the homestays selected by the user.

上述預測模型21包括應用於該雙向遞迴神經網路211的每 個通道輸出的一或多個門控結構(gated recurrent unit, GRU)213。 The above prediction model 21 includes each One or more gated recurrent units (GRU) 213 output by each channel.

上述預測模型係利用該注意力機制212分配關注的權重來進 行預測,以輸出該用戶最有可能選購的數個最具代表性民宿特徵向量。 The above prediction model uses the attention mechanism 212 to assign attention weights to Prediction is performed to output several most representative homestay feature vectors that the user is most likely to purchase.

本發明整體模型的訓練流程如第2圖所示。圖中左側部分為一個 完整的去噪自編碼器11,利用去噪自編碼器11所產生較低維度民宿特徵向量當作輸入,依照用戶當下之瀏覽點擊選訂紀錄依序輸入右側部分基於注意力機制212建構出的雙向遞迴神經網路211的預測模型21中,在訓練的過程中會結合兩側所產生的誤差進行聯合訓練。 The training process of the overall model of the present invention is shown in Figure 2. The left part of the figure is a The complete denoising self-encoder 11 uses the lower-dimensional homestay feature vector generated by the denoising self-encoder 11 as input, and sequentially inputs the right part based on the attention mechanism 212 according to the user's current browsing, clicking and ordering records. In the prediction model 21 of the bidirectional recurrent neural network 211, the errors generated by both sides are combined for joint training during the training process.

當運用時,該去噪自編碼器11在進行降維以及特徵向量抽取 前,係先將該數個民宿資訊中的類別特徵(Categorical feature)進行獨熱(one-hot)編碼,以編碼來擴增該數個民宿資訊的維度。接著,利用該去噪自編碼器11對擴增維度的數個民宿資訊進行預先訓練來得到維度相對較低的民宿特徵向量,該去噪自編碼器11需要與右側的預測模型進行聯合訓練後才能完成訓練。該去噪自編碼器11的架構主要由編碼器(encoder)111、解碼器(decoder)112以及包含於該編碼器111與該解碼器112之間的隱藏層113所組成,其架構如第3圖所示。圖中

Figure 02_image001
為對類別特徵進行one-hot編碼後,再經過極小化極大演算法(min-Max)標準化的民宿資訊,
Figure 02_image003
為加入隨即雜訊後的民宿資訊,
Figure 02_image005
為民宿特徵向量,及
Figure 02_image007
為去噪自編碼器的輸出亦即
Figure 02_image001
的重建。該去噪自編碼器11的重建誤差係使用均方誤差(Mean squared error, MSE)。之後將依照用戶當下之瀏覽點擊選訂紀錄依序抽取出相對應民宿資訊的隱藏層之民宿特徵向量進行後續模型的建立。 When used, the denoising self-encoder 11 performs one-hot encoding on the category features (Categorical features) in the several homestay information before performing dimension reduction and feature vector extraction. Expand the dimensions of the several homestay information. Then, use the denoising self-encoder 11 to pre-train several homestay information of the enlarged dimension to obtain a relatively low-dimensional hotel feature vector. The denoising self-encoder 11 needs to be jointly trained with the prediction model on the right. to complete the training. The architecture of the denoising self-encoder 11 is mainly composed of an encoder (encoder) 111, a decoder (decoder) 112, and a hidden layer 113 included between the encoder 111 and the decoder 112, and its architecture is as shown in Section 3. As shown in the figure. in the picture
Figure 02_image001
In order to perform one-hot encoding on the category features, and then standardize the homestay information through the minimization and maximization algorithm (min-Max),
Figure 02_image003
In order to add the homestay information after the immediate noise,
Figure 02_image005
is the feature vector of the homestay, and
Figure 02_image007
is the output of the denoising autoencoder, namely
Figure 02_image001
reconstruction. The reconstruction error of the denoising autoencoder 11 uses mean squared error (Mean squared error, MSE). Afterwards, according to the user's current browsing, clicking, and ordering records, the hotel feature vectors of the hidden layer corresponding to the hotel information will be sequentially extracted for the establishment of subsequent models.

本發明依據用戶對於民宿的瀏覽點擊選訂紀錄從數間民宿中抽 取出被瀏覽與訂購所對應的民宿資訊,並且經過去噪自編碼器11中的編碼器111進行編碼,取得相對應的民宿特徵向量,輸入到雙向遞迴神經網路 211進行後續訓練。 The present invention extracts from several homestays based on the user’s browsing, clicking, and ordering records for homestays. Take out the homestay information corresponding to the browsing and ordering, and encode it through the encoder 111 in the denoising self-encoder 11, obtain the corresponding homestay feature vector, and input it into the bidirectional recurrent neural network 211 for follow-up training.

第4圖顯示第2圖右側基於注意力機制建構出雙向遞迴神網路 的預測模型的細節,圖中

Figure 02_image009
為基於注意力機制雙向遞迴神經網路的輸出,
Figure 02_image009
Figure 02_image011
的維度相同,
Figure 02_image011
為訂購的民宿特徵向量,
Figure 02_image013
為非瀏覽或訂購過的民宿特徵向量,
Figure 02_image015
為經過one-hot再經過min-Max標準化的民宿資訊,
Figure 02_image017
則為去噪自編碼器的輸出即民宿資訊的重建。 Figure 4 shows the details of the prediction model of the two-way recurrent neural network constructed based on the attention mechanism on the right side of Figure 2.
Figure 02_image009
is the output of the bidirectional recurrent neural network based on the attention mechanism,
Figure 02_image009
and
Figure 02_image011
have the same dimensions,
Figure 02_image011
is the feature vector of the ordered homestay,
Figure 02_image013
is the feature vector of non-browsed or ordered homestays,
Figure 02_image015
For homestay information standardized by one-hot and then min-Max,
Figure 02_image017
It is the output of the denoising self-encoder, that is, the reconstruction of homestay information.

可見本發明利用去噪自編碼器的編碼器111進行編碼,接著輸 入至去噪自編碼器的解碼器112與基於注意力機制212雙向遞迴神經網路211中,透過聯合訓練的方式來最小化聯合誤差

Figure 02_image019
Figure 02_image021
Figure 02_image023
It can be seen that the present invention uses the encoder 111 of the denoising self-encoder to encode, and then inputs it to the decoder 112 of the denoising self-encoder and the bidirectional recurrent neural network 211 based on the attention mechanism 212, and minimizes the combined error
Figure 02_image019
:
Figure 02_image021
Figure 02_image023

從公式中可以觀察到,

Figure 02_image019
Figure 02_image025
Figure 02_image027
兩部分所組成,由此可看出去噪自編碼器所訓練出的民宿特徵向量同時參與資訊重建以及預測的任務。 It can be observed from the formula that
Figure 02_image019
Depend on
Figure 02_image025
and
Figure 02_image027
It is composed of two parts. It can be seen that the feature vector of the homestay trained by the denoising self-encoder participates in the tasks of information reconstruction and prediction at the same time.

在最後的測試/推薦階段,本發明將利用基於注意力機制雙向遞迴 神經網路來預測出用戶最終可能訂購民宿的民宿特徵向量

Figure 02_image009
,接著將利用
Figure 02_image009
來計算與所有民宿特徵向量的餘弦相似度,餘弦相似度越高越能代表該民宿與用戶偏好的民宿越相似,因此可合理的將與用戶偏好相似的民宿進行餘弦相似度排序,依照相似度高低為用戶生成出適合的民宿推薦清單,其簡易推薦流程如第5圖所示。 In the final test/recommendation stage, the present invention will use the bidirectional recurrent neural network based on the attention mechanism to predict the feature vector of the homestay that the user may eventually order
Figure 02_image009
, then use the
Figure 02_image009
To calculate the cosine similarity with all homestay feature vectors, the higher the cosine similarity, the more similar the homestay is to the homestay preferred by the user. Therefore, it is reasonable to sort the cosine similarity of the homestay similar to the user's preference, according to the similarity High and Low generates a suitable homestay recommendation list for users, and its simple recommendation process is shown in Figure 5.

第5圖中左下角為經過聯合訓練後民宿特徵向量的t-隨機鄰近嵌 入法(t-distributed Stochastic Neighbor Embedding, t-SNE)二維可視化示意圖,圖中的每一點均代表一間民宿,透過用戶的瀏覽點擊選訂紀錄輸入至本裝置 100當中生成

Figure 02_image029
,而圖中的黑色圓點則為預測模型的輸出即最終可能訂購民宿的民宿特徵向量
Figure 02_image029
,黑色圓圈則為經過餘弦相似度計算過後所篩選出的候選民宿特徵向量,最終本發明將選取前N個高餘弦相似度的民宿作為民宿推薦清單31。 The lower left corner of Figure 5 is a two-dimensional visualization diagram of the t-distributed Stochastic Neighbor Embedding (t-SNE) method of the feature vector of the homestay after joint training. Each point in the figure represents a homestay. The user's browsing, clicking, and ordering records are input into the device 100 to generate
Figure 02_image029
, and the black dots in the figure are the output of the prediction model, that is, the feature vector of the homestay that may eventually order a homestay
Figure 02_image029
, the black circles are the feature vectors of candidate homestays screened out after cosine similarity calculation, and finally the present invention selects the top N homestays with high cosine similarity as homestay recommendation list 31.

以下實施例僅舉例以供了解本發明之細節與內涵,但不用於限制 本發明之申請專利範圍。 The following examples are only examples for understanding the details and connotation of the present invention, but not for limitation The patent scope of the present invention.

[評估準則] 在本實驗中,當推薦清單生成後,將以Top@N以及平均精度均值(Mean average precision@N, MAP@N)進行模型效果的評測。Top@N的定義是假設當用戶瀏覽點擊選訂紀錄輸入本裝置100後生成的N間民宿推薦清單中,有出現該用戶最終的訂購民宿就代表推薦成功,而Top@N的推薦比例為推薦成功之用戶相對於整體被推薦用戶(測試集中所有用戶)的比例,如第6圖所示。 [Evaluation Criteria] In this experiment, after the recommendation list is generated, the model effect will be evaluated with Top@N and Mean average precision@N, MAP@N. The definition of Top@N is assuming that when the user browses, clicks on the selected record and enters the N homestay recommendation list generated by the device 100, if the user’s final ordered homestay appears, it means that the recommendation is successful, and the recommendation ratio of Top@N is recommended The proportion of successful users relative to the overall recommended users (all users in the test set) is shown in Figure 6.

由第6圖中可看出,第一筆瀏覽點擊選訂紀錄所生成出的推薦清單相較於第二筆瀏覽點擊選訂紀錄所生成出的推薦清單較為成功,因為希望推薦清單中的訂購民宿優先出現,因此本發明也採用MAP @N來進行模型效果的評測,MAP@N的定義是假設當用戶瀏覽點擊選訂紀錄輸入本裝置100後生成的N間民宿推薦清單中,出現該用戶最終的訂購民宿並考量該民宿所出現的位置,以位置之倒數計算出平均正確率(average precision,AP),而整體MAP@N的計算為每筆紀錄的AP之平均,如第7圖所示。 It can be seen from Figure 6 that the recommendation list generated by the first browsing click and ordering record is more successful than the recommendation list generated by the second browsing click and ordering record, because it is hoped that the order in the recommendation list Homestays appear first, so the present invention also uses MAP@N to evaluate the model effect. The definition of MAP@N is to assume that when the user browses, clicks on the selected record and enters the N homestay recommendation list generated by the device 100, the user appears The final ordering homestay takes into account the location of the homestay, and the average accuracy (average precision, AP) is calculated by the reciprocal of the location, and the calculation of the overall MAP@N is the average AP of each record, as shown in Figure 7 Show.

觀察第7圖可發現,當生成出的推薦清單中第一間民宿就是用戶最終訂購的民宿時,表示此筆數據的AP@5=1,而當推薦清單中第二間民宿為用戶最終訂購民宿時,表示此筆數據的AP@5=0.5。如此一來可以推知,當所有數據進行評測所得MAP@5介在1~0.5之間,顯示本裝置表現得十分優秀,所生成的推薦清單最終訂購民宿都在第一與第二間之間。 Observing Figure 7, it can be found that when the first homestay in the generated recommendation list is the homestay that the user finally ordered, it means that the AP@5=1 of this data, and when the second homestay in the recommendation list is the final homestay that the user ordered For homestay, it means AP@5=0.5 of this data. In this way, it can be deduced that when all the data are evaluated and the MAP@5 is between 1 and 0.5, it shows that the device performs very well, and the generated recommendation lists are all between the first and second homestays.

[與不同推薦方法的比較] [Comparison with different recommendation methods]

本裝置將與隨機推薦(Random)、Distance-Price推薦、基於自適應嵌入的遞迴神經網路(Adaptive Embedding-based Recurrent Neural Network,AE-RNN)模型、改動過的協同過濾(modified Collaborative Filtering)以及現有推薦系統模型Wide & Deep模型做Top@N與MAP@N比較,並且探究使用不同推薦方法對於A公司現有架構亦即評估基準(AY sort)帶來多少程度的推薦效果提升,並且進一步分析比較不同推薦方法的差異性。 This device will work with random recommendation (Random), Distance-Price recommendation, Adaptive Embedding-based Recurrent Neural Network (AE-RNN) model, modified collaborative filtering (modified Collaborative Filtering) And the existing recommendation system model Wide & Deep model compares Top@N and MAP@N, and explores how much the recommendation effect can be improved by using different recommendation methods for the existing structure of A company, that is, the evaluation benchmark (AY sort), and further analyzes Compare the differences of different recommended methods.

下表一顯示不同推薦模型的推薦效果,表中的數值為2019年1月共2004筆用戶點擊瀏覽紀錄,分別以Top@N以及MAP@N來做評估測試。 The following table 1 shows the recommendation effects of different recommendation models. The values in the table are 2004 user clicks and browsing records in January 2019. Top@N and MAP@N are used for evaluation and testing respectively.

表一   Top@20 Top@10 Top@5 MAP@20 MAP@10 MAP@5 本裝置 0.745 0.662 0.568 0.434 0.432 0.419 AE-RNN 0.532 0.453 0.375 0.286 0.281 0.270 Wide&Deep 0.490 0.391 0.309 0.225 0.218 0.207 modified CF 0.458 0.361 0.263 0.165 0.158 0.145 Distance-Price 0.306 0.198 0.129 0.081 0.074 0.064 AY sort 0.312 0.189 0.120 0.079 0.070 0.062 Random 0.089 0.047 0.024 0.017 0.014 0.011 Table I Top@20 Top@10 Top@5 MAP@20 MAP@10 MAP@5 This device 0.745 0.662 0.568 0.434 0.432 0.419 AE-RNN 0.532 0.453 0.375 0.286 0.281 0.270 Wide&Deep 0.490 0.391 0.309 0.225 0.218 0.207 modified CF 0.458 0.361 0.263 0.165 0.158 0.145 Distance-Price 0.306 0.198 0.129 0.081 0.074 0.064 AYsort 0.312 0.189 0.120 0.079 0.070 0.062 Random 0.089 0.047 0.024 0.017 0.014 0.011

隨機推薦為依照用戶當前所點擊瀏覽過的民宿,依照固定的距離 範圍隨機推薦N間該城市的民宿給該用戶,Distance-Price推薦利用了金額以及位置相關民宿特徵篩選出民宿生成推薦清單,modified Collaborative Filtering僅利用用戶點擊過的民宿紀錄尋找相似的用戶點擊紀錄進行民宿推薦,現行推薦系統Wide&Deep模型利用用戶點擊過的民宿對應民宿特徵向量並透過模型進行預測,AE-RNN模型僅使用去噪自編碼器進行特徵抽取後經過單向遞迴神經網路後生成推薦清單進行推薦。 The random recommendation is based on the B&Bs that the user currently clicks and browses, according to a fixed distance The range randomly recommends N homestays in the city to the user. The Distance-Price recommendation uses the characteristics of homestays related to the amount and location to filter out homestays to generate a recommendation list. Modified Collaborative Filtering only uses the homestay records that the user has clicked to find similar user click records. Homestay recommendation, the current recommendation system Wide&Deep model uses the feature vector of the homestay clicked by the user to make predictions through the model, and the AE-RNN model only uses the denoising self-encoder for feature extraction and then passes through the one-way recurrent neural network to generate recommendations list is recommended.

由上述表一中的結果可看出,本裝置、AE-RNN模型、Wide&Deep 模型與modified Collaborative Filtering不管是在Top@N或MAP@N都有優於AY sort的推薦效果並且遠遠高於隨機推薦,而Distance-Price推薦與AY sort的推薦效果差不多,接下來將以長條圖比較每個模型如第8圖與第9圖所示。 From the results in Table 1 above, it can be seen that the device, AE-RNN model, Wide&Deep The recommendation effect of the model and modified Collaborative Filtering is better than that of AY sort in Top@N or MAP@N and much higher than that of random recommendation, while the recommendation effect of Distance-Price recommendation is similar to that of AY sort. A bar chart comparing each model is shown in Figures 8 and 9.

從第8圖與第9圖可發現藉由modified Collaborative Filtering、 Distance-Price推薦與隨機推薦的比較,用戶的完整瀏覽點擊紀錄對於模型推薦有相當程度的影響,modified Collaborative Filtering透過用戶的點擊瀏覽紀錄了解用戶偏好,再推薦有相似點擊瀏覽紀錄用戶最終訂購的民宿給該用戶。民宿特徵longitude、latitude與min_price為所有民宿特徵中相對重要的民宿特徵,Distance-Price推薦透過用戶完整點擊瀏覽紀錄對應出用戶對於民宿價格以及地點的偏好,如此作法達到與AY sort相似的推薦效果。 From Figure 8 and Figure 9, it can be found that through modified Collaborative Filtering, The comparison between Distance-Price recommendation and random recommendation shows that the user's complete browsing and clicking records have a considerable impact on the model recommendation. Modified Collaborative Filtering understands user preferences through the user's clicking and browsing records, and then recommends the homestays that users with similar clicking and browsing records finally ordered. to that user. Homestay features longitude, latitude, and min_price are relatively important homestay features among all homestay features. Distance-Price recommends that the user's preference for homestay price and location be mapped out through the user's complete click browsing record. This approach achieves a recommendation effect similar to AY sort.

透過modified Collaborative Filtering與Wide&Deep模型的比較可以發 現,僅利用用戶點擊記錄再搭配相應的民宿資訊,可幫助Wide&Deep模型學習高階與低階的民宿特徵組合,使得模型更加了解用戶的喜好而生成出優良的推薦清單進行推薦。 Through the comparison of modified Collaborative Filtering and Wide&Deep model, it can be found that Now, only using the user's click record and matching the corresponding homestay information can help the Wide&Deep model learn the combination of high-level and low-level homestay features, so that the model can better understand the user's preferences and generate an excellent recommendation list for recommendation.

而藉由Wide&Deep模型與AE-RNN模型的比較可得知,用戶的點擊 瀏覽紀錄加上民宿資訊,再加上時序列資訊的引入可幫助模型推薦效果再次提升,訓練出優良的民宿資訊向量不僅能幫助模型更加了解用戶的偏好,在推薦民宿給用戶時也是利用民宿間的餘弦相似度來生成推薦清單。 And by comparing the Wide&Deep model with the AE-RNN model, it can be known that the user's click Browsing records plus homestay information, coupled with the introduction of time series information can help the model to improve the recommendation effect again. Training an excellent homestay information vector can not only help the model better understand user preferences, but also use the homestay room when recommending homestay to users. cosine similarity to generate a recommendation list.

最後比較AE-RNN模型與本裝置可看出,提供更多的民宿資訊能 幫助提升民宿間的差異度生成更好的推薦清單,而用戶對於民宿間的點擊次數也有助於模型了解用戶對於不同民宿的關注程度,透過雙向遞迴神經網路以及注意力機制並進行聯合訓練能有效的幫助整體模型的推薦結果。 Finally, comparing the AE-RNN model with this device, it can be seen that it can provide more homestay information It helps to improve the difference between homestays and generate better recommendation lists, and the number of clicks by users on homestays also helps the model understand the user's attention to different homestays, and conduct joint training through a two-way recurrent neural network and attention mechanism It can effectively help the recommendation results of the overall model.

[動態推薦] 由表一的結果可看出將用戶完整點擊紀錄輸入模型所得到的推薦效果比較,但在現實的網路平台介面中,並無法從一開始就獲得用戶完整的點擊瀏覽紀錄直到該用戶完成訂購。在實際情況中,本發明必須依照用戶當下的點擊情形動態式的給予推薦清單,並且期許在有限的點擊瀏覽民宿的次數中給予優良的推薦清單。 [Dynamic recommendation] From the results in Table 1, it can be seen that the comparison of recommendation effects obtained by inputting the user's complete click record into the model, but in the real network platform interface, it is impossible to obtain the user's complete click browsing record from the beginning until the user completes the order . In actual situations, the present invention must dynamically provide a recommendation list according to the user's current click situation, and expects to provide an excellent recommendation list in a limited number of clicks to browse homestays.

有鑑於尚未在平台上給予AB測試(A/B test),因此本發明嘗試模 擬用戶點擊瀏覽A公司平台上民宿的方式,依照目前測試集上的點擊序列,動態 地提供給本裝置依序產生推薦清單,並且依照每次所生成的推薦清單去評測Top@N和MAP@N的表現。但需要強調的是因無法在平台上進行動態推薦的評測,僅能利用靜態的網頁固定排序機制並依照當下每次用戶的瀏覽點擊選訂紀錄依序提供給本裝置來生成推薦清單。也就是說用戶仍然看見原本的推薦清單,且每一次的點擊行為尚未受到模型所生成的推薦清單所影響。而本發明預期現實中的動態推薦由於能實際動態調整推薦清單,因此將能提供更好的表現,使得模型更加快速的預測出用戶最終訂購民宿。 In view of the fact that no AB test (A/B test) has been given on the platform, the present invention tries to simulate It is proposed that the user click to browse the homestay on the platform of Company A, according to the current click sequence on the test set, the dynamic Provided to the device to generate recommendation lists sequentially, and evaluate the performance of Top@N and MAP@N according to each generated recommendation list. However, it needs to be emphasized that since the evaluation of dynamic recommendations cannot be performed on the platform, only the static webpage fixed sorting mechanism can be used to provide the device with the orderly selection records of each current user's browsing and clicking to generate a recommendation list. That is to say, the user still sees the original recommendation list, and each click has not been affected by the recommendation list generated by the model. However, the present invention expects that the dynamic recommendation in reality will provide better performance because it can actually dynamically adjust the recommendation list, so that the model can more quickly predict the user's final order for a homestay.

假設用戶瀏覽點擊選訂紀錄為如下形式

Figure 02_image037
接著本發明將依照點擊的順序依序將瀏覽點擊選訂紀錄輸入至本裝置100當中去生成民宿推薦清單31,如第10圖所示。 Assuming that the user browses, clicks and selects the order record in the following form
Figure 02_image037
Then the present invention will sequentially input the browsing, clicking and ordering records into the device 100 according to the order of the clicking to generate the hotel recommendation list 31, as shown in FIG. 10 .

表一的六種模型中利用用戶點擊序列進行建模的模型分別為本 發明所提裝置模型、AE-RNN模型、Wide&Deep模型與modified Collaborative Filtering,接下來將依序對這四種模型進行動態推薦評測,測試成果如第11、12、13及14圖所示。 Among the six models in Table 1, the models using user click sequences for modeling are respectively The proposed device model, AE-RNN model, Wide&Deep model and modified Collaborative Filtering will be followed by dynamic recommendation evaluation for these four models. The test results are shown in Figures 11, 12, 13 and 14.

從圖中各個模型的表現中可以觀察出本裝置模型、AE-RNN模型 與Wide&Deep模型會隨著點擊序列長度的增加而提升民宿推薦清單的準確性,本裝置模型與AE-RNN模型在序列長度達到五十次點擊時呈現最佳推薦狀態,接下來隨著序列長度點擊的次數增加而維持著推薦清單的準確性,而Wide&Deep模型因忽視了時序列的資訊,在多次的點擊序列下較無法判別出用戶的真實偏好,因此表現不如本裝置模型與AE-RNN模型優秀。 From the performance of each model in the figure, it can be observed that the device model and the AE-RNN model And the Wide&Deep model will improve the accuracy of the homestay recommendation list as the length of the click sequence increases. The device model and the AE-RNN model present the best recommendation status when the sequence length reaches 50 clicks, and then clicks with the sequence length The number of clicks increases to maintain the accuracy of the recommendation list, while the Wide&Deep model ignores the time series information and is less able to distinguish the user's real preference under multiple click sequences, so the performance is not as good as the device model and the AE-RNN model excellent.

由此可知本裝置模型、AE-RNN模型與Wide&Deep模型較不容易受 多餘的點擊次數影響而降低推薦清單的準確性,具備有抗雜訊的能力,也可看 出本裝置模型不但比AE-RNN模型具有較優的表現且具有較佳的穩定性。 It can be seen that the device model, AE-RNN model and Wide&Deep model are less susceptible to The accuracy of the recommended list is reduced due to the impact of redundant clicks, and it has the ability to resist noise. You can also see The device model not only has better performance than the AE-RNN model, but also has better stability.

反觀modified Collaborative Filtering的動態推薦表現圖(見第14圖 所示),可發現Top@N隨著點擊序列長度的增加而維持著穩定成長,但在MAP@N的表現上可以發現在點擊序列長度達到12次點擊時呈現最高值,接下來隨著序列長度點擊的次數增加而降低推薦效果,由此可知當點擊序列長度慢慢增加時,modified Collaborative Filtering判斷用戶喜好的能力會漸漸下降,雖然生成的推薦清單中具有最終訂購的民宿,卻將最終訂購的民宿放在名單中較後面的位置,這也是為何在Top@N評比中效果穩定但MAP@N評比卻下降的原因。 In contrast, the dynamic recommendation performance diagram of modified Collaborative Filtering (see Figure 14 As shown), it can be found that Top@N maintains a steady growth with the increase of the length of the click sequence, but in the performance of MAP@N, it can be found that the highest value is shown when the length of the click sequence reaches 12 clicks, and then with the sequence The number of length clicks increases and the recommendation effect decreases. It can be seen that when the length of the click sequence increases gradually, the ability of modified Collaborative Filtering to judge user preferences will gradually decrease. Although the generated recommendation list has the final ordered homestay, it will eventually order The B&Bs are placed at the lower end of the list, which is why the performance is stable in the Top@N evaluation, but the MAP@N evaluation is declining.

四組模型經動態推薦測試後可發現,本裝置模型、AE-RNN模型 與Wide&Deep模型傾向輸入完整點擊序列長度方可達成最佳推薦狀態,由第15圖可發現序列長度累計至五十次點擊時達1913筆佔全體95%,而過長的點擊序列長度會影響modified Collaborative Filtering的推薦效果。 The four groups of models can be found after the dynamic recommendation test, the device model, AE-RNN model And the Wide&Deep model tends to enter the complete click sequence length to achieve the best recommendation status. From Figure 15, it can be found that when the sequence length accumulates to 50 clicks, it reaches 1913, accounting for 95% of the total, and the length of the click sequence is too long. It will affect the modified The recommended effect of Collaborative Filtering.

藉此,本發明蒐集各家民宿的資訊以及所有用戶完整的瀏覽點擊 選訂紀錄,透過觀察這些資料,發現用戶重複點擊同一民宿的行為表示對該民宿有較高的關注程度,以及完整的民宿資訊有助於提升民宿間的差異度。此外,本發明利用去噪自編碼器對民宿資訊進行特徵抽取的動作來獲取優異的民宿特徵向量,接著再使用用戶完整的瀏覽點擊選訂紀錄,透過雙向遞迴神經網路搭配注意力機制,進行預測用戶所喜好的民宿並且進行推薦。最終本發明將所提裝置的模型與先前所提出的模型以及A公司的現行推薦系統進行比較,經測試後結果顯示,不管是提供用戶完整或部分的瀏覽點擊選訂紀錄,本裝置模型皆有較優異的表現結果並大幅度的提升推薦成功機率。 In this way, the present invention collects the information of each homestay and the complete browsing clicks of all users After selecting and ordering records, by observing these data, it is found that the behavior of users repeatedly clicking on the same homestay indicates that they have a high degree of attention to the homestay, and complete homestay information helps to increase the difference between homestays. In addition, the present invention uses the denoising self-encoder to perform feature extraction on homestay information to obtain excellent homestay feature vectors, and then uses the user's complete browsing and clicking to select records, and uses a two-way recurrent neural network with an attention mechanism. Predict the user's favorite homestay and recommend it. Finally, the present invention compares the model of the proposed device with the previously proposed model and the current recommendation system of Company A. After testing, the results show that whether it provides users with complete or partial browsing, clicking and ordering records, the device model has Excellent performance results and greatly increase the probability of successful recommendation.

綜上所述,本發明係一種動態式民宿資訊推薦裝置,可有效改善 習用之種種缺點,不需使用者資訊即可根據使用者偏好進行個人化推薦,經過測試後推薦準確度可達到百分之75的高準確度,能進一步優化一般市面上的推薦算法,大幅提升舊有技術的準確度,有助於協助提供更好的使用者體驗,進而使本發明之產生能更進步、更實用、更符合使用者之所須,確已符合發明專利申請之要件,爰依法提出專利申請。 In summary, the present invention is a dynamic homestay information recommendation device, which can effectively improve Various shortcomings that are commonly used, without user information, personalized recommendations can be made according to user preferences. After testing, the recommendation accuracy can reach a high accuracy of 75%, which can further optimize the general recommendation algorithm on the market and greatly improve The accuracy of the old technology helps to provide a better user experience, thereby making the present invention more advanced, more practical, and more in line with the needs of users, which indeed meets the requirements of the invention patent application. File a patent application in accordance with the law.

惟以上所述者,僅為本發明之較佳實施例而已,當不能以此限定 本發明實施之範圍;故,凡依本發明申請專利範圍及發明說明書內容所作之簡單的等效變化與修飾,皆應仍屬本發明專利涵蓋之範圍內。 However, what is described above is only a preferred embodiment of the present invention, and should not be limited thereto. The scope of implementation of the present invention; therefore, all simple equivalent changes and modifications made according to the patent scope of the present invention and the content of the description of the invention should still fall within the scope covered by the patent of the present invention.

100:動態式民宿資訊推薦裝置 1:民宿特徵學習模組 11:去噪自編碼器 111:編碼器 112:解碼器 113:隱藏層 2:使用者偏好模組 21:預測模型 211:雙向遞迴神經網路 212:注意力機制 213:門控結構 3:推薦模組 31:民宿推薦清單 100:Dynamic homestay information recommendation device 1: B&B feature learning module 11: Denoising autoencoder 111: Encoder 112: Decoder 113: Hidden layer 2: User Preferences Module 21: Predictive Models 211: Bidirectional Recurrent Neural Networks 212: Attention mechanism 213: Gated structures 3: Recommended modules 31: Homestay recommendation list

第1圖,係本發明動態式民宿資訊推薦裝置之架構示意圖。 第2圖,係本發明動態式民宿資訊推薦裝置之訓練流程示意圖。 第3圖,係本發明去噪自編碼器之架構示意圖。 第4圖,係本發明預測模型之架構示意圖。 第5圖,係本發明之推薦流程示意圖。 第6圖,係本發明之Top@N示意圖。 第7圖,係本發明之MAP@N示意圖。 第8圖,係本發明之不同模型在Top@N的柱狀圖。 第9圖,係本發明之不同模型在MAP@N的柱狀圖。 第10圖,係本發明之動態推薦示意圖。 第11圖,係本發明所提裝置模型之動態推薦表現圖。 第12圖,係本發明比較AE-RNN模型之動態推薦表現圖。 第13圖,係本發明比較Wide&Deep模型之動態推薦表現圖。 第14圖,係本發明比較modified Collaborative Filtering模型之動態推薦表現圖。 第15圖,係本發明之用戶點擊序列的長度統計圖。 Figure 1 is a schematic diagram of the structure of the dynamic homestay information recommendation device of the present invention. Figure 2 is a schematic diagram of the training process of the dynamic homestay information recommendation device of the present invention. Figure 3 is a schematic diagram of the architecture of the denoising self-encoder of the present invention. Figure 4 is a schematic diagram of the structure of the prediction model of the present invention. Fig. 5 is a schematic diagram of the recommended flow chart of the present invention. Figure 6 is a schematic diagram of Top@N of the present invention. Figure 7 is a schematic diagram of the MAP@N of the present invention. Figure 8 is the histogram of different models of the present invention in Top@N. Figure 9 is a histogram of different models of the present invention in MAP@N. Figure 10 is a schematic diagram of the dynamic recommendation of the present invention. Fig. 11 is a dynamic recommendation performance diagram of the device model proposed by the present invention. Fig. 12 is a dynamic recommendation performance diagram comparing the AE-RNN model of the present invention. Fig. 13 is a dynamic recommendation performance diagram of the comparison of Wide & Deep models in the present invention. Fig. 14 is a dynamic recommendation performance diagram comparing modified Collaborative Filtering models of the present invention. Fig. 15 is a statistical diagram of the length of the user click sequence of the present invention.

100:動態式民宿資訊推薦裝置 100:Dynamic homestay information recommendation device

1:民宿特徵學習模組 1: B&B feature learning module

11:去噪自編碼器 11: Denoising self-encoder

2:使用者偏好模組 2: User preference module

21:預測模型 21: Predictive Models

211:雙向遞迴神經網路 211: Bidirectional Recurrent Neural Networks

212:注意力機制 212:Attention mechanism

213:門控結構 213: Gating structure

3:推薦模組 3: Recommended modules

31:民宿推薦清單 31: Homestay recommendation list

Claims (7)

一種動態式民宿資訊推薦裝置,係在使用者無須登入及無需用戶資料之情況下,適用於訂房推薦系統及電商推薦系統應用,該動態式民宿資訊推薦裝置包括: 一民宿特徵學習模組,係根據數間民宿的民宿資訊,利用一去噪自編碼器(Denoising Autoencoder)對該些民宿資訊進行降維以及特徵向量抽取的計算,通過該去噪自編碼器的預先訓練(pre-train)而產生數個較低維度民宿特徵向量,再依照用戶當下之瀏覽點擊(click records)選訂紀錄引入時間概念,從該數間民宿中依序抽取出被瀏覽與選訂所對應的民宿資訊,經過該去噪自編碼器進行編碼,從該些較低維度民宿特徵向量中取得數個相對應的民宿特徵向量(B&B’s embedding vector),同時一方面輸出至基於注意力機制(Attention Mechanism)雙向遞迴神經網路(Bidirectional Recurrent Neural Network, BiRNN)之預測模型中,一方面將接著經由該去噪自編碼器進行解碼還原; 一使用者偏好模組,連接該民宿特徵學習模組,係以該雙向遞迴神經網路結合該注意力機制為一模型訓練而建立該預測模型,取得輸入的該些民宿特徵向量,該些民宿特徵向量係根據該用戶當下之瀏覽點擊選訂紀錄上的時間依序輸入該預測模型,利用該預測模型學習分析使用者偏好以產生數個使用者偏好向量,再採用聯合訓練(joint training)方式,利用該去噪自編碼器與該預測模型結合比較該些民宿特徵向量與該些使用者偏好向量所產生的誤差進行聯合訓練以最小化聯合誤差;以及 一推薦模組,連接該使用者偏好模組,經過該聯合訓練後,使該預測模型最終可從該些民宿特徵向量中預測出該用戶最有可能選購的數個最具代表性民宿特徵向量,利用最小化向量間餘弦相似度來計算各民宿間的相關程度,通過計算該些最具代表性民宿特徵向量與所有民宿特徵向量之間的餘弦相似度,並從該些最具代表性民宿特徵向量中依照該餘弦相似度從高到低進行排序,根據該排序擷取其中N個排序靠前的最相似民宿特徵向量所對應之民宿,經過推薦流程(Recommendation process)產生出適合該用戶的民宿推薦清單進行推薦;其中,該民宿特徵學習模組、該使用者偏好模組及該推薦模組係進行共同訓練,得以相互校正,從而有效提升其推薦精準度。 A dynamic homestay information recommendation device, which is suitable for room reservation recommendation systems and e-commerce recommendation systems without the need for users to log in and user information. The dynamic homestay information recommendation device includes: A B&B feature learning module, based on the B&B information of several B&Bs, uses a denoising autoencoder (Denoising Autoencoder) to perform dimension reduction and feature vector extraction calculations on these B&B information, through the denoising autoencoder Pre-training (pre-train) generates several low-dimensional homestay feature vectors, and then introduces the concept of time according to the user's current browsing click records (click records), and sequentially extracts the browsed and selected homestays from the number of homestays. The corresponding B&B information is encoded by the denoising self-encoder, and several corresponding B&B's embedding vectors are obtained from these lower-dimensional B&B feature vectors, and at the same time output to the attention-based In the Bidirectional Recurrent Neural Network (BiRNN) prediction model of the Attention Mechanism, on the one hand, the denoising self-encoder will then be used for decoding and restoration; A user preference module, connected to the homestay feature learning module, uses the bidirectional recurrent neural network combined with the attention mechanism as a model training to establish the prediction model, obtains the input feature vectors of the homestay, and the The B&B feature vectors are sequentially input into the prediction model according to the time of the user's current browsing, clicking, and ordering records. The prediction model is used to learn and analyze user preferences to generate several user preference vectors, and then adopt joint training (joint training) way, using the denoising self-encoder and the predictive model to compare the errors generated by the feature vectors of these homestays and the user preference vectors for joint training to minimize the joint error; and A recommendation module, connected to the user preference module, after the joint training, the prediction model can finally predict the most representative characteristics of several homestays that the user is most likely to purchase from the feature vectors of these homestays vector, using the minimized cosine similarity between the vectors to calculate the degree of correlation between each homestay, by calculating the cosine similarity between the most representative homestay feature vectors and all homestay feature vectors, and from the most representative homestay feature vectors The B&B feature vectors are sorted from high to low according to the cosine similarity, and according to the sorting, the B&Bs corresponding to the N most similar B&B feature vectors in the top ranks are extracted, and the recommendation process (Recommendation process) generates a list that is suitable for the user The homestay recommendation list is recommended; wherein, the homestay feature learning module, the user preference module and the recommendation module are jointly trained and can be calibrated with each other, thereby effectively improving the accuracy of their recommendation. 依申請專利範圍第1項所述之動態式民宿資訊推薦裝置,其中,該民宿資訊包含民宿之地區、房型、設施、配備、服務、價格及其它與民宿相關資料。According to the dynamic homestay information recommendation device described in Item 1 of the scope of the patent application, the homestay information includes the area, room type, facilities, equipment, services, prices and other information related to the homestay. 依申請專利範圍第1項所述之動態式民宿資訊推薦裝置,其中,該瀏覽點擊選訂紀錄包含用戶當下之點擊次數、順序、瀏覽資料及用戶選訂之民宿。According to the dynamic homestay information recommendation device described in Item 1 of the scope of the patent application, the record of browsing, clicking and ordering includes the user's current click times, order, browsing data and the homestays selected by the user. 依申請專利範圍第1項所述之動態式民宿資訊推薦裝置,其中,該去噪自編碼器係由編碼器(encoder)、解碼器(decoder)以及包含於該編碼器與該解碼器之間的隱藏層所組成,該去噪自編碼器將該些較低維度民宿特徵向量經過該編碼器進行編碼,並投影到該隱藏層中,接著透過該解碼器將該隱藏層進行解碼還原。According to the dynamic homestay information recommendation device described in Item 1 of the scope of the patent application, the denoising self-encoder is composed of an encoder, a decoder, and a device between the encoder and the decoder. The denoising self-encoder encodes these lower-dimensional feature vectors through the encoder, and projects them into the hidden layer, and then decodes and restores the hidden layer through the decoder. 依申請專利範圍第1項所述之動態式民宿資訊推薦裝置,其中,該去噪自編碼器在進行降維以及特徵向量抽取前,係先將該數個民宿資訊中的類別特徵(Categorical feature)進行獨熱(one-hot)編碼,以編碼來擴增該數個民宿資訊的維度。According to the dynamic homestay information recommendation device described in Item 1 of the scope of the patent application, the denoising self-encoder first uses the category features (Categorical feature) in several homestay information before performing dimension reduction and feature vector extraction. ) to perform one-hot encoding to amplify the dimensions of the several homestay information. 依申請專利範圍第1項所述之動態式民宿資訊推薦裝置,其中,該預測模型係利用該注意力機制分配關注的權重來進行預測,以輸出該用戶最有可能選購的數個最具代表性民宿特徵向量。According to the dynamic homestay information recommendation device described in Item 1 of the scope of the patent application, the prediction model uses the attention mechanism to assign attention weights to make predictions, so as to output the most likely purchases by the user. Representative homestay feature vector. 依申請專利範圍第1項所述之動態式民宿資訊推薦裝置,其中,該預測模型包括應用於該雙向遞迴神經網路的每個通道輸出的一或多個門控結構(gated recurrent unit, GRU)。According to the dynamic homestay information recommendation device described in Item 1 of the scope of the patent application, the prediction model includes one or more gating structures (gated recurrent unit, GRU).
TW110138760A 2021-10-19 2021-10-19 Dynamic homestay information recommendation device TWI779887B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW110138760A TWI779887B (en) 2021-10-19 2021-10-19 Dynamic homestay information recommendation device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW110138760A TWI779887B (en) 2021-10-19 2021-10-19 Dynamic homestay information recommendation device

Publications (2)

Publication Number Publication Date
TWI779887B true TWI779887B (en) 2022-10-01
TW202318328A TW202318328A (en) 2023-05-01

Family

ID=85475803

Family Applications (1)

Application Number Title Priority Date Filing Date
TW110138760A TWI779887B (en) 2021-10-19 2021-10-19 Dynamic homestay information recommendation device

Country Status (1)

Country Link
TW (1) TWI779887B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108492124A (en) * 2018-01-22 2018-09-04 阿里巴巴集团控股有限公司 Store information recommends method, apparatus and client
US20180276572A1 (en) * 2017-03-22 2018-09-27 Facebook, Inc. Providing travel related content for transportation by multiple vehicles
CN108875007A (en) * 2018-06-15 2018-11-23 腾讯科技(深圳)有限公司 The determination method and apparatus of point of interest, storage medium, electronic device
TW202113577A (en) * 2019-06-01 2021-04-01 美商蘋果公司 Techniques for machine language model creation
CN112997171A (en) * 2018-09-27 2021-06-18 谷歌有限责任公司 Analyzing web pages to facilitate automated navigation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180276572A1 (en) * 2017-03-22 2018-09-27 Facebook, Inc. Providing travel related content for transportation by multiple vehicles
CN108492124A (en) * 2018-01-22 2018-09-04 阿里巴巴集团控股有限公司 Store information recommends method, apparatus and client
CN108875007A (en) * 2018-06-15 2018-11-23 腾讯科技(深圳)有限公司 The determination method and apparatus of point of interest, storage medium, electronic device
CN112997171A (en) * 2018-09-27 2021-06-18 谷歌有限责任公司 Analyzing web pages to facilitate automated navigation
TW202113577A (en) * 2019-06-01 2021-04-01 美商蘋果公司 Techniques for machine language model creation

Also Published As

Publication number Publication date
TW202318328A (en) 2023-05-01

Similar Documents

Publication Publication Date Title
CN111797321B (en) Personalized knowledge recommendation method and system for different scenes
CN111080400B (en) Commodity recommendation method and system based on gate control graph convolution network and storage medium
CN103377250B (en) Top k based on neighborhood recommend method
Weideman et al. Parallel search engine optimisation and pay-per-click campaigns: A comparison of cost per acquisition
CN104751007A (en) Building value evaluation based calculation method and device
Dillahunt et al. Detecting and visualizing filter bubbles in Google and Bing
CN112131480A (en) Personalized commodity recommendation method and system based on multilayer heterogeneous attribute network representation learning
CN108198045A (en) The design method of mixing commending system based on e-commerce website data mining
CN105138653A (en) Exercise recommendation method and device based on typical degree and difficulty
CN105976229A (en) Collaborative filtering algorithm based on user and project mixing
US10264082B2 (en) Method of producing browsing attributes of users, and non-transitory computer-readable storage medium
CN112016002A (en) Mixed recommendation method integrating comment text level attention and time factors
CN106528812A (en) USDR model based cloud recommendation method
CN116541607B (en) Intelligent recommendation method based on commodity retrieval data analysis
CN113591971B (en) User individual behavior prediction method based on DPI time sequence word embedded vector
CN110413880B (en) Single-classification collaborative filtering method based on user personality hierarchical structure
CN114386513A (en) Interactive grading prediction method and system integrating comment and grading
TW201734909A (en) Method and apparatus for identifying target user
CN102929975A (en) Recommending method based on document tag characterization
Chowdhury et al. Neural Factorization for Offer Recommendation using Knowledge Graph Embeddings.
CN113239266B (en) Personalized recommendation method and system based on local matrix decomposition
CN114154080A (en) Dynamic socialization recommendation method based on graph neural network
TWI779887B (en) Dynamic homestay information recommendation device
US12118577B2 (en) Self-learning valuation
CN106055715B (en) A kind of arest neighbors collaborative filtering method expanded based on product item feature

Legal Events

Date Code Title Description
GD4A Issue of patent certificate for granted invention patent