CN115438871A - Ice and snow scenic spot recommendation method and system integrating preference and eliminating popularity deviation - Google Patents

Ice and snow scenic spot recommendation method and system integrating preference and eliminating popularity deviation Download PDF

Info

Publication number
CN115438871A
CN115438871A CN202211161858.7A CN202211161858A CN115438871A CN 115438871 A CN115438871 A CN 115438871A CN 202211161858 A CN202211161858 A CN 202211161858A CN 115438871 A CN115438871 A CN 115438871A
Authority
CN
China
Prior art keywords
tourist
preference
tourists
matrix
snow
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211161858.7A
Other languages
Chinese (zh)
Inventor
李鹏
苏忻洁
朱心如
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin University of Commerce
Original Assignee
Harbin University of Commerce
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin University of Commerce filed Critical Harbin University of Commerce
Priority to CN202211161858.7A priority Critical patent/CN115438871A/en
Publication of CN115438871A publication Critical patent/CN115438871A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/14Travel agencies

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

An ice and snow scenic spot recommendation method and system integrating preference and eliminating popularity deviation relates to the technical field of scenic spot recommendation and is used for solving the problem that excessive recommendation is caused by popularity deviation in the existing recommendation method. The technical points of the invention comprise: the sex, age and occupation of the tourist are respectively grouped and mapped into numerical values of [ 0-1 ]; calculating and obtaining the multi-feature preference similarity of the tourists according to the mapping values of the sex, the age and the occupation of the tourists; calculating to obtain a tourist preference value according to the multi-feature preference similarity of the tourists and historical scoring data of the tourists; constructing a tourist-preference value matrix according to the standardized tourist preference values; training a matrix decomposition model based on the multi-feature preferences of the tourists according to the tourist-preference value matrix and the historical scoring data of the tourists to obtain a trained tourist-preference value matrix; and predicting and recommending the new user according to the trained tourist-preference value matrix. The invention effectively relieves the unfavorable situation that the low-popularity ice and snow scenic spots are difficult to be recommended to tourists.

Description

Ice and snow scenic spot recommendation method and system integrating preference and eliminating popularity deviation
Technical Field
The invention relates to the technical field of scenic spot recommendation, in particular to a method and a system for recommending ice and snow scenic spots, which integrate preferences to eliminate popularity deviation.
Background
Tourism is a powerful support for promoting economic development, provides great benefits for income increase, and creates a special urban image, wherein the tourism recommendation of the ice and snow scenic spots is unique, and plays a significant role in the recommendation of a plurality of tourist attractions. However, the current ice and snow scenic spot recommendation methods have the problem of popularity deviation, that is, most of the recommendation methods recommend hot, that is, high-popularity ice and snow scenic spots to tourists, so that although the cognition degree of the recommended ice and snow scenic spots can be further improved, the long-tailed phenomenon will occur in the past. The hot ice and snow scenic spots are recommended excessively, people easily gather excessively, and the small ice and snow scenic spots and low-popularity scenic spots are relatively unobtrusive, so that benefit loss is caused. Therefore, the prior art does not solve the following problems: 1) The long tail phenomenon caused by excessively recommending the ice and snow scenic spots with high popularity is considered, and proper recommendation strength is not given to the ice and snow scenic spots with low popularity; 2) The recommendation of the ice and snow scenic spots is guaranteed to have higher recommendation accuracy, and meanwhile, the recommendation result meets the recommendation fairness and gives consideration to the recommendation requirements of the small public and the hunter customers.
Disclosure of Invention
In view of the above problems, the invention provides an ice and snow scenic spot recommendation method and system integrating preference and eliminating popularity deviation, so as to solve the problem that the existing ice and snow scenic spot recommendation method causes over recommendation due to the popularity deviation.
According to one aspect of the invention, an ice and snow scenic spot recommendation method integrating preferences and eliminating popularity deviation is provided, and the method comprises the following steps:
step one, acquiring a training data set; the training data set comprises historical visitor feature data and visitor historical scoring data; the historical tourist feature data comprises tourist IDs, sexes, ages and occupations of tourists; the historical scoring data of the tourists comprises scoring scores of the tourists on the ice and snow scenic spots played once;
step two, respectively grouping the sex, the age and the occupation of the tourists, and mapping the sex, the age and the occupation of the tourists into numerical values between [0 and 1 ];
calculating and obtaining the multi-feature preference similarity of the tourists according to the mapping values of sex, age and occupation of the tourists, wherein the multi-feature preference similarity of the tourists is used for expressing the preference similarity of the tourist groups with similar feature preferences;
step four, calculating according to the multi-feature preference similarity of the tourists and the historical scoring data of the tourists to obtain a tourist preference value;
fifthly, standardizing the tourist preference values, and carrying out one-to-one correspondence on the standardized tourist preference values, the tourist IDs and the ice and snow scenic spot IDs to construct a tourist-preference value matrix; the serial numbers of the rows and the columns of the tourist-preference value matrix respectively correspond to tourist IDs and ice and snow scenic spots IDs, and the value of the tourist-preference value matrix is the tourist preference value after standardized processing;
step six, training a matrix decomposition model based on the tourist multi-feature preference according to the tourist-preference value matrix and the historical scoring data of the tourists to obtain a trained tourist-preference value matrix which is defined as the tourist-multi-feature matrix;
and seventhly, inputting the ID of the tourist to be recommended or inputting the ID of the tourist to be recommended and the ID of the target ice and snow scenery spot, and searching and acquiring the scoring predicted values of one or more ice and snow scenery spots according to the tourist-multi-feature matrix.
Further, linear mapping is adopted for the sex and occupation of the tourists in the step two, trapezoidal fuzzy number mapping is adopted for the age of the tourists, and the trapezoidal fuzzy number formula is as follows:
Figure BDA0003860384100000021
wherein x represents an input value of the guest's age; a, b, c, d belongs to R, and a is less than or equal to b<c≤d;μ A Degree of membership, mu, representing a trapezoidal fuzzy number A ∈[0,1]。
Further, the formula for calculating the similarity of the multiple characteristic preferences of the tourists in the third step is as follows:
Figure BDA0003860384100000022
TP (u, v) represents the preference similarity of tourists u and v; u. of i,j 、v i,j Respectively representing the mapping values corresponding to the tourists u and v when the input value of a certain determined characteristic j is i; when the characteristic j is gender or occupation k f =1, when the characteristic j is age, k f Representing the number of fuzzy sets; l represents the total number of features.
Further, the formula for calculating the guest preference value in step four is:
Figure BDA0003860384100000023
wherein, P u,t A visitor preference value representing the visitor u for the attraction t; n represents a set of guests with similar feature preferences; r is a radical of hydrogen v,t Representing the actual scoring score of the guest v for the sight t;
Figure BDA0003860384100000024
representing the average of all scoring points by visitors u and v, respectively, for the snow attraction they have ever played.
Further, in the sixth step, the original loss function of the matrix decomposition model is improved in the matrix decomposition model based on the guest multi-feature preference, and the improved loss function is as follows:
Figure BDA0003860384100000031
wherein k represents a steganographic factor spatial dimension; r is u,t Representing the scores of tourists u on the ice and snow scenic spots t; p is a radical of formula u,k 、q k,t Respectively representing a k-dimension tourist potential factor matrix and a k-dimension ice and snow scenic spot potential factor matrix; λ represents a regularization coefficient; e is a natural index; n is u,t Representing the normalized guest preference value; m is a group of u,k Representing guest u's preference in k dimensionA value; m is a group of k,t Representing the preference value of sight t for a dimension of k.
Further, after obtaining the tourist-multi-feature matrix, carrying out grading compensation on the tourist-multi-feature matrix according to the tourist preference value after standardization processing; wherein, the scoring compensation calculation formula is as follows:
Figure BDA0003860384100000032
wherein newM represents the guest-multi-feature matrix after score compensation; m is a group of u,t Values in the guest-preference value matrix representing the sight t to which guest u corresponds.
According to another aspect of the present invention, there is provided an ice and snow attraction recommendation system fusing preferences to eliminate popularity bias, the system comprising:
a data acquisition module configured to acquire a training data set; the training data set comprises historical visitor feature data and visitor historical scoring data; the historical tourist feature data comprises tourist IDs, sexes, ages and occupations of tourists; the historical scoring data of the tourists comprises scoring scores of the tourists on the ice and snow scenic spots played once;
the grouping mapping module is configured to group the sex, the age and the occupation of the tourist respectively and map the sex, the age and the occupation of the tourist into numerical values between [0 and 1 ];
the preference similarity calculation module is configured to calculate and obtain the multi-feature preference similarity of the tourists according to the mapping values of the sex, the age and the occupation of the tourists, and the multi-feature preference similarity of the tourists is used for representing the preference similarity of the tourist group with the similar feature preference; the calculation formula of the multi-feature preference similarity of the tourists is as follows:
Figure BDA0003860384100000033
TP (u, v) represents the preference similarity of tourists u and v; u. of i,j 、v i,j Respectively indicating u, v of touristsA mapping value corresponding to the input value of the determined characteristic j is i; when the characteristic j is sex or occupation k f =1, when the characteristic j is age, k f Representing the number of fuzzy sets; l represents the total number of features;
a preference value calculation module configured to calculate a guest preference value according to the guest multi-feature preference similarity and the guest history scoring data; the calculation formula of the tourist preference value is as follows:
Figure BDA0003860384100000041
wherein, P u,t Representing a guest preference value for guest u for sight point t; n represents a set of guests with similar feature preferences; r is a radical of hydrogen v,t Representing the actual scoring score of the guest v for the sight t;
Figure BDA0003860384100000042
represents the average of all scoring points of the ice and snow spots that visitors u and v have ever played, respectively;
the preference value matrix construction module is configured to standardize the tourist preference values, correspond the standardized tourist preference values to tourist IDs (identities) and ice and snow scenic spots in a one-to-one manner, and construct a tourist-preference value matrix; the serial numbers of the rows and the columns of the tourist-preference value matrix respectively correspond to tourist IDs (identities) and ice and snow scenic spots IDs, and the value of the tourist-preference value matrix is the tourist preference value after standardized processing;
a preference value matrix training module configured to train a matrix decomposition model based on visitor multi-feature preference according to the visitor-preference value matrix and the visitor historical scoring data, obtain a trained visitor-preference value matrix, and define the matrix as a visitor-multi-feature matrix;
and the recommending module is configured to input the ID of the tourist to be recommended or input the ID of the tourist to be recommended and the ID of the target ice and snow scenery spot, and search and obtain the scoring predicted value of one or more ice and snow scenery spots according to the tourist-multi-feature matrix.
Further, linear mapping is adopted in the grouping mapping module for the sex and occupation of the tourists, trapezoidal fuzzy number mapping is adopted for the age of the tourists, and the trapezoidal fuzzy number formula is as follows:
Figure BDA0003860384100000043
wherein x represents an input value of the guest's age; a, b, c, d belongs to R, and a is less than or equal to b<c≤d;μ A Degree of membership, mu, representing a trapezoidal fuzzy number A ∈[0,1]。
Further, the preference value matrix training module improves an original loss function of the matrix decomposition model in the matrix decomposition model based on the guest multi-feature preference, and the improved loss function is as follows:
Figure BDA0003860384100000044
wherein k represents a steganographic factor spatial dimension; r is u,t Representing the scores of tourists u on the ice and snow scenic spots t; p is a radical of formula u,k 、q k,t Respectively representing a k-dimension tourist potential factor matrix and a k-dimension ice and snow scenic spot potential factor matrix; λ represents a regularization coefficient; e is a natural index; n is u,t Representing the normalized guest preference value; (ii) a M u,k Represents the preference value of the tourist u in k dimension; m k,t Representing the preference value of sight t for a dimension of k.
Further, after the preference value matrix training module obtains the visitor-multi-feature matrix, the visitor-multi-feature matrix is subjected to grading compensation according to the visitor preference value after standardization processing; wherein, the scoring compensation calculation formula is as follows:
Figure BDA0003860384100000051
wherein newM represents the guest-multi-feature matrix after score compensation; m is a group of u,t Values in the guest-preference value matrix representing the sight t to which guest u corresponds.
The beneficial technical effects of the invention are as follows:
according to the method, the tourists are grouped in the same type according to the sex, age and occupation information of the tourists, three characteristics of the tourists are respectively mapped into numerical values between [ 0-1 ], and a tourists-multi-characteristic matrix is further constructed so as to better capture the real preference of the tourists, but not to enable the recommendation result to be influenced by popularity deviation, and the control degree of the tourists on the recommendation result is increased; moreover, through historical scoring information of tourists on ice and snow scenic spots, a matrix decomposition model loss function based on the tourists multi-feature preference is constructed by fusing a tourist-multi-feature matrix, the tourists multi-feature preference is integrated into model training, and the inherent popularity deviation problem of a matrix decomposition recommendation model is relieved; furthermore, the tourist standard preference value is used as a compensation item of a training result of the matrix decomposition model based on the multi-feature preference of the tourists, so that the popularity deviation is greatly reduced, the problem that the high-popularity ice and snow scenic spots are continuously and excessively recommended in the conventional matrix decomposition recommendation model is avoided, and meanwhile, the unfavorable situation that the low-popularity ice and snow scenic spots are difficult to present in a tourist recommendation list is effectively relieved.
Drawings
Fig. 1 is a schematic flow chart of a method for recommending ice and snow spots by fusing preferences and eliminating popularity deviation according to an embodiment of the present invention.
Fig. 2 is a schematic structural diagram of an ice and snow attraction recommendation system that integrates preferences to eliminate popularity deviation according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the disclosure, exemplary embodiments or examples of the disclosure are described below with reference to the accompanying drawings. It is obvious that the described embodiments or examples are only some, but not all embodiments or examples of the invention. All other embodiments or examples, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments or examples in the present invention, shall fall within the protection scope of the present invention.
The method fully considers the problem that the popular ice and snow scenic spots are influenced by popularity deviation to cause over recommendation. Excessive recommendation of hot ice and snow scenic spots can cause recommendation to show a long tail phenomenon, so that the traffic is gathered in the ice and snow scenic spots with high popularity, and the crowd is gathered. The low-popularity ice and snow scenic spot is difficult to be pushed to a tourist preference list due to unfair recommendation caused by popularity deviation, and is unfair recommendation for the ice and snow scenic spot and tourists. Aiming at the problems, the invention combines the multi-feature information of the tourists to capture the real preference of the tourists, reduces the influence of the popularity deviation on the recommendation result of the ice and snow scenic spots and increases the influence of the real preference of the tourists on the recommendation result. Meanwhile, different mapping inverse methods are adopted according to the properties of different characteristic information of the tourists, so that the capturing accuracy of the tourists is further improved, and the bad problems caused by popularity deviation are effectively reduced; according to the method, the tourists multi-feature preference is fused into the matrix decomposition model so as to optimize the inherent popularity deviation of the traditional matrix decomposition model, and meanwhile, the optimized initial recommendation result is graded and compensated by the tourists multi-feature preference value again, so that the popularity deviation generated by the recommendation model based on the matrix decomposition on the inherent design mechanism is further reduced, and the recommendation fairness is improved.
An ice and snow scenic spot recommendation method for fusing preferences and eliminating popularity deviation is shown in fig. 1 and comprises the following steps:
step one, acquiring a training data set; the training data set comprises historical visitor feature data and visitor historical scoring data; the historical tourist feature data comprises tourist IDs, sexes, ages and occupations of the tourists; the historical scoring data of the tourists comprises scoring values of the tourists on the ice and snow scenic spots played once;
step two, respectively grouping the sex, the age and the occupation of the tourists, and mapping the sex, the age and the occupation of the tourists into numerical values between [0 and 1 ];
calculating and obtaining the multi-feature preference similarity of the tourists according to the mapping values of the sex, the age and the occupation of the tourists, wherein the multi-feature preference similarity of the tourists is used for expressing the preference similarity of the tourist groups with similar feature preferences;
calculating to obtain a tourist preference value according to the multi-feature preference similarity of the tourists and the historical scoring data of the tourists;
fifthly, standardizing the tourist preference values, and carrying out one-to-one correspondence on the standardized tourist preference values, the tourist IDs and the ice and snow scenic spot IDs to construct a tourist-preference value matrix; the serial numbers of the rows and the columns of the tourist-preference value matrix respectively correspond to tourist IDs and ice and snow scenic spots IDs, and the value of the tourist-preference value matrix is the tourist preference value after standardized processing;
step six, training a matrix decomposition model based on the multi-feature preference of the tourists according to the tourist-preference value matrix and the historical scoring data of the tourists to obtain a trained tourist-preference value matrix which is defined as a tourist-multi-feature matrix;
and seventhly, inputting the ID of the tourist to be recommended or inputting the ID of the tourist to be recommended and the ID of the target ice and snow scenery spot, and searching and obtaining the scoring predicted value of one or more ice and snow scenery spots according to the tourist-multi-feature matrix.
The embodiment of the invention provides an ice and snow scenic spot recommendation method integrating preferences and eliminating popularity deviation, which comprises the following steps:
the method comprises the following steps: acquiring characteristic information of the tourists and historical scoring information of the tourists; the tourist characteristic information comprises the sex, age and occupation of the tourist; the historical scoring information of the tourists comprises scoring scores of the tourists on the ice and snow scenic spots played once;
according to the embodiment of the invention, historical scoring information of tourists on ice and snow scenic spots and sex, age and occupation in the registration information are obtained, and preliminary grouping is carried out according to the sex, age and occupation of the tourists, wherein the preliminary grouping is to group the same sex/age/occupation into one group.
Step two: according to the sex information of the tourists, the tourists are grouped according to the sex, and the groups comprise male, female and tourists without the sex; according to the age information of the tourists, the tourists are grouped according to the ages in a trapezoidal fuzzy number mode, and the groups comprise young, middle and old; according to the tourist occupation information, the tourists are grouped according to occupation, the groups include "other", "academic/educator", "artist", "clerk/manager", "college student/researcher", "customer service", "doctor/health", "execution/management", "farmer", "housewife", "young/high school", "attorney", "programmer", "retirement", "sales/marketing", "scientist", "employer", "technician/engineer", "businessman/craftsman", "unemployment", "writer";
according to the embodiment of the present invention, gender is set to "male" =1, "female" =0, "guest without gender set" =0.5 at the time of input; the gender input value may also be y 1 =x 1 The mapping is "male" =1, "female" =0, "guest without gender set" =0.5, where x 1 For sex, y 1 Is a gender mapping value.
According to an embodiment of the present invention, the age is input at the time of input in accordance with a conventional age value, for example, 20 (years); the age input value can also be mapped into a continuous value between [0 and 1] by adopting a trapezoidal fuzzy number; the trapezoidal fuzzy number formula is:
Figure BDA0003860384100000071
wherein a, b, c, d belongs to R, and a is not more than b<c≤d;μ A Membership, mu, of a trapezoidal fuzzy number A ∈[0,1]。
According to the embodiment of the invention, the careers correspond to [ 0-20 ] in the grouping order when inputting]Values in between, such as "other" =0, "academics/educators" =1; professional input value adoption
Figure BDA0003860384100000072
Is mapped to [ 0-1 ]]A continuous value in between, wherein x 2 For professional input of value, y 2 Is a career mapping value.
The sex, age and occupation characteristic information of the tourists are mapped to a range from [0 to 1], wherein the age of the tourists is considered to have ambiguity, namely a certain age cannot be mapped to a certain age range and can be mapped to one or more age ranges, so the trapezoidal fuzzy number is adopted to map the ages of the tourists. The fuzzy number is a concept in fuzzy mathematics, has good performance when processing uncertain factors, adopts the trapezoidal fuzzy number to map the age of the tourist to a range from 0 to 1 by combining the characteristics of the age of the tourist, effectively solves the problem of the fuzziness of the age, and the mapping result is a continuous value of the range from 0 to 1.
It is a deterministic feature since the guest gender and guest occupation are not ambiguous. Therefore, when the sex of the guest is taken into consideration, the male input value is set to 1, the female input value is set to 0, and the guest input value without sex set is set to 0.5. And adopt y 1 =x 1 The mapping is "male" =1, "female" =0, "guest without gender set" =0.5, where x 1 For sex, value, y 1 Is a gender mapping value.
When considering tourist professions, 21 professions are considered, including "other", "academician/educator", "artist", "clerk/manager", "university student/researcher", "customer service", "doctor/health", "executive/management", "farmer", "housewife", "young/high school", "attorney", "programmer", "retirement", "sales/marketing", "scientist", "employer", "technician/engineer", "businessman/craftsman", "unemployment", "writer". Sequentially comparing and converting into input values of 0-20]I.e. "other" =0, "academics/educators" =1, "artist" =3, and so on. And adopt
Figure BDA0003860384100000081
Is mapped to [ 0-1 ]]A continuous value in between, wherein x 2 For professional input of value, y 2 Mapping to [ 0-1 ] for occupational mapping values]A continuous value in between.
The invention carries out different mapping processing on different characteristics based on different characteristics of the tourist, avoids rigid division in the traditional recommendation method and simultaneously shapes the characteristics with fuzzy properties. The detailed information is shown in the following table:
Figure BDA0003860384100000082
Figure BDA0003860384100000091
step three: and calculating to obtain the tourist group with similar characteristic preferences by adopting a tourist multi-characteristic preference similarity calculation formula, wherein the tourist multi-characteristic preference similarity calculation formula is as follows:
Figure BDA0003860384100000092
TP (u, v) represents the preference similarity of tourists u and v; u. u i,j 、v i,j Respectively representing the mapping results corresponding to the tourists u and v when the input value of a certain determined characteristic j is i; when the characteristic j is sex or occupation k f =1, when the characteristic j is age, k f Representing the number of fuzzy sets; l represents the total amount of the feature.
Step four: calculating by adopting a weighted average strategy calculation formula to obtain a tourist preference value based on the similarity of the multi-characteristic preferences of the tourists, wherein the weighted average strategy calculation formula is as follows:
Figure BDA0003860384100000093
wherein, P u,t Representing a guest preference value for guest u for sight point t; n represents a set of guests with similar feature preferences;
Figure BDA0003860384100000094
Figure BDA0003860384100000095
represents the average of all scoring points of the ice and snow spots that visitors u and v have ever played, respectively;TP (u, v) represents the preference similarity of tourists u and v; r is a radical of hydrogen v,t Representing the guest v's actual rating for the attraction t.
The guest preference calculation results are shown in the following table:
tourist ID Ice and snow scenery spot ID Preference value
1 20 3.01
2 13 3.48
4 50 5.20
Step five: standardizing the tourist preference values in the fourth step by adopting a standardized calculation formula, and constructing a tourist-preference value matrix M based on the standardized tourist preference values; the method specifically comprises the following steps: carrying out one-to-one correspondence on the tourist preference values after the standardization treatment, the tourist IDs and the ice and snow scenery spot IDs, and constructing a tourist-preference value matrix; the serial numbers of the rows and the columns of the tourist-preference value matrix respectively correspond to tourist IDs and ice and snow scenic spots IDs, and the value of the tourist-preference value matrix is the tourist preference value after standardized processing;
according to an embodiment of the present invention, the normalized calculation formula is as follows:
Figure BDA0003860384100000101
wherein U represents the total number of guests; c represents the total number of the ice and snow scenic spots; p u,t A visitor preference value representing the visitor u for the attraction t; n is a radical of an alkyl radical u,t Represents P u,t The processed values are normalized.
Step six, fusing the obtained tourist-preference value matrix M and historical scoring information of the tourist on the ice and snow scenic spots to construct a matrix decomposition loss function based on the tourist multi-feature preference, taking the loss function as a target function of a matrix decomposition model, and training through the constructed matrix decomposition model to obtain a tourist-multi-feature matrix;
according to the embodiment of the invention, matrix decomposition occupies a quite important position in a recommendation model, the matrix decomposition utilizes explicit data (score/preference value and the like) or implicit data (click or not/purchase or not and the like) of tourists as original data to construct a co-occurrence matrix, then, the original data is continuously updated in an iterative mode to be decomposed into potential factors of the tourists and the ice and snow scenic spots, and then unknown values in the matrix are further filled, so that the purpose of predicting the preference of the tourists on the ice and snow scenic spots is achieved. The loss function of the original matrix factorization model is as follows:
Figure BDA0003860384100000102
the invention improves the original loss function, and the matrix decomposition loss function based on the multi-feature preference of the tourists is as follows:
Figure BDA0003860384100000103
wherein k represents a steganographic factor spatial dimension; r is a radical of hydrogen u,t Representing the scores of tourists u on the ice and snow scenic spots t; p is a radical of formula u,k 、q k,t Respectively representing a k-dimensional tourist potential factor matrix and a k-dimensional ice and snow scenic spot potential factor matrix; λ represents a regularization coefficient; e is a natural index;n u,t representing the normalized guest preference value; m is a group of u,k Represents the preference value of the tourist u in k dimension; m k,t Representing the preference value of sight t for a dimension of k.
Experiments prove that the matrix decomposition model based on the tourist multi-feature preference is superior to the original matrix decomposition model in the aspects of recommendation precision and popularity deviation reduction.
The model training process is as follows:
firstly, the data obtained in the first step are calculated according to the following equation 7:2: the proportion of 1 is divided into a training set, a verification set and a test set.
Secondly, training parameters of a matrix decomposition model based on the multi-feature preferences of the tourists by adopting a training set; verifying hyper-parameters of a matrix decomposition model based on multi-feature preferences of tourists; the test set evaluates the prediction accuracy of a matrix factorization model based on guest multi-feature preferences.
And finally, inputting the ID of the target tourist to obtain the preliminary recommendation prediction score. Specifically, the ID of the tourist to be recommended is input, or the ID of the tourist to be recommended and the ID of the target ice and snow scenery spot are input, and the scoring predicted value of one or more ice and snow scenery spots is searched and obtained according to the tourist-multi-feature matrix. When only the ID of the target tourist is input, the line where the ID of the tourist is located is the recommended prediction score of the current tourist on all ice and snow scenic spots; when the target tourist ID and the target sight ID are input, the recommendation prediction score of the tourist for the sight is represented. The preliminary recommended prediction score is shown in the following table:
Figure BDA0003860384100000111
step seven: according to the standard tourist preference value obtained in the fifth step, grading compensation is carried out on the tourist-multi-feature matrix in the sixth step; obtaining a final score prediction value of the tourist on the scenic spot;
according to the embodiment of the invention, the grading compensation formula is as follows:
Figure BDA0003860384100000112
wherein newM is a matrix after grading compensation; m is a group of u,t A value in the guest-preference value matrix representing the attraction t to which the guest u corresponds.
The scoring prediction value of the tourist for the ice and snow scenic spots is shown in the following table:
Figure BDA0003860384100000113
step eight: and sorting the obtained final scoring predicted values from large to small, selecting Top (K) scenic spots to recommend to the tourists, wherein K represents the number of recommendations.
According to the embodiment of the invention, according to the scoring and predicting value of the tourist on the ice and snow scenic spots, top (K) results are selected from high to low and recommended to the tourist, and K is the number of recommended scenic spots. Recommended examples are shown in the following table. After the calculation of the steps, the scoring prediction value of the tourist on the ice and snow scenic spots is obtained, and K =3 is set and represents that the tourist recommends the first three scenic spots with the highest scoring prediction values.
Figure BDA0003860384100000114
Figure BDA0003860384100000121
The final recommendation list given to the guest by the tour system according to Top (3) recommendation is as follows:
Figure BDA0003860384100000122
another embodiment of the present invention provides an ice and snow spot recommendation system that integrates preferences to eliminate popularity deviation, as shown in fig. 2, the system includes:
a data acquisition module 10 configured to acquire a training data set; the training data set comprises historical visitor feature data and visitor historical scoring data; the historical tourist feature data comprises tourist ID, tourist gender, tourist age and tourist occupation; the historical scoring data of the tourists comprises scoring values of the tourists on the ice and snow scenic spots played once;
a grouping mapping module 20 configured to group the sex, age, and occupation of the guest, respectively, and map the sex, age, and occupation of the guest into numerical values between [0 to 1 ];
a preference similarity calculation module 30 configured to calculate and obtain a multi-feature preference similarity of the tourists according to the mapping values of the sex, the age and the occupation of the tourists, wherein the multi-feature preference similarity of the tourists is used for representing the preference similarity of the tourist group with similar feature preferences; the calculation formula of the multi-feature preference similarity of the tourists is as follows:
Figure BDA0003860384100000131
TP (u, v) represents the preference similarity of tourists u and v; u. u i,j 、v i,j Respectively representing mapping values corresponding to tourists u and v when the input value of a certain determined characteristic j is i; when the characteristic j is gender or occupation k f =1, when the characteristic j is age, k f Representing the number of fuzzy sets; l represents the total number of features;
a preference value calculating module 40 configured to calculate and obtain a guest preference value according to the guest multi-feature preference similarity and the guest historical scoring data; the formula for calculating the tourist preference value is as follows:
Figure BDA0003860384100000132
wherein, P u,t A visitor preference value representing the visitor u for the attraction t; n represents a set of guests with similar feature preferences; r is v,t Representing the actual scoring score of the guest v for the sight t;
Figure BDA0003860384100000133
respectively representing all of the ice and snow attractions that guest u and v have ever playedAverage of the score values;
a preference value matrix construction module 50 configured to standardize the tourist preference values, and construct a tourist-preference value matrix by one-to-one correspondence between the standardized tourist preference values, the tourist IDs, and the ice and snow scenic spots IDs; the serial numbers of the rows and the columns of the tourist-preference value matrix respectively correspond to tourist IDs and ice and snow scenic spots IDs, and the value of the tourist-preference value matrix is the tourist preference value after standardized processing;
a preference value matrix training module 60 configured to train a matrix decomposition model based on guest multi-feature preferences according to the guest-preference value matrix and guest historical scoring data to obtain a trained guest-preference value matrix defined as a guest-multi-feature matrix;
and the recommending module 70 is configured to input the ID of the tourist to be recommended or input the ID of the tourist to be recommended and the ID of the target ice and snow scene point, and search and obtain the scoring predicted values of one or more ice and snow scene points according to the tourist-multi-feature matrix.
In this embodiment, preferably, the group mapping module 20 adopts linear mapping for the sex and occupation of the guest, and adopts trapezoidal fuzzy number mapping for the age of the guest, and the formula of the trapezoidal fuzzy number is:
Figure BDA0003860384100000141
wherein x represents an input value of the guest's age; a, b, c, d belongs to R, and a is less than or equal to b<c≤d;μ A Degree of membership, mu, representing the number of trapezoidal ambiguities A ∈[0,1]。
In this embodiment, preferably, in the preference value matrix training module 60, an original loss function of the matrix decomposition model is improved in the matrix decomposition model based on the guest multi-feature preference, and the improved loss function is:
Figure BDA0003860384100000142
wherein k represents a steganographic factor spatial dimension; r is u,t To representScoring the ice and snow scenic spots t by the tourists u; p is a radical of formula u,k 、q k,t Respectively representing a k-dimension tourist potential factor matrix and a k-dimension ice and snow scenic spot potential factor matrix; λ represents a regularization coefficient; e is a natural index; n is a radical of an alkyl radical u,t Representing the normalized guest preference value; (ii) a M u,k Represents the preference value of the tourist u in the k dimension; m is a group of k,t Representing the preference value of sight t for a dimension of k.
In this embodiment, preferably, after obtaining the visitor-multi-feature matrix, the preference value matrix training module 60 performs score compensation on the visitor-multi-feature matrix according to the visitor preference value after the standardization process; wherein, the scoring compensation calculation formula is as follows:
Figure BDA0003860384100000143
wherein newM represents the guest-multi-feature matrix after score compensation; m u,t Values in the guest-preference value matrix representing the sight t to which guest u corresponds.
The function of the scenic spot recommendation system with fusion of preference and elimination of popularity deviation in the embodiment of the invention can be explained by the scenic spot recommendation method with fusion of preference and elimination of popularity deviation, so that the detailed part of the system embodiment can be referred to the method embodiment, and is not described again here.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this description, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as described herein. The present invention has been disclosed in an illustrative rather than a restrictive sense, and the scope of the present invention is defined by the appended claims.

Claims (10)

1. An ice and snow scenic spot recommendation method integrating preferences and eliminating popularity deviation is characterized by comprising the following steps:
step one, acquiring a training data set; the training data set comprises historical visitor feature data and historical visitor scoring data; the historical tourist feature data comprises tourist IDs, sexes, ages and occupations of the tourists; the historical scoring data of the tourists comprises scoring scores of the tourists on the ice and snow scenic spots played once;
step two, respectively grouping the sex, the age and the occupation of the tourists, and mapping the sex, the age and the occupation of the tourists into numerical values between [0 and 1 ];
calculating and obtaining the multi-feature preference similarity of the tourists according to the mapping values of sex, age and occupation of the tourists, wherein the multi-feature preference similarity of the tourists is used for expressing the preference similarity of the tourist groups with similar feature preferences;
fourthly, calculating to obtain a tourist preference value according to the multi-feature preference similarity of the tourists and the historical scoring data of the tourists;
fifthly, standardizing the tourist preference values, and carrying out one-to-one correspondence on the standardized tourist preference values, the tourist IDs and the ice and snow scenic spot IDs to construct a tourist-preference value matrix; the serial numbers of the rows and the columns of the tourist-preference value matrix respectively correspond to tourist IDs (identities) and ice and snow scenic spots IDs, and the value of the tourist-preference value matrix is the tourist preference value after standardized processing;
step six, training a matrix decomposition model based on the multi-feature preference of the tourists according to the tourist-preference value matrix and the historical scoring data of the tourists to obtain a trained tourist-preference value matrix which is defined as a tourist-multi-feature matrix;
and seventhly, inputting the ID of the tourist to be recommended or inputting the ID of the tourist to be recommended and the ID of the target ice and snow scenery spot, and searching and acquiring the scoring predicted values of one or more ice and snow scenery spots according to the tourist-multi-feature matrix.
2. The method for recommending ice and snow scenic spots by fusing preferences and eliminating popularity deviation as claimed in claim 1, wherein in the second step, linear mapping is adopted for the sex and occupation of the tourist, trapezoidal fuzzy number mapping is adopted for the age of the tourist, and a trapezoidal fuzzy number formula is as follows:
Figure FDA0003860384090000011
wherein x represents an input value of the guest's age; a, b, c, d belongs to R, and a is less than or equal to b<c≤d;μ A Degree of membership, mu, representing a trapezoidal fuzzy number A ∈[0,1]。
3. The method for recommending ice and snow scenic spots by fusing preferences and eliminating popularity bias as claimed in claim 2, wherein the formula for calculating the similarity of the multi-feature preferences of the tourists in the third step is as follows:
Figure FDA0003860384090000012
wherein TP (u, v) represents the preference similarity of tourists u and v; u. of i,j 、v i,j Respectively representing the mapping values corresponding to the tourists u and v when the input value of a certain determined characteristic j is i; when the characteristic j is gender or occupation k f =1, when the characteristic j is age, k f Representing the number of fuzzy sets; l represents the total number of features.
4. The method for recommending ice and snow spots, which integrates preferences to eliminate popularity bias, according to claim 3, wherein the formula for calculating the preference value of the tourist in step four is as follows:
Figure FDA0003860384090000021
wherein, P u,t Representing a guest preference value for guest u for sight point t; n represents a set of guests with similar feature preferences; r is a radical of hydrogen v,t Representing the actual scoring score of the guest v for the attraction t;
Figure FDA0003860384090000022
respectively, represent the average of all scoring points by guests u and v for the ice and snow attraction they have ever played.
5. The method as claimed in claim 4, wherein in step six, the original loss function of the matrix factorization model is improved in the matrix factorization model based on the multi-feature preferences of the tourists, and the improved loss function is:
Figure FDA0003860384090000023
wherein k represents a steganographic factor spatial dimension; r is u,t Representing the scores of tourists u on the ice and snow scenic spots t; p is a radical of formula u,k 、q k,t Respectively representing a k-dimensional tourist potential factor matrix and a k-dimensional ice and snow scenic spot potential factor matrix; λ represents a regularization coefficient; e is a natural index; n is u,t Representing the normalized guest preference value; m u,k Represents the preference value of the tourist u in the k dimension; m k,t Representing the preference value of sight t for a dimension of k.
6. The method for recommending ice and snow scenic spots by fusing the preference to eliminate the popularity deviation according to any one of claims 1-5, wherein after a tourist-multi-feature matrix is obtained in the sixth step, the tourist-multi-feature matrix is subjected to grading compensation according to the standardized tourist preference value; wherein, the scoring compensation calculation formula is as follows:
Figure FDA0003860384090000024
wherein newM represents the guest-multi-feature matrix after score compensation; m u,t Values in the guest-preference value matrix representing the sight t to which guest u corresponds.
7. An ice and snow attraction recommendation system fusing preferences to eliminate popularity bias, comprising:
a data acquisition module configured to acquire a training data set; the training data set comprises historical visitor feature data and historical visitor scoring data; the historical tourist feature data comprises tourist IDs, sexes, ages and occupations of the tourists; the historical scoring data of the tourists comprises scoring scores of the tourists on the ice and snow scenic spots played once;
the grouping mapping module is configured to group the sex, the age and the occupation of the tourists respectively and map the sex, the age and the occupation of the tourists into numerical values between [ 0-1 ];
the preference similarity calculation module is configured to calculate and obtain the multi-feature preference similarity of the tourists according to the mapping values of the sex, the age and the occupation of the tourists, and the multi-feature preference similarity of the tourists is used for representing the preference similarity of the tourist group with the similar feature preference; the calculation formula of the multi-feature preference similarity of the tourists is as follows:
Figure FDA0003860384090000031
TP (u, v) represents the preference similarity of tourists u and v; u. of i,j 、v i,j Respectively representing the mapping values corresponding to the tourists u and v when the input value of a certain determined characteristic j is i; when the characteristic j is sex or occupation k f =1, when the characteristic j is age, k f Representing the number of fuzzy sets; l represents the total number of features;
a preference value calculation module configured to calculate a guest preference value according to the guest multi-feature preference similarity and the guest history scoring data; the calculation formula of the tourist preference value is as follows:
Figure FDA0003860384090000032
wherein, P u,t A visitor preference value representing the visitor u for the attraction t; n represents a set of guests with similar feature preferences; r is v,t Representing the actual scoring score of the guest v for the attraction t;
Figure FDA0003860384090000033
represents the average of all scoring points of the ice and snow spots that visitors u and v have ever played, respectively;
the preference value matrix construction module is configured to standardize the tourist preference values, correspond the standardized tourist preference values to tourist IDs (identities) and ice and snow scenic spots in a one-to-one manner, and construct a tourist-preference value matrix; the serial numbers of the rows and the columns of the tourist-preference value matrix respectively correspond to tourist IDs (identities) and ice and snow scenic spots IDs, and the value of the tourist-preference value matrix is the tourist preference value after standardized processing;
a preference value matrix training module configured to train a matrix decomposition model based on visitor multi-feature preference according to the visitor-preference value matrix and the visitor historical scoring data, obtain a trained visitor-preference value matrix, and define the matrix as a visitor-multi-feature matrix;
and the recommending module is configured to input the ID of the tourist to be recommended or input the ID of the tourist to be recommended and the ID of the target ice and snow scenery spot, and search and obtain the scoring predicted value of one or more ice and snow scenery spots according to the tourist-multi-feature matrix.
8. The system of claim 7, wherein the block mapping module employs linear mapping for guest gender and guest occupation, and trapezoidal fuzzy number mapping for guest age, and the trapezoidal fuzzy number formula is:
Figure FDA0003860384090000041
wherein x represents an input value of the guest's age; a, b, c, d belongs to R, and a is less than or equal to b<c≤d;μ A Degree of membership, mu, representing a trapezoidal fuzzy number A ∈[0,1]。
9. The system of claim 8, wherein the preference value matrix training module improves an original loss function of a matrix factorization model in the matrix factorization model based on the multi-feature preferences of the tourists, and the improved loss function is as follows:
Figure FDA0003860384090000042
wherein k represents a steganographic factor spatial dimension; r is u,t Representing the scores of tourists u on the ice and snow scenic spots t; p is a radical of formula u,k 、q k,t Respectively representing a k-dimension tourist potential factor matrix and a k-dimension ice and snow scenic spot potential factor matrix; λ represents a regularization coefficient; e is a natural index; n is u,t Representing the normalized guest preference value; (ii) a M u,k Represents the preference value of the tourist u in k dimension; m is a group of k,t Representing the preference value of sight t for a dimension of k.
10. The system for recommending ice and snow scenery spots fusing preferences to eliminate popularity deviation according to any one of claims 7-9, characterized in that the preference value matrix training module performs score compensation on the visitor-multi-feature matrix according to the visitor preference value after standardization processing after obtaining the visitor-multi-feature matrix; wherein, the score compensation calculation formula is as follows:
Figure FDA0003860384090000043
wherein newM represents the guest-multi-feature matrix after score compensation; m u,t A value in the guest-preference value matrix representing the attraction t to which the guest u corresponds.
CN202211161858.7A 2022-09-23 2022-09-23 Ice and snow scenic spot recommendation method and system integrating preference and eliminating popularity deviation Pending CN115438871A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211161858.7A CN115438871A (en) 2022-09-23 2022-09-23 Ice and snow scenic spot recommendation method and system integrating preference and eliminating popularity deviation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211161858.7A CN115438871A (en) 2022-09-23 2022-09-23 Ice and snow scenic spot recommendation method and system integrating preference and eliminating popularity deviation

Publications (1)

Publication Number Publication Date
CN115438871A true CN115438871A (en) 2022-12-06

Family

ID=84248094

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211161858.7A Pending CN115438871A (en) 2022-09-23 2022-09-23 Ice and snow scenic spot recommendation method and system integrating preference and eliminating popularity deviation

Country Status (1)

Country Link
CN (1) CN115438871A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115809374A (en) * 2023-02-13 2023-03-17 四川大学 Method, system, device and storage medium for correcting mainstream deviation of recommendation system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115809374A (en) * 2023-02-13 2023-03-17 四川大学 Method, system, device and storage medium for correcting mainstream deviation of recommendation system

Similar Documents

Publication Publication Date Title
CN109345348A (en) The recommended method of multidimensional information portrait based on travel agency user
CN107562812A (en) A kind of cross-module state similarity-based learning method based on the modeling of modality-specific semantic space
CN109408712A (en) A kind of construction method of travel agency user multidimensional information portrait
CN111222847B (en) Open source community developer recommendation method based on deep learning and unsupervised clustering
CN108563749B (en) Online education system resource recommendation method based on multi-dimensional information and knowledge network
CN110689523A (en) Personalized image information evaluation method based on meta-learning and information data processing terminal
CN109871504A (en) A kind of Course Recommendation System based on Heterogeneous Information network and deep learning
CN109977322A (en) Trip mode recommended method, device, computer equipment and readable storage medium storing program for executing
CN104239496A (en) Collaborative filtering method based on integration of fuzzy weight similarity measurement and clustering
CN115995018A (en) Long tail distribution visual classification method based on sample perception distillation
CN109949174A (en) A kind of isomery social network user entity anchor chain connects recognition methods
CN115438871A (en) Ice and snow scenic spot recommendation method and system integrating preference and eliminating popularity deviation
CN115035341A (en) Image recognition knowledge distillation method capable of automatically selecting student model structure
CN112131261A (en) Community query method and device based on community network and computer equipment
CN113220915B (en) Remote sensing image retrieval method and device based on residual attention
CN112148994B (en) Information push effect evaluation method and device, electronic equipment and storage medium
CN112396092B (en) Crowdsourcing developer recommendation method and device
CN112287241B (en) Travel recommendation method and system
Magassy et al. Influence of mode use on level of satisfaction with daily travel routine: a focus on automobile driving in the United States
CN111488460A (en) Data processing method, device and computer readable storage medium
CN110503072B (en) Face age estimation method based on multi-branch CNN architecture
CN106779181A (en) Method is recommended by a kind of medical institutions based on linear regression factor Non-negative Matrix Factorization model
CN108133296B (en) Event attendance prediction method combining environmental data under social network based on events
CN110633890A (en) Land utilization efficiency judgment method and system
CN113409351B (en) Unsupervised field self-adaptive remote sensing image segmentation method based on optimal transmission

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination