CN106846082A - Tourism cold start-up consumer products commending system and method based on hardware information - Google Patents
Tourism cold start-up consumer products commending system and method based on hardware information Download PDFInfo
- Publication number
- CN106846082A CN106846082A CN201611134210.5A CN201611134210A CN106846082A CN 106846082 A CN106846082 A CN 106846082A CN 201611134210 A CN201611134210 A CN 201611134210A CN 106846082 A CN106846082 A CN 106846082A
- Authority
- CN
- China
- Prior art keywords
- data
- cold
- arithmetic elements
- tourism
- cold start
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 239000013598 vector Substances 0.000 claims abstract description 21
- 238000007781 pre-processing Methods 0.000 claims abstract description 9
- 239000000284 extract Substances 0.000 claims abstract description 7
- 238000005457 optimization Methods 0.000 claims abstract description 5
- 239000011159 matrix material Substances 0.000 claims description 26
- 238000005070 sampling Methods 0.000 claims description 6
- 238000012549 training Methods 0.000 claims description 6
- 238000012546 transfer Methods 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 5
- 238000013507 mapping Methods 0.000 claims description 3
- 230000006978 adaptation Effects 0.000 abstract description 2
- 230000003542 behavioural effect Effects 0.000 abstract description 2
- 230000004927 fusion Effects 0.000 abstract description 2
- 230000006872 improvement Effects 0.000 description 13
- 230000006399 behavior Effects 0.000 description 8
- 230000008569 process Effects 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000002203 pretreatment Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0631—Item recommendations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24137—Distances to cluster centroïds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0241—Advertisements
- G06Q30/0251—Targeted advertisements
- G06Q30/0255—Targeted advertisements based on user history
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0241—Advertisements
- G06Q30/0251—Targeted advertisements
- G06Q30/0269—Targeted advertisements based on user profile or attribute
- G06Q30/0271—Personalized advertisement
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/14—Travel agencies
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Strategic Management (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Economics (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Databases & Information Systems (AREA)
- Tourism & Hospitality (AREA)
- General Engineering & Computer Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Game Theory and Decision Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Probability & Statistics with Applications (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Primary Health Care (AREA)
- Health & Medical Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Recommend the invention discloses a kind of tourism cold start-up consumer products commending system based on hardware information and accordingly method, system includes data preprocessing module, algoritic module and prediction module, data preprocessing module includes data extracting unit, data vector unit and data serialization unit, the computing module includes Canopy arithmetic elements, Kmeans arithmetic elements, with RF arithmetic elements, the present invention utilizes the behavioural information of user, by the way of many algorithm fusions, with hardware as sample, it is characterized with the behavior of product and extracts and model, the combination of hardware is given and is recommended, can think that new user gives personalized recommendation without product information, lifting Consumer's Experience.Due to obtaining the newest output of each unit with daily newest user preference data, periodically each algorithm hyper parameter value of algoritic module is updated, continuous self-optimization, the different data of self adaptation, it is ensured that the accuracy and reliability of system recommendation, it is highly efficient.
Description
Technical field
The invention belongs to microcomputer data processing field, and in particular to it is a kind of for cold start-up user based on hardware
The Products Show system and recommendation method of information.
Background technology
At present, application of the personalized recommendation in each system platform is more and more universal.The existing way of recommendation is mainly root
The hobby of user is analyzed according to the historical operation behavior record of user, according to the hobby of user, will be liked with user interest
Good corresponding Products Show is to user.This mode can not only lift sales ratio of industrial enterprises, it is also possible to lift user's likability, one
Two are lifted to obtain.But it is directed to cold start-up user --- i.e. new user, due to lacking the historical operation behavior record of these users, it is impossible to
Historical data according to user carries out personalized recommendation.Therefore, currently for being pushed away in the following ways more than cold start-up user
Recommend:
1st, using RECOMENDATION, identical recommendation results are provided to all users.Obviously, this way of recommendation cannot meet completely
The individual demand of user, and the exposure rate of RECOMENDATION product is extremely low, it is difficult to lift the exchange hand and new product of whole station
Light exposure.
2nd, business personnel or product manager are invited, manually to new user grouping, and recommendation results is formulated to each packet.But
This mode needs manually to regularly update packet and recommendation results, not only needs to expend substantial amounts of human cost, and can not answer
To mass data, it is impossible to which timely treatment new feature user, inefficiency, also more single, scale is smaller for the product of recommendation.
3rd, according to user log-on message or prompting problem gives and recommends.This way of recommendation is not complete due to information,
The individualized feature of recommended products is not also obvious.
The content of the invention
To solve the above problems, the invention discloses a kind of more flexible system for being reliably used for new user recommendation and recommendation
Method, by first clustering classification of the method the classified afterwards realization to new user, and recommends the much-sought-after item of correspondence classification.
In order to achieve the above object, the present invention provides following technical scheme:
A kind of tourism cold start-up consumer products commending system based on hardware information, including data preprocessing module, algoritic module
And prediction module,
The data preprocessing module includes data extracting unit, data vector unit and data serialization unit,
The data extracting unit is used to select the user data of historical behavior based on time dimension, extracts and corresponding cold opens letter
Breath combination, product and PV numbers that correspondence is browsed obtain user preference data table;The data vector unit is used to pass through data
Matrixing method, based on user preference data table, information combination as analysis object is opened by the use of cold, and circuit number is done
The feature of information combination is opened for this is cold, the cold corresponding relation for opening information combination and all product lists is obtained;The data sequence
Cold after changing unit and being used for data vectorization cell processing opens information combination and product list mapping table is serialized;
The computing module includes Canopy arithmetic elements, Kmeans arithmetic elements and RF arithmetic elements,
The Canopy arithmetic elements are used to obtain center dot file after carrying out computing to cold start-up Data Serialization matrix data,
Kmeans arithmetic elements carry out further optimization to the central point that Canopy arithmetic elements are obtained and obtain more accurately central point, RF
Arithmetic element is used for the central point cluster result obtained according to Kmeans arithmetic elements, and RF model trainings are obtained by random sampling
Data, using the method for checking of reporting to the leadship after accomplishing a task, obtain optimum RF forecast model;
The center dot file that the Kmeans arithmetic elements are used to be obtained according to Canopy calculates class center, Kmeans arithmetic elements
Also include ClustrClassifier subelements, ClustrClassifier subelements are used to be obtained according to data preprocessing module
The matrixing data that are browsed of article and the center dot files that obtain of Kmeans calculated, each in traversal calculating matrix
Vector and each midpoint distance, with minimum value as the mark for judging vector generic, and category label are assigned to corresponding
It is cold to open information combination, the cold information that opens is clustered, while article most popular under calculating each classification;
The prediction module is used to for online data to be input into RF forecast models, the prediction classification for being returned, and transfers popular thing
Product list.
As a further improvement on the present invention, in Canopy arithmetic elements, suitable clusterFilter is preset
Comprising the less central point of number of samples in removal cluster result.
As a further improvement on the present invention, after ClusterClassifier subelements carry out cluster output result, also
Increase the ratio of between class distance and inter- object distance by the parameter for adjusting Canopy.
As a further improvement on the present invention, prediction module is screened according to the specific filter condition for setting from recommendation list
Go out the article of particular community.
A kind of tourism cold start-up consumer products based on hardware information recommend method, comprise the following steps:
Step 1:Select the user data of historical behavior based on time dimension, extract it is corresponding it is cold open information combination, correspondence is clear
The product and PV numbers look at, obtain user preference data table;
Step 2:By data matrix method, based on user preference data table, by the use of it is cold open information combination as point
Analysis object, by the cold circuit number for opening information combination as the cold feature for opening information combination, obtains cold opening information combination and institute
There is the corresponding relation of product list as matrix data;
Step 3:Matrix data in step 2 is serialized;
Step 4:Initial cluster center dot file is obtained by Canopy algorithms, including number of clusters and class center position.Make
It is the improvement of the step, suitable clusterFilter, the isolated central point in removal cluster result should be preset;
Step 5:The center dot file that the serialized data and step 4 that the article that step 3 is obtained is browsed are obtained, passes through
Mahout platforms obtain the center point data after Kmeans is calculated;
Step 6:The center dot file that the matrixing data and step 5 that the article that acquisition step 2 is obtained is browsed are obtained, traversal meter
Each vector and each midpoint distance in matrix are calculated, with minimum value as the mark for judging vector generic, by category label
Be assigned to it is corresponding it is cold open information combination, realization opens the cluster of information to cold, and generates popular article in each classification;
Step 7:Cold start-up combined information and its affiliated classification that step 6 is obtained are obtained, obtaining RF models by random sampling instructs
Practice data, using the method for checking of reporting to the leadship after accomplishing a task, verify the accuracy of the output result of RF models, adjusted with reference to the limitation of platform resource
The number and the depth of tree set in RF models, and make accuracy in tolerance interval, finally give under RF models and storage
Come;
Step 8:Reception is cold online to open data and forwards the data to RF models by interface, in the prediction classification for being returned
Afterwards, request is sent to storage popular article module of all categories, transfers popular item lists.
As a further improvement on the present invention, suitable clusterFilter removals cluster is preset in the step 4
Comprising the central point that number of samples is less in result.
Also increased by adjusting parameter as improvement of the invention, in the step 6 and compare between class distance and inter- object distance
Ratio.
As improvement of the invention, screened from recommendation list always according to the specific filter condition for setting in the step 8
Go out the article of particular community.
Compared with prior art, the invention has the advantages that and beneficial effect:
The present invention utilizes the behavioural information of user, and by the way of many algorithm fusions, with hardware as sample, the behavior with product is
Feature extraction is simultaneously modeled, and the combination of hardware is given and is recommended, and can think that new user gives personalized recommendation without product information,
Lifting Consumer's Experience, and then lift the conversion ratio of purchase.Due to obtaining each computing module with daily newest user preference data
The newest output of unit, is periodically updated, continuous self-optimization to each algorithm hyper parameter value of algoritic module, and self adaptation is not
Same data, it is ensured that the accuracy and reliability of system recommendation, it is highly efficient.Meanwhile, user of service can voluntarily select to use
The cold information dimension for opening in family, can voluntarily select the size of sample data and the time range of sample data, meet system and match somebody with somebody
Putting demand.
Brief description of the drawings
The tourism cold start-up consumer products commending system Organization Chart based on hardware information that Fig. 1 is provided for the present invention.
The user preference data table that Fig. 2 is obtained for data extracting unit.
Fig. 3 is the cold corresponding relation matrix for opening information combination and all product lists of data vector unit.
Fig. 4 is the initial cluster center point that Canopy arithmetic elements are obtained.
Fig. 5 is the class center that Kmeans arithmetic elements are obtained.
Fig. 6 is the cluster result that ClusterClassifier subelements are obtained.
Fig. 7 recommends method flow diagram for the tourism cold start-up consumer products based on hardware information that the present invention is provided.
Specific embodiment
The technical scheme that the present invention is provided is described in detail below with reference to specific embodiment, it should be understood that following specific
Implementation method is only illustrative of the invention and is not intended to limit the scope of the invention.
A kind of tourism cold start-up consumer products commending system based on hardware information, as shown in figure 1, including data prediction
Module, computing module and prediction module.Data preprocessing module is used to that the user data for having historical behavior to be carried out to extract and pre-
Treatment, the user preference data matrix for being serialized, matrix includes cold start-up information;Computing module is used for pre- to data
The data matrix that processing module is obtained carries out cluster computing, so as to be classified to cold start-up information and is obtained middle hot topic of all categories
Article, and obtain the forecast model classified to new user;Prediction module is used to obtain online cold start-up data, passes through
The forecast model for calling computing module to obtain obtains prediction classification, and a step of going forward side by side takes popular item lists.
Data preprocessing module includes data extracting unit, data vector unit and data serialization unit.Data are carried
Take the user data that unit selectes historical behavior based on time dimension, extract and corresponding cold open information combination(Such as terminal hardware
Information, App version numbers, place city etc.), product and PV numbers that correspondence is browsed(Flow number), obtain user preference data table.
User preference data table structure is as shown in Fig. 2 wherein hwinfo is represented and cold opened information combination information, dest_id representative products(It is main
Circuit)Numbering, num represent the flow number for using the user of the hardware to the product.Data vector unit is used to pre-process number
According to, by data matrix method, based on the user preference data table in Fig. 2, by the use of it is cold open information combination as point
Analysis object, by the cold circuit number for opening information combination in Fig. 2 as the cold feature for opening information combination, obtains cold opening information combination
With the corresponding relation of all product lists as shown in figure 3, the numerical value in Fig. 3 under each circuit number feature is and each cold opens information group
Close the flow number for the circuit.Data Serialization unit to data vectorization cell processing after cold open information combination and product
List mapping table is serialized, and serializing function by Mahout realizes, specific to use the customized vectors of Mahout
Serializing-org.apache.mahout.math.SequentialAccessSparseVector, by the number of objects in internal memory
According to being saved in disk, eliminate and read initial data every time(Disk)The extensive of java objects (to internal memory) is converted into disappear
Consumption such that it is able to improve the efficiency of computing under big data high latitude.
Computing module includes Canopy arithmetic elements, Kmeans arithmetic elements and RF arithmetic elements, wherein Canopy computings
Unit is used to obtain center dot file, Kmeans arithmetic elements pair after carrying out cold start-up Data Serialization matrix data computing
The central point that Canopy arithmetic elements are obtained carries out further optimization and obtains more accurately central point, and RF arithmetic elements are used for basis
The central point cluster result that Kmeans arithmetic elements are obtained, obtains RF models.
The Data Serialization matrix that Canopy arithmetic elements are browsed the article that pretreatment module is obtained(Fig. 3)For defeated
Enter, T2 initial values obtained by the distance of general similar item in calculating matrix, Canopy model T1, T2 are obtained based on this,
The initial setting up of clusterFilter parameters.Specifically, Canopy arithmetic elements are by calculating institute the distance between a little simultaneously
Make three-dimensional Discrete point analysis institute a little(One point represents the data line in matrix)Distribution, then rule of thumb select
Select suitable T1 and T2, T1 be usually no more than the ultimate range of point-to-point transmission, T2 initial selecteds average distance a little 1/2
Then experimental result is finely adjusted, to cause that the size of number of clusters and each class can receive.Finally, pass through
Canopy arithmetic elements obtain initial cluster center dot file, as shown in figure 4, so as to the cluster needed in clear and definite Kmeans algorithms
Quantity and the position at class center.As the improvement of Canopy arithmetic elements, preferably in Canopy arithmetic elements, conjunction is preset
Suitable clusterFilter(Rule of thumb set, it is considered that not having for our recommendation less than 50 classes of point in this example
Too big help, therefore should be filtered), can so remove in cluster result comprising the less central point of number of samples, to keep away
Exempt from the situation that a certain classification is null value occur when follow-up Kmeans is clustered, improve the reliability of cluster result.
In Kmeans arithmetic elements, with common use Kmeans for each point cluster calculates class center not with Kmeans algorithms
Together, the serialized data that the present invention is browsed article(Fig. 3)The center dot file obtained with Canopy(Fig. 4)As Kmeans
The input of arithmetic element, the center dot file obtained according to Canopy with Kmeans arithmetic elements(Fig. 4)Class center is calculated, is obtained
Class center as shown in figure 5, and being stored.ClusterClassifier subelements in Kmeans arithmetic elements according to
The matrixing data that article is browsed(Fig. 3)The center dot file obtained with Kmeans(Fig. 5)Calculated, traveled through calculating matrix
In each vector and each midpoint distance, with minimum value as the mark for judging vector generic, and by category label assignment
To it is corresponding it is cold open information combination, while article most popular under calculating each classification, wherein most popular criterion is use
Family flow and purchase volume.Can realize opening the cluster of information to cold by ClusterClassifier subelements, and generate each
Popular article in classification, cluster result and it is of all categories in popular article be stored, cluster result such as Fig. 6(One in figure
Data line in individual original point corresponding diagram 3)It is shown.So use distributed ClusterClassifier methods it is parallel for
Substantial amounts of central point is classified, it is possible to increase efficiency simultaneously may be used on real-time scene.Further change as of the invention
Enter, after ClusterClassifier subelements carry out cluster output result, by the ratio for calculating between class distance and inter- object distance
Value judges Clustering Effect.Then, by adjusting the parameter of Canopy(Adjustable parameter with T1, based on T2, make by singular point threshold value
It is auxiliary)Constantly increase ratio so that more separated between class and class, and sample more condenses in same class.By judging distance ratio
Value, we cause that the accuracy rate that RF tests oneself is lifted to more than 90% from 70%.
After RF arithmetic elements obtain the cold start-up combined informations that obtain of ClusterClassifier and its affiliated classification, lead to
Cross random sampling and obtain RF model training data, using the method for checking of reporting to the leadship after accomplishing a task, verify the output result accuracy of RF models.Knot
Close the limitation of platform resource and the accuracy rate requirement of model, the number and the depth of tree set in adjustment RF models.RF models pass through
Mahout realizes mainly there are 3 stages:Data explanation document is produced, RF modelings, data are reported to the leadship after accomplishing a task checking.During increase data
Pretreatment(Data are changed into the pattern of the input of RF algorithm requirements)With model data self-test process, strengthen the reliability of model.
Randomly select first 70% ClusterClassifier output result as model training data, data left is used as reporting to the leadship after accomplishing a task
Checking data are used.Data supporting paper is obtained by calling mahout, this document is a part for RF modeling inputs.RF is modeled
Process there are certain requirements to physical memory size, and it is situation about frequently encountering on stream that internal memory overflows.Lead in this experiment
Cross adjusting parameter nbtrees determine tree number, with ms adjust node punish number Indirect method tree depth, and with model number
According to self-test is carried out, multigroup acceptable parameter combination is quickly obtained.Best modeled is obtained eventually through the method for checking of reporting to the leadship after accomplishing a task
Parameter, so as to obtain optimum RF forecast model and be stored.Due to the cluster for obtaining ClusterClassifier subelements
Result as assorting process input so that training data and test data are easier to obtain, and are adapted to be processed on actual time line, and right
User is lacked in partial information, with preferable generalization ability.
Online data is input into RF forecast models, after the prediction classification for being returned, Xiang Cun by prediction module by interface
Store up popular article module of all categories and send request, transfer popular item lists.Prediction module can be according to specific setting simultaneously
Filter condition filters out the article of particular community from recommendation list.
Based on the above-mentioned tourism cold start-up consumer products commending system based on hardware information, present invention also offers based on hard
The tourism cold start-up consumer products of part information recommend method, as shown in fig. 7, comprises following steps:
Step 1:Select the user data of historical behavior based on time dimension, extract and corresponding cold open information combination(Such as terminal
Hardware information, App version numbers, place city etc.), product and PV numbers that correspondence is browsed(Flow number), obtain user preference number
According to table.
Step 2:By data matrix method, based on user preference data table, made using the cold information combination that opens
It is analysis object, by the cold circuit number for opening information combination as the cold feature for opening information combination, obtains cold opening information combination
With the corresponding relation of all product lists as matrix data.
Step 3:Matrix data in step 2 is serialized.
Step 4:Initial cluster center dot file is obtained by Canopy algorithms, including number of clusters and class center position
Put.As the improvement of the step, suitable clusterFilter, the isolated central point in removal cluster result should be preset.
Step 5:The center dot file that the serialized data and step 4 that the article that step 3 is obtained is browsed are obtained, passes through
Mahout platforms obtain the center point data after Kmeans is calculated, and storage center point data.
Step 6:The center dot file that the matrixing data and step 5 that the article that acquisition step 2 is obtained is browsed are obtained, time
Each vector and each midpoint distance in calculating matrix are gone through, with minimum value as the mark for judging vector generic, by classification
GO TO assignment to it is corresponding it is cold open information combination, realization opens the cluster of information to cold, and generates popular article in each classification,
Cluster result and it is of all categories in popular article be stored.As the improvement of this step, class can also be compared by calculating
Between distance and inter- object distance ratio, by the final cluster result of ratio in judgement and preserve, while collecting under each classification
Popular article and preserve.
Step 7:Cold start-up combined information and its affiliated classification that step 6 is obtained are obtained, RF moulds are obtained by random sampling
Type training data, using the method for checking of reporting to the leadship after accomplishing a task, verifies the accuracy of the output result of RF models, with reference to the limitation of platform resource
The number and the depth of tree set in adjustment RF models, and make accuracy in tolerance interval, finally give RF models and store up
Leave and.
Step 8:Reception is cold online to open data and forwards the data to RF models by interface, in the prediction class for being returned
After not, request is sent to popular article module of all categories is stored, transfer popular item lists.As the improvement of this step, may be used also
User foreground interface is shifted onto with the article that particular community is filtered out from recommendation list according to the specific filter condition for setting.
Technological means disclosed in the present invention program is not limited only to the technological means disclosed in above-mentioned implementation method, also includes
Constituted technical scheme is combined by above technical characteristic.It should be pointed out that for those skilled in the art
For, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications are also considered as
Protection scope of the present invention.
Claims (8)
1. a kind of tourism cold start-up consumer products commending system based on hardware information, it is characterised in that:Including data prediction
Module, algoritic module and prediction module,
The data preprocessing module includes data extracting unit, data vector unit and data serialization unit,
The data extracting unit is used to select the user data of historical behavior based on time dimension, extracts and corresponding cold opens letter
Breath combination, product and PV numbers that correspondence is browsed obtain user preference data table;The data vector unit is used to pass through data
Matrixing method, based on user preference data table, information combination as analysis object is opened by the use of cold, and circuit number is done
The feature of information combination is opened for this is cold, the cold corresponding relation for opening information combination and all product lists is obtained;The data sequence
Cold after changing unit and being used for data vectorization cell processing opens information combination and product list mapping table is serialized;
The computing module includes Canopy arithmetic elements, Kmeans arithmetic elements and RF arithmetic elements,
The Canopy arithmetic elements are used to obtain center dot file after carrying out computing to cold start-up Data Serialization matrix data,
Kmeans arithmetic elements carry out further optimization to the central point that Canopy arithmetic elements are obtained and obtain more accurately central point, RF
Arithmetic element is used for the central point cluster result obtained according to Kmeans arithmetic elements, and RF model trainings are obtained by random sampling
Data, using the method for checking of reporting to the leadship after accomplishing a task, obtain optimum RF forecast model;
The center dot file that the Kmeans arithmetic elements are used to be obtained according to Canopy calculates class center, Kmeans arithmetic elements
Also include ClustrClassifier subelements, ClustrClassifier subelements are used to be obtained according to data preprocessing module
The matrixing data that are browsed of article and the center dot files that obtain of Kmeans calculated, each in traversal calculating matrix
Vector and each midpoint distance, with minimum value as the mark for judging vector generic, and category label are assigned to corresponding
It is cold to open information combination, the cold information that opens is clustered, while article most popular under calculating each classification;
The prediction module is used to for online data to be input into RF forecast models, the prediction classification for being returned, and transfers popular thing
Product list.
2. the tourism cold start-up consumer products commending system based on hardware information according to claim 1, it is characterised in that:
In Canopy arithmetic elements, preset less comprising number of samples in suitable clusterFilter removal cluster result
Central point.
3. the tourism cold start-up consumer products commending system based on hardware information according to claim 1, it is characterised in that:
After ClusterClassifier subelements carry out cluster output result, also increase class spacing by adjusting the parameter of Canopy
From the ratio with inter- object distance.
4. the tourism cold start-up consumer products commending system based on hardware information according to claim 1, it is characterised in that:
Prediction module filters out the article of particular community according to the specific filter condition for setting from recommendation list.
5. a kind of tourism cold start-up consumer products based on hardware information recommend method, it is characterised in that:Comprise the following steps:
Step 1:Select the user data of historical behavior based on time dimension, extract it is corresponding it is cold open information combination, correspondence is clear
The product and PV numbers look at, obtain user preference data table;
Step 2:By data matrix method, based on user preference data table, by the use of it is cold open information combination as point
Analysis object, by the cold circuit number for opening information combination as the cold feature for opening information combination, obtains cold opening information combination and institute
There is the corresponding relation of product list as matrix data;
Step 3:Matrix data in step 2 is serialized;
Step 4:Initial cluster center dot file is obtained by Canopy algorithms, including number of clusters and class center position;
Step 5:The center dot file that the serialized data and step 4 that the article that step 3 is obtained is browsed are obtained, passes through
Mahout platforms obtain the center point data after Kmeans is calculated;
Step 6:The center dot file that the matrixing data and step 5 that the article that acquisition step 2 is obtained is browsed are obtained, traversal meter
Each vector and each midpoint distance in matrix are calculated, with minimum value as the mark for judging vector generic, by category label
Be assigned to it is corresponding it is cold open information combination, realization opens the cluster of information to cold, and generates popular article in each classification;
Step 7:Cold start-up combined information and its affiliated classification that step 6 is obtained are obtained, obtaining RF models by random sampling instructs
Practice data, using the method for checking of reporting to the leadship after accomplishing a task, verify the accuracy of the output result of RF models, adjusted with reference to the limitation of platform resource
The number and the depth of tree set in RF models, and make accuracy in tolerance interval, finally give under RF models and storage
Come;
Step 8:Reception is cold online to open data and forwards the data to RF models by interface, in the prediction classification for being returned
Afterwards, request is sent to storage popular article module of all categories, transfers popular item lists.
6. the tourism cold start-up consumer products based on hardware information according to claim 5 recommend method, it is characterised in that:
Preset in the step 4 in suitable clusterFilter removal cluster result comprising the less central point of number of samples.
7. the tourism cold start-up consumer products based on hardware information according to claim 5 recommend method, it is characterised in that:
The ratio for comparing between class distance and inter- object distance is also increased by adjusting parameter in the step 6.
8. the tourism cold start-up consumer products based on hardware information according to claim 5 recommend method, it is characterised in that:
Filter out the article of particular community in the step 8 from recommendation list always according to the specific filter condition for setting.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611134210.5A CN106846082B (en) | 2016-12-10 | 2016-12-10 | Travel cold start user product recommendation system and method based on hardware information |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611134210.5A CN106846082B (en) | 2016-12-10 | 2016-12-10 | Travel cold start user product recommendation system and method based on hardware information |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106846082A true CN106846082A (en) | 2017-06-13 |
CN106846082B CN106846082B (en) | 2021-07-30 |
Family
ID=59140727
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611134210.5A Active CN106846082B (en) | 2016-12-10 | 2016-12-10 | Travel cold start user product recommendation system and method based on hardware information |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106846082B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108009877A (en) * | 2017-11-24 | 2018-05-08 | 阿里巴巴集团控股有限公司 | Information mining method and device |
CN108629665A (en) * | 2018-05-08 | 2018-10-09 | 北京邮电大学 | A kind of individual commodity recommendation method and system |
CN109102903A (en) * | 2018-07-09 | 2018-12-28 | 康美药业股份有限公司 | A kind of topic prediction technique and system for health consultation platform |
CN112508512A (en) * | 2020-11-26 | 2021-03-16 | 国网河北省电力有限公司经济技术研究院 | Power grid engineering cost data management method and device and terminal equipment |
CN113538110A (en) * | 2021-08-13 | 2021-10-22 | 苏州工业职业技术学院 | Similar article recommendation method based on browsing sequence |
CN113744021A (en) * | 2021-02-08 | 2021-12-03 | 北京沃东天骏信息技术有限公司 | Recommendation method, recommendation device, computer storage medium and recommendation system |
US20220012601A1 (en) * | 2019-03-26 | 2022-01-13 | Huawei Technologies Co., Ltd. | Apparatus and method for hyperparameter optimization of a machine learning model in a federated learning system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20010013009A1 (en) * | 1997-05-20 | 2001-08-09 | Daniel R. Greening | System and method for computer-based marketing |
CN103455555A (en) * | 2013-08-06 | 2013-12-18 | 北京大学深圳研究生院 | Recommendation method and device based on mobile terminal similarity |
CN103559252A (en) * | 2013-11-01 | 2014-02-05 | 桂林电子科技大学 | Method for recommending scenery spots probably browsed by tourists |
CN104616221A (en) * | 2014-07-30 | 2015-05-13 | 江苏物泰信息科技有限公司 | Intelligent tour recommendation system |
CN106033589A (en) * | 2015-03-10 | 2016-10-19 | 上海昕鼎网络科技有限公司 | Personalized service method and system for tour route |
-
2016
- 2016-12-10 CN CN201611134210.5A patent/CN106846082B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20010013009A1 (en) * | 1997-05-20 | 2001-08-09 | Daniel R. Greening | System and method for computer-based marketing |
CN103455555A (en) * | 2013-08-06 | 2013-12-18 | 北京大学深圳研究生院 | Recommendation method and device based on mobile terminal similarity |
CN103559252A (en) * | 2013-11-01 | 2014-02-05 | 桂林电子科技大学 | Method for recommending scenery spots probably browsed by tourists |
CN104616221A (en) * | 2014-07-30 | 2015-05-13 | 江苏物泰信息科技有限公司 | Intelligent tour recommendation system |
CN106033589A (en) * | 2015-03-10 | 2016-10-19 | 上海昕鼎网络科技有限公司 | Personalized service method and system for tour route |
Non-Patent Citations (7)
Title |
---|
HAMID PARVIN等: "Nearest Cluster Classifier", 《HYBRID ARTIFICIAL INTELLIGENT SYSTEMS》 * |
冯跃飞等: "《形势与政策》", 31 August 2016 * |
吴喜之: "《统计学:从数据到结论》", 31 March 2013 * |
张影等: "《预测与评价》", 31 May 2015 * |
朱蔷蔷等: "基于Hadoop平台上面向电影数据集Kmeans算法的改进", 《哈尔滨师范大学自然科学学报》 * |
郑丹等: "基于weighted_slope_one用户聚类的林产品推荐算法", 《森林工程》 * |
郑非等: "《体育统计学》", 31 July 2010 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108009877A (en) * | 2017-11-24 | 2018-05-08 | 阿里巴巴集团控股有限公司 | Information mining method and device |
CN108009877B (en) * | 2017-11-24 | 2021-10-15 | 创新先进技术有限公司 | Information mining method and device |
CN108629665A (en) * | 2018-05-08 | 2018-10-09 | 北京邮电大学 | A kind of individual commodity recommendation method and system |
CN108629665B (en) * | 2018-05-08 | 2021-07-16 | 北京邮电大学 | Personalized commodity recommendation method and system |
CN109102903A (en) * | 2018-07-09 | 2018-12-28 | 康美药业股份有限公司 | A kind of topic prediction technique and system for health consultation platform |
US20220012601A1 (en) * | 2019-03-26 | 2022-01-13 | Huawei Technologies Co., Ltd. | Apparatus and method for hyperparameter optimization of a machine learning model in a federated learning system |
CN112508512A (en) * | 2020-11-26 | 2021-03-16 | 国网河北省电力有限公司经济技术研究院 | Power grid engineering cost data management method and device and terminal equipment |
CN112508512B (en) * | 2020-11-26 | 2022-09-09 | 国网河北省电力有限公司经济技术研究院 | Power grid engineering cost data management method and device and terminal equipment |
CN113744021A (en) * | 2021-02-08 | 2021-12-03 | 北京沃东天骏信息技术有限公司 | Recommendation method, recommendation device, computer storage medium and recommendation system |
CN113538110A (en) * | 2021-08-13 | 2021-10-22 | 苏州工业职业技术学院 | Similar article recommendation method based on browsing sequence |
CN113538110B (en) * | 2021-08-13 | 2023-08-11 | 苏州工业职业技术学院 | Similar article recommending method based on browsing sequence |
Also Published As
Publication number | Publication date |
---|---|
CN106846082B (en) | 2021-07-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106846082A (en) | Tourism cold start-up consumer products commending system and method based on hardware information | |
CN107844915B (en) | Automatic scheduling method of call center based on traffic prediction | |
Xu et al. | A hybrid machine learning model for demand prediction of edge-computing-based bike-sharing system using Internet of Things | |
CN106897420B (en) | Mobile phone signaling data-based user travel resident behavior identification method | |
CN107766929B (en) | Model analysis method and device | |
CN106779087A (en) | A kind of general-purpose machinery learning data analysis platform | |
CN104750674B (en) | A kind of man-machine conversation's satisfaction degree estimation method and system | |
CN111178624A (en) | Method for predicting new product demand | |
CN110674993A (en) | User load short-term prediction method and device | |
CN108052505A (en) | Text emotion analysis method and device, storage medium, terminal | |
CN107563343A (en) | The self-perfection method and system of FaceID databases based on face recognition technology | |
Alamsyah et al. | Artificial neural network for Indonesian tourism demand forecasting | |
CN113706151A (en) | Data processing method and device, computer equipment and storage medium | |
CN113469730A (en) | Customer repurchase prediction method and device based on RF-LightGBM fusion model under non-contract scene | |
CN117436679B (en) | Meta-universe resource matching method and system | |
CN110147389A (en) | Account number treating method and apparatus, storage medium and electronic device | |
CN108897614A (en) | A kind of memory method for early warning and server-side based on convolutional neural networks | |
CN112116103A (en) | Method, device and system for evaluating personal qualification based on federal learning and storage medium | |
CN110110915A (en) | A kind of integrated prediction technique of the load based on CNN-SVR model | |
CN109978215A (en) | Patrol management method and device | |
CN115423031B (en) | Model training method and related device | |
CN112418476A (en) | Ultra-short-term power load prediction method | |
CN110688888A (en) | Pedestrian attribute identification method and system based on deep learning | |
CN112288172A (en) | Prediction method and device for line loss rate of transformer area | |
CN108459997A (en) | High skewness data value probability forecasting method based on deep learning and neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |