CN106547919A - A kind of distributed recommendation method of massive digital information - Google Patents
A kind of distributed recommendation method of massive digital information Download PDFInfo
- Publication number
- CN106547919A CN106547919A CN201611110429.1A CN201611110429A CN106547919A CN 106547919 A CN106547919 A CN 106547919A CN 201611110429 A CN201611110429 A CN 201611110429A CN 106547919 A CN106547919 A CN 106547919A
- Authority
- CN
- China
- Prior art keywords
- digital information
- similarity
- information
- similarity matrix
- distributed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of distributed recommendation method of massive digital information, comprises the following steps:S1:The distributed local network of equity is built, is at least included 20 computers in the distributed local network, mutually can be communicated between two computers;S2:Hadoop clusters are disposed in reciprocity distributed local network;S3:Collect the set of the related digital information of active user, the set of the digital information two stage pipeline statistics of Map and Reduce and the related digital information of active user by Hadoop, information input data source of the related digital information of active user as the Map stages, the information input data source in Reduce stages are the output result in Map stages.The distributed recommendation method of the present invention in the case of speed from massive digital information to user's recommending digital information faster, and the distributed recommendation method of massive digital information is more accurate to the digital information that user recommends.
Description
Technical field
The present invention relates to magnanimity information processing technical field, more particularly to a kind of distributed recommendation side of massive digital information
Method.
Background technology
The science and technology of 21 century and rapid development of information technology, especially as development and the popularization of Internet technology, network letter
Breath resource is increased rapidly, has nowadays come into the epoch of a digital information explosion.So-called digital information is referred in the Internet
The information contents such as the article of middle issue, picture, sound, image.With 2.0 replacement Web 1.0 of Web, Web 2.0 has become
The platform that digital information is shared.As Web 2.0 more focuses on the reciprocal action of user, user is both the viewer of web site contents,
And the maker of web site contents, thus in the digital information of magnanimity, people will find the information of definite needs will be become to get over
To be more difficult to.Obtaining the most common mode of digital information has three kinds:The first is that conventional info web is linked, such as portal website
Popular model recommendation, news links etc.;Second is that user searches for the information wanted by search engine;The third is to pass through
The introduction of friend, sends out the mode of link or information key to user's recommendation information.In above-mentioned three kinds of modes, search engine is fast
Speed finds the preferred approach of target information.When user is relatively unambiguous to the information of oneself demand, can be with search engine
The information of oneself needs is found by keyword search easily.But search engine can not fully meet user and information is sent out
Existing demand, therefore commending system just arises at the historic moment, corresponding with search engine, people are also accustomed to referred to as recommended engine.It is existing
The recommended engine algorithm that has some related, but existing recommended engine is not high to the accuracy of user's recommending digital information, and
In the case where user's history data volume is larger, response speed is slower.
The content of the invention
Based on the technical problem that background technology is present, the present invention proposes a kind of distributed recommendation side of massive digital information
Method.
A kind of distributed recommendation method of massive digital information proposed by the present invention, comprises the following steps:S1:Build equity
Distributed local network, in the distributed local network at least include 20 computers, can enter between two computers
Row is mutually communicated;S2:Hadoop clusters are disposed in reciprocity distributed local network;S3:Collect the related number of active user
The set of word information, the set of the digital information are counted by two stage pipelines of Map and Reduce of Hadoop and current use
The related digital information in family, information input data source of the related digital information of active user as the Map stages, Reduce
Output result of the information input data source in stage for the Map stages;S4:Parallel Map stage computings are run, and the Map stages transport
The input data source of calculation is the Reduce stage output results in step S3, and then to build the similarity moment between digital information
Battle array;S5:Similarity matrix can be divided into similarity complete according to degree of correlation by the similarity matrix that step S4 is obtained
The larger similarity matrix of the similarity matrix of cause, similarity and the less similarity matrix of similarity;S6:Obtained according to step S5
Digital information in the on all four similarity matrix of similarity for arriving, the extracting directly similarity matrix is used as optimum numeral
Information;Or the similarity matrix that the similarity that obtained according to step S5 is larger, extract the occurrence number in the similarity matrix
Most digital information is used as optimum digital information;Or the less similarity matrix of similarity obtained according to step S5,
Occur less digital information in rejecting the similarity matrix first, then again from the similarity matrix extract occurrence number compared with
Many digital information is used as optimum digital information;S7:According to step S6 to optimum digital information be combined into digital information
Item set, digital information items set obtain the detailed of the digital information as recommendation results in mongodb digital information libraries
The detailed content of acquired digital information is finally returned to active user by content.
Preferably, the distributed local network is by being set up by ICP/IP protocol.
Preferably, the Hadoop includes two stages of Map and Reduce, during the Map stages refer to Hadoop
Partition data in MapReduce patterns, the Reduce stages refer in Hadoop merging data in MapReduce patterns.
Preferably, the related digital information of the active user refers to the News Network that user has seen
The information of the commodity that the news or user stood was bought.
Beneficial effects of the present invention:
1st, the distributed recommendation method of massive digital information adopts many on the basis of existing Collaborative Filtering Recommendation Algorithm
Individual computer carries out concurrent operation, and the distributed recommendation method of massive digital information can more quickly to user's recommending digital
The carrying out of information is recommended;
2nd, data storage is classified to the digital information of user behavior according to the similarity of similarity matrix, and from similar
Optimum digital information is extracted in degree matrix, and then the optimum digital information extracted is combined into digital information items set, numeral letter
The set of breath item enters the detailed content that the digital information as recommendation results is obtained in mongodb digital information libraries so that magnanimity
The distributed recommendation method of digital information is more accurate to the digital information that user recommends;
The present invention distributed recommendation method in the case of speed from massive digital information to user's recommending digital information
Faster, and massive digital information distributed recommendation method it is more accurate to the digital information that user recommends.
Specific embodiment
The present invention is further explained with reference to specific embodiment.
Embodiment
A kind of distributed recommendation method of massive digital information is proposed in the present embodiment, is comprised the following steps:S1:Build
The distributed local network of equity, at least includes 20 computers, between two computers in the distributed local network
Mutually to be communicated;S2:Hadoop clusters are disposed in reciprocity distributed local network;S3:Collecting active user has relation
Digital information set, the set of the digital information by two stage pipelines statistics of the Map and Reduce of Hadoop with work as
The related digital information of front user, information input data source of the related digital information of active user as the Map stages,
Output result of the information input data source in Reduce stages for the Map stages;S4:The parallel Map stage computings of operation, and Map
The input data source of stage computing is the Reduce stage output results in step S3, and then to build the phase between digital information
Like degree matrix;S5:Similarity matrix can be divided into similarity according to degree of correlation by the similarity matrix that step S4 is obtained
The larger similarity matrix of on all four similarity matrix, similarity and the less similarity matrix of similarity;S6:According to step
Digital information in the on all four similarity matrix of similarity that rapid S5 is obtained, the extracting directly similarity matrix is used as optimum
Digital information;Or the similarity matrix that the similarity that obtained according to step S5 is larger, extract going out in the similarity matrix
The most digital information of occurrence number is used as optimum digital information;Or the less similarity of similarity obtained according to step S5
, there is less digital information in matrix in rejecting the similarity matrix first, then extract from the similarity matrix again and occur
The more digital information of number of times is used as optimum digital information;S7:According to step S6 to optimum digital information be combined into number
Word collection of information items, digital information items set enter the digital information obtained in mongodb digital information libraries as recommendation results
Detailed content, the detailed content of acquired digital information is returned to into active user finally.
In the present embodiment, by being set up by ICP/IP protocol, Hadoop includes Map to distributed local network
With two stages of Reduce, the Map stages refer in Hadoop partition data in MapReduce patterns, and the Reduce stages refer to Hadoop
Merging data in middle MapReduce patterns, the related digital information of the active user refer to the news website that user has seen
News or the information of commodity bought of user, the distributed recommendation method of massive digital information is in existing collaborative filtering
Concurrent operation is carried out using multiple computers on the basis of proposed algorithm, the distributed recommendation method of massive digital information can be more
Plus quickly recommend to the carrying out of user's recommending digital information, data storage is according to the similarity of similarity matrix to user behavior
Digital information classified, and optimum digital information is extracted from similarity matrix, and then the optimum digital information extracted
Digital information items set is combined into, digital information items set is entered and obtained as recommendation results in mongodb digital information libraries
The detailed content of digital information so that the distributed recommendation method of massive digital information is more accurate to the digital information that user recommends
Really, distributed recommendation method of the invention in the case of speed from massive digital information to user's recommending digital information faster,
And the distributed recommendation method of massive digital information is more accurate to the digital information that user recommends.
The above, the only present invention preferably specific embodiment, but protection scope of the present invention is not limited thereto,
Any those familiar with the art the invention discloses technical scope in, technology according to the present invention scheme and its
Inventive concept equivalent or change in addition, should all be included within the scope of the present invention.
Claims (4)
1. a kind of distributed recommendation method of massive digital information, it is characterised in that comprise the following steps:
S1:The distributed local network of equity is built, in the distributed local network, at least includes 20 computers, two calculating
Mutually can be communicated between machine;
S2:Hadoop clusters are disposed in reciprocity distributed local network;
S3:Collect the set of the related digital information of active user, the set of the digital information by the Map of Hadoop and
Two stage pipeline statistics of Reduce and the related digital information of active user, the related digital information conduct of active user
The information input data source in Map stages, the information input data source in Reduce stages are the output result in Map stages;
S4:The parallel Map stage computings of operation, and the input data source of Map stage computings is the Reduce stages in step S3
Output result, and then to build the similarity matrix between digital information;
S5:Similarity matrix can be divided into similarity complete according to degree of correlation by the similarity matrix that step S4 is obtained
The larger similarity matrix of consistent similarity matrix, similarity and the less similarity matrix of similarity;
S6:Number according to the on all four similarity matrix of similarity that step S5 is obtained, in the extracting directly similarity matrix
Word information is used as optimum digital information;Or the similarity matrix that the similarity that obtained according to step S5 is larger, extract the phase
Like the most digital information of the occurrence number in degree matrix as optimum digital information;Or according to step S5 obtain it is similar
Less similarity matrix is spent, less digital information in rejecting the similarity matrix first, occurs, then again from the similarity
The more digital information of occurrence number is extracted in matrix as optimum digital information;
S7:According to step S6 to optimum digital information be combined into digital information items set, digital information items set is entered
The detailed content of the digital information as recommendation results is obtained in mongodb digital information libraries, finally by acquired numeral letter
The detailed content of breath returns to active user.
2. a kind of distributed recommendation method of massive digital information according to claim 1, it is characterised in that the distribution
Formula LAN is by being set up by ICP/IP protocol.
3. the distributed recommendation method of a kind of massive digital information according to claim 1, it is characterised in that described
Hadoop includes two stages of Map and Reduce, and the Map stages refer in Hadoop partition data, institute in MapReduce patterns
Stating the Reduce stages refers in Hadoop merging data in MapReduce patterns.
4. the distributed recommendation method of a kind of massive digital information according to claim 1, it is characterised in that described current
The related digital information of user refers to the information of the commodity that the news of the news website that user has seen or user bought.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611110429.1A CN106547919B (en) | 2016-12-06 | 2016-12-06 | A kind of distributed recommendation method of massive digital information |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611110429.1A CN106547919B (en) | 2016-12-06 | 2016-12-06 | A kind of distributed recommendation method of massive digital information |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106547919A true CN106547919A (en) | 2017-03-29 |
CN106547919B CN106547919B (en) | 2018-07-24 |
Family
ID=58397079
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611110429.1A Active CN106547919B (en) | 2016-12-06 | 2016-12-06 | A kind of distributed recommendation method of massive digital information |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106547919B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020078084A1 (en) * | 2018-10-19 | 2020-04-23 | 深圳点猫科技有限公司 | Education resource platform-based consumption data collection method and device |
WO2023093783A1 (en) * | 2021-11-25 | 2023-06-01 | 苏州凉白开网络科技有限公司 | Distributed recommendation method for mass digital information |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102298650A (en) * | 2011-10-18 | 2011-12-28 | 东莞市巨细信息科技有限公司 | Distributed recommendation method of massive digital information |
US20130254196A1 (en) * | 2012-03-26 | 2013-09-26 | Duke University | Cost-based optimization of configuration parameters and cluster sizing for hadoop |
CN103605718A (en) * | 2013-11-15 | 2014-02-26 | 南京大学 | Hadoop improvement based goods recommendation method |
CN103886487A (en) * | 2014-03-28 | 2014-06-25 | 焦点科技股份有限公司 | Individualized recommendation method and system based on distributed B2B platform |
CN104572855A (en) * | 2014-12-17 | 2015-04-29 | 深圳先进技术研究院 | News recommendation method and device |
CN105930469A (en) * | 2016-04-23 | 2016-09-07 | 北京工业大学 | Hadoop-based individualized tourism recommendation system and method |
-
2016
- 2016-12-06 CN CN201611110429.1A patent/CN106547919B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102298650A (en) * | 2011-10-18 | 2011-12-28 | 东莞市巨细信息科技有限公司 | Distributed recommendation method of massive digital information |
US20130254196A1 (en) * | 2012-03-26 | 2013-09-26 | Duke University | Cost-based optimization of configuration parameters and cluster sizing for hadoop |
CN103605718A (en) * | 2013-11-15 | 2014-02-26 | 南京大学 | Hadoop improvement based goods recommendation method |
CN103886487A (en) * | 2014-03-28 | 2014-06-25 | 焦点科技股份有限公司 | Individualized recommendation method and system based on distributed B2B platform |
CN104572855A (en) * | 2014-12-17 | 2015-04-29 | 深圳先进技术研究院 | News recommendation method and device |
CN105930469A (en) * | 2016-04-23 | 2016-09-07 | 北京工业大学 | Hadoop-based individualized tourism recommendation system and method |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020078084A1 (en) * | 2018-10-19 | 2020-04-23 | 深圳点猫科技有限公司 | Education resource platform-based consumption data collection method and device |
WO2023093783A1 (en) * | 2021-11-25 | 2023-06-01 | 苏州凉白开网络科技有限公司 | Distributed recommendation method for mass digital information |
Also Published As
Publication number | Publication date |
---|---|
CN106547919B (en) | 2018-07-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Garimella et al. | Quantifying controversy on social media | |
Zhao | A study on e-commerce recommender system based on big data | |
CN102298650B (en) | Distributed recommendation method of massive digital information | |
CN105335519B (en) | Model generation method and device and recommendation method and device | |
CN108665333B (en) | Commodity recommendation method and device, electronic equipment and storage medium | |
CN103886047B (en) | Towards the online recommendation method of distribution of stream data | |
TWI508011B (en) | Category information providing method and device | |
US9208223B1 (en) | Method and apparatus for indexing and querying knowledge models | |
CN103870507B (en) | Method and device of searching based on category | |
CN106777051A (en) | A kind of many feedback collaborative filtering recommending methods based on user's group | |
CN104599160A (en) | Commodity recommendation method and commodity recommendation device | |
CN106407349A (en) | Product recommendation method and device | |
CN109034981A (en) | A kind of electric business collaborative filtering recommending method | |
Cho et al. | Latent space model for multi-modal social data | |
CN108876508A (en) | A kind of electric business collaborative filtering recommending method | |
CN107169821B (en) | Big data query recommendation method and system | |
Gonzalez et al. | Net2vec: Deep learning for the network | |
Viriyavisuthisakul et al. | A comparison of similarity measures for online social media Thai text classification | |
Zhao et al. | Socialtransfer: Transferring social knowledge for cold-start cowdsourcing | |
JP6434954B2 (en) | Information processing apparatus, information processing method, and program | |
Jiang et al. | A unified neural network approach to e-commerce relevance learning | |
CN106547919A (en) | A kind of distributed recommendation method of massive digital information | |
Al-Dhelaan et al. | Graph summarization for hashtag recommendation | |
CN108932248A (en) | A kind of search realization method and system | |
Kumar et al. | Cuisine prediction based on ingredients using tree boosting algorithms |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |