CN106547919A - A kind of distributed recommendation method of massive digital information - Google Patents

A kind of distributed recommendation method of massive digital information Download PDF

Info

Publication number
CN106547919A
CN106547919A CN201611110429.1A CN201611110429A CN106547919A CN 106547919 A CN106547919 A CN 106547919A CN 201611110429 A CN201611110429 A CN 201611110429A CN 106547919 A CN106547919 A CN 106547919A
Authority
CN
China
Prior art keywords
digital information
similarity
information
similarity matrix
distributed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611110429.1A
Other languages
Chinese (zh)
Other versions
CN106547919B (en
Inventor
王勇
王瑛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Dongguan South China Design and Innovation Institute
Original Assignee
Guangdong University of Technology
Dongguan South China Design and Innovation Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology, Dongguan South China Design and Innovation Institute filed Critical Guangdong University of Technology
Priority to CN201611110429.1A priority Critical patent/CN106547919B/en
Publication of CN106547919A publication Critical patent/CN106547919A/en
Application granted granted Critical
Publication of CN106547919B publication Critical patent/CN106547919B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of distributed recommendation method of massive digital information, comprises the following steps:S1:The distributed local network of equity is built, is at least included 20 computers in the distributed local network, mutually can be communicated between two computers;S2:Hadoop clusters are disposed in reciprocity distributed local network;S3:Collect the set of the related digital information of active user, the set of the digital information two stage pipeline statistics of Map and Reduce and the related digital information of active user by Hadoop, information input data source of the related digital information of active user as the Map stages, the information input data source in Reduce stages are the output result in Map stages.The distributed recommendation method of the present invention in the case of speed from massive digital information to user's recommending digital information faster, and the distributed recommendation method of massive digital information is more accurate to the digital information that user recommends.

Description

A kind of distributed recommendation method of massive digital information
Technical field
The present invention relates to magnanimity information processing technical field, more particularly to a kind of distributed recommendation side of massive digital information Method.
Background technology
The science and technology of 21 century and rapid development of information technology, especially as development and the popularization of Internet technology, network letter Breath resource is increased rapidly, has nowadays come into the epoch of a digital information explosion.So-called digital information is referred in the Internet The information contents such as the article of middle issue, picture, sound, image.With 2.0 replacement Web 1.0 of Web, Web 2.0 has become The platform that digital information is shared.As Web 2.0 more focuses on the reciprocal action of user, user is both the viewer of web site contents, And the maker of web site contents, thus in the digital information of magnanimity, people will find the information of definite needs will be become to get over To be more difficult to.Obtaining the most common mode of digital information has three kinds:The first is that conventional info web is linked, such as portal website Popular model recommendation, news links etc.;Second is that user searches for the information wanted by search engine;The third is to pass through The introduction of friend, sends out the mode of link or information key to user's recommendation information.In above-mentioned three kinds of modes, search engine is fast Speed finds the preferred approach of target information.When user is relatively unambiguous to the information of oneself demand, can be with search engine The information of oneself needs is found by keyword search easily.But search engine can not fully meet user and information is sent out Existing demand, therefore commending system just arises at the historic moment, corresponding with search engine, people are also accustomed to referred to as recommended engine.It is existing The recommended engine algorithm that has some related, but existing recommended engine is not high to the accuracy of user's recommending digital information, and In the case where user's history data volume is larger, response speed is slower.
The content of the invention
Based on the technical problem that background technology is present, the present invention proposes a kind of distributed recommendation side of massive digital information Method.
A kind of distributed recommendation method of massive digital information proposed by the present invention, comprises the following steps:S1:Build equity Distributed local network, in the distributed local network at least include 20 computers, can enter between two computers Row is mutually communicated;S2:Hadoop clusters are disposed in reciprocity distributed local network;S3:Collect the related number of active user The set of word information, the set of the digital information are counted by two stage pipelines of Map and Reduce of Hadoop and current use The related digital information in family, information input data source of the related digital information of active user as the Map stages, Reduce Output result of the information input data source in stage for the Map stages;S4:Parallel Map stage computings are run, and the Map stages transport The input data source of calculation is the Reduce stage output results in step S3, and then to build the similarity moment between digital information Battle array;S5:Similarity matrix can be divided into similarity complete according to degree of correlation by the similarity matrix that step S4 is obtained The larger similarity matrix of the similarity matrix of cause, similarity and the less similarity matrix of similarity;S6:Obtained according to step S5 Digital information in the on all four similarity matrix of similarity for arriving, the extracting directly similarity matrix is used as optimum numeral Information;Or the similarity matrix that the similarity that obtained according to step S5 is larger, extract the occurrence number in the similarity matrix Most digital information is used as optimum digital information;Or the less similarity matrix of similarity obtained according to step S5, Occur less digital information in rejecting the similarity matrix first, then again from the similarity matrix extract occurrence number compared with Many digital information is used as optimum digital information;S7:According to step S6 to optimum digital information be combined into digital information Item set, digital information items set obtain the detailed of the digital information as recommendation results in mongodb digital information libraries The detailed content of acquired digital information is finally returned to active user by content.
Preferably, the distributed local network is by being set up by ICP/IP protocol.
Preferably, the Hadoop includes two stages of Map and Reduce, during the Map stages refer to Hadoop Partition data in MapReduce patterns, the Reduce stages refer in Hadoop merging data in MapReduce patterns.
Preferably, the related digital information of the active user refers to the News Network that user has seen
The information of the commodity that the news or user stood was bought.
Beneficial effects of the present invention:
1st, the distributed recommendation method of massive digital information adopts many on the basis of existing Collaborative Filtering Recommendation Algorithm Individual computer carries out concurrent operation, and the distributed recommendation method of massive digital information can more quickly to user's recommending digital The carrying out of information is recommended;
2nd, data storage is classified to the digital information of user behavior according to the similarity of similarity matrix, and from similar Optimum digital information is extracted in degree matrix, and then the optimum digital information extracted is combined into digital information items set, numeral letter The set of breath item enters the detailed content that the digital information as recommendation results is obtained in mongodb digital information libraries so that magnanimity The distributed recommendation method of digital information is more accurate to the digital information that user recommends;
The present invention distributed recommendation method in the case of speed from massive digital information to user's recommending digital information Faster, and massive digital information distributed recommendation method it is more accurate to the digital information that user recommends.
Specific embodiment
The present invention is further explained with reference to specific embodiment.
Embodiment
A kind of distributed recommendation method of massive digital information is proposed in the present embodiment, is comprised the following steps:S1:Build The distributed local network of equity, at least includes 20 computers, between two computers in the distributed local network Mutually to be communicated;S2:Hadoop clusters are disposed in reciprocity distributed local network;S3:Collecting active user has relation Digital information set, the set of the digital information by two stage pipelines statistics of the Map and Reduce of Hadoop with work as The related digital information of front user, information input data source of the related digital information of active user as the Map stages, Output result of the information input data source in Reduce stages for the Map stages;S4:The parallel Map stage computings of operation, and Map The input data source of stage computing is the Reduce stage output results in step S3, and then to build the phase between digital information Like degree matrix;S5:Similarity matrix can be divided into similarity according to degree of correlation by the similarity matrix that step S4 is obtained The larger similarity matrix of on all four similarity matrix, similarity and the less similarity matrix of similarity;S6:According to step Digital information in the on all four similarity matrix of similarity that rapid S5 is obtained, the extracting directly similarity matrix is used as optimum Digital information;Or the similarity matrix that the similarity that obtained according to step S5 is larger, extract going out in the similarity matrix The most digital information of occurrence number is used as optimum digital information;Or the less similarity of similarity obtained according to step S5 , there is less digital information in matrix in rejecting the similarity matrix first, then extract from the similarity matrix again and occur The more digital information of number of times is used as optimum digital information;S7:According to step S6 to optimum digital information be combined into number Word collection of information items, digital information items set enter the digital information obtained in mongodb digital information libraries as recommendation results Detailed content, the detailed content of acquired digital information is returned to into active user finally.
In the present embodiment, by being set up by ICP/IP protocol, Hadoop includes Map to distributed local network With two stages of Reduce, the Map stages refer in Hadoop partition data in MapReduce patterns, and the Reduce stages refer to Hadoop Merging data in middle MapReduce patterns, the related digital information of the active user refer to the news website that user has seen News or the information of commodity bought of user, the distributed recommendation method of massive digital information is in existing collaborative filtering Concurrent operation is carried out using multiple computers on the basis of proposed algorithm, the distributed recommendation method of massive digital information can be more Plus quickly recommend to the carrying out of user's recommending digital information, data storage is according to the similarity of similarity matrix to user behavior Digital information classified, and optimum digital information is extracted from similarity matrix, and then the optimum digital information extracted Digital information items set is combined into, digital information items set is entered and obtained as recommendation results in mongodb digital information libraries The detailed content of digital information so that the distributed recommendation method of massive digital information is more accurate to the digital information that user recommends Really, distributed recommendation method of the invention in the case of speed from massive digital information to user's recommending digital information faster, And the distributed recommendation method of massive digital information is more accurate to the digital information that user recommends.
The above, the only present invention preferably specific embodiment, but protection scope of the present invention is not limited thereto, Any those familiar with the art the invention discloses technical scope in, technology according to the present invention scheme and its Inventive concept equivalent or change in addition, should all be included within the scope of the present invention.

Claims (4)

1. a kind of distributed recommendation method of massive digital information, it is characterised in that comprise the following steps:
S1:The distributed local network of equity is built, in the distributed local network, at least includes 20 computers, two calculating Mutually can be communicated between machine;
S2:Hadoop clusters are disposed in reciprocity distributed local network;
S3:Collect the set of the related digital information of active user, the set of the digital information by the Map of Hadoop and Two stage pipeline statistics of Reduce and the related digital information of active user, the related digital information conduct of active user The information input data source in Map stages, the information input data source in Reduce stages are the output result in Map stages;
S4:The parallel Map stage computings of operation, and the input data source of Map stage computings is the Reduce stages in step S3 Output result, and then to build the similarity matrix between digital information;
S5:Similarity matrix can be divided into similarity complete according to degree of correlation by the similarity matrix that step S4 is obtained The larger similarity matrix of consistent similarity matrix, similarity and the less similarity matrix of similarity;
S6:Number according to the on all four similarity matrix of similarity that step S5 is obtained, in the extracting directly similarity matrix Word information is used as optimum digital information;Or the similarity matrix that the similarity that obtained according to step S5 is larger, extract the phase Like the most digital information of the occurrence number in degree matrix as optimum digital information;Or according to step S5 obtain it is similar Less similarity matrix is spent, less digital information in rejecting the similarity matrix first, occurs, then again from the similarity The more digital information of occurrence number is extracted in matrix as optimum digital information;
S7:According to step S6 to optimum digital information be combined into digital information items set, digital information items set is entered The detailed content of the digital information as recommendation results is obtained in mongodb digital information libraries, finally by acquired numeral letter The detailed content of breath returns to active user.
2. a kind of distributed recommendation method of massive digital information according to claim 1, it is characterised in that the distribution Formula LAN is by being set up by ICP/IP protocol.
3. the distributed recommendation method of a kind of massive digital information according to claim 1, it is characterised in that described Hadoop includes two stages of Map and Reduce, and the Map stages refer in Hadoop partition data, institute in MapReduce patterns Stating the Reduce stages refers in Hadoop merging data in MapReduce patterns.
4. the distributed recommendation method of a kind of massive digital information according to claim 1, it is characterised in that described current The related digital information of user refers to the information of the commodity that the news of the news website that user has seen or user bought.
CN201611110429.1A 2016-12-06 2016-12-06 A kind of distributed recommendation method of massive digital information Active CN106547919B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611110429.1A CN106547919B (en) 2016-12-06 2016-12-06 A kind of distributed recommendation method of massive digital information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611110429.1A CN106547919B (en) 2016-12-06 2016-12-06 A kind of distributed recommendation method of massive digital information

Publications (2)

Publication Number Publication Date
CN106547919A true CN106547919A (en) 2017-03-29
CN106547919B CN106547919B (en) 2018-07-24

Family

ID=58397079

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611110429.1A Active CN106547919B (en) 2016-12-06 2016-12-06 A kind of distributed recommendation method of massive digital information

Country Status (1)

Country Link
CN (1) CN106547919B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020078084A1 (en) * 2018-10-19 2020-04-23 深圳点猫科技有限公司 Education resource platform-based consumption data collection method and device
WO2023093783A1 (en) * 2021-11-25 2023-06-01 苏州凉白开网络科技有限公司 Distributed recommendation method for mass digital information

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102298650A (en) * 2011-10-18 2011-12-28 东莞市巨细信息科技有限公司 Distributed recommendation method of massive digital information
US20130254196A1 (en) * 2012-03-26 2013-09-26 Duke University Cost-based optimization of configuration parameters and cluster sizing for hadoop
CN103605718A (en) * 2013-11-15 2014-02-26 南京大学 Hadoop improvement based goods recommendation method
CN103886487A (en) * 2014-03-28 2014-06-25 焦点科技股份有限公司 Individualized recommendation method and system based on distributed B2B platform
CN104572855A (en) * 2014-12-17 2015-04-29 深圳先进技术研究院 News recommendation method and device
CN105930469A (en) * 2016-04-23 2016-09-07 北京工业大学 Hadoop-based individualized tourism recommendation system and method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102298650A (en) * 2011-10-18 2011-12-28 东莞市巨细信息科技有限公司 Distributed recommendation method of massive digital information
US20130254196A1 (en) * 2012-03-26 2013-09-26 Duke University Cost-based optimization of configuration parameters and cluster sizing for hadoop
CN103605718A (en) * 2013-11-15 2014-02-26 南京大学 Hadoop improvement based goods recommendation method
CN103886487A (en) * 2014-03-28 2014-06-25 焦点科技股份有限公司 Individualized recommendation method and system based on distributed B2B platform
CN104572855A (en) * 2014-12-17 2015-04-29 深圳先进技术研究院 News recommendation method and device
CN105930469A (en) * 2016-04-23 2016-09-07 北京工业大学 Hadoop-based individualized tourism recommendation system and method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020078084A1 (en) * 2018-10-19 2020-04-23 深圳点猫科技有限公司 Education resource platform-based consumption data collection method and device
WO2023093783A1 (en) * 2021-11-25 2023-06-01 苏州凉白开网络科技有限公司 Distributed recommendation method for mass digital information

Also Published As

Publication number Publication date
CN106547919B (en) 2018-07-24

Similar Documents

Publication Publication Date Title
Garimella et al. Quantifying controversy on social media
Zhao A study on e-commerce recommender system based on big data
CN102298650B (en) Distributed recommendation method of massive digital information
CN105335519B (en) Model generation method and device and recommendation method and device
CN108665333B (en) Commodity recommendation method and device, electronic equipment and storage medium
CN103886047B (en) Towards the online recommendation method of distribution of stream data
TWI508011B (en) Category information providing method and device
US9208223B1 (en) Method and apparatus for indexing and querying knowledge models
CN103870507B (en) Method and device of searching based on category
CN106777051A (en) A kind of many feedback collaborative filtering recommending methods based on user's group
CN104599160A (en) Commodity recommendation method and commodity recommendation device
CN106407349A (en) Product recommendation method and device
CN109034981A (en) A kind of electric business collaborative filtering recommending method
Cho et al. Latent space model for multi-modal social data
CN108876508A (en) A kind of electric business collaborative filtering recommending method
CN107169821B (en) Big data query recommendation method and system
Gonzalez et al. Net2vec: Deep learning for the network
Viriyavisuthisakul et al. A comparison of similarity measures for online social media Thai text classification
Zhao et al. Socialtransfer: Transferring social knowledge for cold-start cowdsourcing
JP6434954B2 (en) Information processing apparatus, information processing method, and program
Jiang et al. A unified neural network approach to e-commerce relevance learning
CN106547919A (en) A kind of distributed recommendation method of massive digital information
Al-Dhelaan et al. Graph summarization for hashtag recommendation
CN108932248A (en) A kind of search realization method and system
Kumar et al. Cuisine prediction based on ingredients using tree boosting algorithms

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant