CN110110180A - A kind of job hunter's recruitment information searching method based on collaborative filtering - Google Patents

A kind of job hunter's recruitment information searching method based on collaborative filtering Download PDF

Info

Publication number
CN110110180A
CN110110180A CN201910336862.4A CN201910336862A CN110110180A CN 110110180 A CN110110180 A CN 110110180A CN 201910336862 A CN201910336862 A CN 201910336862A CN 110110180 A CN110110180 A CN 110110180A
Authority
CN
China
Prior art keywords
user
post
information
collaborative filtering
job hunter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910336862.4A
Other languages
Chinese (zh)
Inventor
佟帅辰
周犇
杨佳林
杨泽群
肖娜
王彦芳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hohai University HHU
Original Assignee
Hohai University HHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hohai University HHU filed Critical Hohai University HHU
Priority to CN201910336862.4A priority Critical patent/CN110110180A/en
Publication of CN110110180A publication Critical patent/CN110110180A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9538Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/105Human resources
    • G06Q10/1053Employment or hiring

Abstract

The invention discloses a kind of job hunter's recruitment information searching method based on collaborative filtering includes the following steps: the post information that existing major recruitment website is crawled by WebMagic web crawlers technology, and storage is in the database;It builds the mobile terminal APP and plain engine platform is searched at computer PC network end, querying condition is set in platform;After user's opening end APP or PC enters platform, carries out registration login and select querying condition;The system background for searching plain platform returns to suitable post information according to querying condition, and is ranked up displaying;User jumps to corresponding recruitment website by click sorted lists and checks post information;The log information that system acquisition is clicked to user, cleans data;Data after cleaning are calculated using the collaborative filtering based on job hunter, generate real-time recommendation list.The present invention provides for user facilitates accurate position search experience service, is quickly found out suitable position convenient for job hunter.

Description

A kind of job hunter's recruitment information searching method based on collaborative filtering
Technical field
The present invention relates to internet search engine technical field, especially a kind of job hunter based on collaborative filtering is recruited Engage information search method.
Background technique
Internet recruitment website is more and more at present, emerges one after another, also thoroughly change people job hunting mode and The recruitment mode of enterprise, it is existing in the market just to have: 51job, to hunt and engage the recruitment websites such as net, drag hook net and intelligence connection recruitment.But User volume on these websites is quite huge but not intercommunication, job hunter are often difficult to find that the post for being suitble to oneself.In addition to this, Existing recruitment website generally has recruitment information recommendation function, and searching results are often based on the position search condition of job hunter Screening, the post that this way of recommendation is recommended may not mutually agree with job hunter.Therefore the duty of existing major recruitment website is solved The problem of position information is not connected, and the search result of return and job hunter do not agree with is to solve the problems, such as that college students'employment is compeled in eyebrow Eyelash.
Summary of the invention
Technical problem to be solved by the present invention lies in provide a kind of job hunter's recruitment information based on collaborative filtering Searching method, provides for user and facilitates accurate position search experience service, is quickly found out suitable position convenient for job hunter.
In order to solve the above technical problems, the present invention provides a kind of job hunter's recruitment information search based on collaborative filtering Method includes the following steps:
(1) the post information that existing major recruitment website is crawled by WebMagic web crawlers technology, is stored in data In library;
(2) it builds the mobile terminal APP and plain engine platform is searched at computer PC network end, querying condition is set in platform;
(3) it after user's opening end APP or PC enters platform, carries out registration login and selects querying condition;
(4) system background for searching plain platform returns to suitable post information according to querying condition, and is ranked up displaying;
(5) user jumps to corresponding recruitment website by click sorted lists and checks post information;
(6) log information that system acquisition is clicked to user, cleans data;
(7) data after cleaning are calculated using the collaborative filtering based on job hunter, generate real-time recommendation list.
Preferably, in step (7), the data after cleaning are calculated using the collaborative filtering based on job hunter, raw Specifically comprise the following steps: at real-time recommendation list
(71) code is write using Javascript in system front end bury a little, and user behavior information is stored in In .log file on nginx server;
(72) collected user journal information is stored in HDFS using flume frame;
(73) data cleansing is carried out using HiveQL to place the data in Hbase database;
(74) building for edge calculations big data platform is carried out using newest spark PC cluster engine, after cleaning Data using based on job hunter collaborative filtering calculate, be as a result stored in the warehouse Hive, and by Sqoop importing MySQL database;
(75) it according to calculated result, generates recommendation list and returns to user.
Preferably, in step (74), collaborative filtering specifically comprises the following steps:
(741) it is as shown in Equation 1 to calculate the similarity matrix based on user
Wherein: SuvFor the similarity score of user u and user v, N (u) indicates that user u had the post information of browsing behavior Set;N (v) indicates that user v had the set of the post information of browsing behavior;The coincidence journey of molecular moiety expression post information Degree, it is clear that coincidence degree is higher, and post information is more similar;A normalization has been done in denominator part, reduces and operates excessive use The similarity degree at family and other users;
(742) it is clicked according to the behavior of user and completes post recommendation
Wherein: PuiIt is user u to the recommendation score of post i.SuvFor the similarity score of user u and user v, k is indicated User v is the preceding k similar users of user u, and the post user u for guaranteeing that user v was browsed was not browsed;rviIt indicates to use Family v is to the behavior score of post i, for different behaviors (such as: browsing launches resume, participates in interview, enters official rank) to user behavior Score define it is different.
Preferably, in step (741), the similarity matrix based on user is rewritten are as follows:
Wherein: SuvFor the similarity score of user u and user v, N (u) indicates that user u had the post information of browsing behavior Set;N (v) indicates that user v had the set of the post information of browsing behavior;U (i) indicates the use for having behavior to post i Family set, if a post was browsed by many job hunters, it will become lower in the contribution of registration.
Preferably, it in step (742), is clicked according to the behavior of user and completes post recommendation rewriting are as follows:
Wherein: SuvFor the similarity score of user u and user v, N (u) indicates that user u had the post information of browsing behavior Set;N (v) indicates that user v had the set of the post information of browsing behavior;Each post information contribution degree being overlapped obtains Divide and be all different, by function f (Δ ti) determine, function is defined as follows:
Wherein: | tui-tvi| user is indicated to the difference of the operating time in the same post, and the time is separated by shorter function Value is higher, and the weight in registration contribution of response is higher.
The invention has the benefit that the present invention in a manner of search engine, incorporates existing by web crawlers technology The data of major recruitment website provide convenience for job hunter's job hunting;In conjunction with big data technology, the collaborative filtering based on job hunter is calculated Method etc. realizes position intelligent recommendation, substantially increases user and hunts for a job efficiency, allows users to more efficiently find and be suitble to oneself Work.
Detailed description of the invention
Fig. 1 is method flow schematic diagram of the invention.
Fig. 2 is crawler flow diagram of the invention.
Fig. 3 is system architecture schematic diagram of the invention.
Specific embodiment
As shown in figure 3, a kind of job hunter's recruitment information search engine based on collaborative filtering of the invention, including number According to three parts such as acquisition module 1, data analysis module 2, data display modules 3.
As depicted in figs. 1 and 2, a kind of job hunter's recruitment information searching method based on collaborative filtering, including it is as follows Step:
Step 1 is crawled the post information of existing major recruitment website by WebMagic web crawlers technology, is stored in In database.
Step 2, builds the mobile terminal APP and plain engine platform is searched at computer PC network end, the setting inquiry item in platform Part.
Step 3 carries out registration login and selects querying condition after user's opening end APP or PC enters platform.
Step 4, the system background for searching plain platform returns to suitable post information according to querying condition, and is ranked up exhibition Show.
Step 5, user jump to corresponding recruitment website by click sorted lists and check post information.
Step 6, the log information that system acquisition to user is clicked, cleans data.
Step 7, the data after cleaning are calculated using the collaborative filtering based on job hunter, generate real-time recommendation List.
We generate recommendation list using calculating based on collaborative filtering in step 7 are as follows: find and target There is user the user of similar interests to collect to merge and will find user in this user set and browse, but target user not by The post information recommended generates offline recommendation list.Since the collaborative filtering (usercf) based on user needs to handle greatly The data of amount, we calculate by Hadoop frame.
Hadoop frame is a distributed system infrastructure developed by apache foundation.With three big groups Part mapreduce distributed arithmetic frame yarn task schedule platform hdfs distributed file system.We are just in this project The log information of user is stored in HDFS.Hive is a Tool for Data Warehouse based on Hadoop, can be by structuring Data file be mapped as a database table, and provide simple sql query function, sql sentence can be converted to MapReduce task is run.
It is specifically included in the step seven:
S7.1 writes code using Javascript in system front end and bury a little, and user behavior information is stored in In .log file on nginx server.
Collected user journal information is stored in HDFS (Hadoop distributed field system using flume frame by S7.2 System) in.
S7.3 carries out data cleansing using HiveQL and places the data in Hbase database.
S7.4 carries out building for edge calculations big data platform using newest spark PC cluster engine, after cleaning Data using based on job hunter collaborative filtering calculate, be as a result stored in the warehouse Hive, and by Sqoop importing MySQL database.
S7.5 generates recommendation list and returns to user according to calculated result.
Collaborative filtering in the step S7.4 includes:
It is as shown in Equation 1 that S7.4.1 calculates the similarity matrix based on user
Wherein: N (u) indicates that user u had the set of the post information of browsing behavior;N (v) indicates that user v had browsing The set of the post information of behavior.The coincidence degree of molecular moiety expression post information, it is clear that coincidence degree is higher, post information It is more similar.A normalization has been done in denominator part, reduces the similarity degree for operating excessive user and other users.
S7.4.2 is clicked according to the behavior of user completes post recommendation
Wherein: rviUser v is indicated to the behavior score of post i, (such as: browsing launches resume, participation for different behaviors Interview, enter official rank) difference is defined to the score of user behavior.SuvFor the similarity score of user u and user v, user v is user The preceding k similar users of u, and the post user u for guaranteeing that user v was browsed was not browsed.We can be obtained by this way Recommendation score of the user u to post i.
In order to carry out more accurate recommendation, we can also carry out improvement at two to formula simultaneously:
1. reducing contribution of the abnormal popular position to user's similarity
Wherein: SuvFor the similarity score of user u and user v, N (u) indicates that user u had the post information of browsing behavior Set;N (v) indicates that user v had the set of the post information of browsing behavior;U (i) indicates the use for having behavior to post i Family set, if a post was browsed by many job hunters, it will become lower in the contribution of registration.
2. different job hunters should give appropriate reduction to the period difference of same post behavior
Wherein: SuvFor the similarity score of user u and user v, N (u) indicates that user u had the post information of browsing behavior Set;N (v) indicates that user v had the set of the post information of browsing behavior;Each post information contribution degree being overlapped obtains Divide and be all different, by function f (Δ ti) determine.Function is defined as follows:
Wherein: | tui-tvi| user is indicated to the difference of the operating time in the same post, and the time is separated by shorter function Value is higher, and the weight in registration contribution of response is higher.
In conclusion the invention discloses a kind of job hunter's recruitment information searching method based on collaborative filtering, incorporates The data of existing major recruitment website, solve the problems, such as data not intercommunication, user are allow to inquire the whole network by this system Recruitment information;Simultaneously in order to realize intelligent recommendation, collaborative filtering is applied in recruitment industry, the improvement to formula is passed through Upgrading, the post information being more suitable is provided for user.

Claims (5)

1. a kind of job hunter's recruitment information searching method based on collaborative filtering, which comprises the steps of:
(1) the post information of existing major recruitment website is crawled by WebMagic web crawlers technology, storage is in the database;
(2) it builds the mobile terminal APP and plain engine platform is searched at computer PC network end, querying condition is set in platform;
(3) it after user's opening end APP or PC enters platform, carries out registration login and selects querying condition;
(4) system background for searching plain platform returns to suitable post information according to querying condition, and is ranked up displaying;
(5) user jumps to corresponding recruitment website by click sorted lists and checks post information;
(6) log information that system acquisition is clicked to user, cleans data;
(7) data after cleaning are calculated using the collaborative filtering based on job hunter, generate real-time recommendation list.
2. job hunter's recruitment information searching method based on collaborative filtering as described in claim 1, which is characterized in that step Suddenly in (7), the data after cleaning are calculated using the collaborative filtering based on job hunter, and it is specific to generate real-time recommendation list Include the following steps:
(71) code is write using Javascript in system front end bury a little, and user behavior information is stored in nginx In .log file on server;
(72) collected user journal information is stored in HDFS using flume frame;
(73) data cleansing is carried out using HiveQL to place the data in Hbase database;
(74) building for edge calculations big data platform is carried out using newest spark PC cluster engine, by the number after cleaning It calculates, is as a result stored in the warehouse Hive, and MySQL is imported by Sqoop according to the collaborative filtering used based on job hunter Database;
(75) it according to calculated result, generates recommendation list and returns to user.
3. job hunter's recruitment information searching method based on collaborative filtering as claimed in claim 2, which is characterized in that step Suddenly in (74), collaborative filtering specifically comprises the following steps:
(741) it is as shown in Equation 1 to calculate the similarity matrix based on user
Wherein: SuvFor the similarity score of user u and user v, N (u) indicates that user u had the collection of the post information of browsing behavior It closes;N (v) indicates that user v had the set of the post information of browsing behavior;Molecular moiety indicates the coincidence degree of post information, Obviously coincidence degree is higher, and post information is more similar;A normalization has been done in denominator part, reduce operate excessive user with The similarity degree of other users;
(742) it is clicked according to the behavior of user and completes post recommendation
Wherein: PuiIt is user u to the recommendation score of post i, SuvFor the similarity score of user u and user v, k indicates user v It is the preceding k similar users of user u, and the post user u for guaranteeing that user v was browsed was not browsed;rviIndicate v pairs of user The behavior score of post i defines score of the different behaviors to user behavior different.
4. job hunter's recruitment information searching method based on collaborative filtering as claimed in claim 3, which is characterized in that step Suddenly in (741), the similarity matrix based on user is rewritten are as follows:
Wherein: SuvFor the similarity score of user u and user v, N (u) indicates that user u had the collection of the post information of browsing behavior It closes;N (v) indicates that user v had the set of the post information of browsing behavior;U (i) indicates to have post i the user of behavior to collect It closes, if a post was browsed by many job hunters, it will become lower in the contribution of registration.
5. job hunter's recruitment information searching method based on collaborative filtering as claimed in claim 3, which is characterized in that step Suddenly it in (742), is clicked according to the behavior of user and completes post recommendation rewriting are as follows:
Wherein: SuvFor the similarity score of user u and user v, N (u) indicates that user u had the collection of the post information of browsing behavior It closes;N (v) indicates that user v had the set of the post information of browsing behavior;The post information contribution degree score that each is overlapped is equal It is not identical, by function f (Δ ti) determine, function is defined as follows:
Wherein: | tui-tvi| user is indicated to the difference of the operating time in the same post, and the time is separated by the value of shorter function more The weight in registration contribution of height, response is higher.
CN201910336862.4A 2019-04-25 2019-04-25 A kind of job hunter's recruitment information searching method based on collaborative filtering Pending CN110110180A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910336862.4A CN110110180A (en) 2019-04-25 2019-04-25 A kind of job hunter's recruitment information searching method based on collaborative filtering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910336862.4A CN110110180A (en) 2019-04-25 2019-04-25 A kind of job hunter's recruitment information searching method based on collaborative filtering

Publications (1)

Publication Number Publication Date
CN110110180A true CN110110180A (en) 2019-08-09

Family

ID=67486520

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910336862.4A Pending CN110110180A (en) 2019-04-25 2019-04-25 A kind of job hunter's recruitment information searching method based on collaborative filtering

Country Status (1)

Country Link
CN (1) CN110110180A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111737557A (en) * 2020-08-21 2020-10-02 信度信息技术(苏州)有限公司 Method and system for recruitment through Internet
CN111861361A (en) * 2020-04-09 2020-10-30 河北利至人力资源服务有限公司 Intelligent resume pushing system and method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104933239A (en) * 2015-06-09 2015-09-23 江苏大学 Hybrid model based personalized position information recommendation system and realization method therefor
CN105005880A (en) * 2015-08-19 2015-10-28 郭文峰 Internet based job application and recruitment system
CN105893641A (en) * 2016-07-01 2016-08-24 中国传媒大学 Job recommending method
CN107562818A (en) * 2017-08-16 2018-01-09 中国工商银行股份有限公司 Information recommendation system and method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104933239A (en) * 2015-06-09 2015-09-23 江苏大学 Hybrid model based personalized position information recommendation system and realization method therefor
CN105005880A (en) * 2015-08-19 2015-10-28 郭文峰 Internet based job application and recruitment system
CN105893641A (en) * 2016-07-01 2016-08-24 中国传媒大学 Job recommending method
CN107562818A (en) * 2017-08-16 2018-01-09 中国工商银行股份有限公司 Information recommendation system and method

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
文哥的学习日记: "推荐系统理论(二)-利用用户行为数据进行推荐(协同过滤)", 《简书》 *
欧阳裕洁等: "一种基于评分时间差的协同过滤算法", 《信息与电脑》 *
蔡立志等: "《大数据测评》", 30 January 2015 *
路小瑞: "基于Hadoop平台的职位推荐系统的设计与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111861361A (en) * 2020-04-09 2020-10-30 河北利至人力资源服务有限公司 Intelligent resume pushing system and method
CN111861361B (en) * 2020-04-09 2021-07-27 河北利至人力资源服务有限公司 Intelligent resume pushing system and method
CN111737557A (en) * 2020-08-21 2020-10-02 信度信息技术(苏州)有限公司 Method and system for recruitment through Internet

Similar Documents

Publication Publication Date Title
US11580104B2 (en) Method, apparatus, device, and storage medium for intention recommendation
CN108509551B (en) Microblog network key user mining system and method based on Spark environment
Rakhra et al. Implementing Machine Learning for Smart Farming to Forecast Farmers' Interest in Hiring Equipment.
US20190197416A1 (en) Information recommendation method, apparatus, and server based on user data in an online forum
Salehi et al. Personalized recommendation of learning material using sequential pattern mining and attribute based collaborative filtering
US10417301B2 (en) Analytics based on scalable hierarchical categorization of web content
CN104268292B (en) The label Word library updating method of portrait system
US8504411B1 (en) Systems and methods for online user profiling and segmentation
CN112307762B (en) Search result sorting method and device, storage medium and electronic device
CN111159341B (en) Information recommendation method and device based on user investment and financial management preference
US20150287051A1 (en) System and method for identifying growing companies and monitoring growth using non-obvious parameters
Khalid et al. A literature review of implemented recommendation techniques used in massive open online courses
Xiao et al. Job recommendation with hawkes process: an effective solution for recsys challenge 2016
Liu et al. QA document recommendations for communities of question–answering websites
CN110110180A (en) A kind of job hunter's recruitment information searching method based on collaborative filtering
CN103814353A (en) Search-based universal navigation
Su et al. Identifying and tracking topic-level influencers in the microblog streams
Kim et al. Topic-Driven SocialRank: Personalized search result ranking by identifying similar, credible users in a social network
Alam et al. Confluence of social network, social question and answering community, and user reputation model for information seeking and experts generation
Sridhar et al. A comparative study on how big data is scaling business intelligence and analytics
CN110717089A (en) User behavior analysis system and method based on weblog
CN104809253A (en) Internet data analysis system
Becheru et al. Towards social data analytics for smart tourism: A network science perspective
Mathias et al. Personalized sightseeing tours: a model for visits in art museums
Nouvellet et al. Discovery of usage patterns in digital library web logs using Markov modeling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190809

RJ01 Rejection of invention patent application after publication