CN104462431A

CN104462431A - Method for crawling web page recruitment information

Info

Publication number: CN104462431A
Application number: CN201410774571.0A
Authority: CN
Inventors: 邱继钊; 于治楼; 范莹
Original assignee: Inspur Software Group Co Ltd
Current assignee: Inspur Software Group Co Ltd
Priority date: 2014-12-16
Filing date: 2014-12-16
Publication date: 2015-03-25

Abstract

The invention relates to a method for crawling web page recruitment information, which solves the problems of difficulty in acquiring and warehousing the web page recruitment information. At present, a certain number of recruitment websites exist on the internet, the main recruitment of enterprises is carried out by releasing recruitment information on the recruitment websites, and similarly, the main way for an applicant to acquire work is the recruitment information released on the recruitment websites by the enterprises. The recruitment information can reflect the requirements and changes of the current social and economic structure to a certain extent; if scientific processing and analysis are carried out on the recruitment information, more targeted policy adjustment and talent culture can be realized.

Description

A kind of method crawling webpage recruitment information

Technical field

The present invention relates to a kind of computer utility, specifically a kind of method crawling webpage recruitment information.

Background technology

Along with popularizing of internet, the carrier of recruitment information is turned to all kinds of recruitment websites on internet gradually by papery newpapers and periodicals.Now, recruitment website has become enterprise and applicant and has issued and the main path obtaining recruitment information.Enterprise, in order to recruit the high-grade, precision and advanced talent, all can issue corresponding recruitment information on different recruitment websites, and applicant, in order to find satisfied work, also can go to different websites to go to find corresponding recruitment information.Along with being on the increase of recruitment website, recruitment information also presents ever-increasing trend, and the information content is different and changeable along with post and enterprise different also present, and this gives to gather and has also been with a numerous difficult problem as follows:

1. the page is irregular, causes the changeable of rule;

2., along with the continuous increase of data volume, page address constantly changes;

3. site information renewal speed is fast.

Summary of the invention

The object of this invention is to provide a kind of method crawling webpage recruitment information.

The object of the invention is to gather for all kinds of recruitment informations on recruitment website, mainly issue obtain the topmost approach of recruitment information because recruitment website has become current enterprise and applicant.According to gathering the rule of internet data, recruitment information all kinds of in recruitment website is gathered: the object of the invention is to realize in the following manner, concrete steps are as follows:

1) acquisition software and packet catcher are installed;

2) analyze recruitment website address, find the address of respective different classes of recruitment information;

3) obtain paging information by packet catcher, configuration related tool implementation data gathers;

4) the main flow recruitment website that will gather is found from internet;

5) packet catcher is utilized to obtain the page address of all kinds of recruitment information;

6) analyze the page, find the page rule of the recruitment information that will capture;

7) information acquisition is carried out by the rule that Allocation Analysis is good;

8) gather data storing to database.

Object beneficial effect of the present invention is: solve webpage recruitment information and gather difficult, that warehouse-in is difficult problem.Current internet exists the recruitment website of some, the now main recruitment of enterprise is undertaken by issuing recruitment information at recruitment website, and same, the main path that applicant obtains work is the recruitment information that enterprise issues on recruitment website.These recruitment informations can react the requirement and change of society economic structure to a certain extent; If carry out the treatment and analyses of science to recruitment information, can realize having more policy adjustment targetedly and personnel training.

Accompanying drawing explanation

Fig. 1 is the process flow diagram crawling webpage recruitment information.

Embodiment

With reference to Figure of description, method of the present invention is described in detail below.

Because different recruitment website address is different, different classes of recruitment information address is different especially, and therefore, a point following step carries out data acquisition to recruitment information:

1) acquisition software and packet catcher are installed;

4) the main flow recruitment website that will gather is found from internet;

8) gather data storing to database.

Except the technical characteristic described in instructions, be the known technology of those skilled in the art.

Claims

1. crawl a method for webpage recruitment information, it is characterized in that concrete steps are as follows:

1) acquisition software and packet catcher are installed;

4) the main flow recruitment website that will gather is found from internet;

8) gather data storing to database.