CN109165295B - Intelligent resume evaluation method - Google Patents

Intelligent resume evaluation method Download PDF

Info

Publication number
CN109165295B
CN109165295B CN201811131459.XA CN201811131459A CN109165295B CN 109165295 B CN109165295 B CN 109165295B CN 201811131459 A CN201811131459 A CN 201811131459A CN 109165295 B CN109165295 B CN 109165295B
Authority
CN
China
Prior art keywords
data
resume
recruitment
text
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811131459.XA
Other languages
Chinese (zh)
Other versions
CN109165295A (en
Inventor
吴毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianya Community Network Technology Co ltd
Original Assignee
Tianya Community Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianya Community Network Technology Co ltd filed Critical Tianya Community Network Technology Co ltd
Priority to CN201811131459.XA priority Critical patent/CN109165295B/en
Publication of CN109165295A publication Critical patent/CN109165295A/en
Application granted granted Critical
Publication of CN109165295B publication Critical patent/CN109165295B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/105Human resources
    • G06Q10/1053Employment or hiring

Abstract

The invention discloses an intelligent resume evaluation method, which comprises the following steps: acquiring a recruitment data set from a database, wherein the recruitment data set at least comprises enterprise recruitment information; extracting data from the recruitment data set, the data comprising: one or more attributes are corresponding to the recruitment requirement on the position, and the attributes are parameters used for representing the position requirement in the enterprise recruitment information; acquiring resume text data from a database one by one; extracting data from the resume text data, wherein the data comprises: one or more attributes characterizing an applicant; the data extracted from the resume text data is matched with the data extracted from the recruitment data set, and the matched data is written into the database.

Description

Intelligent resume evaluation method
Technical Field
The invention relates to the technical field of data processing, in particular to an intelligent resume evaluation method.
Background
At present, many enterprise HR often adopt manual identification, judgment and screening methods for resumes delivered by applicants, the methods are more dependent on personal experience judgment, and in long-time screening and evaluation, evaluators are easy to feel fatigue to repeatedly browse similar contents, so that recruitment efficiency and subjective judgment are affected, on the other hand, in the existing recruitment process, enterprises tend to find talents through a recruitment website, most of such recruitment websites are characterized in that full description is performed on corresponding recruiters through social networks, behavior data and the like, interests, characters and abilities of the applicants are comprehensively evaluated, and the enterprises are helped to find suitable talents, but the problems exist: the requirement on data required by evaluation is high, the accuracy limitation is large, the difficulty is high, and an effective solution does not appear at present.
Disclosure of Invention
Accordingly, the present invention is directed to an intelligent resume evaluation method to solve at least the above problems.
An intelligent resume evaluation method, comprising:
acquiring a recruitment data set from a database, wherein the recruitment data set at least comprises enterprise recruitment information;
extracting data from the recruitment data set, wherein the data comprises: one or more attributes are corresponding to the recruitment requirement on the position, and the attributes are parameters used for representing the position requirement in the enterprise recruitment information;
acquiring resume text data from a database one by one;
extracting data from the resume text data, wherein the data comprises: one or more attributes characterizing an applicant;
and matching the data extracted from the resume text data with the data extracted from the recruitment data set, and writing the matched resume text data into a database.
Further, the obtaining resume text data item by item from the database includes screening the resume text data, where the screening includes: removing the resume text which does not meet the condition from the resume text data; and acquiring the screened resume text data item by item.
Further, the resume text which does not meet the condition is the resume text which does not adopt a semi-structured data form.
Further, extracting data from the resume text data includes:
dividing the resume text into a basic information class and a complex information class set;
extracting data from the basic information class;
classifying the complex information class set;
and extracting the target information from the complex information class.
Further, when the resume text is divided into a basic information class and a complex information class set, firstly, a matching strategy based on a regular expression is adopted to identify keywords so as to search a dividing point; if no recognizable keywords exist, the first 5-10 lines of text of the resume text are taken as fuzzy basic information classes to extract data.
Further, extracting data from the basic information class includes:
identifying the content of the strong identification element;
the element type is determined based on the element context location.
Further, when the complex information class set is classified, firstly, a key character matching strategy based on a regular expression is adopted to classify the complex information class set; if the matched key words can not be found, the complex information class set is classified by analyzing the format and the font of the text, or the automatic classification algorithm based on the simple vector is used for classification.
Further, when extracting target information from the complex information class, extracting the target information by adopting a key character matching strategy based on a regular expression, wherein the target information is information used for representing the professional skills and the technical level of an applicant in resume text data.
Compared with the prior art, the invention has the beneficial effects that:
according to the intelligent resume evaluation method provided by the invention, the recruitment information and the resume text data are respectively subjected to specific information extraction and automatic matching, so that the resume screening process is simplified, the efficiency is higher compared with the traditional manual resume screening mode, the manpower resource usage is reduced, on the other hand, the requirement on the source of the screened data is lower, the required data can be automatically extracted from the resume of a postman, the screening basis can be adjusted according to the requirement of a recruiter, and the target information extraction accuracy is higher.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is apparent that the drawings in the following description are only preferred embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without inventive efforts.
Fig. 1 is a schematic flow chart of an intelligent resume evaluation method according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a resume text data extraction process according to an embodiment of the present invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, the illustrated embodiments are provided to illustrate the invention and not to limit the scope of the invention.
The following embodiments can be applied to a general terminal such as a computer. Of course, the following embodiments may also be applied to a server, which may also be understood as a device composed of one or more computers. Therefore, the structure of the computer shown below is also applicable to the server. The following embodiments may also be implemented in mobile terminals when the computing power of the mobile terminal is gradually increased. Of course, the steps or modules in the following embodiments may be performed in different servers or terminals or mobile terminals, respectively, and the necessary data interaction between the servers or terminals or mobile terminals may be performed.
Referring to fig. 1, the present invention provides an intelligent resume evaluation method, specifically including:
and step S1, acquiring a recruitment data set from the database, wherein the recruitment data set at least comprises enterprise recruitment information.
In the above step, as an optional implementation manner, the recruitment data set stored in the database is latest enterprise recruitment information of the recruitment enterprise, and since the recruitment standard of the recruitment enterprise may change with the passage of time, the requirement for the candidate may change, the latest enterprise recruitment information of the recruitment enterprise is used as a source of the recruitment data set to ensure the accuracy of the recruitment target.
Step S2, extracting data from the recruitment data set, wherein the data comprises: and one or more attributes corresponding to the recruitment requirement on the position, wherein the attributes are parameters used for representing the position requirement in the enterprise recruitment information.
In the above steps, the attribute for characterizing the job requirement may be a parameter such as a scholarly, a specialty, a skill, a work experience, and the like required by the job. The extracted data is the specific requirement for the attribute, for example, the study of the applicant requiring the position in the enterprise recruitment information should be large or more, and the specialty should be software engineering.
Step S3, obtaining resume text data item by item from the database;
in the above step, the resume text data is a resume text delivered by an applicant. In an embodiment of the present invention, the foregoing step further includes screening the resume text data, where the screening includes: removing the resume text which does not meet the condition from the resume text data; and acquiring the screened resume text data item by item. And the resume text which does not meet the condition is the resume text which does not adopt a semi-structured data form.
According to the characteristics of the text, the text data can be divided into three categories: structured data, that is, text data strictly generated according to a certain format, such as various bills, score sheets, and the like; unstructured data, i.e., text data that is dominated by human-accustomed communication and that conforms to natural grammatical rules, such as news reports, novels, prose, etc.; the semi-structured data is between the first two types of text data, and the text data has certain format constraint and does not completely accord with natural grammar rules in the whole text view, but locally uses natural grammar rule organization languages, such as notices, announcements, most resumes and the like, and all belong to the semi-structured text. In order to facilitate the recognition of the resume text data and the extraction of information by a computer, when the resume text data delivered by an applicant is acquired, the resume text which is not in a semi-structured data form needs to be removed, namely the text which is not in a conventional resume writing form needs to be removed.
Step S4, extracting data from the resume text data, wherein the data includes: one or more attributes characterizing the applicant.
In the above steps, the attribute for characterizing the characteristics of the applicant may be information such as name, gender, school, academic calendar, specialty, skill, work experience, and the like.
And step S5, matching the data extracted from the resume text data with the data extracted from the recruitment data set, and writing the matched data into the database.
In step S5, the data representing the characteristics of the applicant extracted from the resume text data is matched with the data representing the recruitment requirement of the recruitment enterprise extracted from the recruitment data set, and if the matching is passed, that is, if a certain aspect of the characteristics of the applicant meets the recruitment requirement of the recruitment enterprise, the matched resume text data is written into the database, so that the human resource department can arrange the applicant to perform an interview according to the resume text data in the database, thereby improving the recruitment work efficiency.
Referring to fig. 2, on the basis of the above embodiment, in step S4, extracting data from the resume text data includes:
step S41, dividing the resume text into a basic information set and a complex information set;
step S42, extracting data from the basic information class;
step S43, classifying the complex information class set;
in step S44, target information is extracted from the complex information class.
In the embodiment of the invention, the resume text is divided into a basic information class and a complex information class, and from the content characteristics of the classes, the basic information class refers to a class formed by the basic information of the applicant, wherein the class is a text with certain common characteristics. The basic information class characterizes the basic situation of the applicant, and may contain a plurality of basic information items, such as name, year and month of birth, school, academic calendar, specialty, native place, contact way and the like. Spaces, carriage returns, and the like are generally provided between the basic information items. The complex information class refers to a class formed by complex information of an applicant, the complex information class represents the extension condition information of the applicant, a plurality of complex information classes such as education experience, work experience, project experience, training experience and the like may exist in a resume text, and the complex information classes form a complex information class set.
The basic information class and the complex information class, and the complex information class have obvious segmentation marks, such as keywords, fonts, formats, and the like, and are different from the content of each class. When the resume text is divided into the basic information class and the complex information class set in step S41, as an optional embodiment, firstly, a keyword matching strategy based on a regular expression is adopted to find out a divided division identifier, and based on the characteristic that each type of text information in the resume text has a title, an exhaustion method can be firstly adopted to store the title possibly appearing in the resume text and the category to which the title belongs in a keyword library, and then a regular expression is designed to retrieve a matched text from the text to be used as the division identifier for division. If the corresponding key characters are not detected in the text, the first 5-10 lines of text of the resume text are used as the fuzzy basic information classes to extract data according to the fact that the basic information of the applicant in the general resume text is located at the beginning of the resume, and of course, the range of the fuzzy basic information classes can be flexibly set according to actual requirements.
In step S42, the extracting of the information required by the recruiting enterprise from the basic information class specifically includes:
identifying the content of the strong identification element;
the element type is determined based on the element context location.
The basic information class is composed of a plurality of basic information items, the basic information items generally comprise a title element and a content element, for example, the name in the resume text is the title element, the third paragraph after the name is the content element, and the title element can judge the strong identification element of the class from the text content of the title element according to the strong and weak identification of the content of the resume text. By designing a regular expression, the strong identification elements in the resume text are retrieved, and then the types of the elements can be judged according to the context positions of the elements. For example, in the basic information class, if a weak identification element or no identification element is located between two strong identification elements, the weak identification element or no identification element is considered as a content element corresponding to a previous element. After the element types in the basic information class are identified, the required information is extracted from the regular expression designed according to the key character matching strategy.
In step S43, since the resume text usually includes a plurality of complex information classes, such as educational backgrounds, work experiences, skills, hobbies, social practices, and the like, which form a complex information class set, the complex information class set needs to be further classified after the segmentation of the basic information class and the complex information class set in the resume text is completed. Firstly, a regular expression-based key character matching strategy is adopted to classify a complex information class set, most resume texts are provided with keywords of education backgrounds and working experiences, and therefore the method is high in speed, high in accuracy and good in classification effect for classifying the complex information class. If the matched key words can not be found, the complex information class set is classified by analyzing the format and the font of the text or by an automatic classification algorithm based on simple vectors according to the characteristics that the title and the content of the complex information class generally adopt different fonts, sizes and formats.
The classification principle of the automatic classification algorithm based on the simple vector is as follows: and generating a central vector for each type of text set according to arithmetic mean, determining a new text vector when the new text comes, calculating the distance between the new text vector and the central vector of each type of text set, namely similarity, and finally judging that the new text belongs to the class closest to the text in classification.
In step S44, after the classification of the complex information category is completed, a regular expression is designed to extract target information based on a keyword matching policy, where the target information is information used for representing the professional skills and technical levels of an applicant in resume text data, and the information is extracted to be used for matching with information representing the job requirements extracted from the recruitment information of a recruitment enterprise, and when the information is matched, according to the job requirement information, for example, according to the requirement that the applicant needs to have professional certification of a high-level software engineer in the recruitment information, the regular expression is designed based on the information, the target information extracted from the complex information category is screened, and if the corresponding information is retrieved, the resume text is stored in a database, otherwise, the resume text is regarded as not meeting the job requirements.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (5)

1. An intelligent resume evaluation method, comprising:
acquiring a recruitment data set from a database, wherein the recruitment data set at least comprises enterprise recruitment information;
extracting data from the recruitment data set, wherein the data comprises: one or more attributes are corresponding to the recruitment requirement on the position, and the attributes are parameters used for representing the position requirement in the enterprise recruitment information;
obtaining resume text data from a database one by one, and screening the resume text data, wherein the screening comprises the following steps: removing the resume text which does not meet the condition from the resume text data; acquiring the screened resume text data one by one, wherein the resume text which does not meet the condition is the resume text which does not adopt a semi-structured data form;
extracting data from the resume text data, including:
dividing the resume text into a basic information class and a complex information class set;
extracting data from the basic information class;
classifying the complex information class set;
extracting target information from a complex information class, wherein the data comprises: one or more attributes characterizing an applicant;
and matching the data extracted from the resume text data with the data extracted from the recruitment data set, and writing the matched resume text data into a database.
2. The intelligent resume evaluation method according to claim 1, wherein when the resume text is divided into a basic information class and a complex information class set, keywords are identified by adopting a matching strategy based on a regular expression to find division points; if no recognizable keywords exist, the first 5-10 lines of text of the resume text are taken as fuzzy basic information classes to extract data.
3. The intelligent resume evaluation method of claim 1, wherein extracting data from the basic information class comprises:
identifying the content of the strong identification element;
the element type is determined based on the element context location.
4. The intelligent resume evaluation method according to claim 1, wherein when the complex information class set is classified, firstly, a key character matching strategy based on a regular expression is adopted to classify the complex information class set; if the matched key words can not be found, the complex information class set is classified by analyzing the format and the font of the text, or the automatic classification algorithm based on the simple vector is used for classification.
5. The intelligent resume evaluation method according to claim 1, wherein when extracting target information from the complex information class, a regular expression-based key character matching strategy is adopted to extract the target information, and the target information is information used for representing professional skills and technical levels of an applicant in resume text data.
CN201811131459.XA 2018-09-27 2018-09-27 Intelligent resume evaluation method Active CN109165295B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811131459.XA CN109165295B (en) 2018-09-27 2018-09-27 Intelligent resume evaluation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811131459.XA CN109165295B (en) 2018-09-27 2018-09-27 Intelligent resume evaluation method

Publications (2)

Publication Number Publication Date
CN109165295A CN109165295A (en) 2019-01-08
CN109165295B true CN109165295B (en) 2021-07-27

Family

ID=64892619

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811131459.XA Active CN109165295B (en) 2018-09-27 2018-09-27 Intelligent resume evaluation method

Country Status (1)

Country Link
CN (1) CN109165295B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110263148A (en) * 2019-06-27 2019-09-20 中国工商银行股份有限公司 Intelligent resume selection method and device
CN111598462B (en) * 2020-05-19 2022-07-12 厦门大学 Resume screening method for campus recruitment

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104599031A (en) * 2014-11-06 2015-05-06 河南智业科技发展有限公司 Resume model matching system and method
CN105183742A (en) * 2015-06-12 2015-12-23 南京富士通南大软件技术有限公司 Resume identification method
CN105117863A (en) * 2015-09-28 2015-12-02 北京橙鑫数据科技有限公司 Resume position matching method and device
CN107590133A (en) * 2017-10-24 2018-01-16 武汉理工大学 The method and system that position vacant based on semanteme matches with job seeker resume
CN107729532A (en) * 2017-10-30 2018-02-23 北京拉勾科技有限公司 A kind of resume matching process and computing device
CN107808016A (en) * 2017-11-29 2018-03-16 四川九鼎智远知识产权运营有限公司 A kind of online resume matching process
CN107862079A (en) * 2017-11-29 2018-03-30 四川九鼎智远知识产权运营有限公司 A kind of online resume Matching Platform

Also Published As

Publication number Publication date
CN109165295A (en) 2019-01-08

Similar Documents

Publication Publication Date Title
CN109325165B (en) Network public opinion analysis method, device and storage medium
US11397778B2 (en) Method and device for mining an enterprise relationship
US11501210B1 (en) Adjusting confidence thresholds based on review and ML outputs
US10410136B2 (en) Model-based classification of content items
Singh et al. PROSPECT: a system for screening candidates for recruitment
US9720901B2 (en) Automated text-evaluation of user generated text
US10997560B2 (en) Systems and methods to improve job posting structure and presentation
US20170075978A1 (en) Model-based identification of relevant content
CN110795919A (en) Method, device, equipment and medium for extracting table in PDF document
CN110991163B (en) Document comparison and analysis method and device, electronic equipment and storage medium
US20190164109A1 (en) Similarity Learning System and Similarity Learning Method
CN113312461A (en) Intelligent question-answering method, device, equipment and medium based on natural language processing
CN115293131B (en) Data matching method, device, equipment and storage medium
Chumwatana Using sentiment analysis technique for analyzing Thai customer satisfaction from social media
US20230351789A1 (en) Systems and methods for deep learning based approach for content extraction
CN109165295B (en) Intelligent resume evaluation method
KR102185733B1 (en) Server and method for automatically generating profile
CN111597302B (en) Text event acquisition method and device, electronic equipment and storage medium
CN107783958B (en) Target statement identification method and device
US11061950B2 (en) Summary generating device, summary generating method, and information storage medium
Chumwatana COMMENT ANALYSIS FOR PRODUCT AND SERVICE SATISFACTION FROM THAI CUSTOMERS'REVIEW IN SOCIAL NETWORK
CN115099680A (en) Risk management method, device, equipment and storage medium
CN113553419A (en) Civil aviation knowledge map question-answering system
US11270357B2 (en) Method and system for initiating an interface concurrent with generation of a transitory sentiment community
CN112328812A (en) Domain knowledge extraction method and system based on self-adjusting parameters and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant