WO2017167069A1 - 简历评估方法和装置 - Google Patents

简历评估方法和装置 Download PDF

Info

Publication number
WO2017167069A1
WO2017167069A1 PCT/CN2017/077496 CN2017077496W WO2017167069A1 WO 2017167069 A1 WO2017167069 A1 WO 2017167069A1 CN 2017077496 W CN2017077496 W CN 2017077496W WO 2017167069 A1 WO2017167069 A1 WO 2017167069A1
Authority
WO
WIPO (PCT)
Prior art keywords
resume
data
name
school
recruitment
Prior art date
Application number
PCT/CN2017/077496
Other languages
English (en)
French (fr)
Inventor
王海英
刘军宁
郭鼎
陈奥
宋华青
张晓莹
石志伟
金凯民
袁泉
刘长江
Original Assignee
阿里巴巴集团控股有限公司
王海英
刘军宁
郭鼎
陈奥
宋华青
张晓莹
石志伟
金凯民
袁泉
刘长江
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司, 王海英, 刘军宁, 郭鼎, 陈奥, 宋华青, 张晓莹, 石志伟, 金凯民, 袁泉, 刘长江 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2017167069A1 publication Critical patent/WO2017167069A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/105Human resources
    • G06Q10/1053Employment or hiring
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Definitions

  • the present invention relates to the field of data processing, and in particular to a resume evaluation method and apparatus.
  • the search network expands a person's past flat pictures to a person's movie story, thus forming a multi-dimensional Stereo user job search image.
  • the search network uses 20 million people's career development path as the data source, through analysis, to form a job promotion map, such as the position of promotion, job relationship and so on.
  • the embodiment of the present invention provides a method and a device for evaluating a resume, so as to at least solve the problem that the corresponding candidate in the prior art performs evaluation through a comprehensive analysis of the behavior and social data of the corresponding candidate to evaluate the candidate, and the candidate's behavior and social interaction.
  • the data is complex and variable, and the acquisition is difficult, which leads to the evaluation of the difficult technical problems.
  • a resume evaluation method includes: acquiring a historical recruitment data set, wherein the historical recruitment data set includes at least: resume text data; Extracting data from a historical collection of recruitment data, wherein the data includes at least: a recruitment result corresponding to one or more attributes in the position, and one or more attributes are parameters used in the resume text data to characterize the candidate, and the recruitment result is at least This includes: the number of occurrences of one or more attributes in the position, and/or the number of times one or more attributes are admitted to the position; constructing a resume evaluation model by training the extracted data.
  • a resume evaluation method including: inputting a resume to be evaluated; and obtaining a resume evaluation result of the resume to be evaluated, wherein the resume evaluation result is based on the resume evaluation model, and the resume evaluation
  • the model is based on extracting data from a collection of historical recruitment data.
  • a resume evaluation apparatus comprising: an acquisition module, configured to acquire a historical recruitment data set, wherein the historical recruitment data set includes at least: resume text data; and an extraction module, configured to: Extracting data from a historical collection of recruitment data, wherein the data includes at least: a recruitment result corresponding to one or more attributes in the position, and one or more attributes are parameters used in the resume text data to characterize the candidate, and the recruitment result is at least Including: the number of occurrences of one or more attributes in the position, and/or the number of times one or more parameters are accepted in the position; a building block for constructing a resume evaluation model by training the extracted data; an evaluation module for use The resume assessment model performs a resume assessment of the received resume.
  • a resume evaluation apparatus including: a first input module, configured to input a resume to be evaluated; and an acquisition module, configured to obtain a resume evaluation result of the resume to be evaluated, wherein the resume assessment
  • the results are based on a resume assessment model that is based on extracting data from historical recruitment data sets.
  • the above solution provided by the present invention solves the problem that the corresponding candidate in the prior art evaluates the candidate through the comprehensive analysis of the behavior and social data of the corresponding candidate, and the behavior and social data of the candidate are complicated and acquired.
  • the difficulty is high, which leads to the evaluation of difficult technical problems.
  • FIG. 1 is a block diagram showing the hardware structure of a computer terminal of a resume evaluation method according to Embodiment 1 of the present invention
  • FIG. 2 is a flow chart of an optional resume evaluation method according to Embodiment 1 of the present invention.
  • FIG. 3 is a flow chart of an optional resume evaluation method according to Embodiment 1 of the present invention.
  • FIG. 5 is a schematic diagram of an optional resume evaluation method according to Embodiment 2 of the present invention.
  • FIG. 6 is a schematic structural diagram of an optional resume evaluation apparatus according to Embodiment 3 of the present invention.
  • FIG. 7 is a schematic structural diagram of an optional resume evaluation apparatus according to Embodiment 3 of the present invention.
  • FIG. 8 is a schematic structural diagram of an optional resume evaluation apparatus according to Embodiment 3 of the present invention.
  • FIG. 9 is a schematic structural diagram of an optional resume evaluation apparatus according to Embodiment 3 of the present invention.
  • FIG. 10 is a schematic structural diagram of an optional resume evaluation apparatus according to Embodiment 3 of the present invention.
  • FIG. 11 is a schematic structural diagram of an optional resume evaluation apparatus according to Embodiment 3 of the present invention.
  • FIG. 12 is a schematic structural diagram of an optional resume evaluation apparatus according to Embodiment 3 of the present invention.
  • FIG. 13 is a schematic structural diagram of a resume evaluation apparatus according to Embodiment 4 of the present invention.
  • FIG. 14 is a schematic structural diagram of an optional resume evaluation apparatus according to Embodiment 4 of the present invention.
  • FIG. 15 is a schematic structural diagram of an optional resume evaluation apparatus according to Embodiment 4 of the present invention.
  • FIG. 16 is a structural block diagram of a computer terminal according to an embodiment of the present invention.
  • the following embodiments can be applied to applications that can be applied to ordinary terminals, such as computers.
  • the following embodiments can also be applied to a server, which can also be understood as a device consisting of one or more computers. Therefore, the structure of the computer shown below is also applicable to the server.
  • the following embodiments can also be implemented in a mobile terminal as the computing power of the mobile terminal is gradually enhanced.
  • the steps or modules in the following embodiments may be performed in different servers or terminals or mobile terminals, respectively, and the necessary data interaction may be performed between these servers or terminals or mobile terminals.
  • FIG. 1 is a block diagram showing the hardware structure of a computer terminal of a resume evaluation method according to Embodiment 1 of the present invention.
  • computer terminal 10 may include one or more (only one shown) processor 102 (processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA)
  • processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA)
  • a memory 104 for storing data
  • a transmission device 106 for communication functions.
  • computer terminal 10 may also include more or fewer components than those shown in FIG. 1, or have a different configuration than that shown in FIG.
  • the memory 104 can be used to store software programs and modules of application software, such as program instructions/modules corresponding to the resume evaluation method in the embodiment of the present invention, and the processor 102 executes various programs by running software programs and modules stored in the memory 104.
  • Functional application and data processing that is, the vulnerability detection method for implementing the above application.
  • Memory 104 may include high speed random access memory, and may also include non-volatile memory such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory.
  • memory 104 can further include memory remotely located relative to processor 102, which can be connected to the network via Computer terminal 10. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • Transmission device 106 is for receiving or transmitting data via a network.
  • the network specific examples described above may include a wireless network provided by a communication provider of the computer terminal 10.
  • the transmission device 106 includes a Network Interface Controller (NIC) that can be connected to other network devices through a base station to communicate with the Internet.
  • the transmission device 106 can be a Radio Frequency (RF) module for communicating with the Internet wirelessly.
  • NIC Network Interface Controller
  • RF Radio Frequency
  • FIG. 2 is a flow chart of a resume evaluation method according to Embodiment 1 of the present invention.
  • the method according to the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course, by hardware, but in many cases, the former is A better implementation.
  • the technical solution of the present invention which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (eg, ROM/RAM, disk, CD-ROM includes a number of instructions for causing a terminal device (which may be a cell phone, a computer, a server, or a network device, etc.) to perform the methods described in various embodiments of the present invention.
  • a storage medium eg, ROM/RAM, disk, CD-ROM includes a number of instructions for causing a terminal device (which may be a cell phone, a computer, a server, or a network device, etc.) to perform the methods described in various embodiments of the present invention.
  • a data set is used in the following steps, one or more data are processed in the same or similar manner, or can be regarded as a data set as a basis for a certain action or step.
  • Step S21 Acquire a historical recruitment data set, wherein the historical recruitment data set includes at least: resume text data.
  • the source of the historical recruitment data set may be a historical recruitment data collection of the target recruiter, for example, the target recruiting party participates in the recruiting time within a preset time (such as a recruitment season) Information, as well as information about the people who are recruiting and obtaining the position of the target recruiting unit.
  • the method of obtaining the historical recruitment data collection may be to obtain a historical recruitment data collection through the website's own database.
  • the historical recruiting data of the target recruiter for the past five years is obtained through the database as the data set, because the recruiting party's recruitment criteria may change with time, for example, the recruiter has academic qualifications. The requirements may be increased, or the job experience of the candidate is more important. Therefore, all historical recruitment data for the past two years are obtained as a data set and part of the recruitment data from the previous five years to the previous three years as a data set.
  • Step S23 extracting data from the historical recruitment data set, wherein the extracted data includes at least: one or more attributes corresponding to the recruitment result in the position, and one or more attributes are simple
  • the recruitment result may include: the number of occurrences of one or more attributes in the position, and/or one or more attributes are admitted to the position frequency.
  • the recruitment results page can include content other than these two parts, and then the information that is beneficial to the recruitment of personnel can be counted in the recruitment results.
  • the attributes of the parameters used to characterize the candidate characteristics may be the qualifications, professional, work experience and the like of the candidate.
  • step S25 the resume evaluation model is constructed by training the extracted data.
  • the GDBT algorithm may be used to train the extracted data to obtain a resume evaluation model.
  • the historical recruitment of the above steps may be used first.
  • a data set constructs a decision tree in one or more dimensions, and the output of the final model is the cumulative value of the results obtained by multiple decision trees.
  • the algorithm for training the extracted data may be the above GDBT algorithm, but is not limited thereto.
  • the purpose of training the extracted data is to make the constructed resume evaluation model learn the extracted data, so that when the resume evaluation model receives the same or similar data again, the same or similar output can be obtained, so the training requires a large amount of training.
  • the data so that the resume of the candidate can be used in the data for training when using the resume evaluation model to evaluate the resume, that is, the amount of data for the data used for training is not specifically limited. However, in a preferred case, the larger the amount of data extracted for training and the wider the coverage of the data, the higher the accuracy of the constructed resume evaluation model.
  • Step S27 optionally, the resume evaluation method provided by the foregoing embodiment of the present application may further include The following steps are included: the resume evaluation is performed on the received resume using the resume evaluation model.
  • the received resume is entered into the constructed resume model to obtain the output of the model.
  • the output of the model can be matched to the position expected by the candidate, and if the match is successful, the prompt is prompted.
  • the different colors used by the candidate's name can be used to identify different matching degrees.
  • the data extracted in step S23 includes one or more attributes corresponding to the recruitment result in the position, and one or more attributes are parameters in the resume text data for characterizing the candidate. It is known that when the extracted data is trained to construct the resume evaluation model in step S27, the recruiter's behavior and social data are not used, thereby avoiding the behavior and social data being complex and varied, and the difficulty of obtaining the impact of the resume evaluation efficiency. The situation happened. It should be noted that the foregoing solution in the embodiment of the present application uses the data in the resume, and the data is easily obtained, and also uses the correspondence between the candidate's parameters (attributes) and the recruitment result, in view of the resume. The data has a high degree of realism. Therefore, not only does the accuracy of the evaluation due to non-use of behavior and social data be low, but also the accuracy of the resume assessment is improved by using the relationship between the candidate parameters and the recruitment results in the historical recruitment results.
  • the solution provided by the foregoing embodiment of the present invention solves the problem that the corresponding candidate in the prior art evaluates the candidate through the comprehensive analysis of the behavior and social data of the candidate, and the candidate's behavior and social data are complex and changeable. Moreover, it is difficult to obtain, which leads to technical problems that are difficult to evaluate.
  • the data is extracted from the historical recruitment data set in step S23, and the data may be extracted from all the historical recruitment data sets.
  • the data may be filtered first. Or clean it to remove data that is considered likely to have an impact.
  • the above step S23 may include the following steps:
  • step S231 the historical recruitment data set is cleaned, wherein the data cleaning is mainly used to shield the resume that is not approved from the historical recruitment data set.
  • the purpose of cleaning the historical recruitment data set is to discover and clear the noise data included in the historical recruitment data set.
  • Step S233 extracting data from the cleaned historical recruitment data set.
  • the extracted data can be made more accurate, and the established resume evaluation model is more in line with the requirements.
  • a resume that fails to pass assessment includes at least one of the following:
  • the head count can be used to characterize the recruiter's human resources in a position for the current needs of the position, future development needs and overall corporate planning, the number of employees booked for this position, and / or recruitment Number of people.
  • Applicant A meets the recruitment criteria of the recruiter, but because the HR department has a pre-requisite number plan for the candidate's position, if the candidate A is hired, the employee may be caused. Redundant phenomenon, so the candidate A is not hired.
  • the resume of the candidate A is the resume that fails the evaluation due to the preparation of the head count.
  • the resume of the applicant B is not evaluated, but directly participates in the interview of the recruiter in a certain position, but the interview is not passed, then the resume of the applicant B It is also considered to be a resume that fails to pass the assessment.
  • Applicant C repeatedly submits his or her resume in a different manner to one of the candidates' positions, for example, multiple reposts to the same position of the same recruiter through different recruitment websites.
  • the resume, the candidate's resume is also considered to be a resume that does not pass the assessment.
  • step S25 the data extracted by training is constructed. Calendar evaluation model.
  • all the extracted data can be applied to the establishment of a resume evaluation model, and the test of the resume evaluation model can be verified using the actually received resume.
  • This method of processing requires the use of a real candidate's resume for testing, and it is possible to screen out the appropriate candidates incorrectly.
  • the extracted data may be divided into two parts, one part is used to generate a resume evaluation model, and the other part is used to test the generated resume evaluation model.
  • step S25 It can include the following steps:
  • step S251 the extracted data is divided into training sample data and test sample data.
  • the training sample data includes a plurality of data for constructing a resume evaluation model by training, and the test sample data also includes a plurality of data for verifying whether the resume evaluation model is accurate.
  • the training sample data and the test sample data may be data in a plurality of dimensions in a historical recruitment data set.
  • the data of the extracted data divided into training sample data is diversified in any dimension to ensure that multiple data in any dimension can be After training, the resume assessment model can learn a variety of data in any dimension.
  • training sample data is used to form the resume evaluation model
  • test sample data is used to verify the accuracy of the resume evaluation model
  • training sample data and the test sample data are historical recruitments that have already known the recruitment result. Data collection.
  • step S253 the training sample data is used for training to generate a resume evaluation model to be tested.
  • the following is described in an alternative embodiment.
  • Personnel 1 A school, A major, B company, B position;
  • Personnel 3 A school, B major, A company, A position;
  • Personnel 5 A school, A major, C company, B position;
  • Major A major has appeared 3 times, accounting for 0.5; B professional has appeared 3 times, accounting for 0.5; other professions have not appeared, accounting for 0.
  • Company has appeared once, accounting for 0.17; Company B has appeared twice, accounting for 0.33; Company C has appeared 3 times, accounting for 0.5; other companies have not appeared, accounting for 0.
  • Position A position has appeared 3 times, accounting for 0.5; B position has appeared 3 times, accounting for 0.5; The other positions accounted for 0.
  • Personnel 1 A school, A major, B company, B position;
  • the scores of the personnel 1 on each attribute are 0.5*0.2, 0.3*0.5, 0.2*0.33, 0.3*0.5, namely, 0.1, 0.15, 0.066 and 0.15;
  • the scores of the personnel 2 on each attribute are 0.2*0.17, 0.3*0.5, 0.2*0.5 and 0.3*0.5, respectively, namely 0.034, 0.15, 0.1 and 0.15;
  • Personnel 3 A school, B major, A company, A position;
  • the scores of the personnel 3 on each attribute are 0.2*0.5, 0.3*0.5, 0.2*0.17 and 0.3*0.5, respectively, which is 0.1 0.15 0.034 0.15;
  • the scores of the personnel 4 on each attribute are 0.2*0.33, 0.3*0.5, 0.2*0.33 and 0.3*0.5, respectively, namely 0.066, 0.15, 0.066 and 0.15;
  • the scores of the personnel 5 on each attribute are 0.2*0.5, 0.3*0.5, 0.2*0.5, and 0.3*0.5, namely, 0.1, 0.15, 0.1, and 0.15;
  • the scores of the persons 6 on the respective attributes are 0.2*0.33, 0.3*0.5, 0.2*0.5, and 0.3*0.5, respectively, which are 0.066, 0.15, 0.1, and 0.15.
  • the method provided by the foregoing embodiment may be used to train the training sample data to obtain a resume evaluation model, but the method for obtaining the resume evaluation model is not limited thereto, and any algorithm capable of obtaining a resume evaluation model by training sample data is used. Can be applied to the above steps For example, the GBDT algorithm and the like.
  • Step S255 using the test sample data to test the resume evaluation model to be tested, and confirming that the resume evaluation model to be tested is an accurate resume evaluation model in the case of passing the test.
  • the accuracy of the resume evaluation model to be tested is not necessarily high because the value of the training sample data is not necessarily comprehensive, or the noise data exists in the training sample data. Therefore, it is necessary to verify whether the resume evaluation model to be tested is accurate by inputting test sample data to the recruitment data model to be inspected.
  • the amount of data contained in the training sample data is directly proportional to the accuracy of the resume evaluation model.
  • the generated data can be used to verify the generated resume evaluation model, thereby avoiding the problem that the mistake can be screened out by using the real recruitment data for verification.
  • the above data can be vectorized. That is, according to the above embodiment of the present application, the training sample data for generating the resume evaluation model to be inspected may be: data obtained by performing vector extraction and characterizing the data after vector extraction; and/or detecting the resume to be inspected.
  • the test sample data of the evaluation model may be: data obtained by performing vector extraction and characterizing the data after vector extraction.
  • the vector extraction of the training sample data and/or the test sample data may be the extraction of data of the training sample data and/or the test sample data in one or more dimensions.
  • Performing feature extraction after vector extraction of training sample data and/or test sample data may be training
  • the data of the sample data and/or test sample data with different forms, formats, display modes, and the like, but with the same meaning are uniformly processed.
  • the purpose of the above steps is to unify the data form and solve the technical problem that the same data has multiple forms due to data diversification and is not easy to be trained by the algorithm.
  • step S255 the test evaluation data of the test sample data to be tested is tested in the above step S255, and the resume evaluation model to be tested is confirmed as an accurate resume evaluation model.
  • the resume evaluation model to be tested is confirmed as an accurate resume evaluation model.
  • an accurate model can be considered if the actual results are consistent with the results of the resume assessment model output.
  • step S255 may include:
  • step S2551 the test sample data is input into the resume evaluation model to be tested for verification, and the test result is output.
  • step S2553 if the error of the recruitment result corresponding to the test sample data is within a predetermined range, it is confirmed that the resume evaluation model to be tested is an accurate resume evaluation model.
  • the plurality of data in the test sample data are separately input into the resume evaluation model to be tested, and the above-mentioned recruited position is a database maintenance position, and three of the test sample data are used. Members who already know the results of the application are tested:
  • Tester 1 A school, C major, C company, B position, not accepted;
  • Tester 2 D school, A major, B company, A position, not accepted;
  • Tester 3 B, A, A, A, and successful admission.
  • test results of the test sample data are calculated:
  • Tester 1 A school, C major, C company, B position;
  • the tester 1 scores 0.5*0.2, 0.3*0, 0.2*0.5, 0.3*0.5 on each attribute, that is, 0.1, 0, 0.1, and 0.15.
  • Tester 2 D school, A major, B company, A position;
  • the tester 2 scores 0*0.2, 0.3*0.5, 0.2*0.33, 0.3*0.5 on each attribute, namely 0, 0.15, 0.066 and 0.15.
  • Tester 3 B school, A major, D company, A position;
  • the tester 3 scores 0.2*0.17, 0.3*0.5, 0.2*0, 0.3*0.5 on each attribute, that is, 0.034, 0.15, 0, and 0.15.
  • the scores of the above three test sample data in different dimensions were calculated, and the tester 1 score was 0.35, the tester 2 score was 0.366, and the tester 3 score was 0.434. Only the tester 3 fell into compliance with the score. The score of the position is within the range of the score. Therefore, the result obtained by the above resume evaluation model is that the tester 1 is not accepted, the tester 2 is not accepted, and the tester 3 is accepted, which is the same as the actual result, so the above resume evaluation can be considered.
  • the model has a high degree of accuracy.
  • test sample data is used as the training sample data, and the resume evaluation model is trained until the same test result as the test sample data can be obtained.
  • performing vector extraction includes: step S2555. performing vector extraction on one or more attributes of the candidate, wherein the one or more attributes include at least one of the following: a company name, a job title, a school name, professional title;
  • Performing feature defragmentation on the vector extracted data includes: step S2557, normalizing one or more attributes.
  • normalization methods There are many ways to handle normalization. In an alternative embodiment, several different normalization methods are provided depending on the company name, job title, school name, and the nature of the professional name. The individualization methods can be used separately or in combination. The normalization method is not limited to this, and other normalization methods can achieve the same effect.
  • the normalization of the company name includes: constructing the industry vocabulary and the local noun table; extracting the company name according to the industry nouns and the place name nouns in the industry vocabulary and the local noun table, and obtaining the normalized result of the company name.
  • the company name is roughly composed of a company location, a company name, a company industry, and a general term, for example: Taobao (China) Software Co., Ltd., based on this, an industry word is constructed.
  • the table and the local noun table extract the name of the company.
  • the company name consists of a place name and an industry word
  • an industry word For example: China Construction Engineering Corporation.
  • the geographical names and company industries are extracted to normalize the company name.
  • a mapping vocabulary of the company English name and the company Chinese name, and a mapping vocabulary of the parent company and the parent company to which the subsidiary belongs are also required, in the case of a company with an English name. Find the Chinese name corresponding to the above English name from the mapping vocabulary of the company's English name and the company's Chinese name, and obtain the normalized result of the company name.
  • the parent company to which the subsidiary company belongs is searched in the mapping vocabulary of the parent company and the parent company to which the subsidiary belongs, to obtain the result of the normalization of the company name, for example, The name of the company whose name is Taobao is normalized to Facebook.
  • Normalizing the job title includes: confirming that in the historical recruitment data set, the job number whose occurrence number is greater than the preset number is described as the correct job title; constructing the job description in the resume text data and the correct job title by editing the distance The mapping vocabulary between the two; the regular description of the job description in the resume text data is matched in the mapping vocabulary to obtain the normalized result of the job title.
  • the job description with more than 2000 occurrences is confirmed as an accurate job description, for example, in the position of applying for a software development engineer.
  • the number of the job description as a software development engineer is greater than 2000, and the job description is made into a software design engineer, a software engineer, etc.
  • a job but no more than 2,000 occurrences the name of the job is confirmed as a software development engineer, and the names of all job titles, software design engineers, software engineers, are normalized to software development engineers, if software development Both the engineer and the software design engineer have more than 2000 job titles.
  • the name that most frequently appears is the name of the job.
  • the names of all the job titles of the software design engineer and the software engineer are normalized to software development engineers, and the software text design engineer and the software engineer are first constructed by editing the distance.
  • Other mapping vocabulary between the name of the software development engineer and the software development engineer that is, mapping all the names used to characterize the software development engineer to the position of software development engineer;
  • the expression matches the job description in the resume text data in the mapping word table to obtain a normalized result.
  • Normalizing the name of the school includes: arranging the names of the schools in the resume text data according to the order in which they appear, and obtaining the school name of the preset ranking to obtain the basic dictionary; Noise processing, and matching the school name obtained by denoising with the school name in the basic dictionary by regular matching to obtain the normalized school name; using the synonym table, constructing the shorthand corresponding to the school name according to the preset rules, The short name corresponding to the name of the school is recorded as the name of the school; the name of the school containing the preset first suffix in the school name is normalized to the corresponding first suffix word, wherein the preset first suffix words include at least: Vocational and technical colleges, network colleges, adult education, self-study and college degree; remove the school name A second suffix word is included to generate a normalized school name, wherein the second suffix word is used to characterize the school's branch; the prefix in the school name is removed to generate a normalized school name.
  • all the school names appearing in the resume text data are ranked according to the number of occurrences, and the name of the top 1000 school names is used to form a basic dictionary, which constitutes a basic dictionary.
  • a basic dictionary which constitutes a basic dictionary.
  • denoising the school name is done by regular matching, discarding the data in the school name, for example, processing Karlsruhl University (Germany) as Düsseldorf University The Xiamen Ocean College will be treated as Xiamen Ocean College to eliminate the impact of noise on the normalization of the school name.
  • the name of the school including the preset first suffix word in the school name is normalized to the corresponding first suffix word, for example, the Xiamen Xingcai Vocational and Technical College is normalized to Vocational and Technical College; remove the second suffix from the school name used to characterize the branch of the school, for example, normalize the Jiangsu University of Science and Technology Economic Management College to Jiangsu University of Science and Technology; remove the prefix from the school name, for example, Jinling College of Nanjing University, Jiangsu province, was renamed Jinling College of Nanjing University.
  • the normalization of the professional name includes: constructing a professional classification table, and training the Bayesian model through the professional classification table; classifying the majors in the resume text data according to the trained Bayesian model.
  • the matching of the JD text and the resume text of the resume text data of the corresponding employer in the above embodiment may quantify the similarity of the text based on the tfidf and word2vec algorithms or based on the evolution of the two algorithms. .
  • FIG. 3 shows an application step in an application scenario
  • FIG. 3 is a flow chart of an optional resume evaluation method according to Embodiment 1 of the present invention.
  • FIG. 3 an example in an application scenario of the foregoing embodiment of the present application is described in detail below:
  • the data cleaning of the historical recruitment data set may be to mask the unsuccessful resume from the historical recruitment data set, wherein the unsuccessful resume may be a resume that fails due to the staffing head count, and no A resume that is directly evaluated by a resume assessment and an unsuccessful interview, and a resume that is repeatedly delivered results in a resume that fails to pass the assessment.
  • S33 The historical recruitment data set is divided into training sample data and test sample data.
  • performing vector extraction on the training sample data may be one or more attributes of the candidate in the training sample data, and one or more attributes include one of the following: company name, job title, school name, professional name .
  • S35 Perform feature defragmentation on the extracted vector.
  • characterizing the extracted vector may be normalized by one or more attributes of the candidate.
  • the training sample data may be trained by using the GDBT algorithm to obtain a resume evaluation model to be tested, but the training algorithm for obtaining the resume evaluation model is not limited thereto.
  • the test sample data is input to the resume evaluation model to be tested. If the output result is the same as the recruitment result of the test sample data, the resume evaluation model to be tested may be considered as an accurate resume evaluation model.
  • the embodiment of the present application also provides a resume evaluation method as shown in FIG. 4.
  • 4 is a flow chart of a resume evaluation method according to Embodiment 2 of the present invention.
  • step S41 the resume to be evaluated is input.
  • the preset resume evaluation model may be any one of the resume evaluation models in Embodiment 1 of the present application.
  • Step S43 Acquire a resume evaluation result of the resume to be evaluated, wherein the resume evaluation result is made according to the resume evaluation model, and the resume evaluation model is established according to the data extracted from the historical recruitment data set.
  • the preset resume evaluation model appearing in the above embodiment of the present application may be any one of the resume evaluation models in the embodiment 1, or may be a resume evaluation model other than the embodiment 1, and may be used arbitrarily.
  • the resume evaluation model obtained from the historical recruitment data instead of the social data can be applied to the present embodiment.
  • the method further includes:
  • Step S45 inputting a historical recruitment data set, wherein the data in the historical recruitment set includes at least: a recruitment result corresponding to one or more attributes in the position, and one or more attributes are used in the resume text data to represent the candidate characteristics.
  • the parameters, the recruitment results include at least: the number of occurrences of one or more attributes in the position, and/or the number of times one or more attributes are admitted to the position.
  • step S43 after obtaining the resume evaluation result of the resume to be evaluated, the method further includes:
  • Step S451 displaying the resume evaluation result of the object to be evaluated in the preset area.
  • the evaluation result may be established according to the preset display content.
  • the resume evaluation model is taken as an example of the resume evaluation model proposed in Embodiment 1, and the user is to the preset resume evaluation system (where the resume evaluation system uses the preset resume evaluation model).
  • the resume evaluation system uses the resume evaluation model to evaluate the user's resume, and then obtains the evaluation result of the above user, and displays the result of the user's resume evaluation.
  • the display terminal shown by the user, at the same time, through the evaluation of the user's resume, the resume evaluation system can also obtain the position suitable for the user, so the resume evaluation system can display the user's resume evaluation result, and can also be displayed as the user recommendation. Suitable for the position.
  • the resume evaluation model is still taken as an example of the resume evaluation model proposed in Embodiment 1, and the human management personnel inputs one or more resumes of the job seekers to the preset resume evaluation system.
  • the resume evaluation system evaluates the resumes of the job seekers in sequence or in a preset order, and obtains the evaluation results and displays them on the display terminal of the human management personnel. For example, the list of job seekers who meet the positions and do not meet the positions may be different. The color or location display, the human management staff can also view the resume of the job seeker by clicking on the name of the job seeker or other operations.
  • the resume evaluation system can also find a job seeker who meets the position in the resume database to recommend for the human management personnel.
  • the display manner of displaying the resume evaluation result according to the preset display content is not limited to any one of the display modes of the above embodiments.
  • Embodiment 5 is a schematic diagram of an optional resume evaluation method according to Embodiment 2 of the present application. The following is provided in the foregoing Embodiment 2 based on the resume evaluation method provided in Embodiment 1 of the present application, with reference to the example shown in FIG. The method is further explained.
  • the method may include two stages, the first stage is a preliminary stage, that is, the server obtains historical recruitment data, and uses the learning method to train historical recruitment data to obtain a resume evaluation model, the first The stage is only performed once the historical recruitment data does not change, or the resume evaluation model is updated according to the fixed week; the second stage is the work stage, that is, the stage in which the user uses the resume evaluation model to perform the evaluation, and this stage It is repetitive and occurs many times.
  • the user inputs to the server through a preset resume evaluation system.
  • Historical recruitment data after receiving the historical recruitment data input by the user, the server uses the machine learning method to obtain the resume evaluation model by training the historical recruitment data, and the above process may be the first stage of the resume evaluation method of the present application.
  • the server After the server generates the resume evaluation model, the user can use the resume evaluation model to evaluate the resume.
  • the process may be the second stage in the resume evaluation method of the application, and the user inputs the new resume to be evaluated into the resume evaluation model, and the server pair The new resume is evaluated, the evaluation result of the new resume is obtained, the user receives the evaluation result of the new resume obtained by the server evaluation, and the evaluation result of the new resume can be displayed and the like.
  • the above-mentioned machine learning method in this embodiment may be any method for constructing a resume evaluation model in Embodiment 1 of the present application.
  • a resume evaluation apparatus for implementing the above-described resume evaluation method is further provided.
  • the apparatus includes: an acquisition module 60, an extraction module 62, a construction module 64, and an evaluation module 66.
  • the obtaining module 60 is configured to obtain a historical recruitment data set, wherein the historical recruitment data set includes at least: resume text data.
  • the source of the historical recruitment data set may be a historical recruitment data collection of the target recruiter, for example, the target recruiter is within a preset time (such as a recruitment season, or the last two recruitment seasons) Information about the people who participated in the recruitment, as well as information about the people who participated in the recruitment and obtained the position of the target recruitment unit.
  • the method of obtaining the historical recruitment data collection may be to obtain a historical recruitment data collection through the website's own database.
  • the historical recruiting data set of the target recruiter for the past five years is obtained through the database, because the recruiting party's recruitment criteria may change with time, for example, the recruiter's requirements for the academic qualification may be Elevation, or more attention to the job experience of the candidate, therefore, the collection of all historical recruitment data for the past two years and some of the recruitment data for the first five years to the previous three years.
  • the extracting module 62 is configured to extract data from the historical recruitment data set, wherein the extracted data includes at least: one or more attributes corresponding to the recruitment result in the position, and one or more attributes are used in the resume text data to represent the application.
  • the parameters of the feature as an optional implementation, the recruitment result may include: the number of occurrences of one or more attributes in the position, and/or the number of times one or more attributes are admitted to the position.
  • the recruitment results page can include content other than these two parts, and then the information that is beneficial to the recruitment of personnel can be counted in the recruitment results.
  • the building module 64 is configured to construct a resume evaluation model by training the extracted data.
  • the evaluation module 66 is configured to perform resume evaluation on the received resume using the resume evaluation model.
  • the received resume is entered into the constructed resume model to obtain the output of the model.
  • the output of the model can be matched to the position expected by the candidate, and if the match is successful, the prompt is prompted.
  • the different colors used by the candidate's name can be used to identify different matching degrees.
  • the data extracted from the historical recruitment data set is used to establish a resume evaluation model for constructing a priori cognition of a certain position, so that The analysis of the corresponding employer can focus on the matching of the comprehensive strength of the candidate and the position, that is, finding the appropriate resume for different positions of different companies, thus eliminating the analysis of the behavior and social data of each candidate. , to reduce the complexity of recruitment, thus eliminating the tedious work of collecting candidates' behavior data on various social platforms, thereby reducing the cost, efficiency and cost performance in the recruitment process.
  • the extraction module 62 may include:
  • the cleaning module 70 is configured to clean the historical recruitment data set, wherein the data cleaning is mainly used to shield the unsuccessful resume from the historical recruitment data set.
  • the first extraction sub-module 72 is configured to extract data from the cleaned historical recruitment data set.
  • the resume that is not approved includes at least one of the following: the resume is not evaluated due to the staffing head count, the interview is not conducted without the resume evaluation, and the different resumes are interviewed, and the resume is repeatedly delivered, resulting in the evaluation not being Passed the resume.
  • the resume that has not passed the evaluation has been described in Embodiment 1, and will not be described again here.
  • the above building module 64 constructs a resume evaluation model by training the extracted data.
  • all the extracted data can be applied to the establishment of a resume evaluation model, and the verification of the resume evaluation model can be verified using the actually received resume.
  • This method of processing requires the use of a real candidate's resume for testing, and it is possible to screen out the appropriate candidates incorrectly.
  • the extracted data may be divided into two parts, one part is used to generate a resume evaluation model, and the other part is used to test the generated resume evaluation model.
  • the above building module 44 Can include:
  • the classification module 80 is configured to divide the extracted data into training sample data and test sample data.
  • the generating module 82 is configured to perform training using the training sample data to generate a resume evaluation model to be tested.
  • the specific example is basically the same as that of Embodiment 1. It is assumed that the database maintenance position needs to be recruited in advance, and there are 6 people who have successfully applied for the position. The resumes of the 6 personnel are used for training:
  • Personnel 1 A school, A major, B company, B position;
  • Personnel 3 A school, B major, A company, A position;
  • Personnel 5 A school, A major, C company, B position;
  • Personnel 1 A school, A major, B company, B position;
  • the scores of the personnel 1 on each attribute are 0.5*0.2, 0.3*0.5, 0.2*0.33, 0.3*0.5, namely, 0.1, 0.15, 0.066 and 0.15;
  • the scores of the personnel 2 on each attribute are 0.2*0.17, 0.3*0.5, 0.2*0.5 and 0.3*0.5, respectively, namely 0.034, 0.15, 0.1 and 0.15;
  • Personnel 3 A school, B major, A company, A position;
  • the scores of the personnel 3 on each attribute are 0.2*0.5, 0.3*0.5, 0.2*0.17 and 0.3*0.5, respectively, which is 0.1 0.15 0.034 0.15;
  • the scores of the personnel 4 on each attribute are 0.2*0.33, 0.3*0.5, 0.2*0.33 and 0.3*0.5, respectively, namely 0.066, 0.15, 0.066 and 0.15;
  • Personnel 5 A school, A major, C company, B position;
  • the scores of the personnel 5 on each attribute are 0.2*0.5, 0.3*0.5, 0.2*0.5, and 0.3*0.5, namely, 0.1, 0.15, 0.1, and 0.15;
  • the scores of the persons 6 on the respective attributes are 0.2*0.33, 0.3*0.5, 0.2*0.5, and 0.3*0.5, respectively, which are 0.066, 0.15, 0.1, and 0.15.
  • the score range is [0.433, 0.5], and if it is lower than 0.433, the resume is not in line with this position.
  • the age of the person in the recruitment of this position, can also be considered, and if the age of the person is within the predetermined range, the score can be increased.
  • this position is a very important position, so if you are over 40 years old, you can increase it by 0.07 points.
  • the method provided by the foregoing embodiment may be used to train the training sample data to obtain a resume evaluation model, but the method for obtaining the resume evaluation model is not limited thereto, and any algorithm capable of obtaining a resume evaluation model by training sample data is used. Both can be applied to the above steps, for example, the GBDT algorithm and the like.
  • the confirmation module 74 is configured to test the resume evaluation model to be inspected using the test sample data, and confirm that the resume evaluation model to be tested is an accurate resume evaluation model when the test passes.
  • the classification module 80 includes:
  • the second extraction sub-module 90 is configured to perform vector extraction.
  • the arranging module 92 is configured to perform feature defragmentation on the data subjected to vector extraction to obtain training sample data and/or test sample data.
  • the vector extraction of the training sample data and/or the test sample data may be the extraction of data of the training sample data and/or the test sample data in one or more dimensions.
  • the feature sorting may be performed by uniformly processing the data in the training sample data and/or the test sample data with different forms, formats, display modes, and the like. The purpose of the above steps is to unify the data form and solve the problem that the same data has various forms due to data diversification, and it is difficult to train by algorithm. technical problem.
  • the above confirmation module 84 is used for confirmation. There are several ways to confirm whether an accurate resume assessment model is used. For example, an accurate model can be considered if the actual results are consistent with the results of the resume assessment model output. As another optional embodiment, the above confirmation module 84 may include:
  • the verification module 100 is configured to input the test sample data into the resume evaluation model to be tested for verification, and output the test result.
  • the confirmation sub-module 102 confirms that the resume evaluation model to be tested is an accurate resume evaluation model if the error of the recruitment result corresponding to the test sample data is within a predetermined range.
  • the plurality of data in the test sample data are separately input into the resume evaluation model to be tested, and the above-mentioned recruited position is a database maintenance position, and three of the test sample data are used. Members who already know the results of the application are tested:
  • Tester 1 A school, C major, C company, B position, not accepted;
  • Tester 2 D school, A major, B company, A position, not accepted;
  • Tester 3 B, A, A, A, and successful admission.
  • the weights of the schools, companies, professions, and positions obtained in the above embodiments are: 0.2, 0.2, 0.3, 0.3, test results of test sample data:
  • Tester 1 A school, C major, C company, B position;
  • the tester 1 scores 0.5*0.2, 0.3*0, 0.2*0.5, 0.3*0.5 on each attribute, that is, 0.1, 0, 0.1, and 0.15.
  • Tester 2 D school, A major, B company, A position;
  • the tester 2 scores 0*0.2, 0.3*0.5, 0.2*0.33, 0.3*0.5 on each attribute, namely 0, 0.15, 0.066 and 0.15.
  • Tester 3 B school, A major, D company, A position;
  • the tester 3 scores 0.2*0.17, 0.3*0.5, 0.2*0, 0.3*0.5 on each attribute, that is, 0.034, 0.15, 0, and 0.15.
  • the scores of the above three test sample data in different dimensions were calculated, and the tester 1 score was 0.35, the tester 2 score was 0.366, and the tester 3 score was 0.434. Only the tester 3 fell into compliance with the score. The score of the position is within the range of the score. Therefore, the result obtained by the above resume evaluation model is that the tester 1 is not accepted, the tester 2 is not accepted, and the tester 3 is accepted, which is the same as the actual result, so the above resume evaluation can be considered.
  • the model has a high degree of accuracy.
  • the age problem was not considered.
  • test sample data is used as the training sample data, and the resume evaluation model is trained until the same test result as the test sample data can be obtained.
  • the second extraction sub-module 90 is configured to perform vector extraction on one or more attributes corresponding to the candidate, wherein the one or more attributes include at least one of the following: a company name The job title module, the school name, and the professional name; the collation module 92 includes a normalization module 112 for normalizing one or more attributes.
  • the normalization module 112 may include at least one of the following:
  • the first sub-normalization module 120 is configured to construct an industry vocabulary and a local noun table; extract the company name according to industry nouns and place name nouns in the industry vocabulary and the local noun table, and obtain the normalized result of the company name, The company name is normalized.
  • the second sub-normalization module 122 is configured to confirm that in the historical recruitment data set, the job number whose occurrence number is greater than the preset number of times is described as the correct job title; the job description in the resume text data and the correct job title are constructed by editing the distance The mapping vocabulary between the two; the regular description of the job description in the resume text data is matched in the mapping vocabulary, and the normalized result of the job title is obtained to normalize the job title.
  • a third sub-normalization module 124 for displaying the school name in the resume text data according to
  • the order is arranged in descending order, and the school name of the preset ranking is obtained, and the basic dictionary is obtained; the school name is denoised, and the school name and the basic dictionary are denoised by regular matching.
  • the names are matched to obtain the normalized school name; the synonym table is used to construct the shorthand corresponding to the school name according to the preset rules, and the short name corresponding to the school name is recorded as the school name; the school name is included in the preset
  • the school name of the first suffix word is normalized to the corresponding first suffix word, wherein the preset first suffix words at least include: vocational and technical college, network college, adult education, self-study and junior college; remove the number included in the school name Second suffix words to generate a normalized school name, where the second suffix words are used to characterize the school's branches; the prefix in the school name is removed to generate a normalized school name for the school name Normalized processing.
  • the fourth sub-normalization module 126 is configured to construct a professional classification table and perform Bayesian model training through a professional classification table; classify the professional in the resume text data according to the trained Bayesian model to the professional name Perform normalization.
  • a resume evaluation apparatus for implementing the resume evaluation method in Embodiment 2 is further provided. As shown in FIG. 13, the apparatus includes: a first input module 130 and an acquisition module 132.
  • a first input module 130 configured to input a resume to be evaluated
  • an obtaining module 132 configured to acquire The resume evaluation result of the resume to be evaluated, wherein the resume evaluation result is based on the resume evaluation model, and the resume evaluation model is established based on the data extracted from the historical recruitment data set.
  • first input module 130 obtaining module 132 corresponds to step S41 to step S43 in the embodiment 2, and the module is the same as the example and application scenario implemented by the corresponding steps, but is not limited to the above embodiment.
  • a public content It should be noted that the above module can be operated as part of the device in the computer terminal 10 provided in the first embodiment.
  • the apparatus further includes: a second input module 140.
  • a second input module 140 configured to input a historical recruitment data set, where the data in the historical recruitment set includes at least: a recruitment result corresponding to one or more attributes in the position, the one or more attributes being the a parameter in the resume text data for characterizing the candidate characteristics, the recruitment result including at least: the number of occurrences of the one or more attributes on the position, and/or the one or more attributes in the position The number of admissions.
  • the foregoing second input module 140 corresponds to the step S45 in the embodiment 2, and the module is the same as the example and the application scenario implemented by the corresponding steps, but is not limited to the content disclosed in the first embodiment. It should be noted that the above module can be operated as part of the device in the computer terminal 10 provided in the first embodiment.
  • the device further includes: a display module 150, configured to display a resume evaluation result of the object to be evaluated in a preset area.
  • the above display module 150 corresponds to step S451 in Embodiment 2, This module is the same as the example and application scenario implemented by the corresponding steps, but is not limited to the content disclosed in the first embodiment. It should be noted that the above module can be operated as part of the device in the computer terminal 10 provided in the first embodiment.
  • Embodiments of the present invention may provide a computer terminal, which may be any one of computer terminal groups.
  • the foregoing computer terminal may also be replaced with a terminal device such as a mobile terminal.
  • the foregoing computer terminal may be located in at least one network device of the plurality of network devices of the computer network.
  • the computer terminal may execute the program code of the following steps in the resume evaluation method: acquiring a historical recruitment data set, wherein the historical recruitment data set includes at least: resume text data; and extracting data from the historical recruitment data set, wherein The data includes at least: one or more attributes corresponding to the recruitment result in the position, and one or more attributes are parameters used in the resume text data to characterize the candidate, and the recruitment result includes at least: one or more attributes in the position
  • the number of occurrences, and/or the number of times one or more parameters are admitted to the position
  • the resume evaluation model is constructed by training the extracted data
  • the resume evaluation is performed on the received resume using the resume evaluation model.
  • FIG. 16 is a structural block diagram of a computer terminal according to an embodiment of the present invention.
  • the computer terminal A may include one or more (only one shown in the figure) processor 1601, memory 1603, and transmission device 1605.
  • the memory can be used to store software programs and modules, such as the resume evaluation method and the program instructions/modules corresponding to the device in the embodiment of the present invention, and the processor executes various functional applications by running software programs and modules stored in the memory. And data processing, that is, to implement the above-mentioned resume evaluation method.
  • the memory may include a high speed random access memory, and may also include non-volatile memory such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory.
  • the memory can further include memory remotely located relative to the processor, which can be connected to terminal A via a network. Examples of such networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.
  • the processor may invoke the memory stored information and the application by the transmission device to perform the following steps: acquiring a historical recruitment data set, wherein the historical recruitment data set includes at least: resume text data; and extracting data from the historical recruitment data set, wherein The data includes at least: one or more attributes corresponding to the recruitment result in the position, and one or more attributes are parameters used in the resume text data to characterize the candidate, and the recruitment result includes at least: one or more attributes in the position The number of occurrences, and/or the number of times one or more parameters are admitted to the position; the resume evaluation model is constructed by training the extracted data; and the resume evaluation is performed on the received resume using the resume evaluation model.
  • the processor may further execute the following program code: cleaning the historical recruitment data set, wherein the data cleaning comprises: evaluating the failed resume from the historical recruitment data collection: from the post-cleaning history recruitment Extract data from the data set.
  • the foregoing processor may further execute the following program code: the staffing head count causes the unsuccessful resume to be evaluated, and the resume evaluation is performed without performing the resume evaluation. And the different resumes and resumes of the interviews are repeated, resulting in a resume that fails to pass the assessment.
  • the foregoing processor may further execute the following program code: divide the extracted data into training sample data and test sample data; use the training sample data to perform training to generate a resume evaluation model to be tested; and use the test sample data to treat The resume evaluation model of the test is tested, and the resume evaluation model to be tested is confirmed as an accurate resume evaluation model when the test is passed.
  • the foregoing processor may further execute the following program code: the training sample data for generating the resume evaluation model to be tested is: data obtained by performing vector extraction and performing feature extraction on the vector extracted data; and/or The test sample data for detecting the resume evaluation model to be tested is: data obtained by performing vector extraction and characterizing the data after vector extraction.
  • the processor may further execute the following program code: input test sample data into the resume evaluation model to be tested for testing, and output the test result; if the test result corresponds to the test sample data, the error of the recruitment result is Within the predetermined range, the resume evaluation model to be tested is confirmed to be an accurate resume evaluation model.
  • the foregoing processor may further execute the following program code: performing vector extraction comprises: performing vector extraction on one or more attributes of the candidate, wherein the one or more attributes include at least one of the following: a company name, Job title, school name, professional name; characterizing the data after vector extraction includes: normalizing one or more attributes.
  • the foregoing processor may further execute the following program code: normalizing the company name includes: constructing an industry vocabulary and a local noun table; according to the industry vocabulary and the local noun table The industry nouns and place names are used to extract the company name and get the normalized result of the company name; normalizing the job title includes: confirming that in the historical recruitment data set, the job descriptions with the number of occurrences greater than the preset number are correct.
  • Job title construct a mapping vocabulary between the job description in the resume text data and the correct job title by editing the distance; match the job description in the resume text data in the mapping vocabulary by the regular expression to obtain the job title Normalization result; normalization of the school name includes: arranging the school names in the resume text data according to the order of occurrence here, and obtaining the school name of the preset ranking to obtain the basic dictionary; Denoise the school name, and match the school name obtained by denoising with the school name in the basic dictionary by regular matching to obtain the normalized school name; use the synonym table to construct the school name according to the preset rules.
  • the short name of the school name will be recorded as the school
  • the name of the school containing the default first suffix in the school name is normalized to the corresponding first suffix word, wherein the preset first suffix words include at least: vocational and technical college, network college, adult education, self-test and Upgrade from the second suffix included in the school name to generate a normalized school name, where the second suffix is used to characterize the branch of the school; the prefix in the school name is removed to generate a normalized The name of the school; normalization of the professional name includes: constructing a professional classification table, and training the Bayesian model through the professional classification table; classifying the majors in the resume text data according to the trained Bayesian model.
  • the computer terminal can also be a smart phone (such as an Android phone, an iOS phone, etc.), a tablet computer, an applause computer, and a mobile Internet device (MID), a PAD, and the like.
  • Figure 16 is not the above The structure of the electronic device is limited.
  • computer terminal A may also include more or fewer components (such as a network interface, display device, etc.) than shown in FIG. 16, or have a different configuration than that shown in FIG.
  • Embodiments of the present invention also provide a storage medium.
  • the foregoing storage medium may be used to save the program code executed by the resume evaluation method provided in Embodiment 1 above.
  • the foregoing storage medium may be located in any one of the computer terminal groups in the computer network, or in any one of the mobile terminal groups.
  • the storage medium is configured to store program code for performing the following steps: acquiring a historical recruitment data set, wherein the historical recruitment data set includes at least: resume text data; from the historical recruitment data set Extracting data, wherein the data includes at least: one or more attributes corresponding to the recruitment result in the position, and one or more attributes are parameters used in the resume text data to characterize the candidate, and the recruitment result includes at least: one or more Attribute at The number of occurrences in the position, and/or the number of times one or more parameters are accepted in the position; the resume evaluation model is constructed by the data extracted from the training; and the resume evaluation is performed on the received resume using the resume evaluation model.
  • the foregoing storage medium is further configured to store program code for performing the following steps: the processor may further execute the following program code: cleaning the historical recruitment data set, wherein the data cleaning comprises: the evaluation fails The resume is blocked from the historical recruitment data set: the data is extracted from the cleaned historical recruitment data set.
  • the foregoing storage medium is further configured to store program code for performing the following steps: the resume is counted due to the staffing head count, the resume is not directly evaluated, and the different resumes and resumes are repeatedly delivered. A resume that leads to an unsuccessful assessment.
  • the foregoing storage medium is further configured to store program code for performing the following steps: dividing the extracted data into training sample data and test sample data; and training using the training sample data to generate a resume evaluation model to be tested; The test evaluation data is tested using the test sample data, and the resume evaluation model to be tested is confirmed as an accurate resume evaluation model.
  • the foregoing storage medium is further configured to store program code for performing the following steps: generating training sample data of the resume evaluation model to be inspected is: performing vector extraction and performing feature extraction on the vector extracted data Data; and/or, test sample data for detecting the resume evaluation model to be inspected is: data obtained by performing vector extraction and characterizing the vector after the vector extraction.
  • the foregoing storage medium is further configured to store program code for performing the following steps: inputting test sample data into a resume evaluation model to be tested for verification, and outputting the test result; if the test result corresponds to the test sample data If the error of the recruitment result is within the predetermined range, the resume evaluation model to be tested is confirmed as an accurate resume evaluation model.
  • the foregoing storage medium is further configured to store program code for performing the following steps: performing vector extraction comprises: performing vector extraction on one or more attributes of the candidate, wherein the one or more attributes include at least the following A: company name, job title, school name, professional name; characterizing the data after vector extraction includes: normalizing one or more attributes.
  • the foregoing storage medium is further configured to store program code for performing the following steps: normalizing the company name includes: constructing an industry vocabulary and a local noun table; according to the industry vocabulary and the local noun table Industry nouns and place names are used to extract the company name and get the normalized result of the company name; normalizing the job title includes: confirming that in the historical recruitment data set, the job number that appears more than the preset number of times is described as the correct job title. The edited distance is used to construct a mapping vocabulary between the job description in the resume text data and the correct job title; the job description in the resume text data is matched in the mapping vocabulary by the regular expression, and the job title is normalized.
  • the result of the normalization of the school name includes: arranging the names of the schools in the resume text data according to the order in which they appear, and obtaining the school name of the preset ranking to obtain the basic dictionary; The name is denoised and the school name and base dictionary will be denoised by regular matching.
  • School name matches the name of the school to obtain a normalized; table using synonyms, according to a preset rule configuration corresponding abbreviated name of the school, and the The short name corresponding to the school name is recorded as the name of the school; the name of the school containing the preset first suffix words in the school name is normalized to the corresponding first suffix word, wherein the preset first suffix words include at least: Vocational and technical colleges, network colleges, adult education, self-study and college degree; remove the second suffix words included in the school name to generate a normalized school name, where the second suffix words are used to characterize the school's branches; remove the school The prefix in the name to generate the normalized school name; normalization of the professional name includes: constructing a professional classification table, and training the Bayesian model through the professional classification table; according to the trained Bayesian model Classify the majors in the resume text data.
  • the disclosed technical contents may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of the unit is only a logical function division.
  • multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, unit or module, and may be electrical or otherwise.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place. Square, or it can be distributed to multiple network elements. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
  • the technical solution of the present invention which is essential or contributes to the prior art, or all or part of the technical solution, may be embodied in the form of a software product stored in a storage medium.
  • a number of instructions are included to cause a computer device (which may be a personal computer, server or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk, and the like. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Human Resources & Organizations (AREA)
  • General Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Computational Linguistics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明公开了一种简历评估方法和装置。其中,该方法包括:获取历史招聘数据集合;从历史招聘数据集合中抽取数据,其中,数据至少包括:一个或多个属性在职位上对应的招聘结果,一个或多个属性是简历文本数据中用于表征应聘者特征的参数,招聘结果至少包括:一个或多个属性在职位上的出现次数、和/或一个或多个属性在职位上录取次数;通过训练抽取到的数据构建简历评估模型;使用简历评估模型对接收到的简历进行简历评估。本发明解决了现有技术中对应聘者进行评估通过对应聘者的行为、社交数据进行全面分析来评价应聘者,而应聘者的行为、社交数据复杂多变且获取难度高,从而导致评价的估难度大的技术问题。

Description

简历评估方法和装置 技术领域
本发明涉及数据处理领域,具体而言,涉及一种简历评估方法和装置。
背景技术
在现有技术的招聘过程中,很多招聘网站通过社交网络、可得行为数据等各方面对应聘者的完整形象进行刻画,从各个方面对应聘者进行了解,从而希望帮助招聘方找到合适的人才。例如,作为大数据招聘平台的“寻英网”,以大数据算法实现人才与企业职位的精准匹配,并具有“职位需求一键同步至主流招聘网站”等实用功能;“寻英网”的特点包括:(1)全量数据,形成多维度立体的用户求职图像。寻英网通过高科技获取求职者的社交信息、求职者在论坛上发布的言论以及发表的论文等全方面数据,将一个人由过去的平面图片拓展到一个人的电影故事,从而形成多维度立体的用户求职图像。(2)个性化,多维度动态分析人才和企业的发展规律,优化双向匹配引擎。寻英网以两千万人的职业发展路径为数据源,通过分析,形成职位晋升图谱,如职位晋升路径、职位关联关系等。
与上述应用功能类似的应用还包括“人才雷达”,“人才雷达”通过每个人在网络上留下的大量的数据,如生活轨迹、社交言行等个人信息,从中剥离出他的兴趣图谱、性格画像和能力评估。
因此,已有方案主要使用社交数据、行为数据等来进行求职人员和职 位的匹配,然而使用社交数据、行为数据进行求职人员和职位的匹配存在如下问题:
(1)应用目标过于复杂
现有技术大多的应用场景集中在对合适人的匹配,落脚点在人的属性,为了实现这个目标需要对人的行为、社交数据进行分析,全面的对人进行评价和刻画,以至于大大提升对数据全面性和数据多样性的要求。
(2)数据的可得性和全面性受到局限
现有技术为达到寻找到合适的员工的目的需要搜集多方面的数据,所以对数据的可得性和数据的全面性有很高的要求,同时也大大的局限了技术的准确性。
针对现有技术中通过对目标人物的行为、社交数据进行全面分析来评价目标应聘者,导致评估难度大的问题,目前尚未提出有效的解决方案。
发明内容
本发明实施例提供了一种简历评估方法和装置,以至少解决现有技术中对应聘者进行评估通过对应聘者的行为、社交数据进行全面分析来评价应聘者,而应聘者的行为、社交数据复杂多变且获取难度高,从而导致评价的估难度大的技术问题。
根据本发明实施例的一个方面,提供了一种简历评估方法,包括:获取历史招聘数据集合,其中,历史招聘数据集合至少包括:简历文本数据; 从历史招聘数据集合中抽取数据,其中,数据至少包括:一个或多个属性在职位上对应的招聘结果,一个或多个属性是简历文本数据中用于表征应聘者特征的参数,招聘结果至少包括:一个或多个属性在职位上的出现次数、和/或一个或多个属性在职位上录取次数;通过训练抽取到的数据构建简历评估模型。
根据本发明实施例的一个方面,还提供了一种简历评估方法,包括:输入待评估简历;获取待评估简历的简历评估结果,其中,简历评估结果是根据简历评估模型做出的,简历评估模型是根据从历史招聘数据集合中抽取数据建立的。
根据本发明实施例的另一方面,还提供了一种简历评估装置,包括:获取模块,用于获取历史招聘数据集合,其中,历史招聘数据集合至少包括:简历文本数据;抽取模块,用于从历史招聘数据集合中抽取数据,其中,数据至少包括:一个或多个属性在职位上对应的招聘结果,一个或多个属性是简历文本数据中用于表征应聘者特征的参数,招聘结果至少包括:一个或多个属性在职位上的出现次数、和/或一个或多个参数在职位上录取次数;构建模块,用于通过训练抽取到的数据构建简历评估模型;评估模块,用于使用简历评估模型对接收到的简历进行简历评估。
根据本发明实施例的一个方面,还提供了一种简历评估装置,包括:第一输入模块,用于输入待评估简历;获取模块,用于获取待评估简历的简历评估结果,其中,简历评估结果是根据简历评估模型做出的,简历评估模型是根据从历史招聘数据集合中抽取数据建立的。
容易注意到的是,采用在历史招聘数据集合中抽取的数据进行简历评估模型的建立,用于构建对某一职位的先验认知,使得对应聘者的分析能够着眼于对应聘者综合实力与职位的匹配程度的挖掘,即,为不同公司的不同职位寻找到合适的简历,这样能够免去对每个应聘者的行为、社交数据进行分析,减少招聘的复杂程度,从而免去了搜集应聘者在各个社交平台上行为数据的繁琐工作,从而进一步的减少在招聘过程中付出的代价,效果和代价比方面具有更好的表现。
由此,本发明提供的上述方案解决了现有技术中对应聘者进行评估通过对应聘者的行为、社交数据进行全面分析来评价应聘者,而应聘者的行为、社交数据复杂多变且获取难度高,从而导致评价的估难度大的技术问题。
附图说明
此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:
图1是根据本发明实施例1的一种简历评估方法的计算机终端的硬件结构框图;
图2是根据本发明实施例1的一种可选的简历评估方法的流程图;
图3是根据本发明实施例1的一种可选的简历评估方法的流程图;
图4是根据本发明实施例2的简历评估方法的流程图;
图5是根据本发明实施例2的一种可选的简历评估方法的示意图;
图6是根据本发明实施例3的一种可选的简历评估装置的结构示意图;
图7是根据本发明实施例3的一种可选的简历评估装置的结构示意图;
图8是根据本发明实施例3的一种可选的简历评估装置的结构示意图;
图9是根据本发明实施例3的一种可选的简历评估装置的结构示意图;
图10是根据本发明实施例3的一种可选的简历评估装置的结构示意图;
图11是根据本发明实施例3的一种可选的简历评估装置的结构示意图;
图12是根据本发明实施例3的一种可选的简历评估装置的结构示意图;
图13是根据本发明实施例4的一种简历评估装置的结构示意图;
图14是根据本发明实施例4的一种可选的简历评估装置的结构示意图;
图15是根据本发明实施例4的一种可选的简历评估装置的结构示意图;以及
图16是根据本发明实施例的一种计算机终端的结构框图。
具体实施方式
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分的实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明保护的范围。
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
以下的实施例可以应用到可以应用到普通的终端中,例如计算机。当然以下的实施例也可以应用到服务器当中,服务器也可以理解为由一个或多个计算机组成的设备。因此,下面所示出的计算机的结构也适用于服务器。当移动终端计算能力逐步增强,以下实施例也可以在移动终端中实施。当然,下述实施例中的步骤或者模块可以在分别在不同的服务器或者终端或者移动终端中进行,这些服务器或者终端或者移动终端之间进行必要的数据交互即可。
实施例1
根据本发明实施例,还提供了一种简历评估方法的实施例,需要说明的是,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。
本申请实施例1所提供的方法实施例可以在移动终端、计算机终端或者类似的运算装置中执行。以运行在计算机终端上为例,图1是本发明实施例1的一种简历评估方法的计算机终端的硬件结构框图。如图1所示,计算机终端10可以包括一个或多个(图中仅示出一个)处理器102(处理器102可以包括但不限于微处理器MCU或可编程逻辑器件FPGA等的处理装置)、用于存储数据的存储器104、以及用于通信功能的传输装置106。本领域普通技术人员可以理解,图1所示的结构仅为示意,其并不对上述电子装置的结构造成限定。例如,计算机终端10还可包括比图1中所示更多或者更少的组件,或者具有与图1所示不同的配置。
存储器104可用于存储应用软件的软件程序以及模块,如本发明实施例中的简历评估方法对应的程序指令/模块,处理器102通过运行存储在存储器104内的软件程序以及模块,从而执行各种功能应用以及数据处理,即实现上述的应用程序的漏洞检测方法。存储器104可包括高速随机存储器,还可包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器104可进一步包括相对于处理器102远程设置的存储器,这些远程存储器可以通过网络连接至 计算机终端10。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
传输装置106用于经由一个网络接收或者发送数据。上述的网络具体实例可包括计算机终端10的通信供应商提供的无线网络。在一个实例中,传输装置106包括一个网络适配器(Network Interface Controller,NIC),其可通过基站与其他网络设备相连从而可与互联网进行通讯。在一个实例中,传输装置106可以为射频(Radio Frequency,RF)模块,其用于通过无线方式与互联网进行通讯。
在上述运行环境下,本申请提供了如图2所示的简历评估方法。图2是根据本发明实施例1的简历评估方法的流程图。
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明并不受所描述的动作顺序的限制,因为依据本发明,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本发明所必须的。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如 ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本发明各个实施例所述的方法。
在本实施例中以下步骤中使用了数据集合,一个或多个数据被进行相同或类似的处理,或者作为某个动作或者步骤的依据均可以认为是一个数据集合。
步骤S21,获取历史招聘数据集合,其中,该历史招聘数据集合至少包括:简历文本数据。
在上述步骤中,作为可选的实施方式,历史招聘数据集合的来源可以是目标招聘方的历史招聘数据集合,例如:目标招聘方在预设时间内(如上一个招聘季)参加招聘的人员的信息,以及参加招聘并获取该目标招聘单位职位的人员的信息。获取历史招聘数据集合的方法可以是通过网站自身的数据库来获取历史招聘数据集合。
在一种可选的实施例中,通过数据库获取目标招聘方的近五年的历史招聘数据作为该数据集合,由于招聘方的招聘准则会随着时间的变化而变化,例如,招聘方对学历的要求可能升高,或对应聘者的工作经验更加重视,因此,获取近两年的全部历史招聘数据作为数据集合以及前五年至前三年的部分招聘数据作为数据集合。
步骤S23,从历史招聘数据集合中抽取数据,其中,抽取的该数据至少包括:一个或多个属性在职位上对应的招聘结果,一个或多个属性是简 历文本数据中用于表征应聘者特征的参数;作为一个可选的实施方式,招聘结果可以包括:一个或多个属性在职位上的出现次数、和/或一个或多个属性在职位上录取次数。当然,招聘结果页可以包括除这两部分内容之外的其他内容,然后有利于人员招聘的信息都可以统计在招聘结果中。
在上述步骤中,上述用于表征应聘者特征的参数的属性可以是应聘者的学历、专业、工作经验等参数。
步骤S25,通过训练抽取到的数据构建简历评估模型。
在一种可选的实施例中,可以采用GDBT算法对抽取到的数据进行训练,得到构建简历评估模型,在采用GDBT算法对抽取到的数据进行训练时,可以首先采用上述步骤抽取的历史招聘数据集合构建一个或多个维度上的决策树,最终模型的输出结果为多个决策树得到的结果的累积值。
此处需要说明的是,对抽取到的数据进行训练的算法可以是上述GDBT算法,但不限于此。对抽取到的数据进行训练的目的在于使构建的简历评估模型学习抽取到的数据,从而使当简历评估模型再次接收到相同或相似的数据时能够得到同样或相近的输出结果,因此训练需要大量的数据,以使在使用简历评估模型对简历进行评估时,应聘者的简历能够落在用于训练的数据之中,也就是说,对于用于训练的数据的数据量本申请不做具体限定,但在一种优选的情况下,抽取到的用于训练的数据量越大、数据的覆盖范围越广,构建的简历评估模型准确程度越高。
步骤S27,可选的,本申请上述实施例提供的简历评估方法还可以包 括如下步骤:使用简历评估模型对接收到的简历进行简历评估。
在一种可选的实施例中,将接收到的简历输入至构建的简历模型中,得到模型的输出结果。在另一个可选的实施例中,可以将模型的输出结果与应聘者预期的职位进行匹配,如果匹配成功则进行提示。提示的方式也有很多种,例如,可以将应聘者的名字使用的不同的颜色来标识不同的匹配度。
此处需要说明的是,由于步骤S23中抽取的数据包括一个或多个属性在职位上对应的招聘结果,而一个或多个属性是简历文本数据中用于表征应聘者特征的参数,因此可以知晓,在步骤S27中当对抽取的数据进行训练从而构建简历评估模型时,并未使用招聘者的行为、社交数据,从而避免了行为、社交数据复杂多变且获取难度高影响简历评估效率的情况发生。还需要说明的是,本申请实施例上述方案在使用了简历中的数据,这些数据是容易获取到的,并且还使用了应聘者的参数(属性)与招聘结果的对应关系,鉴于简历中的数据真实度比较高,因此,不仅不会因为不使用行为、社交数据而引起评估的准确率低,还会由于使用了历史招聘结果中应聘者参数与招聘结果的关系提高简历评估的准确度。
容易注意到的是,通过上述步骤S21至步骤S27,采用在历史招聘数据集合中抽取的数据进行简历评估模型的建立,用于构建对某一职位的先验认知,使得对应聘者的分析能够着眼于对应聘者综合实力与职位的匹配程度的挖掘,即,为不同公司的不同职位寻找到合适的简历,这样能够免去对每个应聘者的行为、社交数据进行分析,减少招聘的复杂程度,从而 免去了搜集应聘者在各个社交平台上行为数据的繁琐工作,从而进一步的减少在招聘过程中付出的代价,效果和代价比方面具有更好的表现。
由此,本发明上述实施例提供的方案解决了现有技术中对应聘者进行评估通过对应聘者的行为、社交数据进行全面分析来评价应聘者,而应聘者的行为、社交数据复杂多变且获取难度高,从而导致评价的估难度大的技术问题。
根据本申请上述实施例,步骤S23中从历史招聘数据集合中抽取数据,抽取数据时可以从全部的历史招聘数据集合中进行抽取,但是,作为一个可选的实施例,可以首先对数据进行过滤或者清洗,去除掉认为有可能产生影响的数据。例如,上述步骤S23可以包括如下步骤:
步骤S231,对历史招聘数据集合进行清洗,其中,该数据清洗主要用来将评估不通过的简历从历史招聘数据集合之中屏蔽。
在上述步骤中,对历史招聘数据集合进行清洗的目的在于发现并清除历史招聘数据集合中包括的噪音数据。
步骤S233,从清洗后的历史招聘数据集合中抽取数据。
通过上述两个步骤,可以使抽取的数据更加准确,使建立的简历评估模型更加符合要求。
对于评估不通过的简历可能有很多类型,例如,评估不通过的简历包括以下至少之一:
由于人员编制head count导致评估不通过的简历、没有进行简历评估而直接进行面试并且面试未通过的简历、简历重复投递导致评估不通过的简历。
在上述步骤中,head count可以用于表征招聘方的人力资源在某一职位上针对该职位当前的需求、未来发展的需求以及企业整体规划,对这一职位预定的员工数量,和/或招聘人员数量。
在一种可选的实施例中,应聘者甲符合招聘方的招聘条件,但由于人力资源部门对于应聘者甲所应聘的职位具有预先的人员数量规划,如果聘用应聘者甲,可能会引起员工冗余的现象,因此并不聘用应聘者甲,应聘者甲的简历即为由于编制head count导致评估不通过的简历。
在另一种可选的实施例中,由于任意原因应聘者乙的简历并未经过评估,而直接参与了招聘方在某一职位上的面试,但面试并未通过,则应聘者乙的简历也被认为是评估不通过的简历。
在又一种可选的实施例中,应聘者丙反复向应聘方的某一职位重复使用不同的方式投递自己的简历,例如,通过不同的招聘网站向同一招聘方的同一职位多次重复投递个人简历,应聘者的简历也被认为是不通过评估的简历。
当然也可以根据招聘单位的实际需要来设置哪些类型的建立是评估不通过的简历。
根据本申请上述实施例,步骤S25中是通过训练抽取到的数据构建简 历评估模型。在该步骤中可以将所有的抽取到的数据均应用到建立简历评估模型上,该简历评估模型的检验可以使用真实接收到的简历来进行检验。这种处理方法需要使用真实应聘者的简历来进行检验,有可能将合适的应聘者错误的筛除。作为另一种可选的实施方式,可以将抽取的数据分为两部分,一部分用来进行生成简历评估模型,一部分用来对生成的简历评估模型进行测试,在这种实施例中,步骤S25可以包括如下步骤:
步骤S251,将抽取到的数据分为训练样本数据和测试样本数据。
在上述步骤中,上述训练样本数据包括多个数据,用于通过训练构建简历评估模型,测试样本数据同样包含多个数据,用于验证上述简历评估模型是否准确。
在一种可选的实施例中,训练样本数据和测试样本数据可以是历史招聘数据集合中的多个维度上的数据。
此处需要说明的是,为了确保简历评估模型的准确性,抽取的数据中被分为训练样本数据的数据是在任意维度上均是多样化的,以确保任意维度上的多种数据都能经过训练,使得简历评估模型能够学习任意维度上的多种数据。
此处还需要说明的是,由于训练样本数据用于构成简历评估模型,测试样本数据用于检验简历评估模型的准确程度,因此,训练样本数据和测试样本数据均为已经知晓招聘结果的历史招聘数据集合。
步骤S253,使用训练样本数据进行训练生成待检验的简历评估模型。 下面以一个可选的实施例进行说明。
在该可选的实施例中,假定预先需要招聘的是数据库维护岗位,已经应聘成功的该职位的人员有6名,先使用这6名人员的简历用于训练:
人员1:A学校、A专业、B公司、B职位;
人员2:B学校、A专业、C公司、B职位;
人员3:A学校、B专业、A公司、A职位;
人员4:C学校、B专业、B公司、A职位;
人员5:A学校、A专业、C公司、B职位;
人员6:C学校、B专业、C公司、A职位。
由此可以获得数据维护岗人员的特征:
学校:A学校在6个数据中出现过3次,占比为0.5;B学校在6个数据中出现过1次,占比为0.17;C学校出过两次,占比为0.33;其他学校没有出现过,占比为0。
专业:A专业出现过3次,占比为0.5;B专业出现过3次,占比为0.5;其他专业没有出现过,占比为0。
公司:A公司出现过1次,占比为0.17;B公司出现过2次,占比为0.33;C公司出现过3次,占比为0.5;其他公司没有出现过,占比为0。
职位:A职位出现过3次,占比为0.5;B职位出现过3次,占比为0.5; 其他职位占比为0。
在已经应聘成功的该职位的6名人员的简历中,出现过3个不同的学校,3个不同的公司,而仅仅出现过2个不同的专业和职位,因此,从简历筛选的角度来讲,专业和职位比学校和公司更重要,其重要程度为学校和公司的1.5倍。因此,在进行训练的过程中学校、公司、专业、职位的权重分别为:0.2、0.2、0.3、0.3。
通过上述数值来计算人员的得分:
人员1:A学校、A专业、B公司、B职位;
可以计算得到人员1在各个属性上得分分别为0.5*0.2、0.3*0.5、0.2*0.33、0.3*0.5,即为0.1、0.15、0.066和0.15;
人员2:B学校、A专业、C公司、B职位;
可以计算得到人员2在各个属性上得分分别为0.2*0.17、0.3*0.5、0.2*0.5和0.3*0.5,即为0.034、0.15、0.1和0.15;
人员3:A学校、B专业、A公司、A职位;
可以计算得到人员3在各个属性上得分分别为0.2*0.5、0.3*0.5、0.2*0.17和0.3*0.5,即为0.1 0.15 0.034 0.15;
人员4:C学校、B专业、B公司、A职位;
可以计算得到人员4在各个属性上得分分别为0.2*0.33、0.3*0.5、0.2*0.33和0.3*0.5,即为0.066、0.15、0.066和0.15;
人员5:A学校、A专业、C公司、B职位
可以计算得到人员5在各个属性上得分分别为0.2*0.5、0.3*0.5、0.2*0.5和0.3*0.5,即为0.1、0.15、0.1和0.15;
人员6:C学校、B专业、C公司、A职位
可以计算得到人员6在各个属性上得分分别为0.2*0.33、0.3*0.5、0.2*0.5和0.3*0.5,即为0.066、0.15、0.1和0.15。
这样就可以得到6个向量,如果使用比较简单的算法,可以对每个人员求和(当然,也可以采用其他的方式来进行向量计算),人员1得分0.466,人员2得分0.434,人员3得分0.434,人员4得分0.433,人员5得分0.5,人员6得分0.466。这其中得分范围为[0.433,0.5],如果低于0.433则认为这个简历是不符合这个职位的。
此处需要说明的是,上述实施例仅以说明为目的,训练数据只有6个,当训练数据更为广泛的,得分的区间范围将更加合理。
上述仅仅是一个职位的得分,还可以使用同样的方式得出多个职位的得分,或多个公司中多个不同职位的得分,然后,将求职简历中的多个属性进行计算,落入了哪个职位的范围,则可以认为该人员符合该职位的要求。
此处需要说明的是,上述实施例提供的方法可用于对训练样本数据进行训练得到简历评估模型,但得到简历评估模型的方法并不仅限于此,任何能够通过训练样本数据获得简历评估模型的算法都能够应用于上述步 骤,例如,GBDT算法等。
步骤S255,使用测试样本数据对待检验的简历评估模型进行检验,在检验通过的情况下确认待检验的简历评估模型为准确的简历评估模型。
在得到待检验的简历评估模型后,由于训练样本数据的取值并不一定全面,或训练样本数据中存在噪声数据等原因的影响,使得待检验的简历评估模型的准确度并不一定较高,因此,需要通过向待检验的招聘数据模型输入测试样本数据来验证待检验的简历评估模型是否准确。
此处还需要说明的是,训练样本数据中包含的数据量的多少与简历评估模型的准确程度成正比。
通过上述步骤,可以使用已有的数据来对生成的简历评估模型进行验证,从而避免了使用真实招聘数据进行验证而导致的可能错误筛除简历的问题。
为了便于进行计算,对于上述的数据可以进行向量化处理。即,根据本申请上述实施例,生成待检验的简历评估模型的训练样本数据可以是:进行向量抽取并对向量抽取后的数据进行特征整理后得到的数据;和/或,检测待检验的简历评估模型的测试样本数据可以是:进行向量抽取并对向量抽取后的数据进行特征整理后得到的数据。
在对训练样本数据和/或测试样本数据进行向量抽取可以是对训练样本数据和/或测试样本数据在一个或者多个维度上的数据进行提取。在对训练样本数据和/或测试样本数据进行向量抽取后进行特征整理可以是将训 练样本数据和/或测试样本数据中形式、格式、显示方式等不同但表征的意义相同的数据进行统一处理。上述步骤的目的在于统一数据形式,解决了由于数据多样化导致的同一数据具有多种形式,不易通过算法进行训练的技术问题。
根据本申请上述实施例,上述步骤S255中使用了测试样本数据对待检验的简历评估模型进行检验,并确认待检验的简历评估模型为准确的简历评估模型。确认是否为准确的简历评估模型的方式有多种,例如,可以将认为实际结果与简历评估模型输出的结果完全一致,才认为准确的模型。作为另一个可选的实施例,步骤S255可以包括:
步骤S2551,将测试样本数据输入至待检验的简历评估模型中进行检验,并输出检验结果。
步骤S2553,如果检验结果与测试样本数据对应的招聘结果误差在预定范围内,则确认待检验的简历评估模型为准确的简历评估模型。
在上述步骤中,允许存在一定的误差,这种误差的存在可能是由于抽取数据的数量不足够导致的,但是该误差的存在并不影响对招聘结果的判断,也在可以接受的范围之内。
在一种可选的实施例中,将测试样本数据中的多个数据分别输入至待检验的简历评估模型中,仍以上述招聘的岗位是数据库维护岗位为例,测试样本数据中的3名已经知晓应聘结果的成员进行测试:
测试人员1:A学校、C专业、C公司、B职位、未被录取;
测试人员2:D学校、A专业、B公司、A职位、未被录取;
测试人员3:B学校、A专业、A公司、A职位、成功录取。
采用上述实施例中得到的学校、公司、专业、职位的权重:0.2、0.2、0.3、0.3,计算测试样本数据的测试结果:
测试人员1:A学校、C专业、C公司、B职位;
可以计算得到测试人员1在各个属性上得分分别为0.5*0.2、0.3*0、0.2*0.5、0.3*0.5,即为0.1、0、0.1和0.15。
测试人员2:D学校、A专业、B公司、A职位;
可以计算得到测试人员2在各个属性上得分分别为0*0.2、0.3*0.5、0.2*0.33、0.3*0.5,即为0、0.15、0.066和0.15。
测试人员3:B学校、A专业、D公司、A职位;
可以计算得到测试人员3在各个属性上得分分别为0.2*0.17、0.3*0.5、0.2*0、0.3*0.5,即为0.034、0.15、0和0.15。
对上述3个测试样本数据在不同维度上的得分进行计算,得到测试人员1的得分为0.35,测试人员2的得分为0.366,测试人员3的得分为0.434,仅有测试人员3落入符合该职位的评分取值范围内,因此,由上述简历评估模型得到的结果为测试人员1未被录取,测试人员2未被录取,测试人员3被录取,与实际结果相同,因此可以认为上述简历评估模型具有较高的准确度。
此处需要说明的是,当测试结果与测试样本数据的结果不相同时,将该测试样本数据作为训练样本数据,对简历评估模型进行训练,直至能够得到与测试样本数据的结果相同的测试结果。
根据本申请上述实施例,进行向量抽取包括:步骤S2555.对应聘者的一个或多个属性进行向量抽取,其中,一个或多个属性包括以下至少之一:公司名称、职位名称、学校名称、专业名称;
对向量抽取后的数据进行特征整理包括:步骤S2557,对一个或多个属性进行归一化处理。
归一化的处理方式有很多种,在一个可选的实施例中根据公司名称、职位名称、学校名称、专业名称的性质的不同,提供了几个不同的归一化方式,这几种归一化方式可以分别单独使用,也可以结合使用。归一化的方式并不限于此,其他的归一化方式也可以取得相同的效果。
公司名称归一化处理
对公司名称进行归一化处理包括:构建行业词表和地名词表;按照行业词表和地名词表中的行业名词和地名名词提取公司名称,得到公司名称的归一化结果。
在一种可选的实施例中,由于公司名称大致是由公司地点、公司名、公司行业、通用词四部分组成,例如:淘宝(中国)软件有限公司,在此基础上,构建了行业词表和地名词表对公司的名称进行提取。
在另一种可选的实施例中,在公司名称由地名和行业词构成的情况下, 例如:中国建筑工程那总公司等。对于上述没有明显公司名提取的公司,提取地名和公司行业以对公司名称进行归一化。
在又一种可选的实施例中,还需要构建公司英文名称和公司中文名称的映射词表,以及子公司与子公司所属的母公司的映射词表,在出现英文名称的公司的情况下,从公司英文名称和公司中文名称的映射词表中查找与上述英文名称对应的中文名称,得到公司名称的归一化结果,例如,将公司名称为alibab的公司名称归一化为阿里巴巴;在公司名称为子公司的公司名称的情况下,在子公司与子公司所属的母公司的映射词表中查找上述子公司所属的母公司,以得到公司名称归一化的结果,例如,将公司名称为淘宝的公司名称归一化为阿里巴巴。
职位名称进行归一化处理
对职位名称进行归一化处理包括:确认在历史招聘数据集合中,出现次数大于预设次数的职位描述为正确的职位名称;通过编辑距离构建简历文本数据中的职位描述与正确的职位名称之间的映射词表;通过正则表达式对简历文本数据中的职位描述在映射词表中进行匹配,得到职位名称的归一化结果。
在一种可选的实施例中,在上述预设次数可以为2000次的示例中,将出现次数大于2000次的职位描述确认为准确的职位描述,例如,在应聘软件开发工程师这一职务的简历中,将该职务描述为软件开发工程师的次数大于2000,而将该职务描述成为软件设计工程师、软件工程师等为同 一职务但出现次数并未超过2000次,则确认该职务准确的名称为软件开发工程师,并将所有职务名称为软件设计工程师、软件工程师的职务的名称归一化为软件开发工程师,如果软件开发工程师和软件设计工程师这两个职务名称均超过2000,在确认软件开发工程师和软件设计工程师为同一职务的情况下,确认出现次数最多的名称为该职务的名称。
在另一种可选的实施例中,将所有职务名称为软件设计工程师、软件工程师的职务的名称归一化为软件开发工程师,首先通过编辑距离构建简历文本数据中,软件设计工程师、软件工程师等其他表征软件开发工程师这一职务的名称与软件开发工程师之间的映射词表,即,将所有用于表征软件开发工程师这一职务的名称均映射到软件开发工程师这一职务上;通过正则表达式对简历文本数据中的职位描述在映射词表中进行匹配,从而得到归一化的结果。
学校名称进行归一化处理
对学校名称进行归一化处理包括:将简历文本数据中的学校名称根据出现的此处按照由大至小的顺序排列,并获取预设排名的学校名称,得到基础词典;对学校名称进行去噪处理,并通过正则匹配将去噪得到的学校名称与基础字典中的学校名称进行匹配,以得到归一化后的学校名称;使用同义词表,根据预设规则构造学校名称对应的简写,将出现学校名称对应的简写的名称记录为学校名称;将学校名称中包含预设第一后缀词语的学校名称归一化为相应的第一后缀词语,其中,预设的第一后缀词语至少包括:职业技术学院、网络学院、成教、自考和专升本;去掉学校名称中 包含的第二后缀词语,以生成归一化后的学校名称,其中,第二后缀词语用于表征学校的分支机构;去掉学校名称中的前缀,以生成归一化后的学校名称。
在一种可选的实施例中,将简历文本数据中出现的所有学校名称按照出现的次数由高至低排列,并获取出现次数排名在前1000名的学校名称构成基础词典,在构成基础词典的基础上,对学校名称进行去噪处理,在对学校名称进行去噪处理后,在基础词典中与学校名称进行匹配,以得到学校名称的归一化结果。
在另一种可选的实施例中,对学校名称进行去噪处理是通过正则匹配,弃掉学校名称中的造成数据,例如,将卡尔斯鲁尔大学(德国)处理为卡尔斯鲁尔大学;将厦门海洋学院(招统否是y招统否是)处理为厦门海洋学院,以消除噪音对学校名称进行归一化处理时的影响。
在又一种可选的实施例中,将学校名称中包含预设第一后缀词语的学校名称归一化为相应的第一后缀词语,例如,将厦门市兴才职业技术学院归一化为职业技术学院;去掉学校名称中包含的用于表征学校的分支机构的第二后缀词语,例如,将江苏科技大学经济管理学院归一化为江苏科技大学;去掉学校名称中的前缀,例如,将江苏省南京大学金陵学院归一化为南京大学金陵学院。
此处需要说明的是,还可以建立世界名校的名称表。如果存在任意学校名称出现频率很低,则首先需要确认该学校名称是否为世界名校,确认 的方法可以是在世界名校的名称表中查找该学校名称。
专业名称进行归一化处理
对专业名称进行归一化处理包括:构建专业分类表,并通过专业分类表进行贝叶斯模型训练;根据训练后的贝叶斯模型对简历文本数据中的专业进行分类。
此处需要说明的是,上述实施例中的对应聘者的简历文本数据的JD文本与简历文本的匹配,可以基于tfidf和word2vec算法或基于这两个算法演进的方法对文本的相似性进行量化。
图3示出了在一种应用场景的应用步骤,图3是根据本发明实施例1的一种可选的简历评估方法的流程图。如图3所示,下面对本申请上述实施例的一种应用场景下的示例进行详细描述如下:
S31:导入历史招聘数据集合。
获取一个或多个招聘方的历史招聘数据集合,上述历史数据至少包括简历文本数据。
S32:对历史招聘数据集合进行数据清洗。
在上述步骤中,对历史招聘数据集合进行数据清洗可以是从历史招聘数据集合中屏蔽评估不通过的简历,其中,评估不通过的简历可以是由于人员编制head count导致评估不通过的简历、没有进行简历评估而直接进行面试并且面试未通过的简历、简历重复投递导致评估不通过的简历。
S33:将历史招聘数据集合分成训练样本数据和测试样本数据。
S34:对训练样本数据进行向量抽取。
在上述步骤中,对训练样本数据进行向量抽取可以是抽取训练样本数据中的应聘者的一个或多个属性,一个或多个属性包括如下之一:公司名称、职位名称、学校名称、专业名称。
S35:对抽取的向量进行特征整理。
在上述步骤中,对抽取的向量进行特征整可以是对应聘者的一个或多个属性进行归一化处理。
S36:对训练样本数据进行模型训练。
在上述步骤中,可以采用GDBT算法对训练样本数据进行训练,得到待检验的简历评估模型,但得到简历评估模型的训练算法不限于此。
S37:对测试样本数据进行特征整理。
S38:通过对训练样本数据进行模型训练待得到验证简历评估模型。
S39:输出测试结果。
将测试样本数据输入至待检验简历评估模型,如果输出的结果与测试样本数据的招聘结果相同,则可以认为待检验的简历评估模型为准确的简历评估模型。
实施例2
本申请实施例还提供了如图4所示的简历评估方法。图4是根据本发明实施例2的简历评估方法的流程图。
步骤S41,输入待评估简历。
在上述步骤中,上述预设的简历评估模型可以是本申请实施例1中的任意一个简历评估模型。
步骤S43,获取待评估简历的简历评估结果,其中,简历评估结果是根据简历评估模型做出的,简历评估模型是根据从历史招聘数据集合中抽取数据建立的。
此处需要说明的是,本申请上述实施例中出现的预设的简历评估模型可以是实施例1中的任意一个简历评估模型,也可以是除实施例1以外的其他简历评估模型,任意使用历史招聘数据而非社交数据得到的简历评估模型均能应用于本实施例。
容易注意到的是,采用在历史招聘数据集合中抽取的数据进行简历评估模型的建立,用于构建对某一职位的先验认知,使得对应聘者的分析能够着眼于对应聘者综合实力与职位的匹配程度的挖掘,即,为不同公司的不同职位寻找到合适的简历,这样能够免去对每个应聘者的行为、社交数据进行分析,减少招聘的复杂程度,从而免去了搜集应聘者在各个社交平台上行为数据的繁琐工作,从而进一步的减少在招聘过程中付出的代价,效果和代价比方面具有更好的表现。因此,采用上述简历评估模型进行简历评估,能够方案解决现有技术中对应聘者进行评估通过对应聘者的行为、 社交数据进行全面分析来评价应聘者,而应聘者的行为、社交数据复杂多变且获取难度高,从而导致评价的估难度大的技术问题。
在本申请上述实施例中,步骤S43,上述方法还包括:
步骤S45,输入历史招聘数据集合,其中,历史招聘集合中的数据至少包括:一个或多个属性在职位上对应的招聘结果,一个或多个属性是简历文本数据中用于表征应聘者特征的参数,招聘结果至少包括:一个或多个属性在职位上的出现次数、和/或一个或多个属性在职位上录取次数。
在本申请上述实施例中,步骤S43,在获取待评估简历的简历评估结果之后,上述方法还包括:
步骤S451,在预设区域显示待评估对象的简历评估结果。
在本申请上述步骤中,在得到简历评估结果后,可以根据预设的显示内容显示建立评估结果。
在一种可选的实施例中,以简历评估模型为实施例1中提出的简历评估模型为示例,在用户向预设的简历评估系统(其中,该简历评估系统使用预设的简历评估模型)输入自身简历,或者按着简历评估系统提供的表格填写自身简历后,简历评估系统使用简历评估模型对用户的简历进行评估后,得到上述用户的评估结果,并将用户的简历评估结果显示于用户所示用的显示终端上,同时,通过对用户简历的评估,简历评估系统还能够得到用户所适合的职位,因此简历评估系统在显示用户的简历评估结果的同时,还可以显示为用户推荐的适合职位。
在另一种可选的实施例中,仍以简历评估模型为实施例1中提出的简历评估模型为示例,在人力管理人员向预设的简历评估系统输入一份或多份求职者的简历后,简历评价系统对求职者的简历依次或按预设顺序进行评估,得到评估结果并显示与人力管理人员的显示终端上,例如,可以将符合职位、不符合职位的求职者名单按照不同的颜色或位置显示,人力管理人员也可以通过点击求职者的姓名或其他操作对求职者的具体简历进行查看。同时,简历评估系统也可以在简历数据库中查找符合该职位的求职者为人力管理人员进行推荐。
此处需要说明的是,根据预设的显示内容显示简历评估结果的显示方式不限于上述实施例的任何一种显示方式。
图5是根据本申请实施例2的一种可选的简历评估方法的示意图,下面结合图5所示的示例,在本申请实施例1提供的简历评估方法的基础上对上述实施例2提供的方法进行进一步说明。
首先,需要说明的是,该方法可以包括两个阶段,第一个阶段为预备阶段,即服务器获取历史招聘数据,并使用学习方法对历史招聘数据进行训练,来得到简历评估模型,该第一阶段在历史招聘数据不发生变动的情况下仅进行一次,或按照固定周次来更新简历评估模型;第二个阶段是工作阶段,即用户使用简历评估模型来进行评估的阶段,而这一阶段的是重复性、多次发生的。
在一种可选的实施例中,用户通过预设的简历评估系统向服务器输入 历史招聘数据,服务器接收用户输入的历史招聘数据后,使用机器学习方法通过对历史招聘数据进行训练得到简历评估模型,上述过程可以是本申请简历评估方法的第一阶段。在服务器生成简历评估模型后,用户可以使用简历评估模型对简历进行评估,该过程可以是本申请简历评估方法中的第二阶段,用户将新的待评估的简历输入至简历评估模型,服务器对新的简历进行评估,得到新的简历的评估结果,用户接收服务器评估得到的新的简历的评估结果,并可以对新的简历的评估结果进行显示等操作。
此处需要说明的是,该实施例中的上述机器学习方法可以是本申请实施例1中的任意一种构建简历评估模型的方法。
实施例3
根据本发明实施例,还提供了一种用于实施上述简历评估方法的简历评估装置,如图6所示,该装置包括:获取模块60、抽取模块62、构建模块64和评估模块66
获取模块60用于获取历史招聘数据集合,其中,历史招聘数据集合至少包括:简历文本数据。
在上述模块涉及的历史招聘数据集合中,历史招聘数据集合的来源可以是目标招聘方的历史招聘数据集合,例如:目标招聘方在预设时间内(如上一个招聘季,或者上两个招聘季)参加招聘的人员的信息,以及参加招聘并获取该目标招聘单位职位的人员的信息。获取历史招聘数据集合的方法可以是通过网站自身的数据库来获取历史招聘数据集合。
在一种可选的实施例中,通过数据库获取目标招聘方的近五年的历史招聘数据集合,由于招聘方的招聘准则会随着时间的变化而变化,例如,招聘方对学历的要求可能升高,或对应聘者的工作经验更加重视,因此,获取近两年的全部历史招聘数据集合以及前五年至前三年的部分招聘数据。
抽取模块62用于从历史招聘数据集合中抽取数据,其中,抽取的该数据至少包括:一个或多个属性在职位上对应的招聘结果,一个或多个属性是简历文本数据中用于表征应聘者特征的参数;作为一个可选的实施方式,招聘结果可以包括:一个或多个属性在职位上的出现次数、和/或一个或多个属性在职位上录取次数。当然,招聘结果页可以包括除这两部分内容之外的其他内容,然后有利于人员招聘的信息都可以统计在招聘结果中。
构建模块64用于通过训练抽取到的数据构建简历评估模型。
评估模块66用于使用简历评估模型对接收到的简历进行简历评估。
在一种可选的实施例中,将接收到的简历输入至构建的简历模型中,得到模型的输出结果。在另一个可选的实施例中,可以将模型的输出结果与应聘者预期的职位进行匹配,如果匹配成功则进行提示。提示的方式也有很多种,例如,可以将应聘者的名字使用的不同的颜色来标识不同的匹配度。
此处需要说明的是,通过上述模块,采用在历史招聘数据集合中抽取的数据进行简历评估模型的建立,用于构建对某一职位的先验认知,使得 对应聘者的分析能够着眼于对应聘者综合实力与职位的匹配程度的挖掘,即为不同公司的不同职位寻找到合适的简历,这样能够免去对每个应聘者的行为、社交数据进行分析,减少招聘的复杂程度,从而免去了搜集应聘者在各个社交平台上行为数据的繁琐工作,从而减少在招聘过程中付出的代价,效果和代价比方面具有更好的表现。
根据本申请上述实施例,结合图7所示,上述抽取模块62可以包括:
清洗模块70,用于对历史招聘数据集合进行清洗,其中,该数据清洗主要用来将评估不通过的简历从历史招聘数据集合之中屏蔽。
第一抽取子模块72,用于从清洗后的历史招聘数据集合中抽取数据。
根据本申请上述实施例,评估不通过的简历包括以下至少之一:由于人员编制head count导致评估不通过的简历、没有进行简历评估而直接进行面试并且面试不同的简历、简历重复投递导致评估不通过的简历。对于评估不通过的简历在实施例1中已经进行了说明,在此不再赘述。
根据本申请上述实施例,结合图8所示,上述构建模块64是通过训练抽取到的数据构建简历评估模型。在该模块中可以将所有的抽取到的数据均应用到建立简历评估模型上,该简历评估模型的检验可以使用真实接收到的简历来进行检验。这种处理方法需要使用真实应聘者的简历来进行检验,有可能将合适的应聘者错误的筛除。作为另一种可选的实施方式,可以将抽取的数据分为两部分,一部分用来进行生成简历评估模型,一部分用来对生成的简历评估模型进行测试,在这种实施例中上述构建模块44 可以包括:
分类模块80,用于将抽取到的数据分为训练样本数据和测试样本数据。
生成模块82,用于使用训练样本数据进行训练生成待检验的简历评估模型。
具体的例子与实施例1基本相同,假定预先需要招聘的是数据库维护岗位,已经应聘成功的该职位的人员有6名,先使用这6名人员的简历用于训练:
人员1:A学校、A专业、B公司、B职位;
人员2:B学校、A专业、C公司、B职位;
人员3:A学校、B专业、A公司、A职位;
人员4:C学校、B专业、B公司、A职位;
人员5:A学校、A专业、C公司、B职位;
人员6:C学校、B专业、C公司、A职位。
由此可以获得数据维护岗人员的特征:
通过上述数值来计算人员的得分:
人员1:A学校、A专业、B公司、B职位;
可以计算得到人员1在各个属性上得分分别为0.5*0.2、0.3*0.5、0.2*0.33、0.3*0.5,即为0.1、0.15、0.066和0.15;
人员2:B学校、A专业、C公司、B职位;
可以计算得到人员2在各个属性上得分分别为0.2*0.17、0.3*0.5、0.2*0.5和0.3*0.5,即为0.034、0.15、0.1和0.15;
人员3:A学校、B专业、A公司、A职位;
可以计算得到人员3在各个属性上得分分别为0.2*0.5、0.3*0.5、0.2*0.17和0.3*0.5,即为0.1 0.15 0.034 0.15;
人员4:C学校、B专业、B公司、A职位;
可以计算得到人员4在各个属性上得分分别为0.2*0.33、0.3*0.5、0.2*0.33和0.3*0.5,即为0.066、0.15、0.066和0.15;
人员5:A学校、A专业、C公司、B职位;
可以计算得到人员5在各个属性上得分分别为0.2*0.5、0.3*0.5、0.2*0.5和0.3*0.5,即为0.1、0.15、0.1和0.15;
人员6:C学校、B专业、C公司、A职位;
可以计算得到人员6在各个属性上得分分别为0.2*0.33、0.3*0.5、0.2*0.5和0.3*0.5,即为0.066、0.15、0.1和0.15。
这样就可以得到6个向量,如果使用比较简单的算法,可以对每个人员求和,人员1得分0.466,人员2得分0.434,人员3得分0.434,人员4得分0.433,人员5得分0.5,人员6得分0.466。这其中得分范围为[0.433,0.5],如果低于0.433则认为这个简历是不符合这个职位的。
与实施例1中的例子不同的是,在这个岗位的招聘中,还可以考虑人员的年龄,如果人员的年龄在预定范围内,则可以增加分数。例如,这个岗位是一个经验很重要的岗位,那么,年龄超过40岁,可以增加0.07分。
此处需要说明的是,上述实施例提供的方法可用于对训练样本数据进行训练得到简历评估模型,但得到简历评估模型的方法并不仅限于此,任何能够通过训练样本数据获得简历评估模型的算法都能够应用于上述步骤,例如,GBDT算法等。
确认模块74,用于使用测试样本数据对待检验的简历评估模型进行检验,在检验通过的情况下确认待检验的简历评估模型为准确的简历评估模型。
根据本申请上述实施例,结合图9所示,上述分类模块80包括:
第二抽取子模块90,用于进行向量抽取。
整理模块92,用于对进行向量抽取后的数据进行特征整理,以得到训练样本数据和/或测试样本数据。
在对训练样本数据和/或测试样本数据进行向量抽取可以是对训练样本数据和/或测试样本数据在一个或者多个维度上的数据进行提取。在对训练样本数据和/或测试样本数据进行向量抽取后进行特征整理可以是将训练样本数据和/或测试样本数据中形式、格式、显示方式等不同但表征的意义相同的数据进行统一处理。上述步骤的目的在于统一数据形式,解决了由于数据多样化导致的同一数据具有多种形式,不易通过算法进行训练的 技术问题。
根据本申请上述实施例,结合图10所示,上述确认模块84用于进行确认。确认是否为准确的简历评估模型的方式有多种,例如,可以将认为实际结果与简历评估模型输出的结果完全一致,才认为准确的模型。作为另一个可选的实施例,上述确认模块84可以包括:
检验模块100,用于将测试样本数据输入至待检验的简历评估模型中进行检验,并输出检验结果。
确认子模块102,如果检验结果与测试样本数据对应的招聘结果误差在预定范围内,则确认待检验的简历评估模型为准确的简历评估模型。
在上述步骤中,允许存在一定的误差,这种误差的存在可能是由于抽取数据的数量不足够导致的,但是该误差的存在并不影响对招聘结果的判断,也在可以接受的范围之内。
在一种可选的实施例中,将测试样本数据中的多个数据分别输入至待检验的简历评估模型中,仍以上述招聘的岗位是数据库维护岗位为例,测试样本数据中的3名已经知晓应聘结果的成员进行测试:
测试人员1:A学校、C专业、C公司、B职位、未被录取;
测试人员2:D学校、A专业、B公司、A职位、未被录取;
测试人员3:B学校、A专业、A公司、A职位、成功录取。
采用上述实施例中得到的学校、公司、专业、职位的权重:0.2、0.2、 0.3、0.3,计算测试样本数据的测试结果:
测试人员1:A学校、C专业、C公司、B职位;
可以计算得到测试人员1在各个属性上得分分别为0.5*0.2、0.3*0、0.2*0.5、0.3*0.5,即为0.1、0、0.1和0.15。
测试人员2:D学校、A专业、B公司、A职位;
可以计算得到测试人员2在各个属性上得分分别为0*0.2、0.3*0.5、0.2*0.33、0.3*0.5,即为0、0.15、0.066和0.15。
测试人员3:B学校、A专业、D公司、A职位;
可以计算得到测试人员3在各个属性上得分分别为0.2*0.17、0.3*0.5、0.2*0、0.3*0.5,即为0.034、0.15、0和0.15。
对上述3个测试样本数据在不同维度上的得分进行计算,得到测试人员1的得分为0.35,测试人员2的得分为0.366,测试人员3的得分为0.434,仅有测试人员3落入符合该职位的评分取值范围内,因此,由上述简历评估模型得到的结果为测试人员1未被录取,测试人员2未被录取,测试人员3被录取,与实际结果相同,因此可以认为上述简历评估模型具有较高的准确度。
作为另一个可选实施方式,在当时招聘的时候,没有考虑年龄的问题,而此时如果考虑年龄的问题,测试人员2是可以加分的,此时测试人员2的得分为0.366+0.07=0.4336,按照这种方式,测试人员2是应该被录取的, 但是实际是没有录取。这种误差是由于招聘需要发生改变而导致的,是在可接受的范围之内的。
此处需要说明的是,当测试结果与测试样本数据的结果不相同时,将该测试样本数据作为训练样本数据,对简历评估模型进行训练,直至能够得到与测试样本数据的结果相同的测试结果。
根据本申请上述实施例,结合图11所示,上述第二抽取子模块90用于对应聘者的一个或多个属性进行向量抽取,其中,一个或多个属性包括以下至少之一:公司名称、职位名称、学校名称、专业名称;上述整理模块92包括:归一化模块112,用于对一个或多个属性进行归一化处理。
根据本申请上述实施例,结合图12所示,归一化模块112可以包括以下至少之一:
第一子归一化模块120,用于构建行业词表和地名词表;按照行业词表和地名词表中的行业名词和地名名词提取公司名称,得到公司名称的归一化结果,以对公司名称进行归一化处理。
第二子归一化模块122,用于确认在历史招聘数据集合中,出现次数大于预设次数的职位描述为正确的职位名称;通过编辑距离构建简历文本数据中的职位描述与正确的职位名称之间的映射词表;通过正则表达式对简历文本数据中的职位描述在映射词表中进行匹配,得到职位名称的归一化结果,以对职位名称进行归一化处理。
第三子归一化模块124,用于将简历文本数据中的学校名称根据出现 的此处按照由大至小的顺序排列,并获取预设排名的学校名称,得到基础词典;对学校名称进行去噪处理,并通过正则匹配将去噪得到的学校名称与基础字典中的学校名称进行匹配,以得到归一化后的学校名称;使用同义词表,根据预设规则构造学校名称对应的简写,将出现学校名称对应的简写的名称记录为学校名称;将学校名称中包含预设第一后缀词语的学校名称归一化为相应的第一后缀词语,其中,预设的第一后缀词语至少包括:职业技术学院、网络学院、成教、自考和专升本;去掉学校名称中包含的第二后缀词语,以生成归一化后的学校名称,其中,第二后缀词语用于表征学校的分支机构;去掉学校名称中的前缀,以生成归一化后的学校名称,以对学校名称进行归一化处理。
第四子归一化模块126,用于构建专业分类表,并通过专业分类表进行贝叶斯模型训练;根据训练后的贝叶斯模型对简历文本数据中的专业进行分类,以对专业名称进行归一化处理。
需要说明的是,上述归一化的处理,在实施例1中已经进行了详细的说明,在此不再赘述。
实施例4
根据本发明实施例,还提供了一种用于实施实施例2中的简历评估方法的简历评估装置,如图13所示,该装置包括:第一输入模块130和获取模块132。
第一输入模块130,用于输入待评估简历;获取模块132,用于获取 待评估简历的简历评估结果,其中,简历评估结果是根据简历评估模型做出的,简历评估模型是根据从历史招聘数据集合中抽取数据建立的。
此处需要说明的是,上述第一输入模块130获取模块132对应于实施例2中的步骤S41至步骤S43,此模块与对应的步骤所实现的实例和应用场景相同,但不限于上述实施例一所公开的内容。需要说明的是,上述模块作为装置的一部分可以运行在实施例一提供的计算机终端10中。
在本申请上述实施例中,结合图14所示,上述装置还包括:第二输入模块140。
第二输入模块140,用于输入历史招聘数据集合,其中,所述历史招聘集合中的数据至少包括:一个或多个属性在职位上对应的招聘结果,所述一个或多个属性是所述简历文本数据中用于表征应聘者特征的参数,所述招聘结果至少包括:所述一个或多个属性在所述职位上的出现次数、和/或所述一个或多个属性在所述职位上录取次数。
此处需要说明的是,上述第二输入模块140对应于实施例2中的步骤S45,此模块与对应的步骤所实现的实例和应用场景相同,但不限于上述实施例一所公开的内容。需要说明的是,上述模块作为装置的一部分可以运行在实施例一提供的计算机终端10中。
在本申请上述实施例中,结合图15所示,上述装置还包括:显示模块150,用于在预设区域显示所述待评估对象的简历评估结果。
此处需要说明的是,上述显示模块150对应于实施例2中的步骤S451, 此模块与对应的步骤所实现的实例和应用场景相同,但不限于上述实施例一所公开的内容。需要说明的是,上述模块作为装置的一部分可以运行在实施例一提供的计算机终端10中。
实施例5
本发明的实施例可以提供一种计算机终端,该计算机终端可以是计算机终端群中的任意一个计算机终端设备。可选的,在本实施例中,上述计算机终端也可以替换为移动终端等终端设备。
可选的,在本实施例中,上述计算机终端可以位于计算机网络的多个网络设备中的至少一个网络设备。
在本实施例中,上述计算机终端可以执行简历评估方法中以下步骤的程序代码:获取历史招聘数据集合,其中,历史招聘数据集合至少包括:简历文本数据;从历史招聘数据集合中抽取数据,其中,数据至少包括:一个或多个属性在职位上对应的招聘结果,一个或多个属性是简历文本数据中用于表征应聘者特征的参数,招聘结果至少包括:一个或多个属性在职位上的出现次数、和/或一个或多个参数在职位上录取次数;通过训练抽取到的数据构建简历评估模型;使用简历评估模型对接收到的简历进行简历评估。
可选的,图16是根据本发明实施例的一种计算机终端的结构框图。如图16所示,该计算机终端A可以包括:一个或多个(图中仅示出一个)处理器1601、存储器1603、以及传输装置1605。
其中,存储器可用于存储软件程序以及模块,如本发明实施例中的简历评估方法和装置对应的程序指令/模块,处理器通过运行存储在存储器内的软件程序以及模块,从而执行各种功能应用以及数据处理,即实现上述的简历评估方法。存储器可包括高速随机存储器,还可以包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器可进一步包括相对于处理器远程设置的存储器,这些远程存储器可以通过网络连接至终端A。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
处理器可以通过传输装置调用存储器存储的信息及应用程序,以执行下述步骤:获取历史招聘数据集合,其中,历史招聘数据集合至少包括:简历文本数据;从历史招聘数据集合中抽取数据,其中,数据至少包括:一个或多个属性在职位上对应的招聘结果,一个或多个属性是简历文本数据中用于表征应聘者特征的参数,招聘结果至少包括:一个或多个属性在职位上的出现次数、和/或一个或多个参数在职位上录取次数;通过训练抽取到的数据构建简历评估模型;使用简历评估模型对接收到的简历进行简历评估。
可选的,上述处理器还可以执行如下步骤的程序代码:对历史招聘数据集合进行清洗,其中,数据清洗包括:评估不通过的简历从历史招聘数据集合之中屏蔽:从清洗后的历史招聘数据集合中抽取数据。
可选的,上述处理器还可以执行如下步骤的程序代码:由于人员编制head count导致评估不通过的简历、没有进行简历评估而直接进行面试并 且面试不同的简历、简历重复投递导致评估不通过的简历。
可选的,上述处理器还可以执行如下步骤的程序代码:将抽取到的数据分为训练样本数据和测试样本数据;使用训练样本数据进行训练生成待检验的简历评估模型;使用测试样本数据对待检验的简历评估模型进行检验,在检验通过的情况下确认待检验的简历评估模型为准确的简历评估模型。
可选的,上述处理器还可以执行如下步骤的程序代码:生成待检验的简历评估模型的训练样本数据是:进行向量抽取并对向量抽取后的数据进行特征整理后得到的数据;和/或,检测待检验的简历评估模型的测试样本数据是:进行向量抽取并对向量抽取后的数据进行特征整理后得到的数据。
可选的,上述处理器还可以执行如下步骤的程序代码:将测试样本数据输入至待检验的简历评估模型中进行检验,并输出检验结果;如果检验结果与测试样本数据对应的招聘结果误差在预定范围内,则确认待检验的简历评估模型为准确的简历评估模型。
可选的,上述处理器还可以执行如下步骤的程序代码:进行向量抽取包括:对应聘者的一个或多个属性进行向量抽取,其中,一个或多个属性包括以下至少之一:公司名称、职位名称、学校名称、专业名称;对向量抽取后的数据进行特征整理包括:对一个或多个属性进行归一化处理。
可选的,上述处理器还可以执行如下步骤的程序代码:对公司名称进行归一化处理包括:构建行业词表和地名词表;按照行业词表和地名词表 中的行业名词和地名名词提取公司名称,得到公司名称的归一化结果;对职位名称进行归一化处理包括:确认在历史招聘数据集合中,出现次数大于预设次数的职位描述为正确的职位名称;通过编辑距离构建简历文本数据中的职位描述与正确的职位名称之间的映射词表;通过正则表达式对简历文本数据中的职位描述在映射词表中进行匹配,得到职位名称的归一化结果;对学校名称进行归一化处理包括:将简历文本数据中的学校名称根据出现的此处按照由大至小的顺序排列,并获取预设排名的学校名称,得到基础词典;对学校名称进行去噪处理,并通过正则匹配将去噪得到的学校名称与基础字典中的学校名称进行匹配,以得到归一化后的学校名称;使用同义词表,根据预设规则构造学校名称对应的简写,将出现学校名称对应的简写的名称记录为学校名称;将学校名称中包含预设第一后缀词语的学校名称归一化为相应的第一后缀词语,其中,预设的第一后缀词语至少包括:职业技术学院、网络学院、成教、自考和专升本;去掉学校名称中包含的第二后缀词语,以生成归一化后的学校名称,其中,第二后缀词语用于表征学校的分支机构;去掉学校名称中的前缀,以生成归一化后的学校名称;对专业名称进行归一化处理包括:构建专业分类表,并通过专业分类表进行贝叶斯模型训练;根据训练后的贝叶斯模型对简历文本数据中的专业进行分类。
本领域普通技术人员可以理解,计算机终端也可以是智能手机(如Android手机、iOS手机等)、平板电脑、掌声电脑以及移动互联网设备(Mobile Internet Devices,MID)、PAD等终端设备。图16其并不对上述 电子装置的结构造成限定。例如,计算机终端A还可包括比图16中所示更多或者更少的组件(如网络接口、显示装置等),或者具有与图16所示不同的配置。
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令终端设备相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:闪存盘、只读存储器(Read-Only Memory,ROM)、随机存取器(Random Access Memory,RAM)、磁盘或光盘等。
实施例6
本发明的实施例还提供了一种存储介质。可选的,在本实施例中,上述存储介质可以用于保存上述实施例1所提供的简历评估方法所执行的程序代码。
可选的,在本实施例中,上述存储介质可以位于计算机网络中计算机终端群中的任意一个计算机终端中,或者位于移动终端群中的任意一个移动终端中。
可选的,在本实施例中,存储介质被设置为存储用于执行以下步骤的程序代码:获取历史招聘数据集合,其中,历史招聘数据集合至少包括:简历文本数据;从历史招聘数据集合中抽取数据,其中,数据至少包括:一个或多个属性在职位上对应的招聘结果,一个或多个属性是简历文本数据中用于表征应聘者特征的参数,招聘结果至少包括:一个或多个属性在 职位上的出现次数、和/或一个或多个参数在职位上录取次数;通过训练抽取到的数据构建简历评估模型;使用简历评估模型对接收到的简历进行简历评估。
可选的,上述存储介质还被设置为存储用于执行以下步骤的程序代码:处理器还可以执行如下步骤的程序代码:对历史招聘数据集合进行清洗,其中,数据清洗包括:评估不通过的简历从历史招聘数据集合之中屏蔽:从清洗后的历史招聘数据集合中抽取数据。
可选的,上述存储介质还被设置为存储用于执行以下步骤的程序代码:由于人员编制head count导致评估不通过的简历、没有进行简历评估而直接进行面试并且面试不同的简历、简历重复投递导致评估不通过的简历。
可选的,上述存储介质还被设置为存储用于执行以下步骤的程序代码:将抽取到的数据分为训练样本数据和测试样本数据;使用训练样本数据进行训练生成待检验的简历评估模型;使用测试样本数据对待检验的简历评估模型进行检验,在检验通过的情况下确认待检验的简历评估模型为准确的简历评估模型。
可选的,上述存储介质还被设置为存储用于执行以下步骤的程序代码:生成待检验的简历评估模型的训练样本数据是:进行向量抽取并对向量抽取后的数据进行特征整理后得到的数据;和/或,检测待检验的简历评估模型的测试样本数据是:进行向量抽取并对向量抽取后的数据进行特征整理后得到的数据。
可选的,上述存储介质还被设置为存储用于执行以下步骤的程序代码:将测试样本数据输入至待检验的简历评估模型中进行检验,并输出检验结果;如果检验结果与测试样本数据对应的招聘结果误差在预定范围内,则确认待检验的简历评估模型为准确的简历评估模型。
可选的,上述存储介质还被设置为存储用于执行以下步骤的程序代码:进行向量抽取包括:对应聘者的一个或多个属性进行向量抽取,其中,一个或多个属性包括以下至少之一:公司名称、职位名称、学校名称、专业名称;对向量抽取后的数据进行特征整理包括:对一个或多个属性进行归一化处理。
可选的,上述存储介质还被设置为存储用于执行以下步骤的程序代码:对公司名称进行归一化处理包括:构建行业词表和地名词表;按照行业词表和地名词表中的行业名词和地名名词提取公司名称,得到公司名称的归一化结果;对职位名称进行归一化处理包括:确认在历史招聘数据集合中,出现次数大于预设次数的职位描述为正确的职位名称;通过编辑距离构建简历文本数据中的职位描述与正确的职位名称之间的映射词表;通过正则表达式对简历文本数据中的职位描述在映射词表中进行匹配,得到职位名称的归一化结果;对学校名称进行归一化处理包括:将简历文本数据中的学校名称根据出现的此处按照由大至小的顺序排列,并获取预设排名的学校名称,得到基础词典;对学校名称进行去噪处理,并通过正则匹配将去噪得到的学校名称与基础字典中的学校名称进行匹配,以得到归一化后的学校名称;使用同义词表,根据预设规则构造学校名称对应的简写,将出 现学校名称对应的简写的名称记录为学校名称;将学校名称中包含预设第一后缀词语的学校名称归一化为相应的第一后缀词语,其中,预设的第一后缀词语至少包括:职业技术学院、网络学院、成教、自考和专升本;去掉学校名称中包含的第二后缀词语,以生成归一化后的学校名称,其中,第二后缀词语用于表征学校的分支机构;去掉学校名称中的前缀,以生成归一化后的学校名称;对专业名称进行归一化处理包括:构建专业分类表,并通过专业分类表进行贝叶斯模型训练;根据训练后的贝叶斯模型对简历文本数据中的专业进行分类。
上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。
在本发明的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
在本申请所提供的几个实施例中,应该理解到,所揭露的技术内容,可通过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或通信连接,可以是电性或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地 方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。

Claims (22)

  1. 一种简历评估方法,其特征在于,包括:
    获取历史招聘数据集合,其中,所述历史招聘数据集合至少包括:简历文本数据;
    从所述历史招聘数据集合中抽取数据,其中,所述数据至少包括:一个或多个属性在职位上对应的招聘结果,所述一个或多个属性是所述简历文本数据中用于表征应聘者特征的参数,所述招聘结果至少包括:所述一个或多个属性在所述职位上的出现次数、和/或所述一个或多个属性在所述职位上录取次数;
    通过训练抽取到的数据构建简历评估模型。
  2. 根据权利要求1所述的方法,其特征在于,从所述历史招聘数据集合中抽取所述数据包括:
    对所述历史招聘数据集合进行清洗,其中,所述数据清洗包括:评估不通过的简历从所述历史招聘数据集合之中屏蔽:
    从清洗后的所述历史招聘数据集合中抽取所述数据。
  3. 根据权利要求2所述的方法,其特征在于,评估不通过的简历包括以下至少之一:
    由于人员编制head count导致评估不通过的简历、没有进行简历评估而直接进行面试并且面试未通过的简历、简历重复投递导致评估 不通过的简历。
  4. 根据权利要求1所述的方法,其特征在于,通过训练抽取到的数据构建简历评估模型包括:
    将抽取到的数据分为训练样本数据和测试样本数据;
    使用所述训练样本数据进行训练生成待检验的简历评估模型;
    使用所述测试样本数据对所述待检验的简历评估模型进行检验,在检验通过的情况下确认所述待检验的简历评估模型为准确的简历评估模型。
  5. 根据权利要求4所述的方法,其特征在于,
    生成所述待检验的简历评估模型的训练样本数据是:进行向量抽取并对向量抽取后的数据进行特征整理后得到的数据;和/或,
    检测所述待检验的简历评估模型的测试样本数据是:进行向量抽取并对向量抽取后的数据进行特征整理后得到的数据。
  6. 根据权利要求4或5所述的方法,其特征在于,使用所述测试样本数据对所述待检验的简历评估模型进行检验,并确认所述待检验的简历评估模型为准确的简历评估模型包括:
    将所述测试样本数据输入至所述待检验的简历评估模型中进行检验,并输出检验结果;
    如果所述检验结果与所述测试样本数据对应的招聘结果误差在预 定范围内,则确认所述待检验的简历评估模型为准确的简历评估模型。
  7. 根据权利要求4所述的方法,其特征在于,
    进行向量抽取包括:对所述应聘者的一个或多个属性进行向量抽取,其中,所述一个或多个属性包括以下至少之一:公司名称、职位名称、学校名称、专业名称;
    对向量抽取后的数据进行特征整理包括:对所述一个或多个属性进行归一化处理。
  8. 根据权利要求7所述的方法,其特征在于,对所述一个或多个属性进行归一化处理包括以下至少之一:
    对所述公司名称进行归一化处理包括:构建行业词表和地名词表;按照所述行业词表和所述地名词表中的行业名词和地名名词提取所述公司名称,得到所述公司名称的归一化结果;
    对所述职位名称进行归一化处理包括:确认在所述历史招聘数据集合中,出现次数大于预设次数的职位描述为正确的职位名称;通过编辑距离构建所述简历文本数据中的职位描述与所述正确的职位名称之间的映射词表;通过正则表达式对所述简历文本数据中的职位描述在所述映射词表中进行匹配,得到所述职位名称的归一化结果;
    对所述学校名称进行归一化处理包括:将所述简历文本数据中的学校名称根据出现的此处按照由大至小的顺序排列,并获取预设排名的学校名称,得到基础词典;对所述学校名称进行去噪处理,并通过 正则匹配将去噪得到的学校名称与所述基础字典中的学校名称进行匹配,以得到归一化后的学校名称;使用同义词表,根据预设规则构造学校名称对应的简写,将出现所述学校名称对应的简写的名称记录为所述学校名称;将所述学校名称中包含预设第一后缀词语的学校名称归一化为相应的第一后缀词语,其中,所述预设的第一后缀词语至少包括:职业技术学院、网络学院、成教、自考和专升本;去掉所述学校名称中包含的第二后缀词语,以生成归一化后的学校名称,其中,所述第二后缀词语用于表征所述学校的分支机构;去掉所述学校名称中的前缀,以生成归一化后的学校名称;
    对所述专业名称进行归一化处理包括:构建专业分类表,并通过所述专业分类表进行贝叶斯模型训练;根据训练后的所述贝叶斯模型对所述简历文本数据中的专业进行分类。
  9. 一种简历评估方法,其特征在于,包括:
    输入待评估简历;
    获取所述待评估简历的简历评估结果,其中,所述简历评估结果是根据简历评估模型做出的,所述简历评估模型是根据从历史招聘数据集合中抽取数据建立的。
  10. 根据权利要求9所述的方法,其特征在于,所述方法还包括:
    输入历史招聘数据集合,其中,所述历史招聘集合中的数据至少包括:一个或多个属性在职位上对应的招聘结果,所述一个或多个属 性是所述简历文本数据中用于表征应聘者特征的参数,所述招聘结果至少包括:所述一个或多个属性在所述职位上的出现次数、和/或所述一个或多个属性在所述职位上录取次数。
  11. 根据权利要求10所述的方法,其特征在于,在获取所述待评估简历的简历评估结果之后,所述方法还包括:
    在预设区域显示所述待评估对象的简历评估结果。
  12. 一种简历评估装置,其特征在于,包括:
    获取模块,用于获取历史招聘数据集合,其中,所述历史招聘数据集合至少包括:简历文本数据;
    抽取模块,用于从所述历史招聘数据集合中抽取数据,其中,所述数据至少包括:一个或多个属性在职位上对应的招聘结果,所述一个或多个属性是所述简历文本数据中用于表征应聘者特征的参数,所述招聘结果至少包括:所述一个或多个属性在所述职位上的出现次数、和/或所述一个或多个参数在所述职位上录取次数;
    构建模块,用于通过训练抽取到的数据构建简历评估模型。
  13. 根据权利要求12所述的装置,其特征在于,所述抽取模块包括:
    清洗模块,用于对所述历史招聘数据集合进行清洗,其中,所述数据清洗包括:评估不通过的简历从所述历史招聘数据集合之中屏蔽:
    第一抽取子模块,用于从清洗后的所述历史招聘数据集合中抽取 所述数据。
  14. 根据权利要求13所述的装置,其特征在于,评估不通过的简历包括以下至少之一:
    由于人员编制head count导致评估不通过的简历、没有进行简历评估而直接进行面试并且面试不同的简历、简历重复投递导致评估不通过的简历。
  15. 根据权利要求12所述的装置,其特征在于,所述构建模块包括:
    分类模块,用于将抽取到的数据分为训练样本数据和测试样本数据;
    生成模块,用于使用所述训练样本数据进行训练生成待检验的简历评估模型;
    确认模块,用于使用所述测试样本数据对所述待检验的简历评估模型进行检验,在检验通过的情况下确认所述待检验的简历评估模型为准确的简历评估模型。
  16. 根据权利要求15所述的装置,其特征在于,所述分类模块包括:
    第二抽取子模块,用于进行向量抽取;
    整理模块,用于对进行所述向量抽取后的数据进行特征整理,以得到所述训练样本数据和/或所述测试样本数据。
  17. 根据权利要求15或16所述的装置,其特征在于,所述确认模块包括:
    检验模块,用于将所述测试样本数据输入至所述待检验的简历评估模型中进行检验,并输出检验结果;
    确认子模块,如果所述检验结果与所述测试样本数据对应的招聘结果误差在预定范围内,则确认所述待检验的简历评估模型为准确的简历评估模型。
  18. 根据权利要求16所述的装置,其特征在于,
    所述第二抽取子模块,用于对所述应聘者的一个或多个属性进行向量抽取,其中,所述一个或多个属性包括以下至少之一:公司名称、职位名称、学校名称、专业名称;
    所述整理模块包括:归一化模块,用于对所述一个或多个属性进行归一化处理。
  19. 根据权利要求18所述的装置,其特征在于,所述归一化模块包括以下至少之一:
    第一子归一化模块,用于构建行业词表和地名词表;按照所述行业词表和所述地名词表中的行业名词和地名名词提取所述公司名称,得到所述公司名称的归一化结果,以对所述公司名称进行归一化处理;
    第二子归一化模块,用于确认在所述历史招聘数据集合中,出现次数大于预设次数的职位描述为正确的职位名称;通过编辑距离构建所述简历文本数据中的职位描述与所述正确的职位名称之间的映射词表;通过正则表达式对所述简历文本数据中的职位描述在所述映射词 表中进行匹配,得到所述职位名称的归一化结果,以对所述职位名称进行归一化处理;
    第三子归一化模块,用于将所述简历文本数据中的学校名称根据出现的此处按照由大至小的顺序排列,并获取预设排名的学校名称,得到基础词典;对所述学校名称进行去噪处理,并通过正则匹配将去噪得到的学校名称与所述基础字典中的学校名称进行匹配,以得到归一化后的学校名称;使用同义词表,根据预设规则构造学校名称对应的简写,将出现所述学校名称对应的简写的名称记录为所述学校名称;将所述学校名称中包含预设第一后缀词语的学校名称归一化为相应的第一后缀词语,其中,所述预设的第一后缀词语至少包括:职业技术学院、网络学院、成教、自考和专升本;去掉所述学校名称中包含的第二后缀词语,以生成归一化后的学校名称,其中,所述第二后缀词语用于表征所述学校的分支机构;去掉所述学校名称中的前缀,以生成归一化后的学校名称,以对所述学校名称进行归一化处理;
    第四子归一化模块,用于构建专业分类表,并通过所述专业分类表进行贝叶斯模型训练;根据训练后的所述贝叶斯模型对所述简历文本数据中的专业进行分类,以对所述专业名称进行归一化处理。
  20. 一种简历评估装置,其特征在于,包括:
    第一输入模块,用于输入待评估简历;
    获取模块,用于获取所述待评估简历的简历评估结果,其中,所 述简历评估结果是根据简历评估模型做出的,所述简历评估模型是根据从历史招聘数据集合中抽取数据建立的。
  21. 根据权利要求20所述的装置,其特征在于,所述装置还包括:
    第二输入模块,用于输入历史招聘数据集合,其中,所述历史招聘集合中的数据至少包括:一个或多个属性在职位上对应的招聘结果,所述一个或多个属性是所述简历文本数据中用于表征应聘者特征的参数,所述招聘结果至少包括:所述一个或多个属性在所述职位上的出现次数、和/或所述一个或多个属性在所述职位上录取次数。
  22. 根据权利要求21所述的装置,其特征在于,所述装置还包括:
    显示模块,用于在预设区域显示所述待评估对象的简历评估结果。
PCT/CN2017/077496 2016-03-30 2017-03-21 简历评估方法和装置 WO2017167069A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610192971.XA CN107291715A (zh) 2016-03-30 2016-03-30 简历评估方法和装置
CN201610192971.X 2016-03-30

Publications (1)

Publication Number Publication Date
WO2017167069A1 true WO2017167069A1 (zh) 2017-10-05

Family

ID=59963497

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/077496 WO2017167069A1 (zh) 2016-03-30 2017-03-21 简历评估方法和装置

Country Status (3)

Country Link
CN (1) CN107291715A (zh)
TW (1) TW201741948A (zh)
WO (1) WO2017167069A1 (zh)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107832776A (zh) * 2017-10-12 2018-03-23 如是科技(大连)有限公司 职位推荐的处理方法及装置
CN108733826A (zh) * 2018-05-24 2018-11-02 佛山市轻遣网络有限公司 自动分发用工信息的方法及系统
CN108898429A (zh) * 2018-06-19 2018-11-27 平安科技(深圳)有限公司 电子装置、偏好倾向预测方法和计算机可读存储介质
CN109634995A (zh) * 2018-09-10 2019-04-16 阿里巴巴集团控股有限公司 评估主体对关系的方法、装置及服务器
CN109948131A (zh) * 2019-03-15 2019-06-28 深圳八爪网络科技有限公司 同一人才不同简历的合并方法及装置
CN110297973A (zh) * 2019-06-18 2019-10-01 中国平安财产保险股份有限公司 一种基于深度学习的数据推荐方法、装置及终端设备
CN110569340A (zh) * 2019-07-24 2019-12-13 深圳壹账通智能科技有限公司 文本信息验证方法、装置、计算机设备和存储介质
CN112069806A (zh) * 2019-05-21 2020-12-11 杭州海康威视数字技术股份有限公司 简历筛选方法、装置、电子设备及存储介质
CN112766869A (zh) * 2020-12-10 2021-05-07 南方电网数字电网研究院有限公司 一种数字化人力资源管理的人岗匹配算法
CN114514543A (zh) * 2019-09-11 2022-05-17 惠普发展公司,有限责任合伙企业 基于时间序列机器学习模型的资源需求预测

Families Citing this family (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019068253A1 (zh) * 2017-10-02 2019-04-11 刘伟 用于职位申请人简历排序的机器学习系统
CN107665383A (zh) * 2017-10-26 2018-02-06 北京拉勾科技有限公司 一种简历处理延时模型的构建方法和计算设备
CN108062657A (zh) * 2017-11-30 2018-05-22 朱学松 人才招聘面试方法及系统
CN108153829B (zh) * 2017-12-12 2021-12-28 北京字节跳动网络技术有限公司 一种简历评估方法及装置
CN108009735B (zh) * 2017-12-12 2022-07-22 北京字节跳动网络技术有限公司 一种简历评估方法及装置
CN108182513B (zh) * 2017-12-12 2020-10-20 北京字节跳动网络技术有限公司 一种简历评估方法及装置
CN107993019B (zh) * 2017-12-12 2022-07-22 北京字节跳动网络技术有限公司 一种简历评估方法及装置
CN108182512B (zh) * 2017-12-12 2022-07-22 北京字节跳动网络技术有限公司 一种简历评估方法及装置
CN108256827A (zh) * 2018-01-10 2018-07-06 广东轩辕网络科技股份有限公司 目标职位分析方法及系统
TWI731215B (zh) * 2018-02-02 2021-06-21 合作金庫商業銀行股份有限公司 人力資源管理系統及人力資源管理方法
CN108764825A (zh) * 2018-05-15 2018-11-06 中国平安人寿保险股份有限公司 职位信息匹配方法、装置、计算机设备及存储介质
TWI665597B (zh) * 2018-05-18 2019-07-11 Nan Kai University Of Technology 用於手持式裝置的履歷顯示與操作系統及其方法
CN108734445A (zh) * 2018-05-24 2018-11-02 佛山市轻遣网络有限公司 一种招聘管理系统及其方法
CN108805533A (zh) * 2018-06-07 2018-11-13 厦门华厦学院 一种基于大数据的人才能力评估推荐平台
CN108985707B (zh) * 2018-06-11 2021-08-10 安徽引航科技有限公司 一种快速判断简历内容真实性的方法
CN108829676A (zh) * 2018-06-11 2018-11-16 安徽引航科技有限公司 基于文本分析技术的人才专业能力评估方法
CN109002906A (zh) * 2018-06-25 2018-12-14 上海学民网络科技有限公司 一种职业规划路径架构系统及处理方法
CN110874714A (zh) * 2018-08-31 2020-03-10 阿里巴巴集团控股有限公司 数据匹配方法及装置
CN109784639B (zh) * 2018-12-13 2023-12-01 贵州有盼头科技有限公司 基于智能评分的线上招聘方法、装置、设备及介质
CN111625618B (zh) * 2019-02-11 2023-05-02 阿里巴巴集团控股有限公司 数据匹配方法以及装置
CN109831531B (zh) * 2019-03-15 2020-05-05 河北冀联人力资源服务集团有限公司 求职简历推送方法与装置以及任务推送方法与装置
CN110069782A (zh) * 2019-04-26 2019-07-30 西安募格网络科技有限公司 一种基于机器学习的简历质量判断方法
CN110209929B (zh) * 2019-04-29 2021-08-03 毕昀 一种简历推荐方法、装置、计算机设备及存储介质
CN110189001A (zh) * 2019-05-15 2019-08-30 北京字节跳动网络技术有限公司 简历分析方法和系统,及存储介质
TWI724517B (zh) * 2019-08-28 2021-04-11 南開科技大學 依據同目標求職者履歷產生履歷修改建議之系統及方法
CN110516261A (zh) * 2019-09-03 2019-11-29 北京字节跳动网络技术有限公司 简历评估方法、装置、电子设备及计算机存储介质
CN110866393B (zh) * 2019-11-19 2023-06-23 北京网聘咨询有限公司 基于领域知识库的简历信息抽取方法及系统
CN111177283A (zh) * 2019-12-31 2020-05-19 广东轩辕网络科技股份有限公司 一种基于强化学习的智能人事模型构建方法及装置
TWI745878B (zh) * 2020-03-05 2021-11-11 宏碁股份有限公司 聊天機器人系統及聊天機器人模型訓練方法
TWI776146B (zh) * 2020-04-30 2022-09-01 中國信託商業銀行股份有限公司 履歷評分方法及其系統
CN111814192B (zh) * 2020-08-28 2021-04-27 支付宝(杭州)信息技术有限公司 训练样本生成方法及装置、敏感信息检测方法及装置
CN112699235A (zh) * 2020-12-21 2021-04-23 胜斗士(上海)科技技术发展有限公司 用于简历样本数据的解析和评价的方法、设备和系统
CN113435857B (zh) * 2021-07-09 2024-07-23 中国银行股份有限公司 应聘者数据分析方法及装置
CN117993876B (zh) * 2024-04-03 2024-08-02 四川蓉城蕾茗科技有限公司 一种简历评估系统、方法、装置和介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120323812A1 (en) * 2010-11-12 2012-12-20 International Business Machines Corporation Matching candidates with positions based on historical assignment data
CN104834668A (zh) * 2015-03-13 2015-08-12 浙江奇道网络科技有限公司 基于知识库的职位推荐系统
CN105159962A (zh) * 2015-08-21 2015-12-16 北京全聘致远科技有限公司 职位推荐方法与装置、简历推荐方法与装置、招聘平台

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102117323A (zh) * 2011-02-21 2011-07-06 深圳埃斯欧纳信息咨询有限公司 一种推荐求职简历的处理方法和系统
CN103634420B (zh) * 2013-11-22 2017-07-28 谢小雪 简历邮件筛选系统及方法
CN103778228B (zh) * 2014-01-24 2018-02-23 五八同城信息技术有限公司 利用即时通讯系统实现简历信息定向推广的方法
CN104599031A (zh) * 2014-11-06 2015-05-06 河南智业科技发展有限公司 一种简历模型匹配系统及方法
CN105117863A (zh) * 2015-09-28 2015-12-02 北京橙鑫数据科技有限公司 简历职位匹配方法及装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120323812A1 (en) * 2010-11-12 2012-12-20 International Business Machines Corporation Matching candidates with positions based on historical assignment data
CN104834668A (zh) * 2015-03-13 2015-08-12 浙江奇道网络科技有限公司 基于知识库的职位推荐系统
CN105159962A (zh) * 2015-08-21 2015-12-16 北京全聘致远科技有限公司 职位推荐方法与装置、简历推荐方法与装置、招聘平台

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107832776A (zh) * 2017-10-12 2018-03-23 如是科技(大连)有限公司 职位推荐的处理方法及装置
CN108733826A (zh) * 2018-05-24 2018-11-02 佛山市轻遣网络有限公司 自动分发用工信息的方法及系统
CN108898429B (zh) * 2018-06-19 2023-04-18 平安科技(深圳)有限公司 电子装置、偏好倾向预测方法和计算机可读存储介质
CN108898429A (zh) * 2018-06-19 2018-11-27 平安科技(深圳)有限公司 电子装置、偏好倾向预测方法和计算机可读存储介质
CN109634995A (zh) * 2018-09-10 2019-04-16 阿里巴巴集团控股有限公司 评估主体对关系的方法、装置及服务器
CN109948131A (zh) * 2019-03-15 2019-06-28 深圳八爪网络科技有限公司 同一人才不同简历的合并方法及装置
CN109948131B (zh) * 2019-03-15 2023-05-12 长沙八爪网络科技有限公司 同一人才不同简历的合并方法及装置
CN112069806A (zh) * 2019-05-21 2020-12-11 杭州海康威视数字技术股份有限公司 简历筛选方法、装置、电子设备及存储介质
CN112069806B (zh) * 2019-05-21 2024-04-05 杭州海康威视数字技术股份有限公司 简历筛选方法、装置、电子设备及存储介质
CN110297973A (zh) * 2019-06-18 2019-10-01 中国平安财产保险股份有限公司 一种基于深度学习的数据推荐方法、装置及终端设备
CN110569340A (zh) * 2019-07-24 2019-12-13 深圳壹账通智能科技有限公司 文本信息验证方法、装置、计算机设备和存储介质
CN114514543A (zh) * 2019-09-11 2022-05-17 惠普发展公司,有限责任合伙企业 基于时间序列机器学习模型的资源需求预测
CN112766869A (zh) * 2020-12-10 2021-05-07 南方电网数字电网研究院有限公司 一种数字化人力资源管理的人岗匹配算法

Also Published As

Publication number Publication date
CN107291715A (zh) 2017-10-24
TW201741948A (zh) 2017-12-01

Similar Documents

Publication Publication Date Title
WO2017167069A1 (zh) 简历评估方法和装置
Mena‐Chalco et al. Brazilian bibliometric coauthorship networks
CN110727852A (zh) 一种推送招聘推荐服务的方法、装置及终端
CN111177322A (zh) 一种领域知识图谱的本体模型构建方法
CN109063116A (zh) 数据识别方法、装置、电子设备及计算机可读存储介质
CN110532363A (zh) 一种基于决策树的任务导向型自动对话方法
Salami et al. Detecting anomalies in students' results using decision trees
Hayek et al. Machine learning and external auditor perception: An analysis for UAE external auditors using technology acceptance model
SARIALTIN A Qualitative Study on the Conceptual Framework and Success
Gallotta et al. Using the Delphi method to verify a framework to implement sustainability initiatives.
Rahhal et al. Education Path: Student orientation based on the job market needs
Hendri et al. A Novel Algorithm for Monitoring Field Data Collection Officers of Indonesia's Central Statistics Agency (BPS) Using Web-Based Digital Technology
Oskouei Identifying students' behaviors related to internet usage patterns
CN112328812B (zh) 基于自调参数的领域知识抽取方法与系统、电子设备
CN109885760A (zh) 基于用户兴趣的信息溯源方法和系统
Medina et al. Business Patterns Catalogue and selection proposal for the Conceptual Model of a software product
Hammook et al. Student/supervisor collaboration and usage patterns of publications available on ResearchGate
Bhoomika et al. 2q-learning scheme for resume screening
İren Cost of quality for crowdsourcing management
Haiyan et al. A data-processing mechanism for scenario-based usability testing
Schuelke-Leech et al. Speaking like statesmen or scientists: Differentiating congressional and administrative views on data
Mummadi et al. Examination System Automation Using NLP
Katre et al. Public Shaming Analyzer using Random Forest Classifier
Sujatha et al. Predicting Students’ Performance in a Core Subject Using Pruned J48 Classifier-Case Study of a Private University in Amaravathi
Liu Research on the Path of Enhancing Employment and Entrepreneurship Ability of Deaf College Students Based on Knowledge Graph

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17773099

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17773099

Country of ref document: EP

Kind code of ref document: A1