CN113570348A - Resume screening method - Google Patents
Resume screening method Download PDFInfo
- Publication number
- CN113570348A CN113570348A CN202111126096.2A CN202111126096A CN113570348A CN 113570348 A CN113570348 A CN 113570348A CN 202111126096 A CN202111126096 A CN 202111126096A CN 113570348 A CN113570348 A CN 113570348A
- Authority
- CN
- China
- Prior art keywords
- job
- work
- resume
- experience
- characteristic information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
- G06Q10/105—Human resources
- G06Q10/1053—Employment or hiring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
Abstract
The invention relates to a resume screening method, which comprises the following steps: aiming at the current recruitment work to be processed, acquiring a job application resume corresponding to the recruitment work and screened initially; acquiring multi-dimensional first characteristic information in the description of the recruitment work and multi-dimensional second characteristic information in the description of each job hunting resume; analyzing the first characteristic information and the second characteristic information by adopting a pre-calculated strategy based on the same dimensionality to obtain a condition score of the dimensionality; performing fusion processing on the condition scores of all dimensions of the same job resume to obtain matching scores; and selecting the job hunting resume corresponding to the matching score larger than the preset threshold value as the screened resume matched with the recruitment work. The method can screen out the most appropriate talents for enterprises, provide recruitment information meeting the requirements of the users, and realize bidirectional personalized service.
Description
Technical Field
The invention relates to the technical field of information, in particular to a resume screening method.
Background
With the rapid development of artificial intelligence technology, accurate matching and personalized recommendation systems are greatly developed and applied in various fields. How to accurately complete matching and recommendation from big data such as user data, recruitment data and the like is an important research direction in the field of artificial intelligence at present.
Recruitment is a very important link in human resource management and is one of the most important links in each company. The recruitment of proper talents is a foundation and guarantee for maintaining the operation and development of companies. Resume screening is the first link in recruitment. At present, most human resource managers utilize a network recruitment platform to acquire and screen resumes. The network recruitment platform generally acquires the resume of the user, and then recommends a proper resume to the human resource manager of the company through matching the resume with the positions. That is, the network recruitment platform generally performs simple filtering and matching according to conditions roughly set by companies and enterprises, for example, a company human resource manager sets conditions such as a target position, a academic requirement, a work place, a technical requirement, an experience age, a salary range and the like according to requirements of a company post, and then the network recruitment platform performs rigid condition filtering according to the set conditions and pushes a resume meeting the conditions to the human resource manager of the company.
However, in the screening process, the hard condition screening algorithm has a poor matching effect, and the screened resume cannot meet the requirements of companies often and needs to be screened again by human resource managers; in addition, because the number of resumes is large, the resumes are manually selected, and the validity is lacked under the condition of mass data, so that omission or misjudgment of required proper talents can be caused greatly.
Therefore, the traditional recruitment system actively contacts the client through the hunting head or the human resource department and provides corresponding job positions. The further developed online recruitment platform is only limited to users to upload resumes by themselves, and the main approach is to manually screen and select resumes through hunting heads or human resource departments of recruiters. Therefore, a resume screening method capable of providing bidirectional personalized recommendation for users and recruitment units is needed.
Disclosure of Invention
Technical problem to be solved
In view of the above disadvantages and shortcomings of the prior art, the present invention provides a resume screening method capable of providing bidirectional personalized recommendations for users and recruiters.
(II) technical scheme
In order to achieve the purpose, the invention adopts the main technical scheme that:
in a first aspect, an embodiment of the present invention provides a resume screening method, including:
a1, acquiring first characteristic information of each dimension in the description of the recruitment work and second characteristic information of each dimension in the description of all job resumes corresponding to the recruitment work aiming at each recruitment work; the job-seeking resume is a selected resume after at least one screening; the dimensions include one or more of the following: job title, job skill, job unit, job site, years of experience, and salary level; the dimensions of the recruitment work and the job hunting resume are consistent;
a2, acquiring condition scores of all dimensions based on first feature information and second feature information under the same dimension, and fusing the condition scores of all dimensions to acquire matching scores;
when the dimension is a job title, acquiring a first coefficient value corresponding to the first characteristic information and a second coefficient value corresponding to the second characteristic information according to the mapping relation table of each job title and the coefficient value; calculating the first coefficient value and the second coefficient value by adopting a formula (1) to obtain the condition score of the job title:
Representing all job title sets in the job experience of the job hunting resume; s is each job title in the job experience of the job hunting resume;representing job titles in the recruitment work;the method comprises the steps of calculating similarity between two working experience character strings for a pre-trained function model;different coefficients corresponding to each job title in the job experience representing the job hunting resume;
and A3, selecting the job resume to which the matching score larger than the preset threshold value belongs as the screened resume matching the recruitment.
Optionally, the mapping relationship table includes a corresponding relationship between job title, coefficient value, and job deadline point; in addition, the first and second substrates are,in the formula (2),in order to be able to use the attenuation factor,a time interval representing the time from the end of each job to the current time; all job title sets in job hunting resumeIn the method, the title job title is used as an independent job title, and the current time is used as the job finishing time; if the job title is invalid or empty, it will beSet to 0.5.
(III) advantageous effects
The resume screening method is used for massive resumes, and personalized information with higher matching degree is recommended for enterprises and users based on a recommendation algorithm of accurate matching.
In the matching of the recruitment work and job hunting resume, the resume and job positions are matched and analyzed through dimensions such as job title scores, work skills, work units, work places, experience years, salary levels and the like, so that the precise matching of the resume and the recruitment information is realized, the process of manually selecting the resume is omitted, the most suitable talents are screened out for an enterprise, any talent is not omitted, the recruitment information meeting the requirements of the user is provided for the user, and the bidirectional personalized service is realized.
Drawings
Fig. 1 is a schematic flow chart of a resume screening method according to an embodiment of the present invention;
fig. 2 is a schematic diagram of maintenance and division of recruitment work and job resumes according to an embodiment of the invention;
FIG. 3 is a schematic flow chart of training a first model according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of an application scenario of the resume screening method of the present invention;
fig. 5 and fig. 6 are flow frame diagrams of a resume screening method according to an embodiment of the present invention;
FIG. 7 is a diagram illustrating a normalization process in a workplace dimension according to one embodiment of the present invention;
FIG. 8 is a diagram illustrating a pre-established training data set according to an embodiment of the present invention;
FIG. 9 is a schematic illustration of the present invention in which labels are assigned to data in a training data set during a training process;
fig. 10 is a schematic illustration of job hunting resume and job recruitment matching in an embodiment of the present invention;
fig. 11 is a schematic structural diagram of a resume matching system according to an embodiment of the present invention.
Detailed Description
For the purpose of better explaining the present invention and to facilitate understanding, the present invention will be described in detail by way of specific embodiments with reference to the accompanying drawings.
At present, the information matching and the effective use influence the working and learning aspects of people. Resume information is also one part of the massive data, recruitment is one of important works in human resource management, resume screening is the first link of recruitment, and currently, the acquisition of resumes by using a network recruitment platform is a mode frequently used by human resource managers. In the prior art, a network recruitment website generally performs simple screening and matching according to conditions roughly set by an enterprise, for example, performs condition screening according to target positions, work places, academic professions and the like, and sends the resume meeting the simple conditions to the enterprise. However, in the using process, the matching mode is poor in effect, the provided resume cannot meet the requirements of enterprises, and the recruiters are often required to manually screen again; on the other hand, the resume has more information, so that the recruiter is lack of effective quantitative evaluation on the resume, and the resume requiring talents may be missed.
In the prior art, the resume is screened mainly by manual participation, the matching effect of the resume and the requirement is poor, and the resume information is more and a large amount of labor cost is spent for screening.
The method of the embodiment of the invention aims to complete the automatic matching process by extracting the requirements of job hunting resume information and recruitment information advertisements, and provide a bidirectional personalized recommendation resume screening method for users and recruiters. The dimensions of job hunting resume and recruitment work in the application are consistent, and the dimensions comprise one or more of the following: job title, job skill, job unit, job site, years of experience, salary level, etc.
According to the title job title of the user resume, the title job title in all job title sets in the user resume (namely the job hunting resume) is regarded as an independent job title, the current time is taken as the job finishing time, and if the job title is invalid and defaulted, the final value is set as the lowest matching number.
The work skill score in the embodiments described below pre-trains the similarity between two work skill strings by Fasttext for the work skill of the target position and the user resume skill set.
In addition, a new FastText model is proposed in this application to standardize all location names. To train the model, training data is collected from the user's profile work experience. In particular, location names associated with work experiences are used in this application. The data set then needs to be further refined to form a location name array (array of city names, provincial names and country/region names) for each user profile, and then used as input for the FastText mode. The location arrays consist of city names, province names and country/region names, respectively.
The first model described below can calculate the similarity of two words by using their cosine similarity between two vector representations. If the word is in the vocabulary, the word is represented by an embedded vector. If the word is not in the vocabulary, a set of embedded vectors will be used to assemble the representation. Thus, this FastText model can solve the problem of the vocabulary not having a required value well by using its substrings.
The underlying algorithm mentioned in step S4 below is a logistic regression algorithm that includes the previously mentioned 6 different condition scores (job title score, job skill score, job unit score, job site score, job experience score, and salary score) and finally a final matching score. The parameters of the model are obtained by training the collected data.
Example one
Referring to fig. 1, fig. 2, fig. 4, fig. 5, fig. 6, and fig. 10, the resume screening method of the present embodiment may be implemented on any computer, and the execution subject of the resume screening method may be any electronic device, and the method may include the following steps:
s1, aiming at the recruitment work to be processed currently, establishing a first data set of all the recruitment works and a second data set of job resume; performing primary screening on the recruitment works in the second data set and the first data set, and acquiring the initially screened job resume corresponding to each recruitment work;
s2, aiming at each recruitment work, acquiring first characteristic information of each dimension in the description of the recruitment work and acquiring second characteristic information of each dimension in the description of each job resume, wherein the job resume is a corresponding initially screened resume of the recruitment work; the description of the recruitment work comprises a plurality of dimensions, the description of the resume comprises a plurality of dimensions, and the dimensional information of the recruitment work and the dimensional information of the resume are the same.
For example, the multiple dimensions in this embodiment may include: job title, job skill, job unit, job site, years of experience, salary level, etc. In the embodiment, the multi-dimensional information is not limited, and the division is performed according to the resume and the specific information of the recruitment work.
S3, analyzing the first characteristic information and the second characteristic information of the dimension by adopting a pre-trained calculation strategy of the dimension in the first model based on the same dimension to obtain a condition score of the dimension; and performing fusion processing on the condition scores of all dimensions of the job hunting resume to obtain a matching score.
For example, the weight coefficient at the time of the fusion process may be acquired in advance. For example, the method shown in fig. 2 may extract Feature information of users and positions in a database, convert Feature data sets into Feature matrices through manual Feature analysis and based on Feature tools Feature framework, perform model prediction through a Feature selection model (e.g., Boruta), and finally perform model training (e.g., TPOT, AutoML) through model optimization (including Feature selection, parameter tuning, etc.) and inputting tuning parameters into the framework.
With the training shown in fig. 2, the weight coefficients (conditional scores) for each dimension in the fusion process are obtained as shown in table 1 below:
and S4, comparing the matching scores of all job resume of each recruitment work with a preset threshold value, and selecting the job resume to which the matching score larger than the preset threshold value belongs as the screened resume matching the recruitment work.
The threshold for determining whether the job application matches is a predetermined threshold, typically 0.75, and any job application with a score greater than 0.75 will be indicated as a good match. The threshold of 0.75 in this embodiment is a parameter of relatively good matching accuracy and algorithm speed found in the training process of the matching algorithm.
As shown in fig. 4, the application scenario of the method is schematically illustrated, and the purpose of the method is to implement intelligent matching between a recruitment position and a job seeker. And making a decision through an artificial intelligence algorithm, and finally realizing the decision of recruitment in a recruitment platform by using the method.
As shown in fig. 3 and 5, in specific use, the post characteristics and the job seeker characteristics are obtained, and then final decision is realized through the first model, namely the machine learning model, so that the intelligent recruitment problem is simplified into a classification problem (such as binary classification) in machine learning, the screening and matching process in the prior art is optimized, and the matching accuracy is improved.
The pre-trained first model is a model trained by adopting a training data set and a testing data set and a calculation strategy of each dimension; the training data set and the testing data set respectively comprise various types of matching and unmatching recruitment work and job hunting resume. Fig. 7 and 8 illustrate a portion of the process of acquiring a training data set and a test data set. In particular, labels need to be added to the training and testing resumes in the training data set and the testing data, which can add recessive labels and dominant labels as shown in fig. 9.
The first model trained in this embodiment may employ a logistic regression algorithm that includes the aforementioned 6 different condition scores (job title score, job skill score, job unit score, job site score, job experience score, and salary score), and finally obtains a final matching score. In a specific application, the parameters of the model are obtained by training the collected data, as shown in fig. 10.
Typically, XXX (typically around 3000) data may be collected, each data copy including two results, "1" indicating that the user's resume meets the target work requirement and "0" indicating that the user's resume does not meet the target work requirement, as shown in FIG. 9. All data were divided into training data and test data in proportions 2/3 and 1/3, respectively.
Of course, before model training, label classification needs to be performed on the training data. General training data, which can be classified into two categories according to feedback: explicit feedback and implicit feedback. As shown in fig. 9, the explicit feedback tags are mainly evaluated by scoring of the user, and the implicit feedback is extracted by the user's behavior, such as interviewing, recording, final recording, and the like.
The information on the features in the training dataset and the test dataset and the description of the features are provided in table 2 below. I.e. a table of data features for training the first model, representing the parameters of the computational formula for each dimension obtained by the various data participating in the training of the first model. Table 2:
in an alternative implementation, the obtaining of the first feature information of each dimension in the description of the recruitment work may include some information of the following dimensions:
acquiring first characteristic information of job titles in the description of the recruitment work; acquiring first characteristic information of the work skills in the description of the recruitment work; acquiring first characteristic information of a work unit in the description of the recruitment work; acquiring first characteristic information of a work place in the description of the recruitment work; acquiring first characteristic information of experience years in the description of the recruitment work; first characteristic information of salary level in the description of the recruitment work is obtained.
Accordingly, obtaining the second feature information of each dimension in the description of each job resume may include some information of the following dimensions:
acquiring second characteristic information of the job title in the description of the job hunting resume; acquiring second characteristic information of the work skills in the description of the job hunting resume; acquiring second characteristic information of a working unit in the description of the job hunting resume; acquiring second characteristic information of a work place in the description of the job hunting resume; acquiring second characteristic information of work experience and/or age limit in the description of the job hunting resume; and acquiring second characteristic information of salary level in the description of the resume of job hunting.
In the embodiment, resume and position matching analysis is performed through dimensions such as job title scores, work skills, work units, work places, experience years, salary levels and the like, accurate matching of the resume and the recruitment information is further achieved, the process of manually selecting the resume is omitted, meanwhile, the most appropriate talents are screened out for an enterprise, any talent is not omitted, recruitment information meeting the requirements of a user is provided for the user, and bidirectional personalized service is achieved.
The resume screening method of the embodiment is used for mass resumes, and recommends personalized information with higher matching degree for enterprises and users based on a recommendation algorithm of accurate matching, as shown in fig. 10.
In the matching of the recruitment work and job hunting resume, the resume and the job position are matched and analyzed through dimensions such as job title scores, work skills, work units, work places, experience years, salary levels and the like, so that the precise matching of the resume and the recruitment information is realized, the process of manually selecting the resume is omitted, the most suitable talents are screened out for an enterprise, any talent is not omitted, the recruitment information meeting the requirements of the user is provided for the user, and the bidirectional personalized service is realized.
In this embodiment, fig. 6 shows a flowchart of the whole method, that is, a series of characteristics selected in fig. 10 are determined by an unsupervised learning method such as Fastext algorithm through an evaluation method such as "pearson" and "correlation/AUC" and the like: job title, job skill, job experience, job site, etc. In fig. 6, feature engineering may refer to a process of converting raw data into training data of a model, and its purpose is to obtain better training data features so that a machine learning model approaches this upper limit. Unsupervised learning (machine learning) refers to extracting tag features through relevant information of positions and job seekers. Feature evaluation refers to the evaluation of each dimension of a feature to select a relatively more important feature in the process of feature selection. Supervised learning is a machine learning task that infers a function from labeled training data.
In fig. 7, how to comprehensively and accurately extract the post features or job seeker features in fig. 4, and remove some irregular information, so that an n-ary dictionary can be established to realize the subsequent feature similarity calculation.
The process of acquiring the training data set and the specific information of the training data set are shown in fig. 8. Fig. 9 shows information of some labels added to the training samples, which are samples of training data, indicating that training can be performed by these data. That is, if training is performed using the Fastext model, a word w is given and recordedTo represent the set of n-grams that appear in w. Representing the vector by ZgAssociated with each n-gram g. I.e. a word is represented by the sum of vector representations of n-grams of a word.
The resume screening method is used for mass resumes, and personalized information with higher matching degree is recommended for enterprises and users based on an accurate matching recommendation algorithm.
In the specific matching of the job recruitment and job hunting resume, the resume and job positions are matched and analyzed through dimensions such as job title scores, job skills, job units, job sites, experience years, salary levels and the like, so that the precise matching of the resume and job information is realized, the process of manually selecting the resume is omitted, the most suitable talents are screened out for enterprises, any talent is not omitted, the recruitment information meeting the requirements of users is provided for the users, and the bidirectional personalized service is realized.
To better understand the information in the method steps shown in fig. 1, the following provides a calculation formula and a description of each dimension.
The various formulas and models in this embodiment are intended to predict whether a resume "matches" the target position. The result of the model is a floating point number between 0-1, with a value close to 1, indicating that the resume "matches" the target position; the value is close to 0, indicating that the resume "does not match" the target position.
The pre-trained first model of the present embodiment may contain 6 different dimensions, i.e., conditional scores for computing these six dimensions, the score for each dimension representing a different aspect of the resume matching the target position, such as job title, job skill, job unit, job site, year of experience, and payroll level. Each condition score is finally combined to obtain a final matching score.
That is, the following description will specifically be made with respect to step S3 in fig. 1.
First, job title score
Aiming at the job title dimension, acquiring a first coefficient value corresponding to the first characteristic information and a second coefficient value corresponding to the second characteristic information according to a mapping relation table of each job title and the coefficient value; the mapping relation table comprises corresponding relations of job titles, coefficient values and job cut-off time points;
calculating the first coefficient value and the second coefficient value by using the following formula (1) to obtain the condition score of the job title:
Representing all the job title sets in the job experience of the job hunting resume. S is each job title in the job experience of the job hunting resume,a job title representing the target position.
For the functional model, the similarity between two working experience strings is calculated by a FastText pre-trained model.
In the work experience showing the job searching resume, each work title corresponds to different coefficients, the latest work experience title corresponds to a larger coefficient value, and the more distant work experience title corresponds to a smaller coefficient value.
f is the attenuation factor (set to 1 in this model),the time interval (in years) from the time when each job ends to the current time is expressed.
In this model, the title job title of the job resume is also considered. In all the job title sets in the job hunting resume, the title job title is regarded as a single job title, and the current time is taken as the job ending time.
Second, work skill score
When the current dimension is a work skill dimension, calculating the first characteristic information and the second characteristic information by adopting the following formula (3) to obtain a condition score of the work skill:
The formula for the work skill score is:
For the functional model, the similarity between two work skill strings is calculated by a FastText pre-trained model.
Third, work unit score
When the current dimension is the dimension of the working unit, calculating the first characteristic information and the second characteristic information by adopting the following formula (4) to obtain the condition score of the working unit:
The formula for the work unit score is:
For the functional model, the similarity between two work unit name strings is calculated by a FastText pre-trained model.
Fourth, job site score
When the current dimension is a work place dimension, acquiring first characteristic information of the work place in the description of the recruitment work and acquiring second characteristic information of the work place in the description of the job hunting resume; the method comprises the following steps:
based on character strings in the recruitment work and job hunting resume, adopting a trained FastText model to carry out standardization processing on non-standardized position names, and acquiring a set of all work places in the job hunting resume and information of the work places in the recruitment work;
correspondingly, analyzing the first characteristic information and the second characteristic information by adopting a calculation strategy of the dimension in a pre-trained first model to obtain a condition score of the dimension, wherein the method comprises the following steps:
acquiring a first coefficient value corresponding to the first characteristic information and a second coefficient value corresponding to the second characteristic information according to the mapping relation table of each working place and the coefficient value; the mapping relation table comprises corresponding relations of work places, coefficient values and work cut-off time points;
calculating the first coefficient value and the second coefficient value by using the following formula (5) to obtain the condition score of the work place:
The formula for the workplace score is divided into two parts, the Fastext scoreAnd distance fraction。
Fastext score:
The set of all workplace (city) names in the work experience representing the job resume, l for each workplace.The name of the place of work (city) representing the target position.
For the functional model, the similarity between two workplace name strings is calculated by a FastText pre-trained model.
In the work experience of the resume representing the job hunting, each work place name corresponds to a different coefficient, the work place of the recent work experience corresponds to a larger coefficient value, and the work place of the longer work experience corresponds to a smaller coefficient value.
f is the attenuation factor (set to 1 in this model),the time interval (in years) from the time when each job ends to the current time is expressed.
In this embodiment, since the position name needs to be standardized, a process of realizing the work site standardization will be described in detail below.
Since FastText does take into account word co-occurrence and substrings in words, these non-normalized location data can be easily processed. However, these non-standardized location names also create a problem in that the quality of the FastText model can vary unpredictably when the location format is changed.
Thus, embodiments of the present invention provide a new FastText model (referred to as a new model) to standardize all location names.
To train the new model, training data is collected from the user's profile work experience. In particular, the location name associated with the work experience is used. The data set then needs to be further refined to form a location name array (array of city names, provincial names and country/region names) for each user profile, and then used as input for the FastText mode. The location array consists of city names, province names and country/region names, respectively, such that each location follows the following format: city name | province name | country name. An example of such an array for one particular configuration file is as follows:
input data trained based on the new model. Some of the hyper-parameters selected in training the model are:
epoch = 10: training is iterated 10 times;
min _ count = 10: all location names with a frequency greater than 10 are considered as vocabulary for the new FastText model;
window = 2: window size of the context words in the sentence (2 words before the current word and 2 words after the current word);
min _ n = 3: the substring must contain at least 3 characters in order to have a shared bucket of embedded vectors;
size = 50: the dimension of the embedded vector is 50;
bucket = 100000: bucket size. Each bucket has its own embedded vector. The hash value of the substring will determine the location of the bucket.
If min _ count = 10, there are 13722 location names in the current vocabulary.
The new model will yield the following results:
each word in the vocabulary (city name | province name | country name) will have its own vector representation;
each bucket (multiple substrings with the same hash value will share the same bucket) will have its own embedded vector.
Given two words (two positions), the new model can compute their similarity by using the cosine similarity between the two vector representations of the two words. If the word is in the vocabulary, the word is represented by an embedded vector. If the word is not in the vocabulary (OOV), a set of embedded vectors (one for each substring) will be used as a representation. Therefore, this FastText model can solve the OOV (out-of-sphere) problem well by using its substrings.
In this new model, it considers the title workplace of the job resume. In all the job sites (city names) in the job resume, the title job site is regarded as a single job site, and the current time is taken as the job end time.
Distance fraction: the score can be interpreted as the location and work of the userThe possibility of commuting between locations. If the distance is within 50 kilometers, the distance score value is 1; if the distance exceeds 50 km, the part of the distance exceeding 50 km will be processed by a penalty factor. The distance between two positions is calculated using the geographical distance between the two positions (Haversene distance: the hemiversine formula is a calculation method for determining the distance between two points on a great circle from the longitude and latitude of the two points, with an important place in navigation. it is a special case of the formula of the "hemiversine theorem" in spherical trigonometry, which relates to the sides and angles of a spherical triangle). The specific formula for the distance is as follows:
d is the distance between the latest work site and the target work site in the work experience of the user profile. Using the Haversine formula, the distance is calculated with the latitude and longitude of the location as inputs.
The final location score will be calculated as the average of the FastText score and the distance score:
of course, if the work location is invalid or the user's previous set of work locations is empty, it will beSet to a default value of 0.5.
Fifth, work experience score
When the current dimension is a working experience dimension, calculating the first characteristic information and the second characteristic information by adopting the following formula (8) to obtain a condition score of the working experience:
The formula of the working experience score is:
By the above formula, in the first caseIt means that the user does not satisfy the work experience requirement of the target position, or the range of satisfying the work experience requirement is within one year. In this case, an attenuation function is used, for example:thus, when the user's work experience years are less than the requirements of the target work, the work experience score will decay exponentially with the gap in time.
In the second caseAnd the user is represented as fully meeting the work experience requirements of the target position. The candidate has at least more than one year of work experience relative to the work experience requirements for the target position. In this case, the work experience score is set to 1, indicating that the user's work experience age meets the work experience requirements of the target position.
If the user does not have work experience or the job title is invalid, the method will be usedSet to a default value of 0.5.
Sixth, working year score
6-1) relative work experience years of the user:
the calculation formula for the relative work experience years of the user can be expressed as:
And calculating the working time length aiming at the working title s in the working experience of the user for the functional model.
Calculating the relevance of each job title and the job title of the target job for the function model:
In the above formula, the working experience years are considered only if there is a correlation with the target job title (in this example, the threshold is set to 0.65).
6-2) working experience years requirement for target work
The calculation formula for the work experience age requirement for the target work may be expressed as:
The working experience years are calculated based on the seniority information of the working titles (e.g., senior titles, managers, headquarters, etc.). Seniority information of job titleAnd the age of the experience of the job can be derived from table 3 below. Table 3:
seventh, conditional score of salary level
When the current dimension is the salary level dimension, the following formula (12) is adopted for the first characteristic information and the second characteristic information
And (3) calculating the sign information to obtain a condition score of the salary level:
the payroll score uses payroll estimation results of the layered model.
Introduction of a hierarchical model: the hierarchical model uses a training data set to build a salary dictionary (hash table) and performs a hierarchical search at test time. The current layer is sorted by (job title + job title → job title + normalized job title + seniority information → job title → normalized job title + seniority information → normalized job title).
That is, if enough samples can be found in the layer, the dictionary is searched for the exact same (work unit name + work title) first, and then the median estimate of the payroll is output; otherwise, please return to the next level (job title) and perform the same search.
The salary matching score formula for each pair of target work and user pairs is as follows:
Predicting the median of the current salary value of the user;
salary value of the target job: predicting a median of a salary value of the target job;
salary score algorithm:
the salary value of if user is not defined or the salary value of target job is not defined:
salary _ score returns 0.5 directly
Else
Salary _ score is defined as the Salary value to work with the user's Salary value/target
If salary _ score is greater than 0.8 and less than 1.2:
salary _ score returns 1.0
Elseif salary _ score is less than 0.8:
salary _ score returns the greater of 0.0 and 2.0-40.8-Salary _ score
Else
Salary _ score returns the larger of both 0.0 and 2.0-Salary _ score-1.2.
In addition, the salary dictionary (hash table) can be searched by using the keywords in the specified sequence to search the salary median valuation of the user and the work:
specifically, when the target job and the user's salary estimate differ by no more than 20%, then the perfect match score is assumed to be 1.0. When the user salary is 20% higher or 20% lower than the work salary, the salary matching score is decreased smoothly. If the estimated payroll of the position or the estimated payroll of the user is invalid, the position and the estimated payroll are compared to determine whether the position and the estimated payroll are validSet to a default value of 0.5.
According to another aspect of the present invention, an electronic device is further provided, which includes a memory and a processor, where the memory stores a program, and the processor executes a computer program stored in the memory, that is, executes the steps of the resume screening method according to any of the embodiments of the first aspect.
The resume screening method in this embodiment may be a computer storage medium that can be executed on a computer, and can be effectively applied to various scenes, such as intelligent recruitment, internet, and the like, and this embodiment does not limit this.
In addition, in practical applications, an embodiment of the present invention further provides a matching system for resume matching, as shown in fig. 11, which may include a branch of a computer program algorithm, a branch of acquired data, and a branch of a specific scenario application. In a specific application scene, the system can be applied to each intelligent recruitment platform, and job hunting resume information and job recruitment post information can be acquired in a targeted manner when data are acquired; in the algorithm branch, specific training can be performed based on the information of the training data and the knowledge graph in advance, and the division of the training data and the data used finally is used by the characteristics of division from 6 dimensions or more than 6 dimensions.
The matching system of the embodiment can be used for massive resumes, and personalized information with higher matching degree is recommended for enterprises and users based on a recommendation algorithm of accurate matching.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, the claims should be construed to include preferred embodiments and all changes and modifications that fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit or scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention should also include such modifications and variations.
Claims (9)
1. A resume screening method is characterized by comprising the following steps:
a1, acquiring first characteristic information of each dimension in the description of the recruitment work and second characteristic information of each dimension in the description of all job resumes corresponding to the recruitment work aiming at each recruitment work; the job-seeking resume is a selected resume after at least one screening; the dimensions include one or more of the following: job title, job skill, job unit, job site, years of experience, and salary level; the dimensions of the recruitment work and the job hunting resume are consistent;
a2, acquiring condition scores of all dimensions based on first feature information and second feature information under the same dimension, and fusing the condition scores of all dimensions to acquire matching scores;
when the dimension is a job title, acquiring a first coefficient value corresponding to the first characteristic information and a second coefficient value corresponding to the second characteristic information according to the mapping relation table of each job title and the coefficient value; calculating the first coefficient value and the second coefficient value by adopting a formula (1) to obtain the condition score of the job title:
Representing all job title sets in the job experience of the job hunting resume; s is each job title in the job experience of the job hunting resume;representing job titles in the recruitment work;the method comprises the steps of calculating similarity between two working experience character strings for a pre-trained function model;different coefficients corresponding to each job title in the job experience representing the job hunting resume;
and A3, selecting the job resume to which the matching score larger than the preset threshold value belongs as the screened resume matching the recruitment.
2. The method of claim 1,
the mapping relation table comprises corresponding relations of job titles, coefficient values and job cut-off time points;
in addition, the first and second substrates are,in the formula (2),in order to be able to use the attenuation factor,a time interval representing the time from the end of each job to the current time; all job title sets in job hunting resumeIn the method, the title job title is used as an independent job title, and the current time is used as the job finishing time; if the job title is invalid or empty, it will beSet to 0.5.
3. The method of claim 1, wherein the A2, based on the first feature information and the second feature information in the same dimension, obtaining the condition score of each dimension comprises:
when the dimensionality is the work skill, calculating the first characteristic information and the second characteristic information by adopting the following formula (3) to obtain a condition score of the work skill:
4. The method of claim 1, wherein the A2, based on the first feature information and the second feature information in the same dimension, obtaining the condition score of each dimension comprises:
when the dimension is a working unit, calculating the first characteristic information and the second characteristic information by adopting the following formula (4) to obtain the condition score of the working unit:
the function model is used for calculating the similarity between the character strings of the names of the two working units;
5. The method of claim 1,
when the dimension is a work place, acquiring first characteristic information of the work place in the description of the recruitment work and acquiring second characteristic information of the work place in the description of the job hunting resume; the method comprises the following steps:
based on character strings in the recruitment work and job hunting resume, adopting a trained FastText model to carry out standardization processing on non-standardized position names, and acquiring a set of all work places in the job hunting resume and information of the work places in the recruitment work;
correspondingly, a2, obtaining a condition score of each dimension based on the first feature information and the second feature information in the same dimension, includes:
acquiring a first coefficient value corresponding to the first characteristic information and a second coefficient value corresponding to the second characteristic information according to the mapping relation table of each working place and the coefficient value; the mapping relation table comprises corresponding relations of work places, coefficient values and work cut-off time points;
calculating the first coefficient value and the second coefficient value using the following equation (5) to obtain the workConditional score of a place:
The name set represents all the workplaces, namely cities, in the working experience of the job hunting resume;
the function model is used for calculating the similarity between the character strings of the two work place names;
6. The method of claim 1,
a2, acquiring condition scores of all dimensions based on the first characteristic information and the second characteristic information under the same dimension, wherein the condition scores comprise:
when the dimensionality is the working experience and the age limit, the following formula (7) is adopted to calculate the first characteristic information and the second characteristic information to obtain the condition scores of the working experience and the age limit:
in equation (7), in the first caseIndicating that the user does not meet the work experience requirements of the target position, or the range of meeting the work experience requirements is within one year; in this case, the attenuation function is;
In the second caseThe user fully meets the work experience requirement of the target position, at the moment, the work experience score is set to be 1, and the work experience age of the user meets the work experience requirement of the target position;
7. The method of claim 6,
before the calculation of the first characteristic information and the second characteristic information by using the following formula (8), the method further includes: obtaining the job title S of each job and the job title of the target jobThe degree of correlation of (c);
calculating the working experience years by adopting a formula (8) when the target working title has correlation;
acquiring the relative working experience age of the user in the job searching resume according to a formula (8);
Calculating the working time length of the job title S in the working experience of the job searching resume for the function model;
for the function model, calculate the job title S of each job and the job title of the target jobThe degree of correlation of (c);
Acquiring the correlation degree with the target job title based on the formula (9);
calculating the working experience age requirement of the target work according to the formula (10),
Calculating the working experience age limit according to the seniority information of the working title; the mapping relation between the seniority information of the job title and the working experience age is obtained according to a preset rule;
8. The method of claim 1,
a2, acquiring condition scores of all dimensions based on the first characteristic information and the second characteristic information under the same dimension, wherein the condition scores comprise:
when the dimension is the salary level, the following formula (10) is adopted to calculate the first characteristic information and the second characteristic information to obtain the condition score of the salary level:
Predicting the median of the current salary value of the user;
salary value of the target job: and predicting the median of the salary value of the target work.
9. The method according to any one of claims 1 to 8, wherein the fusing the condition scores of all dimensions of the same career resume to obtain the matching score comprises:
and adopting a logistic regression algorithm to perform fusion processing on the condition scores of the six dimensions to obtain matching scores.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111126096.2A CN113570348A (en) | 2021-09-26 | 2021-09-26 | Resume screening method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111126096.2A CN113570348A (en) | 2021-09-26 | 2021-09-26 | Resume screening method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113570348A true CN113570348A (en) | 2021-10-29 |
Family
ID=78174496
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111126096.2A Pending CN113570348A (en) | 2021-09-26 | 2021-09-26 | Resume screening method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113570348A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115795002A (en) * | 2022-10-18 | 2023-03-14 | 上海自然智动网络科技有限公司 | Intelligent interaction method and system |
CN116452163A (en) * | 2023-02-14 | 2023-07-18 | 广东尊一互动科技有限公司 | Talent recruitment management system and method based on big data |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110267651A1 (en) * | 2010-04-30 | 2011-11-03 | Canon Kabushiki Kaisha | Information processing apparatus, information processing method and storage medium |
US20120123955A1 (en) * | 2010-11-12 | 2012-05-17 | Chen Ke Kelly | Calculation engine for compensation planning |
US20170177708A1 (en) * | 2015-12-17 | 2017-06-22 | Linkedin Corporation | Term weight optimization for content-based recommender systems |
CN107590133A (en) * | 2017-10-24 | 2018-01-16 | 武汉理工大学 | The method and system that position vacant based on semanteme matches with job seeker resume |
CN107729532A (en) * | 2017-10-30 | 2018-02-23 | 北京拉勾科技有限公司 | A kind of resume matching process and computing device |
US20180197146A1 (en) * | 2017-01-06 | 2018-07-12 | International Business Machines Corporation | System, method and computer program product for remarketing an advertised resume within groups |
CN109492164A (en) * | 2018-11-26 | 2019-03-19 | 北京网聘咨询有限公司 | A kind of recommended method of resume, device, electronic equipment and storage medium |
CN109558429A (en) * | 2018-11-16 | 2019-04-02 | 广东百城人才网络股份有限公司 | The two-way recommendation method and system of talent service based on internet big data |
CN110032681A (en) * | 2019-04-17 | 2019-07-19 | 北京网聘咨询有限公司 | Position recommended method based on resume content |
CN111078971A (en) * | 2019-11-19 | 2020-04-28 | 平安金融管理学院(中国·深圳) | Resume file screening method and device, terminal and storage medium |
CN111125343A (en) * | 2019-12-17 | 2020-05-08 | 领猎网络科技(上海)有限公司 | Text analysis method and device suitable for human-sentry matching recommendation system |
CN111833019A (en) * | 2020-07-15 | 2020-10-27 | 北京亮马手信息咨询有限公司 | Cloud customization recruitment method and system |
CN112100999A (en) * | 2020-09-11 | 2020-12-18 | 河北冀联人力资源服务集团有限公司 | Resume text similarity matching method and system |
CN113298642A (en) * | 2021-05-26 | 2021-08-24 | 上海晓途网络科技有限公司 | Order detection method and device, electronic equipment and storage medium |
-
2021
- 2021-09-26 CN CN202111126096.2A patent/CN113570348A/en active Pending
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110267651A1 (en) * | 2010-04-30 | 2011-11-03 | Canon Kabushiki Kaisha | Information processing apparatus, information processing method and storage medium |
US20120123955A1 (en) * | 2010-11-12 | 2012-05-17 | Chen Ke Kelly | Calculation engine for compensation planning |
US20170177708A1 (en) * | 2015-12-17 | 2017-06-22 | Linkedin Corporation | Term weight optimization for content-based recommender systems |
US20180197146A1 (en) * | 2017-01-06 | 2018-07-12 | International Business Machines Corporation | System, method and computer program product for remarketing an advertised resume within groups |
CN107590133A (en) * | 2017-10-24 | 2018-01-16 | 武汉理工大学 | The method and system that position vacant based on semanteme matches with job seeker resume |
CN107729532A (en) * | 2017-10-30 | 2018-02-23 | 北京拉勾科技有限公司 | A kind of resume matching process and computing device |
CN109558429A (en) * | 2018-11-16 | 2019-04-02 | 广东百城人才网络股份有限公司 | The two-way recommendation method and system of talent service based on internet big data |
CN109492164A (en) * | 2018-11-26 | 2019-03-19 | 北京网聘咨询有限公司 | A kind of recommended method of resume, device, electronic equipment and storage medium |
CN110032681A (en) * | 2019-04-17 | 2019-07-19 | 北京网聘咨询有限公司 | Position recommended method based on resume content |
CN111078971A (en) * | 2019-11-19 | 2020-04-28 | 平安金融管理学院(中国·深圳) | Resume file screening method and device, terminal and storage medium |
CN111125343A (en) * | 2019-12-17 | 2020-05-08 | 领猎网络科技(上海)有限公司 | Text analysis method and device suitable for human-sentry matching recommendation system |
CN111833019A (en) * | 2020-07-15 | 2020-10-27 | 北京亮马手信息咨询有限公司 | Cloud customization recruitment method and system |
CN112100999A (en) * | 2020-09-11 | 2020-12-18 | 河北冀联人力资源服务集团有限公司 | Resume text similarity matching method and system |
CN113298642A (en) * | 2021-05-26 | 2021-08-24 | 上海晓途网络科技有限公司 | Order detection method and device, electronic equipment and storage medium |
Non-Patent Citations (4)
Title |
---|
KANYANUT HOMSAPAYA ET AL: "Machine Learning for Older Jobseeker and Employment Matching", 《2020 17TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING/ELECTRONICS, COMPUTER, TELECOMMUNICATIONS AND INFORMATION TECHNOLOGY (ECTI-CON)》 * |
刘兴林等: "基于向量相似度的招聘就业双向推荐模型", 《中国科技信息》 * |
张潇艺: "招聘信息与简历智能匹配系统的研究与实现", 《中国优秀博硕士学位论文全文数据库(硕士)社会科学Ⅱ辑》 * |
徐锦阳等: "招聘网站职位与简历的双向匹配相似度算法", 《信息技术》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115795002A (en) * | 2022-10-18 | 2023-03-14 | 上海自然智动网络科技有限公司 | Intelligent interaction method and system |
CN115795002B (en) * | 2022-10-18 | 2023-11-03 | 上海自然智动网络科技有限公司 | Intelligent interaction method and system |
CN116452163A (en) * | 2023-02-14 | 2023-07-18 | 广东尊一互动科技有限公司 | Talent recruitment management system and method based on big data |
CN116452163B (en) * | 2023-02-14 | 2023-11-28 | 广东尊一互动科技有限公司 | Talent recruitment management system and method based on big data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3186754B1 (en) | Customizable machine learning models | |
Zavadskas et al. | Multi-attribute decision-making model by applying grey numbers | |
Lin et al. | Continuous improvement of knowledge management systems using Six Sigma methodology | |
US8117024B2 (en) | System and method for automatically processing candidate resumes and job specifications expressed in natural language into a normalized form using frequency analysis | |
US20170193010A1 (en) | Computerized system and method for determining non-redundant tags from a user's network activity | |
CN109271539B (en) | Image automatic labeling method and device based on deep learning | |
US20100114789A1 (en) | System and method for guiding users to candidate resumes and current in-demand job specification matches using predictive tag clouds of common, normalized elements for navigation | |
CN111008262B (en) | Lawyer evaluation method and recommendation method based on knowledge graph | |
US20090276415A1 (en) | System and method for automatically processing candidate resumes and job specifications expressed in natural language into a common, normalized, validated form | |
CN113570348A (en) | Resume screening method | |
US20090276460A1 (en) | System and method for automatically processing candidate resumes and job specifications expressed in natural language by automatically adding classification tags to improve matching of candidates to job specifications | |
US20110313940A1 (en) | Process To Optimize A Person's Profile Into A Standardized Competency Profile | |
CN112417097A (en) | Multi-modal data feature extraction and association method for public opinion analysis | |
CN110310012B (en) | Data analysis method, device, equipment and computer readable storage medium | |
US20090276295A1 (en) | system and method for modeling workforce talent supply to enable dynamic creation of job specifications in response thereto | |
US20220327487A1 (en) | Ontology-based technology platform for mapping skills, job titles and expertise topics | |
CN115099310A (en) | Method and device for training model and classifying enterprises | |
Tallapragada et al. | Improved Resume Parsing based on Contextual Meaning Extraction using BERT | |
Kosylo et al. | Artificial intelligence on job-hopping forecasting: AI on job-hopping | |
CN115222433A (en) | Information recommendation method and device and storage medium | |
CN116226404A (en) | Knowledge graph construction method and knowledge graph system for intestinal-brain axis | |
Гавриленко et al. | Тhe task of analyzing publications to build a forecast for changes in cryptocurrency rates | |
CN112905713B (en) | Case-related news overlapping entity relation extraction method based on joint criminal name prediction | |
CN112330387B (en) | Virtual broker applied to house watching software | |
CN108153829B (en) | Resume evaluation method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20211029 |