CN111353014B

CN111353014B - Position keyword extraction and position demand updating method and device

Info

Publication number: CN111353014B
Application number: CN201811563936.XA
Authority: CN
Inventors: 李越川; 林方全; 杨超; 张京桥; 杨程; 周涛明; 戈伟; 蒋澄宇; 吴超; 周恒�; 颜文龙; 夏宇; 张磊; 汪琳
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2018-12-20
Filing date: 2018-12-20
Publication date: 2023-05-02
Anticipated expiration: 2038-12-20
Also published as: CN111353014A

Abstract

The invention discloses a method and a device for extracting position keywords and updating position requirements. Wherein the method comprises the following steps: acquiring a job keyword list according to the pre-acquired job description history data; acquiring keywords of each position description in position description historical data according to a position type keyword list to obtain a keyword set of a position description data set; respectively obtaining a keyword set of the candidate resume data set and a keyword set of the interview record data set according to the candidate resume data set and the interview record data set and combining the job keyword list; and obtaining target keywords according to the keyword set of the candidate resume data set and the keyword set of the interview record data set and the keyword set of the job description data set. The invention solves the technical problems that the keyword extraction is missed due to limited content in the keyword extraction text in the related technology, so that the position requirement is not updated in time due to low keyword extraction accuracy, and recruitment efficiency and enterprise operation are affected.

Description

Position keyword extraction and position demand updating method and device

Technical Field

The invention relates to the recruitment field, in particular to a method and a device for extracting position keywords and updating position requirements.

Background

Enterprises often issue job descriptions for reference and selection by job seekers when recruiting talents. Job descriptions typically contain at least two parts, work content and job requirements. Large enterprises can release thousands of positions every year, and the positions are distributed in various departments and teams, and the categories are various, such as technical categories, product categories and operation categories. When recruiters write job requirements, job requirements are often written according to common requirements required by the job types, so that the job requirements of different departments and teams on the same type of job are quite similar in text content.

However, each post has its own focused requirements for the candidate, some recruiters are not aware of the requirements when writing the post descriptions, some do not accurately express the terms, some technical post new technologies are very fast to promote, and the post requirements are not updated in time, so that job seekers, recommenders (staff push in, hunter, etc.) cannot fully and accurately acquire the post requirements, and recruitment efficiency is further affected.

In the related art, position information is analyzed according to a preset position information training library to obtain position keywords of the position information, and the method can be realized through the following processes: (1) Performing word segmentation processing on position information according to a preset position information training library to obtain a position word set; (2) Searching weights and correlations corresponding to words in the position word set in a preset position information training library; (3) Generating a comprehensive result corresponding to the words in the position word set according to the searched weights and the relevance corresponding to the words in the position word set, and sequencing the words in the position word set according to the sequence from high to low of the comprehensive result; (4) And determining a third preset number of words in the ordering as position keywords of the position information. However, the related art has a problem in that extraction of a JOB keyword is limited to JOB DESCRIPTION (JD) JD text, and if a certain word does not appear in JD, it is impossible to extract the keyword.

Aiming at the problems that in the related technology, due to limited content in a keyword extraction text, the keyword extraction is easy to be missed, so that the position requirement is not updated in time due to low keyword extraction accuracy, and recruitment efficiency and enterprise operation are affected, no effective solution is proposed at present.

Disclosure of Invention

The embodiment of the invention provides a method and a device for extracting position keywords and updating position requirements, which at least solve the technical problems that the keyword extraction is missed easily due to limited content in a keyword extraction text in the related technology, so that the position requirements are not updated in time due to low keyword extraction accuracy, and recruitment efficiency and enterprise operation are affected.

According to an aspect of the embodiment of the present invention, there is provided a method for extracting position keywords, including: acquiring a job keyword list according to the pre-acquired job description history data; acquiring keywords of each position description in position description historical data according to a position type keyword list to obtain a keyword set of a position description data set; respectively obtaining a keyword set of the candidate resume data set and a keyword set of the interview record data set according to the candidate resume data set and the interview record data set and combining the job keyword list; and obtaining target keywords according to the keyword set of the candidate resume data set and the keyword set of the interview record data set and the keyword set of the job description data set.

Optionally, according to the pre-acquired job description history data, acquiring the job keyword list includes: dividing the position description data in the position description history data into words to obtain position description words; counting the occurrence probability of each job descriptor in the job description history data to obtain a first numerical value; counting the occurrence probability of each job descriptor in the job description data under each job class to obtain a second value; calculating the importance value of each job descriptor under the job class according to the first numerical value and the second numerical value; and ordering the position descriptors of the importance values according to a preset sequence to obtain a position keyword list.

Further, optionally, the ranking of the job descriptors of the importance values according to a preset sequence, and obtaining the job keyword list includes: acquiring first N job descriptors with importance values larger than a preset threshold value; and obtaining a job keyword list according to the first N job descriptors.

Optionally, obtaining keywords of each job description in the job description history data according to the job type keyword list, and obtaining the keyword set of the job description data set includes: dividing each position description in the position description history data into words to obtain a position description word set; and acquiring an intersection set according to the job description word set and the job type keyword list to obtain a keyword set of the job description data set.

Optionally, the obtaining the keyword set of the candidate resume data set and the keyword set of the interview record data set according to the pre-obtained candidate resume data set and the interview record data set and combining the job keyword list includes: under the condition that the candidate resume data set comprises a candidate resume number and a resume text, and the interview record data set comprises a position description number, a candidate resume number, interview results and interview evaluation text, the position description belonging to the position description is found according to the position description number; acquiring an intersection according to the result of the job keyword list and the resume text word segmentation to obtain a keyword set of a resume data set; and acquiring an intersection set according to the job keyword surface and the results after the interview evaluation text word segmentation, and obtaining a keyword set of the interview record data set.

Optionally, according to the keyword set of the candidate resume data set and the keyword set of the interview record data set, combining the keyword set of the job description data set to obtain the target keyword includes: counting any one position description to obtain an interview record of the position description; acquiring a union set according to the interview evaluation obtained by interview recording and a keyword set of a candidate resume data set to obtain a first set; acquiring a keyword set under a job class corresponding to the job description; if the keyword set appears in the first set, marking a first identifier; if the keyword set does not appear in the first set, marking a second identifier; obtaining a data vector of the interview record according to the first mark and/or the second mark; and calculating according to the data vector to obtain the target keyword.

Further, optionally, calculating according to the data vector, to obtain the target keyword includes: calculating pearson correlation coefficients between each keyword vector and each column of interview result vectors in interview records, and arranging calculation results to obtain a pearson correlation coefficient set, wherein the keyword vectors are whether keywords appear in a resume or interview records; judging whether the pearson correlation coefficient in the pearson correlation coefficient set has keywords which are larger than a preset threshold and do not describe the keyword set of the data set in position; and if the judgment result is yes, obtaining the target keyword.

Optionally, calculating according to the data vector, obtaining the target keyword includes: calculating cosine similarity between each keyword vector and each column of interview result vector in the interview record, wherein the keyword vector is whether keywords appear in a resume or the interview record; and obtaining the target keywords according to the cosine similarity.

Optionally, the method further comprises: and acquiring recruitment requirements according to the target keywords.

According to another aspect of the embodiment of the present invention, there is also provided a method for post demand update, including: acquiring a job keyword list according to the pre-acquired job description history data; acquiring keywords of each position description in position description historical data according to a position type keyword list to obtain a keyword set of a position description data set; respectively obtaining a keyword set of the candidate resume data set and a keyword set of the interview record data set according to the candidate resume data set and the interview record data set and combining the job keyword list; obtaining target keywords according to the keyword set of the candidate resume data set and the keyword set of the interview record data set and the keyword set of the job description data set; and updating recruitment requirements according to the target keywords.

According to still another aspect of the embodiment of the present invention, there is further provided a position keyword extraction apparatus, including: the first acquisition module is used for acquiring a job keyword list according to the pre-acquired job description historical data; the second acquisition module is used for acquiring keywords of each position description in the position description history data according to the position type keyword list to obtain a keyword set of a position description data set; the first extraction module is used for obtaining a keyword set of the candidate resume data set and a keyword set of the interview record data set according to the candidate resume data set and the interview record data set respectively and combining the job keyword list; and the second extraction module is used for obtaining target keywords according to the keyword set of the candidate resume data set and the keyword set of the interview record data set and combining the keyword set of the job description data set.

Optionally, the first obtaining module includes: the word segmentation unit is used for segmenting the position description data in the position description history data to obtain position description words; the first statistics unit is used for counting the occurrence probability of each job descriptor in the job description historical data to obtain a first numerical value; the second statistics unit is used for counting the occurrence probability of each job descriptor in the job description data under each job class to obtain a second numerical value; the computing unit is used for computing the importance value of each job descriptor under the job class according to the first numerical value and the second numerical value; the acquisition unit is used for sorting the position descriptors of the importance values according to a preset sequence to obtain a position keyword list.

According to still another aspect of the embodiment of the present invention, there is also provided a station requirement updating apparatus, including: the first acquisition module is used for acquiring a job keyword list according to the pre-acquired job description historical data; the second acquisition module is used for acquiring keywords of each position description in the position description history data according to the position type keyword list to obtain a keyword set of a position description data set; the first extraction module is used for obtaining a keyword set of the candidate resume data set and a keyword set of the interview record data set according to the candidate resume data set and the interview record data set respectively and combining the job keyword list; the second extraction module is used for obtaining target keywords according to the keyword set of the candidate resume data set and the keyword set of the interview record data set and combining the keyword set of the job description data set; and the updating module is used for updating the recruitment requirement according to the target keyword.

According to an aspect of another embodiment of the present invention, there is further provided a storage medium, where the storage medium includes a stored program, and when the program runs, the device where the storage medium is controlled to execute the above method for extracting a job keyword or the method for updating a job requirement.

In the embodiment of the invention, a mode of taking text description of the JD into consideration and simultaneously utilizing interview evaluation and resume information related to the JD is adopted, and a job keyword list is obtained according to the pre-obtained job description history data; acquiring keywords of each position description in position description historical data according to a position type keyword list to obtain a keyword set of a position description data set; respectively obtaining a keyword set of the candidate resume data set and a keyword set of the interview record data set according to a pre-obtained candidate resume data set and an interview record data set and combining a job keyword list; according to the keyword set of the candidate resume data set and the keyword set of the interview record data set, the target keywords are obtained by combining the keyword set of the job description data set, so that the correlation between the extracted keywords and the job description is effectively ensured, the extracted keywords are not required to appear in the job description, the technical effect of improving the keyword extraction accuracy in information is achieved, and the technical problems that due to limited content in a keyword extraction text in the related technology, the keyword extraction is easy to be missed, the job requirements are not updated in time due to low keyword extraction accuracy, and recruitment efficiency and enterprise operation are affected are solved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiments of the invention and together with the description serve to explain the invention and do not constitute a limitation on the invention. In the drawings:

fig. 1 is a hardware block diagram of a computer terminal of a method for extracting position keywords according to an embodiment of the present invention;

FIG. 2 is a flow chart of a method of job keyword extraction according to a first embodiment of the present invention;

FIG. 3 is a flow chart of a method of job keyword extraction according to a first embodiment of the present invention;

fig. 4 is a block diagram of an apparatus for extracting position keywords according to a third embodiment of the present invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The technical terms referred to in this application are:

JD: job Description, job Description.

CV: curliculum Vitae, candidate resume.

Word segmentation: the complete expression is Chinese word segmentation. Refers to the segmentation of a sequence of chinese characters into individual words. Word segmentation is the process of recombining a continuous word sequence into a word sequence according to a certain specification. It is known that in English line text, space is used as natural delimiter between words, chinese is simply delimited by word, sentence and segment by obvious delimiter, and only word does not have one delimiter in form, while English also has phrase dividing problem, but Chinese is more complex and more difficult than English on word layer.

Pearson correlation coefficient: in statistics, the Pearson correlation coefficient (Pearson correlation coefficient), also called Pearson product-moment correlation coefficient, abbreviated as PPMC or PCCs, is used to measure the correlation (linear correlation) between two variables X and Y, and its value is between-1 and 1.

Cosine similarity: their similarity is evaluated by calculating the cosine of the angle between the two vectors.

Example 1

In accordance with an embodiment of the present invention, there is also provided a method embodiment of job keyword extraction, it being noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system such as a set of computer executable instructions, and although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order other than that illustrated herein.

The method embodiment provided in the first embodiment of the present application may be executed in a mobile terminal, a computer terminal or a similar computing device. Taking a computer terminal as an example, fig. 1 is a hardware block diagram of a computer terminal of a method for extracting position keywords according to an embodiment of the present invention. As shown in fig. 1, the computer terminal 10 may include one or more (only one is shown in the figure) processors 102 (the processors 102 may include, but are not limited to, a microprocessor MCU or a processing device such as a programmable logic device FPGA), a memory 104 for storing data, and a transmission module 106 for communication functions. It will be appreciated by those of ordinary skill in the art that the configuration shown in fig. 1 is merely illustrative and is not intended to limit the configuration of the electronic device described above. For example, the computer terminal 10 may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.

The memory 104 may be used to store software programs and modules of application software, such as program instructions/modules corresponding to the method for extracting position keywords in the embodiment of the present invention, and the processor 102 executes the software programs and modules stored in the memory 104, thereby executing various functional applications and data processing, that is, implementing the method for extracting position keywords of the application program. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the computer terminal 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission means 106 is arranged to receive or transmit data via a network. The specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal 10. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module for communicating with the internet wirelessly.

In the above-mentioned operating environment, the present application provides a method for extracting position keywords as shown in fig. 2. Fig. 2 is a flowchart of a method of job keyword extraction according to a first embodiment of the present invention.

Step S202, acquiring a job keyword list according to pre-acquired job description historical data;

step S204, obtaining keywords of each position description in the position description history data according to a position type keyword list to obtain a keyword set of a position description data set;

step S206, respectively obtaining a keyword set of the candidate resume data set and a keyword set of the interview record data set according to the candidate resume data set and the interview record data set and combining the job keyword list;

step S208, obtaining target keywords according to the keyword set of the candidate resume data set and the keyword set of the interview record data set and combining the keyword set of the job description data set.

Specifically, in combination with step S202 to step S208, the method for extracting the position keywords provided by the embodiment of the present application may be suitable for a scene of extracting the position keywords of the application resume, so that an administrative and human department in an enterprise may be facilitated to effectively manage the application resume and issue the recruitment, and a situation that due to different positions, the recruitment is caused by different emphasis is avoided, and the expression of the term passing through is wrong, so that the job seeker, the recommender (employee push, hunter, etc.) cannot obtain the position requirements sufficiently and accurately, and further the recruitment efficiency is affected.

It should be noted that, in the embodiment of the present application, job description is abbreviated as JD, and candidate resume is abbreviated as CV; fig. 3 is a flowchart of a method for extracting position keywords according to a first embodiment of the present invention, as shown in fig. 3. The method for extracting the position keywords provided by the embodiment of the application specifically comprises the following steps:

s11, a JD detailed information data table of nearly three years is imported from a database, and each piece of data of the JD detailed information data table comprises a JD number, a job class and a post description text, wherein the job class can comprise: classification of job positions, such as technical class, operation class and product class.

S12, in the process of acquiring the job keyword table (i.e. step S202 in the embodiment of the present application), word segmentation is performed on the JD history data, then the probability P (w_i) of occurrence of each word segment w_i in all JD history data and the probability P (w_i|c_j) of occurrence of each word segment w_i in the JD history data under each job c_j are counted, and the importance lift (w_i, c_j) =P (w_i|c_j)/P (w_i) of each word segment w_i under the job c_j is calculated.

For each job class c_j, sorting all the word segments w_i according to lift (w_i, c_j) from large to small, and taking top N important words with the maximum lift greater than 1 as a job class keyword list.

S13, extracting a keyword set of a job description data set (namely, step S204 in the embodiment of the application), namely, extracting JD keywords, firstly, segmenting JD historical data, obtaining a word set for each job description JD, and obtaining a keyword set of the JD by using the set to obtain an intersection with a job keyword list of a job in which the JD is located.

S14, importing CV detailed information data tables of nearly three years from a database, wherein each piece of data of the CV detailed information data tables comprises CV numbers and resume texts;

s15, importing an interview record detailed information data table of nearly three years from a database, wherein each piece of data of the interview record detailed information data table comprises a JD number, a CV number, an interview result and an interview evaluation text;

s16, CV and interview evaluation keyword extraction comprises the following steps: CV text, interview scores, JD numbers and interview results are correlated from interview records. The method comprises the steps of finding the job class to which the JD belongs through the JD number, and obtaining an intersection by utilizing the extracted job class keyword list and a CV text word segmentation result to obtain a CV keyword set. Similarly, a keyword set of the interview evaluation text is obtained.

S17, obtaining target keywords.

Specifically, for a given JD, the relevant interview records are summarized and recorded as M, each record has an interview result (1 indicates passing, i.e., the first identifier in the embodiment of the application, and 0 indicates not passing, i.e., the second identifier in the embodiment of the application) whether the interview record passes, and the interview evaluation in each interview record is combined with the keyword set of the corresponding CV.

The N keyword sets prepared under the job class corresponding to the JD are represented by 1 if the keyword sets are combined and present, the keyword sets are represented by 0 if the keyword sets are not present, and one interview record is represented as a vector consisting of N dimensions 0 and 1. For each dimension of the N-dimensional keyword set, the Pearson correlation coefficient between the column of M records and the interview result column is calculated, and the sequences are arranged from large to small. The keywords with the correlation coefficient exceeding a certain preset threshold and not in JD are the potential keywords obtained by mining.

Optionally, in step S202, according to the pre-acquired job description history data, acquiring the job keyword table includes:

step S2021, dividing the position description data in the position description history data into words to obtain position description words;

step S2022, counting the occurrence probability of each job descriptor in the job description history data to obtain a first numerical value;

step S2023, counting the occurrence probability of each job descriptor in the job description data under each job class to obtain a second value;

step S2024, calculating the importance value of each job descriptor under the job class according to the first numerical value and the second numerical value;

step S2025, sorting the job descriptors of the importance values according to a preset sequence to obtain a job keyword list.

Further, optionally, in step S2025, the job descriptors of the importance values are ordered according to a preset order, and the obtaining a job keyword table includes:

step S20251, obtaining the first N job descriptors with importance values larger than a preset threshold;

step S20252, obtaining the job keyword list according to the first N job descriptors.

Specifically, in combination with step S2021 to step S2025, the job keyword table is obtained as follows:

A JD detailed information data table of nearly three years is imported from the database, and each piece of data of the JD detailed information data table contains JD numbers, job classes, and job description texts.

In the process of acquiring the job keyword table (i.e., step S202 in the embodiment of the present application), the JD history data is segmented, then the probability P (w_i) of occurrence of each segmented word w_i in all JD history data and the probability P (w_i|c_j) of occurrence of each segmented word w_i in JD history data under each job c_j are counted, and the importance lift (w_i, c_j) =p (w_i|c_j)/P (w_i) of each segmented word w_i under the job c_j is calculated.

For each job class c_j, all the segmentation words w_i are ranked from large to small according to lift (w_i, c_j), and top N important words (i.e. the first N job descriptors in the embodiment of the application) with the maximum lift being greater than 1 (i.e. the preset threshold in the embodiment of the application) are taken as a job class keyword table.

Optionally, in step S204, obtaining keywords of each job description in the job description history data according to the job type keyword table, where obtaining a keyword set of the job description data set includes:

step S2041, performing word segmentation on each position description in the position description history data to obtain a position description word set;

Step S2042, the keyword set of the job description data set is obtained by acquiring an intersection set according to the job description word set and the job type keyword list.

Specifically, in combination with step S2041 to step S2042, obtaining keywords of each job description in the job description history data according to the job type keyword table, the keyword set of the job description data set includes:

firstly, word segmentation is carried out on JD historical data, a word set is obtained by describing the JD for each position, and an intersection set is obtained by using the set and a job keyword list of a job in which the JD is located, so as to obtain a keyword set of the JD.

Optionally, in step S206, the obtaining the keyword set of the candidate resume data set and the keyword set of the interview record data set by combining the job keyword list according to the pre-obtained candidate resume data set and the interview record data set respectively includes:

step S2061, in the case that the candidate resume data set comprises a candidate resume number and resume text, and the interview record data set comprises a position description number, a candidate resume number, interview results and interview evaluation text, finding the position class to which the position description belongs according to the position description number;

step S2062, acquiring an intersection according to the result of the job keyword list and the resume text word segmentation, and obtaining a keyword set of the resume data set;

And step S2063, taking an intersection according to the result of the job keyword list and the interview evaluation text word segmentation, and obtaining a keyword set of the interview record data set.

Specifically, combining step S2061 to step S2063, respectively according to the pre-acquired candidate resume data set and interview record data set, and combining the job keyword list, the obtaining the keyword set of the candidate resume data set and the keyword set of the interview record data set includes:

CV text, interview scores, JD numbers and interview results are correlated from interview records. The method comprises the steps of finding the job class to which the JD belongs through the JD number, and obtaining an intersection by utilizing the extracted job class keyword list and a CV text word segmentation result to obtain a CV keyword set. Similarly, a keyword set of the interview evaluation text is obtained.

Optionally, in step S208, according to the keyword set of the candidate resume data set and the keyword set of the interview record data set, combining the keyword set of the job description data set to obtain the target keyword includes:

step S2081, counting any one position description to obtain interview records of the position description;

step S2082, a union set is obtained according to the interview evaluation obtained by interview recording and the keyword set of the candidate resume data set, and a first set is obtained;

Step S2083, acquiring a keyword set under the job class corresponding to the job description;

step S2084, if a keyword set appears in the first set, marking a first identifier;

step S2085, if the keyword set does not appear in the first set, marking the second identifier;

step S2086, obtaining the data vector of the interview record according to the first identifier and/or the second identifier;

step S2087, calculating according to the data vector to obtain the target keyword.

Specifically, obtaining the target keyword includes:

for a given JD, the associated interview records are summarized and recorded as M, each record has an interview result (1 indicates pass, i.e., the first identifier in the embodiment of the present application, and 0 indicates fail, i.e., the second identifier in the embodiment of the present application), and the interview evaluation in each interview record is combined with the keyword set of the corresponding CV.

In this embodiment of the present application, the calculation according to the data vector includes two implementation manners, which are specifically as follows:

mode one: calculating the pearson correlation coefficient:

further optionally, in step S2087, calculating according to the data vector, to obtain the target keyword includes:

step S20871, calculating pearson correlation coefficients between each keyword vector and each column of interview result vectors in the interview records, and arranging the calculation results to obtain a pearson correlation coefficient set, wherein the keyword vectors are whether keywords appear in the resume or the interview records;

step S20872, judging whether the pearson correlation coefficient in the pearson correlation coefficient set has keywords which are larger than a preset threshold and do not describe the keyword set of the data set in position;

in step S20873, if the determination result is yes, the target keyword is obtained.

The pearson correlation coefficient is the quotient of covariance and standard deviation between two variables, for a certain position, the following data are counted, the numerical value of the middle 3 columns of cells means whether the interview record and the corresponding resume text contain corresponding position keywords or not, a matrix is formed, and the additional column means whether the interview passes or not.

The vector of the keyword Java in the table above is a= (1,1,0,1,0), the vector of whether the interview passes is b= (1,0,0,1,0), and the pearson correlation coefficient is 2/3.

Mode two: calculating cosine similarity:

optionally, in step S2087, calculating according to the data vector, to obtain the target keyword includes:

step S20871', calculating cosine similarity between each keyword vector and each column of interview result vector in the interview record, wherein the keyword vector is whether keywords appear in the resume or the interview record;

in step S20872', the target keyword is obtained according to the cosine similarity.

For example, a= (1,1,0,1,0) b= (1,0,0,1,0), and the cosine similarity is 0.816.

Optionally, the method for extracting position keywords provided in the embodiment of the present application further includes: step S210, acquiring recruitment requirements according to the target keywords.

The job keyword extraction method provided by the embodiment of the application can be applied to a JD editing management page to help a JD manager describe recruitment requirements comprehensively and accurately in terms of language. The method has the advantages that at least two benefits can be brought, firstly, job seekers, hunting heads, internal staff and the like can more fully know the job recruitment requirement, the first step of matching is better made in the process of resume delivery and recommendation, the recruitment cost of enterprises is saved, and meanwhile, the recruitment efficiency of the job seekers is improved; secondly, the artificial intelligence-based post matching model is generally arranged in the enterprise, the text of the JD is one of the main inputs of the post matching model, and the more accurate the text of the JD is, the larger the value space the post matching model can exert, so that matching is better performed, and the recruitment cost of the enterprise is further saved.

Example 2

It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present invention is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present invention. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present invention.

From the above description of the embodiments, it will be clear to those skilled in the art that the method for extracting position keywords or the method for updating position requirements according to the above embodiments may be implemented by means of software plus a necessary general hardware platform, or may be implemented by hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.

Example 3

According to an embodiment of the present invention, there is also provided an apparatus for implementing the method for extracting position keywords, as shown in fig. 4, and fig. 4 is a block diagram of an apparatus for extracting position keywords according to a third embodiment of the present invention, where the apparatus includes:

a first obtaining module 42, configured to obtain a job keyword table according to the job description history data obtained in advance; a second obtaining module 44, configured to obtain keywords of each job description in the job description history data according to the job keyword list, so as to obtain a keyword set of the job description data set; the first extraction module 46 is configured to obtain a keyword set of the candidate resume data set and a keyword set of the interview record data set according to the pre-acquired candidate resume data set and the interview record data set, and by combining the job keyword list; the second extraction module 48 is configured to obtain the target keyword according to the keyword set of the candidate resume data set and the keyword set of the interview record data set, and the keyword set of the job description data set.

Optionally, the first obtaining module 42 includes: the word segmentation unit is used for segmenting the position description data in the position description history data to obtain position description words; the first statistics unit is used for counting the occurrence probability of each job descriptor in the job description historical data to obtain a first numerical value; the second statistics unit is used for counting the occurrence probability of each job descriptor in the job description data under each job class to obtain a second numerical value; the computing unit is used for computing the importance value of each job descriptor under the job class according to the first numerical value and the second numerical value; the acquisition unit is used for sorting the position descriptors of the importance values according to a preset sequence to obtain a position keyword list.

Example 4

Example 5

According to still another aspect of the embodiment of the present invention, there is further provided a storage medium, where the storage medium includes a stored program, and when the program runs, the device where the storage medium is controlled to execute the above method for extracting a job keyword or the method for updating a job requirement.

Example 6

Embodiments of the present invention may provide a computer terminal, which may be any one of a group of computer terminals. Alternatively, in the present embodiment, the above-described computer terminal may be replaced with a terminal device such as a mobile terminal.

Alternatively, in this embodiment, the above-mentioned computer terminal may be located in at least one network device among a plurality of network devices of the computer network.

In this embodiment, the computer terminal may execute the program code of the following steps in the method for extracting the position keyword of the application program: acquiring a job keyword list according to the pre-acquired job description history data; acquiring keywords of each position description in position description historical data according to a position type keyword list to obtain a keyword set of a position description data set; respectively obtaining a keyword set of the candidate resume data set and a keyword set of the interview record data set according to the candidate resume data set and the interview record data set and combining the job keyword list; and obtaining target keywords according to the keyword set of the candidate resume data set and the keyword set of the interview record data set and the keyword set of the job description data set.

The memory may be used to store software programs and modules, such as program instructions/modules corresponding to the method and apparatus for extracting position keywords in the embodiment of the present invention, and the processor executes various functional applications and data processing by running the software programs and modules stored in the memory, thereby implementing the method for detecting a system vulnerability attack described above. The memory may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory may further include memory remotely located with respect to the processor, which may be connected to terminal a through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The processor may call the information and the application program stored in the memory through the transmission device to perform the following steps:

optionally, the above processor may further execute program code for: according to the pre-acquired post description history data, acquiring a post keyword list comprises the following steps: dividing the position description data in the position description history data into words to obtain position description words; counting the occurrence probability of each job descriptor in the job description history data to obtain a first numerical value; counting the occurrence probability of each job descriptor in the job description data under each job class to obtain a second value; calculating the importance value of each job descriptor under the job class according to the first numerical value and the second numerical value; and ordering the position descriptors of the importance values according to a preset sequence to obtain a position keyword list.

Further, optionally, the above processor may further execute program code for: the job descriptors of the importance values are ordered according to a preset sequence, and the obtained job keyword list comprises: acquiring first N job descriptors with importance values larger than a preset threshold value; and obtaining a job keyword list according to the first N job descriptors.

Optionally, the above processor may further execute program code for: obtaining keywords of each position description in the position description history data according to the position type keyword list, wherein obtaining the keyword set of the position description data set comprises the following steps: dividing each position description in the position description history data into words to obtain a position description word set; and acquiring an intersection set according to the job description word set and the job type keyword list to obtain a keyword set of the job description data set.

Optionally, the above processor may further execute program code for: the step of obtaining the keyword set of the candidate resume data set and the keyword set of the interview record data set by combining the job keyword list according to the pre-obtained candidate resume data set and the interview record data set respectively comprises the following steps: under the condition that the candidate resume data set comprises a candidate resume number and a resume text, and the interview record data set comprises a position description number, a candidate resume number, interview results and interview evaluation text, the position description belonging to the position description is found according to the position description number; acquiring an intersection according to the result of the job keyword list and the resume text word segmentation to obtain a keyword set of a resume data set; and acquiring an intersection set according to the job keyword surface and the results after the interview evaluation text word segmentation, and obtaining a keyword set of the interview record data set.

Optionally, the above processor may further execute program code for: according to the keyword set of the candidate resume data set and the keyword set of the interview record data set, combining the keyword set of the job description data set to obtain target keywords comprises: counting any one position description to obtain an interview record of the position description; acquiring a union set according to the interview evaluation obtained by interview recording and a keyword set of a candidate resume data set to obtain a first set; acquiring a keyword set under a job class corresponding to the job description; if the keyword set appears in the first set, marking a first identifier; if the keyword set does not appear in the first set, marking a second identifier; obtaining a data vector of the interview record according to the first mark and/or the second mark; and calculating according to the data vector to obtain the target keyword.

Further, optionally, the above processor may further execute program code for: calculating according to the data vector, wherein the obtaining of the target keyword comprises the following steps: calculating pearson correlation coefficients between each keyword vector and each column of interview result vectors in interview records, and arranging calculation results to obtain a pearson correlation coefficient set, wherein the keyword vectors are whether keywords appear in a resume or interview records; judging whether the pearson correlation coefficient in the pearson correlation coefficient set has keywords which are larger than a preset threshold and do not describe the keyword set of the data set in position; and if the judgment result is yes, obtaining the target keyword.

Optionally, the above processor may further execute program code for: calculating according to the data vector, wherein the obtaining of the target keyword comprises the following steps: calculating cosine similarity between each keyword vector and each column of interview result vector in the interview record, wherein the keyword vector is whether keywords appear in a resume or the interview record; and obtaining the target keywords according to the cosine similarity.

The embodiment of the invention provides a scheme of a method for extracting position keywords. Acquiring a job keyword list according to the pre-acquired job description history data; acquiring keywords of each position description in position description historical data according to a position type keyword list to obtain a keyword set of a position description data set; respectively obtaining a keyword set of the candidate resume data set and a keyword set of the interview record data set according to a pre-obtained candidate resume data set and an interview record data set and combining a job keyword list; according to the keyword set of the candidate resume data set and the keyword set of the interview record data set, the target keywords are obtained by combining the keyword set of the job description data set, so that the correlation between the extracted keywords and the job description is effectively ensured, the extracted keywords are not required to appear in the job description, the technical effect of improving the keyword extraction accuracy in information is achieved, and the technical problems that due to limited content in a keyword extraction text in the related technology, the keyword extraction is easy to be missed, the job requirements are not updated in time due to low keyword extraction accuracy, and recruitment efficiency and enterprise operation are affected are solved.

Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program for instructing a terminal device to execute in association with hardware, the program may be stored in a computer readable storage medium, and the storage medium may include: flash disk, read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), magnetic or optical disk, and the like.

Example 7

The embodiment of the invention also provides a storage medium. Alternatively, in this embodiment, the storage medium may be used to store program code executed by the method for extracting position keywords provided in the first embodiment.

Alternatively, in this embodiment, the storage medium may be located in any one of the computer terminals in the computer terminal group in the computer network, or in any one of the mobile terminals in the mobile terminal group.

Alternatively, in the present embodiment, the storage medium is configured to store program code for performing the steps of: acquiring a job keyword list according to the pre-acquired job description history data; acquiring keywords of each position description in position description historical data according to a position type keyword list to obtain a keyword set of a position description data set; respectively obtaining a keyword set of the candidate resume data set and a keyword set of the interview record data set according to the candidate resume data set and the interview record data set and combining the job keyword list; and obtaining target keywords according to the keyword set of the candidate resume data set and the keyword set of the interview record data set and the keyword set of the job description data set.

Alternatively, in the present embodiment, the storage medium is configured to store program code for performing the steps of: according to the pre-acquired post description history data, acquiring a post keyword list comprises the following steps: dividing the position description data in the position description history data into words to obtain position description words; counting the occurrence probability of each job descriptor in the job description history data to obtain a first numerical value; counting the occurrence probability of each job descriptor in the job description data under each job class to obtain a second value; calculating the importance value of each job descriptor under the job class according to the first numerical value and the second numerical value; and ordering the position descriptors of the importance values according to a preset sequence to obtain a position keyword list.

Further optionally, in the present embodiment, the storage medium is configured to store program code for performing the steps of: the job descriptors of the importance values are ordered according to a preset sequence, and the obtained job keyword list comprises: acquiring first N job descriptors with importance values larger than a preset threshold value; and obtaining a job keyword list according to the first N job descriptors.

Alternatively, in the present embodiment, the storage medium is configured to store program code for performing the steps of: obtaining keywords of each position description in the position description history data according to the position type keyword list, wherein obtaining the keyword set of the position description data set comprises the following steps: dividing each position description in the position description history data into words to obtain a position description word set; and acquiring an intersection set according to the job description word set and the job type keyword list to obtain a keyword set of the job description data set.

Alternatively, in the present embodiment, the storage medium is configured to store program code for performing the steps of: the step of obtaining the keyword set of the candidate resume data set and the keyword set of the interview record data set by combining the job keyword list according to the pre-obtained candidate resume data set and the interview record data set respectively comprises the following steps: under the condition that the candidate resume data set comprises a candidate resume number and a resume text, and the interview record data set comprises a position description number, a candidate resume number, interview results and interview evaluation text, the position description belonging to the position description is found according to the position description number; acquiring an intersection according to the result of the job keyword list and the resume text word segmentation to obtain a keyword set of a resume data set; and acquiring an intersection set according to the job keyword surface and the results after the interview evaluation text word segmentation, and obtaining a keyword set of the interview record data set.

Alternatively, in the present embodiment, the storage medium is configured to store program code for performing the steps of: according to the keyword set of the candidate resume data set and the keyword set of the interview record data set, combining the keyword set of the job description data set to obtain target keywords comprises: counting any one position description to obtain an interview record of the position description; acquiring a union set according to the interview evaluation obtained by interview recording and a keyword set of a candidate resume data set to obtain a first set; acquiring a keyword set under a job class corresponding to the job description; if the keyword set appears in the first set, marking a first identifier; if the keyword set does not appear in the first set, marking a second identifier; obtaining a data vector of the interview record according to the first mark and/or the second mark; and calculating according to the data vector to obtain the target keyword.

Further optionally, in the present embodiment, the storage medium is configured to store program code for performing the steps of: calculating according to the data vector, wherein the obtaining of the target keyword comprises the following steps: calculating pearson correlation coefficients between each keyword vector and each column of interview result vectors in interview records, and arranging calculation results to obtain a pearson correlation coefficient set, wherein the keyword vectors are whether keywords appear in a resume or interview records; judging whether the pearson correlation coefficient in the pearson correlation coefficient set has keywords which are larger than a preset threshold and do not describe the keyword set of the data set in position; and if the judgment result is yes, obtaining the target keyword.

Alternatively, in the present embodiment, the storage medium is configured to store program code for performing the steps of: calculating according to the data vector, wherein the obtaining of the target keyword comprises the following steps: calculating cosine similarity between each keyword vector and each column of interview result vector in the interview record, wherein the keyword vector is whether keywords appear in a resume or the interview record; and obtaining the target keywords according to the cosine similarity.

The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.

In the foregoing embodiments of the present invention, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.

In the several embodiments provided in the present application, it should be understood that the disclosed technology content may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, such as the division of the units, is merely a logical function division, and may be implemented in another manner, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims

1. A method for extracting position keywords comprises the following steps:

acquiring a job keyword list according to the pre-acquired job description history data;

acquiring keywords of each position description in the position description historical data according to the position description keyword list to obtain a keyword set of a position description data set;

combining the job keyword list according to the candidate resume data set and the interview record data set respectively to obtain a keyword set of the candidate resume data set and a keyword set of the interview record data set;

obtaining target keywords according to the keyword set of the candidate resume data set and the keyword set of the interview record data set and combining the keyword set of the job description data set;

according to the keyword set of the candidate resume data set and the keyword set of the interview record data set, combining the keyword set of the job description data set to obtain target keywords comprises:

Counting any one job description to obtain an interview record of the job description;

obtaining a first set according to the interview evaluation obtained by the interview record and the keyword set of the candidate resume data set;

acquiring a keyword set under the job class corresponding to the job description;

if the keyword set appears in the first set, marking a first identifier;

if the keyword set does not appear in the first set, marking a second identifier;

obtaining a data vector of the interview record according to the first identifier and/or the second identifier;

and calculating the similarity between each keyword vector and each column of interview result vector in the interview record, and obtaining the target keywords according to the similarity.

2. The method of claim 1, wherein the obtaining a job class keyword table from pre-obtained job description history data comprises:

dividing the position description data in the position description history data into words to obtain position description words;

counting the occurrence probability of each job description word in the job description historical data to obtain a first numerical value;

counting the occurrence probability of each job descriptor in the job description data under each job class to obtain a second value;

Calculating the importance value of each job descriptor under the job class according to the first numerical value and the second numerical value;

and ordering the job descriptors of the importance values according to a preset sequence to obtain the job keyword list.

3. The method of claim 2, wherein the ranking the job descriptors of the importance values according to a preset order to obtain the job-class keyword table includes:

acquiring the first N job descriptors with the importance value larger than a preset threshold value;

and obtaining the job keyword list according to the first N job descriptors.

4. The method of claim 1, wherein the obtaining the keyword set of the job description dataset from the job description keyword table to obtain the keyword of each job description in the job description history data comprises:

performing word segmentation on each job description in the job description history data to obtain a job description word set;

and acquiring an intersection set according to the job description word set and the job class keyword list, and obtaining a keyword set of a job description data set.

5. The method of claim 1, wherein the combining the job-like keyword list to obtain the keyword set of the candidate resume data set and the keyword set of the interview record data set according to the pre-acquired candidate resume data set and the interview record data set, respectively, comprises:

Under the condition that the candidate resume data set comprises a candidate resume number and a resume text, and the interview record data set comprises a job description number, a candidate resume number, interview results and interview evaluation text, finding a job class to which the job description belongs according to the job description number;

acquiring an intersection according to the result of the word segmentation of the job keyword list and the resume text, and obtaining a keyword set of a resume data set;

and acquiring an intersection set according to the result of the job keyword list and the interview evaluation text word segmentation, and obtaining a keyword set of the interview record data set.

6. The method of claim 1, wherein calculating a similarity between each keyword vector and each column of interview result vectors in the interview record, and deriving the target keyword based on the similarity comprises:

calculating pearson correlation coefficients between each keyword vector and each column of interview result vector in the interview record, and arranging calculation results to obtain a pearson correlation coefficient set, wherein the keyword vector is whether keywords appear in a resume or interview record;

judging whether the pearson correlation coefficient in the pearson correlation coefficient set has keywords which are larger than a preset threshold and do not describe the keyword set of the data set in the positions;

And if the judgment result is yes, obtaining the target keyword.

7. The method of claim 1, wherein calculating a similarity between each keyword vector and each column of interview result vectors in the interview record, and deriving the target keyword based on the similarity comprises:

calculating cosine similarity between each keyword vector and each column of interview result vector in the interview record, wherein the keyword vector is whether keywords appear in a resume or interview record;

and obtaining the target keywords according to the cosine similarity.

8. The method of claim 1, wherein the method further comprises:

and acquiring recruitment requirements according to the target keywords.

9. A method of post demand update, comprising:

updating recruitment requirements according to the target keywords;

if the keyword set appears in the first set, marking a first identifier;

10. A position keyword extraction device, comprising:

the first acquisition module is used for acquiring a job keyword list according to the pre-acquired job description historical data;

the second acquisition module is used for acquiring the keyword of each position description in the position description historical data according to the position description keyword list to obtain a keyword set of a position description data set;

the first extraction module is used for combining the job keyword list according to the candidate resume data set and the interview record data set respectively to obtain a keyword set of the candidate resume data set and a keyword set of the interview record data set;

the second extraction module is used for obtaining target keywords according to the keyword set of the candidate resume data set and the keyword set of the interview record data set and combining the keyword set of the job description data set;

the second extraction module further includes: counting any one job description to obtain an interview record of the job description; obtaining a first set according to the interview evaluation obtained by the interview record and the keyword set of the candidate resume data set; acquiring a keyword set under the job class corresponding to the job description; if the keyword set appears in the first set, marking a first identifier; if the keyword set does not appear in the first set, marking a second identifier; obtaining a data vector of the interview record according to the first identifier and/or the second identifier; and calculating the similarity between each keyword vector and each column of interview result vector in the interview record, and obtaining the target keywords according to the similarity.

11. The apparatus of claim 10, wherein the first acquisition module comprises:

the word segmentation unit is used for segmenting the position description data in the position description history data to obtain position description words;

the first statistics unit is used for counting the occurrence probability of each job description word in the job description historical data to obtain a first numerical value;

the second statistics unit is used for counting the occurrence probability of each job description word in the job description data under each job class to obtain a second numerical value;

the computing unit is used for computing the importance value of each job position descriptor under the job class according to the first numerical value and the second numerical value;

and the acquisition unit is used for sequencing the position descriptors of the importance values according to a preset sequence to obtain the position keyword list.

12. An apparatus for post demand update, comprising:

the updating module is used for updating recruitment requirements according to the target keywords;

wherein the second extraction module further comprises: counting any one job description to obtain an interview record of the job description; obtaining a first set according to the interview evaluation obtained by the interview record and the keyword set of the candidate resume data set; acquiring a keyword set under the job class corresponding to the job description; if the keyword set appears in the first set, marking a first identifier; if the keyword set does not appear in the first set, marking a second identifier; obtaining a data vector of the interview record according to the first identifier and/or the second identifier; and calculating the similarity between each keyword vector and each column of interview result vector in the interview record, and obtaining the target keywords according to the similarity.

13. A storage medium comprising a stored program, wherein the program, when run, controls a device in which the storage medium resides to perform: the method for extracting position keywords of claim 1, or the post demand update method of claim 9.