CN112100492A - Batch delivery method and system for resumes of different versions - Google Patents

Batch delivery method and system for resumes of different versions Download PDF

Info

Publication number
CN112100492A
CN112100492A CN202010954388.4A CN202010954388A CN112100492A CN 112100492 A CN112100492 A CN 112100492A CN 202010954388 A CN202010954388 A CN 202010954388A CN 112100492 A CN112100492 A CN 112100492A
Authority
CN
China
Prior art keywords
correlation vector
words
word
position information
calculating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010954388.4A
Other languages
Chinese (zh)
Inventor
吴晓军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hebei Jilian Human Resources Service Group Co ltd
Original Assignee
Hebei Jilian Human Resources Service Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hebei Jilian Human Resources Service Group Co ltd filed Critical Hebei Jilian Human Resources Service Group Co ltd
Priority to CN202010954388.4A priority Critical patent/CN112100492A/en
Publication of CN112100492A publication Critical patent/CN112100492A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/105Human resources
    • G06Q10/1053Employment or hiring

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Databases & Information Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Strategic Management (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Operations Research (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Quality & Reliability (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Medical Informatics (AREA)
  • General Business, Economics & Management (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Tourism & Hospitality (AREA)

Abstract

The present disclosure provides a batch delivery method and system for resumes of different versions, including: acquiring position information downloaded from a plurality of sites to form a local position database; generating a theme of the local position information according to the position information in the local position database; calculating a first correlation vector of each position information in the local position database and the generated theme; calculating a second correlation vector of the resume of the user-selected version and the generated subject; calculating the similarity of the first correlation vector and the second correlation vector; delivering the resume in the user-selected version to the associated one or more positions based on the descending order of similarity.

Description

Batch delivery method and system for resumes of different versions
Technical Field
The present disclosure relates to the field of information technologies, and in particular, to a batch delivery method and system for resumes of different versions, an electronic device, and a computer-readable storage medium.
Background
In the existing website for providing the internet recruitment service, a conventional method is that a recruiter issues a job position to be recruited, and a job seeker interested in the job position delivers a resume to the job position. Some recruitment websites can automatically match the relevance between the job seeker and the job position, push the job position to the job seeker with high relevance, and improve the effect of recruitment.
However, resume casting is troublesome, and it is not necessary to visit each large recruitment platform, such as intelligent union and boss direct hiring, but also possible to visit each large company's own official website, to search and screen positions meeting the needs of the company, and then to post the positions one by one. If the job seeker needs to prepare a plurality of resumes and put corresponding resumes in a targeted manner in order to search for a plurality of positions, the job seeker is very troublesome, and wrong resumes can be delivered carelessly.
Therefore, a one-stop method for resume with different versions, automatically matching and adapting to positions and delivering resumes in batches is urgently needed, and functions of automatically selecting positions, storing the positions in a database, managing resumes with different versions, matching positions of resumes and delivering resumes by one key are automatically realized.
Disclosure of Invention
In view of this, an object of the embodiments of the present disclosure is to provide a batch delivery method and system for resumes of different versions, which implement efficient and accurate matching between the resume and the job by generating topics of the resumes and the job and calculating similarity through an LDA machine learning algorithm based on the topics, so as to achieve a function of automatically delivering resumes of different versions in batches.
According to a first aspect of the present disclosure, there is provided a batch delivery method of resumes of different versions, including:
acquiring position information downloaded from a plurality of sites to form a local position database;
generating a theme of the local position information according to the position information in the local position database;
calculating a first correlation vector of each position information in the local position database and the generated theme;
calculating a second correlation vector of the resume of the user-selected version and the generated subject;
calculating the similarity of the first correlation vector and the second correlation vector;
delivering the resume in the user-selected version to the associated one or more positions based on the descending order of similarity.
In one possible embodiment, the method for generating the topic of the local position information comprises:
segmenting words of all position information in the local position database according to the existing dictionary, wherein the words comprise sentence breaks, word segmentation and stop word removal to obtain a first segmented word;
extracting 2-grams and 3-grams according to the obtained first word segmentation words, calculating mutual information values of each 2-gram and each 3-gram, performing descending arrangement on the 2-grams and the 3-grams based on the mutual information values, and selecting the 2-grams and the 3-grams which are ranked in the front for updating the first word segmentation words and the existing dictionary;
respectively calculating information entropies of left and right adjacent characters of the word segmentation words according to the obtained word segmentation words, merging the first word segmentation words based on the information entropies, and further updating the first word segmentation words and the existing dictionary;
filtering a second participle word obtained after the first participle word is updated by using a TF-IDF method to obtain a third participle word;
classifying words according to the positions of the sites, counting the probability of the words appearing in the local position information, and filtering the third word segmentation words according to the probability to obtain fourth word segmentation words;
and converting the fourth word words into word vectors, and clustering the word vectors to obtain a plurality of word clusters serving as the subjects of the local position information.
In one possible embodiment, the calculation of the first correlation vector or the second correlation vector comprises: calculating the first correlation vector or the second correlation vector based on a machine learning model of an LDA topic model.
In one possible embodiment, the calculating the similarity between the first correlation vector and the second correlation vector includes: a cosine distance or a euclidean distance or a manhattan distance between the first correlation vector and the second correlation vector is calculated.
In one possible embodiment, after the similarity-based descending ranking, delivering resumes to the positions associated therewith further comprises: and automatically downloading the position information again at the preset site within a preset time period to obtain the updated local position information.
In a possible embodiment, after obtaining the updated local position information, the method further includes: automatically reminding the user of the position with newly found similarity meeting the preset value, and automatically delivering the resume with the preset version to the position
In one possible embodiment, the training data of the machine learning model based on the LDA topic model is obtained by intersecting words obtained by performing word segmentation on the updated existing dictionary of local position information with words included in the topic.
According to a second aspect of the present disclosure, there is provided a batch delivery system of resumes in different versions, comprising:
the system comprises a position acquisition unit, a position database and a position database, wherein the position acquisition unit is used for acquiring position information downloaded from a plurality of sites and forming the local position database;
the theme generating unit is used for generating a theme of the local position information according to the position information in the local position database;
the first correlation vector unit is used for calculating a first correlation vector of each local position and the generated theme;
the second correlation vector unit is used for calculating a second correlation vector of the user-selected version of the ephemeral and the generated theme;
a similarity calculation unit configured to calculate a similarity between the first correlation vector and the second correlation vector;
and the resume delivery unit is used for delivering the resume of the version selected by the user to the associated one or more positions based on the descending order of the similarity.
According to a third aspect of the present disclosure, there is provided an electronic device comprising: a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method according to the first aspect when executing the program.
According to a fourth aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method of the first aspect.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts. The foregoing and other objects, features and advantages of the application will be apparent from the accompanying drawings. Like reference numerals refer to like parts throughout the drawings. The drawings are not intended to be to scale as practical, emphasis instead being placed upon illustrating the subject matter of the present application.
Fig. 1 illustrates a schematic view of a typical search and presentation interface for job information for a recruitment platform, according to an embodiment of the present disclosure.
Fig. 2 is a diagram illustrating exemplary job information downloaded from a website according to an embodiment of the present disclosure.
FIG. 3 shows a schematic diagram of a typical method of batch delivery of different versions of resumes according to an embodiment of the present disclosure.
FIG. 4 is a diagram illustrating an exemplary method of building training data for a machine learning model according to an embodiment of the present disclosure.
FIG. 5 shows a schematic diagram of a system for batch delivery of exemplary different versions of resumes according to an embodiment of the present disclosure.
Fig. 6 shows a schematic structural diagram of an electronic device for implementing an embodiment of the present disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The words "a", "an" and "the" and the like as used herein are also intended to include the meanings of "a plurality" and "the" unless the context clearly dictates otherwise. Furthermore, the terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.
In the existing website for providing the internet recruitment service, a conventional method is that a recruiter issues a job position to be recruited, and a job seeker interested in the job position delivers a resume to the job position. Some recruitment websites can automatically match the relevance between the job seeker and the job position, push the job position to the job seeker with high relevance, and improve the effect of recruitment.
However, resume casting is troublesome, and it is not necessary to visit each large recruitment platform, such as intelligent union and boss direct hiring, but also possible to visit each large company's own official website, to search and screen positions meeting the needs of the company, and then to post the positions one by one. If the job seeker needs to prepare a plurality of resumes and put corresponding resumes in a targeted manner in order to search for a plurality of positions, the job seeker is very troublesome, and wrong resumes can be delivered carelessly.
In view of the above, the present disclosure provides a one-stop method for automatically matching adaptive positions and delivering resume in batch for resumes of different versions, which automatically realizes functions of job selection, saving in a database, managing resumes of different versions, position matching of resumes, and one-key resume delivery.
The present disclosure is described in detail below with reference to the attached drawings.
Fig. 1 illustrates a schematic view of a typical search and presentation interface for job information for a recruitment platform, according to an embodiment of the present disclosure.
Taking a certain large-scale recruitment platform information as an example, in order to establish a local position database, a user can log in a recruitment platform by using an account number, search positions concerned by the user, download the positions, and authorize the disclosure to be obtained in an automatic crawling manner.
The content required for establishing the local job database comprises the following contents:
job names, such as: C/C + + engineer
Company names, for example: XXX Ltd
Post release times, for example: 19 hours before
Description of the responsibilities, for example: 1. the system is responsible for image recognition algorithm development and optimization and hardware drive development and debugging work; 2. performing the compiling, optimizing and API interface development work of the bottom layer algorithm model according to the research and development result of the algorithm engineer; 3. the hardware interface driver development, debugging and optimization work is carried out by matching with a hardware engineer; and other contents such as salary range, work calendar, school calendar, etc., and the disclosure is not limited.
Fig. 2 is a diagram illustrating exemplary job information downloaded from a website according to an embodiment of the present disclosure. The job information is saved to Excel or any possible database table, or other database management system, and the disclosure is not limited.
By analogy, 51 jobs, boss direct hiring, Tencent hiring, Ali hiring, company official websites and the like can be obtained. In summary, the source of job information may include all interested sites, all jobs downloaded or crawled locally, built to form a local job database.
FIG. 3 shows a schematic diagram of a typical method of batch delivery of different versions of resumes according to an embodiment of the present disclosure.
By step 301: and acquiring the position information downloaded from the plurality of sites to form a local position database.
And generating a theme of the local position information according to the position information in the local position database. The method for generating the theme of the local position information comprises the following steps:
step 302: and segmenting words of all position information in the local position database according to the existing dictionary, wherein the words comprise sentence breaks, word segmentation and stop word removal, and a first segmented word is obtained.
Step 303: extracting 2-grams and 3-grams from the obtained word segmentation words according to the obtained first word segmentation words, calculating mutual information values of each 2-gram and each 3-gram, performing descending arrangement on the 2-grams and the 3-grams based on the mutual information values, and selecting the 2-grams and the 3-grams with the top order for updating the first word segmentation words and the existing dictionary.
In natural language processing, N-gram is a common language model, and for Chinese, matching information between adjacent words in the context can be utilized to improve the processing effect. The basic idea is to perform a sliding window operation with a size of N on the content in the text according to bytes, so as to form a byte fragment sequence with a length of N, wherein each byte fragment is called a gram. In this disclosure, a 1-gram is a word resulting from word segmentation. The 2-gram is two words in succession and the 3-gram is three words in succession.
Further, the 1-gram in this disclosure is the word obtained after each word segmentation. A 2-gram, is a contiguous 2 words, e.g., an algorithm engineer, connected in such a way as to connect the algorithm to the engineer. A3-gram is a sequence of 3 words, for example, connecting natural, linguistic, and processing to obtain natural language processing.
The formula for calculating the mutual information value is shown below:
Figure BDA0002678111630000071
mutual information values represent the degree of interdependence between two variables. The binary mutual information is the amount of correlation of two events, and the higher the mutual information value is, the higher the correlation of X and Y is, the higher the possibility that X and Y form a phrase is; conversely, the lower the mutual information value, the lower the correlation between X and Y, and the greater the likelihood of a phrase boundary between X and Y. X and Y in the formula refer to two adjacent words, and the P value is its probability of occurrence.
For example, in one context, "algorithm engineer" is a 2-gram formed by connecting an algorithm with an engineer, and the total number of occurrences is 3, while the total number of 2-grams is 252, so that P (X, Y) in the above formula is 3/252. In the same way, P (X) P (Y) can be obtained.
Step 304: and respectively calculating the information entropies of the left and right adjacent characters of the word segmentation words according to the obtained word segmentation words, and merging the first word segmentation words based on the information entropies for further updating the first word segmentation words and the existing dictionary.
The purpose of calculating the information entropy of the left and right adjacent characters of a word is to use the information entropy to measure how random the left adjacent character set and the right adjacent character set of a text segment are, namely, to set a reasonable threshold value by using the information entropy, and to reserve the word segmentation within the threshold value range, which indicates that the words are more likely to be fixed word groups, otherwise, the left and right adjacent characters are more likely to be randomly combined, and can not be reserved.
For example, for "text/analysis/naming", it can be calculated that the left entropy of the word "analysis" is low, the word "text" and "analysis" should be merged, the right entropy of the word "analysis" is high, and the word "analysis" and "naming" should be separated.
Step 305: and filtering the second participle word obtained after the first participle word is updated by using a TF-IDF method to obtain a third participle word.
The reason for filtering the second-participle words is that despite the new word dictionary, the participle will still separate a large heap of words with seven and eight worlds. E.g., H5 vue, front end, page, five-risk one-fund, group building, employee benefit, growth, responsibility, skill, learning, priority, experience, understanding. The first 4 words are keywords, and the latter ten words have too low value and should be deleted. Therefore, by setting a reasonable third threshold range using the TF-IDF method, common words in job description, such as priority, experience, proficiency, understanding, etc., can be filtered out.
Step 306: and classifying words according to the positions of the sites, counting the probability of the words appearing in the local position information, and filtering the third word segmentation words according to the probability to obtain a fourth word segmentation word.
For example, the original category words of job information obtained from the crawled plurality of recruitment websites are responsible, skilled, and grown. The written words are deleted as they appear in almost all resumes and positions with almost no information content, as statistically the probability of the words appearing in the second text is 99%. This further enhances the filtering of what should be filtered out of step 305.
Step 307: and converting the fourth word words into word vectors, and clustering the word vectors to obtain a plurality of word clusters serving as the subjects of the local position information. Word2vec or other methods may be used to convert the fourth-word words into word vectors, which are clustered by k-means clustering or other clustering methods.
Step 308: a first correlation vector is calculated for each position information in the local position database with the generated topic.
Step 309: a second correlation vector of the user-selected version of the resume with the generated topic is calculated.
Wherein the calculation of the first correlation vector or the second correlation vector comprises: calculating the first correlation vector or the second correlation vector based on a machine learning model of an LDA topic model.
FIG. 4 is a diagram illustrating an exemplary method of building training data for a machine learning model according to an embodiment of the present disclosure. The training data of the machine learning model based on the LDA topic model is obtained by intersecting words obtained by word segmentation of the updated existing dictionary of local position information with words included in the topic. Other methods of establishing training data may also be used, as the present disclosure is not limited in this respect.
Calculating the first correlation vector and the second correlation vector, and for the front-end engineer, assuming that the extracted fourth participle word is: h5, html, css, vue, node, js, page, and beauty.
After clustering, the generated topics are respectively topic 1, topic 2, topic 3 and topic 4, and then the clustering algorithm is calculated by a topic-based LDA machine learning model to obtain:
p (belonging to subject 1) ═ 0.1;
p (belonging to subject 2) ═ 0.3;
p (belonging to subject 3) ═ 0.2;
p (belonging to subject 4) ═ 0.8;
p is the probability.
The first correlation vector is then: v1 ═ 0.1,0.3,0.2, 0.8.
Similarly, for a version of the resume selected by the user, a second correlation vector is calculated, for example, v2 ═ 0.2,0.3,0.2, 0.7.
Step 310: and calculating the similarity of the first correlation vector and the second correlation vector. Since the matching degree of two texts can be represented by the distance of the vectors, the matching degree between the job and the resume can be reflected by calculating the cosine distance or the euclidean distance or the manhattan distance between the first correlation vector and the second correlation vector.
In step 311, the resume of the version selected by the user is delivered to the associated one or more positions based on the descending order of the similarity. The user can preset the resume version which the user wants to deliver.
By means of the theme generation mode, resume and positions are matched, and the problem of inaccurate matching caused by excessive words, multiple meanings of a word, too narrow or too wide meanings of the words and the like is solved.
In one embodiment, after the similarity-based descending ranking, delivering resumes to the positions associated therewith further comprises: and automatically downloading the position information again at the preset site within a preset time period to obtain the updated local position information.
By presetting a time period, for example, one week, at a fixed time of each week, the positions of each site can be automatically downloaded again, expired positions can be deleted, newly added positions can be supplemented, and updated local position information can be obtained.
In one embodiment, after obtaining the updated local position information, the method further includes: and automatically reminding the user of the position of which the newly found similarity accords with the preset value, and automatically delivering the resume of the preset version to the position.
If the preset value is 90%, the user is automatically reminded of the position with the newly found similarity being greater than or equal to 90%, and the resume with the preset version is automatically delivered to the position.
In one embodiment, the user selects a version of his resume and automatically matches the corresponding job from the local job database. At this time, the positions of the self-mind apparatus are directly selected in batches, and resumes are delivered in batches by one key.
Since the matched positions are searched from the current resume, the selection of the wrong resume is also avoided.
And when the delivery is simplified, opening the selected position URLs in sequence, wherein the corresponding recruiter website may need to be logged in, and the present disclosure can also be authorized to automatically complete the delivery.
Through the mode, the job hunting can be effectively helped to find newly-increased positions, the burden of delivering resume of job hunters is reduced, more resumes are thrown, and the success chance of job hunting is increased.
FIG. 5 shows a schematic diagram of a system for batch delivery of exemplary different versions of resumes according to an embodiment of the present disclosure. The system 500, comprising:
a job position obtaining unit 501, configured to obtain job position information downloaded from multiple websites, and form a local job position database;
a theme generating unit 502, configured to generate a theme of the local position information according to the position information in the local position database;
a first correlation vector unit 503, configured to calculate a first correlation vector between each local position and the generated topic;
a second correlation vector unit 504 that calculates a second correlation vector of the user-selected version of the almanac and the generated topic;
a similarity calculation unit 505, configured to calculate a similarity between the first correlation vector and the second correlation vector;
a resume delivery unit 506 for delivering the user-selected version of the resume to the associated one or more positions based on the descending order of similarity.
In one embodiment, the system 500 further comprises: an update job unit 507, configured to, after the similarity-based descending order, post resumes to the job associated therewith, further include: and automatically downloading the position information again at the preset site within a preset time period to obtain the updated local position information.
In one embodiment, the system 500 further comprises: and the automatic reminding unit 508 is used for automatically reminding the user of the position with the newly found similarity meeting the preset value, and automatically delivering the resume with the preset version to the position.
In one embodiment, the system 500 further comprises: a one-touch delivery unit 509, configured to enable a user to select a certain version of the resume, and automatically match the corresponding job from the local job database. At this time, the positions of the self-mind apparatus are directly selected in batches, and resumes are delivered in batches by one key.
Fig. 6 shows a schematic structural diagram of an electronic device for implementing an embodiment of the present disclosure. As shown in fig. 6, the electronic apparatus 600 includes a Central Processing Unit (CPU)601 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for the operation of the electronic apparatus 600 are also stored. The CPU 601, ROM 602, and RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer-readable medium bearing instructions that, in such embodiments, may be downloaded and installed from a network through the communication portion 609, and/or installed from the removable media 611. The various method steps described in this disclosure are performed when the instructions are executed by a Central Processing Unit (CPU) 601.
Although example embodiments have been described, it will be apparent to those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the disclosed concept. Accordingly, it should be understood that the above-described exemplary embodiments are not limiting, but illustrative.

Claims (10)

1. A batch delivery method of resumes of different versions comprises the following steps:
acquiring position information downloaded from a plurality of sites to form a local position database;
generating a theme of the local position information according to the position information in the local position database;
calculating a first correlation vector of each position information in the local position database and the generated theme;
calculating a second correlation vector of the resume of the user-selected version and the generated subject;
calculating the similarity of the first correlation vector and the second correlation vector;
delivering the resume in the user-selected version to the associated one or more positions based on the descending order of similarity.
2. The method of claim 1, the method of generating the topic of the local job information comprising:
segmenting words of all position information in the local position database according to the existing dictionary, wherein the words comprise sentence breaks, word segmentation and stop word removal to obtain a first segmented word;
extracting 2-grams and 3-grams according to the obtained first word segmentation words, calculating mutual information values of each 2-gram and each 3-gram, performing descending arrangement on the 2-grams and the 3-grams based on the mutual information values, and selecting the 2-grams and the 3-grams which are ranked in the front for updating the first word segmentation words and the existing dictionary;
respectively calculating information entropies of left and right adjacent characters of the word segmentation words according to the obtained word segmentation words, merging the first word segmentation words based on the information entropies, and further updating the first word segmentation words and the existing dictionary;
filtering a second participle word obtained after the first participle word is updated by using a TF-IDF method to obtain a third participle word;
classifying words according to the positions of the sites, counting the probability of the words appearing in the local position information, and filtering the third word segmentation words according to the probability to obtain fourth word segmentation words;
and converting the fourth word words into word vectors, and clustering the word vectors to obtain a plurality of word clusters serving as the subjects of the local position information.
3. The method of claim 1, wherein the calculation of the first correlation vector or the second correlation vector comprises: calculating the first correlation vector or the second correlation vector based on a machine learning model of an LDA topic model.
4. The method of claim 1, wherein said calculating the similarity of the first correlation vector and the second correlation vector comprises: a cosine distance or a euclidean distance or a manhattan distance between the first correlation vector and the second correlation vector is calculated.
5. The method of claim 1, further comprising, after the similarity-based descending ranking, delivering resumes to the job sites associated therewith: and automatically downloading the position information again at the preset site within a preset time period to obtain the updated local position information.
6. The method of claim 5, further comprising, after the obtaining updated local job information: and automatically reminding the user of the position of which the newly found similarity accords with the preset value, and automatically delivering the resume of the preset version to the position.
7. The method of claim 3, wherein the training data for the LDA topic model based machine learning model is derived from the intersection of words derived from updated existing dictionary segmentation of local position information with words comprised by the topic.
8. A batch delivery system of different versions of resumes comprising:
the system comprises a position acquisition unit, a position database and a position database, wherein the position acquisition unit is used for acquiring position information downloaded from a plurality of sites and forming the local position database;
the theme generating unit is used for generating a theme of the local position information according to the position information in the local position database;
the first correlation vector unit is used for calculating a first correlation vector of each local position and the generated theme;
the second correlation vector unit is used for calculating a second correlation vector of the user-selected version of the ephemeral and the generated theme;
a similarity calculation unit configured to calculate a similarity between the first correlation vector and the second correlation vector;
and the resume delivery unit is used for delivering the resume of the version selected by the user to the associated one or more positions based on the descending order of the similarity.
9. An electronic device, comprising:
one or more processors;
a memory for storing one or more programs;
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-7.
10. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method of any one of claims 1 to 7.
CN202010954388.4A 2020-09-11 2020-09-11 Batch delivery method and system for resumes of different versions Pending CN112100492A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010954388.4A CN112100492A (en) 2020-09-11 2020-09-11 Batch delivery method and system for resumes of different versions

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010954388.4A CN112100492A (en) 2020-09-11 2020-09-11 Batch delivery method and system for resumes of different versions

Publications (1)

Publication Number Publication Date
CN112100492A true CN112100492A (en) 2020-12-18

Family

ID=73750934

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010954388.4A Pending CN112100492A (en) 2020-09-11 2020-09-11 Batch delivery method and system for resumes of different versions

Country Status (1)

Country Link
CN (1) CN112100492A (en)

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105786781A (en) * 2016-03-14 2016-07-20 裴克铭管理咨询(上海)有限公司 Job description text similarity calculation method based on topic model
JP5965557B1 (en) * 2016-01-29 2016-08-10 株式会社リクルートホールディングス Similarity learning system and similarity learning method
CN107341233A (en) * 2017-07-03 2017-11-10 北京拉勾科技有限公司 A kind of position recommends method and computing device
CN108021558A (en) * 2017-12-27 2018-05-11 北京金山安全软件有限公司 Keyword recognition method and device, electronic equipment and storage medium
CN108090231A (en) * 2018-01-12 2018-05-29 北京理工大学 A kind of topic model optimization method based on comentropy
CN108763213A (en) * 2018-05-25 2018-11-06 西南电子技术研究所(中国电子科技集团公司第十研究所) Theme feature text key word extracting method
CN108920544A (en) * 2018-06-13 2018-11-30 桂林电子科技大学 A kind of personalized position recommended method of knowledge based map
CN109710947A (en) * 2019-01-22 2019-05-03 福建亿榕信息技术有限公司 Power specialty word stock generating method and device
CN109783636A (en) * 2018-12-12 2019-05-21 重庆邮电大学 A kind of car review subject distillation method based on classifier chains
CN109978510A (en) * 2019-04-02 2019-07-05 北京网聘咨询有限公司 Campus recruiting management system and method
CN110134847A (en) * 2019-05-06 2019-08-16 北京科技大学 A kind of hot spot method for digging and system based on internet Financial Information
CN111061877A (en) * 2019-12-10 2020-04-24 厦门市美亚柏科信息股份有限公司 Text theme extraction method and device
US20200193382A1 (en) * 2018-12-17 2020-06-18 Robert P. Michaels Employment resource system, method and apparatus
CN111353050A (en) * 2019-12-27 2020-06-30 北京合力亿捷科技股份有限公司 Word stock construction method and tool in vertical field of telecommunication customer service
CN111461637A (en) * 2020-02-28 2020-07-28 平安国际智慧城市科技股份有限公司 Resume screening method and device, computer equipment and storage medium

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5965557B1 (en) * 2016-01-29 2016-08-10 株式会社リクルートホールディングス Similarity learning system and similarity learning method
CN105786781A (en) * 2016-03-14 2016-07-20 裴克铭管理咨询(上海)有限公司 Job description text similarity calculation method based on topic model
CN107341233A (en) * 2017-07-03 2017-11-10 北京拉勾科技有限公司 A kind of position recommends method and computing device
CN108021558A (en) * 2017-12-27 2018-05-11 北京金山安全软件有限公司 Keyword recognition method and device, electronic equipment and storage medium
CN108090231A (en) * 2018-01-12 2018-05-29 北京理工大学 A kind of topic model optimization method based on comentropy
CN108763213A (en) * 2018-05-25 2018-11-06 西南电子技术研究所(中国电子科技集团公司第十研究所) Theme feature text key word extracting method
CN108920544A (en) * 2018-06-13 2018-11-30 桂林电子科技大学 A kind of personalized position recommended method of knowledge based map
CN109783636A (en) * 2018-12-12 2019-05-21 重庆邮电大学 A kind of car review subject distillation method based on classifier chains
US20200193382A1 (en) * 2018-12-17 2020-06-18 Robert P. Michaels Employment resource system, method and apparatus
CN109710947A (en) * 2019-01-22 2019-05-03 福建亿榕信息技术有限公司 Power specialty word stock generating method and device
CN109978510A (en) * 2019-04-02 2019-07-05 北京网聘咨询有限公司 Campus recruiting management system and method
CN110134847A (en) * 2019-05-06 2019-08-16 北京科技大学 A kind of hot spot method for digging and system based on internet Financial Information
CN111061877A (en) * 2019-12-10 2020-04-24 厦门市美亚柏科信息股份有限公司 Text theme extraction method and device
CN111353050A (en) * 2019-12-27 2020-06-30 北京合力亿捷科技股份有限公司 Word stock construction method and tool in vertical field of telecommunication customer service
CN111461637A (en) * 2020-02-28 2020-07-28 平安国际智慧城市科技股份有限公司 Resume screening method and device, computer equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘剑,刘伍颖,程学旗: "《术语学研究新进展》", 31 March 2015, pages: 146 - 150 *
王小华,徐宁,谌志群: "基于共词分析的文本主题词聚类与主题发现", 《情报科学》, 5 November 2011 (2011-11-05), pages 1621 - 1624 *

Similar Documents

Publication Publication Date Title
CN110543574B (en) Knowledge graph construction method, device, equipment and medium
US9536522B1 (en) Training a natural language processing model with information retrieval model annotations
US10515147B2 (en) Using statistical language models for contextual lookup
US9639522B2 (en) Methods and apparatus related to determining edit rules for rewriting phrases
US8510308B1 (en) Extracting semantic classes and instances from text
US20170364495A1 (en) Propagation of changes in master content to variant content
WO2017181106A1 (en) Systems and methods for suggesting content to a writer based on contents of a document
WO2019217096A1 (en) System and method for automatically responding to user requests
CN110162768B (en) Method and device for acquiring entity relationship, computer readable medium and electronic equipment
US8521739B1 (en) Creation of inferred queries for use as query suggestions
US9773166B1 (en) Identifying longform articles
CN109086265B (en) Semantic training method and multi-semantic word disambiguation method in short text
CN111090731A (en) Electric power public opinion abstract extraction optimization method and system based on topic clustering
CN112989208B (en) Information recommendation method and device, electronic equipment and storage medium
CN108427702B (en) Target document acquisition method and application server
CN112699303A (en) Medical information intelligent pushing system and method based on 5G message
CN114880447A (en) Information retrieval method, device, equipment and storage medium
CN110674365A (en) Searching method, device, equipment and storage medium
CN113742592A (en) Public opinion information pushing method, device, equipment and storage medium
CN114141384A (en) Method, apparatus and medium for retrieving medical data
Winters et al. Automatic joke generation: Learning humor from examples
WO2019085118A1 (en) Topic model-based associated word analysis method, and electronic apparatus and storage medium
CN112015866A (en) Method, device, electronic equipment and storage medium for generating synonymous text
EP3425531A1 (en) System, method, electronic device, and storage medium for identifying risk event based on social information
Yu et al. Role-explicit query identification and intent role annotation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination