CN112559726A - Resume information filtering method, model training method, device, equipment and medium - Google Patents

Resume information filtering method, model training method, device, equipment and medium Download PDF

Info

Publication number
CN112559726A
CN112559726A CN202011534809.4A CN202011534809A CN112559726A CN 112559726 A CN112559726 A CN 112559726A CN 202011534809 A CN202011534809 A CN 202011534809A CN 112559726 A CN112559726 A CN 112559726A
Authority
CN
China
Prior art keywords
information
resume
credibility
experience
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011534809.4A
Other languages
Chinese (zh)
Inventor
焦学宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Eebochina Technology Co ltd
Original Assignee
Shenzhen Eebochina Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Eebochina Technology Co ltd filed Critical Shenzhen Eebochina Technology Co ltd
Priority to CN202011534809.4A priority Critical patent/CN112559726A/en
Publication of CN112559726A publication Critical patent/CN112559726A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/103Workflow collaboration or project management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/105Human resources
    • G06Q10/1053Employment or hiring

Abstract

The application provides a method for filtering resume information, a method and a device for model training, electronic equipment and a storage medium, wherein the filtering method comprises the following steps: carrying out field replacement on the resume information to be filtered to obtain target resume information; performing information matching on the target resume information and preset post information based on a pre-trained information matching model to obtain a comprehensive matching degree; and determining the filtering result of the resume information according to the comprehensive matching degree. According to the method, the description terms of the resume information are the same as those of the recruitment post information through field replacement, so that the matching degree of the resume content and the post content is improved, and the accuracy of the filtering result is improved.

Description

Resume information filtering method, model training method, device, equipment and medium
Technical Field
The application relates to the technical field of computers, in particular to a method for filtering resume information, a method and a device for training models, electronic equipment and a storage medium.
Background
At present, enterprise units generally receive application resumes sent by application parties through a recruitment system, but the enterprise units receive a lot of application resumes every day, and in order to improve the recruitment efficiency, the recruitment system is required to automatically filter out the application resumes related to the recruitment posts.
In the related art, the resume filtering method is usually implemented based on keyword matching. Specifically, resume keywords and post keywords respectively corresponding to resume information and post information are extracted, and the resume keywords and the post keywords are matched. However, because the habits of the employing party and the recruiting party are different, the resume descriptive terms of the employing party and the recruiting party must be different, so that the matching degree of the resume content and the post content is low. It can be seen that the current resume filtering method has the problem of low accuracy of filtering results.
Disclosure of Invention
An embodiment of the application aims to provide a method for filtering resume information, a method and a device for training a model, electronic equipment and a storage medium, and aims to solve the problem that the accuracy of a filtering result is low in the conventional resume filtering method.
In a first aspect, an embodiment of the present application provides a method for filtering resume information, including:
segmenting words of the resume information to be filtered to obtain a plurality of resume keywords;
determining resume replacement words corresponding to the plurality of resume keywords based on a preset replacement relation between the keywords and the replacement words;
updating the resume information according to the plurality of resume replacement words to obtain target resume information;
performing information matching on the target resume information and preset post information based on a pre-trained information matching model to obtain a comprehensive matching degree;
and determining the filtering result of the resume information according to the comprehensive matching degree.
In the implementation process, the problem that the accuracy of the filtering result of the current resume filtering method is low due to different language habits of the employing party and the recruiting party is solved, so that the description mode of the target resume information can be consistent with the description mode of the preset position information through word segmentation and updating operation, the matching degree of the resume content and the position content is improved, and the accuracy of the filtering result is improved. And performing information matching on the target resume information and preset post information based on a pre-trained information matching model to obtain a comprehensive matching degree, and determining a filtering result of the resume information according to the comprehensive matching degree. The method provides a more efficient and accurate matching and filtering function for the first ring of resume initial screening in the internet recruitment process, greatly reduces the time spent by the recruiter in the resume screening process, and can screen resumes with higher quality.
Further, based on a preset information matching model, performing information matching on the target resume information and the preset post information to obtain a comprehensive matching degree, including:
determining the credibility of the target resume information according to various structured information in the target resume information by using an information matching model;
if the credibility accords with the preset credibility, determining the basic matching degree of the target resume information and the preset post information;
and determining the comprehensive matching degree of the target resume information based on the credibility and the basic matching degree.
In the implementation process, authenticity verification is carried out on various structured information of the target resume information, so that the filtered resume has higher quality, and recruitment efficiency of a recruiter is improved.
Further, before determining the credibility of the target resume information according to various kinds of structured information in the target resume information by using the information matching model, the method comprises the following steps:
and structuring the target resume information to obtain various structured information of the target resume information, wherein the structured information comprises personal basic information, job hunting intention information, work experience information, education experience information, project experience information and/or skill information.
In the implementation process, the target resume information is structured, so that authenticity verification is performed on each structured information in the target resume information, and verification efficiency and accuracy are improved.
Further, determining the credibility of the target resume information according to various structured information in the target resume information by using an information matching model, wherein the credibility comprises the following steps:
and determining the personal information credibility, the work experience credibility, the project experience credibility and the education experience credibility of the target resume information by using the information matching model according to the personal basic information, the work experience information, the project experience information and the education experience information in the target resume information respectively.
In the implementation process, the authenticity of each piece of structured information in the target resume information is verified, so that the verification accuracy of the target resume information is improved.
Further, the method for determining the personal information credibility, the working experience credibility, the project experience credibility and the education experience credibility of the target resume information by using the information matching model according to the personal basic information, the working experience information, the project experience information and the education experience information in the target resume information respectively comprises the following steps:
inputting the personal basic information into an information matching model, outputting the integrity and the change frequency of the basic information, and taking the integrity and the change frequency of the basic information as the reliability of the personal information;
inputting the work experience information into an information matching model, outputting a credibility ratio of enterprise units and a coincidence ratio of work time, and taking the credibility of the work units and the coincidence ratio of the work time as the credibility of the work experience;
inputting the project experience information into an information matching model, outputting project unit credibility and project time overlap ratio, and taking the project unit credibility and the project time overlap ratio as project experience credibility;
inputting the education experience information into the information matching model, outputting the credibility of the education units and the credibility of the academic calendar, and taking the credibility of the education units and the credibility of the academic calendar as the credibility of the education experience.
In the implementation process, authenticity verification is carried out on each piece of structured information through multiple model dimensions of the model, and therefore verification accuracy of the structured information is improved.
In a second aspect, an embodiment of the present application provides a model training method, which is applied to the information matching model in the first aspect, and the method includes:
acquiring resume data and position data, and removing the duplicate of the resume data;
structuring the resume data and the post data after the duplication removal, and taking the structured resume data and the post data as training samples;
and inputting the training sample into a pre-established graph neural network model for training until the graph neural network model reaches a preset convergence condition, so as to obtain an information matching model.
In the implementation process, repeated resume data are reduced by removing the duplicate of the resume data, so that overfitting of a training model caused by high similarity of training samples is avoided, and the matching accuracy of the information matching model is improved.
In a third aspect, an embodiment of the present application provides a device for filtering resume information, including:
the word segmentation module is used for segmenting words of the resume information to be filtered to obtain a plurality of resume keywords;
the system comprises a first determination module, a second determination module and a third determination module, wherein the first determination module is used for determining resume replacement words corresponding to a plurality of resume keywords based on a preset replacement relation between the keywords and the replacement words;
the updating module is used for updating the resume information according to the resume replacement words to obtain the target resume information;
the matching module is used for performing information matching on the target resume information and the preset post information based on a pre-trained information matching model to obtain comprehensive matching degree;
and the second determining module is used for determining the filtering result of the resume information according to the comprehensive matching degree.
In a fourth aspect, an embodiment of the present application provides a model training apparatus, including:
the acquisition module is used for acquiring resume data and post data and removing the duplicate of the resume data;
the structuring module is used for structuring the resume data and the post data after the duplication removal and taking the structured resume data and the post data as training samples;
and the training module is used for inputting the training sample into a pre-established graph neural network model for training until the graph neural network model reaches a preset convergence condition, so as to obtain an information matching model.
In a fifth aspect, an embodiment of the present application provides an electronic device, including a memory and a processor, where the memory is used to store a computer program, and the processor runs the computer program to make the electronic device execute the method for filtering resume information in the first aspect or the method for training models in the second aspect.
In a sixth aspect, an embodiment of the present application provides a computer-readable storage medium, which is characterized by storing a computer program, and when the computer program is executed by a processor, the computer program implements the method for filtering resume information according to the first aspect, or the method for training a model according to the second aspect.
It is to be understood that, for the beneficial effects of the third aspect to the sixth aspect, reference may be made to the description of the first aspect or the second aspect, and details are not described herein again.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
Fig. 1 is a flowchart illustrating an implementation of a method for filtering resume information according to an embodiment of the present disclosure;
FIG. 2 is a flowchart illustrating an implementation of a model training method according to an embodiment of the present disclosure;
fig. 3 is a schematic structural diagram of a filtering apparatus for resume information according to an embodiment of the present application;
FIG. 4 is a schematic structural diagram of a model training apparatus according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
fig. 6 is a schematic data structure diagram of resume data provided in an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
As described in the related background art, because the habits of the employing party and the recruiting party are different, the resume descriptive terms of the employing party and the recruiting party are different, so that the matching degree of the resume content and the post content is low, and the accuracy of the filtering result is low.
In order to solve the problems in the prior art, the application provides a method for filtering resume information, wherein target resume information is obtained by performing field replacement on the resume information to be filtered, so that the description terms of the resume information are the same as those of the recruitment post information, the matching degree of the resume content and the post content is improved, and the accuracy of a filtering result is improved. And performing information matching on the target resume information and preset post information based on a pre-trained information matching model to obtain a comprehensive matching degree, and determining a filtering result of the resume information according to the comprehensive matching degree. The method provides a more efficient and accurate matching and filtering function for the first ring of resume initial screening in the internet recruitment process, greatly reduces the time spent by the recruiter in the resume screening process, and can screen resumes with higher quality.
Example one
Referring to fig. 1, fig. 1 is a flowchart illustrating an implementation of a method for filtering resume information according to an embodiment of the present application. The method for filtering resume information described in the embodiments of the present application can be applied to electronic devices, including but not limited to computer devices such as smart phones, tablet computers, desktop computers, supercomputers, personal digital assistants, physical servers, and cloud servers. The method for filtering resume information in the embodiment of the application includes steps S101 to S105, which are detailed as follows:
and S101, performing word segmentation on the resume information to be filtered to obtain a plurality of resume keywords.
In this embodiment, the resume information is resume information delivered by an applicant for a recruiting position, and may include basic information and detailed information. Basic information may include, but is not limited to, name, gender, ethnicity, contact, marital status, and home address, and detailed information may include, but is not limited to, educational experiences, work experience, project experiences, and hobbies. The target resume information is resume information which is obtained by replacing fields and has the same description mode as the post information, for example, the resume information is read from the institute of applied literature in the central and rural areas, and the target resume information is read from the university in Beijing. It is understood that the target resume information also includes information contents equivalent to the resume information, such as basic information and detailed information.
Step S102, determining resume replacement words corresponding to a plurality of resume keywords based on a preset replacement relation between the keywords and the replacement words.
In this embodiment, a preset replacement relationship between the set keyword and the replacement word is preset. Optionally, after the resume information is subjected to data cleaning in a preset format according to the information composition in the resume information, the resume information is split into a plurality of data modules, including but not limited to personal basic information, job intention information, work experience information, education experience information, project experience information and skill information. And then, utilizing a word segmentation technology provided in an ElasticSearch search engine to segment words of resume information of different data modules, counting the number of the obtained keywords, recording the same keywords into a corresponding keyword word bank when the frequency of occurrence of the same keywords in the global keyword counting of the data modules reaches a certain set threshold value, periodically carrying out manual intervention maintenance, classifying the recorded keywords and specifying the corresponding relation of conversion.
Taking the educational experience keyword thesaurus as an example: the school name field contains keywords 'Beijing university' and 'Beida' in the earlier stage, and both specify conversion into the 'Beijing university', and through analysis of resume information and position information, keywords 'Yuanmingyuan professional technology college' and 'Zhongguancun application literature college' are contained in the word stock, and the conversion relation is uniformly specified as the 'Beijing university'.
Further, for a plurality of split data modules, data in the data modules are divided into unique data and non-unique data, wherein the unique data means that only one valid result should exist in the logical application, but all history records (such as personal information, basic information and job seeking intention) of the modified data are reserved in a linked list mode (a life cycle starting and ending time field is added). The non-unique data retains all of its identifiable data (e.g., educational training experience, work experience, etc.).
And step S103, updating the resume information according to the plurality of resume replacement words to obtain target resume information.
In this embodiment, the process of updating the resume information is a process of replacing the keyword in the resume information with a preset replacement word. Optionally, a replacement relationship between the field and the replacement word is preset, and based on the corresponding replacement relationship, whether the field in the resume information has the replacement word is detected, if the field in the resume information has the replacement word, the field in the resume information is replaced by the replacement word, for example, if the resume information is "read in the middle-concerned village application literature college", the target resume information after the field replacement is "read in the Beijing university"; or recording the replacement word as a spare word of the resume information to the resume information, for example, if the resume information is 'read from the institute of literary application in zhongguancun', the target resume information after the field replacement is 'read from the institute of literary application in zhongguancun (Beijing university)'. It should be understood that the above alternate representation is merely exemplary, and other characters may be substituted in other embodiments, such as "read at the university of guan university college of literature-Beijing university", which will not be described herein.
It can be understood that the above manner of updating the resume information may be to replace the keyword with a replacement word, or to record the replacement word in the resume information as an explanatory term of the keyword.
According to the embodiment, the target resume information is obtained by updating the resume information to be filtered, so that the descriptive terms of the resume information are the same as those of the recruitment post information, the matching degree of resume contents and post contents is improved, and the accuracy of the filtering result is improved.
And step S104, performing information matching on the target resume information and the preset post information based on a pre-trained information matching model to obtain comprehensive matching degree.
In this embodiment, the comprehensive matching degree is a final matching degree of the target resume information and the preset position information, that is, a matching degree parameter for evaluating whether the resume information meets the position requirement. The preset post information is used for filtering resume information, and is obtained after field replacement in order to ensure the uniformity of description phrases.
The information matching model is an artificial intelligence model for performing authenticity verification on each information content in the target resume information and performing matching degree operation on the target resume information and the preset post information, and the model can be obtained by training based on a graph neural network model and a convolution neural network model. It can be understood that the information matching model may be trained in advance by the electronic device, or the model file and the algorithm file corresponding to the information matching model may be transplanted to the electronic device after being trained in advance by other devices. That is, the execution topic of the training information matching model and the execution subject using the information matching model may be the same or different.
The information matching model comprises a plurality of model dimensions. Illustratively, according to the information content of the target resume information, the model dimension may include a basic information integrity degree, an information change frequency, a work unit reliability, a work time coincidence proportion, a project unit reliability, a project time coincidence degree, an education unit reliability, a academic reliability, a skill matching degree, a comprehensive matching degree, and the like. Based on different inputs of the information matching model, the information matching model outputs a corresponding model result by adopting a corresponding model dimension, for example, if a academic certificate in the education experience is used as the input of the information matching model, the adopted model dimension is the academic credibility, and a numerical value corresponding to the academic credibility is output.
In a possible implementation mode, the target resume information and the preset position information are input into an information matching model, the information matching model carries out authenticity verification on each information content in the target resume information, if the target resume information meets the authenticity requirement, whether the target resume information meets the position requirement corresponding to the preset position information or not is verified, and the comprehensive matching degree is output.
And S105, determining a filtering result of the resume information according to the comprehensive matching degree.
In this embodiment, the comprehensive matching degree is compared with a preset matching degree, and if the comprehensive matching degree is not less than the preset matching degree, it is indicated that the resume information meets the post requirement corresponding to the preset post information, so that it is determined that the filtering result of the resume information is that the resume passes the filtering; if the comprehensive matching degree is smaller than the preset matching degree, the resume information is not in accordance with the post requirement corresponding to the preset post information, and therefore the filtering result of the resume information is determined to be that the resume does not pass the filtering.
As an embodiment of the present application, on the basis of the embodiment in fig. 1, the step S102 performs information matching on the target resume information and the preset position information based on a preset information matching model to obtain a comprehensive matching degree, including:
determining the credibility of the target resume information according to various structured information in the target resume information by using an information matching model; if the credibility accords with the preset credibility, determining the basic matching degree of the target resume information and the preset post information; and determining the comprehensive matching degree of the target resume information based on the credibility and the basic matching degree.
In the above implementation manner, the structured information is information obtained by performing a structured operation on the target resume information machine, and the structured information includes personal basic information, job hunting intention information, work experience information, education experience information, project experience information, and/or skill information. And verifying the authenticity of the structured information based on the multiple model dimensions of the information matching model, namely, respectively inputting the structured information into the information matching model, and outputting an output result corresponding to the model dimensions by the information matching model. And when the structured information is real information (the credibility corresponding to each structured information accords with the preset credibility), calculating the basic matching degree of the target resume information and the preset post information, and finally taking the credibility corresponding to each structured information and the basic matching degree as the input of the information matching model and outputting the comprehensive matching degree.
Further, each confidence indicator of the structured information may be set to a set of range values, thereby determining the processing result after the result of each confidence indicator falls within a different range. For example: and (4) directly filtering (0 min) when the reliability of the working experience is less than 70%, waiting (6 min) when the reliability of the working experience is 70% -85%, and credible (10 min) when the reliability of the working experience is more than 85%. It will be appreciated that the weight between each confidence indicator is configurable, and the model dimension weight within each confidence indicator is configurable.
As an embodiment of the present application, on the basis of the embodiment in fig. 1, the determining, by using the information matching model, the reliability of the target resume information according to the various kinds of structured information in the target resume information includes:
and determining the personal information credibility, the work experience credibility, the project experience credibility and the education experience credibility of the target resume information by using the information matching model according to the personal basic information, the work experience information, the project experience information and the education experience information in the target resume information respectively.
Optionally, inputting the personal basic information into an information matching model, outputting the integrity of the basic information and the information change frequency, and taking the integrity of the basic information and the information change frequency as the reliability of the personal information; inputting the work experience information into an information matching model, outputting the reliability of a work unit and the coincidence degree of work time, and taking the reliability of the work unit and the coincidence degree of the work time as the reliability of the work experience; inputting the project experience information into an information matching model, outputting project unit credibility and project time overlap ratio, and taking the project unit credibility and the project time overlap ratio as project experience credibility; inputting the education experience information into the information matching model, outputting the credibility of the education units and the credibility of the academic calendar, and taking the credibility of the education units and the credibility of the academic calendar as the credibility of the education experience.
In the above implementation, the integrity of the basic information is used to represent the integrity of the personal information; the information change frequency expression represents the unchangeable information change conditions such as mobile phone numbers, micro signals, nationalities, birthdays, sexes and the like; the credibility of the working unit is the degree that the company name in all personal working experiences is matched with the industrial and commercial information of the third-party enterprise to obtain the background information of the company name and the industrial and commercial information of the third-party enterprise, and the background information of the company name and the industrial and commercial information of the third-party enterprise are consistent with the description; the working time overlap ratio is based on the actual situation that the same person usually cannot work in different companies in the same time period, the time range of all historical working experience designs is judged, and particularly the credibility of the working experience time dimension is obtained by comparing data among different resume sources; the project unit credibility is the matching degree of the company to which the project belongs and the company in the work experience in the same time period; the project time overlap ratio is the matching degree of the time spent on the project in the same time period and the time of the company in the work experience; the credibility of the education units is compared with industrial data or third party authority cooperative institution data, and credibility ratios of schools (institutions), specialties (training courses), qualification certificates, starting time and the like are checked; the academic credibility is the matching degree of the certification document and the corresponding content which can be issued by the school (organization).
Optionally, before determining the reliability of the target resume information according to the various kinds of structured information in the target resume information by using the information matching model, the method includes: and structuring the target resume information to obtain various structured information of the target resume information.
Example two
Referring to fig. 2, fig. 2 is a flowchart illustrating an implementation of a model training method provided in an embodiment of the present application. The model training method described below in the embodiments of the present application may be applied to electronic devices including, but not limited to, computer devices such as smart phones, tablet computers, desktop computers, supercomputers, personal digital assistants, physical servers, and cloud servers. The model training method of the embodiment of the application comprises steps S201 to S203, which are detailed as follows:
step S201, resume data and position data are obtained, and the resume data are subjected to duplication elimination.
In this embodiment, resume data and post data are obtained from multiple channels, such as resume data and post data of each large recruitment website. In order to ensure that the formats of the resume data and the post data acquired from multiple channels are uniform, the resume data and the post data are structured and de-duplicated.
Fig. 6 is a schematic diagram illustrating a structure of the station data according to the embodiment of the present application. Illustratively, the deduplication process may include: the life cycle end time recorded by name + mobile phone number + is used as an identifier for uniquely identifying a natural person identity, and a pull chain table (a table design which keeps all historical updating operations but relatively saves storage space) is further created, so that basic information related to the person is created. The basic information includes, but is not limited to, name, phone number, gender, nationality, political face, other contact details (WeChat, QQ, microblog, etc.), creation time, last modification time, life cycle start time, life cycle end time, etc. When a plurality of identical mobile phone numbers exist but the names are different, the mobile phone numbers are treated as different natural persons. And when the names are the same and the mobile phone numbers are different, searching time-related information corresponding to the mobile phone numbers. The latest 6 months of data are used as a judgment basis, the newer information in the latest 6 months is the latest basic information (the latest data is inserted), and the information beyond 6 months is treated as different natural persons. When natural person information is searched, a record of name + mobile phone number + latest (latest creation time) in the time dimension is used as a search result. After the basic information of the natural person in the system is confirmed, the resume data acquired by each channel can be associated with the natural person after being cleaned.
Step S202, structuring the resume data and the post data after the duplication removal, and taking the resume data and the post data after the structuring as training samples.
In this embodiment, according to a preset format, after all collected resume data are cleaned, the resume data can be split into 7 large data modules, which include: basic information, job-seeking intent, work experience, educational/training experience, project experience, skill specials, other information.
Step S203, inputting the training sample into a pre-established graph neural network model for training until the graph neural network model reaches a preset convergence condition, and obtaining the information matching model.
In this embodiment, the preset convergence condition is a condition indicating that the network training is completed, for example, if a loss value obtained by the loss function is smaller than a preset loss threshold, convergence is indicated. Exemplarily, the training sample cannot be input into the neural network model of the graph for processing, and the comprehensive matching degree is obtained; calculating a loss value between the input training sample and the actual matching degree, adjusting network parameters in the graph neural network model when the loss value is greater than or equal to a preset loss threshold value, and returning to the step of inputting the training sample into the graph neural network model for processing to obtain the comprehensive matching degree; and when the loss value is smaller than the preset loss threshold value, finishing the training of the graph neural network model to obtain a trained information matching model. It can be understood in colloquial terms that a smaller loss value indicates a more accurate feature vector extracted by the neural network.
EXAMPLE III
In order to implement the corresponding method of the above embodiments to achieve the corresponding functions and technical effects, the following provides a filtering apparatus for resume information. Referring to fig. 3, fig. 3 is a block diagram of a filtering apparatus for resume information according to an embodiment of the present application. The modules included in the apparatus in this embodiment are used to execute the steps in the embodiment corresponding to fig. 1, and refer to fig. 1 and the related description in the embodiment corresponding to fig. 1 specifically. For convenience of explanation, only the part related to the present embodiment is shown, and the filtering apparatus for resume information provided in the embodiment of the present application includes:
the word segmentation module 301 is configured to segment words of the resume information to be filtered to obtain a plurality of resume keywords;
a first determining module 302, configured to determine resume replacement words corresponding to a plurality of resume keywords based on a preset replacement relationship between the keywords and the replacement words;
the updating module 303 is configured to update resume information according to the plurality of resume replacement words to obtain target resume information;
the matching module 304 is used for performing information matching on the target resume information and the preset post information based on a pre-trained information matching model to obtain a comprehensive matching degree;
and a second determining module 305, configured to determine a filtering result of the resume information according to the comprehensive matching degree.
As an embodiment of the present application, the matching module 302 specifically includes:
the second determining unit is used for determining the credibility of the target resume information according to various kinds of structured information in the target resume information by using the information matching model;
the third determining unit is used for determining the basic matching degree of the target resume information and the preset post information if the credibility accords with the preset credibility;
and the fourth determining unit is used for determining the comprehensive matching degree of the target resume information based on the credibility and the basic matching degree.
As an embodiment of the present application, the matching module 302 further includes:
and the structuring unit is used for structuring the target resume information to obtain various kinds of structured information of the target resume information, wherein the structured information comprises personal basic information, job hunting intention information, work experience information, education experience information, project experience information and/or skill information.
As an embodiment of the present application, the second determining unit is specifically configured to:
and determining the personal information credibility, the work experience credibility, the project experience credibility and the education experience credibility of the target resume information by using the information matching model according to the personal basic information, the work experience information, the project experience information and the education experience information in the target resume information respectively.
As an embodiment of the present application, the second determining unit specifically includes:
the first output subunit is used for inputting the personal basic information into the information matching model, outputting the integrity and the change frequency of the basic information and taking the integrity and the change frequency of the basic information as the reliability of the personal information;
the second output subunit is used for inputting the work experience information into the information matching model, outputting the enterprise unit credibility proportion and the working time coincidence proportion, and taking the working unit credibility and the working time coincidence degree as the work experience credibility;
the third output subunit is used for inputting the project experience information into the information matching model, outputting the project unit credibility and the project time overlap ratio, and taking the project unit credibility and the project time overlap ratio as the project experience credibility;
and the fourth output subunit is used for inputting the education experience information into the information matching model, outputting the education unit credibility and the academic record credibility, and taking the education unit credibility and the academic record credibility as the education experience credibility.
The aforementioned filtering apparatus for resume information can implement the method for filtering resume information in the first embodiment. The alternatives in the first embodiment are also applicable to the present embodiment, and are not described in detail here. The rest of the embodiments of the present application may refer to the contents of the first embodiment, and in this embodiment, details are not repeated.
Example four
In order to implement the corresponding methods of the above embodiments to achieve the corresponding functions and technical effects, a model training apparatus is provided below. Referring to fig. 4, fig. 4 is a block diagram of a model training apparatus according to an embodiment of the present disclosure. The modules included in the apparatus in this embodiment are configured to execute the steps in the embodiment corresponding to fig. 2, and refer to fig. 2 and the related description in the embodiment corresponding to fig. 2 specifically. For convenience of explanation, only the part related to the present embodiment is shown, and the model training apparatus provided in the embodiment of the present application includes:
an obtaining module 401, configured to obtain resume data and position data, and perform deduplication on the resume data;
the structuring module 402 is configured to structure the redacted data and the post data after the duplication removal, and use the structured resume data and the post data as training samples;
the training module 403 is configured to input a training sample into a pre-established graph neural network model for training until the graph neural network model reaches a preset convergence condition, so as to obtain an information matching model.
The model training apparatus can implement the model training method of the second embodiment. The options in the second embodiment are also applicable to the present embodiment, and are not described in detail here. The rest of the embodiments of the present application may refer to the contents of the second embodiment, and in this embodiment, details are not repeated.
EXAMPLE five
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 5, the electronic apparatus 5 of this embodiment includes: at least one processor 50 (only one shown in fig. 5), a memory 51, and a computer program 52 stored in the memory 51 and executable on the at least one processor 50, the processor 50 implementing the steps of any of the above-described method embodiments when executing the computer program 52.
The electronic device 5 may be a computing device such as a smart phone, a tablet computer, a desktop computer, a supercomputer, a personal digital assistant, a physical server, and a cloud server. The electronic device may include, but is not limited to, a processor 50, a memory 51. Those skilled in the art will appreciate that fig. 5 is merely an example of the electronic device 5, and does not constitute a limitation of the electronic device 5, and may include more or less components than those shown, or combine some of the components, or different components, such as an input-output device, a network access device, etc.
The Processor 50 may be a Central Processing Unit (CPU), and the Processor 50 may be other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 51 may in some embodiments be an internal storage unit of the electronic device 5, such as a hard disk or a memory of the electronic device 5. The memory 51 may also be an external storage device of the electronic device 5 in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device 5. Further, the memory 51 may also include both an internal storage unit and an external storage device of the electronic device 5. The memory 51 is used for storing an operating system, an application program, a BootLoader (BootLoader), data, and other programs, such as program codes of the computer program. The memory 51 may also be used to temporarily store data that has been output or is to be output.
In addition, an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the computer program implements the steps in any of the method embodiments described above.
The embodiments of the present application provide a computer program product, which when running on an electronic device, enables the electronic device to implement the steps in the above method embodiments when executed.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the processes in the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium and can implement the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to a photographing apparatus/terminal apparatus, a recording medium, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), an electrical carrier signal, a telecommunications signal, and a software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc. In certain jurisdictions, computer-readable media may not be an electrical carrier signal or a telecommunications signal in accordance with legislative and patent practice.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims (10)

1. A method for filtering resume information, comprising:
segmenting words of the resume information to be filtered to obtain a plurality of resume keywords;
determining resume replacement words corresponding to a plurality of resume keywords based on a preset replacement relation between the keywords and the replacement words;
updating the resume information according to the plurality of resume replacement words to obtain the target resume information;
performing information matching on the target resume information and preset post information based on a pre-trained information matching model to obtain a comprehensive matching degree;
and determining the filtering result of the resume information according to the comprehensive matching degree.
2. The method for filtering resume information according to claim 1, wherein the information matching of the target resume information and the preset position information based on a preset information matching model to obtain a comprehensive matching degree comprises:
determining the credibility of the target resume information according to various structured information in the target resume information by using the information matching model;
if the credibility accords with the preset credibility, determining the basic matching degree of the target resume information and the preset post information;
and determining the comprehensive matching degree of the target resume information based on the credibility and the basic matching degree.
3. The method for filtering resume information according to claim 2, wherein before determining the credibility of the target resume information according to a plurality of kinds of structured information in the target resume information by using the information matching model, the method comprises:
and structuring the target resume information to obtain various structured information of the target resume information, wherein the structured information comprises personal basic information, job hunting intention information, work experience information, education experience information, project experience information and/or skill information.
4. The method for filtering resume information according to claim 3, wherein the determining the credibility of the target resume information according to a plurality of kinds of structured information in the target resume information by using the information matching model comprises:
and determining the personal information reliability, the work experience reliability, the project experience reliability and the education experience reliability of the target resume information by utilizing the information matching model according to the personal basic information, the work experience information, the project experience information and the education experience information in the target resume information respectively.
5. The method for filtering resume information according to claim 4, wherein the determining, by using the information matching model, the personal information reliability, the work experience reliability, the project experience reliability and the education experience reliability of the target resume information according to the personal basic information, the work experience information, the project experience information and the education experience information in the target resume information respectively comprises:
inputting the personal basic information into the information matching model, outputting basic information integrity and information change frequency, and taking the basic information integrity and the information change frequency as personal information credibility;
inputting the working experience information into the information matching model, outputting working unit credibility and working time contact ratio, and taking the working unit credibility and the working time contact ratio as the working experience credibility;
inputting the project experience information into the information matching model, outputting project unit credibility and project time overlap ratio, and taking the project unit credibility and the project time overlap ratio as the project experience credibility;
inputting the education experience information into the information matching model, outputting education unit credibility and academic record credibility, and taking the education unit credibility and the academic record credibility as the education experience credibility.
6. A model training method applied to the information matching model of claim 1, the method comprising:
acquiring resume data and position data, and removing the duplicate of the resume data;
structuring the resume data and the post data after the duplication removal, and taking the resume data and the post data after the structuring as training samples;
and inputting the training sample into a pre-established graph neural network model for training until the graph neural network model reaches a preset convergence condition, so as to obtain the information matching model.
7. A device for filtering resume information, comprising:
the word segmentation module is used for segmenting words of the resume information to be filtered to obtain a plurality of resume keywords;
the system comprises a first determination module, a second determination module and a third determination module, wherein the first determination module is used for determining resume replacement words corresponding to a plurality of resume keywords based on a preset replacement relation between the keywords and the replacement words;
the updating module is used for updating the resume information according to the resume replacement words to obtain the target resume information;
the matching module is used for performing information matching on the target resume information and the preset post information based on a pre-trained information matching model to obtain comprehensive matching degree;
and the second determining module is used for determining the filtering result of the resume information according to the comprehensive matching degree.
8. A model training apparatus, comprising:
the acquisition module is used for acquiring resume data and post data and carrying out duplication removal on the resume data;
the structuring module is used for structuring the resume data and the post data after the duplication is removed and taking the structured resume data and the post data as training samples;
and the training module is used for inputting the training sample into a pre-established graph neural network model for training until the graph neural network model reaches a preset convergence condition, so as to obtain the information matching model.
9. An electronic device comprising a memory for storing a computer program and a processor for executing the computer program to cause the electronic device to perform the method for filtering resume information according to any one of claims 1 to 5 or the method for model training according to claim 6.
10. A computer-readable storage medium, characterized in that it stores a computer program which, when executed by a processor, implements the method of filtering resume information according to any one of claims 1 to 5, or the method of model training according to claim 6.
CN202011534809.4A 2020-12-22 2020-12-22 Resume information filtering method, model training method, device, equipment and medium Pending CN112559726A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011534809.4A CN112559726A (en) 2020-12-22 2020-12-22 Resume information filtering method, model training method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011534809.4A CN112559726A (en) 2020-12-22 2020-12-22 Resume information filtering method, model training method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN112559726A true CN112559726A (en) 2021-03-26

Family

ID=75030947

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011534809.4A Pending CN112559726A (en) 2020-12-22 2020-12-22 Resume information filtering method, model training method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN112559726A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112905805A (en) * 2021-03-05 2021-06-04 北京中经惠众科技有限公司 Knowledge graph construction method and device, computer equipment and storage medium
CN113610503A (en) * 2021-08-11 2021-11-05 中国平安人寿保险股份有限公司 Resume information processing method, device, equipment and medium
CN114168819A (en) * 2022-02-14 2022-03-11 北京大学 Post matching method and device based on graph neural network
CN115187022A (en) * 2022-06-29 2022-10-14 广州市南方人力资源评价中心有限公司 Talent comprehensive capacity analysis method and device, storage medium and terminal equipment
CN116166717A (en) * 2023-04-25 2023-05-26 贵州自由客网络技术有限公司 Artificial intelligence information extraction method applied to resume
CN116862277A (en) * 2023-05-09 2023-10-10 北京新东方迅程网络科技有限公司 User's selection information processing method, device, electronic equipment and readable storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105787639A (en) * 2016-02-03 2016-07-20 北京云太科技有限公司 Artificial-intelligence-based talent big data quantization precise matching method and apparatus
CN107870976A (en) * 2017-09-25 2018-04-03 平安科技(深圳)有限公司 Resume identification device, method and computer-readable recording medium
CN109087003A (en) * 2018-08-03 2018-12-25 四川民工加网络科技有限公司 The archives generation method and device of mobility worker
CN109582704A (en) * 2018-10-17 2019-04-05 龙马智芯(珠海横琴)科技有限公司 Recruitment information and the matched method of job seeker resume
CN110187938A (en) * 2019-05-24 2019-08-30 北京神州泰岳软件股份有限公司 A kind of assemble method and device of page workflow
CN110378544A (en) * 2018-04-12 2019-10-25 百度在线网络技术(北京)有限公司 A kind of personnel and post matching analysis method, device, equipment and medium
CN110909120A (en) * 2018-09-14 2020-03-24 阿里巴巴集团控股有限公司 Resume searching/delivering method, device and system and electronic equipment

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105787639A (en) * 2016-02-03 2016-07-20 北京云太科技有限公司 Artificial-intelligence-based talent big data quantization precise matching method and apparatus
CN107870976A (en) * 2017-09-25 2018-04-03 平安科技(深圳)有限公司 Resume identification device, method and computer-readable recording medium
CN110378544A (en) * 2018-04-12 2019-10-25 百度在线网络技术(北京)有限公司 A kind of personnel and post matching analysis method, device, equipment and medium
CN109087003A (en) * 2018-08-03 2018-12-25 四川民工加网络科技有限公司 The archives generation method and device of mobility worker
CN110909120A (en) * 2018-09-14 2020-03-24 阿里巴巴集团控股有限公司 Resume searching/delivering method, device and system and electronic equipment
CN109582704A (en) * 2018-10-17 2019-04-05 龙马智芯(珠海横琴)科技有限公司 Recruitment information and the matched method of job seeker resume
CN110187938A (en) * 2019-05-24 2019-08-30 北京神州泰岳软件股份有限公司 A kind of assemble method and device of page workflow

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112905805A (en) * 2021-03-05 2021-06-04 北京中经惠众科技有限公司 Knowledge graph construction method and device, computer equipment and storage medium
CN113610503A (en) * 2021-08-11 2021-11-05 中国平安人寿保险股份有限公司 Resume information processing method, device, equipment and medium
CN114168819A (en) * 2022-02-14 2022-03-11 北京大学 Post matching method and device based on graph neural network
CN114168819B (en) * 2022-02-14 2022-07-12 北京大学 Post matching method and device based on graph neural network
CN115187022A (en) * 2022-06-29 2022-10-14 广州市南方人力资源评价中心有限公司 Talent comprehensive capacity analysis method and device, storage medium and terminal equipment
CN116166717A (en) * 2023-04-25 2023-05-26 贵州自由客网络技术有限公司 Artificial intelligence information extraction method applied to resume
CN116166717B (en) * 2023-04-25 2023-06-23 贵州自由客网络技术有限公司 Artificial intelligence information extraction method applied to resume
CN116862277A (en) * 2023-05-09 2023-10-10 北京新东方迅程网络科技有限公司 User's selection information processing method, device, electronic equipment and readable storage medium

Similar Documents

Publication Publication Date Title
CN112559726A (en) Resume information filtering method, model training method, device, equipment and medium
He et al. A database linking Chinese patents to China’s census firms
US9390176B2 (en) System and method for recursively traversing the internet and other sources to identify, gather, curate, adjudicate, and qualify business identity and related data
CN111737499B (en) Data searching method based on natural language processing and related equipment
CN111639066A (en) Data cleaning method and device
CN107688645B (en) Policy data processing method and terminal equipment
CN112613917A (en) Information pushing method, device and equipment based on user portrait and storage medium
CN110674360B (en) Tracing method and system for data
CN112182246A (en) Method, system, medium, and application for creating an enterprise representation through big data analysis
CN112241458B (en) Text knowledge structuring processing method, device, equipment and readable storage medium
CN109325042B (en) Processing template acquisition method, form processing method, device, equipment and medium
CN111652658A (en) Portrait fusion method, apparatus, electronic device and computer readable storage medium
CN110502529B (en) Data processing method, device, server and storage medium
CN112948429B (en) Data reporting method, device and equipment
CN115203435A (en) Entity relation generation method and data query method based on knowledge graph
CN110851431B (en) Data processing method and device for data center station
CN112860722A (en) Data checking method and device, electronic equipment and readable storage medium
CN114722819B (en) Entity type classification and identification method, device, equipment and medium
CN111125102B (en) Data query method and device based on index data
CN114943234B (en) Enterprise name linking method, enterprise name linking device, computer equipment and storage medium
CN112882699B (en) Service processing method, device, equipment and medium based on flow configuration engine
CN113220801B (en) Structured data classification method, device, equipment and medium
US20230325371A1 (en) System and method for entity timeslicing for disambiguating entity profiles
CN112286926B (en) Method for combing data quality rules based on affair handling data supply and demand maps
US20230325366A1 (en) System and method for entity disambiguation for customer relationship management

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination