CN111625618B - Data matching method and device - Google Patents

Data matching method and device Download PDF

Info

Publication number
CN111625618B
CN111625618B CN201910110232.5A CN201910110232A CN111625618B CN 111625618 B CN111625618 B CN 111625618B CN 201910110232 A CN201910110232 A CN 201910110232A CN 111625618 B CN111625618 B CN 111625618B
Authority
CN
China
Prior art keywords
data set
word
vector
converting
text vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910110232.5A
Other languages
Chinese (zh)
Other versions
CN111625618A (en
Inventor
李越川
林方全
杨超
张京桥
杨程
周涛明
蒋澄宇
吴超
颜文龙
夏宇
汪琳
周恒�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201910110232.5A priority Critical patent/CN111625618B/en
Publication of CN111625618A publication Critical patent/CN111625618A/en
Application granted granted Critical
Publication of CN111625618B publication Critical patent/CN111625618B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a data matching method and device. Wherein the method comprises the following steps: acquiring a first data set, a second data set and a third data set, wherein the first data set is a job description data set, the second data set is a candidate resume data set, and the third data set is a recruitment record data set in a preset period; selecting a first expansion set of the first data set from the second data set and the third data set according to a first preset screening condition, and selecting a second expansion set of the second data set from the first data set and the third data set according to a second preset screening condition; a degree of matching between the first data set and the second data set is determined from at least one of the first and second extended sets, the first data set, and the second data set. The method and the device solve the technical problem that the accuracy of resume and position matching in the existing recruitment system is low.

Description

Data matching method and device
Technical Field
The present application relates to the field of computers, and in particular, to a data matching method and apparatus.
Background
With the development of computer technology, the internet has provided convenience to people's life and work. For example, an enterprise may recruit employees via the internet, and a job seeker may select an appropriate enterprise via various recruitment platforms. Large companies often have recruitment systems to provide the necessary data storage and platform support. In order to make recruitment more efficient, some companies match the resume and job descriptions through an intelligent algorithm to recommend proper posts for job seekers and recommend proper talents for each department of the company so as to improve recruitment and job seeking efficiency. In the matching process, the resume and the position texts play a key role, and the quality of the texts influences the algorithm effect.
In recruitment, recruiters typically write post requirements based on commonality requirements of the type of position to which the position belongs, however, these post requirements do not reflect the characteristics of the post, and in addition, each post has different emphasis requirements on the candidate, and when writing post requirements, a worker may not be aware of the emphasis requirements of the position description, or use a popular term for accurate expression. Because the new technology of technical class post is updated faster, if the post requirement cannot be updated in time, job seekers and recommenders (staff push in, hunter and the like) cannot accurately acquire the post requirement, so that recruitment efficiency is affected.
While a job seeker writes a brief duration, he/she usually introduces his/her own work experience and project experience, thereby embodying his/her own mastering skills. Due to the huge number of talents, many job seekers with similar experiences and skills can appear. On the other hand, each person has own unique job-seeking will, and a plurality of factors are considered, such as interests, regions, department tendencies and the like, but the contents cannot be well reflected in resume texts, so that the accuracy of candidate position recommendation is reduced, the user experience of a job seeker is influenced, and the recruitment quality and efficiency of a company are also influenced.
The prior art only extracts school, professional, corporate, job information from the profile and normalizes, and does not utilize unstructured data (e.g., work experience and project experience), but these texts are typically rich in information. In addition, the prior art can also help job seekers and recruiters to perfect resume and position descriptions by establishing a phrase database, so that the problem that resume and position description expressions are not standard is solved, but the proposal respectively processes resume and position descriptions, does not link the resume and position descriptions, and cannot process the resume and position existing in the database.
In view of the above problems, no effective solution has been proposed at present.
Disclosure of Invention
The embodiment of the application provides a data matching method and device, which at least solve the technical problem of low accuracy of resume and job matching in the existing recruitment system.
According to an aspect of the embodiments of the present application, there is provided a data matching method, including: acquiring a first data set, a second data set and a third data set, wherein the first data set is a job description data set, the second data set is a candidate resume data set, and the third data set is a recruitment record data set in a preset period; selecting a first expansion set of the first data set from the second data set and the third data set according to a first preset screening condition, and selecting a second expansion set of the second data set from the first data set and the third data set according to a second preset screening condition; determining a degree of matching between the first data set and the second data set based on at least one of the first and second extended sets, the first data set and the second data set
According to another aspect of the embodiments of the present application, there is also provided a data matching apparatus, including: the acquisition module is used for acquiring a first data set, a second data set and a third data set, wherein the first data set is a job description data set, the second data set is a candidate resume data set, and the third data set is a recruitment record data set in a preset period; the selection module is used for selecting a first expansion set of the first data set from the second data set and the third data set according to a first preset screening condition, and selecting a second expansion set of the second data set from the first data set and the third data set according to a second preset screening condition; and the determining module is used for determining the matching degree between the first data set and the second data set according to at least one of the first expansion set and the second expansion set, the first data set and the second data set.
According to another aspect of the embodiments of the present application, there is also provided a storage medium including a stored program, where the device on which the storage medium is controlled to execute the data matching method when the program runs.
According to another aspect of the embodiments of the present application, there is also provided a processor for running a program, wherein the program executes a data matching method when running.
According to another aspect of the embodiments of the present application, there is also provided a computer apparatus including: a processor; and a memory, coupled to the processor, for providing instructions to the processor for processing the steps of: acquiring a first data set, a second data set and a third data set, wherein the first data set is a job description data set, the second data set is a candidate resume data set, and the third data set is a recruitment record data set in a preset period; selecting a first expansion set of the first data set from the second data set and the third data set according to a first preset screening condition, and selecting a second expansion set of the second data set from the first data set and the third data set according to a second preset screening condition; a degree of matching between the first data set and the second data set is determined from at least one of the first and second extended sets, the first data set, and the second data set.
In the embodiment of the application, a mode of expanding job description data and candidate resume data by recruitment record data is adopted, after a job description data set, a candidate resume data set and a recruitment record data set are obtained, a first expansion set of the job description data set is obtained by screening the candidate resume data set and the recruitment record data set according to a first preset screening condition, a second expansion set of the candidate resume data set is obtained by screening the job description data set and the recruitment record data set according to a second preset screening condition, and finally the matching degree between the job description data set and the candidate resume data set is determined according to at least one of the first expansion set and the second expansion set, the job description data set and the candidate resume data set.
In the process, the job description data set can be expanded through the candidate resume data set and the recruitment record data set, the candidate resume data set can be expanded through the job description data set and the recruitment record data set, the job description data and the candidate resume data are matched based on the expanded job description data and the candidate resume data, and the matching degree of the job and the resume can be improved.
From the above, the scheme provided by the application achieves the purpose of matching the resume and the position, thereby realizing the technical effect of improving the accuracy of resume and position matching, and further solving the technical problem of low accuracy of resume and position matching in the existing recruitment system.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:
FIG. 1 is a block diagram of the hardware architecture of an alternative computer terminal according to an embodiment of the present application;
FIG. 2 is a flow chart of a data matching method according to an embodiment of the present application;
FIG. 3 is a flow chart of an alternative data matching method according to an embodiment of the present application;
FIG. 4 is a schematic illustration of an alternative matching result according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a data matching device according to an embodiment of the present application; and
fig. 6 is a block diagram of a computer terminal according to an embodiment of the present application.
Detailed Description
In order to make the present application solution better understood by those skilled in the art, the following description will be made in detail and with reference to the accompanying drawings in the embodiments of the present application, it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the present application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
First, partial terms or terminology appearing in describing embodiments of the present application are applicable to the following explanation:
JD, job Description, abbreviation for Job Description.
CV, curliculum Vitae, abbreviation of candidate resume.
Word segmentation, in this application, refers to chinese word segmentation, which refers to the segmentation of a sequence of chinese characters into individual words. Word segmentation refers to the process of recombining a sequence of consecutive words into a sequence of words according to a certain specification.
Cosine similarity refers to the evaluation of the similarity of two vectors by calculating the cosine value of the included angle of the two vectors.
Embedding, in the field of text analysis, refers to finding a function or map to generate a new representation of words in low dimensional space.
DNN, deep Neural Networks, deep neural network, refers to a feed-forward neural network with multiple hidden layers.
CNN, convolutional Neural Networks, convolutional neural network refers to a feed-forward neural network that includes convolutional or correlation calculations and has a deep structure.
RNN, recurrent Neural Networks, recurrent neural network, artificial neural network with knuckle points connected in a loop in a directed manner.
Example 1
In accordance with the embodiments of the present application, there is also provided a data matching method embodiment, it being noted that the steps shown in the flowchart of the figures may be performed in a computer system, such as a set of computer executable instructions, and, although a logical order is shown in the flowchart, in some cases, the steps shown or described may be performed in an order other than that shown or described herein.
The method embodiment provided in the first embodiment of the present application may be executed in a mobile terminal, a computer terminal or a similar computing device. Fig. 1 shows a block diagram of a hardware structure of a computer terminal (or mobile device) for implementing a data matching method. As shown in fig. 1, the computer terminal 10 (or mobile device 10) may include one or more (shown as 102a, 102b, … …,102 n) processors 102 (the processors 102 may include, but are not limited to, a microprocessor MCU, a programmable logic device FPGA, etc. processing means), a memory 104 for storing data, and a transmission means 106 for communication functions. In addition, the method may further include: a display, an input/output interface (I/O interface), a Universal Serial Bus (USB) port (which may be included as one of the ports of the I/O interface), a network interface, a power supply, and/or a camera. It will be appreciated by those of ordinary skill in the art that the configuration shown in fig. 1 is merely illustrative and is not intended to limit the configuration of the electronic device described above. For example, the computer terminal 10 may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
It should be noted that the one or more processors 102 and/or other data processing circuits described above may be referred to generally herein as "data processing circuits. The data processing circuit may be embodied in whole or in part in software, hardware, firmware, or any other combination. Furthermore, the data processing circuitry may be a single stand-alone processing module, or incorporated, in whole or in part, into any of the other elements in the computer terminal 10 (or mobile device). As referred to in the embodiments of the present application, the data processing circuit acts as a processor control (e.g., selection of the path of the variable resistor termination to interface).
The memory 104 may be used to store software programs and modules of application software, such as program instructions/data storage devices corresponding to the data matching method in the embodiments of the present application, and the processor 102 executes the software programs and modules stored in the memory 104, thereby performing various functional applications and data processing, that is, implementing the data matching method described above. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the computer terminal 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission means 106 is arranged to receive or transmit data via a network. The specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal 10. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module for communicating with the internet wirelessly.
The display may be, for example, a touch screen type Liquid Crystal Display (LCD) that may enable a user to interact with a user interface of the computer terminal 10 (or mobile device).
It should be noted that, in some alternative embodiments, the computer device (or mobile device) illustrated in fig. 1 may include hardware elements (including circuitry), software elements (including computer code stored on a computer-readable medium), or a combination of both hardware and software elements. It should be noted that fig. 1 is only one example of a specific example, and is intended to illustrate the types of components that may be present in the computer device (or mobile device) described above.
In the above-described operating environment, the present application provides a data matching method as shown in fig. 2. Fig. 2 is a flowchart of a data matching method according to an embodiment of the present application, where the data matching method provided in the present application is applied to at least one of the following scenarios: providing a recruiter with the matched candidate recruiter, and providing the recruiter with the matched candidate position.
As shown in fig. 2, the data matching method provided in the present application includes the following steps:
step S202, a first data set, a second data set and a third data set are obtained, wherein the first data set is a job description data set, the second data set is a candidate resume data set, and the third data set is a recruitment record data set in a preset period.
It should be noted that the job description data set includes relevant data of job descriptions, including but not limited to job responsibilities, job requirements, etc., where the job requirements include but not limited to professional requirements, academic requirements, skill requirements, etc. For example, the job responsibilities of the job description data of the image recognition algorithm engineer may be "responsible for the design and study of video recognition related image algorithms; the design of one or more related algorithms such as road traffic identification, image obstacle identification, air traffic identification, image stitching, tracking algorithm, driver behavior identification and the like is responsible; finishing writing of a design document of the developed software; the post requirements can be' computer, software engineering, information communication, mathematics and other related professions, a master and the schools above the master; a skilled master programming language).
Optionally, the recruitment system may execute the data matching method provided in this embodiment, where the recruitment system may import, from a first data table in the database, a job description data set in a preset period, where each piece of data in the first data table includes a number of the job description data, a job description content in the job description data, and a job requirement text. In addition, the recruitment system may import a candidate resume data set from a second data table of the database, wherein each piece of data in the second data table includes a number of the candidate resume and a post description text. The recruitment system may further import a recruitment record data set for a predetermined period from a third data table of the database, wherein each piece of data in the third data table includes a number of job description data, a number of candidate resume, and information regarding whether the interview passed (e.g., one interview passed).
In addition, the first data table, the second data table, and the third data table may be different data tables or may be the same data table in the database. In addition, the preset period may be set by the user using the recruitment system, for example, the preset period is set to one year.
Step S204, selecting a first extension set of the first data set from the second data set and the third data set according to the first preset screening condition, and selecting a second extension set of the second data set from the first data set and the third data set according to the second preset screening condition.
It should be noted that, the first preset screening condition and the second preset screening condition may be set by the user, and the first preset screening condition and the second preset screening condition may be the same or different. In addition, after the second data set and the third data set are screened according to the first preset screening condition, a candidate resume matched with the position description data can be obtained, for example, if the first preset screening condition is set to pass the first interview in a preset time period, the obtained first expansion set is all candidate resume matched with the position description data and passing the first interview. Similarly, after the third data set of the first data set is screened according to the second preset screening condition, positions matched with the resume data of the candidate can be obtained, for example, the second preset screening condition is set to be positions matched with the resume of the candidate and passed through the first interview in a preset time period, and the obtained second expansion set is all positions matched with the resume of the candidate and passed through the first interview.
Step S206, determining the matching degree between the first data set and the second data set according to at least one of the first expansion set and the second expansion set, the first data set and the second data set.
Alternatively, as known from step S206, the method for calculating the matching degree between the first data set and the second data set may include three methods, where the first method is to determine the matching degree between the first data set and the second data set by using the first extension set, the first data set, and the second data set; the second is to determine the matching degree between the first data set and the second data set through the second expansion set, the first data set and the second data set; the third is to determine the degree of matching between the first data set and the second data set by the first extension set, the second extension set, the first data set, and the second data set.
Based on the above-mentioned scheme defined in step S202 to step S206, it may be known that, by adopting the recruitment record data to expand the job description data and the candidate resume data, after the job description data set, the candidate resume data set and the recruitment record data set are obtained, a first expansion set of the job description data set is obtained by screening the candidate resume data set and the recruitment record data set according to a first preset screening condition, and a second expansion set of the candidate resume data set is obtained by screening the job description data set and the recruitment record data set according to a second preset screening condition, and finally, a matching degree between the job description data set and the candidate resume data set is determined according to at least one of the first expansion set and the second expansion set, the job description data set and the candidate resume data set.
It is easy to note that the job description data set can be expanded through the candidate resume data set and the recruitment record data set, the candidate resume data set can be expanded through the job description data set and the recruitment record data set, the job description data and the candidate resume data are matched based on the expanded job description data and the candidate resume data, and the matching degree of the job and the resume can be improved.
From the above, the scheme provided by the application achieves the purpose of matching the resume and the position, thereby realizing the technical effect of improving the accuracy of resume and position matching, and further solving the technical problem of low accuracy of resume and position matching in the existing recruitment system.
It should be noted that, after the first data set, the second data set and the third data set are obtained, the matching degree of the resume and the position is improved by expanding the position description data and the candidate resume data.
Optionally, the recruitment system may select the first extension set from the second data set and the third data set according to a first preset screening condition. Specifically, for each job description in the first dataset, the recruitment system selects a candidate meeting the first preset screening condition from the third dataset, then obtains a candidate resume corresponding to the candidate meeting the first preset screening condition from the second dataset, and splices the obtained candidate resume into the first expansion set. For example, the recruitment system obtains candidates from the recruitment record dataset that passed the first interview for a preset period of time (e.g., one year), then obtains candidate resumes from the candidate resume dataset, and splices the candidate resumes to obtain the first expansion set. Wherein the first extension set is set to null if there are no candidates in the recruitment record data set that meet the first preset screening condition, e.g., there are no candidates in the recruitment record data set that pass the first interview within one year.
Optionally, the recruitment system may select a second extension set from the first data set and the third data set according to a second preset screening condition. Specifically, for each candidate resume in the second data set, the recruitment system selects a position meeting the second preset screening condition from the third data set, then acquires a position description corresponding to the position meeting the second preset screening condition from the first data set, and splices the acquired position descriptions into a second expansion set. For example, the recruitment system obtains the position of the candidate passing the first interview from the recruitment record data set after obtaining the candidate resume from the candidate resume data set, and then obtains the position description of the position of the candidate passing the first interview from the position description data set, and splices the position descriptions, thereby obtaining the second expansion set. Wherein the second extension set is set to null if there is no position in the recruitment record data set that meets the second preset screening condition, e.g., there is no position in the recruitment record data set where the candidate passed the first interview in one year.
Further, after obtaining the first and second extension sets, the recruitment system can determine a degree of matching between the first and second data sets based on the first and second extension sets, the first data set, and the second data set. Specifically, the recruitment system firstly converts the first data set into a first word vector sequence, converts the second data set into a second word vector sequence, converts the first expansion set into a third word vector sequence and converts the second expansion set into a fourth word vector sequence, then converts the first word vector sequence into a first text vector, converts the second word vector sequence into a second text vector, converts the third word vector sequence into a third text vector and converts the fourth word vector sequence into a fourth text vector, calculates the matching degree of the first text vector and the second text vector to obtain a first calculation result, calculates the matching degree of the first text vector and the fourth text vector to obtain a second calculation result, calculates the matching degree of the second text vector and the third text vector to obtain a fourth calculation result, and finally performs linear transformation on the first calculation result, the second calculation result, the third calculation result and the fourth calculation result to determine the matching degree between the first data set and the second data set.
Optionally, the recruitment system may perform word vector processing on the data set after word segmentation in a dimension reduction manner or a preset vector manner, and convert the data set into a word vector sequence. Specifically, word segmentation is carried out on the first data set to obtain a first word segmentation result, and the first word segmentation result is subjected to word vectorization by adopting a dimension reduction mode or a preset word vector so as to be converted into a first word vector sequence; performing word segmentation on the second data set to obtain a second word segmentation result, performing word vectorization on the second word segmentation result by adopting a dimension reduction mode or a preset word vector, and converting the second word vector into a second word vector sequence; performing word segmentation processing on the first expansion set to obtain a third word segmentation result, performing word vectorization processing on the third word segmentation result by adopting a dimension reduction mode or a preset word vector, and converting the third word vector into a third word vector sequence; and performing word segmentation processing on the second expansion set to obtain a fourth word segmentation result, and performing word vectorization processing on the fourth word segmentation result by adopting a dimension reduction mode or a preset word vector to convert the fourth word segmentation result into a fourth word vector sequence.
It should be noted that the preset word vector may be obtained through pre-training or calculation.
Further, the recruitment system may convert the first word vector sequence into a first text vector, the second word vector sequence into a second text vector, the third word vector sequence into a third text vector, and the fourth word vector sequence into a fourth text vector through a pre-set neural network, wherein the pre-set neural network comprises one of: deep Neural Network (DNN), convolutional Neural Network (CNN), recurrent Neural Network (RNN).
Further, after the first text vector, the second text vector and the third text vector are obtained, the recruitment system calculates the cosine similarity of the first text vector and the second text vector to obtain a first calculation result, calculates the cosine similarity of the first text vector and the fourth text vector to obtain a second calculation result, calculates the cosine similarity of the second text vector and the third text vector to obtain a third calculation result, and calculates the cosine similarity of the third text vector and the fourth text vector to obtain a fourth calculation result.
In an alternative, fig. 3 shows a flow chart of the above process, and the description will be given by taking fig. 3 as an example. Specifically, the recruitment system first imports a JD text dataset (i.e., a first dataset), a CV text dataset (i.e., a second dataset), and a recruitment record dataset (i.e., a third dataset) from the database, as shown in steps S1-S3 of fig. 3. Recruitment then expands each JD text dataset to obtain expanded JD text (i.e., a first expanded set), and expands each CV text dataset to obtain expanded CV text (i.e., a second expanded set), as shown in steps S4 through S5 of fig. 3. Further, the recruitment system performs word segmentation on the JD text, the CV text, the JD extended text, and the CV extended text, and performs word vectorization processing on the JD text, the CV text, the JD extended text, and the CV extended text by using an embedded technology or pre-trained word vectors, so as to obtain word vector sequences corresponding to each type of text, as shown in steps S6 to S9 in fig. 3. And then the recruitment system processes the word vector sequences in the steps S6 to S9 through a preset neural network (for example, DNN, RNN, CNN) to obtain vector representations of the JD text, the CV text, the JD expanded text and the CV expanded text, namely, a first word vector sequence, a second word vector sequence, a third word vector sequence and a fourth word vector sequence, as shown in steps S10 to S13 in fig. 3. After a first word vector sequence, a second word vector sequence, a third word vector sequence and a fourth word vector sequence are obtained, the recruitment system calculates cosine similarity of the JD text vector and the CV text vector to obtain a first calculation result; the cosine similarity of the JD text vector and the CV expansion text vector is calculated, and a second calculation result is obtained; the cosine similarity of the JD extended text vector and the CV text vector is calculated, and a third calculation result is obtained; cosine similarity of the JD-extended text vector and the CV-extended text vector is calculated, and a fourth calculation result is obtained as in steps S14 to S17 in fig. 3. Finally, the recruitment system performs linear transformation on the calculation results obtained in S14 to S17 to obtain the matching degree of the JD text and the CV text, which are used for matching model training and prediction, as shown in step S18 in fig. 3.
It should be noted that the recruitment system may also perform text expansion on only the first data set or the second data set.
In an alternative, the recruitment system determines a degree of match between the first data set and the second data set based on the first extension set, the first data set, and the second data set. Specifically, the recruitment system converts the first data set into a first word vector sequence, converts the second data set into a second word vector sequence, converts the first expansion set into a third word vector sequence, converts the first word vector sequence into a first text vector, converts the second word vector sequence into a second text vector, and converts the third word vector sequence into a third text vector, then calculates the matching degree of the first text vector and the second text vector to obtain a first calculation result, calculates the matching degree of the second text vector and the third text vector to obtain a third calculation result, and finally performs linear transformation on the first calculation result and the third calculation result to determine the matching degree between the first data set and the second data set.
In another alternative, the recruitment system determines a degree of match between the first data set and the second data set based on the second extension set, the first data set, and the second data set. Specifically, the recruitment system converts the first data set into a first word vector sequence, converts the second data set into a second word vector sequence, converts the second expansion set into a fourth word vector sequence, converts the first word vector sequence into a first text vector, converts the second word vector sequence into a second text vector, and converts the fourth word vector sequence into a fourth text vector, then calculates the matching degree of the first text vector and the fourth text vector to obtain a second calculation result, calculates the matching degree of the second text vector and the fourth text vector to obtain a fourth calculation result, and finally performs linear transformation on the second calculation result and the fourth calculation result to determine the matching degree between the first data set and the second data set.
It should be noted that, the recruitment system only performs text expansion on the first data set or the second data set, and the step of performing matching degree calculation is the same as determining the matching degree between the first data set and the second data set according to the first expansion set, the second expansion set, the first data set and the second data set, which is not described herein.
Optionally, the job position and resume can be matched through the content, and the recruitment system can display the job position description content or resume according to the matching degree. For example, a schematic diagram showing the matching result shown in fig. 4, three positions that match the concierge of Zhang three are shown.
From the above, the scheme provided by the application can be applied to the aspect of person post matching, can recommend candidates for job recruiters, and recommend job positions for job seekers, and compared with the existing model without text expansion, the scheme provided by the application can remarkably improve the matching accuracy, further improve the accuracy of person post matching and improve recruitment efficiency.
It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of action combinations, but it should be understood by those skilled in the art that the present application is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required in the present application.
From the above description of the embodiments, it will be clear to those skilled in the art that the data matching method according to the above embodiments may be implemented by means of software plus necessary general hardware platform, but of course may also be implemented by hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method described in the embodiments of the present application.
Example 2
According to an embodiment of the present application, there is further provided a data matching device for implementing the above data matching method, as shown in fig. 5, the device 50 includes: an acquisition module 501, a selection module 503 and a determination module 505.
The acquiring module 501 is configured to acquire a first data set, a second data set and a third data set, where the first data set is a job description data set, the second data set is a candidate resume data set, and the third data set is a recruitment record data set in a preset period; a selecting module 503, configured to select a first extension set of the first data set from the second data set and the third data set according to a first preset screening condition, and select a second extension set of the second data set from the first data set and the third data set according to a second preset screening condition; a determining module 505 is configured to determine a degree of matching between the first data set and the second data set according to at least one of the first extension set and the second extension set, the first data set, and the second data set.
It should be noted that, the obtaining module 501, the selecting module 503, and the determining module 505 correspond to steps S202 to S206 in embodiment 1, and the three modules are the same as the examples and application scenarios implemented by the corresponding steps, but are not limited to the disclosure in the first embodiment. It should be noted that the above-described module may be operated as a part of the apparatus in the computer terminal 10 provided in the first embodiment.
In an alternative solution, the selecting module includes: the second selecting module and the first splicing module. The second selecting module is used for selecting candidates meeting the first preset screening conditions from the third data set for each job description in the first data set; the first splicing module is used for acquiring the candidate resume corresponding to the candidate meeting the first preset screening condition from the second data set, and splicing the acquired candidate resume into the first expansion set.
In an alternative solution, the selecting module includes: and the third selecting module and the second splicing module. The third selecting module is used for selecting positions meeting second preset screening conditions from the third data set for each candidate resume in the second data set; the second splicing module is used for acquiring the position descriptions corresponding to the positions meeting the second preset screening conditions from the first data set and splicing the acquired position descriptions into a second expansion set.
In an alternative, the determining module includes: the device comprises a first conversion module, a second conversion module, a first processing module and a first determination module. The first conversion module is used for converting the first data set into a first word vector sequence, converting the second data set into a second word vector sequence, converting the first expansion set into a third word vector sequence and converting the second expansion set into a fourth word vector sequence; the second conversion module is used for converting the first word vector sequence into a first text vector, converting the second word vector sequence into a second text vector, converting the third word vector sequence into a third text vector and converting the fourth word vector sequence into a fourth text vector; the first processing module is used for calculating the matching degree of the first text vector and the second text vector to obtain a first calculation result, calculating the matching degree of the first text vector and the fourth text vector to obtain a second calculation result, calculating the matching degree of the second text vector and the third text vector to obtain a third calculation result, and calculating the matching degree of the third text vector and the fourth text vector to obtain a fourth calculation result; the first determining module is used for performing linear transformation on the first calculation result, the second calculation result, the third calculation result and the fourth calculation result to determine the matching degree between the first data set and the second data set.
In an alternative, the first conversion module includes: the system comprises a second processing module, a third processing module, a fourth processing module and a fifth processing module. The second processing module is used for carrying out word segmentation on the first data set to obtain a first word segmentation result, carrying out word vectorization on the first word segmentation result by adopting a dimension reduction mode or a preset word vector, and converting the first word vector into a first word vector sequence, wherein the preset word vector is obtained through pre-training or calculation; the third processing module is used for carrying out word segmentation processing on the second data set to obtain a second word segmentation result, carrying out word vectorization processing on the second word segmentation result by adopting a dimension reduction mode or a preset word vector, and converting the second word result into a second word vector sequence, wherein the preset word vector is obtained through pre-training or calculation; the fourth processing module is used for carrying out word segmentation on the first expansion set to obtain a third word segmentation result, carrying out word vectorization on the third word segmentation result by adopting a dimension reduction mode or a preset word vector, and converting the third word vector into a third word vector sequence, wherein the preset word vector is obtained through pre-training or calculation; and the fifth processing module is used for carrying out word segmentation on the second expansion set to obtain a fourth word segmentation result, carrying out word vectorization on the fourth word segmentation result by adopting a dimension reduction mode or a preset word vector, and converting the fourth word vector into a fourth word vector sequence, wherein the preset word vector is obtained through pre-training or calculation.
In an alternative, the second conversion module includes: the third conversion module is configured to convert, through a preset neural network, the first word vector sequence into a first text vector, the second word vector sequence into a second text vector, the third word vector sequence into a third text vector, and the fourth word vector sequence into a fourth text vector, where the preset neural network includes one of: deep neural networks, convolutional neural networks, recurrent neural networks.
In an alternative, the first processing module includes: the sixth processing module is configured to calculate a cosine similarity between the first text vector and the second text vector to obtain a first calculation result, calculate a cosine similarity between the first text vector and the fourth text vector to obtain a second calculation result, calculate a cosine similarity between the second text vector and the third text vector to obtain a third calculation result, and calculate a cosine similarity between the third text vector and the fourth text vector to obtain a fourth calculation result.
In an alternative, the determining module includes: a fourth conversion module, a fifth conversion module, a seventh processing module, and a second determination module. The fourth conversion module is used for converting the first data set into a first word vector sequence, converting the second data set into a second word vector sequence and converting the first expansion set into a third word vector sequence; a fifth conversion module for converting the first word vector sequence into a first text vector, converting the second word vector sequence into a second text vector, and converting the third word vector sequence into a third text vector; the seventh processing module is used for calculating the matching degree of the first text vector and the second text vector to obtain a first calculation result, and calculating the matching degree of the second text vector and the third text vector to obtain a third calculation result; and the second determining module is used for carrying out linear transformation on the first calculation result and the third calculation result and determining the matching degree between the first data set and the second data set.
In an alternative, the determining module includes: a sixth conversion module, a seventh conversion module, an eighth processing module, and a third determination module. The sixth conversion module is used for converting the first data set into a first word vector sequence, converting the second data set into a second word vector sequence and converting the second expansion set into a fourth word vector sequence; a seventh conversion module for converting the first word vector sequence into a first text vector, converting the second word vector sequence into a second text vector, and converting the fourth word vector sequence into a fourth text vector; the eighth processing module is used for calculating the matching degree of the first text vector and the fourth text vector to obtain a second calculation result and calculating the matching degree of the second text vector and the fourth text vector to obtain a fourth calculation result; and the third determining module is used for carrying out linear transformation on the second calculation result and the fourth calculation result and determining the matching degree between the first data set and the second data set.
Optionally, the data matching method provided by the application is applied to at least one of the following scenarios: providing a recruiter with the matched candidate recruiter, and providing the recruiter with the matched candidate position.
Example 3
According to an embodiment of the present application, there is also provided a computer device for implementing the above data matching method, the computer device including: a processor and a memory.
The memory is connected with the processor and is used for providing instructions for the processor to process the following processing steps: acquiring a first data set, a second data set and a third data set, wherein the first data set is a job description data set, the second data set is a candidate resume data set, and the third data set is a recruitment record data set in a preset period; selecting a first expansion set of the first data set from the second data set and the third data set according to a first preset screening condition, and selecting a second expansion set of the second data set from the first data set and the third data set according to a second preset screening condition; a degree of matching between the first data set and the second data set is determined from at least one of the first and second extended sets, the first data set, and the second data set.
According to the method, the recruitment record data are adopted to expand the position description data and the candidate resume data, after the position description data set, the candidate resume data set and the recruitment record data set are obtained, a first expansion set of the position description data set is obtained by screening the candidate resume data set and the recruitment record data set according to a first preset screening condition, a second expansion set of the candidate resume data set is obtained by screening the position description data set and the recruitment record data set according to a second preset screening condition, and finally the matching degree between the position description data set and the candidate resume data set is determined according to at least one of the first expansion set and the second expansion set, the position description data set and the candidate resume data set.
It is easy to note that the job description data set can be expanded through the candidate resume data set and the recruitment record data set, the candidate resume data set can be expanded through the job description data set and the recruitment record data set, the job description data and the candidate resume data are matched based on the expanded job description data and the candidate resume data, and the matching degree of the job and the resume can be improved.
From the above, the scheme provided by the application achieves the purpose of matching the resume and the position, thereby realizing the technical effect of improving the accuracy of resume and position matching, and further solving the technical problem of low accuracy of resume and position matching in the existing recruitment system.
It should be noted that, the computer device provided in this embodiment may execute the data matching method in embodiment 1, and the related content is described in embodiment 1 and is not described herein.
Example 4
Embodiments of the present application may provide a computer terminal, which may be any one of a group of computer terminals. Alternatively, in the present embodiment, the above-described computer terminal may be replaced with a terminal device such as a mobile terminal.
Alternatively, in this embodiment, the above-mentioned computer terminal may be located in at least one network device among a plurality of network devices of the computer network.
In this embodiment, the above-mentioned computer terminal may execute the program code for the following steps in the data matching method: acquiring a first data set, a second data set and a third data set, wherein the first data set is a job description data set, the second data set is a candidate resume data set, and the third data set is a recruitment record data set in a preset period; selecting a first expansion set of the first data set from the second data set and the third data set according to a first preset screening condition, and selecting a second expansion set of the second data set from the first data set and the third data set according to a second preset screening condition; a degree of matching between the first data set and the second data set is determined from at least one of the first and second extended sets, the first data set, and the second data set.
Alternatively, fig. 6 is a block diagram of a computer terminal according to an embodiment of the present application. As shown in fig. 6, the computer terminal 10 may include: one or more (only one is shown) processors 602, memory 604, and a transmission 606.
The memory may be used to store software programs and modules, such as program instructions/modules corresponding to the data matching method and apparatus in the embodiments of the present application, and the processor executes the software programs and modules stored in the memory, thereby executing various functional applications and data processing, that is, implementing the data matching method described above. The memory may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory may further include memory remotely located with respect to the processor, which may be connected to terminal a through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The processor may call the information and the application program stored in the memory through the transmission device to perform the following steps: acquiring a first data set, a second data set and a third data set, wherein the first data set is a job description data set, the second data set is a candidate resume data set, and the third data set is a recruitment record data set in a preset period; selecting a first expansion set of the first data set from the second data set and the third data set according to a first preset screening condition, and selecting a second expansion set of the second data set from the first data set and the third data set according to a second preset screening condition; a degree of matching between the first data set and the second data set is determined from at least one of the first and second extended sets, the first data set, and the second data set.
Optionally, the above processor may further execute program code for: selecting candidates meeting a first preset screening condition from the third data set for each job description in the first data set; and obtaining the candidate resume corresponding to the candidate meeting the first preset screening condition from the second data set, and splicing the obtained candidate resume into the first expansion set.
Optionally, the above processor may further execute program code for: selecting positions meeting a second preset screening condition from the third data set for each candidate resume in the second data set; and acquiring position descriptions corresponding to positions meeting second preset screening conditions from the first data set, and splicing the acquired position descriptions into a second expansion set.
Optionally, the above processor may further execute program code for: converting the first data set into a first word vector sequence, converting the second data set into a second word vector sequence, converting the first expansion set into a third word vector sequence, and converting the second expansion set into a fourth word vector sequence; converting the first word vector sequence into a first text vector, converting the second word vector sequence into a second text vector, converting the third word vector sequence into a third text vector, and converting the fourth word vector sequence into a fourth text vector; calculating the matching degree of the first text vector and the second text vector to obtain a first calculation result, calculating the matching degree of the first text vector and the fourth text vector to obtain a second calculation result, calculating the matching degree of the second text vector and the third text vector to obtain a third calculation result, and calculating the matching degree of the third text vector and the fourth text vector to obtain a fourth calculation result; and performing linear transformation on the first calculation result, the second calculation result, the third calculation result and the fourth calculation result to determine the matching degree between the first data set and the second data set.
Optionally, the above processor may further execute program code for: performing word segmentation processing on the first data set to obtain a first word segmentation result, performing word vectorization processing on the first word segmentation result by adopting a dimension reduction mode or a preset word vector, and converting the first word segmentation result into a first word vector sequence, wherein the preset word vector is obtained through pre-training or calculation; performing word segmentation processing on the second data set to obtain a second word segmentation result, performing word vectorization processing on the second word segmentation result by adopting a dimension reduction mode or a preset word vector, and converting the second word vector into a second word vector sequence, wherein the preset word vector is obtained through pre-training or calculation; performing word segmentation processing on the first expansion set to obtain a third word segmentation result, performing word vectorization processing on the third word segmentation result by adopting a dimension reduction mode or a preset word vector, and converting the third word vector into a third word vector sequence, wherein the preset word vector is obtained through pre-training or calculation; performing word segmentation processing on the second expansion set to obtain a fourth word segmentation result, performing word vectorization processing on the fourth word segmentation result by adopting a dimension reduction mode or a preset word vector, and converting the fourth word segmentation result into a fourth word vector sequence, wherein the preset word vector is obtained through pre-training or calculation.
Optionally, the above processor may further execute program code for: converting a first word vector sequence into a first text vector, converting a second word vector sequence into a second text vector, converting a third word vector sequence into a third text vector, and converting a fourth word vector sequence into a fourth text vector through a preset neural network, wherein the preset neural network comprises one of the following: deep neural networks, convolutional neural networks, recurrent neural networks.
Optionally, the above processor may further execute program code for: the cosine similarity of the first text vector and the second text vector is calculated to obtain a first calculation result, the cosine similarity of the first text vector and the fourth text vector is calculated to obtain a second calculation result, the cosine similarity of the second text vector and the third text vector is calculated to obtain a third calculation result, and the cosine similarity of the third text vector and the fourth text vector is calculated to obtain a fourth calculation result.
Optionally, the above processor may further execute program code for: converting the first data set into a first sequence of word vectors, converting the second data set into a second sequence of word vectors, and converting the first expanded set into a third sequence of word vectors; converting the first word vector sequence into a first text vector, converting the second word vector sequence into a second text vector, and converting the third word vector sequence into a third text vector; calculating the matching degree of the first text vector and the second text vector to obtain a first calculation result, and calculating the matching degree of the second text vector and the third text vector to obtain a third calculation result; and performing linear transformation on the first calculation result and the third calculation result, and determining the matching degree between the first data set and the second data set.
Optionally, the above processor may further execute program code for: converting the first data set into a first word vector sequence, converting the second data set into a second word vector sequence, and converting the second expansion set into a fourth word vector sequence; converting the first word vector sequence into a first text vector, converting the second word vector sequence into a second text vector, and converting the fourth word vector sequence into a fourth text vector; calculating the matching degree of the first text vector and the fourth text vector to obtain a second calculation result, and calculating the matching degree of the second text vector and the fourth text vector to obtain a fourth calculation result; and performing linear transformation on the second calculation result and the fourth calculation result, and determining the matching degree between the first data set and the second data set.
It will be appreciated by those skilled in the art that the configuration shown in fig. 6 is only illustrative, and the computer terminal may be a smart phone (such as an Android phone, an iOS phone, etc.), a tablet computer, a palm-phone computer, a mobile internet device (Mobile Internet Devices, MID), a PAD, etc. Fig. 6 is not limited to the structure of the electronic device. For example, the computer terminal 10 may also include more or fewer components (e.g., network interfaces, display devices, etc.) than shown in FIG. 6, or have a different configuration than shown in FIG. 6.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program for instructing a terminal device to execute in association with hardware, the program may be stored in a computer readable storage medium, and the storage medium may include: flash disk, read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), magnetic or optical disk, and the like.
Example 5
Embodiments of the present application also provide a storage medium. Alternatively, in this embodiment, the storage medium may be used to store the program code executed by the data matching method provided in the first embodiment.
Alternatively, in this embodiment, the storage medium may be located in any one of the computer terminals in the computer terminal group in the computer network, or in any one of the mobile terminals in the mobile terminal group.
Alternatively, in the present embodiment, the storage medium is configured to store program code for performing the steps of: acquiring a first data set, a second data set and a third data set, wherein the first data set is a job description data set, the second data set is a candidate resume data set, and the third data set is a recruitment record data set in a preset period; selecting a first expansion set of the first data set from the second data set and the third data set according to a first preset screening condition, and selecting a second expansion set of the second data set from the first data set and the third data set according to a second preset screening condition; a degree of matching between the first data set and the second data set is determined from at least one of the first and second extended sets, the first data set, and the second data set.
Alternatively, in the present embodiment, the storage medium is configured to store program code for performing the steps of: selecting candidates meeting a first preset screening condition from the third data set for each job description in the first data set; and obtaining the candidate resume corresponding to the candidate meeting the first preset screening condition from the second data set, and splicing the obtained candidate resume into the first expansion set.
Alternatively, in the present embodiment, the storage medium is configured to store program code for performing the steps of: selecting positions meeting a second preset screening condition from the third data set for each candidate resume in the second data set; and acquiring position descriptions corresponding to positions meeting second preset screening conditions from the first data set, and splicing the acquired position descriptions into a second expansion set.
Alternatively, in the present embodiment, the storage medium is configured to store program code for performing the steps of: converting the first data set into a first word vector sequence, converting the second data set into a second word vector sequence, converting the first expansion set into a third word vector sequence, and converting the second expansion set into a fourth word vector sequence; converting the first word vector sequence into a first text vector, converting the second word vector sequence into a second text vector, converting the third word vector sequence into a third text vector, and converting the fourth word vector sequence into a fourth text vector; calculating the matching degree of the first text vector and the second text vector to obtain a first calculation result, calculating the matching degree of the first text vector and the fourth text vector to obtain a second calculation result, calculating the matching degree of the second text vector and the third text vector to obtain a third calculation result, and calculating the matching degree of the third text vector and the fourth text vector to obtain a fourth calculation result; and performing linear transformation on the first calculation result, the second calculation result, the third calculation result and the fourth calculation result to determine the matching degree between the first data set and the second data set.
Alternatively, in the present embodiment, the storage medium is configured to store program code for performing the steps of: performing word segmentation processing on the first data set to obtain a first word segmentation result, performing word vectorization processing on the first word segmentation result by adopting a dimension reduction mode or a preset word vector, and converting the first word segmentation result into a first word vector sequence, wherein the preset word vector is obtained through pre-training or calculation; performing word segmentation processing on the second data set to obtain a second word segmentation result, performing word vectorization processing on the second word segmentation result by adopting a dimension reduction mode or a preset word vector, and converting the second word vector into a second word vector sequence, wherein the preset word vector is obtained through pre-training or calculation; performing word segmentation processing on the first expansion set to obtain a third word segmentation result, performing word vectorization processing on the third word segmentation result by adopting a dimension reduction mode or a preset word vector, and converting the third word vector into a third word vector sequence, wherein the preset word vector is obtained through pre-training or calculation; performing word segmentation processing on the second expansion set to obtain a fourth word segmentation result, performing word vectorization processing on the fourth word segmentation result by adopting a dimension reduction mode or a preset word vector, and converting the fourth word segmentation result into a fourth word vector sequence, wherein the preset word vector is obtained through pre-training or calculation.
Alternatively, in the present embodiment, the storage medium is configured to store program code for performing the steps of: converting a first word vector sequence into a first text vector, converting a second word vector sequence into a second text vector, converting a third word vector sequence into a third text vector, and converting a fourth word vector sequence into a fourth text vector through a preset neural network, wherein the preset neural network comprises one of the following: deep neural networks, convolutional neural networks, recurrent neural networks.
Alternatively, in the present embodiment, the storage medium is configured to store program code for performing the steps of: the cosine similarity of the first text vector and the second text vector is calculated to obtain a first calculation result, the cosine similarity of the first text vector and the fourth text vector is calculated to obtain a second calculation result, the cosine similarity of the second text vector and the third text vector is calculated to obtain a third calculation result, and the cosine similarity of the third text vector and the fourth text vector is calculated to obtain a fourth calculation result.
Alternatively, in the present embodiment, the storage medium is configured to store program code for performing the steps of: converting the first data set into a first sequence of word vectors, converting the second data set into a second sequence of word vectors, and converting the first expanded set into a third sequence of word vectors; converting the first word vector sequence into a first text vector, converting the second word vector sequence into a second text vector, and converting the third word vector sequence into a third text vector; calculating the matching degree of the first text vector and the second text vector to obtain a first calculation result, and calculating the matching degree of the second text vector and the third text vector to obtain a third calculation result; and performing linear transformation on the first calculation result and the third calculation result, and determining the matching degree between the first data set and the second data set.
Alternatively, in the present embodiment, the storage medium is configured to store program code for performing the steps of: converting the first data set into a first word vector sequence, converting the second data set into a second word vector sequence, and converting the second expansion set into a fourth word vector sequence; converting the first word vector sequence into a first text vector, converting the second word vector sequence into a second text vector, and converting the fourth word vector sequence into a fourth text vector; calculating the matching degree of the first text vector and the fourth text vector to obtain a second calculation result, and calculating the matching degree of the second text vector and the fourth text vector to obtain a fourth calculation result; and performing linear transformation on the second calculation result and the fourth calculation result, and determining the matching degree between the first data set and the second data set.
The foregoing embodiment numbers of the present application are merely for describing, and do not represent advantages or disadvantages of the embodiments.
In the foregoing embodiments of the present application, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed technology content may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, such as the division of the units, is merely a logical function division, and may be implemented in another manner, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely a preferred embodiment of the present application and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present application and are intended to be comprehended within the scope of the present application.

Claims (11)

1. A method of data matching, comprising:
acquiring a first data set, a second data set and a third data set, wherein the first data set is a job description data set, the second data set is a candidate resume data set, and the third data set is a recruitment record data set in a preset period;
selecting a first extension set of the first data set from the second data set and the third data set according to a first preset screening condition, and selecting a second extension set of the second data set from the first data set and the third data set according to a second preset screening condition;
determining a degree of matching between the first data set and the second data set according to at least one of the first and second extended sets, the first data set and the second data set;
wherein selecting a first set of extensions of the first data set from the second data set and the third data set according to a first preset screening condition, and selecting a second set of extensions of the second data set from the first data set and the third data set according to a second preset screening condition comprises:
Selecting a target object meeting a target screening condition from the third data set, wherein the target screening condition is the first preset screening condition or the second preset screening condition, and the target object is a candidate or a position;
and acquiring an object to be expanded from a target data set based on the target object, and splicing the object to be expanded into a target expansion set, wherein the target data set is the first data set or the second data set, the target expansion set is the first expansion set or the second expansion set, and the object to be expanded is a candidate resume or a job description.
2. The method of claim 1, wherein when the target screening condition is the first preset screening condition, the target data set is the second data set, the target extension set is the first extension set, the target object is the candidate, the object to be extended is the candidate profile, and selecting the first extension set from the second data set and the third data set according to the first preset screening condition comprises:
selecting, for each of the job descriptions in the first dataset, the candidate satisfying the first preset screening condition from the third dataset;
And acquiring the candidate resume corresponding to the candidate meeting the first preset screening condition from the second data set, and splicing the acquired candidate resume into the first expansion set.
3. The method of claim 1, wherein when the target screening condition is the second preset screening condition, the target data set is the first data set, the target extension set is the second extension set, the target object is the job title, and the object to be extended is the job title description, selecting the second extension set from the first data set and the third data set according to the second preset screening condition comprises:
selecting the positions meeting the second preset screening conditions from the third data set for each candidate resume in the second data set;
and acquiring the position description corresponding to the position meeting the second preset screening condition from the first data set, and splicing the acquired position description into the second expansion set.
4. The method of claim 1, wherein determining a degree of matching between the first data set and the second data set from the first extension set, the second extension set, the first data set, and the second data set comprises:
Converting the first dataset into a first sequence of word vectors, converting the second dataset into a second sequence of word vectors, converting the first expanded set into a third sequence of word vectors, and converting the second expanded set into a fourth sequence of word vectors;
converting the first word vector sequence into a first text vector, converting the second word vector sequence into a second text vector, converting the third word vector sequence into a third text vector, and converting the fourth word vector sequence into a fourth text vector;
calculating the matching degree of the first text vector and the second text vector to obtain a first calculation result, calculating the matching degree of the first text vector and the fourth text vector to obtain a second calculation result, calculating the matching degree of the second text vector and the third text vector to obtain a third calculation result, and calculating the matching degree of the third text vector and the fourth text vector to obtain a fourth calculation result;
and performing linear transformation on the first calculation result, the second calculation result, the third calculation result and the fourth calculation result to determine the matching degree between the first data set and the second data set.
5. The method of claim 4, wherein converting the first dataset into the first sequence of word vectors, the second dataset into the second sequence of word vectors, the first expanded set into the third sequence of word vectors, and the second expanded set into the fourth sequence of word vectors comprises:
performing word segmentation on the first data set to obtain a first word segmentation result, performing word vectorization on the first word segmentation result by adopting a dimension reduction mode or a preset word vector, and converting the first word vector into a first word vector sequence, wherein the preset word vector is obtained through pre-training or calculation;
performing word segmentation on the second data set to obtain a second word segmentation result, performing word vectorization on the second word segmentation result by adopting a dimension reduction mode or a preset word vector, and converting the second word segmentation result into a second word vector sequence, wherein the preset word vector is obtained through pre-training or calculation;
performing word segmentation on the first expansion set to obtain a third word segmentation result, performing word vectorization on the third word segmentation result by adopting a dimension reduction mode or a preset word vector, and converting the third word vector into a third word vector sequence, wherein the preset word vector is obtained through pre-training or calculation;
Performing word segmentation on the second expansion set to obtain a fourth word segmentation result, performing word vectorization on the fourth word segmentation result by adopting a dimension reduction mode or a preset word vector, and converting the fourth word segmentation result into a fourth word vector sequence, wherein the preset word vector is obtained through pre-training or calculation.
6. The method of claim 4, wherein converting the first sequence of word vectors into a first text vector, converting the second sequence of word vectors into a second text vector, converting the third sequence of word vectors into the third text vector, and converting the fourth sequence of word vectors into a fourth text vector is performed by a pre-set neural network, wherein the pre-set neural network comprises one of: deep neural networks, convolutional neural networks, recurrent neural networks.
7. The method of claim 4, wherein calculating cosine similarity of the first text vector and the second text vector yields the first calculation result, calculating cosine similarity of the first text vector and the fourth text vector yields the second calculation result, calculating cosine similarity of the second text vector and the third text vector yields the third calculation result, and calculating cosine similarity of the third text vector and the fourth text vector yields the fourth calculation result.
8. The method of claim 1, wherein determining a degree of matching between the first data set and the second data set from the first extension set, the first data set, and the second data set comprises:
converting the first dataset into a first sequence of word vectors, converting the second dataset into a second sequence of word vectors, and converting the first expanded set into a third sequence of word vectors;
converting the first word vector sequence into a first text vector, converting the second word vector sequence into a second text vector, and converting the third word vector sequence into a third text vector;
calculating the matching degree of the first text vector and the second text vector to obtain a first calculation result, and calculating the matching degree of the second text vector and the third text vector to obtain a third calculation result;
and performing linear transformation on the first calculation result and the third calculation result, and determining the matching degree between the first data set and the second data set.
9. The method of claim 1, wherein determining a degree of matching between the first data set and the second data set from the second extension set, the first data set, and the second data set comprises:
Converting the first dataset into a first sequence of word vectors, converting the second dataset into a second sequence of word vectors, and converting the second expanded set into a fourth sequence of word vectors;
converting the first word vector sequence into a first text vector, converting the second word vector sequence into a second text vector, and converting the fourth word vector sequence into a fourth text vector;
calculating the matching degree of the first text vector and the fourth text vector to obtain a second calculation result, and calculating the matching degree of the second text vector and the fourth text vector to obtain a fourth calculation result;
and performing linear transformation on the second calculation result and the fourth calculation result, and determining the matching degree between the first data set and the second data set.
10. The method according to any one of claims 1 to 9, characterized in that the method is applied to at least one of the following scenarios: providing a recruiter with the matched candidate recruiter, and providing the recruiter with the matched candidate position.
11. A data matching apparatus, comprising:
the acquisition module is used for acquiring a first data set, a second data set and a third data set, wherein the first data set is a position description data set, the second data set is a candidate resume data set, and the third data set is a recruitment record data set in a preset period;
The selection module is used for selecting a first expansion set of the first data set from the second data set and the third data set according to a first preset screening condition, and selecting a second expansion set of the second data set from the first data set and the third data set according to a second preset screening condition;
a determining module, configured to determine a degree of matching between the first data set and the second data set according to at least one of the first extension set and the second extension set, the first data set, and the second data set;
wherein, the selecting module is further configured to:
selecting a target object meeting a target screening condition from the third data set, wherein the target screening condition is the first preset screening condition or the second preset screening condition, and the target object is a candidate or a position;
and acquiring an object to be expanded from a target data set based on the target object, and splicing the object to be expanded into a target expansion set, wherein the target data set is the first data set or the second data set, the target expansion set is the first expansion set or the second expansion set, and the object to be expanded is a candidate resume or a job description.
CN201910110232.5A 2019-02-11 2019-02-11 Data matching method and device Active CN111625618B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910110232.5A CN111625618B (en) 2019-02-11 2019-02-11 Data matching method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910110232.5A CN111625618B (en) 2019-02-11 2019-02-11 Data matching method and device

Publications (2)

Publication Number Publication Date
CN111625618A CN111625618A (en) 2020-09-04
CN111625618B true CN111625618B (en) 2023-05-02

Family

ID=72258719

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910110232.5A Active CN111625618B (en) 2019-02-11 2019-02-11 Data matching method and device

Country Status (1)

Country Link
CN (1) CN111625618B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106408249A (en) * 2016-08-31 2017-02-15 五八同城信息技术有限公司 Resume and position matching method and device
CA2961157A1 (en) * 2016-03-18 2017-09-18 Mark Meier Job posting, resume creation/management and applicant tracking system and method
CN107291715A (en) * 2016-03-30 2017-10-24 阿里巴巴集团控股有限公司 Resume appraisal procedure and device
CN108717619A (en) * 2018-03-27 2018-10-30 谭振江 A kind of practising method employment recruitment system of bidirectional pushing matched data

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140122355A1 (en) * 2012-10-26 2014-05-01 Bright Media Corporation Identifying candidates for job openings using a scoring function based on features in resumes and job descriptions
US20170357945A1 (en) * 2016-06-14 2017-12-14 Recruiter.AI, Inc. Automated matching of job candidates and job listings for recruitment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2961157A1 (en) * 2016-03-18 2017-09-18 Mark Meier Job posting, resume creation/management and applicant tracking system and method
CN107291715A (en) * 2016-03-30 2017-10-24 阿里巴巴集团控股有限公司 Resume appraisal procedure and device
CN106408249A (en) * 2016-08-31 2017-02-15 五八同城信息技术有限公司 Resume and position matching method and device
CN108717619A (en) * 2018-03-27 2018-10-30 谭振江 A kind of practising method employment recruitment system of bidirectional pushing matched data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
徐锦阳 ; 张高煜 ; 王曼曦 ; 楼焕钰 ; 薛伟程 ; 毛髌裕 ; .招聘网站职位与简历的双向匹配相似度算法.信息技术.2016,(08),全文. *
谷楠楠 ; 冯筠 ; 孙霞 ; 赵妍 ; 张蕾 ; .中文简历自动解析及推荐算法.计算机工程与应用.2017,(18),全文. *

Also Published As

Publication number Publication date
CN111625618A (en) 2020-09-04

Similar Documents

Publication Publication Date Title
Batty Digital twins
US11227190B1 (en) Graph neural network training methods and systems
US20230169100A1 (en) Method and apparatus for information acquisition, electronic device, and computer-readable storage medium
KR20170001550A (en) Human-computer intelligence chatting method and device based on artificial intelligence
Köchling et al. Highly accurate, but still discriminatory: A fairness evaluation of algorithmic video analysis in the recruitment context
CN111061946A (en) Scenario content recommendation method and device, electronic equipment and storage medium
CN112000805A (en) Text matching method, device, terminal and storage medium based on pre-training model
US20150379087A1 (en) Apparatus and method for replying to query
CN110837586B (en) Question-answer matching method, system, server and storage medium
CN110807566A (en) Artificial intelligence model evaluation method, device, equipment and storage medium
CN111046158B (en) Question-answer matching method, model training method, device, equipment and storage medium
CN114626380A (en) Entity identification method and device, electronic equipment and storage medium
CN115481969A (en) Resume screening method and device, electronic equipment and readable storage medium
CN112507095A (en) Information identification method based on weak supervised learning and related equipment
CN110532562B (en) Neural network training method, idiom misuse detection method and device and electronic equipment
CN112380421A (en) Resume searching method and device, electronic equipment and computer storage medium
CN112131261A (en) Community query method and device based on community network and computer equipment
CN111242710A (en) Business classification processing method and device, service platform and storage medium
CN116956116A (en) Text processing method and device, storage medium and electronic equipment
US11443215B2 (en) Intelligent recommendation of convenient event opportunities
Moreno-García et al. Online learning the consensus of multiple correspondences between sets
CN111625618B (en) Data matching method and device
CN113486659A (en) Text matching method and device, computer equipment and storage medium
CN112395887A (en) Dialogue response method, dialogue response device, computer equipment and storage medium
CN110929519B (en) Entity attribute extraction method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant