WO2019056793A1 - 简历识别装置、方法及计算机可读存储介质 - Google Patents

简历识别装置、方法及计算机可读存储介质 Download PDF

Info

Publication number
WO2019056793A1
WO2019056793A1 PCT/CN2018/089188 CN2018089188W WO2019056793A1 WO 2019056793 A1 WO2019056793 A1 WO 2019056793A1 CN 2018089188 W CN2018089188 W CN 2018089188W WO 2019056793 A1 WO2019056793 A1 WO 2019056793A1
Authority
WO
WIPO (PCT)
Prior art keywords
resume
preset
identified
recognized
entry
Prior art date
Application number
PCT/CN2018/089188
Other languages
English (en)
French (fr)
Inventor
秦欢欢
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2019056793A1 publication Critical patent/WO2019056793A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/105Human resources
    • G06Q10/1053Employment or hiring

Definitions

  • the present application relates to the field of computer technology, and in particular, to a resume identification device, method, and computer readable storage medium.
  • the profile of the outsourced personnel uploaded by the supplier if the outsourcer fraud on the resume content, such as: work experience fraud, project experience and other information fraud, and for the case of resume fraud, only the department's auditors are viewing or in the interviewer It is only when the interview is conducted that it can be identified, resulting in wasted interview resources.
  • the present application provides a resume identification device, method and computer readable storage medium, the main purpose of which is to effectively identify resume fraud and reduce waste of interview resources.
  • the present application provides a resume identification device including a memory and a processor, wherein the memory stores a resume recognition program executable on the processor, and the resume recognition program is processed The following steps are implemented when the device is executed:
  • the present application further provides a resume identification method, the method comprising:
  • the present application further provides a computer readable storage medium, where the resume identification program is stored, and the resume identification program can be executed by one or more processors to implement The steps of the resume identification method as described above.
  • the resume identification device, the method and the computer readable storage medium provided by the application obtain the entry content belonging to the preset entry from the resume to be identified, and the content of the obtained entry and the content of the entry under the same preset entry in the reference resume are used as keywords. Matching to calculate the similarity between the resume to be identified and the reference resume; if the similarity between all the reference resumes in the resume library and the resume to be identified is less than or equal to the first preset threshold, then the resume to be identified is determined to be a real resume Otherwise, it is determined that the resume to be identified is a fraudulent resume, and the application identifies the fraud of the resume before the interview through the above scheme, filters the fraud resume, adds the resume determined to be true to the interview list, improves the interview efficiency, and reduces The waste of interview resources reduces the risk of employing people.
  • FIG. 1 is a schematic diagram of a preferred embodiment of a resume identification device of the present application
  • FIG. 2 is a schematic diagram of a program module of a resume identification program in an embodiment of the resume identification device of the present application
  • FIG. 3 is a flow chart of a preferred embodiment of the resume identification method of the present application.
  • the application provides a resume identification device.
  • FIG. 1 a schematic diagram of a preferred embodiment of the resume identification device of the present application is shown.
  • the resume identification device may be a PC (Personal Computer), or may be a portable terminal device having a display function such as a smart phone or a tablet computer.
  • PC Personal Computer
  • the resume identification device may be a PC (Personal Computer), or may be a portable terminal device having a display function such as a smart phone or a tablet computer.
  • the resume identification device includes at least a memory 11, a processor 12, a communication bus 13, and a network interface 14.
  • the memory 11 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (for example, an SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, and the like.
  • the memory 11 may in some embodiments be an internal storage unit of the resume identification device, such as the hard drive of the resume identification device.
  • the memory 11 may also be an external storage device of the resume identification device in other embodiments, such as a plug-in hard disk equipped on the resume identification device, a smart memory card (SMC), and a secure digital (SD). Card, flash card, etc.
  • the memory 11 may also include both an internal storage unit of the resume identification device and an external storage device.
  • the memory 11 can be used not only for storing application software installed in the resume identification device and various types of data, such as code of the resume recognition program, but also for temporarily storing data that has been output or is to be output.
  • the processor 12 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor or other data processing chip for running program code or processing stored in the memory 11. Data, such as performing a resume identification program.
  • CPU Central Processing Unit
  • controller microcontroller
  • microprocessor or other data processing chip for running program code or processing stored in the memory 11.
  • Data such as performing a resume identification program.
  • Communication bus 13 is used to implement connection communication between these components.
  • the network interface 14 can optionally include a standard wired interface, a wireless interface (such as a WI-FI interface), and is typically used to establish a communication connection between the device and other electronic devices.
  • a standard wired interface such as a WI-FI interface
  • Figure 1 shows only the resume identification device with components 11-14 and the resume identification program, but it should be understood that not all illustrated components may be implemented and that more or fewer components may be implemented instead.
  • the device may further include a user interface
  • the user interface may include a display
  • an input unit such as a keyboard
  • the optional user interface may further include a standard wired interface and a wireless interface.
  • the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch sensor, or the like.
  • the display may also be appropriately referred to as a display screen or display unit for displaying information processed in the resume identification device and a user interface for displaying visualization.
  • the device may also include a touch sensor.
  • the area provided by the touch sensor for the user to perform a touch operation is referred to as a touch area.
  • the touch sensor described herein may be a resistive touch sensor, a capacitive touch sensor, or the like.
  • the touch sensor includes not only a contact type touch sensor but also a proximity type touch sensor or the like.
  • the touch sensor may be a single sensor or a plurality of sensors arranged in an array.
  • the area of the display of the device may be the same as or different from the area of the touch sensor.
  • a display is stacked with the touch sensor to form a touch display. The device detects a user-triggered touch operation based on a touch screen display.
  • the device may further include a camera, an RF (Radio Frequency) circuit, a sensor, an audio circuit, a WiFi module, and the like.
  • sensors such as light sensors, motion sensors, and other sensors.
  • the light sensor may include an ambient light sensor and a proximity sensor, wherein if the device is a mobile terminal, the ambient light sensor may adjust the brightness of the display screen according to the brightness of the ambient light, and the proximity sensor may move when the mobile terminal moves to the ear. , turn off the display and / or backlight.
  • the gravity acceleration sensor can detect the magnitude of acceleration in each direction (usually three axes), and can detect the magnitude and direction of gravity when stationary, and can be used to identify the posture of the mobile terminal (such as horizontal and vertical screen switching, Related games, magnetometer attitude calibration), vibration recognition related functions (such as pedometer, tapping), etc.; of course, the mobile terminal can also be equipped with other sensors such as gyroscope, barometer, hygrometer, thermometer, infrared sensor, etc. No longer.
  • a resume recognition program is stored in the memory 11; when the processor 12 executes the resume recognition program stored in the memory 11, the following steps are implemented:
  • the content of the entry belonging to the preset entry is obtained from the resume to be identified.
  • the content of the obtained item is matched with the content of the item under the preset item in the resume library, and the similarity between the resume to be identified and the reference resume is calculated.
  • the device proposed in this embodiment runs a resume identification program, and the program can be connected with the outsourcing management system.
  • the outsourcing management system sends the resume of the outsourced personnel to the device of the embodiment according to the personnel demand information, and after receiving the resume, the device will The above resume is used as a resume to be identified, and the resume is identified as fraud, and the identified fraud resume is filtered out, and the recognized real resume is retained.
  • the resume library is stored in the device, and the resume database has multiple reference resumes, wherein the reference resume may be a resume derived from an incumbent.
  • the device can process the existing resume and prepare the reference resume to be stored in the resume database as a benchmark for judging whether the resume uploaded by the outsourcing management system is fraudulent.
  • a resume template is set up, which includes a plurality of preset items, such as work experience, project experience, internship experience, and training experience, and the existing resume is entered according to the resume template, and when the resume is entered. , enter by keyword, instead of entering all the content under the entry.
  • the processor 12 is further configured to execute the resume identification device to implement the following steps:
  • the keyword is extracted from each preset item of the uploaded resume as a reference keyword; after the reference keyword is associated with the corresponding preset item, the reference resume is generated and stored in the resume library.
  • the content of the item is displayed to the user, and the keyword is manually selected by the user, and the keyword selected by the user is received as a reference keyword, and stored in association with the preset item.
  • the resume template can be provided to the outsourcing management system, and the supplier inputs the resume of the outsourcer through the resume template and uploads it.
  • the content of the entry belonging to the preset entry is obtained from the resume to be identified, for example, the preset entry of the work experience, and the content of the entry belonging to the entry in the resume is obtained, according to The resume storage order in the resume database is matched one by one, and the similarity between each reference resume and the resume to be identified is calculated.
  • the number of preset items in the resume of the outsourcer is uncertain, for example, some of the outsourcers' resumes may include multiple preset items such as work experience, project experience, internship experience, and training experience.
  • the outsourcer's resume may have only a preset entry of work experience or internship experience. Therefore, the above calculation method for similarity may adopt different calculation methods according to the number of preset entries in the resume to be identified.
  • the step of matching the content of the acquired item with the content of the item under the preset item in the resume library with the reference resume to calculate the similarity between the resume to be identified and each reference resume includes: obtaining a reference resume a total number of reference keywords and reference keywords included in the preset entry; calculating a proportion of the number of keywords under the entry content of the preset entry of the resume to be identified in the total number, The ratio is used as the similarity between the resume to be identified and the reference resume.
  • the resume to be identified only includes the preset entry of the work experience
  • the reference keyword corresponding to the preset entry in the reference resume is obtained, and the total number N of the resume is determined, and the work experience of the resume to be identified is determined.
  • the proportion of the calculated number in the total number N is used as the similarity between the resume to be identified and the reference resume.
  • the content of the acquired item is matched with the content of the item under the same preset item in the resume library, and the similarity between the resume to be recognized and each reference resume is calculated.
  • the step includes: when the plurality of preset entries are included in the resume to be identified, obtaining the number of reference keywords included in each preset entry in the reference resume; respectively calculating the number of keywords included in each preset entry Presetting the proportion of the total number of keywords in the entry, and obtaining the weight corresponding to each preset entry, wherein the sum of the weights corresponding to the plurality of preset entries is 1; according to the ratio and the corresponding corresponding to each preset entry
  • the weight is calculated, and the similarity between the resume to be identified and the reference resume is calculated according to a weighting algorithm.
  • the user can set a corresponding weight according to the importance degree of the preset item. After calculating the proportion corresponding to each preset item, the similarity between the resume to be identified and the reference resume is calculated according to the preset weight.
  • Determining whether the similarity between all reference resumes in the resume library and the resume to be identified is less than or equal to a first preset threshold.
  • the resume to be identified is a real resume, and the resume to be identified is added to the interview list.
  • the first preset threshold is set to 70%, and if the similarity between all the reference resumes in the resume library and the resume to be identified is less than 70%, the resume to be identified is considered to be If the similarity between the reference resume and the resume to be identified in the resume database is greater than 70%, the resume to be identified is considered to be a fraud resume, and the resume with a similarity of more than 70% to the resume to be identified is a similar resume.
  • the prompt information may be sent to the second preset node, where the prompt information includes the resume to be recognized and the similarity with the resume to be identified is greater than a first preset threshold.
  • the reference resume that is, after the fraud is identified, the resume and the resume can be sent to the second preset node.
  • the second preset node may be an outsourcing management system. For the outsourcing management system to be accounted for according to the above prompt information.
  • the processor 12 is further configured to execute the resume recognition program to implement the following steps before the step of determining that the resume to be recognized is a real resume:
  • the similarity between the reference resume in the resume library and the resume to be identified is less than or equal to the first preset threshold, determining whether there is a reference resume between the resume and the resume to be recognized The similarity is greater than the second preset threshold, where the first preset threshold is greater than the second preset threshold.
  • the resume to be recognized is a real resume
  • the resume to be identified is added to the interview list.
  • the resume to be identified is sent to the first preset node, so that the first preset node determines the authenticity of the resume to be recognized; when receiving the first preset node, When confirming the information, it is determined that the resume to be recognized is a real resume, and the step of adding the resume to be identified to the interview list is performed.
  • a second preset threshold is introduced, and the second preset threshold is smaller than the first preset threshold.
  • the second preset threshold may be 50%, that is, if the similarity between the reference resume and the resume to be recognized in the resume database is greater than 50% and less than or equal to 70%, the resume and the pending resume will be determined.
  • a reference resume with a similarity between the resumes greater than 50% and less than or equal to 70% is sent to the first preset node for further determination by the first preset node.
  • the first preset node may be an interviewer node; in addition, the same reference keyword as the reference resume may be identified in the resume to be identified and sent to the interviewer for the interviewer to further the authenticity of the resume.
  • the confirmation information may be sent to the apparatus of the embodiment, and the device determines the resume to be recognized when receiving the confirmation information sent by the first preset node.
  • the device determines the resume to be recognized when receiving the confirmation information sent by the first preset node.
  • the values of the first preset threshold and the second preset threshold are only examples. In other embodiments, the user can set other values as needed.
  • the resume identification device proposed in this embodiment obtains the content of the entry belonging to the preset entry from the resume to be identified, and matches the content of the obtained entry with the content of the entry under the same preset entry in the reference resume to calculate the resume to be recognized.
  • the degree of similarity with the reference resume if the similarity between all the reference resumes in the resume library and the resume to be identified is less than or equal to the first preset threshold, then the resume to be identified is determined to be a real resume, otherwise the resume to be identified is determined to be
  • the fake resume uses the above scheme to identify whether the resume is fraudulent before the interview, filter the fraud resume, add the resume determined to be true to the interview list, improve the interview efficiency, reduce the waste of the interview resources, and reduce the employment. risk.
  • the resume identification program may also be divided into one or more modules, one or more modules being stored in the memory 11 and being processed by one or more processors (this embodiment is The processor 12) is executed to complete the application.
  • a module referred to in the present application refers to a series of computer program instruction segments capable of performing a specific function for describing the execution process of the resume recognition program in the resume identification device.
  • FIG. 2 it is a schematic diagram of a function module of a resume identification program in an embodiment of the resume identification device of the present application.
  • the resume recognition program can be divided into an acquisition module 10, a calculation module 20, and an identification module 30.
  • And judging module 40 exemplarily:
  • the obtaining module 10 is configured to: obtain an entry content belonging to the preset entry from the resume to be identified;
  • the calculating module 20 is configured to: perform keyword matching on the content of the item in the same preset item as the reference resume in the resume database to calculate a similarity between the resume to be identified and the reference resume;
  • the identification module 30 is configured to: determine whether the similarity between all reference resumes in the resume library and the resume to be identified is less than or equal to a first preset threshold;
  • the determining module 40 is configured to: if the similarity between all the reference resumes in the resume library and the resume to be identified is less than or equal to the first preset threshold, determine that the resume to be recognized is a real resume, and The resume to be identified is added to the interview list;
  • the present application also provides a resume identification method.
  • FIG. 3 it is a flowchart of the first embodiment of the resume identification method of the present application.
  • the resume identification method includes:
  • step S10 the content of the entry belonging to the preset entry is obtained from the resume to be identified.
  • step S20 the content of the acquired item is matched with the content of the item under the preset item in the resume database, and the similarity between the resume to be identified and the reference resume is calculated.
  • the method proposed in this embodiment can be performed by a device, which can be implemented by software and/or hardware.
  • the device interfaces with the outsourcing management system, and the outsourcing management system sends the resume of the outsourced personnel to the device according to the personnel demand information.
  • the device uses the resume as the resume to be identified, and identifies whether the resume is fraudulent, and the identified The fake resume is filtered out and the recognized real resume is retained.
  • a resume database is stored, and the resume database has a plurality of reference resumes, wherein the reference resume may be a resume derived from an incumbent.
  • the device can process the existing resume and prepare the reference resume to be stored in the resume database as a benchmark for judging whether the resume uploaded by the outsourcing management system is fraudulent.
  • a resume template is set up, which includes a plurality of preset items, such as work experience, project experience, internship experience, and training experience, and the existing resume is entered according to the resume template, and when the resume is entered. , enter by keyword, instead of entering all the content under the entry.
  • the method further includes the following steps:
  • the keyword is extracted from each preset item of the uploaded resume as a reference keyword; after the reference keyword is associated with the corresponding preset item, the reference resume is generated and stored in the resume library.
  • the content of the item is displayed to the user, and the keyword is manually selected by the user, and the keyword selected by the user is received as a reference keyword, and stored in association with the preset item.
  • the resume template can be provided to the outsourcing management system, and the supplier inputs the resume of the outsourcer through the resume template and uploads it.
  • the content of the entry belonging to the preset entry is obtained from the resume to be identified, for example, the preset entry of the work experience, and the content of the entry belonging to the entry in the resume is obtained, according to The resume storage order in the resume database is matched one by one, and the similarity between each reference resume and the resume to be identified is calculated.
  • the number of preset items in the resume of the outsourcer is uncertain, for example, some of the outsourcers' resumes may include multiple preset items such as work experience, project experience, internship experience, and training experience.
  • the outsourcer's resume may have only a preset entry of work experience or internship experience. Therefore, the above calculation method for similarity may adopt different calculation methods according to the number of preset entries in the resume to be identified.
  • the step of matching the content of the acquired item with the content of the item under the preset item in the resume library with the reference resume to calculate the similarity between the resume to be identified and each reference resume includes: obtaining a reference resume a total number of reference keywords and reference keywords included in the preset entry; calculating a proportion of the number of keywords under the entry content of the preset entry of the resume to be identified in the total number, The ratio is used as the similarity between the resume to be identified and the reference resume.
  • the resume to be identified only includes the preset entry of the work experience
  • the reference keyword corresponding to the preset entry in the reference resume is obtained, and the total number N of the resume is determined, and the work experience of the resume to be identified is determined.
  • the proportion of the calculated number in the total number N is used as the similarity between the resume to be identified and the reference resume.
  • the content of the acquired item is matched with the content of the item under the same preset item in the resume library, and the similarity between the resume to be recognized and each reference resume is calculated.
  • the step includes: when the plurality of preset entries are included in the resume to be identified, obtaining the number of reference keywords included in each preset entry in the reference resume; respectively calculating the number of keywords included in each preset entry Presetting the proportion of the total number of keywords in the entry, and obtaining the weight corresponding to each preset entry, wherein the sum of the weights corresponding to the plurality of preset entries is 1; according to the ratio and the corresponding corresponding to each preset entry
  • the weight is calculated, and the similarity between the resume to be identified and the reference resume is calculated according to a weighting algorithm.
  • the user can set a corresponding weight according to the importance degree of the preset item. After calculating the proportion corresponding to each preset item, the similarity between the resume to be identified and the reference resume is calculated according to the preset weight.
  • step S30 it is determined whether the similarity between all the reference resumes in the resume library and the resume to be identified is less than or equal to the first preset threshold.
  • Step S40 if yes, determining that the resume to be recognized is a real resume, and adding the resume to be identified to the interview list.
  • Step S50 if no, determining that the resume to be identified is a fraudulent resume.
  • the first preset threshold is set to 70%, and if the similarity between all the reference resumes in the resume library and the resume to be identified is less than 70%, the resume to be identified is considered to be If the similarity between the reference resume and the resume to be identified in the resume database is greater than 70%, the resume to be identified is considered to be a fraud resume, and the resume with a similarity of more than 70% to the resume to be identified is a similar resume.
  • the prompt information may be sent to the second preset node, where the prompt information includes the resume to be recognized and the similarity with the resume to be identified is greater than a first preset threshold.
  • the reference resume that is, after the fraud is identified, the resume and the resume can be sent to the second preset node.
  • the second preset node may be an outsourcing management system. For the outsourcing management system to be accounted for according to the above prompt information.
  • the method before the step of determining that the resume to be recognized is a real resume, the method further includes the following steps:
  • the similarity between the reference resume in the resume library and the resume to be identified is less than or equal to the first preset threshold, determining whether there is a reference resume between the resume and the resume to be recognized The similarity is greater than the second preset threshold, where the first preset threshold is greater than the second preset threshold.
  • determining that the resume to be recognized is a real resume adding the resume to be identified to the interview list; if yes, comparing the similarity between the resume to be identified and the resume to be identified is greater than the The reference resume whose second preset threshold is less than or equal to the first preset threshold is sent to the second preset node.
  • a second preset threshold is introduced, and the second preset threshold is smaller than the first preset threshold.
  • the second preset threshold may be 50%, that is, if the similarity between the reference resume and the resume to be recognized in the resume database is greater than 50% and less than or equal to 70%, the resume to be identified and
  • the reference resume with the similarity between the to-be-identified resumes greater than 50% and less than or equal to 70% is sent to the first preset node for the first preset node to make further judgment.
  • the first preset node may be an interviewer node; in addition, the same reference keyword as the reference resume may be identified in the resume to be identified and sent to the interviewer for the interviewer to further the authenticity of the resume.
  • the confirmation information may be sent to the apparatus of the embodiment, and the device determines the resume to be recognized when receiving the confirmation information sent by the first preset node.
  • the device determines the resume to be recognized when receiving the confirmation information sent by the first preset node.
  • the values of the first preset threshold and the second preset threshold are only examples. In other embodiments, the user can set other values as needed.
  • the resume identification method proposed in this embodiment obtains the content of the entry belonging to the preset entry from the resume to be identified, and matches the content of the obtained entry with the content of the entry under the same preset entry in the reference resume to calculate the resume to be recognized.
  • the degree of similarity with the reference resume if the similarity between all the reference resumes in the resume library and the resume to be identified is less than or equal to the first preset threshold, then the resume to be identified is determined to be a real resume, otherwise the resume to be identified is determined to be
  • the fake resume uses the above scheme to identify whether the resume is fraudulent before the interview, filter the fraud resume, add the resume determined to be true to the interview list, improve the interview efficiency, reduce the waste of the interview resources, and reduce the employment. risk.
  • the embodiment of the present application further provides a computer readable storage medium, where the resume identification program is stored, and the resume identification program can be executed by one or more processors to implement the following operations:
  • the ratio of the number of keywords included in each preset entry to the total number of keywords in the preset entry is calculated, and the weight corresponding to each preset entry is obtained, where the sum of the weights corresponding to the multiple preset entries is 1;
  • the similarity between the reference resume in the resume library and the resume to be identified is less than or equal to the first preset threshold, determining whether there is a reference resume between the resume and the resume to be recognized The similarity is greater than the second preset threshold, where the first preset threshold is greater than the second preset threshold;
  • the resume to be identified is sent to the first preset node, so that the first preset node determines the authenticity of the resume to be recognized; when receiving the first preset node, When confirming the information, it is determined that the resume to be recognized is a real resume, and the step of adding the resume to be identified to the interview list is performed.
  • the technical solution of the present application which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM as described above). , a disk, an optical disk, including a number of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the methods described in the various embodiments of the present application.
  • a terminal device which may be a mobile phone, a computer, a server, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Health & Medical Sciences (AREA)
  • Operations Research (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Probability & Statistics with Applications (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Storage Device Security (AREA)

Abstract

一种简历识别装置,包括存储器和处理器,存储器上存储有可在处理器上运行的简历识别程序,该程序被处理器执行时实现如下步骤:从待识别简历中获取属于预设条目的条目内容(S10);将获取的条目内容与简历库中参考简历相同预设条目下的条目内容进行关键词匹配,以计算待识别简历与参考简历之间的相似度(S20);判断是否简历库中的所有参考简历与待识别简历之间的相似度均小于或等于第一预设阈值(S30);若是,则判定待识别简历为真实简历,并将待识别简历添加至面试列表(S40);若否,则判定待识别简历为造假简历(S50)。还提出一种简历识别方法以及一种计算机可读存储介质。有效识别简历作假,减少了面试资源浪费。

Description

简历识别装置、方法及计算机可读存储介质
本申请基于巴黎公约申明享有2017年09月25日递交的申请号为201710876431.8、名称为“简历识别装置、方法及计算机可读存储介质”的中国专利申请的优先权,该中国专利申请的整体内容以参考的方式结合在本申请中。
技术领域
本申请涉及计算机技术领域,尤其涉及一种简历识别装置、方法及计算机可读存储介质。
背景技术
供应商上传的外包人员简历,如果外包人员在简历内容上造假,如:工作经历造假、项目经验等信息上造假,而对于简历造假的情况,只有在部门的审核人员在查看时或者在面试人员进行面试时,才能够识别出来,导致浪费面试资源。
发明内容
本申请提供一种简历识别装置、方法及计算机可读存储介质,其主要目的在于有效识别简历作假,减少面试资源浪费。
为实现上述目的,本申请提供一种简历识别装置,该装置包括存储器和处理器,所述存储器中存储有可在所述处理器上运行的简历识别程序,所述简历识别程序被所述处理器执行时实现如下步骤:
从待识别简历中获取属于预设条目的条目内容;
将获取的条目内容与简历库中参考简历相同预设条目下的条目内容进行关键词匹配,以计算所述待识别简历与参考简历之间的相似度;
判断是否所述简历库中的所有参考简历与所述待识别简历之间的相似度均小于或等于第一预设阈值;
若是,则判定所述待识别简历为真实简历,并将所述待识别简历添加至面试列表;
若否,则判定所述待识别简历为造假简历。
此外,为实现上述目的,本申请还提供一种简历识别方法,该方法包括:
从待识别简历中获取属于预设条目的条目内容;
将获取的条目内容与简历库中参考简历相同预设条目下的条目内容进行关键词匹配,以计算所述待识别简历与参考简历之间的相似度;
判断是否所述简历库中的所有参考简历与所述待识别简历之间的相似度均小于或等于第一预设阈值;
若是,则判定所述待识别简历为真实简历,并将所述待识别简历添加至面试列表;
若否,则判定所述待识别简历为造假简历。
此外,为实现上述目的,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有简历识别程序,所述简历识别程序可被一个或者多个处理器执行,以实现如上所述的简历识别方法的步骤。
本申请提出的简历识别装置、方法及计算机可读存储介质,从待识别简历中获取属于预设条目的条目内容,将获取的条目内容与参考简历中相同预设条目下的条目内容进行关键词匹配,以计算待识别简历与参考简历之间的相似度;若简历库中所有参考简历与待识别简历之间的相似度均小于或者等于第一预设阈值,则判定待识别简历为真实简历,否则判定该待识别简历为造假简历,本申请通过上述方案在面试之前对简历是否造假进行识别,将造假的简历过滤掉,将判定为真实的简历添加至面试列表中,提高面试效率,减少面试资源的浪费,降低用人风险。
附图说明
图1为本申请简历识别装置较佳实施例的示意图;
图2为本申请简历识别装置一实施例中简历识别程序的程序模块示意图;
图3为本申请简历识别方法较佳实施例的流程图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请提供一种简历识别装置。参照图1所示,为本申请简历识别装置较佳实施例的示意图。
在本实施例中,简历识别装置可以是PC(Personal Computer,个人电脑),也可以是智能手机、平板电脑等具有显示功能的可移动式终端设备。
该简历识别装置至少包括存储器11、处理器12,通信总线13,以及网络接口14。
其中,存储器11至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、磁性存储器、磁盘、光盘等。存储器11在一些实施例中可以是简历识别装置的内部存储单元,例如该简历识别装置的硬盘。存储器11在另一些实施例中也可以是简历识别装置的外部存储设备,例如简历识别装置上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,存储器11还可以既包括简历识别装置的内部存储单元也包括外部存储设备。存储器11不仅可以用于存储安装于简历识别装置的应用软件及各类数据,例如简历识别程序的代码等,还可以用于暂时地存储已经输出或者将要输出的数据。
处理器12在一些实施例中可以是一中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器或其他数据处理芯片,用于运行存储器11中存储的程序代码或处理数据,例如执行简历识别程序等。
通信总线13用于实现这些组件之间的连接通信。
网络接口14可选的可以包括标准的有线接口、无线接口(如WI-FI接口),通常用于在该装置与其他电子设备之间建立通信连接。
图1仅示出了具有组件11-14以及简历识别程序的简历识别装置,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。
可选地,该装置还可以包括用户接口,用户接口可以包括显示器(Display)、输入单元比如键盘(Keyboard),可选的用户接口还可以包括标准的有线接口、无线接口。可选地,在一些实施例中,显示器可以是LED显示器、液晶显示 器、触控式液晶显示器以及OLED(Organic Light-Emitting Diode,有机发光二极管)触摸器等。其中,显示器也可以适当的称为显示屏或显示单元,用于显示在简历识别装置中处理的信息以及用于显示可视化的用户界面。
可选地,该装置还可以包括触摸传感器。所述触摸传感器所提供的供用户进行触摸操作的区域称为触控区域。此外,这里所述的触摸传感器可以为电阻式触摸传感器、电容式触摸传感器等。而且,所述触摸传感器不仅包括接触式的触摸传感器,也可包括接近式的触摸传感器等。此外,所述触摸传感器可以为单个传感器,也可以为阵列布置的多个传感器。该装置的显示器的面积可以与所述触摸传感器的面积相同,也可以不同。可选地,将显示器与所述触摸传感器层叠设置,以形成触摸显示屏。该装置基于触摸显示屏侦测用户触发的触控操作。
可选地,该装置还可以包括摄像头、RF(Radio Frequency,射频)电路,传感器、音频电路、WiFi模块等。其中,传感器比如光传感器、运动传感器以及其他传感器。具体地,光传感器可包括环境光传感器及接近传感器,其中,若该装置为移动终端,环境光传感器可根据环境光线的明暗来调节显示屏的亮度,接近传感器可在移动终端移动到耳边时,关闭显示屏和/或背光。作为运动传感器的一种,重力加速度传感器可检测各个方向上(一般为三轴)加速度的大小,静止时可检测出重力的大小及方向,可用于识别移动终端姿态的应用(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等;当然,移动终端还可配置陀螺仪、气压计、湿度计、温度计、红外线传感器等其他传感器,在此不再赘述。
在图1所示的装置实施例中,存储器11中存储有简历识别程序;处理器12执行存储器11中存储的简历识别程序时实现如下步骤:
从待识别简历中获取属于预设条目的条目内容。
将获取的条目内容与简历库中参考简历相同预设条目下的条目内容进行关键词匹配,以计算所述待识别简历与参考简历之间的相似度。
本实施例提出的装置中运行有简历识别程序,该程序可以与外包管理系统对接,外包管理系统根据人员需求信息向本实施例的装置发送外包人员的简历,该装置接收到上述简历后,将上述简历作为待识别简历,对简历是否 造假进行识别,将识别出的造假的简历过滤掉,保留识别出的真实简历。
需要说明的是,在本装置中存储有简历库,该简历库中有多个参考简历,其中,参考参考简历可以是来源于在职人员的简历。该装置可以对于现有简历进行处理,制作成参考简历存储到简历库中,作为判断外包管理系统上传的简历是否造假的基准。在制作参考简历时,设置简历模板,该简历模板中包括有多个预设条目,例如工作经历、项目经验、实习经历以及培训经历等,按照该简历模板录入现有简历,并且在录入简历时,以提取关键字的方式录入,而非将条目下的内容全部录入。
具体地,处理器12还用于执行所述简历识别装置,以实现如下步骤:
从上传的简历的各个预设条目中提取关键词作为参考关键词;将所述参考关键词与对应的预设条目关联后,生成参考简历存储至所述简历库。其中,将条目内容展示给用户,由用户手动选择关键词,接收用户选择的关键词作为参考关键词,将其与预设条目关联存储。此外,还可以将简历模板提供给外包管理系统,供应商通过该简历模板录入外包人员的简历并上传。
在将待识别简历与简历库中的简历进行匹配时,从待识别简历中获取属于预设条目的条目内容,例如工作经历这一预设条目,获取简历中属于该条目下的条目内容,按照简历库中的简历存储顺序进行逐一匹配,计算出每一个参考简历与待识别简历之间的相似度。由于外包人员的简历中的属于预设条目的个数不确定,例如,有的外包人员的简历中可能包含有工作经历、项目经验、实习经历以及培训经历等多个预设条目,而有的外包人员的简历中可能只有工作经历或者实习经历这一个预设条目,因此,上述关于相似度的计算方法可以根据待识别简历中的预设条目的数量采用不同的计算方法。
例如,将获取的条目内容与简历库中参考简历相同预设条目下的条目内容进行关键词匹配,以计算所述待识别简历与每一参考简历之间的相似度的步骤包括:获取参考简历中预设条目下包含的参考关键词和参考关键词的总数量;计算所述待识别简历的所述预设条目的条目内容下的关键词数量在所述总数量中的占比,将所述占比作为所述待识别简历与该参考简历之间的相似度。
例如,待识别简历中只包含有工作经历这个预设条目,则获取参考简历中预设条目对应的参考关键字,并确定其总数量N,判断待识别简历的工作 经历这个预设条目下包含有几个上述参考关键字,计算得到的个数在总数量N中的占比,将占比作为待识别简历与该参考简历之间的相似度。
在其他实施例中,所述将获取的条目内容与简历库中参考简历相同预设条目下的条目内容进行关键词匹配,以计算所述待识别简历与每一参考简历之间的相似度的步骤包括:当所述待识别简历中包含有多个预设条目时,获取参考简历中各个预设条目下包含的参考关键词的数量;分别计算各个预设条目下包含的关键词数量占该预设条目的关键词总数量的比例,并获取各个预设条目对应的权重,其中,所述多个预设条目对应的权重之和为1;根据各个预设条目对应的所述比例和所述权重,按照加权算法计算所述待识别简历与所述参考简历之间的相似度。用户可以根据预设条目的重要程度为其设置对应的权重,在计算得到各个预设条目对应的占比后,结合预先设置的权重,计算待识别简历与该参考简历之间的相似度。
判断是否所述简历库中的所有参考简历与所述待识别简历之间的相似度均小于或等于第一预设阈值。
若是,则判定所述待识别简历为真实简历,并将所述待识别简历添加至面试列表。
若否,则判定所述待识别简历为造假简历。
在计算得到待识别简历与简历库中的所有参考简历之间的相似度后,根据预先设置的第一预设阈值判断待识别简历是否造假。例如,在该装置的一个实施例中,将第一预设阈值设置为70%,如果简历库中的所有参考简历与待识别简历之间的相似度均小于70%,则认为待识别简历为真实简历,若简历库中有参考简历与待识别简历之间的相似度大于70%,则认为待识别简历为造假简历,且与待识别简历相似度大于70%的简历为雷同简历。
进一步地,在判定待识别简历造假后,可以向第二预设节点发送提示信息,所述提示信息中包含有所述待识别简历以及与所述待识别简历的相似度大于第一预设阈值的参考简历,也就是说,在识别到该简历造假后,可以将该简历和上述雷同简历发送给第二预设节点,在一些实施例中,上述第二预设节点可以是外包管理系统,以供外包管理系统根据上述提示信息进行追责。
或者,在另一实施例中,处理器12还用于执行所述简历识别程序,以在所述判定所述待识别简历为真实简历的步骤之前,还实现如下步骤:
若所述简历库中的参考简历与所述待识别简历之间的相似度均小于或者等于第一预设阈值,则判断所述简历库中是否有参考简历与所述待识别简历之间的相似度大于第二预设阈值,其中,所述第一预设阈值大于所述第二预设阈值。
若否,则判定所述待识别简历为真实简历,将所述待识别简历添加至面试列表。
若是,则将所述待识别简历发送至第一预设节点,以供所述第一预设节点对所述待识别简历的真实性进行判断;当接收到所述第一预设节点发送的确认信息时,判定所述待识别简历为真实简历,并执行所述将所述待识别简历添加至面试列表的步骤。
在该实施例中,为了提高简历识别的准确度,引入第二预设阈值,该第二预设阈值小于第一预设阈值。例如,第二预设阈值可以为50%,也就是说,若简历库中有参考简历与待识别简历之间的相似度大于50%且小于或等于70%,则将待识别简历和与待识别简历之间的相似度大于50%且小于或等于70%的参考简历发送至第一预设节点,以供第一预设节点进行进一步判断。其中,第一预设节点可以是面试人员节点;此外,可以在待识别简历中标识出与参考简历相同的参考关键词后发送给面试人员,以供面试人员对这些简历的真实性做进一步的判断,若第一预设节点通过人工核实待识别简历为真实简历,则可以向本实施例的装置发送确认信息,该装置在接收到第一预设节点发送的确认信息时,判定待识别简历为真实简历,并执行将所述待识别简历添加至面试列表的步骤。可以理解的是,所述第一预设阈值和第二预设阈值的数值只是举例说明,在其他实施例中,用户可以根据需要设置为其他数值。
本实施例提出的简历识别装置,从待识别简历中获取属于预设条目的条目内容,将获取的条目内容与参考简历中相同预设条目下的条目内容进行关键词匹配,以计算待识别简历与参考简历之间的相似度;若简历库中所有参考简历与待识别简历之间的相似度均小于或者等于第一预设阈值,则判定待识别简历为真实简历,否则判定待识别简历为造假简历,该实施例通过上述方案在面试之前对简历是否造假进行识别,将造假的简历过滤掉,将判定为真实的简历添加至面试列表中,提高面试效率,减少面试资源的浪费,降低 用人风险。
可选地,在其他的实施例中,简历识别程序还可以被分割为一个或者多个模块,一个或者多个模块被存储于存储器11中,并由一个或多个处理器(本实施例为处理器12)所执行以完成本申请,本申请所称的模块是指能够完成特定功能的一系列计算机程序指令段,用于描述简历识别程序在简历识别装置中的执行过程。
例如,参照图2所示,为本申请简历识别装置一实施例中的简历识别程序的功能模块示意图,该实施例中,简历识别程序可以被分割为获取模块10、计算模块20、识别模块30和判断模块40,示例性地:
获取模块10用于:从待识别简历中获取属于预设条目的条目内容;
计算模块20用于:将获取的条目内容与简历库中参考简历相同预设条目下的条目内容进行关键词匹配,以计算所述待识别简历与参考简历之间的相似度;
识别模块30用于:判断是否所述简历库中的所有参考简历与所述待识别简历之间的相似度均小于或等于第一预设阈值;
判断模块40用于:若所述简历库中的所有参考简历与所述待识别简历之间的相似度均小于或等于第一预设阈值,则判定所述待识别简历为真实简历,并将所述待识别简历添加至面试列表;
以及,若所述简历库中有参考简历与所述待识别简历之间的相似度大于所述第一预设阈值,则判定所述待识别简历为造假简历。
上述获取模块10、计算模块20、识别模块30和判断模块40被执行所实现的功能或操作步骤与上述实施例大体相同,在此不再赘述。
此外,本申请还提供一种简历识别方法。参照图3所示,为本申请简历识别方法第一实施例的流程图。
在本实施例中,简历识别方法包括:
步骤S10,从待识别简历中获取属于预设条目的条目内容。
步骤S20,将获取的条目内容与简历库中参考简历相同预设条目下的条目内容进行关键词匹配,以计算所述待识别简历与参考简历之间的相似度。
本实施例提出的方法可以由一个装置执行,该装置可以由软件和/或硬件实现。装置与外包管理系统对接,外包管理系统根据人员需求信息向该装置发送外包人员的简历,该装置接收到上述简历后,将上述简历作为待识别简历,对简历是否造假进行识别,将识别出的造假的简历过滤掉,保留识别出的真实简历。
需要说明的是,在上述装置中存储有简历库,该简历库中有多个参考简历,其中,参考参考简历可以是来源于在职人员的简历。该装置可以对于现有简历进行处理,制作成参考简历存储到简历库中,作为判断外包管理系统上传的简历是否造假的基准。在制作参考简历时,设置简历模板,该简历模板中包括有多个预设条目,例如工作经历、项目经验、实习经历以及培训经历等,按照该简历模板录入现有简历,并且在录入简历时,以提取关键字的方式录入,而非将条目下的内容全部录入。
具体地,该方法还包括如下步骤:
从上传的简历的各个预设条目中提取关键词作为参考关键词;将所述参考关键词与对应的预设条目关联后,生成参考简历存储至所述简历库。其中,将条目内容展示给用户,由用户手动选择关键词,接收用户选择的关键词作为参考关键词,将其与预设条目关联存储。此外,还可以将简历模板提供给外包管理系统,供应商通过该简历模板录入外包人员的简历并上传。
在将待识别简历与简历库中的简历进行匹配时,从待识别简历中获取属于预设条目的条目内容,例如工作经历这一预设条目,获取简历中属于该条目下的条目内容,按照简历库中的简历存储顺序进行逐一匹配,计算出每一个参考简历与待识别简历之间的相似度。由于外包人员的简历中的属于预设条目的个数不确定,例如,有的外包人员的简历中可能包含有工作经历、项目经验、实习经历以及培训经历等多个预设条目,而有的外包人员的简历中可能只有工作经历或者实习经历这一个预设条目,因此,上述关于相似度的计算方法可以根据待识别简历中的预设条目的数量采用不同的计算方法。
例如,将获取的条目内容与简历库中参考简历相同预设条目下的条目内容进行关键词匹配,以计算所述待识别简历与每一参考简历之间的相似度的步骤包括:获取参考简历中预设条目下包含的参考关键词和参考关键词的总数量;计算所述待识别简历的所述预设条目的条目内容下的关键词数量在所 述总数量中的占比,将所述占比作为所述待识别简历与该参考简历之间的相似度。
例如,待识别简历中只包含有工作经历这个预设条目,则获取参考简历中预设条目对应的参考关键字,并确定其总数量N,判断待识别简历的工作经历这个预设条目下包含有几个上述参考关键字,计算得到的个数在总数量N中的占比,将占比作为待识别简历与该参考简历之间的相似度。
在其他实施例中,所述将获取的条目内容与简历库中参考简历相同预设条目下的条目内容进行关键词匹配,以计算所述待识别简历与每一参考简历之间的相似度的步骤包括:当所述待识别简历中包含有多个预设条目时,获取参考简历中各个预设条目下包含的参考关键词的数量;分别计算各个预设条目下包含的关键词数量占该预设条目的关键词总数量的比例,并获取各个预设条目对应的权重,其中,所述多个预设条目对应的权重之和为1;根据各个预设条目对应的所述比例和所述权重,按照加权算法计算所述待识别简历与所述参考简历之间的相似度。用户可以根据预设条目的重要程度为其设置对应的权重,在计算得到各个预设条目对应的占比后,结合预先设置的权重,计算待识别简历与该参考简历之间的相似度。
步骤S30,判断是否所述简历库中的所有参考简历与所述待识别简历之间的相似度均小于或等于第一预设阈值。
步骤S40,若是,则判定所述待识别简历为真实简历,并将所述待识别简历添加至面试列表。
步骤S50,若否,则判定所述待识别简历为造假简历。
在计算得到待识别简历与简历库中的所有参考简历之间的相似度后,根据预先设置的第一预设阈值判断待识别简历是否造假。例如,在该装置的一个实施例中,将第一预设阈值设置为70%,如果简历库中的所有参考简历与待识别简历之间的相似度均小于70%,则认为待识别简历为真实简历,若简历库中有参考简历与待识别简历之间的相似度大于70%,则认为待识别简历为造假简历,且与待识别简历相似度大于70%的简历为雷同简历。
进一步地,在判定待识别简历造假后,可以向第二预设节点发送提示信息,所述提示信息中包含有所述待识别简历以及与所述待识别简历的相似度大于第一预设阈值的参考简历,也就是说,在识别到该简历造假后,可以将 该简历和上述雷同简历发送给第二预设节点,在一些实施例中,上述第二预设节点可以是外包管理系统,以供外包管理系统根据上述提示信息进行追责。
或者,在另一实施例中,在所述判定所述待识别简历为真实简历的步骤之前,该方法还包括如下步骤:
若所述简历库中的参考简历与所述待识别简历之间的相似度均小于或者等于第一预设阈值,则判断所述简历库中是否有参考简历与所述待识别简历之间的相似度大于第二预设阈值,其中,所述第一预设阈值大于所述第二预设阈值。
若否,则判定所述待识别简历为真实简历,将所述待识别简历添加至面试列表;若是,则将所述待识别简历,以及与所述待识别简历之间的相似度大于所述第二预设阈值且小于或者等于所述第一预设阈值的参考简历发送至第二预设节点。
在该实施例中,为了提高简历识别的准确度,引入第二预设阈值,该第二预设阈值小于第一预设阈值。例如,第二预设阈值可以为50%,也就是说,若简历库中有参考简历与待识别简历之间的相似度大于50%且小于或等于70%,则将该待识别简历和与该待识别简历之间的相似度大于50%且小于或等于70%的参考简历发送至第一预设节点,以供第一预设节点进行进一步判断。其中,第一预设节点可以是面试人员节点;此外,可以在待识别简历中标识出与参考简历相同的参考关键词后发送给面试人员,以供面试人员对这些简历的真实性做进一步的判断,若第一预设节点通过人工核实待识别简历为真实简历,则可以向本实施例的装置发送确认信息,该装置在接收到第一预设节点发送的确认信息时,判定待识别简历为真实简历,并执行将所述待识别简历添加至面试列表的步骤。可以理解的是,所述第一预设阈值和第二预设阈值的数值只是举例说明,在其他实施例中,用户可以根据需要设置为其他数值。
本实施例提出的简历识别方法,从待识别简历中获取属于预设条目的条目内容,将获取的条目内容与参考简历中相同预设条目下的条目内容进行关键词匹配,以计算待识别简历与参考简历之间的相似度;若简历库中所有参考简历与待识别简历之间的相似度均小于或者等于第一预设阈值,则判定待识别简历为真实简历,否则判定待识别简历为造假简历,该实施例通过上述 方案在面试之前对简历是否造假进行识别,将造假的简历过滤掉,将判定为真实的简历添加至面试列表中,提高面试效率,减少面试资源的浪费,降低用人风险。
此外,本申请实施例还提出一种计算机可读存储介质,所述计算机可读存储介质上存储有简历识别程序,所述简历识别程序可被一个或多个处理器执行,以实现如下操作:
从待识别简历中获取属于预设条目的条目内容;
将获取的条目内容与简历库中参考简历相同预设条目下的条目内容进行关键词匹配,以计算所述待识别简历与参考简历之间的相似度;
判断是否所述简历库中的所有参考简历与所述待识别简历之间的相似度均小于或等于第一预设阈值;
若是,则判定所述待识别简历为真实简历,并将所述待识别简历添加至面试列表;
若否,则判定所述待识别简历为造假简历。
进一步地,所述简历识别程序被处理器执行时还实现如下操作:
获取参考简历中预设条目下包含的参考关键词和参考关键词的总数量;
计算所述待识别简历的所述预设条目的条目内容下的关键词数量在所述总数量中的占比,将所述占比作为所述待识别简历与该参考简历之间的相似度。
进一步地,所述简历识别程序被处理器执行时还实现如下操作:
当所述待识别简历中包含有多个预设条目时,获取参考简历中各个预设条目下包含的参考关键词的数量;
分别计算各个预设条目下包含的关键词数量占该预设条目的关键词总数量的比例,并获取各个预设条目对应的权重,其中,所述多个预设条目对应的权重之和为1;
根据各个预设条目对应的所述比例和所述权重,按照加权算法计算所述待识别简历与所述参考简历之间的相似度。
进一步地,所述简历识别程序被处理器执行时还实现如下操作:
若所述简历库中的参考简历与所述待识别简历之间的相似度均小于或者等于第一预设阈值,则判断所述简历库中是否有参考简历与所述待识别简历 之间的相似度大于第二预设阈值,其中,所述第一预设阈值大于所述第二预设阈值;
若否,则判定所述待识别简历为真实简历,将所述待识别简历添加至面试列表;
若是,则将所述待识别简历发送至第一预设节点,以供所述第一预设节点对所述待识别简历的真实性进行判断;当接收到所述第一预设节点发送的确认信息时,判定所述待识别简历为真实简历,并执行所述将所述待识别简历添加至面试列表的步骤。
需要说明的是,上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。并且本文中的术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、装置、物品或者方法不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、装置、物品或者方法所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、装置、物品或者方法中还存在另外的相同要素。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种简历识别装置,其特征在于,所述装置包括存储器和处理器,所述存储器上存储有可在所述处理器上运行的简历识别程序,所述简历识别程序被所述处理器执行时实现如下步骤:
    从待识别简历中获取属于预设条目的条目内容;
    将获取的条目内容与简历库中参考简历相同预设条目下的条目内容进行关键词匹配,以计算所述待识别简历与参考简历之间的相似度;
    判断是否所述简历库中的所有参考简历与所述待识别简历之间的相似度均小于或等于第一预设阈值;
    若是,则判定所述待识别简历为真实简历,并将所述待识别简历添加至面试列表;
    若否,则判定所述待识别简历为造假简历。
  2. 根据权利要求1所述的简历识别装置,其特征在于,所述将获取的条目内容与简历库中参考简历相同预设条目下的条目内容进行关键词匹配,以计算所述待识别简历与每一参考简历之间的相似度的步骤包括:
    获取参考简历中预设条目下包含的参考关键词和参考关键词的总数量;
    计算所述待识别简历的所述预设条目的条目内容下的关键词数量在所述总数量中的占比,将所述占比作为所述待识别简历与该参考简历之间的相似度。
  3. 根据权利要求1所述的简历识别装置,其特征在于,所述将获取的条目内容与简历库中参考简历相同预设条目下的条目内容进行关键词匹配,以计算所述待识别简历与每一参考简历之间的相似度的步骤包括:
    当所述待识别简历中包含有多个预设条目时,获取参考简历中各个预设条目下包含的参考关键词的数量;
    分别计算各个预设条目下包含的关键词数量占该预设条目的关键词总数量的比例,并获取各个预设条目对应的权重,其中,所述多个预设条目对应的权重之和为1;
    根据各个预设条目对应的所述比例和所述权重,按照加权算法计算所述待识别简历与所述参考简历之间的相似度。
  4. 根据权利要求2所述的简历识别装置,其特征在于,所述处理器还用 于执行所述简历识别程序,以在所述判定所述待识别简历为真实简历的步骤之前,还实现如下步骤:
    若所述简历库中的参考简历与所述待识别简历之间的相似度均小于或者等于第一预设阈值,则判断所述简历库中是否有参考简历与所述待识别简历之间的相似度大于第二预设阈值,其中,所述第一预设阈值大于所述第二预设阈值;
    若否,则判定所述待识别简历为真实简历,将所述待识别简历添加至面试列表;
    若是,则将所述待识别简历发送至第一预设节点,以供所述第一预设节点对所述待识别简历的真实性进行判断;当接收到所述第一预设节点发送的确认信息时,判定所述待识别简历为真实简历,并执行所述将所述待识别简历添加至面试列表的步骤。
  5. 根据权利要求3所述的简历识别装置,其特征在于,所述处理器还用于执行所述简历识别程序,以在所述判定所述待识别简历为真实简历的步骤之前,还实现如下步骤:
    若所述简历库中的参考简历与所述待识别简历之间的相似度均小于或者等于第一预设阈值,则判断所述简历库中是否有参考简历与所述待识别简历之间的相似度大于第二预设阈值,其中,所述第一预设阈值大于所述第二预设阈值;
    若否,则判定所述待识别简历为真实简历,将所述待识别简历添加至面试列表;
    若是,则将所述待识别简历发送至第一预设节点,以供所述第一预设节点对所述待识别简历的真实性进行判断;当接收到所述第一预设节点发送的确认信息时,判定所述待识别简历为真实简历,并执行所述将所述待识别简历添加至面试列表的步骤。
  6. 根据权利要求4所述的简历识别装置,其特征在于,所述处理器还用于执行所述简历识别装置,以实现如下步骤:
    从上传的简历的各个预设条目中提取关键词作为参考关键词;
    将所述参考关键词与对应的预设条目关联后,生成参考简历存储至所述简历库。
  7. 根据权利要求5所述的简历识别装置,其特征在于,所述处理器还用于执行所述简历识别装置,以实现如下步骤:
    从上传的简历的各个预设条目中提取关键词作为参考关键词;
    将所述参考关键词与对应的预设条目关联后,生成参考简历存储至所述简历库。
  8. 一种简历识别方法,其特征在于,所述简历识别方法包括:
    从待识别简历中获取属于预设条目的条目内容;
    将获取的条目内容与简历库中参考简历相同预设条目下的条目内容进行关键词匹配,以计算所述待识别简历与参考简历之间的相似度;
    判断是否所述简历库中的所有参考简历与所述待识别简历之间的相似度均小于或等于第一预设阈值;
    若是,则判定所述待识别简历为真实简历,并将所述待识别简历添加至面试列表;
    若否,则判定所述待识别简历为造假简历。
  9. 根据权利要求8所述的简历识别方法,其特征在于,所述将获取的条目内容与简历库中参考简历相同预设条目下的条目内容进行关键词匹配,以计算所述待识别简历与每一参考简历之间的相似度的步骤包括:
    获取参考简历中预设条目下包含的参考关键词和参考关键词的总数量;
    计算所述待识别简历的所述预设条目的条目内容下的关键词数量在所述总数量中的占比,将所述占比作为所述待识别简历与该参考简历之间的相似度。
  10. 根据权利要求8所述的简历识别方法,其特征在于,所述将获取的条目内容与简历库中参考简历相同预设条目下的条目内容进行关键词匹配,以计算所述待识别简历与每一参考简历之间的相似度的步骤包括:
    当所述待识别简历中包含有多个预设条目时,获取参考简历中各个预设条目下包含的参考关键词的数量;
    分别计算各个预设条目下包含的关键词数量占该预设条目的关键词总数量的比例,并获取各个预设条目对应的权重,其中,所述多个预设条目对应的权重之和为1;
    根据各个预设条目对应的所述比例和所述权重,按照加权算法计算所述 待识别简历与所述参考简历之间的相似度。
  11. 根据权利要求9所述的简历识别方法,其特征在于,所述处理器还用于执行所述简历识别程序,以在所述判定所述待识别简历为真实简历的步骤之前,还实现如下步骤:
    若所述简历库中的参考简历与所述待识别简历之间的相似度均小于或者等于第一预设阈值,则判断所述简历库中是否有参考简历与所述待识别简历之间的相似度大于第二预设阈值,其中,所述第一预设阈值大于所述第二预设阈值;
    若否,则判定所述待识别简历为真实简历,将所述待识别简历添加至面试列表;
    若是,则将所述待识别简历发送至第一预设节点,以供所述第一预设节点对所述待识别简历的真实性进行判断;
    当接收到所述第一预设节点发送的确认信息时,判定所述待识别简历为真实简历,并执行所述将所述待识别简历添加至面试列表的步骤。
  12. 根据权利要求10所述的简历识别方法,其特征在于,所述处理器还用于执行所述简历识别程序,以在所述判定所述待识别简历为真实简历的步骤之前,还实现如下步骤:
    若所述简历库中的参考简历与所述待识别简历之间的相似度均小于或者等于第一预设阈值,则判断所述简历库中是否有参考简历与所述待识别简历之间的相似度大于第二预设阈值,其中,所述第一预设阈值大于所述第二预设阈值;
    若否,则判定所述待识别简历为真实简历,将所述待识别简历添加至面试列表;
    若是,则将所述待识别简历发送至第一预设节点,以供所述第一预设节点对所述待识别简历的真实性进行判断;
    当接收到所述第一预设节点发送的确认信息时,判定所述待识别简历为真实简历,并执行所述将所述待识别简历添加至面试列表的步骤。
  13. 根据权利要求11所述的简历识别方法,其特征在于,所述方法还包括如下步骤:
    从上传的简历的各个预设条目中提取关键词作为参考关键词;
    将所述参考关键词与对应的预设条目关联后,生成参考简历存储至所述简历库。
  14. 根据权利要求12所述的简历识别方法,其特征在于,所述方法还包括如下步骤:
    从上传的简历的各个预设条目中提取关键词作为参考关键词;
    将所述参考关键词与对应的预设条目关联后,生成参考简历存储至所述简历库。
  15. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有简历识别程序,所述简历识别程序可被一个或者多个处理器执行,以实现如下步骤:
    从待识别简历中获取属于预设条目的条目内容;
    将获取的条目内容与简历库中参考简历相同预设条目下的条目内容进行关键词匹配,以计算所述待识别简历与参考简历之间的相似度;
    判断是否所述简历库中的所有参考简历与所述待识别简历之间的相似度均小于或等于第一预设阈值;
    若是,则判定所述待识别简历为真实简历,并将所述待识别简历添加至面试列表;
    若否,则判定所述待识别简历为造假简历。
  16. 根据权利要求15所述的计算机可读存储介质,其特征在于,所述将获取的条目内容与简历库中参考简历相同预设条目下的条目内容进行关键词匹配,以计算所述待识别简历与每一参考简历之间的相似度的步骤包括:
    获取参考简历中预设条目下包含的参考关键词和参考关键词的总数量;
    计算所述待识别简历的所述预设条目的条目内容下的关键词数量在所述总数量中的占比,将所述占比作为所述待识别简历与该参考简历之间的相似度。
  17. 根据权利要求15所述的计算机可读存储介质,其特征在于,所述将获取的条目内容与简历库中参考简历相同预设条目下的条目内容进行关键词匹配,以计算所述待识别简历与每一参考简历之间的相似度的步骤包括:
    当所述待识别简历中包含有多个预设条目时,获取参考简历中各个预设条目下包含的参考关键词的数量;
    分别计算各个预设条目下包含的关键词数量占该预设条目的关键词总数量的比例,并获取各个预设条目对应的权重,其中,所述多个预设条目对应的权重之和为1;
    根据各个预设条目对应的所述比例和所述权重,按照加权算法计算所述待识别简历与所述参考简历之间的相似度。
  18. 根据权利要求16所述的计算机可读存储介质,其特征在于,所述处理器还用于执行所述简历识别程序,以在所述判定所述待识别简历为真实简历的步骤之前,还实现如下步骤:
    若所述简历库中的参考简历与所述待识别简历之间的相似度均小于或者等于第一预设阈值,则判断所述简历库中是否有参考简历与所述待识别简历之间的相似度大于第二预设阈值,其中,所述第一预设阈值大于所述第二预设阈值;
    若否,则判定所述待识别简历为真实简历,将所述待识别简历添加至面试列表;
    若是,则将所述待识别简历发送至第一预设节点,以供所述第一预设节点对所述待识别简历的真实性进行判断;当接收到所述第一预设节点发送的确认信息时,判定所述待识别简历为真实简历,并执行所述将所述待识别简历添加至面试列表的步骤。
  19. 根据权利要求17所述的计算机可读存储介质,其特征在于,所述处理器还用于执行所述简历识别程序,以在所述判定所述待识别简历为真实简历的步骤之前,还实现如下步骤:
    若所述简历库中的参考简历与所述待识别简历之间的相似度均小于或者等于第一预设阈值,则判断所述简历库中是否有参考简历与所述待识别简历之间的相似度大于第二预设阈值,其中,所述第一预设阈值大于所述第二预设阈值;
    若否,则判定所述待识别简历为真实简历,将所述待识别简历添加至面试列表;
    若是,则将所述待识别简历发送至第一预设节点,以供所述第一预设节点对所述待识别简历的真实性进行判断;当接收到所述第一预设节点发送的确认信息时,判定所述待识别简历为真实简历,并执行所述将所述待识别简 历添加至面试列表的步骤。
  20. 根据权利要求18所述的计算机可读存储介质,其特征在于,所述处理器还用于执行所述简历识别装置,以实现如下步骤:
    从上传的简历的各个预设条目中提取关键词作为参考关键词;
    将所述参考关键词与对应的预设条目关联后,生成参考简历存储至所述简历库。
PCT/CN2018/089188 2017-09-25 2018-05-31 简历识别装置、方法及计算机可读存储介质 WO2019056793A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710876431.8A CN107870976A (zh) 2017-09-25 2017-09-25 简历识别装置、方法及计算机可读存储介质
CN201710876431.8 2017-09-25

Publications (1)

Publication Number Publication Date
WO2019056793A1 true WO2019056793A1 (zh) 2019-03-28

Family

ID=61752853

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/089188 WO2019056793A1 (zh) 2017-09-25 2018-05-31 简历识别装置、方法及计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN107870976A (zh)
WO (1) WO2019056793A1 (zh)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107870976A (zh) * 2017-09-25 2018-04-03 平安科技(深圳)有限公司 简历识别装置、方法及计算机可读存储介质
CN108764825A (zh) * 2018-05-15 2018-11-06 中国平安人寿保险股份有限公司 职位信息匹配方法、装置、计算机设备及存储介质
CN108874928B (zh) * 2018-05-31 2024-02-02 平安科技(深圳)有限公司 简历数据信息解析处理方法、装置、设备及存储介质
CN108985707B (zh) * 2018-06-11 2021-08-10 安徽引航科技有限公司 一种快速判断简历内容真实性的方法
CN109598192A (zh) * 2018-10-23 2019-04-09 平安科技(深圳)有限公司 基于图像识别技术审核简历的方法、装置和计算机设备
CN109545382A (zh) * 2018-10-30 2019-03-29 平安科技(深圳)有限公司 一种基于大数据的雷同病例识别方法及计算设备
CN109472310B (zh) * 2018-11-12 2022-08-09 深圳八爪网络科技有限公司 确定两份简历为相同人才的识别方法及装置
CN109740147B (zh) * 2018-12-14 2023-08-04 国云科技股份有限公司 一种大数量人才简历去重匹配分析方法
CN109902726B (zh) * 2019-02-02 2022-07-12 天津字节跳动科技有限公司 简历信息处理方法及装置
CN112559726A (zh) * 2020-12-22 2021-03-26 深圳市易博天下科技有限公司 简历信息的过滤方法、模型训练方法、装置、设备及介质
CN113343816A (zh) * 2021-05-31 2021-09-03 的卢技术有限公司 一种针对ocr简历识别算法的自动化测试方法和系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101369279A (zh) * 2008-09-19 2009-02-18 江苏大学 一种基于计算机检索系统的学术论文相似度的检测方法
CN103970722A (zh) * 2014-05-07 2014-08-06 江苏金智教育信息技术有限公司 一种文本内容去重的方法
CN104679728A (zh) * 2015-02-06 2015-06-03 中国农业大学 一种文本相似度检测方法
CN105224518A (zh) * 2014-06-17 2016-01-06 腾讯科技(深圳)有限公司 文本相似度的计算方法及系统、相似文本的查找方法及系统
CN106408249A (zh) * 2016-08-31 2017-02-15 五八同城信息技术有限公司 简历与职位匹配方法及装置
CN107870976A (zh) * 2017-09-25 2018-04-03 平安科技(深圳)有限公司 简历识别装置、方法及计算机可读存储介质

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8051081B2 (en) * 2008-08-15 2011-11-01 At&T Intellectual Property I, L.P. System and method for generating media bookmarks
CN104299083A (zh) * 2014-10-10 2015-01-21 河南智业科技发展有限公司 一种云简历存储与查阅的系统及方法
CN106203935B (zh) * 2015-06-11 2019-11-05 唐锐 基于用户生成内容及用户关系的技能评估与岗位匹配方法
CN105117863A (zh) * 2015-09-28 2015-12-02 北京橙鑫数据科技有限公司 简历职位匹配方法及装置
CN105302793A (zh) * 2015-10-21 2016-02-03 南方电网科学研究院有限责任公司 一种利用计算机自动评价科技文献新颖性的方法
CN105701522B (zh) * 2016-01-13 2018-08-07 浙江工贸职业技术学院 基于互联网的个人简历的真伪识别方法与系统
CN106126589B (zh) * 2016-06-17 2018-05-22 广州视源电子科技股份有限公司 简历搜索方法及装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101369279A (zh) * 2008-09-19 2009-02-18 江苏大学 一种基于计算机检索系统的学术论文相似度的检测方法
CN103970722A (zh) * 2014-05-07 2014-08-06 江苏金智教育信息技术有限公司 一种文本内容去重的方法
CN105224518A (zh) * 2014-06-17 2016-01-06 腾讯科技(深圳)有限公司 文本相似度的计算方法及系统、相似文本的查找方法及系统
CN104679728A (zh) * 2015-02-06 2015-06-03 中国农业大学 一种文本相似度检测方法
CN106408249A (zh) * 2016-08-31 2017-02-15 五八同城信息技术有限公司 简历与职位匹配方法及装置
CN107870976A (zh) * 2017-09-25 2018-04-03 平安科技(深圳)有限公司 简历识别装置、方法及计算机可读存储介质

Also Published As

Publication number Publication date
CN107870976A (zh) 2018-04-03

Similar Documents

Publication Publication Date Title
WO2019056793A1 (zh) 简历识别装置、方法及计算机可读存储介质
WO2019041773A1 (zh) 预测模型的更新装置、方法及计算机可读存储介质
WO2019019255A1 (zh) 建立预测模型的装置、方法、预测模型建立程序及计算机可读存储介质
CN107818492B (zh) 产品推荐装置、方法及计算机可读存储介质
WO2019019798A1 (zh) 贷款产品的查询装置、方法及计算机可读存储介质
WO2019041521A1 (zh) 用户关键词提取装置、方法及计算机可读存储介质
WO2019205375A1 (zh) 牲畜识别方法、装置及存储介质
CN110751043A (zh) 基于人脸可见性的人脸识别方法、装置及存储介质
CN110866443B (zh) 人像存储方法、人脸识别方法、设备及存储介质
CN109151023A (zh) 任务分配方法、装置及存储介质
CN109190062B (zh) 目标语料数据的爬取方法、装置及存储介质
CN109829375A (zh) 一种机器学习方法、装置、设备及系统
CN110889045B (zh) 标签分析方法、装置及计算机可读存储介质
JP2020514681A (ja) 物質検出方法、装置、電子機器、およびコンピュータ可読記憶媒体
CN113706249B (zh) 数据推荐方法、装置、电子设备及存储介质
CN113869063A (zh) 数据推荐方法、装置、电子设备及存储介质
CN111241066B (zh) 平台数据库自动化运维方法、装置及计算机可读存储介质
CN111552829B (zh) 用于分析图像素材的方法和装置
CN111429085A (zh) 合同数据生成方法、装置、电子设备及存储介质
CN114329164B (zh) 用于处理数据的方法、装置、设备、介质和产品
CN111221917B (zh) 智能分区存储方法、装置及计算机可读存储介质
WO2021164122A1 (zh) 智能化用户识别方法、装置及计算机可读存储介质
CN115098644A (zh) 图像与文本匹配方法、装置、电子设备及存储介质
CN113989618A (zh) 可回收物品分类识别方法
CN113657971B (zh) 物品推荐方法、装置及电子设备

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM1205A DATED 13/10/2020)

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18859440

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 18859440

Country of ref document: EP

Kind code of ref document: A1