WO2014138838A1 - Method and system for candidate matching using dynamic dictionary maintenance heuristics - Google Patents

Method and system for candidate matching using dynamic dictionary maintenance heuristics Download PDF

Info

Publication number
WO2014138838A1
WO2014138838A1 PCT/CA2013/000247 CA2013000247W WO2014138838A1 WO 2014138838 A1 WO2014138838 A1 WO 2014138838A1 CA 2013000247 W CA2013000247 W CA 2013000247W WO 2014138838 A1 WO2014138838 A1 WO 2014138838A1
Authority
WO
WIPO (PCT)
Prior art keywords
job posting
documents
dictionary
domain identifiers
dictionaries
Prior art date
Application number
PCT/CA2013/000247
Other languages
French (fr)
Inventor
Jay Tanner
Original Assignee
Whoplusyou Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Whoplusyou Inc. filed Critical Whoplusyou Inc.
Priority to PCT/CA2013/000247 priority Critical patent/WO2014138838A1/en
Publication of WO2014138838A1 publication Critical patent/WO2014138838A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/105Human resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • G06Q10/06311Scheduling, planning or task assignment for a person or group
    • G06Q10/063112Skill-based matching of a person or a group to a task

Definitions

  • TITLE METHOD AND SYSTEM FOR CANDIDATE MATCHING USING DYNAMIC DICTIONARY MAINTENANCE HEURISTICS
  • the present disclosure relates to employment candidate matching methods and systems. Certain embodiments provide a method and system for candidate matching using dynamic dictionary maintenance heuristics.
  • searches or comparisons based on keywords can be limited to the search queries and the particular terminology used in the search queries. Where the electronic data files use different terminology, the searches and comparisons can miss relevant hits. Furthermore, as the job market changes and evolves over time, some electronic data files can use different or additional keywords to describe the same or new educational requirements, credentials, experiences, technology skills, phrases, buzzwords, etc. In such cases, previous keyword searches and comparisons can become obsolete or less effective in locating relevant hits.
  • FIG. 1 is a block diagram of a system for candidate matching using dynamic dictionary heuristics in accordance with an example
  • FIG. 2 is a block diagram of a document database of the system of FIG. 1;
  • FIG. 3 is a flowchart illustrating a method for candidate matching using dynamic dictionary heuristics in accordance with an example
  • FIG. 4 is a flowchart illustrating a method for candidate matching using dynamic dictionary heuristics in accordance with an alternative example
  • the following describes a method in a server having a processor, a memory, and a network interface device that includes storing, in the memory, a first index of domain identifiers and associated job posting documents, and a plurality of dictionaries, each dictionary associated with one of the domain identifiers, receiving, at the network interface device, a plurality of job posting documents, determining if one of the plurality of dictionaries requires
  • This disclosure relates generally to candidate matching methods and systems and particularly to methods and systems for candidate matching using dynamic dictionary maintenance heuristics.
  • FIG. 1 A block diagram of an example of a system 100 for candidate matching using dynamic dictionary maintenance heuristics is shown in FIG. 1.
  • the system 100 includes a plurality of candidate devices 104-1, 104-2,... 104-n (generically referred to herein as “candidate device 104" and collectively as “candidate devices 104"; this nomenclature will also be used for other elements herein), and a plurality of employer devices 106- 1, 106-2,... 106-o, all of which are connected to a server 110 via a network 108.
  • the server 110 is typically a server or mainframe within a housing containing an arrangement of one or more processors 114, volatile memory (i.e., random access memory or RAM), persistent memory 116 (e.g., hard disk devices), and a network interface device 112 (to allow the server 110 to communicate over the network 108) all of which are interconnected by a bus.
  • volatile memory i.e., random access memory or RAM
  • persistent memory e.g., hard disk devices
  • network interface device 112 to allow the server 110 to communicate over the network 108 all of which are interconnected by a bus.
  • Many computing environments implementing the server 110 or components thereof are within the scope of the invention.
  • the server 110 is typically connected to other computing infrastructure including displays, printers, data warehouse or file servers, and the like (not shown in FIG. 1).
  • the server 110 includes a network interface device 112 interconnected with the processor 114.
  • the network interface device 112 allows the server 110 to communicate with other computing devices such as the candidate devices 104 and the employer devices 106 via a link with the network 108, or via a direct, local connection (such as a Universal Serial Bus (USB) or BluetoothTM connection, not shown).
  • the network 108 can include any suitable combination of wired and/or wireless networks, including but not limited to a Wide Area Network (WAN) such as the Internet, a Local Area Network (LAN), cell phone networks, WiFi networks, LTE networks and the like.
  • the network interface device 112 is selected for compatibility with the network 108, as well as with local links as desired .
  • the link between the network interface device 112 and the network 108 is a wired link, such as an Ethernet link.
  • the network interface device 112 thus includes the necessary hardware for communicating over such a link.
  • the link between the server 110 and the network 108 can be wireless, and the network interface device 112 can include (in addition to, or instead of, any wired- link hardware) one or more transmitter/receiver assemblies, or radios, and associated circuitry.
  • the server 110 can include a keyboard, mouse, touch-sensitive display (or other input devices), a monitor (or display, such as a touch-sensitive display, or other output devices) (not shown in FIG. 1).
  • the server 110 stores, in the memory 116, a plurality of computer readable instructions executable by the processor 114. These instructions include an operating system and a variety of applications. Among the
  • applications in the memory 116 is an application 131 (also referred to herein as "application 131"; not shown in FIG. 1).
  • application 131 also referred to herein as "application 131"; not shown in FIG. 1.
  • the processor 114 executes the instructions of application 131, the processor 114 is configured to perform various functions specified by the computer readable instructions of the
  • the server 110 can also store in the memory 116, a document database 117, as discussed below in greater detail.
  • the memory 116 can also store messages and records of other transactions between one or more of the candidate devices 104 and employer devices 106.
  • the employer devices 106 are associated with entities seeking to fill job openings. In one example, these entities are third parties acting on behalf of employers (e.g. third party operators of applicant tracking systems or talent management systems that host the data of employers). Typically, the employer device 106 can be any of a desktop computer, smart phone, laptop computer, tablet computer, and the like.
  • the employer device 106 includes a processor, a memory, input and output devices, and a network interface device as described above in connection with the server 110.
  • An employer device 106 can be operated by an employer.
  • the employer device 106 exchanges messages with the server 110, via the network 108 using a client application 132 (not shown in FIG. 1) loaded on the employer device 106.
  • the client application 132 can be a web browser or native application that uses a web-based or mobile interface and exchanges messages including requests for candidate profile matches and messages to post and present job posting documents 118 to candidate devices 104.
  • the candidate devices 104 are associated with candidates (i.e., potential employees for the employers) that are seeking employment and likewise seeking to fill the job openings.
  • the candidate device 104 can be any of a desktop computer, smart phone, laptop computer, tablet computer, and the like.
  • the candidate device 104 includes a processor, a memory, input and output devices, and a network interface device as described above in connection with the server 110.
  • a candidate device 104 can be operated by a candidate.
  • the candidate device 104 and the employer device 106 can be the same device, where for example, the device is operated by users who are candidates, employers, or both.
  • the candidate device 104 exchanges messages with the server 110, via the network 108 and using a client application 134 (not shown in FIG. 1) loaded on the candidate device 104.
  • the client application 134 can be a web browser or native application that uses a web-based or mobile interface and exchanges messages including requests for job posting matches, and messages to post and present candidate profile documents 120 to employer devices 106.
  • the server 110 is configured to perform professional social network service operations.
  • the server 110 is configured to process "connection" requests, searches, exchange of content, instant messages, video chats, and the like and any other transactions between any of the candidate devices 104 and the employer devices 106, the candidate devices 104 themselves, or the employer devices 106 themselves.
  • the server 110 includes a document database 117.
  • the document database 117 can be a database application loaded on the server 110, a stand-alone database server or a virtual machine in communication with the network interface device 112 of the server 110, or any other suitable database.
  • the document database 117 maintains a plurality of job posting documents 118-1, 118-2,... 118-n and a plurality of candidate profile documents 120-1, 120-2,... 120-o (both are generically referred to herein as "documents 119" and collectively as "documents 119").
  • Each job posting document 118 can capture the particulars of a job opening, including without limitation contact information, job requirements, etc.
  • Each candidate profile document 120 can capture the particulars of a candidate including without limitation contact information, work experience, education, qualities and strengths, awards, community service, etc.
  • a candidate profile document 120 can also capture the particulars of a candidate's desired employment, projected work experience, other potential qualifications, etc.
  • the job posting document 118-1 is associated with the employer device 106-1, and so on, and the candidate profile document 120-1 is associated with the candidate device 104-1, and so on.
  • an example document database 117 is shown, including three job posting documents 118-1, 118-2, and 118-3, and one candidate profile document 120-1.
  • the document database 117 may be described as an "index" in that individual documents 119 are indexed or associated with domain identifiers 202.
  • document 118-1 is indexed according to domain identifiers 202-2 and 202-3;
  • document 118-2 is indexed according to domain identifiers 202-1 and 202-3;
  • document 118-3 is indexed according to domain identifier 202-3;
  • document 120-1 is indexed according to domain identifiers 202- 1 and 202-3.
  • each domain identifier 202 is associated with a dictionary 204 of keywords. For example, FIG.
  • the document database 117 can include names or labels such as "node”, “keywords”, and “documents” in association with each "entry” of the index (described below with reference to the "nodes” of a “tree” data structure) ; the names are provided for illustrative purposes, and can be omitted from the document database 117.
  • Each document 119 can be associated with one or more domain identifiers 202-1, 202-2,... 202-p that are stored in an index (as shown in FIG. 2).
  • the domain identifiers 202 can relate to a functional area, a focus area, or an educational program area.
  • the index of domain identifiers 202 can be stored as a "tree" data structure in the memory 116, in which each "node" of the tree data structure corresponds to a different domain identifier 202.
  • Each node can be associated with one or more data values including a label value, according to the location of the node in the tree data structure.
  • nodes corresponding to the label value "skills” can also be indexed under nodes or branches of nodes corresponding to the label values "focus areas", “functional areas”, or "educational program areas”.
  • each node of the index labeled by one or more label values corresponds to a different domain identifier 202.
  • the domain identifier 202-1 can be associated with the label value "Engineering”
  • the domain identifier 202-2 can be associated with the label value "Mechanical Engineering”
  • the domain identifier 202-3 can be associated with the label value "Aerospace Engineering”.
  • the domain identifiers 202 are used to associate, or rank, the job posting documents 118 to enable matching of candidate profile documents 120 to job posting documents 118, as discussed below. In alternative examples, the domain identifiers 202 can be used to enable matching documents 119 more generally.
  • the domain identifiers 202 can be indexed in a non-tree data structure.
  • the domain identifiers 202 can be indexed with respect to functional areas, focus areas, educational program areas, skills, and so on.
  • the domain identifiers 202 can be pre-determined according to business rules, and/or automatically refined by a bootstrapping process.
  • the bootstrapping process can combine or reduce domain identifiers 202 with few indexed documents 119, split domain identifiers 202 with many indexed documents 119, etc.
  • the bootstrapping process can start with a manually coded tree data structure and change the data structure following the application of the dynamic dictionary maintenance heuristics disclosed herein .
  • Each domain identifier 202 is associated with a dictionary 204 of keywords.
  • the dictionaries 204 can be maintained by the server 110 in the memory 116 or in the document database 117.
  • a dictionary 204 is typically maintained for each "node" of the index, listing relevant keywords and variations for that particular node.
  • the term "keywords” can include phrases, types of phrases, single words, multiple words, words in proximity, and so on.
  • the dictionary 204 can consist of a single keyword, sometimes referred to as a "literal" match.
  • each of the domain identifiers 202-1 (“Engineering"), 202-2 (“Mechanical Engineering”), and 202-3 (“Aerospace Engineering”) can correspond to a different dictionary 204-1, 204-2, and 204-3.
  • the system 100 further includes a job posting acquisition engine 122, a posting analysis and indexing engine 126, and a dictionary development engine 124.
  • the system 100 can include an analytics engine 130 (not shown in FIG. 1).
  • the functioning of these engines, sometimes referred to as modules, is described below, with reference to the methods of FIG. 3 and FIG. 4.
  • each of the engines provides
  • the job posting documents 118 include plain-text files. In other examples, however, other suitable documents can be used, such as web form documents or other structured documents.
  • the job posting documents 118 can be associated with one or more domain identifiers 202, as discussed below.
  • the job posting document 118 includes data fields acquired from a populated questionnaire or online form by the employer device 106 using the client application 132.
  • the client application 132 captures and/or sends multimedia assets such as audios, videos, tags, etc. to be included in the job posting document 118. Keywords can be extracted from the multimedia assets.
  • the candidate profile document 120 includes data acquired from a populated questionnaire or online form by the candidate devices 104 using the client application 134.
  • the candidate profile document 120 can be associated with one or more domain identifiers, as discussed below.
  • Table 1 sets out example domain identifiers 202.
  • the client application 134 captures and/or sends multimedia assets such as an electronic CV, audios, videos, tags, etc. to be included in the candidate profile document 120. Keywords can be extracted from the
  • Job Environment Preferences (coaching & training, communications, creative, leadership & management, teamwork, technical)
  • FIG. 3 and FIG. 4 Flowcharts illustrating example methods of candidate matching using dynamic dictionary maintenance heuristics are shown in FIG. 3 and FIG. 4. These methods can be carried out by the application 131 or other software executed by, for example, the processor 114 of the server 110. These methods can contain additional or fewer processes than shown and/or described, and can be
  • Computer-readable code executable by at least one processor of the server 110 to perform the methods can be stored in a computer-readable storage medium, such as a non-transitory computer-readable medium.
  • a method 300 starts at 305 and, at 310, the server 110 is configured to store, in the memory 116, a first index of domain identifiers 202 and associated job posting documents 118, and a plurality of dictionaries 204, where each dictionary 204 is associated with one of the domain identifiers 202.
  • the network interface device 112 receives a plurality of job posting documents 118.
  • the job posting documents 118 can be received from the output of the job posting acquisition engine 122.
  • the received job posting documents 118 can be stored in the document database 117.
  • the job posting acquisition engine 122 acquires job posting documents 118 over the network 108 from third party websites, computing devices, or other data sources.
  • the job posting acquisition engine 122 can use crawling, spidering, or any similar technique, performed ad hoc or on a
  • the websites that are consulted can be pre-determined according to business rules, or determined by crawling, spidering, etc.
  • the pages that are accessed in this way are typically not structured or formatted in a consistent way, as they depend on the particular characteristics of various website configurations of employers, third party applicant tracking systems, etc., and the differences in the pages that can vary by employer, industry, job type, and the like.
  • a crawled page can be included for processing and/or storing in the document database 117 upon the performance of data handling and analysis routines.
  • the job posting documents 118 can be received from the client application 132 loaded on the employer devices 106 in the form of data fields populated from online forms or questionnaires.
  • the job posting acquisition engine 122 includes routines for organizing and maintaining currency of documents and associated links (currency refers to the document 119 being unexpired or active), de- duplication, acquisition scheduling, and data checking.
  • currency refers to the document 119 being unexpired or active
  • de- duplication de- duplication
  • acquisition scheduling acquisition scheduling
  • data checking job posting documents 118 sourced from website links can be checked for changes or updates to the website links.
  • the document database 117 can be populated with documents according to one or more of several other techniques, such as monitoring XML streams, RSS feeds, batch files, manual input, and the like.
  • job posting documents 118 and candidate profile documents 120 can be populated in the document database 117 through one or more of several methods without departing from the present specification.
  • the job posting acquisition engine 122 provides as output, a stream of job posting documents 118 from selected data sources, for input to the posting analysis and indexing engine 126, discussed below.
  • a candidate profile acquisition engine (not shown) can acquire candidate profile documents 120 over the network 108 from third party websites or data sources through the use of crawling, spidering, etc. as described above with reference to the job posting acquisition engine 122, with appropriate modifications.
  • the method 300 continues at 320, where the server 110 determines if one of the plurality of dictionaries 204 requires maintenance. The determination can be executed by the dictionary development engine 124. If the determination of 320 is affirmative, then, at 325, the dictionary
  • development engine 124 applies a dynamic dictionary maintenance heuristic to dynamically modify the dictionaries 204 using the received job posting
  • the determination can depend upon a predefined passage of time, such as one week. In another example, the
  • determination can depend on the acquisition of a pre-defined number of job posting documents 118. In other examples, the determination can depend upon one or more threshold scores, as discussed below with reference to FIG. 4, following the classification of one or more job posting documents 118.
  • the various heuristics described herein can represent, without limitation, "learning" algorithms designed to dynamically modify and/or update the dictionaries 204 that have been previously updated and thus enabling machine learning.
  • the dictionary development engine 124 applies a dynamic dictionary maintenance heuristic to each dictionary 204.
  • This heuristic evaluates the job posting documents 118 already assigned to a domain identifier 202 (e.g. a functional area, a focus area, or an educational program area) to validate the keywords of the dictionary 204 associated with the domain identifier 202, discarding less relevant keywords and importantly acquiring new keywords from the received job posting documents 118.
  • development engine 124 is a new dictionary 204A for the domain identifier 202.
  • Table 2 sets out example dictionaries 204 and 204A for the domain identifier 202 associated with the functional area "Software Development”.
  • both the dictionary 204 and the new dictionary 204A are then evaluated against a sample set of job posting
  • the size of the sample set of job posting documents 118 can be 1000, or 10,000, for example, drawn from a much larger complete set of job posting documents 118.
  • the thresholds can vary depending on the domain identifier 202. For example, certain domain identifiers 202 that index relatively few job posting documents 118 feature a 50% threshold so that many more relevant job posting documents 118 are indexed, and to introduce related or serendipitous matches. For other domain identifiers 202, thresholds of 75% can be suitable, providing a balance between indexing too few and too many job posting documents 118. On the other hand, certain domain identifiers 202 that index relatively many job posting documents 118 feature a 90% threshold to ensure dependable matches such that that fewer, but more relevant job posting documents 118 are indexed. In one example, the heuristic scores each dictionary 204 and 204A at one or more thresholds.
  • the thresholds can vary depending on the domain identifier 202. For example, certain domain identifiers 202 that index relatively few job posting documents 118 feature a 50% threshold so that many more relevant job posting documents 118 are
  • dictionary 204 and 204A by determining the "uniqueness" of each keyword occurrence in the sample set of job posting documents 118, in comparison to other dictionaries 204 within the same "layer" of nodes in the tree data structure.
  • the heuristic can score each dictionary 204 and 204A by comparing the number of job posting documents 118 selected from the sample set. In one example, the heuristic can assign different weights to selected job posting documents 118 that relate to higher popularity scores.
  • each dictionary 204 can be tuned on a periodic basis to reflect changing terminology from received job posting documents 118 using heuristics rather than manual intervention.
  • the dictionaries 204 can be periodically or continually aligned and/or adjusted to the terminology of newly acquired job posting documents 118, improving the matching of documents 119.
  • Step 330 the server 110 associates, indexes, or ranks, one or more of the acquired job posting document 118 with one or more of the domain identifiers 202 in the first index using the plurality of dictionaries 204.
  • Step 330 can be executed by the posting analysis and indexing engine 126.
  • the posting analysis and indexing engine 126 evaluates each received job posting document 118 against one or more of the dictionaries 204. The result is a ratio of the strength of the relationship of the job posting document 118 to each dictionary 204, which provides a relevance score relative to all other dictionaries 204 (and associated domain identifiers 202). Using the relevance score, the job posting document 118 can be indexed according to one or more of the domain identifiers 202. For example, a job posting document 118 can be indexed according to domain identifiers corresponding to a functional area, a focus area, or an educational program area . In one example, each domain identifier 202 is associated with values representing "high" and "low" relevance parameters. Only job posting documents 118 meeting the parameters would be selected for matching .
  • job posting documents 118 that include plain text files can be indexed according to one or more domain identifiers 202 providing a way to group and organize the job posting documents 118, enabling candidate matching.
  • the server 110 receives, from a candidate device 104, a candidate profile document 120 and associated domain identifiers 202, and a match request.
  • the server 110 evaluates job posting documents 118 from an index (the "first" index) matching, or belonging to the same node of, the associated domain identifiers 202 of the candidate profile document 120.
  • the server 110 transmits, to the candidate device 104, a message containing the matched job posting documents 118. Step 340 can be performed by the document evaluation engine 128.
  • the server 110 receives, from an employer device 106, a job posting document 118 and associated domain identifiers 202, and a match request.
  • the server 110 evaluates candidate profile documents 120 from an index (the "second" index) matching, or belonging to the same node of, the associated domain identifiers 202 of the job posting document 118.
  • the server 110 transmits, to the employer device 106, a message containing the matched candidate profile documents 120.
  • the first and second indexes may be the same index.
  • the document evaluation engine 128 matches a candidate profile
  • the match request from a candidate device 104 can include one or more candidate preferences, experience, education, or other attributes.
  • the match request from an employer device 106 can include one or more employer requirements or preferences.
  • the match request from an employer device 106 can be limited to candidate profile documents 104 including a particular degree or qualification.
  • candidate profile documents 120 and job posting documents 118 indexed, associated, or ranked according to the same domain identifier or identifiers 202 can indicate a match for selected experience, education, or other attributes.
  • mapping candidate profile documents 120 and job posting documents 118 provides a common way to evaluate candidate profile documents 120 and job posting documents 118 based on potentially different terminology used in the documents 119.
  • the document evaluation engine 128 communicates with the server 110 to provide an interface for access by the client applications 132 and 134.
  • Client application 132 can be provided with an interface to facilitate searching or browsing candidate profile document 120.
  • Client application 134 can be provided with an interface to facilitate searching or browsing job posting documents 118.
  • Some or all of the matched documents 119 may be provided to the devices 104/106, depending on a privacy setting for the document 119.
  • the privacy setting can be determined by the devices 104/106, or based on predetermined business rules by the system 100.
  • each step of method 400 marked with a "B" corresponds to the step of method 300.
  • dictionaries 204 are selected for possible dictionary maintenance at step 320B if they are associated with the received job posting documents 118 at 315B.
  • step 405 classifies the received job posting documents 118 according to one or more domain identifiers and therefore selects only those domain identifiers 202 (and associated dictionaries 204) for maintenance according to the method steps of FIG. 3.
  • screenshot 500 may be launched by opening the client application 134 on the candidate device 104.
  • a candidate profile document 120 may be populated including data fields acquired from online form 504.
  • a listing 502 of one or more job posting documents 118 is then shown, displaying one or more data fields for each job posting document 118.
  • Other data fields may be displayed according to the content of the job posting document 118.
  • the client application 134 may be launched on the candidate device 104 for exchanging messages with the server 110, including match requests (e.g.
  • screenshot 600 may be launched by opening the client application 132 on the employer device 106.
  • a summary of selected candidate profile documents 120 is shown at 602 and graphically at 604, displaying a summary of one or more data fields for the candidate profile document 120. Additional data fields including identifying information for the candidate may be displayed upon, for example, a "connection" request being offered and accepted by the employer device 106 and the candidate device 104, respectively.
  • the client application 132 may be launched on the employer device 106 for exchanging messages with the server 110, including match requests (e.g. receiving match request at 335 or 335B and transmitting matched documents at 345 or 345B).
  • the client application 132 displays a questionnaire or online form for populating a job posting document 118 (not shown in FIG. 6).
  • the content of the questionnaire can be determined according to the domain identifiers 202 (and associated fields) and/or the dictionaries 204 that can be dynamically modified.
  • the client application 132 may receive an electronic data file or plain text from the employer device 106 for populating a job posting document 118.
  • server 110 can check matched job posting documents 118 for currency.
  • the system 100 can include an analytics engine 130 (not shown in FIG. 1).
  • the optional analytics engine 130 can generate business intelligence information from the dictionaries 204.
  • the content of the dictionaries 204 can reveal trends or forecasts such as what current skills, education, and training (represented by keywords in the dictionaries 204 that are continually tuned or adjusted) employers are seeking in candidates.
  • baseline and trending information for domain identifiers 202 including
  • dictionaries 204 can be generated. For example, trends across various industries such as new keywords listed in the dictionaries 204 can be detected and reported.
  • the analytics engine 130 can access the database 117 and allow the client applications 132 and/or 134, or third party users, to access the business intelligence information derived from the dictionaries 204.
  • client application 134 used by candidate device 104 sends a match request
  • the analytics engine 130 can send results to show the candidate device 104 how the candidate profile document 120 associated with the candidate device 104 compares with other candidate profile documents 120 interested in or matched for the job posting document 118.
  • the analytics engine 130 can also indicate to candidate devices 104 popular jobs, skills that are valued or sought after, or career paths for a given candidate profile document 120 (referred to as "personal analytics").
  • the analytics engine can provide the candidate device 104 with job prospects by location or geographical proximity, skills forecasting, education requirements or suggestions, and the like.
  • a method includes, in a server having a processor, a memory, and a network interface device storing, in the memory, a first index of domain identifiers and associated job posting documents, and a plurality of dictionaries, each dictionary associated with one of the domain identifiers, receiving, at the network interface device, a plurality of job posting documents, determining if one of the plurality of dictionaries requires maintenance, and if the determination is affirmative, applying a dynamic dictionary maintenance heuristic to dynamically modify the one of the plurality of dictionaries using the received posting documents.
  • the plurality of job posting documents includes a first job posting document and the method can further include associating the first job posting document with one or more of the domain identifiers in the first index using the plurality of dictionaries.
  • the above method can include receiving, from an electronic device, a candidate profile document and associated domain identifiers, and a match request, selecting job posting documents from the first index matching the associated domain identifiers of the candidate profile document, and transmitting, to the electronic device, a message containing the matched job posting documents.
  • the storing can further include storing, in the memory, a second index of domain identifiers and associated candidate profile documents.
  • the method can further include receiving, a match request from an electronic device, evaluating candidate profile documents from the second index matching the associated domain identifiers of the first job posting document, and transmitting, to the electronic device, a message containing the matched candidate profile
  • Each dictionary can include a set of keywords of terms or groups of terms, and the determining step can further include : selecting a sample set of job posting documents, for each domain identifier, generating a new dictionary, and scoring the dictionary and the new dictionary against a sample set of job posting documents.
  • the scoring includes scoring the dictionary and the new dictionary using one or more threshold scores.
  • the domain identifiers can be selected from one of a focus area, a functional area, and an educational program area .
  • the index can be maintained as a tree structure.
  • a system includes a server having a processor and connected to a network interface device and a memory, and the processor can be configured to store, in the memory, a first index of domain identifiers and associated job posting documents, and a plurality of dictionaries, each dictionary associated with one of the domain identifiers, receive, at the network interface device, a plurality of job posting documents, determine if one of the plurality of dictionaries requires maintenance, and if the determination is affirmative, apply a dynamic dictionary maintenance heuristic to dynamically modify the one of the plurality of
  • the system can include a job posting acquisition engine for acquiring the plurality of job posting documents from one or more data sources over a network, a posting analysis and indexing engine for associating the received job posting documents with one or more of the domain identifiers, a dictionary development engine for applying the dynamic dictionary maintenance heuristics, and/or an analytics engine for deriving business intelligence information from the plurality of dictionaries.
  • a job posting acquisition engine for acquiring the plurality of job posting documents from one or more data sources over a network
  • a posting analysis and indexing engine for associating the received job posting documents with one or more of the domain identifiers
  • a dictionary development engine for applying the dynamic dictionary maintenance heuristics
  • an analytics engine for deriving business intelligence information from the plurality of dictionaries.

Abstract

According to embodiments described in the specification, systems and methods are provided for enhancing candidate matching using dynamic dictionary maintenance heuristics. A method in a server having a processor, a memory, and a network interface device includes storing, in the memory, a first index of domain identifiers and associated job posting documents, and a plurality of dictionaries, each dictionary associated with one of the domain identifiers, receiving, at the network interface device, a plurality of job posting documents, determining if one of the plurality of dictionaries requires maintenance, and if the determination is affirmative, applying a dynamic dictionary maintenance heuristic to dynamically modify the one of the plurality of dictionaries using the received job posting documents.

Description

TITLE: METHOD AND SYSTEM FOR CANDIDATE MATCHING USING DYNAMIC DICTIONARY MAINTENANCE HEURISTICS
FIELD OF TECHNOLOGY
[0001] The present disclosure relates to employment candidate matching methods and systems. Certain embodiments provide a method and system for candidate matching using dynamic dictionary maintenance heuristics.
BACKGROUND
[0002] Various techniques have been developed for indexing, searching, and locating documents or electronic data files to facilitate matching job postings with relevant candidates, and vice versa. Past approaches, in which job postings and candidate profiles are stored in electronic data files, have used keyword searches or comparisons of the electronic data files, and can suffer from several
disadvantages. For example, searches or comparisons based on keywords can be limited to the search queries and the particular terminology used in the search queries. Where the electronic data files use different terminology, the searches and comparisons can miss relevant hits. Furthermore, as the job market changes and evolves over time, some electronic data files can use different or additional keywords to describe the same or new educational requirements, credentials, experiences, technology skills, phrases, buzzwords, etc. In such cases, previous keyword searches and comparisons can become obsolete or less effective in locating relevant hits.
[0003] Improvements in candidate matching methods and systems are desirable.
[0004] The foregoing examples of the related art and limitations related thereto are intended to be illustrative and not exclusive. Other limitations of the related art will become apparent to those of skill in the art upon a reading of the specification and a study of the drawings. BRIEF DESCRIPTION OF THE DRAWINGS
[0005] Examples are illustrated with reference to the attached figures. It is intended that the examples and figures disclosed herein are to be considered illustrative rather than restrictive. [0006] FIG. 1 is a block diagram of a system for candidate matching using dynamic dictionary heuristics in accordance with an example;
[0007] FIG. 2 is a block diagram of a document database of the system of FIG. 1;
[0008] FIG. 3 is a flowchart illustrating a method for candidate matching using dynamic dictionary heuristics in accordance with an example; [0009] FIG. 4 is a flowchart illustrating a method for candidate matching using dynamic dictionary heuristics in accordance with an alternative example;
[0010] FIG. 5 is a view illustrating a screenshot of a client application loaded on a candidate device for use in accordance with the methods of FIG. 3 and FIG. 4; and [0011] FIG. 6 is a view illustrating a screenshot of a client application loaded on an employer device for use in accordance with the methods of FIG. 3 and FIG. 4.
DETAILED DESCRIPTION
[0012] The following describes a method in a server having a processor, a memory, and a network interface device that includes storing, in the memory, a first index of domain identifiers and associated job posting documents, and a plurality of dictionaries, each dictionary associated with one of the domain identifiers, receiving, at the network interface device, a plurality of job posting documents, determining if one of the plurality of dictionaries requires
maintenance, and if the determination is affirmative, applying a dynamic dictionary maintenance heuristic to dynamically modify one of the plurality of dictionaries using the received job posting documents. [0013] Throughout the following description, specific details are set forth in order to provide a more thorough understanding to persons skilled in the art.
However, well-known elements may not be shown or described in detail to avoid unnecessarily obscuring the disclosure. Accordingly, the description and drawings are to be regarded in an illustrative, rather than a restrictive, sense.
[0014] This disclosure relates generally to candidate matching methods and systems and particularly to methods and systems for candidate matching using dynamic dictionary maintenance heuristics.
[0015] The following description will provide, with reference to FIG. 1 and FIG. 2, detailed descriptions of exemplary systems for a candidate matching system. Detailed descriptions of corresponding computer-implemented methods will also be provided in connection with FIG. 3 and FIG. 4.
[0016] A block diagram of an example of a system 100 for candidate matching using dynamic dictionary maintenance heuristics is shown in FIG. 1.
[0017] According to this example, the system 100 includes a plurality of candidate devices 104-1, 104-2,... 104-n (generically referred to herein as "candidate device 104" and collectively as "candidate devices 104"; this nomenclature will also be used for other elements herein), and a plurality of employer devices 106- 1, 106-2,... 106-o, all of which are connected to a server 110 via a network 108.
[0018] The server 110 is typically a server or mainframe within a housing containing an arrangement of one or more processors 114, volatile memory (i.e., random access memory or RAM), persistent memory 116 (e.g., hard disk devices), and a network interface device 112 (to allow the server 110 to communicate over the network 108) all of which are interconnected by a bus. Many computing environments implementing the server 110 or components thereof are within the scope of the invention.
[0019] The server 110 is typically connected to other computing infrastructure including displays, printers, data warehouse or file servers, and the like (not shown in FIG. 1).
[0020] The server 110 includes a network interface device 112 interconnected with the processor 114. The network interface device 112 allows the server 110 to communicate with other computing devices such as the candidate devices 104 and the employer devices 106 via a link with the network 108, or via a direct, local connection (such as a Universal Serial Bus (USB) or Bluetooth™ connection, not shown). The network 108 can include any suitable combination of wired and/or wireless networks, including but not limited to a Wide Area Network (WAN) such as the Internet, a Local Area Network (LAN), cell phone networks, WiFi networks, LTE networks and the like.
[0021] The network interface device 112 is selected for compatibility with the network 108, as well as with local links as desired . In one example, the link between the network interface device 112 and the network 108 is a wired link, such as an Ethernet link. The network interface device 112 thus includes the necessary hardware for communicating over such a link. In other examples, the link between the server 110 and the network 108 can be wireless, and the network interface device 112 can include (in addition to, or instead of, any wired- link hardware) one or more transmitter/receiver assemblies, or radios, and associated circuitry.
[0022] The server 110 can include a keyboard, mouse, touch-sensitive display (or other input devices), a monitor (or display, such as a touch-sensitive display, or other output devices) (not shown in FIG. 1).
[0023] The server 110 stores, in the memory 116, a plurality of computer readable instructions executable by the processor 114. These instructions include an operating system and a variety of applications. Among the
applications in the memory 116 is an application 131 (also referred to herein as "application 131"; not shown in FIG. 1). When the processor 114 executes the instructions of application 131, the processor 114 is configured to perform various functions specified by the computer readable instructions of the
application 131, as will be discussed below in greater detail. The server 110 can also store in the memory 116, a document database 117, as discussed below in greater detail. The memory 116 can also store messages and records of other transactions between one or more of the candidate devices 104 and employer devices 106.
[0024] The employer devices 106 are associated with entities seeking to fill job openings. In one example, these entities are third parties acting on behalf of employers (e.g. third party operators of applicant tracking systems or talent management systems that host the data of employers). Typically, the employer device 106 can be any of a desktop computer, smart phone, laptop computer, tablet computer, and the like. The employer device 106 includes a processor, a memory, input and output devices, and a network interface device as described above in connection with the server 110. An employer device 106 can be operated by an employer.
[0025] The employer device 106 exchanges messages with the server 110, via the network 108 using a client application 132 (not shown in FIG. 1) loaded on the employer device 106. In one example, the client application 132 can be a web browser or native application that uses a web-based or mobile interface and exchanges messages including requests for candidate profile matches and messages to post and present job posting documents 118 to candidate devices 104.
[0026] The candidate devices 104 are associated with candidates (i.e., potential employees for the employers) that are seeking employment and likewise seeking to fill the job openings. Typically, the candidate device 104 can be any of a desktop computer, smart phone, laptop computer, tablet computer, and the like. The candidate device 104 includes a processor, a memory, input and output devices, and a network interface device as described above in connection with the server 110. A candidate device 104 can be operated by a candidate.
According to one example, the candidate device 104 and the employer device 106 can be the same device, where for example, the device is operated by users who are candidates, employers, or both.
[0027] The candidate device 104 exchanges messages with the server 110, via the network 108 and using a client application 134 (not shown in FIG. 1) loaded on the candidate device 104. In one example, the client application 134 can be a web browser or native application that uses a web-based or mobile interface and exchanges messages including requests for job posting matches, and messages to post and present candidate profile documents 120 to employer devices 106.
[0028] According to one example, the server 110 is configured to perform professional social network service operations. For example, the server 110 is configured to process "connection" requests, searches, exchange of content, instant messages, video chats, and the like and any other transactions between any of the candidate devices 104 and the employer devices 106, the candidate devices 104 themselves, or the employer devices 106 themselves.
[0029] As mentioned above, the server 110 includes a document database 117. The document database 117 can be a database application loaded on the server 110, a stand-alone database server or a virtual machine in communication with the network interface device 112 of the server 110, or any other suitable database. The document database 117 maintains a plurality of job posting documents 118-1, 118-2,... 118-n and a plurality of candidate profile documents 120-1, 120-2,... 120-o (both are generically referred to herein as "documents 119" and collectively as "documents 119"). Each job posting document 118 can capture the particulars of a job opening, including without limitation contact information, job requirements, etc. Each candidate profile document 120 can capture the particulars of a candidate including without limitation contact information, work experience, education, qualities and strengths, awards, community service, etc. A candidate profile document 120 can also capture the particulars of a candidate's desired employment, projected work experience, other potential qualifications, etc. Typically, the job posting document 118-1 is associated with the employer device 106-1, and so on, and the candidate profile document 120-1 is associated with the candidate device 104-1, and so on.
[0030] Turning to FIG. 2, an example document database 117 is shown, including three job posting documents 118-1, 118-2, and 118-3, and one candidate profile document 120-1. The document database 117 may be described as an "index" in that individual documents 119 are indexed or associated with domain identifiers 202. For example, document 118-1 is indexed according to domain identifiers 202-2 and 202-3; document 118-2 is indexed according to domain identifiers 202-1 and 202-3; document 118-3 is indexed according to domain identifier 202-3; document 120-1 is indexed according to domain identifiers 202- 1 and 202-3. Furthermore, each domain identifier 202 is associated with a dictionary 204 of keywords. For example, FIG. 2 shows the keywords of dictionary 204-3 that is associated with domain identifier 202-3. The document database 117 can include names or labels such as "node", "keywords", and "documents" in association with each "entry" of the index (described below with reference to the "nodes" of a "tree" data structure) ; the names are provided for illustrative purposes, and can be omitted from the document database 117.
[0031] Each document 119 can be associated with one or more domain identifiers 202-1, 202-2,... 202-p that are stored in an index (as shown in FIG. 2). In one example, the domain identifiers 202 can relate to a functional area, a focus area, or an educational program area. The index of domain identifiers 202 can be stored as a "tree" data structure in the memory 116, in which each "node" of the tree data structure corresponds to a different domain identifier 202. Each node can be associated with one or more data values including a label value, according to the location of the node in the tree data structure. For example, nodes corresponding to the label value "skills" can also be indexed under nodes or branches of nodes corresponding to the label values "focus areas", "functional areas", or "educational program areas". Generally, each node of the index labeled by one or more label values corresponds to a different domain identifier 202. For example, the domain identifier 202-1 can be associated with the label value "Engineering", while the domain identifier 202-2 can be associated with the label value "Mechanical Engineering", and the domain identifier 202-3 can be associated with the label value "Aerospace Engineering".
[0032] The domain identifiers 202 are used to associate, or rank, the job posting documents 118 to enable matching of candidate profile documents 120 to job posting documents 118, as discussed below. In alternative examples, the domain identifiers 202 can be used to enable matching documents 119 more generally.
[0033] According to other examples, the domain identifiers 202 can be indexed in a non-tree data structure. The domain identifiers 202 can be indexed with respect to functional areas, focus areas, educational program areas, skills, and so on. The domain identifiers 202 can be pre-determined according to business rules, and/or automatically refined by a bootstrapping process. For example, the bootstrapping process can combine or reduce domain identifiers 202 with few indexed documents 119, split domain identifiers 202 with many indexed documents 119, etc. In one example, the bootstrapping process can start with a manually coded tree data structure and change the data structure following the application of the dynamic dictionary maintenance heuristics disclosed herein .
[0034] Each domain identifier 202 is associated with a dictionary 204 of keywords. In one example, the dictionaries 204 can be maintained by the server 110 in the memory 116 or in the document database 117. A dictionary 204 is typically maintained for each "node" of the index, listing relevant keywords and variations for that particular node. The term "keywords" can include phrases, types of phrases, single words, multiple words, words in proximity, and so on. In one example, the dictionary 204 can consist of a single keyword, sometimes referred to as a "literal" match. In the example given above, each of the domain identifiers 202-1 ("Engineering"), 202-2 ("Mechanical Engineering"), and 202-3 ("Aerospace Engineering") can correspond to a different dictionary 204-1, 204-2, and 204-3.
[0035] Where two or more documents 119 are indexed according to the same node or branch of nodes in the tree data structure, this can indicate a relevant match between the two or more documents 119.
[0036] Returning to FIG. 1, the system 100 further includes a job posting acquisition engine 122, a posting analysis and indexing engine 126, and a dictionary development engine 124. Optionally, the system 100 can include an analytics engine 130 (not shown in FIG. 1). The functioning of these engines, sometimes referred to as modules, is described below, with reference to the methods of FIG. 3 and FIG. 4. Typically, each of the engines provides
instructions, for example by way of application 131, to determine the functioning of the processor 114 of the server 110. In other examples, however, some of or all the engines may be part of other applications, servers, or other computing infrastructure and the method steps may communicate with components of the server 110 including the database 117 via the network interface device 112.
[0037] In the example of FIG. 2, which should be considered non-limiting, the job posting documents 118 include plain-text files. In other examples, however, other suitable documents can be used, such as web form documents or other structured documents. The job posting documents 118 can be associated with one or more domain identifiers 202, as discussed below.
[0038] According to an alternative example, the job posting document 118 includes data fields acquired from a populated questionnaire or online form by the employer device 106 using the client application 132. In one example, the client application 132 captures and/or sends multimedia assets such as audios, videos, tags, etc. to be included in the job posting document 118. Keywords can be extracted from the multimedia assets.
[0039] According to the example of FIG. 2, the candidate profile document 120 includes data acquired from a populated questionnaire or online form by the candidate devices 104 using the client application 134. The candidate profile document 120 can be associated with one or more domain identifiers, as discussed below. Table 1 sets out example domain identifiers 202. In one example, the client application 134 captures and/or sends multimedia assets such as an electronic CV, audios, videos, tags, etc. to be included in the candidate profile document 120. Keywords can be extracted from the
multimedia assets.
TABLE 1: Example Domain Identifiers
I Job Type (part-time, full-time, co-op, university, internship) Functional Area
(arts, entertainment, media, publishing; banking, investment, mortgage;
biotech, research and development, scientific; clerical, administrative;
engineering - chemical; engineering - civil, structural; engineering - computer - hardware, software; engineering - electrical; engineering - general; engineering - interdisciplinary, applied; engineering - mechanical; general labour; health care, social services; human resources; legal; marketing, communications, advertising; purchasing, logistics, inventory management; real estate, facilities, equipment management; sales, business development; skilled trades; software development, information technology; training, customer support)
Job Environment Preferences (coaching & training, communications, creative, leadership & management, teamwork, technical)
Job Title
Industry Sector (Agriculture; Arts, Entertainment, Media, Publishing;
Associations & Non Profit Organizations; Business Services; Construction;
Consumer Products; Creative Services; Education; Finance & Insurance; Health Care & Social Services; Hospitality; Manufacturing; Personal Services;
Professional Services; Public Institution / Public Service; Real Estate; Resource; Retail / Wholesale; Science & Technology; Technical Services; Transportation & Shipping; Utilities & Energy; Waste Management)
Employer Name
Location
[0040] Flowcharts illustrating example methods of candidate matching using dynamic dictionary maintenance heuristics are shown in FIG. 3 and FIG. 4. These methods can be carried out by the application 131 or other software executed by, for example, the processor 114 of the server 110. These methods can contain additional or fewer processes than shown and/or described, and can be
performed in a different order. Computer-readable code executable by at least one processor of the server 110 to perform the methods can be stored in a computer-readable storage medium, such as a non-transitory computer-readable medium.
[0041] With reference to FIG. 3, a method 300 starts at 305 and, at 310, the server 110 is configured to store, in the memory 116, a first index of domain identifiers 202 and associated job posting documents 118, and a plurality of dictionaries 204, where each dictionary 204 is associated with one of the domain identifiers 202. At 315, the network interface device 112 receives a plurality of job posting documents 118. The job posting documents 118 can be received from the output of the job posting acquisition engine 122. The received job posting documents 118 can be stored in the document database 117.
[0042] The job posting acquisition engine 122 acquires job posting documents 118 over the network 108 from third party websites, computing devices, or other data sources. In one example, the job posting acquisition engine 122 can use crawling, spidering, or any similar technique, performed ad hoc or on a
scheduled basis, to access pages that are posted on employer websites or other data sources. The websites that are consulted can be pre-determined according to business rules, or determined by crawling, spidering, etc. The pages that are accessed in this way are typically not structured or formatted in a consistent way, as they depend on the particular characteristics of various website configurations of employers, third party applicant tracking systems, etc., and the differences in the pages that can vary by employer, industry, job type, and the like. A crawled page can be included for processing and/or storing in the document database 117 upon the performance of data handling and analysis routines. According to another alternative example, the job posting documents 118 can be received from the client application 132 loaded on the employer devices 106 in the form of data fields populated from online forms or questionnaires.
[0043] In some examples, the job posting acquisition engine 122 includes routines for organizing and maintaining currency of documents and associated links (currency refers to the document 119 being unexpired or active), de- duplication, acquisition scheduling, and data checking. For example, job posting documents 118 sourced from website links can be checked for changes or updates to the website links.
[0044] In alternative examples, the document database 117 can be populated with documents according to one or more of several other techniques, such as monitoring XML streams, RSS feeds, batch files, manual input, and the like. In other words, job posting documents 118 and candidate profile documents 120 can be populated in the document database 117 through one or more of several methods without departing from the present specification.
[0045] According to one example, the job posting acquisition engine 122 provides as output, a stream of job posting documents 118 from selected data sources, for input to the posting analysis and indexing engine 126, discussed below.
[0046] According to an alternative example, a candidate profile acquisition engine (not shown) can acquire candidate profile documents 120 over the network 108 from third party websites or data sources through the use of crawling, spidering, etc. as described above with reference to the job posting acquisition engine 122, with appropriate modifications.
[0047] Returning to FIG. 3, the method 300 continues at 320, where the server 110 determines if one of the plurality of dictionaries 204 requires maintenance. The determination can be executed by the dictionary development engine 124. If the determination of 320 is affirmative, then, at 325, the dictionary
development engine 124 applies a dynamic dictionary maintenance heuristic to dynamically modify the dictionaries 204 using the received job posting
documents 118. In one example, the determination can depend upon a predefined passage of time, such as one week. In another example, the
determination can depend on the acquisition of a pre-defined number of job posting documents 118. In other examples, the determination can depend upon one or more threshold scores, as discussed below with reference to FIG. 4, following the classification of one or more job posting documents 118.
[0048] The term "heuristic algorithm," or often simply "heuristic," as used herein, generally refers to any type or form of algorithm, formula, model, or tool that can be used to index, classify, or make decisions with respect to, a document 119. In some examples, the various heuristics described herein can represent, without limitation, "learning" algorithms designed to dynamically modify and/or update the dictionaries 204 that have been previously updated and thus enabling machine learning.
[0049] At 325, the dictionary development engine 124 applies a dynamic dictionary maintenance heuristic to each dictionary 204. This heuristic evaluates the job posting documents 118 already assigned to a domain identifier 202 (e.g. a functional area, a focus area, or an educational program area) to validate the keywords of the dictionary 204 associated with the domain identifier 202, discarding less relevant keywords and importantly acquiring new keywords from the received job posting documents 118. The output of the dictionary
development engine 124 is a new dictionary 204A for the domain identifier 202. Table 2 sets out example dictionaries 204 and 204A for the domain identifier 202 associated with the functional area "Software Development".
TABLE 2: Example Dictionaries
First dictionary 204 New dictionary 204A
agile development agile development
agile development methodologies agile development methodologies agile methodologies agile methodologies
agile methodology agile methodology
agile software development agile software development aix aix
ajax ajax
algorithm design algorithm
amazon web services algorithm design
android algorithms
apache amazon web services
Apache ant analytics
Apache maven android
Apache subversion animation
Apache Tomcat apache
application design Apache ant
application development Apache maven
applications development Apache subversion
architecture and design Apache Tomcat
application design
application development
applications development architecture and design
architecture design
architectures
asics
atg
audio
automate automated
automated test
automated testing
automation
avionics
[0050] According to one example heuristic, both the dictionary 204 and the new dictionary 204A are then evaluated against a sample set of job posting
documents 118, and if the new dictionary 204A results in a greater number of relevant matches or hits, then the dictionary 204 is replaced with the dictionary 204A. The size of the sample set of job posting documents 118 can be 1000, or 10,000, for example, drawn from a much larger complete set of job posting documents 118.
[0051] In one example, the heuristic scores each dictionary 204 and 204A at one or more thresholds. The thresholds can vary depending on the domain identifier 202. For example, certain domain identifiers 202 that index relatively few job posting documents 118 feature a 50% threshold so that many more relevant job posting documents 118 are indexed, and to introduce related or serendipitous matches. For other domain identifiers 202, thresholds of 75% can be suitable, providing a balance between indexing too few and too many job posting documents 118. On the other hand, certain domain identifiers 202 that index relatively many job posting documents 118 feature a 90% threshold to ensure dependable matches such that that fewer, but more relevant job posting documents 118 are indexed. In one example, the heuristic scores each
dictionary 204 and 204A by determining the "uniqueness" of each keyword occurrence in the sample set of job posting documents 118, in comparison to other dictionaries 204 within the same "layer" of nodes in the tree data structure. As mentioned above, the heuristic can score each dictionary 204 and 204A by comparing the number of job posting documents 118 selected from the sample set. In one example, the heuristic can assign different weights to selected job posting documents 118 that relate to higher popularity scores.
[0052] Advantageously, by employing the heuristics disclosed herein, each dictionary 204 can be tuned on a periodic basis to reflect changing terminology from received job posting documents 118 using heuristics rather than manual intervention. Through the use of the methods described herein, the dictionaries 204 can be periodically or continually aligned and/or adjusted to the terminology of newly acquired job posting documents 118, improving the matching of documents 119.
[0053] Still with reference to FIG. 3, the method continues at 330 where the server 110 associates, indexes, or ranks, one or more of the acquired job posting document 118 with one or more of the domain identifiers 202 in the first index using the plurality of dictionaries 204. Step 330 can be executed by the posting analysis and indexing engine 126.
[0054] The posting analysis and indexing engine 126 evaluates each received job posting document 118 against one or more of the dictionaries 204. The result is a ratio of the strength of the relationship of the job posting document 118 to each dictionary 204, which provides a relevance score relative to all other dictionaries 204 (and associated domain identifiers 202). Using the relevance score, the job posting document 118 can be indexed according to one or more of the domain identifiers 202. For example, a job posting document 118 can be indexed according to domain identifiers corresponding to a functional area, a focus area, or an educational program area . In one example, each domain identifier 202 is associated with values representing "high" and "low" relevance parameters. Only job posting documents 118 meeting the parameters would be selected for matching .
[0055] Advantageously, job posting documents 118 that include plain text files can be indexed according to one or more domain identifiers 202 providing a way to group and organize the job posting documents 118, enabling candidate matching.
[0056] Still with reference to FIG. 3, at 335, the server 110 receives, from a candidate device 104, a candidate profile document 120 and associated domain identifiers 202, and a match request. At 340, the server 110 evaluates job posting documents 118 from an index (the "first" index) matching, or belonging to the same node of, the associated domain identifiers 202 of the candidate profile document 120. At 345, the server 110 transmits, to the candidate device 104, a message containing the matched job posting documents 118. Step 340 can be performed by the document evaluation engine 128.
[0057] In other examples, at 335, the server 110 receives, from an employer device 106, a job posting document 118 and associated domain identifiers 202, and a match request. At 340, the server 110 evaluates candidate profile documents 120 from an index (the "second" index) matching, or belonging to the same node of, the associated domain identifiers 202 of the job posting document 118. At 345, the server 110 transmits, to the employer device 106, a message containing the matched candidate profile documents 120. In one example, the first and second indexes may be the same index.
[0058] The document evaluation engine 128 matches a candidate profile
document 120 to a job posting document 118. The match request from a candidate device 104 can include one or more candidate preferences, experience, education, or other attributes. The match request from an employer device 106 can include one or more employer requirements or preferences. For example, the match request from an employer device 106 can be limited to candidate profile documents 104 including a particular degree or qualification. As
mentioned above, candidate profile documents 120 and job posting documents 118 indexed, associated, or ranked according to the same domain identifier or identifiers 202 (e.g. in an index that is a "tree" data structure) can indicate a match for selected experience, education, or other attributes.
[0059] Advantageously, matching using the domain identifiers 202 provides a common way to evaluate candidate profile documents 120 and job posting documents 118 based on potentially different terminology used in the documents 119.
[0060] The document evaluation engine 128 communicates with the server 110 to provide an interface for access by the client applications 132 and 134. Client application 132 can be provided with an interface to facilitate searching or browsing candidate profile document 120. Client application 134 can be provided with an interface to facilitate searching or browsing job posting documents 118. Some or all of the matched documents 119 (or portions therof) may be provided to the devices 104/106, depending on a privacy setting for the document 119. The privacy setting can be determined by the devices 104/106, or based on predetermined business rules by the system 100.
[0061] With reference to FIG. 4, each step of method 400 marked with a "B" corresponds to the step of method 300. According to method 400, dictionaries 204 are selected for possible dictionary maintenance at step 320B if they are associated with the received job posting documents 118 at 315B. Accordingly, step 405 classifies the received job posting documents 118 according to one or more domain identifiers and therefore selects only those domain identifiers 202 (and associated dictionaries 204) for maintenance according to the method steps of FIG. 3.
[0062] Example screenshots on the displays of the candidate device 104 and the employer device 106 when loaded with the application 134 and 132, respectively, to operate in accordance with the present disclosure are depicted in FIG. 5 and FIG. 6 and described with continued reference to FIG. 3 and FIG. 4. [0063] With reference to FIG. 5, screenshot 500 may be launched by opening the client application 134 on the candidate device 104. A candidate profile document 120 may be populated including data fields acquired from online form 504. A listing 502 of one or more job posting documents 118 is then shown, displaying one or more data fields for each job posting document 118. Other data fields may be displayed according to the content of the job posting document 118. The client application 134 may be launched on the candidate device 104 for exchanging messages with the server 110, including match requests (e.g.
receiving match request at 335 or 335B and transmitting matched documents at 345 or 345B). [0064] With reference to FIG. 6, screenshot 600 may be launched by opening the client application 132 on the employer device 106. A summary of selected candidate profile documents 120 is shown at 602 and graphically at 604, displaying a summary of one or more data fields for the candidate profile document 120. Additional data fields including identifying information for the candidate may be displayed upon, for example, a "connection" request being offered and accepted by the employer device 106 and the candidate device 104, respectively. The client application 132 may be launched on the employer device 106 for exchanging messages with the server 110, including match requests (e.g. receiving match request at 335 or 335B and transmitting matched documents at 345 or 345B). In one example, the client application 132 displays a questionnaire or online form for populating a job posting document 118 (not shown in FIG. 6). The content of the questionnaire can be determined according to the domain identifiers 202 (and associated fields) and/or the dictionaries 204 that can be dynamically modified. According to an alternative example, the client application 132 may receive an electronic data file or plain text from the employer device 106 for populating a job posting document 118.
[0065] According to one example, prior to transmitting matched documents at 345 or 345B to the candidate devices 104, server 110 can check matched job posting documents 118 for currency. [0066] Furthermore, the system 100 can include an analytics engine 130 (not shown in FIG. 1). The optional analytics engine 130 can generate business intelligence information from the dictionaries 204. For example, the content of the dictionaries 204 can reveal trends or forecasts such as what current skills, education, and training (represented by keywords in the dictionaries 204 that are continually tuned or adjusted) employers are seeking in candidates. Upon periodic refreshes of the database 117, including the job posting documents 118, baseline and trending information for domain identifiers 202, including
dictionaries 204, can be generated. For example, trends across various industries such as new keywords listed in the dictionaries 204 can be detected and reported. [0067] The analytics engine 130 can access the database 117 and allow the client applications 132 and/or 134, or third party users, to access the business intelligence information derived from the dictionaries 204. Advantageously, when client application 134 used by candidate device 104 sends a match request, the analytics engine 130 can send results to show the candidate device 104 how the candidate profile document 120 associated with the candidate device 104 compares with other candidate profile documents 120 interested in or matched for the job posting document 118. The analytics engine 130 can also indicate to candidate devices 104 popular jobs, skills that are valued or sought after, or career paths for a given candidate profile document 120 (referred to as "personal analytics"). Advantageously, the analytics engine can provide the candidate device 104 with job prospects by location or geographical proximity, skills forecasting, education requirements or suggestions, and the like.
[0068] A method includes, in a server having a processor, a memory, and a network interface device storing, in the memory, a first index of domain identifiers and associated job posting documents, and a plurality of dictionaries, each dictionary associated with one of the domain identifiers, receiving, at the network interface device, a plurality of job posting documents, determining if one of the plurality of dictionaries requires maintenance, and if the determination is affirmative, applying a dynamic dictionary maintenance heuristic to dynamically modify the one of the plurality of dictionaries using the received posting documents.
[0069] The plurality of job posting documents includes a first job posting document and the method can further include associating the first job posting document with one or more of the domain identifiers in the first index using the plurality of dictionaries.
[0070] The above method can include receiving, from an electronic device, a candidate profile document and associated domain identifiers, and a match request, selecting job posting documents from the first index matching the associated domain identifiers of the candidate profile document, and transmitting, to the electronic device, a message containing the matched job posting documents.
[0071] The storing can further include storing, in the memory, a second index of domain identifiers and associated candidate profile documents. The method can further include receiving, a match request from an electronic device, evaluating candidate profile documents from the second index matching the associated domain identifiers of the first job posting document, and transmitting, to the electronic device, a message containing the matched candidate profile
documents.
[0072] Each dictionary can include a set of keywords of terms or groups of terms, and the determining step can further include : selecting a sample set of job posting documents, for each domain identifier, generating a new dictionary, and scoring the dictionary and the new dictionary against a sample set of job posting documents.
[0073] The scoring includes scoring the dictionary and the new dictionary using one or more threshold scores. The domain identifiers can be selected from one of a focus area, a functional area, and an educational program area . The index can be maintained as a tree structure.
[0074] A system includes a server having a processor and connected to a network interface device and a memory, and the processor can be configured to store, in the memory, a first index of domain identifiers and associated job posting documents, and a plurality of dictionaries, each dictionary associated with one of the domain identifiers, receive, at the network interface device, a plurality of job posting documents, determine if one of the plurality of dictionaries requires maintenance, and if the determination is affirmative, apply a dynamic dictionary maintenance heuristic to dynamically modify the one of the plurality of
dictionaries using the received job posting documents.
[0075] The system can include a job posting acquisition engine for acquiring the plurality of job posting documents from one or more data sources over a network, a posting analysis and indexing engine for associating the received job posting documents with one or more of the domain identifiers, a dictionary development engine for applying the dynamic dictionary maintenance heuristics, and/or an analytics engine for deriving business intelligence information from the plurality of dictionaries.
[0076] While a number of exemplary aspects and examples have been discussed above, those of skill in the art will recognize certain modifications, permutations, additions and sub-combinations thereof.

Claims

What is claimed is: Claims
1. A method in a server having a processor, a memory, and a network interface device comprising : storing, in the memory, a first index of domain identifiers and associated job posting documents, and a plurality of dictionaries, each dictionary associated with one of the domain identifiers; receiving, at the network interface device, a plurality of job posting documents; determining if one of the plurality of dictionaries requires maintenance; and if the determination is affirmative, applying a dynamic dictionary maintenance heuristic to dynamically modify the one of the plurality of dictionaries using the received job posting documents.
2. The method of claim 1 wherein the plurality of job posting documents includes a first job posting document, and the method further comprises: associating the first job posting document with one or more of the domain identifiers in the first index using the plurality of dictionaries.
3. The method of claim 2 further comprising the steps of: receiving, from an electronic device, a candidate profile document and associated domain identifiers, and a match request; selecting job posting documents from the first index matching the associated domain identifiers of the candidate profile document; and transmitting, to the electronic device, a message containing the matched job posting documents.
4. The method of claim 2 wherein the storing further comprises: storing, in the memory, a second index of domain identifiers and associated candidate profile documents; and wherein the method further comprises: receiving, from an electronic device, a match request; responsive to the match request, evaluating candidate profile documents from the second index matching the associated domain identifiers of the first job posting document; and transmitting, to the electronic device, a message containing the matched candidate profile documents.
5. The method of claim 1 wherein each dictionary comprises a set of keywords comprising terms or groups of terms, and the determining step further comprises: selecting a sample set of job posting documents; for each domain identifier, generating a new dictionary; and scoring the dictionary and the new dictionary against a sample set of job posting documents.
6. The method of claim 5 wherein the scoring comprises scoring the dictionary and the new dictionary using one or more threshold scores.
7. The method of claim 6 wherein the domain identifiers are selected from one of a focus area, a functional area, and an educational program area.
8. The method of claim 1 wherein the index is maintained as a tree structure.
9. A system comprising : a server having a processor and connected to a network interface device and a memory, wherein the processor is configured to: store, in the memory, a first index of domain identifiers and associated job posting documents, and a plurality of dictionaries, each dictionary associated with one of the domain identifiers; receive, at the network interface device, a plurality of job posting documents; determine if one of the plurality of dictionaries requires maintenance; and if the determination is affirmative, apply a dynamic dictionary maintenance heuristic to dynamically modify the one of the plurality of dictionaries using the received job posting documents.
10. The system of claim 9 further comprising a job posting acquisition engine for acquiring the plurality of job posting documents from one or more data sources over a network.
11. The system of claim 9 further comprising a posting analysis and indexing engine for associating the received job posting documents with one or more of the domain identifiers.
12. The system of claim 9 further comprising a dictionary development engine for applying the dynamic dictionary maintenance heuristics.
13. The system of claim 12 further comprising an analytics engine for deriving business intelligence information from the plurality of dictionaries.
PCT/CA2013/000247 2013-03-15 2013-03-15 Method and system for candidate matching using dynamic dictionary maintenance heuristics WO2014138838A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CA2013/000247 WO2014138838A1 (en) 2013-03-15 2013-03-15 Method and system for candidate matching using dynamic dictionary maintenance heuristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CA2013/000247 WO2014138838A1 (en) 2013-03-15 2013-03-15 Method and system for candidate matching using dynamic dictionary maintenance heuristics

Publications (1)

Publication Number Publication Date
WO2014138838A1 true WO2014138838A1 (en) 2014-09-18

Family

ID=51535641

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CA2013/000247 WO2014138838A1 (en) 2013-03-15 2013-03-15 Method and system for candidate matching using dynamic dictionary maintenance heuristics

Country Status (1)

Country Link
WO (1) WO2014138838A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11269901B2 (en) 2020-01-16 2022-03-08 International Business Machines Corporation Cognitive test advisor facility for identifying test repair actions

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003077151A2 (en) * 2002-03-05 2003-09-18 Siemens Medical Solutions Health Services Corporation A dynamic dictionary and term repository system
CA2771172A1 (en) * 2009-08-25 2011-03-17 Opko Curna, Llc Treatment of 'iq motif containing gtpase activating protein' (iqgap) related diseases by inhibition of natural antisense transcript to iqgap
WO2012060928A1 (en) * 2010-11-05 2012-05-10 Nextgen Datacom, Inc. Method and system for document classification or search using discrete words
CA2771525A1 (en) * 2011-03-18 2012-09-18 Mark Henry Harris Bailey Systems and methods for facilitating recruitment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003077151A2 (en) * 2002-03-05 2003-09-18 Siemens Medical Solutions Health Services Corporation A dynamic dictionary and term repository system
CA2771172A1 (en) * 2009-08-25 2011-03-17 Opko Curna, Llc Treatment of 'iq motif containing gtpase activating protein' (iqgap) related diseases by inhibition of natural antisense transcript to iqgap
WO2012060928A1 (en) * 2010-11-05 2012-05-10 Nextgen Datacom, Inc. Method and system for document classification or search using discrete words
CA2771525A1 (en) * 2011-03-18 2012-09-18 Mark Henry Harris Bailey Systems and methods for facilitating recruitment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
FRAWLEY ET AL.: "Knowledge Discovery in Databases: An Overview''.", AI MAGAZINE, vol. 13, no. 3, 1992, pages 1 - 14 *
RILOFF: "An empirical study of automated dictionary construction for information extraction in three domains", ARTIFICIAL INTELLIGENCE, vol. 85, no. 1-2, August 1996 (1996-08-01), pages 101 - 134 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11269901B2 (en) 2020-01-16 2022-03-08 International Business Machines Corporation Cognitive test advisor facility for identifying test repair actions

Similar Documents

Publication Publication Date Title
US10832219B2 (en) Using feedback to create and modify candidate streams
US10922657B2 (en) Using an employee database with social media connections to calculate job candidate reputation scores
Kumar et al. Mapping research collaborations in the business and management field in Malaysia, 1980–2010
US11120403B2 (en) Career analytics platform
Senthil Kumaran et al. Towards an automated system for intelligent screening of candidates for recruitment using ontology mapping (EXPERT)
US8341101B1 (en) Determining relationships between data items and individuals, and dynamically calculating a metric score based on groups of characteristics
US20110238591A1 (en) Automated profile standardization and competency profile generation
US20140279629A1 (en) System and method for generating an organization profile based on skill information
US20130332385A1 (en) Methods and systems for detecting and extracting product reviews
US8612434B2 (en) Identifying social profiles in a social network having relevance to a first file
US20170286865A1 (en) Systems and methods to identify job titles for connections on a social networking system
US20150095105A1 (en) Industry graph database
WO2011094341A2 (en) System and method for social networking
WO2015002830A1 (en) Social network for employment search
US20180060822A1 (en) Online and offline systems for job applicant assessment
US10395191B2 (en) Recommending decision makers in an organization
US20180150534A1 (en) Job posting data normalization and enrichment
US20180089607A1 (en) Presenting endorsements using analytics and insights
US20180225632A1 (en) Finding virtual teams within a company according to organizational hierarchy
He et al. Using blog mining as an analytical method to study the use of social media by small businesses
US9946994B2 (en) Techniques for providing insights relating to job postings
Al-Qurishi et al. User profiling for big social media data using standing ovation model
US9336330B2 (en) Associating entities based on resource associations
US8478702B1 (en) Tools and methods for determining semantic relationship indexes
US10409830B2 (en) System for facet expansion

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13878241

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13878241

Country of ref document: EP

Kind code of ref document: A1