TWI776146B - Resume scoring method and system - Google Patents

Resume scoring method and system Download PDF

Info

Publication number
TWI776146B
TWI776146B TW109114559A TW109114559A TWI776146B TW I776146 B TWI776146 B TW I776146B TW 109114559 A TW109114559 A TW 109114559A TW 109114559 A TW109114559 A TW 109114559A TW I776146 B TWI776146 B TW I776146B
Authority
TW
Taiwan
Prior art keywords
model
training
resume
content
text
Prior art date
Application number
TW109114559A
Other languages
Chinese (zh)
Other versions
TW202143122A (en
Inventor
林志豪
鄒尚軒
羅宏瑜
吳浣青
馮國鈞
林佳妤
邱國豪
曾文忻
宋政隆
王俊權
Original Assignee
中國信託商業銀行股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中國信託商業銀行股份有限公司 filed Critical 中國信託商業銀行股份有限公司
Priority to TW109114559A priority Critical patent/TWI776146B/en
Publication of TW202143122A publication Critical patent/TW202143122A/en
Application granted granted Critical
Publication of TWI776146B publication Critical patent/TWI776146B/en

Links

Images

Abstract

一種履歷評分系統,包含多個正規表示式、一向量生成模型,及一評分模型,該每一正規表示式具有一預定關鍵字,該向量生成模型用以根據一文字內容產生一文本向量,該評分模型根據該等正規表示式產生的結果和該向量生成模型產生的文本向量產生一分數,當接收到相關於一求職者的一履歷時,對於每一正規表示式,該系統根據該履歷利用該正規表示式獲得一包括該履歷中對應於該正規表示式之預定關鍵字之關鍵字內容的關鍵字組合,並利用該向量生成模型產生一對應該履歷的文本向量,且根據每一關鍵字組合及該文本向量,利用該評分模型產生一分數。A resume scoring system includes a plurality of regular expressions, a vector generating model, and a scoring model, each regular expression has a predetermined keyword, the vector generating model is used to generate a text vector according to a text content, the scoring The model generates a score based on the results generated by the regular expressions and the text vector generated by the vector generating model. When receiving a resume related to a job applicant, for each regular expression, the system uses the The regular expression obtains a keyword combination including the keyword content in the resume corresponding to the predetermined keyword of the regular expression, and uses the vector generation model to generate a text vector corresponding to the resume, and according to each keyword combination and the text vector, using the scoring model to generate a score.

Description

履歷評分方法及其系統Resume scoring method and system

本發明是有關於一種辦公室自動化方法,特別是指一種根據履歷自動產生評分的方法。The present invention relates to an office automation method, in particular to a method for automatically generating scores according to resumes.

在現今社會中,一般企業進行徵才作業時,第一關多會以求職者所提供的履歷作為依據進行篩選,然而對於徵才企業而言,當收到數以萬計的履歷時,多是由人資部門逐一審核每份履歷內容以篩選適合的求職者,此一作法不僅勞心勞力,同時亦有可能由於各人見解不同或是人為疏失,造成履歷誤篩選的問題。In today's society, when general companies conduct talent recruitment operations, the first pass will be based on the resumes provided by job seekers. The human resources department reviews the contents of each resume one by one to screen suitable job applicants. This approach is not only laborious, but also may cause the problem of wrong resume selection due to different opinions or human negligence.

有鑑於此,勢必需要提出一種全新解決方案,以解決目前審核履歷過度耗費時間人力成本以及容易產生履歷誤篩選的問題。In view of this, it is necessary to propose a new solution to solve the problems of excessively time-consuming and labor-intensive reviewing of resumes and prone to wrong selection of resumes.

因此,本發明的目的,即在提供一種協助評估履歷的履歷評分方法Therefore, the purpose of the present invention is to provide a resume scoring method for assisting in evaluating resumes

另外,本發明的另一目的,在於提供一種協助評估履歷的履歷評分系統。In addition, another object of the present invention is to provide a history scoring system that assists in evaluating history.

於是,本發明履歷評分方法,由一伺服端實施,該履歷包含一求職者填寫的一學經歷內容,及一個人介紹的自傳內容,該伺服端包含多個正規表示式、一向量生成模型,及一評分模型,該每一正規表示式具有一預定關鍵字,該向量生成模型用以根據一由文字構成的內容產生一文本向量,該評分模型依據該正規表示式產生的結果和該向量生成模型產生的文本向量產生一分數,該履歷評分方法包含一步驟(A)、一步驟(B),及一步驟(C)。Therefore, the resume scoring method of the present invention is implemented by a server. The resume includes a school experience filled in by a job seeker and an autobiography introduced by a person. The server includes a plurality of regular expressions, a vector generation model, and A scoring model, each regular expression has a predetermined keyword, the vector generating model is used to generate a text vector according to a content composed of text, the scoring model is based on the result generated by the regular expression and the vector generating model The generated text vector generates a score. The resume scoring method includes a step (A), a step (B), and a step (C).

在該步驟(A)中,當該伺服端接收到該相關於該求職者的履歷時,對於每一正規表示式,該伺服端根據該學經歷內容利用該正規表示式獲得該學經歷內容中對應於該正規表示式之預定關鍵字的關鍵字內容,其中每一正規表示式之預定關鍵字及其對應的關鍵字內容構成一對應的關鍵字組合。In the step (A), when the server receives the resume related to the job seeker, for each regular expression, the server uses the regular expression to obtain the content of the learning experience according to the content of the learning experience. The keyword content corresponding to the predetermined keyword of the regular expression, wherein the predetermined keyword of each regular expression and its corresponding keyword content constitute a corresponding keyword combination.

在該步驟(B)中,該伺服端根據該自傳內容,利用該向量生成模型產生一對應該自傳內容的文本向量。In step (B), the server uses the vector generation model to generate a pair of text vectors corresponding to the autobiographical content according to the autobiographical content.

在該步驟(C)中,該伺服端根據每一關鍵字組合及該文本向量,利用該評分模型產生一對應該履歷的分數。In step (C), the server uses the scoring model to generate a pair of scores corresponding to the resume according to each keyword combination and the text vector.

再者,本發明履歷評分系統,用以對一履歷產生評分,並經由一通訊網路連接一管理端,該履歷包含一求職者填寫的一學經歷內容,及一個人介紹的自傳內容,該履歷評分系統包含一通訊模組、一儲存模組,及一處理模組。Furthermore, the resume scoring system of the present invention is used to generate a score for a resume, which is connected to a management terminal through a communication network. The system includes a communication module, a storage module, and a processing module.

該通訊模組連接至該通訊網路,該儲存模組儲存有多個正規表示式、一向量生成模型,及一評分模型,該每一正規表示式具有一預定關鍵字,該向量生成模型用以根據一由文字構成的內容產生一文本向量,該評分模型依據該正規表示式產生的結果和該向量生成模型產生的文本向量產生一分數。The communication module is connected to the communication network, and the storage module stores a plurality of regular expressions, a vector generation model, and a scoring model, each regular expression has a predetermined keyword, and the vector generation model is used for A text vector is generated from a content consisting of text, and the scoring model generates a score based on the result generated by the regular expression and the text vector generated by the vector generation model.

該處理模組電連接該通訊模組及該儲存模組,其中,當該處理模組透過該通訊模組接收到來自該管理端且該相關於該求職者的履歷時,對於每一正規表示式,該處理模組根據該學經歷內容,利用該正規表示式獲得該學經歷內容中對應於該正規表示式之預定關鍵字的關鍵字內容,其中每一正規表示式之預定關鍵字及其對應的關鍵字內容構成一對應的關鍵字組合,並根據該自傳內容,利用該向量生成模型產生一對應該自傳內容的文本向量,以及根據每一關鍵字組合及該文本向量,利用該評分模型產生一對應該履歷的分數。The processing module is electrically connected to the communication module and the storage module, wherein, when the processing module receives the resume related to the job seeker from the management terminal through the communication module, for each formal representation formula, the processing module uses the regular expression to obtain the keyword content corresponding to the predetermined keyword of the regular expression in the learning experience content according to the learning experience content, wherein the predetermined keyword of each regular expression and its The corresponding keyword content constitutes a corresponding keyword combination, and according to the autobiographical content, the vector generation model is used to generate a pair of text vectors corresponding to the autobiographical content, and the scoring model is used according to each keyword combination and the text vector. Generate a pair of scores that should be resumed.

本發明的功效在於:藉由該伺服端產生對應該履歷的分數,相關工作人員可參考分數篩選求職者,不僅節省審核每份履歷的時間人力成本,同時避免人為因素所導致的履歷誤篩選問題。The effect of the present invention is: by generating the score corresponding to the resume by the server, the relevant staff can refer to the score to screen job applicants, which not only saves the time and labor cost of reviewing each resume, but also avoids the problem of wrong resume selection caused by human factors. .

在本發明被詳細描述之前,應當注意在以下的說明內容中,類似的元件是以相同的編號來表示。Before the present invention is described in detail, it should be noted that in the following description, similar elements are designated by the same reference numerals.

參閱圖1,本發明履歷評分系統的一第一實施例,由一伺服端1來實施,該伺服端1透過一通訊網路100連接至一管理端2,並包含一通訊模組11、一儲存模組12,及一處理模組13。Referring to FIG. 1 , a first embodiment of the history scoring system of the present invention is implemented by a server 1. The server 1 is connected to a management terminal 2 through a communication network 100, and includes a communication module 11, a storage module 12, and a processing module 13.

該通訊模組11透過該通訊網路100連接至該管理端2。The communication module 11 is connected to the management terminal 2 through the communication network 100 .

該儲存模組12儲存有多個正規表示式、多筆歷史履歷、一向量生成模型,及一評分模型,該每一正規表示式具有一預定關鍵字,該每一歷史履歷具有一相關於一歷史求職者個人介紹的自傳內容、多筆相關於該歷史求職者學經歷的關鍵字組合、一相關於該自傳內容的文本向量,及一相關於該歷史履歷的分數,該向量生成模型用以根據一由文字構成的內容產生一文本向量,該評分模型依據該正規表示式產生的結果和該向量生成模型產生的文本向量產生一分數。The storage module 12 stores multiple regular expressions, multiple historical records, a vector generation model, and a scoring model, each regular expression has a predetermined keyword, and each historical record has a The autobiographical content of the personal introduction of the historical job seeker, multiple keyword combinations related to the historical job seeker's academic experience, a text vector related to the autobiographical content, and a score related to the historical resume, the vector generation model is used to generate A text vector is generated from a content consisting of text, and the scoring model generates a score based on the result generated by the regular expression and the text vector generated by the vector generation model.

該處理模組13電連接該通訊模組11及該儲存模組12,並根據一相關於一求職者的履歷產生一對應該履歷的分數,其中該履歷包含該求職者填寫的一學經歷內容,及一相關於該求職者個人介紹的自傳內容。The processing module 13 is electrically connected to the communication module 11 and the storage module 12, and generates a pair of scores corresponding to a resume according to a resume related to a job seeker, wherein the resume includes a school experience content filled in by the job seeker , and an autobiography about the job seeker's personal introduction.

參閱圖2、圖3,及圖4,本發明履歷評分方法,包含一向量生成模型訓練程序、一評分模型訓練程序,及一評分程序。Referring to FIG. 2 , FIG. 3 , and FIG. 4 , the history scoring method of the present invention includes a vector generation model training program, a scoring model training program, and a scoring program.

參閱圖2,該向量生成模型訓練程序包含一步驟301、一步驟302、一步驟303、一步驟304,及一步驟305,並說明該處理模組13如何根據該等歷史履歷調整精進該向量生成模型。Referring to FIG. 2, the vector generation model training program includes a step 301, a step 302, a step 303, a step 304, and a step 305, and describes how the processing module 13 adjusts and refines the vector generation according to the historical records Model.

在該步驟301中,該處理模組13根據該等歷史履歷的自傳內容及文本向量,利用一深度學習演算法,建立一用以根據一由文字構成的內容產生一文本向量的第二訓練模型,例如BERT或XLNet等訓練模型。In step 301, the processing module 13 uses a deep learning algorithm to establish a second training model for generating a text vector according to a content composed of text according to the autobiographical content and the text vector of the historical records , such as training models like BERT or XLNet.

在該步驟302中,該處理模組13根據每一歷史履歷所對應的自傳內容,利用該第二訓練模型產生分別對應每一歷史履歷的多個訓練文本向量。In step 302, the processing module 13 uses the second training model to generate a plurality of training text vectors corresponding to each historical record according to the autobiographical content corresponding to each historical record.

在該步驟303中,對於每一歷史履歷,該處理模組13判斷對應該歷史履歷的該文本向量及該訓練文本向量的相似度是否大於一預設閥值。在該第一實施例中,該處理模組13根據一相似度比對演算法,例如餘弦相似度(cosine similarity),獲得該歷史履歷的該文本向量及該訓練文本向量之間的相似度。當該處理模組13判斷相似度並未大於該預設閥值時,該處理模組13調整該第二訓練模型並重新進行該步驟302,亦即該步驟304,當該處理模組13判斷相似度大於該預設閥值時,該處理模組13將該第二訓練模型作為用以根據一由文字構成的內容產生一文本向量的該向量生成模型,亦即該步驟305。In step 303, for each historical record, the processing module 13 determines whether the similarity between the text vector corresponding to the historical record and the training text vector is greater than a preset threshold. In the first embodiment, the processing module 13 obtains the similarity between the text vector of the historical record and the training text vector according to a similarity comparison algorithm, such as cosine similarity. When the processing module 13 determines that the similarity is not greater than the preset threshold, the processing module 13 adjusts the second training model and performs the step 302 again, that is, the step 304, when the processing module 13 determines When the similarity is greater than the preset threshold, the processing module 13 uses the second training model as the vector generation model for generating a text vector according to a content composed of text, that is, step 305 .

參閱圖3,該評分模型訓練程序包含一步驟401、一步驟402、一步驟403、一步驟404,及一步驟405,並說明該處理模組13如何根據該等歷史履歷調整精進該評分模型。3, the scoring model training procedure includes a step 401, a step 402, a step 403, a step 404, and a step 405, and describes how the processing module 13 adjusts and refines the scoring model according to the historical records.

在該步驟401中,該處理模組13將所儲存的多筆歷史履歷區分為一訓練子集及一測試子集,其中該訓練子集中所包括的該等歷史履歷與該測試子集中所包括的該等歷史履歷,其數量可以相等亦可以有所差別。In step 401, the processing module 13 divides the stored multiple historical records into a training subset and a testing subset, wherein the historical records included in the training subset and the historical records included in the testing subset are The number of such historical records may be equal or different.

在該步驟402中,該處理模組13根據該訓練子集中每一歷史履歷所對應的該關鍵字組合、該文本向量,及該分數,利用一機器學習演算法,例如邏輯斯迴歸(Logistic Regression)、隨機森林(Random Forest)、梯度提升技術(Gradient Boosting)、人工神經網路(Artificial Neural Network)等等,建立一根據該等關鍵字組合及該文本向量產生一訓練分數的第一訓練模型。In step 402, the processing module 13 uses a machine learning algorithm, such as logistic regression, according to the keyword combination, the text vector, and the score corresponding to each historical record in the training subset ), Random Forest, Gradient Boosting, Artificial Neural Network, etc., to establish a first training model that generates a training score according to the keyword combinations and the text vector .

在該步驟403中,該處理模組13根據該訓練子集及該測試子集中每一歷史履歷所對應的該關鍵字組合、該文本向量,及該分數,判斷該第一訓練模型是否過度擬合或擬合不足,當判斷該第一訓練模型過度擬合或擬合不足時,該處理模組13調整該第一訓練模型並重新判斷調整後的該第一訓練模型是否過度擬合或擬合不足,亦即該步驟404,當判斷該第一訓練模型並未過度擬合及擬合不足時,該處理模組13將該第一訓練模型作為依據該正規表示式產生的結果和該向量生成模型產生的文本向量產生一分數的該評分模型,亦即該步驟405。In step 403, the processing module 13 determines whether the first training model is overfit according to the keyword combination, the text vector, and the score corresponding to each historical record in the training subset and the test subset If the first training model is over-fit or under-fit, the processing module 13 adjusts the first training model and re-determines whether the adjusted first training model is over-fit or under-fit. Insufficient fit, that is, in step 404, when it is judged that the first training model is not overfitting or underfitting, the processing module 13 uses the first training model as the result generated according to the regular expression and the vector The text vector generated by the generative model generates a score for the scoring model, that is, step 405 .

參閱圖4,該評分程序包含一步驟501、一步驟502、一步驟503,及一步驟504,並說明該處理模組13如何根據該履歷產生對應該履歷的該分數。Referring to FIG. 4 , the scoring procedure includes a step 501 , a step 502 , a step 503 , and a step 504 , and describes how the processing module 13 generates the score corresponding to the resume according to the resume.

在該步驟501中,當該處理模組13透過該通訊模組11接收到該來自該管理端2且相關於該求職者的該履歷時,對於每一正規表示式,該處理模組13根據該履歷的該學經歷內容利用該正規表示式獲得該學經歷內容中對應於該正規表示式之預定關鍵字的關鍵字內容,其中每一正規表示式之預定關鍵字及其對應的關鍵字內容構成一對應的關鍵字組合。在該第一實施例中,該等正規表示式之預定關鍵字分別為「最高學歷」、「工作經歷」、「英文能力」,則該處理模組13根據該等正規表示式所獲得的關鍵字內容分別對應為「成功大學電機系碩士」、「台積電研發部門3年」、「多益測驗870分」,而「最高學歷:成功大學電機系碩士」為一對應「最高學歷」的關鍵字組合,類似地,對應「工作經歷」、「英文能力」的關鍵字組合分別為「工作經歷:台積電研發部門3年」、「英文能力:多益測驗870分」。In step 501, when the processing module 13 receives the resume from the management terminal 2 and is related to the job applicant through the communication module 11, for each regular expression, the processing module 13 according to The learning experience content of the resume uses the regular expression to obtain the keyword content of the learning experience content corresponding to the predetermined keyword of the regular expression, wherein the predetermined keyword of each regular expression and its corresponding keyword content Constitute a corresponding keyword combination. In the first embodiment, the predetermined keywords of the regular expressions are “highest education”, “work experience”, and “English ability” respectively, then the processing module 13 obtains the keywords according to the regular expressions The content of the words corresponds to "Master of Electrical Engineering of National Cheng Kung University", "3 years of TSMC R&D Department", "TOEIC 870 points", and "Highest Education: Master of Electrical Engineering of National Cheng Kung University" is a keyword corresponding to "Highest Education" Combinations, similarly, the keyword combinations corresponding to "work experience" and "English ability" are "work experience: 3 years in TSMC's R&D department" and "English ability: TOEIC 870 points".

在該步驟502中,該處理模組13根據該履歷的該自傳內容,利用該向量生成模型產生一對應該自傳內容的文本向量。值得一提的是,該向量生成模型可為該步驟305所確認的該向量生成模型。In step 502, the processing module 13 uses the vector generation model to generate a pair of text vectors corresponding to the autobiographical content according to the autobiographical content of the resume. It is worth mentioning that the vector generation model may be the vector generation model confirmed in step 305 .

在該步驟503中,該處理模組13根據每一關鍵字組合及該文本向量,利用該評分模型產生一對應該履歷的分數。值得一提的是,該評分模型可為該步驟405所確認的該評分模型。In step 503, the processing module 13 uses the scoring model to generate a pair of scores corresponding to the resume according to each keyword combination and the text vector. It is worth mentioning that the scoring model may be the scoring model confirmed in step 405 .

在該步驟504中,該處理模組13儲存該履歷的該自傳內容、每一關鍵字組合、該文本向量,及該分數為該等歷史履歷之其中一者。藉此,可累積該等歷史履歷之數量,以使該評分模型及該向量生成模型之訓練樣本更多元,藉由持續追蹤及調整訓練和測試樣本來精進所獲得之該評分模型及該向量生成模型。In step 504, the processing module 13 stores the autobiographical content of the resume, each keyword combination, the text vector, and the score as one of the historical resumes. In this way, the number of these historical records can be accumulated, so that the training samples of the scoring model and the vector generation model are more diverse, and the scoring model and the vector obtained by continuously tracking and adjusting the training and testing samples are refined. Generate the model.

參閱圖5,本發明履歷評分方法的一第二實施例類似於該第一實施例,其相同之處不再贅述,其相異之處在於,在該第二實施例中,該儲存模組12還儲存有一用以根據一由文字構成的內容產生一摘要的摘要生成模型,而該儲存模組12所儲存的該每一歷史履歷還具有一相關於所對應之歷史履歷的該自傳內容的摘要,而在該步驟504前還包含一摘要程序,說明該處理模組13如何根據該履歷的自傳內容,產生一對應該履歷之自傳內容的摘要,並包括一步驟601、一步驟602、一步驟603、一步驟604、一步驟605、及一步驟606。Referring to FIG. 5 , a second embodiment of the resume scoring method of the present invention is similar to the first embodiment, and the similarities will not be repeated. The difference is that in the second embodiment, the storage module 12 also stores an abstract generation model for generating an abstract according to a content composed of text, and each historical record stored in the storage module 12 also has an autobiographical content related to the corresponding historical record. Abstract, and before the step 504, it also includes an abstract program, which describes how the processing module 13 generates a pair of abstracts of the autobiographical content of the resume according to the autobiographical content of the resume, and includes a step 601, a step 602, a Step 603 , a step 604 , a step 605 , and a step 606 .

在該步驟601中,該處理模組13根據該等歷史履歷的該等自傳內容及該等摘要,利用一深度學習演算法,例如遞迴神經網路,建立一根據一由文字構成的內容產生一摘要的訓練模型,例如GPT-2或Transformer等模型。In step 601, the processing module 13 uses a deep learning algorithm, such as a recursive neural network, to create a content generated based on a text based on the autobiographical contents and the abstracts of the historical records. A summary of the training model, such as models such as GPT-2 or Transformer.

在該步驟602中,該處理模組13根據每一歷史履歷所對應的自傳內容,利用該訓練模型產生分別對應每一歷史履歷的多個訓練摘要。In step 602, the processing module 13 uses the training model to generate a plurality of training abstracts corresponding to each historical record according to the autobiographical content corresponding to each historical record.

在該步驟603中,對於每一歷史履歷,該處理模組13判斷對應該歷史履歷的該摘要及該訓練摘要的相似度是否大於另一預設閥值。在本實施例中,該處理模組13利用一相似度比對演算法,例如餘弦相似度(cosine similarity),獲得該歷史履歷的該摘要及該訓練摘要之間的相似度。當判斷相似度並未大於該另一預設閥值時,該處理模組13調整該訓練模型並重新進行該步驟602,亦即該步驟604,當判斷相似度大於該另一預設閥值時,該處理模組13確定該訓練模型為一用以根據一由文字構成的內容產生一摘要的摘要生成模型,亦即該步驟605。In step 603, for each historical record, the processing module 13 determines whether the similarity between the abstract corresponding to the historical record and the training abstract is greater than another preset threshold. In this embodiment, the processing module 13 uses a similarity comparison algorithm, such as cosine similarity, to obtain the similarity between the abstract of the historical record and the training abstract. When judging that the similarity is not greater than the other preset threshold, the processing module 13 adjusts the training model and repeats the step 602, that is, the step 604, when it is judged that the similarity is greater than the other preset threshold , the processing module 13 determines that the training model is an abstract generation model for generating an abstract according to a content composed of text, that is, step 605 .

在該步驟606中,當該處理模組13透過該通訊模組11接收到該來自該管理端2且相關於該求職者的履歷時,該處理模組13根據該履歷的該自傳內容,利用該摘要生成模型產生一對應該自傳內容的摘要。In step 606, when the processing module 13 receives the resume related to the job applicant from the management terminal 2 through the communication module 11, the processing module 13 uses the autobiographical content of the resume to use The abstract generation model produces a pair of abstracts that should be autobiographical content.

此外,在該第二實施例中,在該步驟504中,該處理模組13儲存該履歷的該自傳內容、每一關鍵字組合、該文本向量、該分數,及該摘要為該等歷史履歷之其中一者。In addition, in the second embodiment, in the step 504, the processing module 13 stores the autobiographical content of the resume, each keyword combination, the text vector, the score, and the abstract as the historical resumes one of them.

綜上所述,本發明履歷評分方法,藉由該處理模組13根據該等正規表示式、該向量生成模型,及該評分模型產生一對應該履歷的分數,藉此,能夠節省相關工作人員對於每一履歷進行評估篩選的時間人力成本,同時由於統一藉由該處理模組根據該評分模型產生該分數,一併避免了由於各人見解相異或是人為失誤所造成的問題,例如履歷誤篩選或是誤評分,此外,藉由該處理模組13根據該摘要生成模型產生一對應該自傳內容的摘要,藉此,當審核履歷者對於某些分數的履歷感到興趣時,可透過該摘要迅速地更進一步了解該求職者,進而節省閱讀整份履歷所花費的時間人力成本,故確實能達成本發明的目的。To sum up, in the resume scoring method of the present invention, the processing module 13 generates a pair of scores corresponding to the resume according to the regular expressions, the vector generation model, and the scoring model, thereby saving relevant staff. The time and labor cost of evaluating and screening each resume, and at the same time, because the processing module generates the score according to the scoring model, problems caused by different opinions or human errors are avoided, such as resumes. Mis-screening or mis-scoring, in addition, the processing module 13 generates a pair of summaries corresponding to the autobiographical content according to the summary generation model. The abstract can quickly learn more about the job seeker, thereby saving the time and labor cost of reading the entire resume, so the purpose of the present invention can be achieved indeed.

惟以上所述者,僅為本發明的實施例而已,當不能以此限定本發明實施的範圍,凡是依本發明申請專利範圍及專利說明書內容所作的簡單的等效變化與修飾,皆仍屬本發明專利涵蓋的範圍內。However, the above are only examples of the present invention, and should not limit the scope of implementation of the present invention. Any simple equivalent changes and modifications made according to the scope of the patent application of the present invention and the contents of the patent specification are still included in the scope of the present invention. within the scope of the invention patent.

1:伺服端 100:通訊網路 11:通訊模組 12:儲存模組 13:處理模組 2:管理端 301~305:步驟 401~405:步驟 501~504:步驟 601~606:步驟1: Servo side 100: Communication Network 11: Communication module 12: Storage Module 13: Processing modules 2: Management terminal 301~305: Steps 401~405: Steps 501~504: Steps 601~606: Steps

本發明的其他的特徵及功效,將於參照圖式的實施方式中清楚地呈現,其中: 圖1是一方塊圖,說明本發明履歷評分系統的一第一實施例經由一通訊網路連接一管理端; 圖2是一流程圖,說明該第一實施例所執行的本發明履歷評分方法之一向量生成模型訓練程序; 圖3是一流程圖,說明該第一實施例所執行的本發明履歷評分方法之一評分模型訓練程序; 圖4是一流程圖,說明該第一實施例所執行的本發明履歷評分方法之一評分程序;及 圖5是一流程圖,說明一第二實施例所執行的本發明履歷評分方法之一摘要程序。Other features and effects of the present invention will be clearly presented in the embodiments with reference to the drawings, wherein: 1 is a block diagram illustrating a first embodiment of the resume scoring system of the present invention connected to a management terminal via a communication network; Fig. 2 is a flow chart illustrating a vector generation model training program of the resume scoring method of the present invention executed by the first embodiment; FIG. 3 is a flow chart illustrating a scoring model training procedure of one of the resume scoring methods of the present invention executed by the first embodiment; FIG. 4 is a flow chart illustrating a scoring procedure of the history scoring method of the present invention performed by the first embodiment; and FIG. 5 is a flow chart illustrating a summary procedure of the history scoring method of the present invention performed by a second embodiment.

1:伺服端1: Servo side

100:通訊網路100: Communication Network

11:通訊模組11: Communication module

12:儲存模組12: Storage Module

13:處理模組13: Processing modules

2:管理端2: Management terminal

Claims (8)

一種履歷評分方法,由一伺服端實施,該履歷包含一求職者填寫的一學經歷內容,及一個人介紹的自傳內容,該伺服端包含多個正規表示式、一向量生成模型,及一評分模型,該每一正規表示式具有一預定關鍵字,該向量生成模型用以根據一由文字構成的內容產生一文本向量,該評分模型依據該正規表示式產生的結果和該向量生成模型產生的文本向量產生一分數,該履歷評分方法包含以下步驟:(A)當該伺服端接收到該相關於該求職者的履歷時,對於每一正規表示式,該伺服端根據該學經歷內容利用該正規表示式獲得該學經歷內容中對應於該正規表示式之預定關鍵字的關鍵字內容,其中每一正規表示式之預定關鍵字及其對應的關鍵字內容構成一對應的關鍵字組合;(B)該伺服端根據該自傳內容,利用該向量生成模型產生一對應該自傳內容的文本向量;(C)該伺服端根據每一關鍵字組合及該文本向量,利用該評分模型產生一對應該履歷的分數;(D)該伺服端儲存該履歷的該自傳內容、每一關鍵字組合、該文本向量,及該分數成一歷史履歷;(E)該伺服端將所儲存的多筆歷史履歷區分為一訓練子集及一測試子集;(F)該伺服端根據該訓練子集中每一歷史履歷所對應的該關鍵字組合、該文本向量,及該分數,利用一機器學 習演算法建立一根據該等關鍵字組合及該文本向量產生一訓練分數的第一訓練模型;(G)該伺服端根據該訓練子集及該測試子集中每一歷史履歷所對應的該關鍵字組合、該文本向量,及該分數,判斷該第一訓練模型是否過度擬合或擬合不足;(H)當該伺服端判斷該第一訓練模型過度擬合或擬合不足時,該伺服端調整該第一訓練模型並重新進行該步驟(G);及(I)當該伺服端判斷該第一訓練模型並未過度擬合與擬合不足時,該伺服端將該第一訓練模型作為執行該步驟(C)時的該評分模型。 A resume scoring method is implemented by a server. The resume includes a school experience filled in by a job seeker and an autobiography introduced by a person. The server includes a plurality of regular expressions, a vector generation model, and a scoring model. , each regular expression has a predetermined keyword, the vector generating model is used to generate a text vector according to a content composed of words, the scoring model is based on the result generated by the regular expression and the text generated by the vector generating model The vector generates a score, and the resume scoring method includes the following steps: (A) when the server receives the resume related to the job seeker, for each regular expression, the server uses the regular expression according to the content of the learning experience. The expression obtains the keyword content corresponding to the predetermined keyword of the regular expression in the learning experience content, wherein the predetermined keyword of each regular expression and its corresponding keyword content constitute a corresponding keyword combination; (B ) According to the autobiographical content, the server utilizes the vector generation model to generate a pair of text vectors corresponding to the autobiographical content; (C) the server utilizes the scoring model to generate a pair of corresponding resumes according to each keyword combination and the text vector (D) The server stores the autobiographical content of the resume, each keyword combination, the text vector, and the score into a historical record; (E) The server divides the stored historical records into a training subset and a test subset; (F) the server uses a machine learning method according to the keyword combination, the text vector, and the score corresponding to each historical record in the training subset The learning algorithm establishes a first training model that generates a training score according to the keyword combinations and the text vector; (G) the server according to the training subset and the test subset corresponding to each historical record in the key The word combination, the text vector, and the score determine whether the first training model is over-fitting or under-fitting; (H) when the server determines that the first training model is over-fitting or under-fitting, the servo The terminal adjusts the first training model and performs the step (G) again; and (I) when the server determines that the first training model is not overfitting or underfitting, the server determines that the first training model is not overfitted or underfitted as the scoring model when performing this step (C). 如請求項1所述的履歷評分方法,還包含以下步驟:(J)該伺服端根據多筆歷史履歷的自傳內容及文本向量,利用一深度學習演算法建立一用以根據一由文字構成的內容產生一文本向量的第二訓練模型;(K)該伺服端根據每一歷史履歷所對應的自傳內容,利用該第二訓練模型產生分別對應每一歷史履歷的多個訓練文本向量;(L)對於每一歷史履歷,該伺服端判斷對應該歷史履歷的該文本向量及該訓練文本向量的相似度是否大於一預設閥值;(M)當該伺服端判斷相似度並未大於該預設閥值時,該伺服端調整該第二訓練模型並重新進行該步驟(K);及(N)當該伺服端判斷相似度大於該預設閥值時,該伺 服端將該第二訓練模型作為執行該步驟(B)時的該向量生成模型。 The resume scoring method according to claim 1, further comprising the following steps: (J) the server uses a deep learning algorithm to establish a The content produces a second training model of a text vector; (K) the server uses the second training model to generate a plurality of training text vectors corresponding to each history respectively according to the autobiographical content corresponding to each historical record; (L ) For each historical record, the server determines whether the similarity between the text vector and the training text vector corresponding to the historical record is greater than a preset threshold; (M) when the server determines that the similarity is not greater than the preset When the threshold is set, the server adjusts the second training model and performs the step (K) again; and (N) when the server determines that the similarity is greater than the preset threshold, the server The server uses the second training model as the vector generation model when performing the step (B). 如請求項1所述的履歷評分方法,其中,該伺服端還包含一摘要生成模型,該摘要生成模型用以根據一由文字構成的內容產生一摘要,在該步驟(C)後還包含以下步驟:(O)該伺服端根據該自傳內容,利用該摘要生成模型產生一對應該自傳內容的摘要。 The resume scoring method according to claim 1, wherein the server further includes an abstract generation model, the abstract generation model is used to generate an abstract according to a content composed of text, and after the step (C), the following further includes: Step: (O) According to the autobiographical content, the server generates a pair of abstracts corresponding to the autobiographical content by using the abstract generation model. 如請求項3所述的履歷評分方法,該伺服端還包含多筆歷史履歷,該每一歷史履歷包括所對應之歷史履歷的一自傳內容及一摘要,其中,在該步驟(O)前還包含以下步驟:(P)該伺服端根據該等歷史履歷的該等自傳內容及該等摘要,利用一深度學習演算法建立一根據一由文字構成的內容產生一摘要的訓練模型;(Q)該伺服端根據每一歷史履歷所對應的自傳內容,利用該訓練模型產生分別對應每一歷史履歷的多個訓練摘要;(R)對於每一歷史履歷,該伺服端判斷對應該歷史履歷的該摘要及該訓練摘要的相似度是否大於另一預設閥值;(S)當該伺服端判斷相似度並未大於該另一預設閥值時,該伺服端調整該訓練模型並重新進行該步驟(Q);及(T)當該伺服端判斷相似度大於該另一預設閥值時,該伺服端確定該訓練模型為該摘要生成模型。 According to the resume scoring method described in claim 3, the server further includes a plurality of historical resumes, and each historical resume includes an autobiographical content and an abstract of the corresponding historical resume, wherein, before the step (0), the It includes the following steps: (P) the server uses a deep learning algorithm to establish a training model for generating an abstract according to a content composed of text according to the autobiographical content and the abstracts of the historical records; (Q) The server uses the training model to generate a plurality of training abstracts corresponding to each historical record according to the autobiographical content corresponding to each historical record; (R) for each historical record, the server determines the Whether the similarity between the digest and the training digest is greater than another preset threshold; (S) when the server determines that the similarity is not greater than the other preset threshold, the server adjusts the training model and performs the process again Step (Q); and (T) when the server determines that the similarity is greater than the other preset threshold, the server determines that the training model is the digest generation model. 一種履歷評分系統,用以對一履歷產生評分,並經由一通 訊網路連接一管理端,該履歷包含一求職者填寫的一學經歷內容,及一個人介紹的自傳內容,該履歷評分系統包含:一通訊模組,連接至該通訊網路;一儲存模組,儲存有多個正規表示式、一向量生成模型,及一評分模型,該每一正規表示式具有一預定關鍵字,該向量生成模型用以根據一由文字構成的內容產生一文本向量,該評分模型依據該正規表示式產生的結果和該向量生成模型產生的文本向量產生一分數;及一處理模組,電連接該通訊模組及該儲存模組;其中,當該處理模組透過該通訊模組接收到來自於該管理端的該相關於該求職者的履歷時,對於每一正規表示式,該處理模組根據該學經歷內容利用該正規表示式獲得該學經歷內容中對應於該正規表示式之預定關鍵字的關鍵字內容,其中每一正規表示式之預定關鍵字及其對應的關鍵字內容構成一對應的關鍵字組合,並根據該自傳內容,利用該向量生成模型產生一對應該自傳內容的文本向量,以及根據每一關鍵字組合及該文本向量,利用該評分模型產生一對應該履歷的分數,該處理模組將該履歷的該自傳內容、該每一關鍵字組合、該文本向量,及該分數儲存至該儲存模組成一歷史履歷,且該處理模組將該儲存模組將所儲存的多筆歷史履歷區分為一訓練子集及一測試子集,並根據該訓練子集中每一歷史履歷所對應的該關鍵字組合、該文本向量,及該分數,利用一機器學習演算法建立一根據該等關鍵字組合及該文本向量產生一訓練分 數的第一訓練模型,且根據該訓練子集及該測試子集中每一歷史履歷所對應的該關鍵字組合、該文本向量,及該分數,判斷該第一訓練模型是否過度擬合或擬合不足,當判斷該第一訓練模型過度擬合或擬合不足時,調整該第一訓練模型並重新判斷是否過度擬合或擬合不足,及當判斷該第一訓練模型並未過度擬合與擬合不足時,將該第一訓練模型作為該評分模型。 A resume scoring system for scoring a resume and passing a pass The information network is connected to a management terminal. The resume includes a school experience filled in by a job seeker and an autobiography introduced by a person. The resume scoring system includes: a communication module, connected to the communication network; a storage module, A plurality of regular expressions, a vector generation model, and a scoring model are stored, each regular expression has a predetermined keyword, the vector generation model is used to generate a text vector according to a content composed of text, and the scoring The model generates a score according to the result generated by the regular expression and the text vector generated by the vector generation model; and a processing module electrically connected to the communication module and the storage module; wherein, when the processing module passes the communication When the module receives the resume related to the job seeker from the management terminal, for each regular expression, the processing module uses the regular expression to obtain the content of the learning experience corresponding to the regular expression according to the content of the learning experience. The keyword content of the predetermined keyword of the expression, wherein the predetermined keyword of each regular expression and its corresponding keyword content constitute a corresponding keyword combination, and according to the autobiographical content, the vector generation model is used to generate a pair of The text vector of the autobiographical content, and according to each keyword combination and the text vector, the scoring model is used to generate a pair of scores corresponding to the resume, the processing module The autobiographical content of the resume, each keyword combination, The text vector and the score are stored in the storage module to form a historical record, and the processing module divides the stored multiple historical records into a training subset and a test subset by the storage module, and according to the The keyword combination, the text vector, and the score corresponding to each historical record in the training subset are created using a machine learning algorithm to generate a training score according to the keyword combination and the text vector. number of the first training model, and according to the keyword combination, the text vector, and the score corresponding to each historical record in the training subset and the test subset, determine whether the first training model is overfitting or overfitting. If it is judged that the first training model is over-fitting or under-fitting, adjust the first training model and re-judge whether it is over-fitting or under-fitting, and when it is judged that the first training model is not over-fitting When the fit is insufficient, the first training model is used as the scoring model. 如請求項5所述的履歷評分系統,其中,該處理模組根據儲存於該儲存模組的多筆歷史履歷的自傳內容及文本向量,利用一深度學習演算法建立一用以根據一由文字構成的內容產生一文本向量的第二訓練模型,並根據每一歷史履歷所對應的自傳內容,利用該第二訓練模型產生分別對應每一歷史履歷的多個訓練文本向量,對於每一歷史履歷,該處理模組判斷對應該歷史履歷的該文本向量及該訓練文本向量的相似度是否大於一預設閥值,當判斷相似度並未大於該預設閥值時,調整該第二訓練模型並重新產生該等訓練文本向量及判斷相似度,當判斷相似度大於該預設閥值時,將該第二訓練模型作為該向量生成模型。 The resume scoring system as claimed in claim 5, wherein the processing module uses a deep learning algorithm to create a system based on the autobiographical content and text vectors of a plurality of historical resumes stored in the storage module. The constituted content generates a second training model of a text vector, and according to the autobiographical content corresponding to each historical resume, the second training model is used to generate a plurality of training text vectors corresponding to each historical resume, for each historical resume , the processing module judges whether the similarity between the text vector corresponding to the historical resume and the training text vector is greater than a preset threshold, and when it is judged that the similarity is not greater than the preset threshold, adjusts the second training model and regenerate the training text vectors and determine the similarity. When the similarity is determined to be greater than the preset threshold, the second training model is used as the vector generation model. 如請求項5所述的履歷評分系統,其中,該儲存模組還儲存有一摘要生成模型,該摘要生成模型用以根據一由文字構成的內容產生一摘要,該處理模組根據該自傳內容,利用該摘要生成模型產生一對應該自傳內容的摘要。 The resume scoring system according to claim 5, wherein the storage module further stores an abstract generation model, the abstract generation model is used to generate an abstract according to a content composed of text, and the processing module is based on the autobiographical content, The abstract generation model is used to generate a pair of abstracts of the autobiographical content. 如請求項7所述的履歷評分系統,其中,該儲存模組還儲存有多筆歷史履歷,該每一歷史履歷包括所對應之歷史履 歷的一自傳內容及一摘要,該處理模組根據該等自傳內容及該等摘要,利用一深度學習演算法建立一根據一由文字構成的內容產生一摘要的訓練模型,並根據每一歷史履歷所對應的自傳內容,利用該訓練模型產生分別對應每一歷史履歷的多個訓練摘要,對於每一歷史履歷,該處理模組判斷對應該歷史履歷的該摘要及該訓練摘要的相似度是否大於另一預設閥值,當判斷相似度並未大於該另一預設閥值時,調整該訓練模型並重新產生該等訓練摘要及判斷相似度,當判斷相似度大於該另一預設閥值時,將該訓練模型作為該摘要生成模型。The resume scoring system according to claim 7, wherein the storage module further stores a plurality of historical resumes, and each historical resume includes a corresponding historical resume An autobiographical content and an abstract of the history, the processing module uses a deep learning algorithm to establish a training model that generates an abstract according to a content composed of text according to the autobiographical content and the abstract, and according to each history For the autobiographical content corresponding to the resume, the training model is used to generate a plurality of training abstracts corresponding to each historical resume. For each historical resume, the processing module determines whether the similarity between the abstract corresponding to the historical resume and the training abstract is not greater than another preset threshold, when it is judged that the similarity is not greater than the other preset threshold, adjust the training model and regenerate the training summaries and judge the similarity, when it is judged that the similarity is greater than the other preset When the threshold is set, the trained model is used as the summary generation model.
TW109114559A 2020-04-30 2020-04-30 Resume scoring method and system TWI776146B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW109114559A TWI776146B (en) 2020-04-30 2020-04-30 Resume scoring method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW109114559A TWI776146B (en) 2020-04-30 2020-04-30 Resume scoring method and system

Publications (2)

Publication Number Publication Date
TW202143122A TW202143122A (en) 2021-11-16
TWI776146B true TWI776146B (en) 2022-09-01

Family

ID=80783110

Family Applications (1)

Application Number Title Priority Date Filing Date
TW109114559A TWI776146B (en) 2020-04-30 2020-04-30 Resume scoring method and system

Country Status (1)

Country Link
TW (1) TWI776146B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120246168A1 (en) * 2011-03-21 2012-09-27 Tata Consultancy Services Limited System and method for contextual resume search and retrieval based on information derived from the resume repository
CN106663230A (en) * 2014-03-14 2017-05-10 庞德·萨利尔 Career analytics platform
TW201741948A (en) * 2016-03-30 2017-12-01 Alibaba Group Services Ltd Resume assessment method and apparatus
CN107590133A (en) * 2017-10-24 2018-01-16 武汉理工大学 The method and system that position vacant based on semanteme matches with job seeker resume
CN108874928A (en) * 2018-05-31 2018-11-23 平安科技(深圳)有限公司 Resume data information analyzing and processing method, device, equipment and storage medium
CN109345198A (en) * 2018-09-17 2019-02-15 平安科技(深圳)有限公司 Resume selection method, apparatus, computer equipment and storage medium
CN109636337A (en) * 2018-12-12 2019-04-16 北京唐冠天朗科技开发有限公司 A kind of talent's base construction method and electronic equipment based on big data
CN109948120A (en) * 2019-04-02 2019-06-28 深圳市前海欢雀科技有限公司 A kind of resume analytic method based on dualization
CN110516261A (en) * 2019-09-03 2019-11-29 北京字节跳动网络技术有限公司 Resume appraisal procedure, device, electronic equipment and computer storage medium
TWM599954U (en) * 2020-04-30 2020-08-11 中國信託商業銀行股份有限公司 Resume scoring system

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120246168A1 (en) * 2011-03-21 2012-09-27 Tata Consultancy Services Limited System and method for contextual resume search and retrieval based on information derived from the resume repository
CN106663230A (en) * 2014-03-14 2017-05-10 庞德·萨利尔 Career analytics platform
TW201741948A (en) * 2016-03-30 2017-12-01 Alibaba Group Services Ltd Resume assessment method and apparatus
CN107590133A (en) * 2017-10-24 2018-01-16 武汉理工大学 The method and system that position vacant based on semanteme matches with job seeker resume
CN108874928A (en) * 2018-05-31 2018-11-23 平安科技(深圳)有限公司 Resume data information analyzing and processing method, device, equipment and storage medium
CN109345198A (en) * 2018-09-17 2019-02-15 平安科技(深圳)有限公司 Resume selection method, apparatus, computer equipment and storage medium
CN109636337A (en) * 2018-12-12 2019-04-16 北京唐冠天朗科技开发有限公司 A kind of talent's base construction method and electronic equipment based on big data
CN109948120A (en) * 2019-04-02 2019-06-28 深圳市前海欢雀科技有限公司 A kind of resume analytic method based on dualization
CN110516261A (en) * 2019-09-03 2019-11-29 北京字节跳动网络技术有限公司 Resume appraisal procedure, device, electronic equipment and computer storage medium
TWM599954U (en) * 2020-04-30 2020-08-11 中國信託商業銀行股份有限公司 Resume scoring system

Also Published As

Publication number Publication date
TW202143122A (en) 2021-11-16

Similar Documents

Publication Publication Date Title
Washizaki et al. Studying software engineering patterns for designing machine learning systems
Mohamad et al. Educational data mining: A review
Zhang et al. A hybrid bug triage algorithm for developer recommendation
CN110111010B (en) Question and answer task allocation method and system based on crowd-sourcing network
Isljamovic et al. Predicting students’ academic performance using artificial neural network: a case study from faculty of organizational sciences
EP4134900A3 (en) Method and apparatus for recommending content, method and apparatus for training ranking model, device, and storage medium
CN112596731A (en) Programming teaching system and method integrating intelligent education
Ganeshan et al. An intelligent student advising system using collaborative filtering
CN113283488B (en) Learning behavior-based cognitive diagnosis method and system
Imran et al. A framework to provide personalization in learning management systems through a recommender system approach
Kuznetsov et al. Reducing cold start problems in educational recommender systems
TWI776146B (en) Resume scoring method and system
TWM599954U (en) Resume scoring system
Oriakhi et al. Design-by-analogy using the wordtree method and an automated wordtree generating tool
Huang et al. Semantic web enabled personalized recommendation for learning paths and experiences
Do et al. A fuzzy approach to detect spammer groups
Celikkan et al. A consolidated approach for design pattern recommendation
Monett et al. Using AI to Understand Intelligence: The Search for a Catalog of Intelligence Capabilities.
Wang et al. A semi-supervised learning method for Q-matrix specification under the DINA and DINO model with independent structure
Vitoria et al. A review of mathematical modelling in educational research in Indonesia
JP7052438B2 (en) Training data generation method, training data generation program and data structure
de Almeida Lima et al. Educational Data Mining: A Hybrid Approach to Predicting Academic Performance of Students.
Suriyasat et al. A Comparison of Machine Learning and Neural Network Algorithms for an Automated Thai Essay Scoring
Cheng et al. Exercise recommendation method combining neuralcd and neumf models
Bertović et al. Using Moodle Test Scores to Predict Success in an Online Course

Legal Events

Date Code Title Description
GD4A Issue of patent certificate for granted invention patent