KR20210050155A

KR20210050155A - Document screening system using artificial intelligence

Info

Publication number: KR20210050155A
Application number: KR1020190134426A
Authority: KR
Inventors: 최윤종; 김응도; 노민영; 안종찬; 조영선; 유상기; 최윤석
Original assignee: 씨제이올리브네트웍스 주식회사
Priority date: 2019-10-28
Filing date: 2019-10-28
Publication date: 2021-05-07
Also published as: KR102358991B1

Abstract

The present invention relates to an application document guide method using artificial intelligence. The method includes a document screening server: dividing a document created by an applicant into sentences; parsing each sentence of the document; acquiring document context information based on the parsing; determining the degree of importance of each sentence using the context information and information on the applied job of the applicant; and learning the sentence-specific degree of importance corrected by an evaluator in the document with a determined degree of importance. According to the present invention, a self-introduction sentence context can be understood, an important sentence can be provided with highlighting, non-highlighted sentences regarded by an evaluator as important sentences can be selected for feedback to a document screening system, and the evaluator's opinion can be reflected in the subsequent selection of important self-introduction sentences. Accordingly, the same evaluation criteria can be applied to every applicant.

Description

Document screening system using artificial intelligence {DOCUMENT SCREENING SYSTEM USING ARTIFICIAL INTELLIGENCE}

본 발명은 인공 지능을 활용한 서류 전형 시스템에 관한 것으로, 자연어 처리 기술을 이용하여 지원자가 작성한 지원서의 맥락을 판단하고 중요 단어, 문장을 지원서에 표시해줌으로써, 지원서에 대한 평가자의 의사 결정이 빠르게 수행될 수 있도록 하여, 많은 양의 지원서를 평가하기 위해 투입되는 인력 비용을 최소화할 수 있는 서류 전형 시스템에 관한 것이다.The present invention relates to a document screening system using artificial intelligence, and by using natural language processing technology to determine the context of an application written by an applicant and displaying important words and sentences on the application, the evaluator's decision on the application is quickly performed. It relates to a document screening system that can minimize the cost of manpower incurred to evaluate a large number of applications.

일반적으로, 각 회사에서는 정기적인 신입사원 채용을 위해서 적은 인원의 실무자/평가자가 몇 백 건의 지원 서류를 읽고 점수를 매기는 데에 많은 시간을 소비하게 된다. 그에 따라, 최근 평가자 업무의 효율을 높이기 위해 인공지능(Artificial Intelligence, AI)을 활용한 지원 서류 분석 기술이 개시되고 있다. In general, for regular recruitment of new employees, a small number of practitioners/evaluators spend a lot of time reading and scoring hundreds of application documents in each company. Accordingly, in order to increase the efficiency of the evaluator's work, a technology for analyzing supporting documents using artificial intelligence (AI) has recently been launched.

대표적으로, 언어 처리 분석 기술을 이용하여 지원자의 지원 서류를 분석함으로써, 내용이 기업과 적합한지, 표절 가능성은 없는지, 기업의 역량 키워드가 존재하는지 등의 방법이 있으나 지원자의 지원 서류에서 표절된 문장이 존재하는지, 기업 별 특정된 단어가 존재하는지 확인해 주는 기술에 불과하다.Representatively, by analyzing the applicant's application documents using language processing analysis technology, there are methods such as whether the content is suitable for the company, whether there is a possibility of plagiarism, and whether the company's competency keywords exist, but the sentences plagiarized in the applicant's application documents. It is just a technology that checks whether or not there is a specific word for each company.

즉, 지원 서류를 통해 지원자가 기업에 적합한 인재인지 판정하기 위해서는 평가자가 직접 지원 서류를 읽고 문맥을 파악하는 것이 가장 정확한 방법이다고 할 수 있다. 그러나 채용 때마다 동일한 평가자가 검토할 수 없기 때문에, 평가 점수의 일관성이 떨어지는 문제점이 발생하게 된다.In other words, it can be said that the most accurate method is for the evaluator to read the application documents and grasp the context in order to determine whether the applicant is the right person for the company through the application documents. However, since the same evaluator cannot review each time it is hired, there is a problem that the evaluation score is inconsistent.

따라서 평가자가 지원 서류를 빠르게 읽고 의사 결정을 내릴 수 있도록 서류 평가를 위한 가이드를 제공할 수 있는 기술의 개발이 요구되며, 본 발명은 이에 관한 것이다.Therefore, it is required to develop a technology capable of providing a guide for document evaluation so that the evaluator can quickly read the application documents and make a decision, and the present invention relates to this.

한국공개특허 제10-2016-0137135호(2015.05.22.)Korean Patent Publication No. 10-2016-0137135 (2015.05.22.)

본 발명이 해결하고자 하는 기술적 과제는 지원 서류(이하, 자기 소개서)의 문장 맥락을 이해하고, 중요 문장으로 판단되는 영역에 강조 표시를 해주어, 평가자의 자기 소개서 판단을 위한 가이드를 제공하는 것을 목적으로 한다.The technical problem to be solved by the present invention is to provide a guide for evaluating the self-introduction by understanding the context of the sentence of the supporting document (hereinafter, self-introduction) and highlighting the area judged as an important sentence. do.

본 발명이 해결하고자 하는 또 다른 기술적 과제는 자기 소개서에 대한 평가자의 피드백(예. 강조 표시된 문장 외에 평가자가 중요하다고 생각되어 표시한 문장들)을 반영하여, 또 다른 자기 소개서에 대한 가이드를 제공할 때 이를 반영할 수 있는 서류 전형 시스템을 제공하는 것을 목적으로 한다.Another technical problem to be solved by the present invention is to provide a guide for another self-introduction by reflecting the evaluator's feedback on the self-introduction letter (e.g., sentences that the evaluator considered important in addition to the highlighted sentence). The purpose of this is to provide a document screening system that can reflect this.

본 발명의 기술적 과제들은 이상에서 언급한 기술적 과제들로 제한되지 않으며, 언급되지 않은 또 다른 기술적 과제들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The technical problems of the present invention are not limited to the technical problems mentioned above, and other technical problems that are not mentioned will be clearly understood by those skilled in the art from the following description.

본 발명의 일 실시 예에 따른 인공 지능을 활용한 지원 서류 가이드 방법은, 지원자가 작성한 문서를 문장 단위로 분리하는 단계, 상기 서류 전형 서버가, 상기 문서를 구성하는 복수의 문장 각각의 구문을 분석하는 단계, 상기 서류 전형 서버가, 분석한 구문을 기초로 상기 문서의 맥락(Context) 정보를 획득하는 단계, 상기 서류 전형 서버가, 상기 지원자가 지원한 직무의 정보와 상기 맥락 정보를 이용하여 상기 복수의 문장 각각의 중요도를 결정하는 단계 및 상기 서류 전형 서버가, 상기 중요도가 결정된 문서 내에서 평가자가 보정한 문장 별 중요도를 학습하는 단계를 포함한다.A method for guiding a supporting document using artificial intelligence according to an embodiment of the present invention includes the steps of separating a document written by an applicant into sentences, and the document selection server analyzes the syntax of each of a plurality of sentences constituting the document. Step of, the document selection server obtaining context information of the document based on the analyzed syntax, the document selection server using the job information and the context information applied by the applicant Determining the importance of each of the plurality of sentences, and learning, by the document selection server, the importance of each sentence corrected by the evaluator in the document for which the importance is determined.

일 실시 예에 따르면, 상기 문장 단위로 분리하는 단계 이전에, 형태소 분석, 개체명 분석 또는 의미역(semantic role) 분석 방식을 이용하여 상기 문서 내 단어 또는 기 저장된 문서 내 단어를 벡터화하는 단계를 더 포함할 수 있다.According to an embodiment, before the step of separating into sentences, the step of vectorizing words in the document or words in the previously stored document using a morpheme analysis, an entity name analysis, or a semantic role analysis method is further performed. Can include.

일 실시 예에 따르면, 상기 기 저장된 문서 내 문장은, 연속된 구문으로 이루어지고, 상기 연속된 구문 각각의 핵심을 나타내는 단어를 조합하여, 제 1특성 벡터 형태로 추출하여, 상기 제 1 특성 벡터를 기초로 단어-벡터 변환 모델을 생성하는 단계를 더 포함할 수 있다.According to an embodiment, the sentences in the pre-stored document are composed of consecutive phrases, and words representing the core of each of the consecutive phrases are combined and extracted in the form of a first characteristic vector, and the first characteristic vector is obtained. It may further include the step of generating a word-vector transformation model based on.

일 실시 예에 따르면, 상기 문장 단위로 분리하는 단계 이전에, 상기 기 저장된 문서를 문장 단위로 나누고, 각 문장에서 연속된 구문의 맥락 정보를 제2 특성 벡터 형태로 추출하고, 상기 제2 특성 벡터를 기초로 중요도 예측 모델을 생성하는 단계를 더 포함할 수 있다.According to an embodiment, before the step of separating into sentences, the pre-stored document is divided into sentences, context information of consecutive sentences from each sentence is extracted in the form of a second feature vector, and the second feature vector It may further include generating an importance prediction model based on.

일 실시 예에 따르면, 상기 중요도 예측 모델을 생성하는 단계는, 동일 문장에 대한 상기 평가자들의 의견 일치 정도 또는 기 저장된 문서의 서류 전형 평가 점수를 기준으로 상기 기 저장된 문서 내에서 각 문장 별 중요도를 결정하고, 결정된 각 문장 별 중요도를 상기 중요도 예측 모델에 반영하는 단계를 더 포함할 수 있다.According to an embodiment, in the generating of the importance prediction model, the importance of each sentence in the previously stored document is determined based on the degree of consensus of the evaluators on the same sentence or the document screening evaluation score of a previously stored document. And reflecting the determined importance of each sentence to the importance prediction model.

일 실시 예에 따르면, 상기 직무의 정보는, 직무 별로 지정된 핵심 요소를 포함하고, 상기 복수의 문장 각각의 중요도를 결정하는 단계는, 상기 복수의 문장 각각을 상기 핵심 요소 및 상기 맥락 정보와의 연관성에 따라 중요도 점수를 부여하고, 부여된 중요도 점수에 따라 상기 복수의 문장 각각에서 평가자가 인식할 수 있도록 상기 문서 내에서 해당 문장에 강조하는 표시를 지정하는 단계를 더 포함할 수 있다.According to an embodiment, the information on the job includes a core element designated for each job, and determining the importance of each of the plurality of sentences includes relevance of each of the plurality of sentences with the core element and the context information. And assigning an importance score according to the assigned importance score, and designating a mark to be emphasized on a corresponding sentence in the document so that the evaluator can recognize it in each of the plurality of sentences according to the assigned importance score.

일 실시 예에 따르면, 상기 벡터화하는 단계는, 상기 기 저장된 문서를 직무 기준으로 분류하고, 분류된 문서 내에서 출현 빈도수가 기준 값 이상인 단어들을 추출하여 유사한 벡터 값을 가지도록 벡터화하는 단계일 수 있다.According to an embodiment, the vectorizing may be a step of classifying the pre-stored document according to a job standard, extracting words with an appearance frequency greater than or equal to a reference value in the classified document, and vectorizing it to have a similar vector value. .

본 발명의 또 다른 실시 예에 따른 서류 전형 서버는, 지원자가 작성한 문서를 문장 단위로 분리하는 문장 분리부, 상기 문서 내 단어를 벡터화하는 단어-벡터 변환부, 상기 문서를 구성하는 복수의 문장 각각의 구문을 분석하는 구문 분석부, 분석한 구문을 기초로 상기 문서의 맥락(Context) 정보를 획득하는 맥락 획득부 및 상기 지원자가 작성한 직무의 정보와 상기 맥락 정보를 이용하여 상기 복수의 문장 각각의 중요도를 결정하고, 상기 중요도가 결정된 문서 내에서 평가자가 보정한 문장 별 중요도를 학습하는 문장 중요도 결정부를 포함한다.A document screening server according to another embodiment of the present invention includes a sentence separation unit for separating a document written by an applicant into sentence units, a word-vector conversion unit for vectorizing words in the document, and a plurality of sentences constituting the document. A syntax analysis unit that analyzes the syntax of the sentence, a context acquisition unit that obtains context information of the document based on the analyzed syntax, and the job information and the context information written by the applicant are used for each of the plurality of sentences. It includes a sentence importance determining unit that determines the importance and learns the importance of each sentence corrected by the evaluator within the document in which the importance is determined.

일 실시 예에 따르면, 상기 단어-벡터 변환부는, 형태소 분석, 개체명 분석 또는 의미역(semantic role) 분석 방식을 이용하여 상기 문서 내 단어 또는 기 저장된 문서 내 단어를 벡터화할 수 있다.According to an embodiment, the word-vector conversion unit may vectorize a word in the document or a word in a pre-stored document by using a morpheme analysis, an entity name analysis, or a semantic role analysis method.

일 실시 예에 따르면, 상기 기 저장된 문서 내 문장은, 연속된 구문으로 이루어지고, 상기 구문 분석부는, 상기 연속된 구문 각각의 핵심을 나타내는 단어를 조합하여, 제1 특성 벡터 형태로 추출하고, 상기 제1 특성 벡터를 기초로 단어-벡터 변환 모델을 생성할 수 있다.According to an embodiment, the sentences in the pre-stored document are composed of consecutive phrases, and the syntax analysis unit combines words representing the core of each of the consecutive phrases and extracts them in the form of a first characteristic vector, and the A word-vector transformation model may be generated based on the first feature vector.

일 실시 예에 따르면, 상기 맥락 획득부는, 상기 기 저장된 문서를 문장 단위로 나누고, 각 문장에서 연속된 구문의 맥락 정보를 제2 특성 벡터 형태로 추출하여, 상기 제2 특성 벡터를 기초로 중요도 예측 모델을 생성할 수 있다.According to an embodiment, the context acquisition unit divides the pre-stored document into sentences, extracts context information of consecutive phrases from each sentence in the form of a second feature vector, and predicts importance based on the second feature vector. You can create a model.

일 실시 예에 따르면, 상기 문장 중요도 결정부는, 동일 문장에 대한 상기 평가자들의 의견 일치 정도 또는 기 저장된 문서의 서류 전형 평가 점수를 기준으로 상기 기 저장된 문서 내에서 각 문장 별 중요도를 결정하고, 결정된 각 문장 별 중요도를 상기 중요도 예측 모델에 반영할 수 있다.According to an embodiment, the sentence importance determining unit determines the importance of each sentence within the previously stored document based on the degree of agreement of the evaluators' opinions on the same sentence or the document screening evaluation score of a previously stored document, and The importance of each sentence may be reflected in the importance prediction model.

일 실시 예에 따르면, 상기 직무의 정보는, 직무 별로 지정된 핵심 요소를 포함하고, 상기 문장 중요도 결정부는, 상기 복수의 문장 각각을 상기 핵심 요소 및 상기 맥락 정보와의 연관성에 따라 중요도 점수를 부여하고, 부여된 중요도 점수에 따라 상기 복수의 문장 각각에서 평가자가 인식할 수 있도록 상기 문서 내에서 해당 문장에 강조하는 표시를 지정할 수 있다.According to an embodiment, the information on the job includes a core element designated for each job, and the sentence importance determining unit assigns an importance score to each of the plurality of sentences according to a correlation between the core element and the context information. , In accordance with the assigned importance score, a mark to be emphasized on a corresponding sentence in the document may be designated so that the evaluator can recognize in each of the plurality of sentences.

일 실시 예에 따르면, 상기 단어-벡터 변환부는, 상기 기 저장된 문서를 직무 기준으로 분류하고, 분류된 문서 내에서 출현 빈도수가 기준 값 이상인 단어들을 추출하여 유사한 벡터 값을 가지도록 벡터화할 수 있다.According to an embodiment, the word-vector converter may classify the pre-stored document on a job basis, extract words with an appearance frequency greater than or equal to a reference value in the classified document, and vectorize it to have a similar vector value.

본 발명에 의하면, 자기 소개서의 문장 맥락을 이해하고, 중요 문장을 강조 표시한 후, 평가자에게 제공함으로써, 평가자가 자기 소개서를 읽고 판단을 완료하는 데에 소요되는 시간을 단축시키고 인력 비용을 절감할 수 있다.According to the present invention, by understanding the sentence context of the self-introduction letter, highlighting the important sentences, and providing it to the evaluator, the time it takes for the evaluator to read the self-introduction and complete the judgment can be shortened and manpower costs can be reduced. I can.

또한, 서류 전형 시스템이 자기 소개서의 중요 문장에 대한 강조 표시만을 제공하며, 이에 대한 합격/불합격을 결정하는 평가자는 이를 참고로만 활용하기 때문에, 채용이 보다 공정하고 효율적으로 진행될 수 있다.In addition, since the document screening system provides only the highlighting of important sentences in the self-introduction letter, and the evaluator who decides pass/fail for this is used only as a reference, hiring can proceed more fairly and efficiently.

또한, 강조 표시된 문장 외에 평가자가 중요 문장으로 생각하는 문장들을 선택하여 서류 전행 시스템에 피드백하여, 이 후 자기 소개서 내 중요 문장 선택 시 평가자의 의견이 반영될 수 있으며, 학습 모델에 반영되어 있기 때문에 모든 지원자에게 동일한 평가 기준을 적용할 수 있다.In addition, in addition to the highlighted sentences, the evaluator selects the sentences that the evaluator thinks as important sentences and feeds them back to the document transfer system. After that, the evaluator's opinion can be reflected when selecting the important sentences in the self-introduction letter, and it is reflected in the learning model. The same evaluation criteria can be applied to applicants.

본 발명의 효과들은 이상에서 언급한 효과들로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해 될 수 있을 것이다.The effects of the present invention are not limited to the above-mentioned effects, and other effects not mentioned will be clearly understood by those skilled in the art from the following description.

도 1은 본 발명의 제1 실시 예에 따른 서류 전형 시스템의 구성을 개략적으로 나타낸 도면이다.
도 2는 본 발명의 제2 실시 예에 따른 서류 전형 시스템의 구성을 개략적으로 나타낸 도면이다.
도 3은 본 발명의 제2 실시 예에 따른 서류 전형 서버의 구성을 개략적으로 나타낸 도면이다.
도 4는 본 발명의 제2 실시 예에 따른 서류 전형 시스템의 구성에서 각 구성 요소들이 수행하는 기능을 설명하기 위한 도면이다.
도 5는 본 발명의 제2 실시 예에 따른 서류 전형 서버(200)가 중요 단어를 학습하고, 지원 서류에서 중요 문장을 추론 및 가이드하는 방법의 흐름을 도식화한 도면이다.
도 6a 및 도 6b는 본 발명의 제2 실시 예에 따른 서류 전형 서버가 가이드 완료한 문서를 예시적으로 나타낸 도면이다.1 is a diagram schematically showing the configuration of a document screening system according to a first embodiment of the present invention.
2 is a diagram schematically showing the configuration of a document screening system according to a second embodiment of the present invention.
3 is a diagram schematically showing the configuration of a document screening server according to a second embodiment of the present invention.
4 is a diagram for explaining a function performed by each component in the configuration of the document screening system according to the second embodiment of the present invention.
FIG. 5 is a diagram illustrating a flow of a method in which the document screening server 200 learns important words and infers and guides important sentences from a supporting document according to a second embodiment of the present invention.
6A and 6B are views exemplarily showing a document that has been guided by a document selection server according to a second embodiment of the present invention.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시 예를 상세히 설명한다. 본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시 예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 게시되는 실시 예에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시 예들은 본 발명의 게시가 완전하도록 하고, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 명세서 전체에 걸쳐 동일 참조 부호는 동일 구성 요소를 지칭한다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. Advantages and features of the present invention, and a method of achieving them will become apparent with reference to the embodiments described below in detail together with the accompanying drawings. However, the present invention is not limited to the embodiments to be posted below, but may be implemented in various different forms, and only these embodiments make the posting of the present invention complete, and common knowledge in the technical field to which the present invention pertains. It is provided to completely inform the scope of the invention to the possessor, and the invention is only defined by the scope of the claims. The same reference numerals refer to the same elements throughout the specification.

다른 정의가 없다면, 본 명세서에서 사용되는 모든 용어(기술 및 과학적 용어를 포함)는 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에게 공통적으로 이해될 수 있는 의미로 사용될 수 있을 것이다. 또 일반적으로 사용되는 사전에 정의되어 있는 용어들은 명백하게 특별히 정의되어 있지 않는 한 이상적으로 또는 과도하게 해석되지 않는다. 본 명세서에서 사용된 용어는 실시 예들을 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다.Unless otherwise defined, all terms (including technical and scientific terms) used in the present specification may be used with meanings that can be commonly understood by those of ordinary skill in the art to which the present invention belongs. In addition, terms defined in a commonly used dictionary are not interpreted ideally or excessively unless explicitly defined specifically. The terms used in this specification are for describing exemplary embodiments and are not intended to limit the present invention. In this specification, the singular form also includes the plural form unless specifically stated in the phrase.

본 명세서에서 사용되는 "포함한다 (comprises)" 및/또는 "포함하는 (comprising)"은 언급된 구성 요소, 단계 및/또는 동작은 하나 이상의 다른 구성 요소, 단계 및/또는 동작의 존재 또는 추가를 배제하지 않는다.As used herein, "comprises" and/or "comprising" refers to the presence or addition of one or more other components, steps and/or actions to the referenced components, steps and/or actions. Do not rule out.

이하, 본 발명에 대하여 첨부된 도면에 따라 상세히 설명한다.Hereinafter, the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 제1 실시 예에 따른 서류 전형 시스템(1000)의 구성을 개략적으로 나타낸 도면이고, 도 2는 본 발명의 제2 실시 예에 따른 서류 전형 시스템의 구성을 개략적으로 나타낸 도면이다.1 is a diagram schematically showing the configuration of a document screening system 1000 according to a first embodiment of the present invention, and FIG. 2 is a diagram schematically showing the configuration of a document screening system according to a second embodiment of the present invention .

도 1을 참조하면, 서류 전형 시스템(1000)은 레거시 시스템(Legacy System)과 인공 지능 시스템(AI System)을 포함할 수 있다. 레거시 시스템(Legacy System)은 신입 사원 채용을 위해 각 회사가 사용 중인 기존 시스템으로서, 지원자로부터 각종 지원 서류를 제공받아 저장할 수 있으며, 지원 서류에 대한 평가자의 평가를 평가자 단말(300)를 통해 제공 받을 수 있다.Referring to FIG. 1, the document selection system 1000 may include a legacy system and an artificial intelligence system. Legacy System is an existing system used by each company for hiring new employees, and it can receive and store various application documents from applicants, and receive evaluation of the evaluator for the application documents through the evaluator terminal 300. I can.

인공 지능 시스템(AI System)은 수 백 건에 이르는 지원 서류를 평가자가 용이하게 읽고 검토할 수 있도록 가이드를 제공할 수 있는 시스템으로서, 학습 서버(10), 서비스 서버(20) 및 서류 전형 모델(30)로 구성될 수 있다.The AI System is a system that can provide a guide for the evaluator to easily read and review hundreds of supporting documents, and is a learning server 10, a service server 20, and a document selection model ( 30).

구체적으로, 학습 서버(10)는 1)채용 서버(100)로부터 기존의 지원 서류 평가 결과를 제공받을 수 있으며, 이를 학습하여 2)지원 서류 가이드를 제공하기 위한 서류 전형 모델(30)을 생성하고 저장할 수 있다. Specifically, the learning server 10 1) can receive the results of the evaluation of the existing support documents from the recruitment server 100, and by learning this, 2) create a document screening model 30 to provide a guide for supporting documents, and Can be saved.

서비스 서버(20)는 3)학습 서버(10)가 생성한 서류 전형 모델(30)을 인공 지능 엔진에 반영할 수 있으며, 4)채용 서버(100)로부터 지원 서류를 제공받게 되면 인공 지능 엔진을 활용하여 5)지원 서류에 대한 가이드를 생성(평가 결과)하여 이를 채용 서버(100)로 제공할 수 있다.The service server 20 can 3) reflect the document selection model 30 generated by the learning server 10 to the artificial intelligence engine, and 4) when the supporting documents are provided from the recruitment server 100, the artificial intelligence engine is By utilizing 5) a guide for the application document may be generated (evaluation result) and provided to the recruitment server 100.

한편, 본 발명은 실시 예에 따라, 학습 서버(10) 및 서비스 서버(20)의 기능을 하나의 서류 전형 서버(200)가 수행할 수도 있으며, 서류 전형 서버(200)를 포함하는 구성은 도 2와 같을 수 있다.On the other hand, according to an embodiment of the present invention, one document screening server 200 may perform the functions of the learning server 10 and the service server 20, and the configuration including the document screening server 200 is illustrated in FIG. May be equal to 2.

지금까지 본 발명의 제1 및 제2 실시 예에 따른 서류 전형 시스템(1000)의 전체 구성에 대해 살펴보았으며, 이하에서는 평가자가 지원 서류를 읽고 의사 결정을 용이하게 할 수 있도록 지원 서류에 대한 가이드를 제공하는 방법에 대하여 구체적으로 설명하도록 한다.So far, the entire configuration of the document screening system 1000 according to the first and second embodiments of the present invention has been examined. Hereinafter, a guide to the supporting documents so that the evaluator can read the application documents and make a decision easily. It will be described in detail with respect to the method of providing.

도 3은 본 발명의 제2 실시 예에 따른 서류 전형 서버(200)의 구성을 개략적으로 나타낸 도면이고, 도 4는 본 발명의 제2 실시 예에 따른 서류 전형 시스템의 구성에서 각 구성 요소들이 수행하는 기능을 설명하기 위한 도면이다.3 is a diagram schematically showing the configuration of a document screening server 200 according to the second embodiment of the present invention, and FIG. 4 is a diagram showing the configuration of the document screening system according to the second embodiment of the present invention. It is a diagram for explaining the function to be performed.

도 3 및 도 4를 참조하면, 서류 전형 서버(200)는 문장 관리부(210), 단어-벡터 변환부(220), 구문 분석부(230), 맥락 획득부(240), 문장 중요도 결정부(250) 및 메모리(260)를 포함할 수 있으며, 기타 본 발명의 목적을 달성하기 위한 부가적인 구성 역시 포함할 수 있음은 물론이다. 3 and 4, the document screening server 200 includes a sentence management unit 210, a word-vector conversion unit 220, a syntax analysis unit 230, a context acquisition unit 240, and a sentence importance determining unit ( 250) and the memory 260 may be included, and other additional components for achieving the object of the present invention may also be included.

한편, 문장 관리부(210), 단어-벡터 변환부(220), 구문 분석부(230), 맥락 획득부(240), 문장 중요도 결정부(250) 및 메모리(260)는 본 발명의 일 실시 예에 따른 서류 전형 서버(200)의 기능을 기준으로 분류한 임시 또는 가상적인 구성일 뿐이며, 어느 한 구성이 수행하는 기능을 다른 구성이 함께 수행할 수 있으며, 하나의 구성이 전체 구성의 기능을 모두 수행할 수 있음은 물론이다. 예를 들어, 하나의 프로세서(미도시)가 문장 관리부(210), 단어-벡터 변환부(220), 구문 분석부(230), 맥락 획득부(240), 문장 중요도 결정부(250) 및 메모리(260)의 기능을 모두 수행할 수도 있다.Meanwhile, the sentence management unit 210, the word-vector conversion unit 220, the syntax analysis unit 230, the context acquisition unit 240, the sentence importance determining unit 250, and the memory 260 are an embodiment of the present invention. It is only a temporary or virtual configuration classified based on the function of the document screening server 200 according to, and the function performed by one configuration can be performed by another configuration, and one configuration can perform all the functions of the entire configuration. Of course it can be done. For example, one processor (not shown) includes a sentence management unit 210, a word-vector conversion unit 220, a syntax analysis unit 230, a context acquisition unit 240, a sentence importance determining unit 250, and a memory. All of the functions of 260 may be performed.

문장 관리부(210)는 지원자가 작성한 지원 서류(이하, '문서'라고 함)를 문장 단위로 분리하여, 문서의 내용을 데이터 처리할 수 있는 형태로 정제할 수 있다. 다만, 문장 관리부(210)는 문서 내 모든 문장을 개별 문장으로 나누기 이전에 문서에 존재하는 항목을 확인하고, 항목 별로 동일한 범주의 문장들을 함께 묶어 둘 수 있다. 예를 들어, 문장 관리부(210)는 [표 1]과 같이 지원자의 문서를 복수의 문장으로 분리할 수 있다.The sentence management unit 210 may separate the application documents (hereinafter referred to as'documents') prepared by the applicant into sentences and refine the contents of the document into a form capable of processing data. However, the sentence management unit 210 may check the items existing in the document before dividing all sentences in the document into individual sentences, and group sentences of the same category for each item. For example, the sentence management unit 210 may divide the applicant's document into a plurality of sentences as shown in [Table 1].

분리 전Before separation 이미지 인식 시스템을 개발한 POC 프로젝트 경험이 가장 의미 있다고 생각합니다. 원활한 프로젝트 진행을 위해 가장 중요한 것은 사용자 요구사항을 정확하게 이해해야 합니다. 팀원들과 함께 솔루션 매뉴얼을 분석하고, 개발자를 찾아가 질문을 통해 이해도를 높이고자 하였습니다. 다각적 사고방식과 효율적인 로직 설계를 통해 완성도 높은 시스템을 개발하였습니다.I think the experience of the POC project that developed the image recognition system is the most meaningful. The most important thing to keep your project running smoothly is to understand your requirements accurately. Together with the team members, we analyzed the solution manual, visited the developer, and tried to improve understanding through questions. We have developed a high-quality system through a multifaceted mindset and efficient logic design. 분리 후After separation 문장1) 이미지 인식 시스템을 개발한 POC 프로젝트 경험이 가장 의미 있다고 생각합니다.
문장2) 원활한 프로젝트 진행을 위해 가장 중요한 것은 사용자 요구사항을 정확하게 이해해야 합니다.
문장3) 팀원들과 함께 솔루션 매뉴얼을 분석하고, 개발자를 찾아가 질문을 통해 이해도를 높이고자 하였습니다.
문장4) 다각적 사고방식과 효율적인 로직 설계를 통해 완성도 높은 시스템을 개발하였습니다.Sentence 1) I think the experience of the POC project that developed the image recognition system is the most meaningful.
Sentence 2) The most important thing for smooth project progress is to accurately understand user requirements.
Sentence 3) We analyzed the solution manual with the team members, visited the developer, and tried to improve understanding through questions.
Sentence 4) We have developed a high-quality system through multifaceted thinking and efficient logic design.

단어-벡터 변환부(220)는 기 저장되어 있는 방대한 양의 문서들(평가자들에 의해서 평가가 완료된 문서들) 내 단어를 워드 임베딩 모델(Word-Embedding, 예. Doc2Vec) 통해 벡터로 변환할 수 있다. 예를 들어, 단어-벡터 변환부(220)는 형태소 분석, 개체명 분석 또는 의미역(semantic role) 방식을 이용하여 각각의 단어를 벡터화할 수 있으며, 형태소 분석의 경우, 의미를 가지는 실질 형태소만을 벡터화할 수 있다.The word-vector conversion unit 220 may convert words in a vast amount of previously stored documents (documents that have been evaluated by evaluators) into vectors through a word-embedding model (eg, Doc2Vec). have. For example, the word-vector conversion unit 220 may vectorize each word using a morpheme analysis, an entity name analysis, or a semantic role method. In the case of morpheme analysis, only real morphemes having meaning Can be vectorized.

또한, 단어-벡터 변환부(220)는 문서 내 출현 빈도수가 기준 값 이상인 단어들을 추출하고 이들을 유사한 벡터 값을 가지도록 벡터화할 수 있으며, 이는 추후 단어-벡터 변환 모델을 학습하기 위해 활용될 수 있다. 여기서, 출현 빈도수의 기준 값은 전체 문장 수 대비 각 단어들의 출현 빈도수를 계산하는 방식으로 산정될 수 있다.In addition, the word-vector conversion unit 220 may extract words with a frequency of occurrence in the document greater than or equal to a reference value and vectorize them to have a similar vector value, which may be used to learn a word-vector conversion model later. . Here, the reference value of the frequency of appearance may be calculated by calculating the frequency of appearance of each word relative to the total number of sentences.

한편, 단어-벡터 변환부(220)는 구문 분석부(230)와 맥락 획득부(240)가 추출하는 특성 벡터를 기초로 지원자가 작성한 직무의 정보와 각 단어를 유사한 값을 가지도록 벡터화할 수 있다.Meanwhile, the word-vector conversion unit 220 may vectorize the job information and each word written by the applicant based on the feature vectors extracted by the syntax analysis unit 230 and the context acquisition unit 240 to have similar values. have.

구문 분석부(230)는 문서를 구성하는 복수의 문장 각각의 구문을 분석할 수 있다. 구체적으로, 구문 분석부(230)는 하나의 문장을 구와 절을 기준으로 분리하거나, 단어 개수를 지정해두고, 이를 여러 개의 구로 취급하여 분리할 수도 있다. 여기서 구문 분석(Phrase Analysis)이란 긴 문장을 여러 개의 구 또는 절로 분리하여 분석하는 기법으로, 다양한 자연어 처리 알고리즘과 기계 학습 알고리즘이 이용될 수 있다.The syntax analysis unit 230 may analyze the syntax of each of a plurality of sentences constituting a document. Specifically, the syntax analysis unit 230 may separate one sentence based on a phrase and a clause, or designate the number of words, and treat it as a plurality of phrases to separate them. Here, phrase analysis is a technique for separating and analyzing a long sentence into several phrases or clauses, and various natural language processing algorithms and machine learning algorithms may be used.

즉, 구문 분석부(230)는 지원자가 작성한 문서에서 중요 단어를 선정하기 위해 메모리(260)에 기 저장되어 있는 평가가 완료된 문서에서 중요 단어를 학습하는 과정을 우선적으로 수행할 수 있다. 이를 위해, 구문 분석부(230)는 n개의 단어 벡터에서 문장의 중요도 결정에 영향을 끼치는 단어 조합 k개를 CNN(Convolution Neural Network) 방식을 이용하여 특성 벡터 형태로 추출할 수 있다.That is, the syntax analysis unit 230 may preferentially perform a process of learning important words from a document that has been evaluated previously stored in the memory 260 in order to select important words from a document written by the applicant. To this end, the syntax analysis unit 230 may extract k word combinations that influence the determination of the importance of a sentence from n word vectors in the form of a feature vector using a convolution neural network (CNN) method.

다시 말하자면, 구문 분석부(230)는 기 저장된 문서에서 각 문장의 핵심을 나타내는 단어 조합(핵심 문구)을 제1 특성 벡터 형태로 추출할 수 있으며, 제 1 특성 벡터를 기초로 단어-벡터 변환 모델을 생성할 수 있다. In other words, the syntax analysis unit 230 may extract a word combination (key phrase) representing the core of each sentence from a previously stored document in the form of a first feature vector, and a word-vector conversion model based on the first feature vector Can be created.

한편, 중요도 예측 모델은 문서 평가 과정에서 평가자에 의해 지속적으로 업데이트될 수 있다.Meanwhile, the importance prediction model may be continuously updated by the evaluator during the document evaluation process.

이와 같이, 구문 분석부(230)는 문장 내에서 중요 단어들을 합하여 하나의 핵심 문구를 생성할 수 있으며, 예를 들어, [표 1]에서 분리되었던 문장 각각에 [표 2]와 같은 핵심 문구가 생성될 수 있다.In this way, the syntax analysis unit 230 may generate one core phrase by combining important words within a sentence. For example, a core phrase such as [Table 2] is included in each of the sentences separated in [Table 1]. Can be created.

1번 문장Sentence 1 이미지 인식 시스템을 개발한 POC 프로젝트 경험이 가장 의미있다고 생각합니다.I think the experience of the POC project that developed the image recognition system is the most meaningful. 핵심 문구Key phrase 이미지 인식 시스템Image recognition system 2번 문장Sentence 2 원활한 프로젝트 진행을 위해 가장 중요한 것은 사용자 요구사항을 정확하게 이해해야 합니다.The most important thing to keep your project running smoothly is to understand your requirements accurately. 핵심 문구Key phrase 사용자 요구사항 이해Understanding user requirements 3번 문장Sentence 3 팀원들과 함께 솔루션 매뉴얼 분석하고, 개발자를 찾아가 질문을 통해 이해도를 높이고자 하였습니다.Together with the team members, we analyzed the solution manual, visited the developer, and tried to improve understanding through questions. 핵심 문구Key phrase 솔루션 매뉴얼 분석Solution Manual Analysis 4번 문장Sentence 4 다각적 사고 방식과 효율적 로직 설계를 통해 완성도 높은 시스템을 개발하였습니다.We have developed a high-quality system through a multifaceted mindset and efficient logic design. 핵심 문구Key phrase 효율적 로직 설계Efficient logic design

맥락 획득부(240)는 구문 분석부(230)가 분석한 구문을 기초로 문서의 맥락(Context) 정보를 획득할 수 있으며, 이를 위해서 맥락 획득부(240)는 순환 신경망 모델(Recurrent Neural Network, RNN)을 이용할 수 있다. 보다 구체적으로, 맥락 획득부(240)는 핵심 문구를 순서대로 조합하여 맥락을 분석할 수 있는데, 첫번째 핵심 문구를 토대로 두번째 핵심 문구의 맥락을 도출하여, 마지막 핵심 문구를 통해 맥락의 결론을 제시할 수 있다. The context acquisition unit 240 may acquire context information of a document based on the syntax analyzed by the syntax analysis unit 230, and for this purpose, the context acquisition unit 240 is a recurrent neural network model. RNN) can be used. More specifically, the context acquisition unit 240 may analyze the context by combining the key phrases in order. Based on the first key phrase, the context of the second key phrase is derived, and the conclusion of the context is presented through the last key phrase. I can.

예를 들어, 핵심 문구의 순서가 '이미지 인식 시스템'-'사용자 요구사항 이해'-'솔루션 매뉴얼 분석'-'효율적 로직 설계'인 경우, 맥락 획득부(240)는 '시스템 운영 직무 적합도 맥락'이라는 맥락 정보를 획득할 수 있다.For example, when the order of key phrases is'Image Recognition System'-'Understanding User Requirements'-'Solution Manual Analysis'-'Efficient Logic Design', the context acquisition unit 240 is used as'system operation job suitability context'. Context information can be obtained.

아울러, 맥락 획득부(240)는 연속된 구문(맥락 정보)을 제2 특성 벡터 형태로 추출하고, 제2 특성 벡터를 기초로 단어-벡터 변환 모델을 생성할 수 있으며, 단어-벡터 변환 모델은 새롭게 수신한 문서 내 단어를 단어 벡터로 변환하는 과정에서 사용될 수 있다.In addition, the context acquisition unit 240 may extract the continuous phrase (context information) in the form of a second feature vector and generate a word-vector conversion model based on the second feature vector, and the word-vector conversion model It can be used in the process of converting a word in a newly received document into a word vector.

문장 중요도 결정부(250)는 지원자가 작성한 직무의 정보와 문서의 맥락 정보를 이용하여 복수의 문장 각각의 중요도를 예측 및 결정할 수 있다. 여기서, 직무의 정보는 직무 별로 지정된 핵심 요소(예. 도전, 긍정, 글로벌 역량)를 포함할 수 있다. 구체적으로, 문장 중요도 결정부(250)는 복수의 문장 각각을 핵심 요소 및 맥락 정보와의 연관성에 따라 중요도 점수를 부여하고, 부여된 중요도 점수에 따라 복수의 문장 각각에 상이한 표시를 지정할 수 있다. 즉, 연관성에 따라 각 문장에 부여된 점수가 중요도에 대한 지표일 수 있다.The sentence importance determining unit 250 may predict and determine the importance of each of a plurality of sentences by using job information and context information of a document written by the applicant. Here, the job information may include key elements (eg, challenge, positivity, global competency) designated for each job. Specifically, the sentence importance determining unit 250 may assign an importance score to each of the plurality of sentences according to the association between the core element and context information, and may designate a different display to each of the plurality of sentences according to the assigned importance score. That is, the score assigned to each sentence according to the relevance may be an indicator of the importance.

다시 말하면, 문장 중요도 결정부(250)는 1)복수의 문장 각각에서 핵심 요소와 관련된 단어의 존재 여부에 따라 중요도 점수를 부여하고, 2)복수의 문장을 통해 도출된 맥락 정보를 나타내고 있는 핵심 문장에 중요도 점수를 부여하여 종합적으로 문장 각각의 중요도를 결정할 수 있다. In other words, the sentence importance determining unit 250 1) assigns an importance score according to the presence or absence of a word related to a core element in each of a plurality of sentences, and 2) a core sentence representing context information derived through the plurality of sentences. By assigning an importance score to each sentence, the importance of each sentence can be determined comprehensively.

[표 2]를 참조하면 핵심 문구를 조합하여 '시스템 운영 직무 적합도 맥락'이라는 맥락 정보가 생성되었고, 지원자가 지원한 직무의 키워드가 '시스템 개발'인 경우, 문장 중요도 결정부(250)는 각각의 문장에 대한 중요도를 [표 3]과 같이 결정할 수 있다. 참고로, 중요도를 점수 형식으로 나타낼 시에는 2: 중요 문장, 1: 상관 문장, 0: 기타 와 같이 숫자의 의미를 기재해줄 수 있다.Referring to [Table 2], context information called'system operation job suitability context' was created by combining key phrases, and when the keyword of the job applied by the applicant is'system development', the sentence importance determining unit 250 each The importance of the sentence of can be determined as shown in [Table 3]. For reference, when the importance is expressed in a score format, the meaning of the numbers can be written, such as 2: important sentences, 1: correlation sentences, and 0: others.

문장 예시Sentence example 중요도importance 이미지 인식 시스템을 개발한 POC 프로젝트 경험이 가장 의미있다고 생각합니다.I think the experience of the POC project that developed the image recognition system is the most meaningful. 22 원활한 프로젝트 진행을 위해 가장 중요한 것은 사용자 요구사항을 정확하게 이해해야 합니다.The most important thing to keep your project running smoothly is to understand your requirements accurately. 00 PM 역할의 선배와 함께 요구 사항 정의서를 꼼꼼히 분석하고, 기능 구현을 위해 업무를 분배하였습니다.Along with the seniors in the PM role, we thoroughly analyzed the requirements definition and distributed tasks to implement the functions. 00 하지만 인턴 사원들은 사내 솔루션 이해도 부족과 익숙하지 않은 개발 환경 때문에 갈피를 잡지 못하였습니다.However, interns were frustrated by the lack of understanding of in-house solutions and the unfamiliar development environment. 00 팀원들과 함께 솔루션 매뉴얼 분석하고, 개발자를 찾아가 질문을 통해 이해도를 높이고자 하였습니다.Together with the team members, we analyzed the solution manual, visited the developer, and tried to improve understanding through questions. 1One 다각적 사고 방식과 효율적 로직 설계를 통해 완성도 높은 시스템을 개발하였습니다.We have developed a high-quality system through a multifaceted mindset and efficient logic design. 00

[표 3]와 같이, 문장 중요도 결정부(250)는 평가자가 문서 내에서 각각의 문장에 부여된 중요도 점수를 인식할 수 있도록 시각적인 효과를 줄 수 있으며, 중요도 점수 외에도 서로 다른 색의 하이라이팅 표시, 볼드체 표시 등 문장 각각에 상이한 표시를 지정할 수 있다.As shown in [Table 3], the sentence importance determining unit 250 can give a visual effect so that the evaluator can recognize the importance score assigned to each sentence in the document, and display highlighting of different colors in addition to the importance score. You can designate a different display for each sentence, such as in bold.

뿐만 아니라, 문장 중요도 결정부(250)는 기준 값보다 높은 중요도 점수를 가지는 모든 문장들에 시각적인 효과를 줄 수 있으며, 여기서의 기준 값은 관리자에 의해 고정된 값으로 설정되거나, 문서 내 전체 문장에서 상위 n%(n은 1이상의 자연수) 값으로 설정될 수 있다.In addition, the sentence importance determining unit 250 can give a visual effect to all sentences having an importance score higher than the reference value, and the reference value here is set to a fixed value by the administrator, or the entire sentence in the document It can be set as the upper n% (n is a natural number greater than or equal to 1).

아울러, 문장 중요도 결정부(250)는 기 저장된 문서에서 동일 문장에 대한 평가자들의 의견 일치 정도 또는 기 저장된 문서의 서류 전형 평가 점수를 기준으로 문장 별 중요도를 결정할 수 있으며, 이를 맥락 획득부(240)가 생성한 중요도 예측 모델에 반영할 수 있다.In addition, the sentence importance determining unit 250 may determine the importance of each sentence based on the degree of consensus of the evaluators on the same sentence in the previously stored document or the document screening evaluation score of the previously stored document. Can be reflected in the importance prediction model generated by

메모리(260)는 채용 서버(100)가 접수 받은 지원 서류, 평가자에 의해 검토가 완료된 지원 서류 데이터를 저장하고, 이를 기초로 학습된 서류 전형 모델(30)을 저장할 수 있다. The memory 260 may store the application document received by the recruitment server 100 and the application document data reviewed by the evaluator, and store the document selection model 30 learned based on this.

지금까지 본 발명의 일 실시 예에 따른 서류 전형 서버(200)의 구성에 대하여 설명하였다. 본 발명에 따르면 서류 전형 서버(200)는 지원자가 작성한 문서를 문장과 단어 별로 세부 분석하고 문서의 맥락을 파악함으로써, 각 회사 별로 설정한 키워드에 부합하는 문장을 강조 표시해줄 수 있다. 또한, 서류 전형 서버(200)를 이용하여 지원자의 문서 검토 시간을 측정한 결과 평균적으로 하나의 문서를 평가하는 데 소요되는 시간이 기존보다 대폭 감소되어, 검토의 효율성은 물론 인력 비용을 감축시킬 수 있으며, 서류 전형 서버(200)가 학습하는 지원서의 양이 지속적으로 증가함에 따라 하나의 문서를 평가하는 데 소요되는 시간은 점차 감소할 수 있다.So far, the configuration of the document screening server 200 according to an embodiment of the present invention has been described. According to the present invention, the document screening server 200 may analyze the document created by the applicant in detail by sentence and word and grasp the context of the document, thereby highlighting the sentence corresponding to the keyword set for each company. In addition, as a result of measuring the document review time of the applicant using the document screening server 200, the time required to evaluate one document on average is significantly reduced compared to the previous one, so that the review efficiency as well as the manpower cost can be reduced. In addition, as the amount of applications that the document screening server 200 learns continuously increases, the time required to evaluate one document may gradually decrease.

이하에서는, 상술한 서류 전형 서버(200)를 이용하여 지원 서류를 가이드하는 방법에 대하여 설명하도록 한다. 참고로, 가이드는 평가자가 지원 서류를 보다 빠르게 읽을 수 있도록 문서 내 중요 문장에 특정 표시를 지정해 주는 것이다. Hereinafter, a method of guiding the application documents using the above-described document selection server 200 will be described. For reference, guides designate specific marks on important sentences in the document so that the evaluator can read the application document more quickly.

도 5는 본 발명의 제2 실시 예에 따른 서류 전형 서버(200)가 중요 단어를 학습하고, 지원 서류에서 중요 문장을 추론 및 가이드하는 방법의 흐름을 도식화한 도면이다.FIG. 5 is a diagram illustrating a flow of a method in which the document screening server 200 learns important words and infers and guides important sentences from a supporting document according to a second embodiment of the present invention.

<학습부><Learning Department>

도 5를 참조하면, 서류 전형 서버(200)는 유사한 의미를 가지는 단어들이 유사한 벡터 값을 가지도록 서류 전형 서버(200)는 평가가 완료된 문서들, 기 저장된 문서들의 맥락 정보들을 학습하여, 맥락을 파악하기 위한 벡터 변환 모델을 생성한다(S101). 구체적으로, 서류 전형 서버(200)는 평가가 완료된 문서들을 맥락의 유사도에 따라 분류하고, 맥락에 대응되는 단어와 해당 문서를 유사한 값을 가지는 단어 벡터와 문서 벡터로 추출할 수 있으며, 이를 기초로 단어-벡터 변환 모델을 생성할 수 있다. 예를 들어, 서류 전형 서버(200)는 분류된 맥락 정보가 '고객 마케팅 직무 적합도 맥락', '고객 서비스 직무 적합도 맥락'이고, 해당 맥락 정보에 대응되는 단어들 '요구'-'이해', '니즈'-'파악'단어를 유사한 값의 벡터로 변환 할 수 있다. 즉, 학습 과정에서 '마케팅', '고객 서비스' 등의 직무 문서는 단어 벡터와 함께 평균을 취하여 학습될 수 있는 바, 문서 벡터는 단어 벡터와 동일하게 유사 값을 가짐으로써, 문서 간 상대적 의미 파악이 가능하다. 아울러, 이러한 단어를 벡터로 변환하는 과정을 수행하기 위해, 단어와 문서를 형태소 분석하여, 의미를 가지는 실질 형태소 만을 벡터화할 수 있다.Referring to FIG. 5, the document screening server 200 learns the context information of documents that have been evaluated and pre-stored documents so that words having similar meanings have similar vector values. A vector transformation model for grasping is generated (S101). Specifically, the document screening server 200 may classify documents for which evaluation has been completed according to the similarity of the context, and extract a word corresponding to the context and a corresponding document into a word vector and a document vector having similar values, and based on this You can create a word-vector transformation model. For example, in the document screening server 200, the classified context information is'customer marketing job suitability context' and'customer service job suitability context', and words corresponding to the context information'request'-'understanding', ' It is possible to convert the words'needs'-'knowledge' into a vector of similar values. In other words, job documents such as'marketing' and'customer service' in the learning process can be learned by taking an average together with word vectors.The document vector has the same value as the word vector, so that the relative meaning between documents can be grasped. This is possible. In addition, in order to perform the process of converting such words into vectors, only real morphemes having meanings may be vectorized by morpheme analysis of words and documents.

S101 단계 이후, 서류 전형 서버(200)는 평가가 완료된 문서들의 중요도를 학습하여, 중요도를 예측하기 위한 중요도 예측 모델을 생성한다(S102). 여기서, 평가가 완료된 문서란 평가자가 중요 문장으로 간주하고 별도로 표시해 둔 문장을 포함하는 문서 또는 서류 전형 서버(200)와 평가자에 의해 중요 문장이 지정 완료된 문서로서, 직무 별 키워드가 포함될 수 있다. 또한, 중요도는 문서를 구성하는 복수의 문장 각각이 지원자가 지원한 직무에 어느 정도로 적합한지 정량적으로 나타낸 지표로서, 예를 들어 평가자의 설정에 따라 가장 적합하지 않은 문장은 0, 가장 적합한 문장은 5로 태깅될 수 있다. 여기서, 문장에 대한 점수는 이에 한정되지 않는 다양한 점수 범위가 설정될 수 있다.After step S101, the document selection server 200 learns the importance of documents for which the evaluation has been completed, and generates an importance prediction model for predicting the importance (S102). Here, the evaluation-completed document is a document including a sentence that the evaluator regards as an important sentence and has been separately marked, or a document in which an important sentence has been designated by the document selection server 200 and the evaluator, and may include keywords for each job. In addition, importance is an index that quantitatively indicates to what extent each of the plurality of sentences constituting the document is suitable for the job applied by the applicant.For example, according to the setting of the evaluator, 0 is the most inappropriate sentence and 5 is the most suitable sentence. Can be tagged as. Here, as for the score for the sentence, various score ranges, which are not limited thereto, may be set.

구체적으로, 서류 전형 서버(200)는 중요도 별 점수가 설정된 문장에서 구문 각각의 핵심을 나타내는 단어를 조합하여(핵심 문구 생성) 제 1 특성 벡터 형태로 추출하고, 문장에서 연속된 구문의 맥락 정보를 제 2 특성 벡터 형태로 추출하여, 이를 기초로 중요도 예측 모델을 생성할 수 있으며, 평가자에 의한 중요도 점수가 반영되는 바, 해당 모델은 이후 지원자의 문서 평가 과정에서 평가자에 의해 지속적으로 업데이트 될 수 있다. Specifically, the document screening server 200 combines words representing the core of each phrase in the sentence for which the score for each importance is set (generates the core phrase), extracts it in the form of a first characteristic vector, and extracts context information of consecutive phrases from the sentence By extracting in the form of a second feature vector, an importance prediction model can be generated based on this, and the importance score by the evaluator is reflected, and the model can be continuously updated by the evaluator during the later document evaluation process of the applicant. .

예를 들어, 서류 전형 서버(200)가 중요하다고 강조한 표시를 읽고 틀렸다고 판단될 경우, 평가자는 강조 표시를 취소할 수 있으며, 서류 전형 서버(200)가 중요하지 않다고 넘긴 부분이 틀렸다고 판단될 경우, 평가자는 중요 문장을 선택할 수도 있다. 이와 같이, 평가자에 의한 피드백을 지속적으로 제공받아 학습 데이터에 반영하기 때문에, 평가 과정의 공정성을 증대시킬 수 있다.For example, if the document screening server 200 reads the mark emphasized as important and it is determined that it is wrong, the evaluator can cancel the highlighting, and if it is determined that the part that the document screening server 200 turned over as not important is wrong, The evaluator may also select important sentences. In this way, since feedback from the evaluator is continuously provided and reflected in the learning data, it is possible to increase the fairness of the evaluation process.

상술한 바와 같이, 서류 전형 서버(200)는 중요 문장 내 중요 단어들을 워드 임베딩(Word-Embedding) 모델을 이용하여 단어 벡터로 변환하고, 이후 신경망 모델(CNN, RNN)을 이용하여 특성에 따라 제1 특성 벡터, 제2 특성 벡터로 변환함으로써, 지원자가 작성한 문서의 맥락을 파악하고, 중요도를 매기기 위한 모델을 생성할 수 있다.As described above, the document screening server 200 converts important words in important sentences into word vectors using a word-embedding model, and then uses neural network models (CNN, RNN) to determine according to characteristics. By converting into a 1 feature vector and a second feature vector, a model can be generated to grasp the context of the document created by the applicant and assign importance.

지금까지 서류 전형 서버(200)가 평가가 완료된 문장의 중요도를 학습하고 맥락을 파악하기 위한 모델을 생성 및 학습하는 방법에 대하여 설명하였다. 본 발명에 따르면, 평가자 하나하나의 의견이 학습 모델에 반영될 수 있기 때문에, 모든 지원자에게 동일한 평가 기준을 적용할 수 있다.So far, the document screening server 200 has described a method of learning the importance of the sentence for which the evaluation has been completed, and generating and learning a model for grasping the context. According to the present invention, since the opinion of each evaluator can be reflected in the learning model, the same evaluation criteria can be applied to all applicants.

<추론부><Inference Department>

서류 전형 서버(200)는 지원자가 작성한 문서를 문장 단위로 분리한다(S110). 해당 단계는 문서의 내용을 서류 전형 서버(200)가 데이터 처리할 수 있는 형태로 정제하는 단계로서, 문서 내 모든 문장을 개별 문장으로 나눌 수 있다. 다만, 문장으로 나누기 이전에 문서에 존재하는 항목을 확인하고, 항목 별로 동일한 범주의 문장들을 함께 묶어 둘 수 있다.The document screening server 200 separates the document created by the applicant into sentences (S110). This step is a step of refining the contents of the document into a form that can be processed by the document screening server 200, and all sentences in the document may be divided into individual sentences. However, before dividing into sentences, you can check the items that exist in the document, and group sentences of the same category for each item.

S110 단계 이후, 서류 전형 서버(200)는 문서를 구성하는 복수의 문장 각각의 구문을 분석한다(S120). 즉, 서류 전형 서버(200)는 문서를 문단-문장에서 더 작은 단위인 단어 그룹으로 나눈 후 구문 분석을 수행할 수 있다. After step S110, the document selection server 200 analyzes the syntax of each of a plurality of sentences constituting the document (S120). That is, the document screening server 200 may perform syntax analysis after dividing the document into word groups that are smaller units in paragraph-sentence.

그에 따라, 서류 전형 서버(200)는 복수의 문장 각각을 단어 또는 형태소 단위로 쪼개고, 구분된 단어 그룹에서 중요 단어를 선정하여 해당 문장의 핵심 문구를 생성할 수 있다. 이 때, 학습된 중요도 예측 모델을 이용하여 중요 단어를 선정할 수 있다.Accordingly, the document screening server 200 may divide each of the plurality of sentences into words or morphemes, select important words from the divided word groups, and generate the core phrases of the corresponding sentences. In this case, important words may be selected using the learned importance prediction model.

보다 구체적으로, 서류 전형 서버(200)는 중요 단어들의 조합을 특성 벡터 형태로 추출하고, 이를 중요도 예측 모델에 입력하여, 중요 단어를 선정할 수 있다. 아울러, 서류 전형 서버(200)는 중요도 예측 모델 벡터 값을 입력하는 바, 벡터 간의 유사도를 측정하기 위해 유클리디안 거리 유사도, 코사인 유사도 방식 등을 이용할 수 있다.More specifically, the document screening server 200 extracts a combination of important words in the form of a feature vector and inputs the combination of important words into an importance prediction model to select important words. In addition, since the document screening server 200 inputs a vector value of an importance prediction model, a Euclidean distance similarity, a cosine similarity method, or the like may be used to measure the similarity between vectors.

S120 단계 이후, 서류 전형 서버(200)는 분석한 구문을 기초로 문서의 맥락(Context) 정보를 획득한다(S130). 맥락 정보를 획득하기 위해, 서류 전형 서버(200)는 순환 신경망 모델(Recurrent Neural Network, RNN)을 이용할 수 있으며, 핵심 문구를 순서대로 조합하여 문서의 맥락을 분석할 수 있다.After step S120, the document screening server 200 acquires context information of the document based on the analyzed syntax (S130). In order to obtain context information, the document selection server 200 may use a recurrent neural network (RNN) model, and may analyze the context of a document by combining key phrases in order.

예를 들어, 핵심 문구의 순서가 '이미지 인식 시스템'-'사용자 요구사항 이해'-'솔루션 매뉴얼 분석'-'효율적 로직 설계'인 경우, 서류 전형 서버(200)는 '시스템 운영 직무 적합도 맥락'이라는 맥락 정보를 획득할 수 있다.For example, if the order of key phrases is'Image Recognition System'-'Understanding User Requirements'-'Solution Manual Analysis'-'Efficient Logic Design', the document screening server 200 is used as'system operation job suitability context'. Context information can be obtained.

S130 단계 이후, 서류 전형 서버(200)는 지원자가 지원한 직무의 정보와 문서의 맥락 정보를 이용하여 복수의 문장 각각의 중요도를 결정한다(S140). 여기서, 직무의 정보는 직무 별로 지정된 핵심 요소(예. 도전, 긍정)를 포함할 수 있다. 구체적으로, 서류 전형 서버(200)는 복수의 문장 각각을 핵심 요소 및 맥락 정보와의 연관성에 따라 중요도 점수를 부여할 수 있으며, 부여된 중요도 점수에 따라 복수의 문장 각각에 상이한 표시를 지정할 수 있다. 즉, 연관성에 따라 각 문장에게 부여된 점수가 중요도에 대한 지표일 수 있다.After step S130, the document screening server 200 determines the importance of each of the plurality of sentences by using the job information and the context information of the document applied by the applicant (S140). Here, the job information may include key elements (eg, challenge, affirmation) designated for each job. Specifically, the document screening server 200 may assign an importance score to each of the plurality of sentences according to the association between the core element and context information, and may designate a different display for each of the plurality of sentences according to the assigned importance score. . That is, the score assigned to each sentence according to the relevance may be an indicator of importance.

다시 말하자면, 1)복수의 문장 각각에서 핵심 요소와 관련된 단어의 존재 여부에 따라 중요도 점수를 부여하고, 2)복수의 문장에서 도출된 맥락 정보를 나타내고 있는 핵심 문장에 중요도 점수를 부여하여 종합적으로 문장 각각의 중요도를 결정할 수 있다. In other words, 1) an importance score is assigned according to the presence or absence of a word related to a key element in each of the plurality of sentences, and 2) an importance score is assigned to the key sentence representing the context information derived from the plurality of sentences. You can determine the importance of each.

S140 단계 이후, 서류 전형 서버(200)는 중요도가 결정된 문서에서 평가자가 보정한 문장 별 중요도를 학습한다(S150). 예를 들어, 서류 전형 서버(200)는 평가자로부터 중요한 문장으로 강조 표시된 문장 외에 또 다른 문장을 선택 받을 수 있으며, 중요한 문장으로 강조 표시된 문장을 취소 받을 수 있다. 이는 곧 학습부에서 지원 서류 학습을 위한 자료로 피드백 될 수 있다.After step S140, the document selection server 200 learns the importance of each sentence corrected by the evaluator in the document whose importance is determined (S150). For example, the document screening server 200 may receive another sentence from the evaluator in addition to the sentence highlighted as an important sentence, and cancel the sentence highlighted as an important sentence. This can soon be fed back by the Ministry of Education as a material for learning the application documents.

한편, 서류 전형 서버(200)는 평가자가 문서 내에서 각각의 문장에 부여된 점수를 인식할 수 있도록 시각적인 효과를 줄 수 있다.Meanwhile, the document screening server 200 may provide a visual effect so that the evaluator can recognize the score assigned to each sentence in the document.

이와 관련하여, 도 6a 및 도 6b는 본 발명의 제2 실시 예에 따른 서류 전형 서버(200)가 가이드 완료한 문서를 예시적으로 나타낸 도면으로서, 도 6a 및 도 6b를 참조하면, 문서 내 중요 문장들이 평가자가 한 눈에 인식할 수 있도록 빗금 표시되어 있는 것을 확인할 수 있다. 또한, 직무 별로 지정된 핵심 요소(지원 동기, 글로벌 역량)가 표시되어 평가자가 이를 읽고 쉽게 해당 문서를 평가할 수 있도록 도움을 줄 수 있다.In this regard, FIGS. 6A and 6B are diagrams showing exemplary documents that have been guided by the document selection server 200 according to the second embodiment of the present invention, and referring to FIGS. 6A and 6B, You can see that the sentences are shaded so that the evaluator can recognize them at a glance. In addition, key elements (motivation for application, global competency) designated for each job are displayed so that the evaluator can read it and easily evaluate the document.

한편, 문장이 빗금 표시와 함께 중요도 점수가 표시될 수도 있으며, 중요도에 따라 볼드체 표시, 밑줄 표시, 서로 다른 색상으로 하이라이팅 표시되어 있을 수도 있다.On the other hand, the sentence may be marked with a hatched mark and an importance score, or may be displayed in bold, underlined, or highlighted in different colors depending on the importance.

지금까지 본 발명의 일 실시 예에 따른 서류 전형 서버(200)가 평가자의 편의를 위해 문서 내 중요 문장을 가이드해주기 위한 방법에 대하여 설명하였다. 본 발명에 따르면, 문서의 표절 여부나, 회사 별 키워드 존재 여부, 키워드 개수 체크 등 문서를 수치적으로 검토하는 것이 아니라, 맥락을 파악하고 맥락에 맞게 문서가 작성되었는지 검토하기 때문에, 실질적으로 합격, 불합격을 결정하는 평가자가 보다 쉽게 의사 결정을 내릴 수 있다. So far, a method for guiding important sentences in a document by the document screening server 200 according to an embodiment of the present invention has been described for the convenience of the evaluator. According to the present invention, since the document is not numerically reviewed, such as whether a document is plagiarized, whether or not there is a keyword for each company, and the number of keywords, it is necessary to understand the context and review whether the document is written in accordance with the The evaluator who decides to fail can make decisions more easily.

한편, 본 발명은 또한 컴퓨터로 읽을 수 있는 기록 매체에 컴퓨터가 읽을 수 있는 코드로서 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 기록 매체는 마그네틱 저장 매체, 광학적 판독 매체 등 모든 저장매체를 포함한다. 또한, 본 발명에서 사용되는 메시지의 데이터 포맷을 기록 매체에 기록하는 것이 가능하다.On the other hand, the present invention can also be implemented as a computer-readable code on a computer-readable recording medium. Computer-readable recording media include all storage media such as magnetic storage media and optical reading media. In addition, it is possible to record the data format of the message used in the present invention on a recording medium.

이상 첨부된 도면을 참조하여 본 발명의 실시 예들을 설명하였지만, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자는 본 발명이 그 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시 예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다.Although the embodiments of the present invention have been described with reference to the accompanying drawings, those of ordinary skill in the art to which the present invention pertains can be implemented in other specific forms without changing the technical spirit or essential features. You will be able to understand. Therefore, it should be understood that the embodiments described above are illustrative and non-limiting in all respects.

1000: 서류 전형 시스템
100: 채용 서버
200: 서류 전형 서버
10: 학습 서버
20: 서비스 서버
30: 서류 전형 모델
210: 문장 분리부
220: 단어-벡터 변환부
230: 구문 분석부
240: 맥락 획득부
250: 문장 중요도 결정부
260: 메모리
300: 평가자 단말1000: document screening system
100: Recruitment Server
200: document screening server
10: learning server
20: service server
30: document screening model
210: sentence separator
220: word-vector conversion unit
230: parsing unit
240: context acquisition unit
250: sentence importance determining unit
260: memory
300: evaluator terminal

Claims

Separating, by the document selection server, the document created by the applicant in sentence units;
Analyzing, by the document selection server, a phrase of each of a plurality of sentences constituting the document;
Obtaining, by the document selection server, context information of the document based on the analyzed syntax;
Determining, by the document screening server, the importance of each of the plurality of sentences using information on the job applied by the applicant and the context information; And
Learning, by the document selection server, the importance of each sentence corrected by the evaluator in the document for which the importance is determined;
How to guide the supporting documents using artificial intelligence, including.

The method of claim 1,
Before the step of separating into the sentence units,
Vectorizing a word in the document or a word in a pre-stored document using a morpheme analysis, an entity name analysis, or a semantic role analysis method;
How to guide the supporting documents using artificial intelligence, which further includes.

The method of claim 2,
Before the step of separating into the sentence units,
Sentences in the pre-stored document consist of consecutive phrases,
Combining words representing the core of each of the consecutive phrases, extracting them into a first feature vector form, and generating a word-vector transformation model based on the first feature vector;
How to guide the supporting documents using artificial intelligence, which further includes.

The method of claim 2,
Before the step of separating into the sentence units,
Dividing the pre-stored document into sentences, extracting context information of consecutive phrases from each sentence in the form of a second feature vector, and generating an importance prediction model based on the second feature vector;
How to guide the supporting documents using artificial intelligence that further includes.

The method of claim 4,
Generating the importance prediction model,
Determining the importance of each sentence in the previously stored document based on the degree of agreement of the evaluator's opinion on the same sentence or the document screening evaluation score of the previously stored document, and reflecting the determined importance of each sentence in the importance prediction model ;
How to guide the supporting documents using artificial intelligence, which further includes.

The method of claim 1,
The information of the above job is,
Includes key elements specified for each job,
Determining the importance of each of the plurality of sentences,
An importance score is assigned to each of the plurality of sentences according to the correlation between the core element and the context information, and the corresponding sentence is emphasized in the document so that the evaluator can recognize it in each of the plurality of sentences according to the assigned importance score. Designating an indication to be performed;
How to guide the supporting documents using artificial intelligence, which further includes.

The method of claim 2,
The vectorizing step,
The step of classifying the pre-stored document based on job criteria, extracting words with a frequency of occurrence of a reference value or more in the classified document, and vectorizing them to have a similar vector value,
How to guide application documents using artificial intelligence.

A sentence separating unit for separating the document written by the applicant into sentence units;
A word-vector conversion unit for vectorizing words in the document;
A syntax analysis unit that analyzes a syntax of each of a plurality of sentences constituting the document;
A context acquisition unit that acquires context information of the document based on the analyzed syntax; And
A sentence importance determining unit for determining the importance of each of the plurality of sentences using the job information and the context information written by the applicant, and learning the importance of each sentence corrected by the evaluator in the document in which the importance is determined;
Document screening server comprising a.

The method of claim 8,
The word-vector conversion unit,
Vectorizing a word in the document or a word in a pre-stored document using a morpheme analysis, entity name analysis, or semantic role analysis method,
Document screening server.

The method of claim 9,
Sentences in the pre-stored document consist of consecutive phrases,
The parsing unit,
Combining words representing the core of each of the consecutive phrases, extracting them in the form of a first feature vector, and generating a word-vector conversion model based on the first feature vector,
Document screening server.

The method of claim 9,
The context acquisition unit divides the pre-stored document into sentences, extracts context information of consecutive phrases from each sentence in the form of a second feature vector, and generates an importance prediction model based on the second feature vector,
Document screening server.

The method of claim 11,
The sentence importance determining unit,
Based on the degree of agreement of the evaluator's opinion on the same sentence or the document screening evaluation score of a previously stored document, the importance of each sentence is determined in the previously stored document, and the determined importance of each sentence is reflected in the importance prediction model,
Document screening server.

The method of claim 8,
The information of the above job is,
Includes key elements specified for each job,
The sentence importance determining unit,
An importance score is assigned to each of the plurality of sentences according to the correlation between the core element and the context information, and the corresponding sentence is emphasized in the document so that the evaluator can recognize it in each of the plurality of sentences according to the assigned importance score. To designate a sign to do,
Document screening server.

The method of claim 9,
The word-vector conversion unit,
Classifying the pre-stored document based on a job standard, extracting words with a frequency of occurrence greater than or equal to a reference value in the classified document, and vectorizing it to have a similar vector value,
Document screening server.