KR102156757B1

KR102156757B1 - System, method, and computer program for credit evaluation using artificial neural network

Info

Publication number: KR102156757B1
Application number: KR1020190119250A
Authority: KR
Inventors: 엄성민
Original assignee: (주)데이터리퍼블릭
Priority date: 2019-09-27
Filing date: 2019-09-27
Publication date: 2020-09-16
Also published as: WO2021060593A1

Abstract

Provided is a credit evaluation system. The credit evaluation system includes: a missing data prediction part receiving, by using machine learning, an input data set including a value about at least some of at least one item related with a financial service request case and at least one item about a requester, thereby creating a hidden vector of a lower dimension than an input vector from the input vector corresponding to the input data set, correcting the hidden vector based on hidden vector distribution, and outputting a reconstitution vector including a prediction value about a missing value among values about the at least one item of the input data set from the corrected hidden vector; and a credit evaluation processing part performing a credit evaluation on the request case based on the reconstitution vector. Therefore, the credit evaluation system is capable of improving the accuracy of a credit evaluation by accurately predicting missing data used for the evaluation.

Description

System, method, and computer program for credit evaluation using artificial neural network}

본 개시의 실시예들은 딥 러닝을 이용한 신용 평가 시스템, 신용 평가 방법, 및 컴퓨터 프로그램에 관한 것이다. Embodiments of the present disclosure relate to a credit rating system, a credit rating method, and a computer program using deep learning.

대출, 카드 개설 등의 다양한 금융 서비스를 제공하기 위하여, 신청자의 신용 평가의 중요성이 강조되고, 이를 위한 다양한 시스템들이 제공되고 있다. 신용 평가에는 다양한 요소들이 고려되어야 하기 때문에, 신용 평가를 위해 이용되는 데이터의 종류가 다수 존재한다. 또한, 다양한 데이터에 기초하여 신용 평가를 수행하기 위한 다양한 원리와 방법이 연구되고 있다. 이와 같이 방대한 데이터를 자동으로 처리하고, 정확한 신용 평가 결과를 산출할 수 있는 시스템 및 방법이 요구되고 있다. In order to provide various financial services such as loans and card openings, the importance of credit evaluation of applicants is emphasized, and various systems are provided for this. Since various factors must be considered in credit rating, there are many types of data used for credit rating. In addition, various principles and methods for performing credit evaluation based on various data are being studied. There is a need for a system and method capable of automatically processing such vast amounts of data and calculating accurate credit evaluation results.

본 개시의 실시예들은 신용 평가를 위한 입력 정보 중, 결측 데이터(missing data)를 정확하게 예측하고 신용 평가를 수행하는 시스템, 방법, 및 컴퓨터 프로그램을 제공하기 위한 것이다.Embodiments of the present disclosure are to provide a system, a method, and a computer program for accurately predicting missing data and performing credit evaluation among input information for credit evaluation.

또한, 본 개시의 실시예들은 기계 학습을 이용하여, 신용 평가에 이용되는 결측 데이터를 정확하게 예측하여, 신용 평가의 정확성을 향상시키기 위한 것이다.In addition, embodiments of the present disclosure are for improving accuracy of credit evaluation by accurately predicting missing data used for credit evaluation by using machine learning.

또한, 본 개시의 실시예들은 경제 지표의 예측 값을 이용하여 신청 서비스의 만기를 고려한 신용 평가 방법을 제공하기 위한 것이다.In addition, embodiments of the present disclosure are to provide a credit evaluation method in consideration of the expiration of an application service using a predicted value of an economic indicator.

본 개시의 일 실시예의 일 측면에 따르면, 기계 학습을 이용하여, 금융 서비스 신청 건에 관련된 적어도 하나의 항목 및 신청자에 대한 적어도 하나의 항목 중 적어도 일부 항목에 대한 값을 포함하는 입력 데이터 세트를 입력 받아, 상기 입력 데이터 세트에 대응하는 입력 벡터로부터 상기 입력 벡터보다 낮은 차원의 히든(hidden) 벡터를 생성하고, 상기 히든 벡터를 히든 벡터 분포에 기초하여 보정하고, 상기 보정된 히든 벡터로부터 상기 입력 데이터 세트의 적어도 하나의 항목에 대한 값 중 결측 값에 대한 예측 값을 포함하는 재구성 벡터를 출력하는 결측 데이터 예측부; 및 상기 재구성 벡터에 기초하여 상기 신청 건에 대한 신용 평가를 수행하는 신용 평가 처리부를 포함하는 신용 평가 시스템이 제공된다.According to an aspect of an embodiment of the present disclosure, using machine learning, an input data set including values for at least one item related to a financial service application and at least one item for an applicant is input. In response, a hidden vector having a lower dimension than the input vector is generated from an input vector corresponding to the input data set, the hidden vector is corrected based on a hidden vector distribution, and the input data from the corrected hidden vector A missing data predictor for outputting a reconstruction vector including a predicted value for a missing value among values for at least one item of the set; And a credit evaluation processing unit that performs credit evaluation on the application based on the reconstruction vector.

또한, 일 실시예에 따르면, 상기 기계 학습 프로세서는, 상기 입력 데이터 세트를 입력 받아 입력 벡터를 생성하는 입력 레이어; 상기 입력 벡터보다 낮은 차원의 히든 벡터를 출력하는 부호화 레이어; 학습된(learned) 히든 벡터 분포에 기초하여 상기 히든 벡터를 보정하는 보정 레이어; 상기 보정된 히든 벡터를 입력 받아, 상기 재구성 벡터를 생성하는 복호화 레이어; 및 상기 재구성 벡터를 출력하는 출력 레이어를 포함할 수 있다.In addition, according to an embodiment, the machine learning processor includes: an input layer that receives the input data set and generates an input vector; An encoding layer that outputs a hidden vector having a lower dimension than the input vector; A correction layer correcting the hidden vector based on a learned hidden vector distribution; A decoding layer receiving the corrected hidden vector and generating the reconstruction vector; And an output layer that outputs the reconstruction vector.

또한, 일 실시예에 따르면, 상기 기계 학습 프로세서는, 상기 입력 데이터 세트의 항목들 사이의 관계에 대한 규칙 정보를 학습(learn)하고, 상기 학습된 규칙 정보에 기초하여 상기 복호화 레이어를 업데이트(update)할 수 있다.In addition, according to an embodiment, the machine learning processor learns rule information about a relationship between items of the input data set, and updates the decoding layer based on the learned rule information. )can do.

또한, 일 실시예에 따르면, 상기 신청자에 대한 적어도 하나의 항목은, 성별, 나이, 결혼 여부, 자녀 유무, 주택 종류, 자가 여부, 직업, 급여 생활자 여부, 수입, 자산, 담보 유무, 기존 대출 유무, 보증 유무, 기존 대출 금액, 연체 유무, 또는 신용 불량 여부 중 적어도 하나 또는 이들의 조합을 포함할 수 있다.In addition, according to an embodiment, at least one item for the applicant is gender, age, marital status, children's existence, housing type, self-employment, occupation, salary status, income, assets, whether or not collateral, existing loans , At least one of whether there is a guarantee, the amount of an existing loan, whether there is arrears, or whether there is bad credit, or a combination thereof.

또한, 일 실시예에 따르면, 상기 신청 건에 관련된 적어도 하나의 항목은, 금융 서비스 종류, 담보 유무, 신청 금액, 대출 기간, 또는 원금 상환 여부 중 적어도 하나 또는 이들의 조합을 포함할 수 있다.In addition, according to an embodiment, the at least one item related to the application may include at least one of a type of financial service, the presence of collateral, an application amount, a loan period, or a principal repayment, or a combination thereof.

또한, 일 실시예에 따르면, 상기 기계 학습 프로세서는, 신청 건 정보, 신청자에 대한 정보, 및 신용 평가 결과를 포함하는 트레이닝 데이터에 기초하여 강화 학습(reinforcement learning)될 수 있다.In addition, according to an embodiment, the machine learning processor may perform reinforcement learning based on training data including application case information, information on the applicant, and credit evaluation results.

또한, 일 실시예에 따르면, 상기 신용 평가 처리부는, 기간에 따른 예측 경제 지표를 입력 받고, 상기 예측 경제 지표 및 상기 재구성 벡터에 기초하여 신용 평가를 수행할 수 있다.In addition, according to an embodiment, the credit evaluation processing unit may receive a predicted economic index according to a period and perform a credit evaluation based on the predicted economic index and the reconstruction vector.

또한, 일 실시예에 따르면, 상기 신청자에 대한 적어도 하나의 항목은, 상기 신청자의 위치 정보 이력을 포함하고, 상기 기계 학습 프로세서는, 상기 신청자의 위치 정보 이력에 기초하여 생성된 신청자 성향 항목에 대응하는 값을 포함하는 재구성 벡터를 생성할 수 있다.In addition, according to an embodiment, the at least one item for the applicant includes a history of location information of the applicant, and the machine learning processor corresponds to an applicant preference item generated based on the history of location information of the applicant It is possible to create a reconstruction vector containing the value of.

또한, 일 실시예에 따르면, 상기 재구성 벡터는 상기 입력 벡터보다 높은 차원을 갖고, 상기 입력 벡터에 기초하여 상기 기계 학습 프로세서에 의해 생성된 추가 생성 정보 항목을 포함할 수 있다.In addition, according to an embodiment, the reconstruction vector has a higher dimension than the input vector and may include an additional generation information item generated by the machine learning processor based on the input vector.

본 개시의 일 실시예의 다른 측면에 따르면, 금융 서비스 신청 건에 관련된 적어도 하나의 항목 및 신청자에 대한 적어도 하나의 항목 중 적어도 일부 항목에 대한 값을 포함하는 입력 데이터 세트를 입력 받는 단계; 기계 학습 프로세서에 의해 상기 입력 데이터 세트에 대응하는 입력 벡터로부터 상기 입력 벡터보다 낮은 차원의 히든(hidden) 벡터를 생성하는 단계; 상기 기계 학습 프로세서에 의해 상기 히든 벡터를 히든 벡터 분포에 기초하여 보정하는 단계; 상기 기계 학습 프로세서에 의해 상기 보정된 히든 벡터로부터 상기 입력 데이터 세트의 적어도 하나의 항목에 대한 값 중 결측 값에 대한 예측 값을 포함하는 재구성 벡터를 출력하는 단계; 및 상기 재구성 벡터에 기초하여 상기 대상자에 대한 신용 평가를 수행하는 단계를 포함하는 신용 평가 방법이 제공된다.According to another aspect of an embodiment of the present disclosure, there is provided a method comprising: receiving an input data set including a value for at least some of at least one item related to a financial service request and at least one item for an applicant; Generating, by a machine learning processor, a hidden vector having a lower dimension than the input vector from an input vector corresponding to the input data set; Correcting the hidden vector based on a hidden vector distribution by the machine learning processor; Outputting a reconstruction vector including a predicted value for a missing value among values for at least one item of the input data set from the corrected hidden vector by the machine learning processor; And performing a credit evaluation on the subject based on the reconstruction vector.

본 개시의 일 실시예의 또 다른 측면에 따르면, 프로세서에 의해 실행되었을 때, 상기 프로세서가 신용 평가 방법을 수행하도록 명령하는 적어도 하나의 명령어를 저장하는 기록 매체를 포함하는 컴퓨터 프로그램에 있어서, 상기 신용 평가 방법은, 금융 서비스 신청 건에 관련된 적어도 하나의 항목 및 신청자에 대한 적어도 하나의 항목 중 적어도 일부 항목에 대한 값을 포함하는 입력 데이터 세트를 입력 받는 단계; 기계 학습 프로세서에 의해 상기 입력 데이터 세트에 대응하는 입력 벡터로부터 상기 입력 벡터보다 낮은 차원의 히든(hidden) 벡터를 생성하는 단계; 상기 기계 학습 프로세서에 의해 상기 히든 벡터를 히든 벡터 분포에 기초하여 보정하는 단계; 상기 기계 학습 프로세서에 의해 상기 보정된 히든 벡터로부터 상기 입력 데이터 세트의 적어도 하나의 항목에 대한 값 중 결측 값에 대한 예측 값을 포함하는 재구성 벡터를 출력하는 단계; 및 상기 재구성 벡터에 기초하여 상기 대상자에 대한 신용 평가를 수행하는 단계를 포함하는, 컴퓨터 프로그램이 제공된다.According to another aspect of an embodiment of the present disclosure, when executed by a processor, a computer program comprising a recording medium storing at least one instruction instructing the processor to perform a credit evaluation method, wherein the credit evaluation The method includes the steps of: receiving an input data set including values for at least one item related to a financial service application and at least some items among at least one item for an applicant; Generating, by a machine learning processor, a hidden vector having a lower dimension than the input vector from an input vector corresponding to the input data set; Correcting the hidden vector based on a hidden vector distribution by the machine learning processor; Outputting a reconstruction vector including a predicted value for a missing value among values for at least one item of the input data set from the corrected hidden vector by the machine learning processor; And performing a credit assessment for the subject based on the reconstruction vector.

본 개시의 실시예들예 따르면, 신용 평가를 위한 입력 정보 중, 결측 데이터(missing data)를 정확하게 예측하고 신용 평가를 수행하는 시스템, 방법, 및 컴퓨터 프로그램을 제공할 수 있는 효과가 있다.According to embodiments of the present disclosure, it is possible to provide a system, a method, and a computer program that accurately predicts missing data and performs credit evaluation among input information for credit evaluation.

또한, 본 개시의 실시예들에 따르면, 기계 학습을 이용하여, 신용 평가에 이용되는 결측 데이터를 정확하게 예측하여, 신용 평가의 정확성을 향상시킬 수 있는 효과가 있다.In addition, according to embodiments of the present disclosure, it is possible to improve accuracy of credit evaluation by accurately predicting missing data used for credit evaluation by using machine learning.

또한, 본 개시의 실시예들에 따르면, 경제 지표의 예측 값을 이용하여 신청 서비스의 만기를 고려한 신용 평가 방법을 제공할 수 있는 효과가 있다.In addition, according to embodiments of the present disclosure, there is an effect of providing a credit evaluation method in consideration of the expiration of an application service by using a predicted value of an economic indicator.

도 1은 일 실시예에 따른 신용 평가 시스템을 나타낸 도면이다.
도 2는 일 실시예에 따른 신용 평가 시스템(100)의 구조를 나타낸 도면이다.
도 3은 일 실시예에 따른 신용 평가 방법을 나타낸 흐름도이다.
도 4는 입력 벡터로부터 재구성 벡터를 생성하는 과정을 설명하기 위한 도면이다.
도 5는 일 실시예에 따른 기계 학습 프로세서의 구조를 나타낸 블록도이다.
도 6은 일 실시예에 따른 히든 벡터 분포에 대한 정보를 나타낸 도면이다.
도 7은 일 실시예에 따른 기계 학습 모델의 학습 과정을 설명하기 위한 도면이다.
도 8은 일 실시예에 따른 신용 평가 처리부의 동작을 설명하기 위한 도면이다.1 is a diagram showing a credit evaluation system according to an embodiment.
2 is a diagram showing the structure of a credit rating system 100 according to an embodiment.
3 is a flowchart illustrating a credit evaluation method according to an embodiment.
4 is a diagram for describing a process of generating a reconstruction vector from an input vector.
5 is a block diagram showing the structure of a machine learning processor according to an embodiment.
6 is a diagram illustrating information on a hidden vector distribution according to an embodiment.
7 is a diagram for describing a learning process of a machine learning model according to an embodiment.
8 is a diagram illustrating an operation of a credit evaluation processing unit according to an exemplary embodiment.

본 명세서는 본 개시의 청구항의 권리범위를 명확히 하고, 본 개시의 실시 예들이 속하는 기술분야에서 통상의 지식을 가진 자가 본 개시의 실시 예들을 실시할 수 있도록, 본 개시의 실시 예들의 원리를 설명하고, 실시 예들을 개시한다. 개시된 실시 예들은 다양한 형태로 구현될 수 있다.The present specification clarifies the scope of the claims of the present disclosure, and describes the principles of the embodiments of the present disclosure so that those of ordinary skill in the art to which the embodiments of the present disclosure belong may implement the embodiments of the present disclosure. And, the embodiments are disclosed. The disclosed embodiments may be implemented in various forms.

명세서 전체에 걸쳐 동일 참조 부호는 동일 구성요소를 지칭한다. 본 명세서가 실시 예들의 모든 요소들을 설명하는 것은 아니며, 본 개시의 실시 예들이 속하는 기술분야에서 일반적인 내용 또는 실시 예들 간에 중복되는 내용은 생략한다. 명세서에서 사용되는 '부'(part, portion)라는 용어는 소프트웨어 또는 하드웨어로 구현될 수 있으며, 실시 예들에 따라 복수의 '부'가 하나의 요소(unit, element)로 구현되거나, 하나의 '부'가 복수의 요소들을 포함하는 것도 가능하다. 이하 첨부된 도면들을 참고하여 본 개시의 실시 예들, 및 실시 예들의 작용 원리에 대해 설명한다.The same reference numerals refer to the same elements throughout the specification. This specification does not describe all elements of the embodiments, and general content in the technical field to which the embodiments of the present disclosure pertain or overlapping content between the embodiments will be omitted. The term'part, portion' used in the specification may be implemented in software or hardware, and according to embodiments, a plurality of'parts' may be implemented as one element or one It is also possible for'to contain multiple elements. Hereinafter, embodiments of the present disclosure and operating principles of the embodiments will be described with reference to the accompanying drawings.

도 1은 일 실시예에 따른 신용 평가 시스템을 나타낸 도면이다.1 is a diagram showing a credit evaluation system according to an embodiment.

금융 기관은 대출, 신용 카드 개설 등 다양한 금융 여신 서비스를 제공하고 있다. 여신 서비스에서는 향후 여신 상환 리스크를 최소화하기 위해, 금융 여신을 실행하기 전에, 신청 건에 대한 리스크의 스코어를 산출한다. 본 개시의 실시예들에 따른 신용 평가 시스템(100)은 금융 서비스 신청 건에 대한 상환 리스크의 스코어를 산출하기 위해, 신청자에 대한 정보 및 신청 서비스에 대한 정보를 입력 받아, 신청 건에 대한 신용 평가를 수행한다. Financial institutions provide a variety of financial loan services such as loans and credit card opening. In the credit service, in order to minimize the risk of future credit repayment, the risk score for the application is calculated before the financial credit is executed. The credit evaluation system 100 according to the embodiments of the present disclosure receives information on the applicant and information on the application service in order to calculate the score of the reimbursement risk for the financial service application, and evaluates the credit on the application. Perform.

신용 평가 시스템(100)은 신청자 정보 및 신청 건 정보를 입력 받아, 신용 평가 정보를 출력한다. 본 개시의 실시예들에서 신청 건은 금융 기관에서 수신한 금융 서비스의 신청 건을 지칭한다. 금융 서비스는 대출, 신용 카드 개설 등을 포함한다. The credit evaluation system 100 receives applicant information and application case information, and outputs credit evaluation information. In the embodiments of the present disclosure, the application refers to an application for a financial service received from a financial institution. Financial services include loans, credit card opening, and more.

신청자 정보는 금융 서비스를 신청한 신청자에 대한 정보이다. 신청자 정보는 신청자와 관련된 복수의 항목들에 대한 값을 포함한다. 신청자 정보의 복수의 항목은 예를 들면, 성별, 나이, 결혼 여부, 자녀 유무, 주택 종류, 자가 여부, 직업, 급여 생활자 여부, 수입, 자산, 담보 유무, 기존 대출 유무, 보증 유무, 기존 대출 금액, 연체 유무, 또는 신용 불량 여부 중 적어도 하나 또는 이들의 조합을 포함할 수 있다.Applicant information is information about an applicant who applied for financial services. The applicant information includes values for a plurality of items related to the applicant. Multiple items of applicant information are, for example, gender, age, marital status, child status, housing type, self-employed status, occupation, salary status, income, assets, security status, existing loan status, guarantee status, existing loan amount , At least one of arrears, or bad credit, or a combination thereof.

신청자 정보는 복수의 항목들 중 일부 항목에 대한 값만 포함하거나, 모든 항목에 대한 값을 포함할 수 있다. 즉, 신청자 정보는 복수의 항목들 중 일부 항목의 값이 결측되어 있을 수 있다. 본 개시의 실시예들에 따른 신용 평가 시스템(100)은 기계 학습을 이용하여 신청자 정보 중 결측된 항목의 값을 예측하여, 신청자 정보의 데이터 세트를 완성할 수 있다.The applicant information may include values for only some of the plurality of items or values for all items. That is, in the applicant information, values of some of the plurality of items may be missing. The credit evaluation system 100 according to embodiments of the present disclosure may predict a value of a missing item among applicant information by using machine learning, and may complete a data set of applicant information.

본 개시의 실시예들에서 기계 학습은 다양한 기계 학습 알고리즘을 포함한다. 본 개시의 실시예들에 따른 기계 학습 모델은 적어도 하나의 matrix multiplication 연산 및 적어도 하나의 non-linear operation의 조합으로 interference가 실행되는 인공지능 알고리즘을 포함할 수 있다. 또한, 기계 학습 모델은 인공 신경망 모델, 또는 심층 신경망 모델을 포함할 수 있다. Machine learning in the embodiments of the present disclosure includes various machine learning algorithms. The machine learning model according to the embodiments of the present disclosure may include an artificial intelligence algorithm in which interference is executed by a combination of at least one matrix multiplication operation and at least one non-linear operation. In addition, the machine learning model may include an artificial neural network model or a deep neural network model.

금융 서비스 신청 건 정보는 금융 기관에 수신된 금융 서비스 신청 건에 대한 정보이다. 신청 건 정보는 신청된 금융 서비스를 정의하는 복수의 항목들을 포함한다. 신청 건 정보의 복수의 항목들은 예를 들면, 금융 서비스 종류, 담보 유무, 신청 금액, 대출 기간, 또는 원금 상환 여부 중 적어도 하나 또는 이들의 조합을 포함할 수 있다. 금융 서비스 종류는 예를 들면, 상품 코드로 정의될 수 있다. 신청 건의 항목은 상품 코드에 따라 정의될 수 있다. 예를 들면, 상품 코드가 주택 담보 대출에 해당하는 경우, 신청 건의 항목은 원금 상환 여부, 대출 기간, 대출 금액, 담보물 평가액, 및 고정 금리 여부 등을 포함할 수 있다. Financial service request information is information on a financial service request received from a financial institution. The request information includes a plurality of items that define the applied financial service. The plurality of items of the request information may include, for example, at least one of a type of financial service, the presence or absence of collateral, an application amount, a loan period, or a principal repayment, or a combination thereof. The type of financial service may be defined as, for example, a product code. The items of the proposal can be defined according to the product code. For example, when the product code corresponds to a mortgage loan, the item of the proposal may include whether to repay the principal, a loan period, a loan amount, an evaluation amount of a collateral, and whether or not a fixed interest rate.

신용 평가 시스템(100)은 입력된 신청자 정보 및 신청 건 정보에 기초하여 신용 평가 정보를 산출하여 출력한다. 신용 평가 정보는 신청 건에 대한 신용 점수를 나타낸다. 신용 평가 정보는 소정의 값으로 정의된 점수로 정의될 수 있다. 신용 평가 정보는 하나 이상의 항목에 대한 점수 또는 값으로 정의될 수 있다. 일 실시예에 따르면, 신용 평가 정보는 신청 건에 대해 산출될 수 있다. 다른 실시예에 따르면, 신용 평가 정보는 신청자에 대한 값과 신청 건에 대한 값을 별개로 가질 수 있다.The credit evaluation system 100 calculates and outputs credit evaluation information based on the input applicant information and application case information. The credit rating information represents the credit score for the application. Credit evaluation information may be defined as a score defined as a predetermined value. Credit rating information may be defined as a score or value for one or more items. According to an embodiment, credit evaluation information may be calculated for an application. According to another embodiment, the credit evaluation information may have a value for an applicant and a value for an application separately.

신용 평가 시스템(100)은 금융 기관의 시스템 내에 구비될 수 있다. 예를 들면, 신용 평가 시스템(100)은 은행, 증권사, 투자사, 카드사, 신용 평가 회사 등의 시스템 내에 구비될 수 있다. 금융 기관의 시스템은 보안과 안정성이 매우 높은 수준으로 요구된다. 이러한 이유로 금융 기관의 시스템은 네트워크로부터 차단되어 있는 경우가 많고, 해당 시스템으로의 데이터 유입과 해당 시스템으로부터의 데이터 유출이 엄격하게 차단되어 있다. 일반적으로 금융 기관의 시스템은 데이터 유출은 금지되어 있으며, 데이터 유입은 소정의 보안 시스템을 거쳐서만 허용되고 있다. 이러한 금융 기관 시스템의 환경으로 인해, 외부 서버에 구비된 모듈을 이용하여 신용 평가를 수행하기는 어려운 실정이다. 결국 금융 기관에서 신용 평가를 수행하기 위한 과정 중 상당 부분을 사람이 직접 처리하고 있어, 신용 평가를 수행하기 위한, 시간, 비용, 인력이 상당히 소요되고 있다. 본 개시의 실시예들에 따른 신용 평가 시스템(100)은 금융 기관의 시스템 내에 구비되어, 금융 기관의 시스템 외부로 데이터를 유출할 필요 없이, 금융 기관 시스템 내의 데이터를 활용하여 금융 기관 시스템 내에서 신용 평가를 수행할 수 있다. 따라서 본 개시의 실시예들에 따른 신용 평가 시스템(100)은 금융 기관 시스템 내에서 데이터의 외부 유출 없이 자동으로 금융 서비스의 신청 건에 대한 신용 평가를 수행할 수 있는 효과가 있다.The credit rating system 100 may be provided within the system of a financial institution. For example, the credit rating system 100 may be provided in a system such as a bank, a securities company, an investment company, a credit card company, or a credit rating company. Financial institutions' systems are highly demanded of security and stability. For this reason, systems of financial institutions are often blocked from the network, and data inflow to and data leakage from the system is strictly blocked. In general, data leakage is prohibited in financial institution systems, and data inflow is allowed only through a predetermined security system. Due to the environment of such a financial institution system, it is difficult to perform credit evaluation using a module provided in an external server. After all, a large part of the process for performing a credit evaluation in a financial institution is directly handled by humans, and it takes considerable time, cost, and manpower to perform a credit evaluation. The credit evaluation system 100 according to the embodiments of the present disclosure is provided in the system of a financial institution, without the need to leak data outside the system of the financial institution, and utilizes data in the system of the financial institution to provide credit within the system of the financial institution. Evaluation can be performed. Accordingly, the credit evaluation system 100 according to the exemplary embodiments of the present disclosure has an effect of automatically performing a credit evaluation on an application for a financial service without data leakage from outside the financial institution system.

또한, 금융 서비스에 대한 신청 건은 각 개인이 금융 기관에서 수기로 작성하거나, 금융 기관의 시스템에 직접 입력하는 것이기 때문에, 신청 건의 항목 중 일부 항목이 결측되어 있는 경우가 많다. 그런데 신용 평가 과정을 자동화하기 위해 소정의 알고리즘을 사용하기 위해서는 데이터의 완결성이 매우 중요하기 때문에, 소정의 알고리즘을 사용하기 위해서는 결측 데이터를 채우는 과정이 매우 중요하다. 그런데 금융 서비스와 관련된 데이터는 그 항목의 개수가 매우 많고, 관련성을 알기 어려운 하이 랭크(high rank) 예측에 속하기 때문에, 신뢰성 높은 결측 데이터의 예측 알고리즘을 설계하기 어려운 실정이다. 또한, 외부로 데이터 유출이 어려운 금융 기관 시스템의 특성 상, 외부 서버를 통해 제공되는 하이 랭크 예측 알고리즘을 이용하기 어려운 실정이다. In addition, since each individual submits an application for financial service by hand at a financial institution or directly inputs it into the system of a financial institution, some of the items of the application proposal are often missing. However, in order to use a predetermined algorithm to automate the credit evaluation process, the completeness of data is very important. Therefore, in order to use the predetermined algorithm, the process of filling in missing data is very important. However, since data related to financial services has a very large number of items and belongs to high-rank prediction, which is difficult to know the relevance, it is difficult to design a highly reliable prediction algorithm for missing data. In addition, due to the characteristics of financial institution systems that are difficult to leak data to the outside, it is difficult to use high-rank prediction algorithms provided through external servers.

본 개시의 실시예들에 따른 신용 평가 시스템(100)은 금융 시스템 내에 구비되어, 금융 시스템 내에서 기계 학습을 이용하여 결측 데이터 예측 과정을 학습하고, 학습된 기계 학습 모델을 이용하여 신청자 정보 및 신청 건 정보 내의 결측 데이터를 예측한다. 또한, 결측 데이터의 예측을 위한 기계 학습 모델을 학습함에 있어서, 본 개시의 실시예들에 따른 신용 평가 시스템(100)은 일부 학습 데이터에 의해 모델이 오버 핏팅(over fitting)되는 문제점을 방지하기 위해 데이터의 배치에 기초하여 인공 지능망 모델을 학습한다. 또한, 본 개시의 실시예들에 따른 신용 평가 시스템(100)은 예측의 신뢰성을 향상시키기 위해, 학습을 통해 획득된 규칙을 이용하여 기계 학습 모델 내에 액티브 컬럼(active column)을 배치하여, 예측의 신뢰성을 향상시킨다.The credit evaluation system 100 according to the embodiments of the present disclosure is provided in the financial system, learns the process of predicting missing data using machine learning in the financial system, and uses the learned machine learning model to learn the applicant information and application. Predict missing data in gun information. In addition, in learning a machine learning model for predicting missing data, the credit evaluation system 100 according to embodiments of the present disclosure prevents a problem in that the model is over-fitting by some training data. Based on the arrangement of data, an artificial intelligence network model is trained. In addition, the credit evaluation system 100 according to the embodiments of the present disclosure arranges an active column in the machine learning model by using a rule obtained through learning to improve the reliability of prediction. Improves reliability.

도 2는 일 실시예에 따른 신용 평가 시스템(100)의 구조를 나타낸 도면이다.2 is a diagram showing the structure of a credit rating system 100 according to an embodiment.

일 실시예에 따른 신용 평가 시스템(100)은 결측 데이터 예측부(210) 및 신용 평가 처리부(220)를 포함한다. 결측 데이터 예측부(210)는 기계 학습 프로세서(212)를 포함한다.The credit evaluation system 100 according to an embodiment includes a missing data prediction unit 210 and a credit evaluation processing unit 220. The missing data prediction unit 210 includes a machine learning processor 212.

결측 데이터 예측부(210)는 신청자 정보 및 신청 건 정보를 포함하는 입력 데이터 세트를 입력 받아, 입력 데이터 세트의 복수의 항목 중 그 값이 없는 결측 값을 예측하고, 예측 값을 포함하는 재구성 벡터를 출력한다. 결측 데이터 예측부(210)는 기계 학습 프로세서(212)를 이용하여 결측 값을 예측한다. The missing data prediction unit 210 receives an input data set including applicant information and application case information, predicts a missing value without the value among a plurality of items of the input data set, and generates a reconstruction vector including the predicted value. Print. The missing data prediction unit 210 predicts the missing value using the machine learning processor 212.

기계 학습 프로세서(212)는 복수의 노드 및 복수의 레이어를 포함하는 기계 학습 모델의 동작을 수행하는 프로세서이다. 기계 학습 모델은 복수의 노드 및 복수의 노드들 사이의 가중치의 가중치에 의해 정의될 수 있다. 또한, 기계 학습 모델은 복수의 레이어를 포함하는 심층 신경망 모델에 대응될 수 있다. 기계 학습 모델은 다수의 트레이닝 데이터에 의해 학습된다. 일 실시예에 따르면, 기계 학습 프로세서(212)는 금융 시스템 내에 저장된 데이터를 트레이닝 데이터로 이용하여 기계 학습 모델을 학습시킬 수 있다.The machine learning processor 212 is a processor that performs an operation of a machine learning model including a plurality of nodes and a plurality of layers. The machine learning model may be defined by weights of a plurality of nodes and weights between the plurality of nodes. In addition, the machine learning model may correspond to a deep neural network model including a plurality of layers. Machine learning models are trained by a number of training data. According to an embodiment, the machine learning processor 212 may train a machine learning model by using data stored in the financial system as training data.

기계 학습 프로세서(212)는 결측 데이터 예측을 위한 전용 기계 학습 칩의 형태로 구현되거나, 범용 프로세서 내에 소프트웨어 모듈 형태로 구현된 기계 학습 모델을 포함하는 형태로 구현될 수 있다. 기계 학습 프로세서(212)는 신용 평가 시스템(100) 내의 중앙 프로세서 내에 구비거나, 신용 평가 시스템(100) 내에서 중앙 프로세서와 별도의 프로세서로 구비될 수 있다.The machine learning processor 212 may be implemented in the form of a dedicated machine learning chip for predicting missing data, or may be implemented in a form including a machine learning model implemented in the form of a software module in a general-purpose processor. The machine learning processor 212 may be provided in the central processor in the credit evaluation system 100 or may be provided as a processor separate from the central processor in the credit evaluation system 100.

입력 데이터 세트는 벡터(vector) 또는 텐서(tensor)로 변환되어 기계 학습 프로세서(212)로 입력된다. 본 개시에서는 입력 데이터 세트가 벡터로 변환되는 것으로 설명하지만, 이러한 기재가 벡터와 상이한 형태의 데이터 형식을 배제하는 것은 아니며, 다양한 형태의 텐서 입력이 기계 학습 프로세서(212)에서 이용될 수 있다. 입력 데이터 세트의 복수의 항목들에 대한 값이 입력 벡터로 변환된다. 입력 벡터의 차원은 입력 데이터 세트의 항목의 개수, 기계 학습 프로세서(212)의 스펙 및 성능 등에 기초하여 다양하게 결정될 수 있다. The input data set is transformed into a vector or a tensor and input to the machine learning processor 212. In the present disclosure, an input data set is described as being converted to a vector, but this description does not exclude a data format different from that of a vector, and various types of tensor inputs may be used in the machine learning processor 212. The values of the plurality of items of the input data set are converted into input vectors. The dimension of the input vector may be variously determined based on the number of items in the input data set, specifications and performance of the machine learning processor 212.

일 실시예에 따르면, 결측 데이터 예측부(210)는 입력 데이터 세트의 항목의 종류 및 개수를 신청 건의 적어도 하나의 항목 값에 기초하여 결정할 수 있다. 예를 들면, 금융 서비스 종류에 따라 신청 건에 관련된 항목의 종류 및 개수가 달라질 수 있다. 다른 예로서, 담보 유무에 따라, 담보가 있는 경우에만 담보물 종류, 담보물 평가액, 선 순위 채권 유무 등의 항목이 입력 데이터 세트의 항목에 추가될 수 있다. 일 실시예에 따르면, 결측 데이터 예측부(210)는 입력 데이터 세트의 항목이 달라짐에 따라 입력 벡터의 각 요소의 속성을 변경하거나, 입력 벡터의 차원을 변경할 수 있다.According to an embodiment, the missing data predictor 210 may determine the type and number of items of the input data set based on the value of at least one item of the request. For example, the type and number of items related to an application may vary depending on the type of financial service. As another example, depending on the presence or absence of collateral, items such as the type of collateral, the amount of collateral, and the presence or absence of a senior bond may be added to the items of the input data set only when there is collateral. According to an embodiment, the missing data predictor 210 may change the attribute of each element of the input vector or change the dimension of the input vector as the items of the input data set are different.

결측 데이터 예측부(210)는 기계 학습 프로세서(212)에 의해 입력 벡터의 결측 값에 대한 예측 값을 생성하여, 결측 값이 채워진 재구성 벡터를 출력한다. 재구성 벡터는 입력 벡터와 동일 차원이거나, 입력 벡터보다 높은 차원의 벡터일 수 있다. 일 실시예에 따르면, 재구성 벡터는 입력 벡터에 기초하여 생성된 추가 생성 항목을 엘리먼트로 포함할 수 있다. 예를 들면, 입력 벡터에 기초하여 신청자의 DSR(Debt Service Ratio) 값 항목을 생성하여 재구성 벡터의 하나의 엘리먼트로 추가할 수 있다.The missing data predictor 210 generates a predicted value for the missing value of the input vector by the machine learning processor 212 and outputs a reconstructed vector filled with the missing value. The reconstruction vector may be a vector having the same dimension as the input vector or a higher dimension than the input vector. According to an embodiment, the reconstruction vector may include an additional generation item generated based on the input vector as an element. For example, an applicant's Debt Service Ratio (DSR) value item may be generated based on the input vector and added as one element of the reconstruction vector.

신용 평가 처리부(220)는 결측 데이터 예측부(210)로부터 생성된 재구성 벡터를 입력받아, 신청 건에 대한 신용 평가를 수행한다. 신용 평가에 대한 결과 값은 신청 건에 대한 신용 리스크를 나타내는 값이다. 신용 평가 처리부(220)는 신용 평가의 결과 값으로, 적어도 하나의 항목에 대한 점수 또는 소정 형식의 값을 출력할 수 있다. 신용 평가 처리부(220)는 신청 건에 대한 신용 평가 결과를 출력하거나, 신청자 및 신청 건에 대한 결과 값을 각각 출력할 수 있다. 신용 평가 처리부(220)는 소정의 신용 평가 알고리즘을 이용하여 신청 건에 대한 신용 평가를 수행할 수 있다. 일 실시예에 따르면, 신용 평가 처리부(220)는 소정의 기계 학습 모델을 이용하여 신용 평가의 결과 값을 출력할 수 있다.The credit evaluation processing unit 220 receives the reconstruction vector generated from the missing data prediction unit 210 and performs a credit evaluation on the application. The result of the credit evaluation is a value representing the credit risk for the application. The credit evaluation processing unit 220 may output a score for at least one item or a value in a predetermined format as a result of the credit evaluation. The credit evaluation processing unit 220 may output a credit evaluation result for the application or may output a result value for the applicant and the application, respectively. The credit evaluation processing unit 220 may perform a credit evaluation on the application by using a predetermined credit evaluation algorithm. According to an embodiment, the credit evaluation processing unit 220 may output a credit evaluation result value using a predetermined machine learning model.

일 실시예에 따르면, 신용 평가 시스템(100)은 입력 인터페이스(230) 및 출력 인터페이스(230)를 더 포함할 수 있다.According to an embodiment, the credit rating system 100 may further include an input interface 230 and an output interface 230.

입력 인터페이스(230)는 사용자 또는 외부 장치로부터 신청자 정보 및 신청 건 정보를 입력 받는다. 입력 인터페이스는 예를 들면, 키보드, 터치 패드, 터치 스크린, 마우스 등을 포함할 수 있다. 또한, 입력 인터페이스는 외부 장치로부터 데이터 또는 명령어를 입력 받을 수 있는 연결 단자 또는 통신 인터페이스를 포함할 수 있다. The input interface 230 receives applicant information and application case information from a user or an external device. The input interface may include, for example, a keyboard, a touch pad, a touch screen, and a mouse. In addition, the input interface may include a connection terminal or a communication interface through which data or commands can be input from an external device.

출력 인터페이스(240)는 신용 평가 처리부(220)에 의해 생성된 신용 평가의 결과 값을 출력한다. 출력 인터페이스(240)는 예를 들면, 디스플레이, 프린터, 스피커 등을 포함할 수 있다. 또한, 출력 인터페이스(240)는 신용 평가의 결과 값을 출력하는 통신 인터페이스를 포함할 수 있다.The output interface 240 outputs a result of the credit evaluation generated by the credit evaluation processing unit 220. The output interface 240 may include, for example, a display, a printer, and a speaker. In addition, the output interface 240 may include a communication interface that outputs a result value of the credit evaluation.

또한, 일 실시예에 따르면, 신용 평가 시스템(100)은 신용 평가 시스템(100) 전반의 동작을 제어하는 프로세서, 소정의 데이터 및 명령어를 저장하는 저장부, 외부 장치와 통신하는 통신부 등의 구성요소를 더 포함할 수 있다. In addition, according to an embodiment, the credit evaluation system 100 includes components such as a processor that controls the overall operation of the credit evaluation system 100, a storage unit that stores predetermined data and commands, and a communication unit that communicates with an external device. It may further include.

도 3은 일 실시예에 따른 신용 평가 방법을 나타낸 흐름도이다.3 is a flowchart illustrating a credit evaluation method according to an embodiment.

본 개시의 신용 평가 방법의 각 단계들은 프로세서를 구비하고, 기계 학습 모델을 이용하는 다양한 형태의 전자 장치에 의해 수행될 수 있다. 본 명세서는 본 개시의 실시예들에 따른 신용 평가 시스템(100)이 신용 평가 방법을 수행하는 실시예를 중심으로 설명한다. 따라서 신용 평가 시스템(100)에 대해 설명된 실시예들은 신용 평가 방법에 대한 실시예들에 적용 가능하고, 반대로 신용 평가 방법에 대해 설명된 실시예들은 신용 평가 시스템(100)에 대한 실시예들에 적용 가능하다. 개시된 실시예들에 따른 신용 평가 방법은 본 명세서에 개시된 신용 평가 시스템(100)에 의해 수행되는 것으로 그 실시예가 한정되지 않고, 다양한 형태의 전자 장치에 의해 수행될 수 있다.Each step of the credit evaluation method of the present disclosure may be performed by various types of electronic devices including a processor and using a machine learning model. This specification will focus on an embodiment in which the credit rating system 100 according to embodiments of the present disclosure performs a credit rating method. Accordingly, the embodiments described for the credit rating system 100 are applicable to the embodiments for the credit rating method, whereas the embodiments described for the credit rating method are applicable to the embodiments for the credit rating system 100. Applicable. The credit evaluation method according to the disclosed embodiments is performed by the credit evaluation system 100 disclosed in the present specification, and the embodiments are not limited and may be performed by various types of electronic devices.

결측 데이터 예측부(210)는 금융 서비스 신청 건에 대한 정보 및 신청자에 대한 정보를 포함하는 입력 데이터 세트를 입력 받는다(S302). 입력 데이터 세트는 신청 건에 관련된 적어도 하나의 항목 및 신청자에 대한 적어도 하나의 항목 중 적어도 일부 항목에 대한 값을 포함할 수 있다. 즉, 입력 데이터 세트는 적어도 하나의 항목 중 일부 항목의 값이 결측된 상태로 입력될 수 있다.The missing data prediction unit 210 receives an input data set including information on a financial service application and information on an applicant (S302). The input data set may include values for at least one item related to the application and at least some items among at least one item for the applicant. That is, the input data set may be input in a state in which values of some items among at least one item are missing.

다음으로 결측 데이터 예측부(210)는 입력 데이터 세트를 입력 벡터로 변환한다. 결측 데이터 예측부(210)는 미리 정의된 형태 및 차원으로 입력 벡터를 생성할 수 있다.Next, the missing data prediction unit 210 converts the input data set into an input vector. The missing data prediction unit 210 may generate an input vector in a predefined shape and dimension.

도 4는 입력 벡터로부터 재구성 벡터를 생성하는 과정을 설명하기 위한 도면이다. 다음으로 도 3 및 도 4를 참조하여 입력 벡터로부터 재구성 벡터를 생성하는 과정을 설명한다.4 is a diagram for describing a process of generating a reconstruction vector from an input vector. Next, a process of generating a reconstruction vector from an input vector will be described with reference to FIGS. 3 and 4.

다음으로 결측 데이터 예측부(210)는 입력 벡터(x_i)로부터 입력 벡터보다 낮은 차원의 히든(hidden) 벡터(h_i)를 생성한다(S304). 결측 데이터 예측부(210)는 입력 벡터(x_i)에 대해 소정의 인코딩 처리를 수행하여, 입력 벡터(x_i)를 더 낮은 차원의 히든 벡터(h_i)로 변환할 수 있다. 히든 벡터(h_i)의 각 엘리먼트의 값은 입력 벡터(x_i)로부터 생성된 값이다. Next, the missing data prediction unit 210 generates a hidden vector h _i of a lower dimension than the input vector from the input vector x _i (S304). Missing data prediction unit 210 may be converted by performing a predetermined encoding process for the input vector (x _i), as an input vector (x _i), the vector further hidden (h _i) of the low level. The value of each element of the hidden vector (h _i ) is a value generated from the input vector (x _i ).

다음으로 결측 데이터 예측부(210)는 히든 벡터(h_i)에 대한 보정을 수행한다(S306). 히든 벡터(h_i)에 대한 분포 정보는 미리 신용 평가 시스템(100)에 저장되거나, 신용 평가 시스템(100) 내에서 산출될 수 있다. 결측 데이터 예측부(210)는 히든 벡터(h_i)에 대한 분포 정보에 기초하여, 입력 벡터(x_i)로부터 생성된 히든 벡터(h_i)의 값을 보정할 수 있다. 예를 들면, 히든 벡터(h_i)에 대한 분포 정보는 소정의 n차원 공간 내에서 정의되고, 소정 개수의 히든 벡터의 클러스터가 n차원 공간 내에서 정의될 수 있다. 결측 데이터 예측부(210)는 소정의 n차원 공간 내에서 입력 벡터(x_i)로부터 산출된 히든 벡터(h_i)가 소정 개수의 히든 벡터의 클러스터 중 하나에 속하도록 히든 벡터(h_i)의 값을 보정할 수 있다. 일 실시예에 따르면, 결측 데이터 예측부(210)는 Manifold learning 알고리즘을 이용하여 히든 벡터(h_i)를 히든 벡터 분포 정보에 기초하여 보정할 수 있다. 본 개시의 실시예들은 히든 벡터의 분포 정보에 기초하여 히든 벡터를 보정함에 의해 기계 학습 모델이 랜덤 에러 또는 노이즈에 오버 피팅(over fitting)되기 어렵도록 학습하고 동작할 수 있다. Next, the missing data prediction unit 210 performs correction on the hidden vector h _i (S306). Distribution information on the hidden vector (h _i ) may be stored in the credit evaluation system 100 in advance or may be calculated in the credit evaluation system 100. Missing data prediction unit 210 based on the distribution information on the hidden vector (h _i), it is possible to correct the value of the hidden vector (h _i) generated from the input vector (x _i). For example, distribution information on the hidden vector h _i may be defined in a predetermined n-dimensional space, and a predetermined number of clusters of hidden vectors may be defined in the n-dimensional space. The missing data prediction unit 210 the hidden vector (h _i) is a hidden vector (h _i) to fall on one of the cluster of the hidden vector of a predetermined number is calculated from a predetermined n-dimensional input vector in the (x _i) Value can be corrected. According to an embodiment, the missing data prediction unit 210 may correct the hidden vector h _i based on the hidden vector distribution information using a Manifold learning algorithm. Embodiments of the present disclosure may learn and operate such that a machine learning model is difficult to overfit to random errors or noise by correcting a hidden vector based on distribution information of the hidden vector.

다음으로 결측 데이터 예측부(210)는 보정된 히든 벡터(h_i)로부터 히든 벡터(h_i)보다 높은 차원을 갖고, 입력 벡터(x_i)와 같거나 높은 차원을 갖는 재구성 벡터(reconstructed vector, x'_i)를 생성한다(S308). 재구성 벡터(x'_i)는 입력 벡터(x_i)의 결측 값에 대한 예측 값을 포함한다. 결측 데이터 예측부(210)는 재구성 벡터(x'_i)를 신용 평가 처리부(220)로 출력한다.Next missing data prediction unit 210 the corrected hidden vector (h _i) reconfiguration with the same or higher level as the hidden vector (h _i) than with a high level, the input vector (x _i) from the vectors (reconstructed vector, generates the x _'i) (S308). The reconstruction vector (x' _i ) contains a predicted value for the missing value of the input vector (x _i ). The missing data prediction unit 210 outputs the reconstruction vector (x′ _i ) to the credit evaluation processing unit 220.

신용 평가 처리부(220)는 입력된 재구성 벡터에 기초하여 신용 평가를 수행한다(S310). 신용 평가 처리부(220)는 기계 학습 모델을 이용하여 재구성 벡터로부터 신용 평가에 대한 결과 값을 생성할 수 있다.The credit evaluation processing unit 220 performs credit evaluation based on the input reconstruction vector (S310). The credit evaluation processing unit 220 may generate a credit evaluation result value from the reconstruction vector using a machine learning model.

도 5는 일 실시예에 따른 기계 학습 프로세서의 구조를 나타낸 블록도이다.5 is a block diagram showing the structure of a machine learning processor according to an embodiment.

기계 학습 프로세서(212)는 트레이닝 데이터를 이용하여 학습된 기계 학습 모델을 이용하여 입력 벡터로부터 재구성 벡터를 생성한다. 기계 학습 프로세서(212)는 복수의 레이어로 구성될 수 있으며, 복수의 레이어는 기계 학습의 학습에 의해 생성될 수 있다. 기계 학습 프로세서(212)는 복수의 레지스터와, 복수의 레지스터들에 대해 지정된 파라미터들의 조합으로 구현될 수 있다. The machine learning processor 212 generates a reconstruction vector from the input vector using the machine learning model learned using the training data. The machine learning processor 212 may be configured with a plurality of layers, and the plurality of layers may be generated by machine learning. The machine learning processor 212 may be implemented with a combination of a plurality of registers and parameters designated for the plurality of registers.

기계 학습 프로세서(212)는 학습 과정에서 모든 항목에 대한 값을 갖는 입력 데이터 세트를 이용하여, 강화 학습(reinforcement learning)을 수행할 수 있다. 예를 들면, 기계 학습 프로세서(212)는 완전한 입력 데이터 세트 중 일부 항목에 대한 값이 결측된 복수의 가상 입력 데이터 세트를 생성하고, 가상 입력 데이터 세트로부터 재구성 벡터를 생성한 후 완전한 입력 데이터 세트와 비교하여 강화 학습을 수행할 수 있다.The machine learning processor 212 may perform reinforcement learning using an input data set having values for all items in the learning process. For example, the machine learning processor 212 generates a plurality of virtual input data sets in which values for some items of the complete input data set are missing, generates reconstruction vectors from the virtual input data set, and then generates the complete input data set and By comparison, you can perform reinforcement learning.

일 실시예에 따르면, 기계 학습 프로세서(212)는 강화 학습 과정에서 적어도 하나의 보상 값(reward)을 이용할 수 있다. 적어도 하나의 보상 값은 금융기관 수익 증가, 손실 감소, 포워드 트랜스퍼(forward transfer) 성능, 백워드 트랜스퍼(backward stransfer) 성능, 및 타 금융 상품에 대한 적응성 중 적어도 하나 또는 이들의 조합을 포함할 수 있다. 금융기관 수익 증가는, 기계 학습 프로세서(212)의 예측 값에 기초하여 얼마나 금융기관의 수익이 증가하였는지를 나타내는 값이다. 손실 감소는 기계 학습 프로세서(212)의 예측 값에 기초하여 얼마나 금융 기관의 손실이 감소하였는지를 나타내는 값이다. 포워드 트랜스퍼 성능은 이전에 학습(learn)한 내용에 기초하여 새로운 지식 또는 모델을 얼마나 쉽게 잘 배우는지를 나타낸다. 백워드 트랜스퍼 성능은 새로 배운 내용이 기존에 이미 배운 내용에 대한 퍼포먼스를 향상시키는데 얼마나 도움이 되는지를 나타낸다. 타 금융 상품에 대한 적응성은 학습의 대상이 된 금융상품에서 학습한 모델이 새로운 금융상품(학습한 적이 없는 상품)에 얼마나 잘 적용되는지를 나타낸다. 기계 학습 프로세서(212)는 학습 데이터로 적어도 하나의 보상 값을 이용하고, 적어도 하나의 보상 값에 기초하여 강화 학습을 수행할 수 있다.According to an embodiment, the machine learning processor 212 may use at least one reward value in the reinforcement learning process. The at least one compensation value may include at least one of an increase in financial institution revenue, a decrease in loss, a forward transfer performance, a backward transfer performance, and an adaptability to other financial products, or a combination thereof. . The financial institution revenue increase is a value indicating how much the financial institution revenue has increased based on the predicted value of the machine learning processor 212. The loss reduction is a value indicating how much the loss of the financial institution has decreased based on the predicted value of the machine learning processor 212. Forward transfer performance indicates how well it is easy to learn new knowledge or models based on what has been previously learned. The backward transfer performance indicates how much new learning is helpful in improving the performance of what is already learned. Adaptability to other financial products indicates how well the model learned from the financial product that is the subject of learning is applied to new financial products (products that have not been learned). The machine learning processor 212 may use at least one reward value as training data and perform reinforcement learning based on the at least one reward value.

일 실시예에 따르면, 기계 학습 프로세서(212)는 적어도 하나의 보상 값의 컨벡스 조합(convex combination)에 기초하여 강화 학습을 수행할 수 있고, 적어도 하나의 보상 값을 고려하는 방식이 이에 한정되는 것은 아니다. According to an embodiment, the machine learning processor 212 may perform reinforcement learning based on a convex combination of at least one compensation value, and a method of considering at least one compensation value is limited thereto. no.

기계 학습 프로세서(212)는 복수의 레이어를 포함할 수 있다. 일 실시예에 따르면, 기계 학습 프로세서(212)는 입력 레이어(510), 부호화 레이어(520), 보정 레이어(530), 복호화 레이어(540), 및 출력 레이어(550)를 포함할 수 있다. 입력 레이어(510), 부호화 레이어(520), 보정 레이어(530), 복호화 레이어(540), 및 출력 레이어(550)는 각각 복수의 노드를 포함한다. 복수의 노드들 사이에는 가중치가 정의되고, 각 노드는 적어도 하나의 다른 노드의 값들로부터 가중치가 적용된 노드 값을 입력 받고, 입력된 값들의 합산 값에 의해 해당 노드의 노드 값이 정의될 수 있다. 각 레이어의 노드 및 노드 사이의 가중치는 기계 학습 프로세서(212) 내의 기계 학습 모델의 학습에 의해 정의될 수 있다.The machine learning processor 212 may include a plurality of layers. According to an embodiment, the machine learning processor 212 may include an input layer 510, an encoding layer 520, a correction layer 530, a decoding layer 540, and an output layer 550. The input layer 510, the encoding layer 520, the correction layer 530, the decoding layer 540, and the output layer 550 each include a plurality of nodes. A weight is defined between a plurality of nodes, each node receives a node value to which the weight is applied from values of at least one other node, and a node value of a corresponding node may be defined by a sum value of the input values. Nodes of each layer and weights between nodes may be defined by learning a machine learning model in the machine learning processor 212.

입력 레이어(510)는 입력 벡터(x_i)를 입력 받는다. 입력 벡터(x_i)는 결측 데이터 예측부(210) 내에서 입력 데이터 세트로부터 생성될 수 있다. 입력 벡터(x_i) 중 일부 엘리먼트의 값은 결측될 수 있다. 일 실시예에 따르면 기계 학습 프로세서(212)는 입력 데이터 세트를 입력 받고, 입력 레이어(510)에서 입력 데이터 세트로부터 입력 벡터(x_i)를 생성할 수 있다. 입력 레이어(510)는 입력 벡터(x_i)를 인식하고 입력 벡터(x_i)의 엘리먼트들을 입력 레이어(510) 내의 복수의 노드들에 저장한다. 입력 레이어(510)에 입력된 입력 벡터(x_i)의 엘리먼트들의 값은 소정의 가중치가 적용되고 합산되어 부호화 레이어(520) 내의 복수의 노드들로 입력된다. The input layer 510 receives an input vector (x _i ). The input vector (x _i ) may be generated from the input data set in the missing data predictor 210. Values of some elements of the input vector (x _i ) may be missing. According to an embodiment, the machine learning processor 212 may receive an input data set and generate an input vector (x _i ) from the input data set in the input layer 510. Type layer (510) recognizes the input vector (x _i) and stores the elements of the input vector (x _i) to a plurality of nodes in the input layer (510). The values of the elements of the input vector (x _i ) input to the input layer 510 are added to a plurality of nodes in the encoding layer 520 by applying a predetermined weight and adding them.

부호화 레이어(520)는 입력 레이어(510)의 각 노드 값에 가중치가 부여된 합성 값을 입력 받는다. 부호화 레이어(520)는 입력 레이어(510)로부터 입력된 노드 값들로부터 입력 벡터(x_i)보다 낮은 차원의 히든 벡터(h_i)를 생성한다. The encoding layer 520 receives a composite value to which a weight is assigned to each node value of the input layer 510. The encoding layer 520 generates a hidden vector h _i having a lower dimension than the input vector x _i from node values input from the input layer 510.

히든 벡터(h_i)는 보정 레이어(530)로 입력된다. 보정 레이어(530)는 히든 벡터 분포에 대한 정보에 기초하여, 히든 벡터(h_i)의 엘리먼트 값을 보정한다. The hidden vector h _i is input to the correction layer 530. The correction layer 530 corrects an element value of the hidden vector h _i based on information on the hidden vector distribution.

도 6은 일 실시예에 따른 히든 벡터 분포에 대한 정보를 나타낸 도면이다. 도 5 및 도 6을 참조하여 보정 레이어(530)의 동작을 설명한다.6 is a diagram illustrating information on a hidden vector distribution according to an embodiment. The operation of the correction layer 530 will be described with reference to FIGS. 5 and 6.

신용 평가 시스템(100)은 히든 벡터 분포에 대한 정보를 저장할 수 있다. 히든 벡터 분포에 대한 정보는 미리 저장되거나, 신용 평가 시스템(100) 내에서 산출될 수 있다. 신용 평가 시스템(100)은 저장된 데이터를 이용하여 히든 벡터 분포를 산출할 수 있다. 예를 들면, 결측 데이터 예측부(210)는 신용 평가 시스템(100)에 저장된 신청자 정보 및 신청 건 정보로부터, 기계 학습 프로세서(212)를 이용하여 히든 벡터(h_i)를 산출하고, 히든 벡터 분포에 대한 정보를 산출할 수 있다. 일 실시예에 따르면, 히든 벡터 분포에 대한 정보를 산출하기 위해 소정의 기계 학습 모델이 이용될 수 있다.The credit rating system 100 may store information on the hidden vector distribution. Information on the hidden vector distribution may be stored in advance or may be calculated in the credit rating system 100. The credit evaluation system 100 may calculate a hidden vector distribution using the stored data. For example, the missing data prediction unit 210 calculates a hidden vector (h _i ) using the machine learning processor 212 from the applicant information and application information stored in the credit evaluation system 100, and the hidden vector distribution Information about can be calculated. According to an embodiment, a machine learning model may be used to calculate information on the hidden vector distribution.

히든 벡터 분포에 대한 정보는, 히든 벡터 분포 상에서 정의되는 클러스터(610a, 610b, 610c, 및 610d)에 대한 정보를 포함한다. 히든 벡터는 그 데이터의 특성상 나타나는 소정의 클러스터(610a, 610b, 610c, 및 610d)를 갖는데, 이러한 클러스터(610a, 610b, 610c, 및 610d) 중 하나에 해당되도록 히든 벡터를 보정함에 의해, 결측 값의 예측 값에 대한 정확도를 향상시킬 수 있다. 또한, 기계 학습 프로세서(212)의 학습 과정에서 히든 벡터 분포 정보에 기초하여 히든 벡터를 보정하면서 학습함에 의해, 기계 학습 모델이 오버 핏팅되는 문제를 피할 수 있다.The information on the hidden vector distribution includes information on clusters 610a, 610b, 610c, and 610d defined on the hidden vector distribution. The hidden vector has predetermined clusters 610a, 610b, 610c, and 610d appearing due to the characteristics of the data. By correcting the hidden vector to correspond to one of these clusters 610a, 610b, 610c, and 610d, the missing value It is possible to improve the accuracy of the predicted value of. In addition, by learning while correcting the hidden vector based on the hidden vector distribution information in the learning process of the machine learning processor 212, a problem in which the machine learning model is overfit can be avoided.

기계 학습 모델은 학습 과정에서 특정 데이터 군에 기계 학습 모델이 오버 핏팅되는 문제 또는 catastrophic forgetting 문제가 발생할 수 있다. 본 개시의 실시예들은 기계 학습 학습에서 히든 벡터의 분포에 대한 정보를 이용하여 히든 벡터를 보정하면서 학습함에 의해, 오버 핏팅 문제 및 catastrophic forgetting 문제를 피할 수 있다. In the machine learning model, a catastrophic forgetting problem or a machine learning model over-fitting to a specific data group may occur during the learning process. The embodiments of the present disclosure may avoid an over-fitting problem and a catastrophic forgetting problem by learning while correcting a hidden vector using information on the distribution of the hidden vector in machine learning learning.

또한, 본 개시의 실시예에 따른 신용 평가 시스템(100)은 히든 벡터를 이용하여 데이터의 분포를 보정함에 의해, 처리의 복잡도를 감소시키고, 처리 자원 및 시간을 절약할 수 있는 효과가 있다. 히든 벡터는 입력 벡터 및 재구성 벡터보다 낮은 차원의 벡터이기 때문에, 데이터의 크기가 작다. 따라서 입력 벡터 또는 재구성 벡터를 이용하여 데이터 분포에 따른 보정을 수행하는 경우에 비해, 클러스터 정의가 용이하고, 벡터의 보정을 위해 요구되는 자원 및 시간이 감소될 수 있다.In addition, the credit evaluation system 100 according to the exemplary embodiment of the present disclosure corrects the distribution of data using a hidden vector, thereby reducing processing complexity and saving processing resources and time. Since the hidden vector is a vector of a lower dimension than the input vector and the reconstructed vector, the size of the data is small. Therefore, compared to the case of performing correction according to data distribution using an input vector or a reconstruction vector, cluster definition is easy, and resources and time required for vector correction can be reduced.

일 실시예에 따르면, 히든 벡터 분포 정보는 반복 처리(iterative process)에 의해 클러스터(610a, 610b, 610c, 및 610d) 사이의 경계(620a, 620b, 620c, 및 620d)를 업데이트하면서, 클러스터(610a, 610b, 610c, 및 610d)의 영역을 정의할 수 있다. 예를 들면, 결측 데이터 예측부(210)는 신용 평가 시스템(100)에 저장된 다수의 신청자 정보 및 신청 건 정보로부터, 히든 벡터를 산출하고, 히든 벡터의 분포를 학습하여 클러스터(610a, 610b, 610c, 및 610d) 사이의 경계(620a, 620b, 620c, 및 620d)를 학습하고 클러스터(610a, 610b, 610c, 및 610d)를 정의할 수 있다. According to an embodiment, the hidden vector distribution information is updated while updating the boundaries 620a, 620b, 620c, and 620d between the clusters 610a, 610b, 610c, and 610d by an iterative process, and the cluster 610a , 610b, 610c, and 610d) may be defined. For example, the missing data prediction unit 210 calculates a hidden vector from a plurality of applicant information and application information stored in the credit evaluation system 100, and learns the distribution of the hidden vector to obtain clusters 610a, 610b, 610c. , And 610d), the boundaries 620a, 620b, 620c, and 620d may be learned and clusters 610a, 610b, 610c, and 610d may be defined.

보정 레이어(530)는 히든 벡터 분포 정보에 기초하여 보정된 히든 벡터(hi')를 생성하여 복호화 레이어(540)로 출력한다. 복호화 레이어(540)는 보정된 히든 벡터(h_i')를 입력 받아, 재구성 벡터(x'_i)를 생성한다. 복호화 레이어(540)는 보정된 히든 벡터(h_i')로부터 재구성 벡터(x'_i)의 각 엘리먼트의 값을 생성하여 재구성 벡터(x'_i)를 생성할 수 있다. 복호화 레이어(540)는 히든 벡터(h_i')로부터 재구성 벡터(x'_i)의 각 엘리먼트 값을 예측하도록 학습될 수 있다.The correction layer 530 generates a corrected hidden vector hi' based on the hidden vector distribution information and outputs it to the decoding layer 540. The decoding layer 540 receives the corrected hidden vector (h _i ') and generates a reconstruction vector (x' _i ). Layer decoding unit 540 to generate a value for each element of _'(i reconstructed vector x) from the calibrated hidden vector (h _i)' may generate a reconstructed vector (x _'i). Decoding layer 540 may be learning to predict the value of each element _'(i reconstructed vector x) from the hidden vector (h _i)'.

복호화 레이어(540)는 생성된 재구성 벡터(x'_i)를 출력 레이어(550)로 출력한다. 출력 레이어(550)는 재구성 벡터(x'_i)를 신용 평가 처리부(220)로 출력한다.Layer decoding unit 540 outputs the generated reconstructed vector (x _'i) in the output layer (550). The output layer 550 outputs the reconstruction vector (x' _i ) to the credit evaluation processing unit 220.

도 7은 일 실시예에 따른 기계 학습 모델의 학습 과정을 설명하기 위한 도면이다. 7 is a diagram for describing a learning process of a machine learning model according to an embodiment.

일 실시예에 따르면, 기계 학습 프로세서(212)에 포함된 기계 학습 모델(700)은 학습 과정에서 소정의 규칙 정보(750)를 학습하고, 학습된 규칙 정보(750)는 기계 학습 모델(700) 내의 소정의 노드 또는 레이어에 할당된다. 기계 학습 모델(700)은 복수의 레이어를 포함한다. 기계 학습 모델(700)은 입력 레이어(710), 중간 레이어(720), 및 출력 레이어(730)를 포함한다. 일 실시예에 따르면, 학습된 규칙 정보(750)는 복수의 중간 레이어(720) 내의 적어도 하나의 노드 또는 적어도 하나의 레이어에 할당될 수 있다.According to an embodiment, the machine learning model 700 included in the machine learning processor 212 learns predetermined rule information 750 in a learning process, and the learned rule information 750 is the machine learning model 700 Is assigned to a predetermined node or layer within. The machine learning model 700 includes a plurality of layers. The machine learning model 700 includes an input layer 710, an intermediate layer 720, and an output layer 730. According to an embodiment, the learned rule information 750 may be allocated to at least one node or at least one layer in the plurality of intermediate layers 720.

규칙 정보(750)는 기계 학습 모델(700)에 의해 학습될 수 있다. 기계 학습 모델(700)은 소정의 규칙 정보(750)를 학습하고, 규칙 정보(750)의 신뢰도가 소정의 기준 값 이상 되었을 때, 학습된 규칙 정보(750)를 중간 레이어(720) 내의 소정의 노드 또는 레이어에 할당할 수 있다. 예를 들면, 규칙 정보(750)는 급여와 신용 불량 정보의 관계를 나타내는 규칙 정보 1(752), 직업과 연체 유무의 관계를 나타내는 규칙 정보 2(754) 등의 규칙 정보를 포함할 수 있다. The rule information 750 may be learned by the machine learning model 700. The machine learning model 700 learns predetermined rule information 750, and when the reliability of the rule information 750 is equal to or greater than a predetermined reference value, the learned rule information 750 is transferred to a predetermined value in the intermediate layer 720. Can be assigned to nodes or layers. For example, the rule information 750 may include rule information such as rule information 1 (752) indicating a relationship between salary and bad credit information, and rule information 2 (754) indicating a relationship between a job and the presence or absence of arrears.

일 실시예에 따르면, 규칙 정보(750)는 미리 저장되거나 외부 장치로부터 입력될 수 있다. 예를 들면, 시스템 설계 과정에서 획득된 규칙 정보, 또는 다른 신용 평가 시스템에서 획득된 규칙 정보가 신용 평가 시스템(100)의 학습에 이용될 수 있다. 기계 학습 모델(700)은 미리 저장되거나 외부 장치로부터 입력된 규칙 정보(750)를 이용하여 학습을 수행하고, 규칙 정보(750)의 신뢰도가 소정의 기준 값 이상이 되면 해당 규칙 정보(750)를 기계 학습 모델(700)의 소정의 노드 또는 레이어에 할당할 수 있다.According to an embodiment, the rule information 750 may be stored in advance or input from an external device. For example, rule information acquired in a system design process or rule information acquired in another credit rating system may be used for learning of the credit rating system 100. The machine learning model 700 performs learning using the rule information 750 stored in advance or input from an external device, and when the reliability of the rule information 750 is greater than or equal to a predetermined reference value, the corresponding rule information 750 is It may be assigned to a predetermined node or layer of the machine learning model 700.

도 8은 일 실시예에 따른 신용 평가 처리부의 동작을 설명하기 위한 도면이다.8 is a diagram illustrating an operation of a credit evaluation processing unit according to an exemplary embodiment.

신용 평가 처리부(220)는 재구성 벡터로부터 신청 건에 대한 신용 평가 결과 값을 생성한다. 신용 평가 처리부(220)는 다양한 기준에 기초하여 재구성 벡터로부터 신용 평가 결과 값을 생성한다.The credit evaluation processing unit 220 generates a credit evaluation result value for the application from the reconstruction vector. The credit evaluation processing unit 220 generates a credit evaluation result value from the reconstruction vector based on various criteria.

일 실시예에 따르면, 신용 평가 처리부(220)는 신청 건이 대출 서비스인 경우, 대출 만기 시의 경제 지표 예측치에 기초하여 신용 평가 결과 값을 산출할 수 있다. 신용 평가 처리부(220)는 하나 이상의 경제 지표를 이용할 수 있다. 경제 지표는 기준 금리, 시장 금리, 국내 주가 지수, 적어도 하나의 타국의 주가 지수, 물가 지수 등의 지표를 포함할 수 있다. 예를 들면, 신용 평가 처리부(220)는 대출 만기 시에 경제 지표 예측치가 양호한 경우 신용도를 높이거나, 리스크를 낮출 수 있다. 또한, 신용 평가 처리부(220)는 대출 만기 시에 경제 지표 예측치가 부정적인 경우, 신용도를 낮추거나 리스크를 높일 수 있다.According to an embodiment, when the application is a loan service, the credit evaluation processing unit 220 may calculate a credit evaluation result value based on a predicted value of an economic index at the maturity of the loan. The credit evaluation processing unit 220 may use one or more economic indicators. The economic indicator may include indicators such as a base interest rate, market interest rate, domestic stock price index, at least one other country stock price index, and price index. For example, the credit evaluation processing unit 220 may increase the credit rating or lower the risk when the predicted value of the economic index is good at the maturity of the loan. In addition, the credit evaluation processing unit 220 may lower the credit rating or increase the risk when the predicted value of the economic index is negative at the time of loan expiration.

일 실시예에 따르면, 신용 평가 처리부(220)는 미리 저장되거나 외부 장치로부터 입력된 경제 지표 예측치를 이용하여, 경제 지표 예측치에 기초한 신용 평가를 수행할 수 있다. 다른 실시예에 따르면, 신용 평가 시스템(100)은 경제 지표 예측치를 생성하기 위한 처리를 수행할 수 있다. 예를 들면 경제 지표 예측을 위한 알고리즘에 의해 동작하는 프로세서가 현재의 경제 지표 및 소정의 관련 데이터를 이용하여 경제 지표 예측치를 산출할 수 있다. 일 실시예에 따르면, 신용 평가 시스템(100)은 소정의 기계 학습 모델을 이용하여 경제 지표 예측치를 생성할 수 있다.According to an embodiment, the credit evaluation processing unit 220 may perform a credit evaluation based on the predicted economic index using the predicted economic index stored in advance or input from an external device. According to another embodiment, the credit rating system 100 may perform processing to generate a predicted economic indicator. For example, a processor operated by an algorithm for predicting economic indicators may calculate an economic indicator predicted value using the current economic indicator and predetermined related data. According to an embodiment, the credit rating system 100 may generate an economic indicator prediction value using a predetermined machine learning model.

일 실시예에 따르면, 결측 데이터 예측부(210)의 기계 학습 프로세서(212)는 신청자의 위치 정보 이력에 기초하여, 신청자 성향 항목의 값을 생성하고, 재구성 벡터 내에 신청자 성향 항목에 대응하는 값을 기록할 수 있다. 입력 데이터 세트는 신청자의 위치 정보 이력을 포함할 수 있다. 신청자가 집과 회사에서 시간을 보내는 비율이 높은 경우, 신청자는 소비 성향이 낮게 평가될 수 있다. 반면에 신청자가 쇼핑, 유흥, 레저 등과 관련된 장소에서 시간을 보내는 비율이 높은 경우, 신청자는 소비 성향이 높게 평가될 수 있다. 이러한 원리로, 기계 학습 프로세서(212)는 신청자의 위치 정보 이력에 기초하여 신청자 성향 항목의 값을 생성할 수 있다. According to an embodiment, the machine learning processor 212 of the missing data prediction unit 210 generates a value of the applicant's propensity item based on the applicant's location information history, and stores a value corresponding to the applicant's propensity item in the reconstruction vector. Can be recorded. The input data set may include the applicant's location information history. If the proportion of applicants spending time at home and at work is high, applicants may be assessed with a low propensity to spend. On the other hand, if the proportion of applicants spending time in places related to shopping, entertainment, leisure, etc. is high, the applicant may be evaluated as having a high consumption tendency. With this principle, the machine learning processor 212 may generate a value of the applicant's propensity item based on the applicant's location information history.

한편, 개시된 실시예들은 컴퓨터에 의해 실행 가능한 명령어 및 데이터를 저장하는 컴퓨터로 읽을 수 있는 기록매체의 형태로 구현될 수 있다. 상기 명령어는 프로그램 코드의 형태로 저장될 수 있으며, 프로세서에 의해 실행되었을 때, 소정의 프로그램 모듈을 생성하여 소정의 동작을 수행할 수 있다. 또한, 상기 명령어는 프로세서에 의해 실행되었을 때, 개시된 실시예들의 소정의 동작들을 수행할 수 있다. Meanwhile, the disclosed embodiments may be implemented in the form of a computer-readable recording medium that stores instructions and data executable by a computer. The instruction may be stored in the form of a program code, and when executed by a processor, a predetermined program module may be generated to perform a predetermined operation. In addition, when the command is executed by a processor, certain operations of the disclosed embodiments may be performed.

이상에서와 같이 첨부된 도면을 참조하여 개시된 실시 예들을 설명하였다. 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고도, 개시된 실시 예들과 다른 형태로 본 발명이 실시될 수 있음을 이해할 것이다. 개시된 실시 예들은 예시적인 것이며, 한정적으로 해석되어서는 안 된다.As described above, the disclosed embodiments have been described with reference to the accompanying drawings. Those of ordinary skill in the art to which the present invention pertains will understand that the present invention may be practiced in a form different from the disclosed embodiments without changing the technical spirit or essential features of the present invention. The disclosed embodiments are exemplary and should not be construed as limiting.

100 신용 평가 시스템
210 결측 데이터 예측부
212 기계 학습 프로세서
220 신용 평가 처리부
230 입력 인터페이스
240 출력 인터페이스100 credit rating system
210 Missing data prediction unit
212 machine learning processor
220 credit rating processing unit
230 input interface
240 output interface

Claims

An input vector corresponding to the input data set by receiving an input data set including a value for at least one item of at least one item related to a financial service application and at least one item for an applicant using machine learning A hidden vector of a lower dimension than the input vector is generated from, and the hidden vector is corrected based on a hidden vector distribution, and a value for at least one item of the input data set is missing from the corrected hidden vector. A missing data predictor for outputting a reconstruction vector including a predicted value for the value; And
A credit evaluation processing unit that performs a credit evaluation on the application based on the reconstruction vector,
The input data set includes a plurality of items, the plurality of items includes a missing item whose value is missing, and the missing data prediction unit writes the predicted value for the missing item of the input data set. A credit rating system for generating and outputting the reconstruction vector.

The method of claim 1,
The missing data prediction unit includes a machine learning processor that performs the machine learning,
The machine learning processor,
An input layer that receives the input data set and generates an input vector;
An encoding layer that outputs a hidden vector having a lower dimension than the input vector;
A correction layer correcting the hidden vector based on a learned hidden vector distribution;
A decoding layer receiving the corrected hidden vector and generating the reconstruction vector; And
And an output layer that outputs the reconstruction vector.

The method of claim 1,
The missing data prediction unit includes a machine learning processor that performs the machine learning,
The machine learning processor,
A credit evaluation system for learning rule information on a relationship between items of the input data set and updating the decoding layer based on the learned rule information.

The method of claim 1,
At least one item for the above applicant is gender, age, marital status, child status, housing type, self-employed status, occupation, salary status, income, assets, security status, existing loan status, guarantee status, existing loan amount, Credit rating system comprising at least one of arrears or bad credit, or a combination thereof.

The method of claim 1,
At least one item related to the application, including at least one or a combination of the financial service type, the presence of collateral, the amount of the application, the loan period, or the principal repayment.

The method of claim 1,
The missing data prediction unit includes a machine learning processor that performs the machine learning,
The machine learning processor is reinforcement learning (reinforcement learning) based on training data including application case information, information on the applicant, and credit evaluation results.

The method of claim 1,
The missing data prediction unit includes a machine learning processor that performs the machine learning,
The machine learning processor is reinforced learning using at least one compensation value, and the at least one compensation value is selected from among financial institution revenue increase, loss reduction, forward transfer performance, backward transfer performance, and adaptability to other financial products. A credit rating system comprising at least one or a combination thereof.

The method of claim 1,
The credit evaluation system, wherein the credit evaluation processing unit receives a predicted economic index according to a period and performs a credit evaluation based on the predicted economic index and the reconstruction vector.

The method of claim 1,
At least one item for the applicant includes a history of location information of the applicant,
The machine learning processor generates a reconstruction vector including a value corresponding to an applicant propensity item generated based on the applicant's location information history.

The method of claim 1,
Wherein the reconstruction vector has a higher dimension than the input vector, and includes an additional generation information item generated by the machine learning processor based on the input vector.

In the credit evaluation method,
Receiving an input data set including values for at least one item related to a financial service application and at least some items among at least one item for the applicant;
Generating, by a machine learning processor, a hidden vector having a lower dimension than the input vector from an input vector corresponding to the input data set;
Correcting the hidden vector based on a hidden vector distribution by the machine learning processor;
Outputting a reconstruction vector including a predicted value for a missing value among values for at least one item of the input data set from the corrected hidden vector by the machine learning processor; And
Comprising the step of performing a credit evaluation on the application based on the reconstruction vector,
The input data set includes a plurality of items, the plurality of items includes a missing item whose value is missing,
The credit valuation method further comprises generating the reconstruction vector in which the predicted value is entered for the missing item in the input data set.

A computer program stored in a storage medium comprising at least one instruction that, when executed by a processor, instructs the processor to perform steps of a credit rating method, the credit rating method comprising:
Receiving an input data set including values for at least one item related to a financial service application and at least some items among at least one item for the applicant;
Generating, by a machine learning processor, a hidden vector having a lower dimension than the input vector from an input vector corresponding to the input data set;
Correcting the hidden vector based on a hidden vector distribution by the machine learning processor;
Outputting a reconstruction vector including a predicted value for a missing value among values for at least one item of the input data set from the corrected hidden vector by the machine learning processor; And
Comprising the step of performing a credit evaluation on the application based on the reconstruction vector,
The input data set includes a plurality of items, the plurality of items includes a missing item whose value is missing,
The credit rating method further comprises generating the reconstruction vector in which the predicted value is entered for the missing item in the input data set.