KR102085814B1

KR102085814B1 - Device and method for new unfair claim pattern analysis based on artificial intelligence

Info

Publication number: KR102085814B1
Application number: KR1020170162068A
Authority: KR
Inventors: 김지혁; 이지현
Original assignee: (주)위세아이텍
Priority date: 2017-11-29
Filing date: 2017-11-29
Publication date: 2020-03-06
Also published as: KR20190063175A

Abstract

신규 부당청구 패턴 분석 장치 및 방법이 개시되며, 본원의 일 실시예에 따른 신규 부당청구 패턴 분석 장치는, 보험청구 내역 데이터를 정형화하여 특질 변수를 도출하는 데이터 전처리부, 상기 특질 변수를 입력으로 하는 비지도 학습 기반의 군집 알고리즘에 기초하여 부당청구의 신규 패턴을 분류하는 부당청구 패턴 분류부, 지도 학습 기반의 의사결정 알고리즘에 기초하여 신규 패턴 판별 모형을 구축하는 판별 모형 구축부 및 상기 신규 패턴 판별 모형에 기초하여 신규 청구에 대한 신규 패턴을 판별하는 신규 패턴 판별부를 포함할 수 있다.Disclosed is a device and method for analyzing a new invalidation pattern. The apparatus for analyzing a new invalidation pattern according to an exemplary embodiment of the present invention includes a data preprocessor for deriving characteristic variables by formulating insurance claim data and inputting the characteristic variables. An unfair billing pattern classification unit classifying a new pattern of unfair claims based on a clustering algorithm based on unsupervised learning, a discriminant model construction unit for constructing a new pattern discrimination model based on a supervised learning-based decision algorithm, and the new pattern discrimination It may include a new pattern determination unit for determining a new pattern for the new claim based on the model.

Description

DEVICE AND METHOD FOR NEW UNFAIR CLAIM PATTERN ANALYSIS BASED ON ARTIFICIAL INTELLIGENCE}

본원은 인공지능 기반의 신규 부당청구 패턴 분석 장치 및 방법에 관한 것이다.The present invention relates to a novel artificial claim pattern analysis apparatus and method based on artificial intelligence.

기존의 보험사기 방지시스템은 비즈니스 룰 기반으로 청구된 보험 사건에 대해 심사자의 경험과 지식을 바탕으로 룰을 도출하는 방법으로 조사대상 룰과 조사대상 제외 룰로 구분하여 산출하고 있다 그러나 보험사기는 점점 지능화 및 고도화됨에 따라 새로운 부당청구 패턴 내지 사기패턴에 대한 지속적인 갱신이 필요하다.The existing insurance fraud prevention system calculates rules based on business rules based on the reviewer's experience and knowledge, and divides them into investigation rules and exclusion rules. And as it is advanced, it is necessary to continuously update new fraudulent patterns or fraud patterns.

또한 보험회사는 보험사기에 대해 아무런 조치도 취하지 않은 채 보험금 누수를 방치할 수도 없고, 보험사기를 완전히 밝혀낸다는 목적 하에 무한대의 조사비용을 지출할 수도 없다. 이에 보험회사는 보험사기로부터 누수보험금을 줄이는 한편 지나치게 많은 조사비용을 지출하지 않는 적정한 선에서 조사노력의 수준을 결정할 필요가 있다.In addition, insurers cannot leave a claim without taking any action on insurance fraud, and cannot spend an infinite amount of investigation on the purpose of fully revealing the fraud. Thus, insurance companies need to reduce the amount of leakage from insurance fraud and determine the level of investigation efforts on a reasonable basis without spending too much investigative expenditure.

본원의 배경이 되는 기술은 한국등록특허공보 제10-0862181호에 개시되어 있다.Background art of the present application is disclosed in Korean Patent Publication No. 10-0862181.

본원은 전술한 종래 기술의 문제점을 해결하기 위한 것으로서, 보험금 부당청구의 신규 패턴을 판별할 수 있는 판별 모형을 제공하는 인공지능 기반의 신규 부당청구 패턴 분석 장치 및 방법을 제공하는 것을 목적으로 한다.The present invention is to solve the above-mentioned problems of the prior art, an object of the present invention to provide an artificial intelligence-based new fraud pattern analysis apparatus and method for providing a discrimination model that can determine the new pattern of claims fraudulent claims.

본원은 전술한 종래 기술의 문제점을 해결하기 위한 것으로서, 부당청구 데이터를 학습하여 새로운 부당청구 패턴을 분석하고, 신규 청구건에 대한 청구 패턴 유형을 판별할 수 있는 인공지능 기반의 신규 부당청구 패턴 분석 장치 및 방법을 제공하는 것을 목적으로 한다.The present application is to solve the above-mentioned problems of the prior art, the artificial intelligence-based new fraudulent pattern analysis that can analyze the fraudulent claim data by analyzing the new fraudulent patterns, and determine the pattern of the claim pattern for a new claim It is an object to provide an apparatus and method.

다만, 본원의 실시예가 이루고자 하는 기술적 과제는 상기된 바와 같은 기술적 과제들도 한정되지 않으며, 또 다른 기술적 과제들이 존재할 수 있다.However, the technical problem to be achieved by the embodiments of the present application is not limited to the technical problems as described above, and other technical problems may exist.

상기한 기술적 과제를 달성하기 위한 기술적 수단으로서, 본원의 일 실시예에 따른 신규 부당청구 패턴 분석 장치는 보험청구 내역 데이터를 정형화하여 특질 변수를 도출하는 데이터 전처리부, 상기 특질 변수를 입력으로 하는 비지도 학습 기반의 군집 알고리즘에 기초하여 부당청구의 신규 패턴을 분류하는 부당청구 패턴 분류부, 지도 학습 기반의 의사결정 알고리즘에 기초하여 신규 패턴 판별 모형을 구축하는 판별 모형 구축부 및 상기 신규 패턴 판별 모형에 기초하여 신규 청구의 패턴 유형을 판별하는 신규 패턴 판별부를 포함할 수 있다.As a technical means for achieving the above technical problem, the new unfair claim pattern analysis apparatus according to an embodiment of the present application is a data pre-processing unit for deriving the characteristic variable by formulating the insurance claim history data, the busy busy as the input of the characteristic variable An unfair billing pattern classifying unit classifying a new pattern of unfair claims based on a group learning algorithm based on a degree learning, a discriminant model construction unit for constructing a new pattern discrimination model based on a supervised learning-based decision algorithm, and the new pattern discriminating model It may include a new pattern determination unit for determining the pattern type of the new claim based on.

본원의 일 실시예에 따르면, 상기 보험청구 내역 데이터, 상기 특질 변수 및 상기 신규 패턴 중 적어도 하나를 기록하는 데이터베이스를 더 포함하고, 상기 부당청구 내역 데이터는 청구 데이터, 계약 데이터, 지급 데이터, 보험설계사 데이터, 고객 데이터 중 적어도 하나를 포함할 수 있다.According to an embodiment of the present application, the claim history data, the characteristic variable and the database further records at least one of the new pattern, wherein the invalid claim history data billing data, contract data, payment data, insurance company It may include at least one of data, customer data.

본원의 일 실시예에 따르면, 상기 부당청구 패턴 분류부는, 상기 특질 변수의 빈도에 기초하여 상기 군집 알고리즘을 통해 상기 특질 변수를 복수개의 청구 패턴으로 군집하고, 상기 청구 패턴의 군집간 분리도에 기초하여 상기 신규 패턴을 검출할 수 있다.According to one embodiment of the present application, the unfair claim pattern classification unit, based on the frequency of the feature variable clusters the feature variable into a plurality of claim patterns through the clustering algorithm, based on the degree of separation between the cluster of the claim pattern The new pattern can be detected.

본원의 일 실시예에 따르면, 상기 부당청구 패턴 분류부는, 상기 신규 패턴 상호간의 유사도에 기초하여 상기 신규 패턴을 복수개의 유사 패턴 그룹으로 분류할 수 있다.According to one embodiment of the present application, the unfair claim pattern classification unit may classify the new pattern into a plurality of similar pattern groups based on the similarity between the new patterns.

본원의 일 실시예에 따르면, 상기 부당청구 패턴 분류부는, 상기 신규 패턴 각각과 연계된 특질 변수 중 상기 특질 변수 각각에 미리 설정된 임계값 이상의 특질 변수의 수에 기초하여 상기 패턴 그룹의 위험 수준을 설정할 수 있다.According to one embodiment of the present application, the unfair claim pattern classifier, the risk level of the pattern group is set on the basis of the number of the characteristic variable above the threshold value preset to each of the characteristic variables associated with each of the new pattern; Can be.

본원의 일 실시예에 따르면, 상기 판별 모형 구축부는, 상기 위험 수준별로 상기 패턴 그룹에 포함된 상기 신규 패턴을 입력으로 하는 의사결정 알고리즘에 기초하여 신규 패턴 판별 규칙을 학습하고, 상기 신규 패턴 판별 규칙을 포함하는 상기 신규 패턴 판별 모형을 구축할 수 있다.According to the exemplary embodiment of the present application, the discrimination model building unit learns a new pattern discrimination rule based on a decision algorithm for inputting the new pattern included in the pattern group for each risk level, and the new pattern discrimination rule. The new pattern discrimination model may be constructed.

본원의 일 실시예에 따르면, 상기 신규 패턴 판별부는, 상기 신규 패턴 판별 규칙에 기초하여 상기 신규 청구가 정상 청구 패턴, 부당청구 패턴 및 신규 패턴 중 어떠한 패턴 유형인지 판별할 수 있다.According to one embodiment of the present application, the new pattern determination unit may determine whether the new claim is a pattern type of the normal claim pattern, unfair claim pattern and new pattern based on the new pattern determination rule.

본원의 일 실시예에 따른 신규 부당청구 패턴 분석 방법은, 보험청구 내역 데이터를 정형화하여 특질 변수를 도출하는 단계, 상기 특질 변수를 입력으로 하는 비지도 학습 기반의 군집 알고리즘에 기초하여 부당청구의 신규 패턴을 분류하는 단계, 지도 학습 기반의 의사결정 알고리즘에 기초하여 신규 패턴 판별 모형을 구축하는 단계 및 상기 신규 패턴 판별 모형에 기초하여 신규 청구에 대한 신규 패턴을 판별하는 단계를 포함할 수 있다.The new fraud pattern analysis method according to an embodiment of the present invention comprises the steps of deriving a feature variable by formulating insurance claim history data, and based on an unsupervised learning-based clustering algorithm using the feature variable as a new method of fraudulent claim. The method may include classifying a pattern, constructing a new pattern discrimination model based on a supervised learning-based decision algorithm, and determining a new pattern for a new claim based on the new pattern discrimination model.

본원의 일 실시예에 따르면, 상기 부당청구 내역 데이터, 상기 특질 변수 및 상기 신규 패턴 중 적어도 하나를 기록하는 단계를 더 포함하고, 상기 부당청구 내역 데이터는 청구 데이터, 계약 데이터, 지급 데이터, 보험설계사 데이터, 고객 데이터 중 적어도 하나를 포함할 수 있다.According to an embodiment of the present application, the method further comprises the step of recording at least one of the claim data, the characteristic variable and the new pattern, wherein the claim data, claim data, contract data, payment data, insurance agent It may include at least one of data, customer data.

본원의 일 실시예에 따르면, 상기 부당청구의 신규 패턴을 분류하는 단계는, 상기 특질 변수의 빈도에 기초하여 상기 군집 알고리즘을 통해 상기 특질 변수를 복수개의 청구 패턴으로 군집하고, 상기 청구 패턴의 군집간 분리도에 기초하여 상기 신규 패턴을 검출할 수 있다.According to an embodiment of the present disclosure, the step of classifying the new pattern of the unfair claims, clustering the feature variable into a plurality of claim patterns through the clustering algorithm based on the frequency of the feature variable, clustering of the claim pattern The novel pattern can be detected based on liver separation.

본원의 일 실시예에 따르면, 상기 부당청구의 신규 패턴을 분류하는 단계는, 상기 신규 패턴 상호간의 유사도에 기초하여 상기 신규 패턴을 복수개의 유사 패턴 그룹으로 분류할 수 있다.According to an embodiment of the present disclosure, in the step of classifying the new pattern of the illegal claim, the new pattern may be classified into a plurality of similar pattern groups based on the similarity between the new patterns.

본원의 일 실시예에 따르면, 상기 부당청구의 신규 패턴을 분류하는 단계는, 상기 신규 패턴 각각과 연계된 특질 변수 중 상기 특질 변수 각각에 미리 설정된 임계값 이상의 특질 변수의 수에 기초하여 상기 패턴 그룹의 위험 수준을 설정할 수 있다.According to an embodiment of the present disclosure, the step of classifying the new pattern of the unfair claims, the pattern group based on the number of feature variables above the threshold value preset to each of the feature variables associated with each of the new patterns You can set the risk level.

본원의 일 실시예에 따르면, 상기 신규 패턴 판별 모형을 구축하는 단계는, 상기 위험 수준별로 상기 패턴 그룹에 포함된 상기 신규 패턴을 입력으로 하는 의사결정 알고리즘에 기초하여 신규 패턴 판별 규칙을 학습하고, 상기 신규 패턴 판별 규칙을 포함하는 상기 신규 패턴 판별 모형을 구축할 수 있다.According to an embodiment of the present disclosure, the building of the new pattern determination model may include learning a new pattern determination rule based on a decision algorithm that inputs the new pattern included in the pattern group for each risk level, The new pattern discrimination model including the new pattern discrimination rule may be constructed.

본원의 일 실시예에 따르면, 상기 신규 패턴을 판별하는 단계는, 상기 신규 패턴 판별 규칙에 기초하여 상기 신규 청구가 정상 청구 패턴, 부당청구 패턴 및 신규 패턴 중 어떠한 패턴 유형인지 판별할 수 있다.According to an exemplary embodiment of the present disclosure, the determining of the new pattern may include determining whether the new claim is a pattern type of a normal claim pattern, an invalid claim pattern, and a new pattern based on the new pattern determination rule.

상술한 과제 해결 수단은 단지 예시적인 것으로서, 본원을 제한하려는 의도로 해석되지 않아야 한다. 상술한 예시적인 실시예 외에도, 도면 및 발명의 상세한 설명에 추가적인 실시예가 존재할 수 있다.The above-mentioned means for solving the problems are merely exemplary, and should not be construed as limiting the present application. In addition to the above-described exemplary embodiments, additional embodiments may exist in the drawings and detailed description of the invention.

전술한 본원의 과제 해결 수단에 의하면, 보험금 부당청구의 신규 패턴을 판별할 수 있는 판별 모형을 제공하는 인공지능 기반의 신규 부당청구 패턴 분석 장치 및 방법을 제공할 수 있다.According to the aforementioned problem solving means of the present application, it is possible to provide an artificial intelligence-based new fraud pattern analysis apparatus and method for providing a discrimination model that can determine a new pattern of claims fraudulent claims.

전술한 본원의 과제 해결 수단에 의하면, 부당청구 데이터를 학습하여 새로운 부당청구 패턴을 분석하고, 신규 청구건에 대한 패턴 유형을 판별할 수 있는 인공지능 기반의 신규 부당청구 패턴 분석 장치 및 방법을 제공할 수 있다.According to the above-described problem solving means of the present invention, it provides a novel artificial claim pattern analysis apparatus and method based on artificial intelligence capable of learning the new claim data to analyze the new claim pattern, and determine the pattern type for the new claim can do.

도 1은 본원의 일 실시예에 따른 신규 부당청구 패턴 분석 장치의 구성을 도시한 도면이다.
도 2는 본원의 일 실시예에 따른 신규 부당 청구 패턴 분석 장치의 신규 패턴 검출의 예를 도시한 도면이다.
도 3은 본원의 일 실시예에 따른 신규 부당 청구 패턴 분석 장치의 유사 패턴 그룹 분류의 예를 도시한 도면이다.
도 4는 본원의 일 실시예에 따른 신규 부당 청구 패턴 분석 방법의 흐름을 도시한 도면이다.1 is a view showing the configuration of a new unfair claim pattern analysis apparatus according to an embodiment of the present application.
2 is a diagram illustrating an example of a new pattern detection of the new invalid claim pattern analysis apparatus according to an embodiment of the present application.
3 is a diagram illustrating an example of similar pattern group classification of the apparatus for analyzing a new invalid pattern according to an embodiment of the present application.
Figure 4 is a diagram showing the flow of a new invalid claim pattern analysis method according to an embodiment of the present application.

아래에서는 첨부한 도면을 참조하여 본원이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본원의 실시예를 상세히 설명한다. 그러나 본원은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본원을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.DETAILED DESCRIPTION Hereinafter, exemplary embodiments of the present disclosure will be described in detail with reference to the accompanying drawings so that those skilled in the art may easily implement the present disclosure. As those skilled in the art would realize, the described embodiments may be modified in various different ways, all without departing from the spirit or scope of the present invention. In the drawings, parts irrelevant to the description are omitted for simplicity of explanation, and like reference numerals designate like parts throughout the specification.

본원 명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "전기적으로 연결"되어 있는 경우도 포함한다. Throughout this specification, when a part is "connected" to another part, this includes not only "directly connected" but also "electrically connected" with another element in between. do.

본원 명세서 전체에서, 어떤 부재가 다른 부재 "상에", "상부에", "상단에", "하에", "하부에", "하단에" 위치하고 있다고 할 때, 이는 어떤 부재가 다른 부재에 접해 있는 경우뿐 아니라 두 부재 사이에 또 다른 부재가 존재하는 경우도 포함한다.Throughout this specification, when a member is said to be located on another member "on", "upper", "top", "bottom", "bottom", "bottom", this means that any member This includes not only the contact but also the case where another member exists between the two members.

본원 명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함" 한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성 요소를 더 포함할 수 있는 것을 의미한다.Throughout this specification, when a part is said to "include" a certain component, it means that it can further include other components, without excluding the other components unless otherwise stated.

도 1은 본원의 일 실시예에 따른 신규 부당청구 패턴 분석 장치의 구성을 도시한 도면이다. 도 1을 참조하면, 신규 부당청구 패턴 분석 장치(100)는 데이터 전처리부(110), 부당청구 패턴 분류부(120), 판별 모형 구축부(130), 신규 패턴 판별부(140) 및 데이터베이스(150)를 포함할 수 있다. 데이터 전처리부(110)는 신규 부당청구의 패턴 분석 대상인 보험청구 내역 데이터를 정형화하여 특질 변수를 도출할 수 있다.1 is a view showing the configuration of a new unfair claim pattern analysis apparatus according to an embodiment of the present application. Referring to FIG. 1, the new fraudulent billing pattern analysis apparatus 100 includes a data preprocessor 110, a fraudulent billing pattern classification unit 120, a discriminant model construction unit 130, a new pattern discrimination unit 140, and a database ( 150). The data preprocessor 110 may derive the characteristic variable by shaping the insurance claim history data, which is a pattern analysis target of the new unfair claim.

후술하는 설명은 보험금의 부당청구 뿐만 아니라 보조금, 지원금, 보증금 등의 부당청구를 포함하는 것은 자명하나, 설명의 편의를 위해 보험금 부당청구를 중심으로 설명한다. 상기 보험청구 내역 데이터는 예시적으로, 보험금을 청구한 청구 데이터, 보험 또는 보조금 관련 계약 데이터, 보험금의 지급 데이터, 보험 설계사 데이터, 고객 데이터 및 보험사기 적발 결과 데이터 중 적어도 하나를 포함할 수 있다. 데이터 전처리부(110)는 상기 보험청구 내역 데이터를 정형화하여 특질 변수를 도출할 수 있다. 보험청구 내역 데이터의 정형화는 예를 들어, 고객 데이터인 경우, 고객의 소득 수준, 고객이 방문한 병원의 수, 고객이 신청한 질병 사유의 개수 등을 수치화하는 것을 의미한다. 또한 데이터 전처리부(110)에 의해 도출되는 특질 변수는 보험금 청구와 관련되어 수치화된 값을 가질 수 있는 변수로, 예를 들어, 고객 ID 번호, 보험사기자 여부, 동일 병명으로 중복 신청한 개수, 계약 체결 년월 개수, 하루최대 계약 개수, 지불 승인된 증권 개수, 지불 신청한 증권 개수, 고객이 체결한 증권 개수, 보장성 보험 청구 횟수, 신용 등급 변화량, 가입한 계약의 종류, 유의 병원 방문 총 횟수, 고객이 신청한 질병 사유의 개수, 고객이 만난 의사의 명수, 고객이 방문한 병원의 개수, 유효 입/통원 총 일수, 진료 과목 개수, 고객 소득 수준, FP(Financial Planner) 변경 횟수, 실손 처리 개수, 사기 FP 계약 개수 중 적어도 하나를 포함할 수 있다. 데이터베이스(150)는 보험청구 내역 데이터, 상기 특질 변수 및 상기 신규 패턴 중 적어도 하나를 기록할 수 있다. 또한, 데이터베이스(150)는 정상적인 청구 패턴을 저장할 수 있다.The description below will include not only claims unfair claims, but also claims, subsidies, security deposits, etc., but will be described based on claims unfair claims for convenience of explanation. For example, the insurance claim data may include at least one of claim data, insurance or subsidy-related contract data, insurance payment data, insurance company data, customer data, and insurance fraud detection result data. The data preprocessor 110 may derive the characteristic variable by formalizing the insurance claim data. For example, in the case of the customer data, the standardization of the insurance claim data means to quantify the income level of the customer, the number of hospitals visited by the customer, the number of reasons for the disease applied by the customer, and the like. In addition, the characteristic variable derived by the data preprocessing unit 110 is a variable that may have a digitized value in relation to a claim, for example, a customer ID number, whether or not an insurance fraud, the number of duplicate applications under the same disease name, contract The number of months signed, the maximum number of contracts per day, the number of securities approved for payment, the number of securities applied for payment, the number of securities entered by the customer, the number of claims for insurance coverage, the change in credit rating, the type of contracts signed up, the total number of hospital visits, and customers The number of reasons for the illness, the number of doctors the client met, the number of hospitals the client visited, the total number of days of effective hospitalization, the number of treatment subjects, the level of the client's income, the number of financial planner changes, the number of deficits, and the fraud It may include at least one of the number of FP contracts. The database 150 may record at least one of claims history data, the characteristic variable, and the new pattern. In addition, the database 150 may store a normal billing pattern.

부당청구 패턴 분류부(120)는 특질 변수를 입력으로 하는 비지도 학습 기반의 군집 알고리즘에 기초하여 부당청구의 신규 패턴을 분류할 수 있다. 비지도 학습이란 학습용 데이터를 구축하는 것이 아닌 데이터 자체를 분석하거나 군집하면서 학습하는 알고리즘을 의미한다. 이는 공지된 사항이므로 구체적인 설명은 생략한다. 부당 청구 패턴 분류부(120)는 보험청구 내역 데이터의 빈도에 기초하여 특질 변수를 복수개의 청구 패턴으로 군집할 수 있다. 예시적으로, 부당 청구 패턴 분류부(120)는 K-means 클러스터링 알고리즘, SOM(Self-Organizing-Maps) 알고리즘 EM & Canopy 알고리즘 중 적어도 하나의 알고리즘에 기초하여 부당청구의 신규 패턴을 분류할 수 있다. K-means 클러스터링 알고리즘은 전통적인 분류기법으로 대상집단을 거리의 평균값(유사도)을 기준으로 K개의 군집으로 반복 세분화 하는 기법이고, SOM알고리즘은 인공신경망을 기반으로 훈련집합의 입력 패턴을 가중치로 학습하여 군집화하는 기법이다. 또한 EM & Canopy 알고리즘은 주어진 초기값으로 가능성이 최대인 것부터 반복 과정을 통해 파라미터 값을 갱신하여 군집화 하는 기법을 의미한다. The unfair billing pattern classification unit 120 may classify a new pattern of unfair billing based on an unsupervised learning-based clustering algorithm using a feature variable as an input. Unsupervised learning refers to an algorithm that learns by analyzing or clustering data itself, rather than constructing learning data. Since this is a known matter, a detailed description thereof will be omitted. The unfair claim pattern classifying unit 120 may cluster the feature variables into a plurality of claim patterns based on the frequency of the claim details data. For example, the unfair claim pattern classifying unit 120 may classify a new pattern of unfair claims based on at least one of a K-means clustering algorithm, a self-organizing-maps (SOM) algorithm, and an EM & Canopy algorithm. . The K-means clustering algorithm is a traditional classification technique that repeats the target group into K clusters based on the mean value (similarity) of the distance, and the SOM algorithm learns the input patterns of training sets by weight based on artificial neural networks. Clustering technique. In addition, EM & Canopy algorithm refers to a technique for clustering by updating the parameter value through the iterative process from the maximum probability to the given initial value.

부당청구 패턴 분류부(120)는 상기 특질 변수의 빈도에 기초하여 상기 군집 알고리즘을 통해 상기 특질 변수를 복수개의 청구 패턴으로 군집할 수 있다. 즉 청구 패턴은 동질성 있는 특질 변수들의 군집일 수 있다. 또한, 청구 패턴의 군집간 분리도 즉 군집간 거리에 기초하여 상기 신규 패턴을 검출할 수 있다. The invalid claim pattern classification unit 120 may cluster the feature variables into a plurality of claim patterns through the clustering algorithm based on the frequency of the feature variables. In other words, the claim pattern may be a cluster of homogeneous feature variables. In addition, the new pattern may be detected based on the degree of separation between clusters of the claimed pattern, that is, the distance between the clusters.

도 2는 본원의 일 실시예에 따른 신규 부당 청구 패턴 분석 장치의 신규 패턴 검출의 예를 도시한 도면이다. 도 2는 특질 변수의 빈도에 따라 군집된 청구 패턴을 나타내며, 서로 다른 청구 패턴에 속한 특질 변수들의 빈도가 유사한 경우, 도2에 도시된 바와 같이 동일 내지 유사한 색(파란색)으로 구분될 수 있다. 즉 청구 패턴 상호간 색이 동일 내지 유사한 경우 군집간 분리도가 낮다고 할 수 있다. 또한, 특질 변수간 유사한 빈도로 군집된 청구 패턴은 정상적인 보험 청구로 인해 발생하는 특질 변수의 빈도에 기초하여 군집된 청구 패턴 즉, 부당청구가 아닌 정상적인 청구 패턴인 것으로 판단할 수 있다.2 is a diagram illustrating an example of a new pattern detection of the new invalid claim pattern analysis apparatus according to an embodiment of the present application. FIG. 2 shows the billing patterns clustered according to the frequency of the feature variables, and if the frequencies of the feature variables belonging to different claim patterns are similar, they may be divided into the same to similar colors (blue) as shown in FIG. 2. In other words, when the colors of the claims are the same or similar to each other, it can be said that the degree of separation between groups is low. In addition, the claim pattern clustered at similar frequencies between the feature variables may be determined to be a clustered claim pattern, that is, a normal claim pattern rather than an unfair claim, based on the frequency of the feature variables resulting from the normal insurance claim.

한편, 유사한 빈도를 가진 특질 변수들과 다른 빈도를 가진 특질 변수들로 군집된 청구 패턴의 경우 도 2에 도시된 바와 같이 전술한 청구 패턴과는 다른 색(빨간색)으로 구분될 수 있다. 이러한 청구 패턴은 정상적인 청구 패턴의 특질 변수의 빈도와는 다르므로(예를 들어, FP변경횟수가 정상적인 청구 패턴의 특질 변수에 비해 상대적으로 많은 경우), 부당청구의 신규 패턴일 수 있고, 부당청구 패턴 분류부(120)에 의해 검출 될 수 있다. Meanwhile, in the case of a claim pattern clustered with feature variables having a similar frequency and feature variables having a different frequency, the claim pattern may be divided into a different color (red) from the above-described claim pattern as shown in FIG. 2. Since this claim pattern is different from the frequency of the feature variable of the normal claim pattern (for example, when the number of FP changes is relatively higher than the feature variable of the normal claim pattern), it may be a new pattern of claim, It may be detected by the pattern classifier 120.

도 3은 본원의 일 실시예에 따른 신규 부당 청구 패턴 분석 장치의 유사 패턴 그룹 분류의 예를 도시한 도면이다.3 is a diagram illustrating an example of similar pattern group classification of the apparatus for analyzing a new invalid pattern according to an embodiment of the present application.

부당청구 패턴 분류부(120)는 신규 패턴 상호간의 유사도에 기초하여 상기 신규 패턴을 복수개의 유사 패턴 그룹으로 분류할 수 있다. 신규 패턴 상호간의 유사도는 전술한 군집 알고리즘에 기초하여 연산될 수 있으므로 중복되는 설명은 생략한다. 도 3은 신규 패턴 상호간의 유사도에 기초하여 7개의 유사 패턴 그룹으로 분류된 예를 도시한다. 상기 유사 패턴 그룹의 수는 신규 패턴 상호간의 유사도에 따라 변화될 수 있음은 자명하다. 또한, 부당청구 패턴 분류부(120)는 상기 신규 패턴 각각과 연계된 특질 변수 중 상기 특질 변수 각각에 미리 설정된 임계값 이상의 특질 변수의 수에 기초하여 상기 패턴 그룹의 위험 수준을 설정할 수 있다. 예시적으로 상기 위험 수준은 도 3에 도시된 바와 같이 주의군, 저위험군, 중위험군 및 고위험군으로 구분될 수 있으나 이에 한정되는 것은 아니다. 즉, 고위험군으로 분류된 패턴 그룹의 경우, 임계값 이상의 특질 변수가 많은 신규 패턴들을 포함하는 패턴 그룹일 가능성이 높다고 할 수 있다.The unfair claim pattern classification unit 120 may classify the new pattern into a plurality of similar pattern groups based on the similarity between the new patterns. Since the similarity between the new patterns can be calculated based on the clustering algorithm described above, redundant descriptions are omitted. 3 shows an example classified into seven similar pattern groups based on the similarity between new patterns. It is apparent that the number of similar pattern groups may vary depending on the similarity between new patterns. In addition, the unfair claim pattern classification unit 120 may set the risk level of the pattern group based on the number of feature variables above a threshold set in each of the feature variables associated with each of the new patterns. For example, the risk level may be classified into a attention group, a low risk group, a medium risk group, and a high risk group as shown in FIG. 3, but is not limited thereto. That is, in the case of a pattern group classified as a high risk group, it may be said that a pattern group including many new patterns having many characteristic variables above a threshold is highly likely.

판별 모형 구축부(130)는 지도 학습 기반의 의사결정 알고리즘에 기초하여 신규 패턴 판별 모형을 구축할 수 있다. 지도 학습이란, 미리 구축된 학습용 데이터(training data)를 활용하여 모델을 학습하는 것을 의미한다. 또한 의사 결정 알고리즘은 예를 들어 Decision Tree알고리즘을 포함할 수 있으나 이에 한정되는 것은 아니다.The discrimination model building unit 130 may construct a new pattern discrimination model based on a supervised learning-based decision algorithm. Supervised learning means learning a model using pre-built training data. In addition, the decision algorithm may include, for example, a decision tree algorithm, but is not limited thereto.

판별 모형 구축부(130)는 위험 수준이 설정된 패턴 그룹으로 분류된 상기 청구 패턴을 입력으로 하는 의사결정 알고리즘에 기초하여 신규 패턴 판별 규칙을 학습할 수 있다. 또한, 판별 모형 구축부(130)는 신규 패턴 판별 규칙을 포함하는 신규 부당청구 패턴 판별 모형을 구축할 수 있다. 전술한 비지도 학습 기반의 신규 패턴 분류는 단순히 빈도에 기초하여 신규 패턴을 분류할 수는 있으나, 어떠한 변수에 의해 분류되었는지는 알 수 없다. 따라서 지도 학습 기반의 의사결정 알고리즘에 기초하여 신규 패턴 판별 규칙을 학습할 수 있다. The discrimination model building unit 130 may learn a new pattern discrimination rule based on a decision algorithm that takes the claim pattern classified into a pattern group in which a risk level is set. In addition, the discrimination model building unit 130 may construct a new unfair billing pattern discrimination model including the new pattern discrimination rule. The above-described new pattern classification based on unsupervised learning can simply classify the new pattern based on the frequency, but it is not known by which variable. Therefore, new pattern discrimination rules can be learned based on supervised learning-based decision algorithms.

상기 신규 패턴 판별 규칙이란, 임계값 이상의 특질 변수에 기초하여 청구 패턴을 신규 패턴으로 판별할 수 있는 규칙을 의미한다. 또한 이러한 신규 패턴 판별 규칙을 종합하여 판별 모형 구축부(130)에 의해 신규 패턴 판별 모형으로 구축될 수 있다. 상기 고위험군으로 설정된 패턴 그룹의 신규 패턴들은 미리 설정된 임계값 이상의 특질 변수를 가질 확률이 다른 위험군에 비해 상대적으로 높은 것은 자명하다. 그러나, 저위험군의 패턴 그룹이라고 하더라도 부당청구의 여지가 있는 특질 변수(즉, 미리 설정된 임계값 이상의 특질 변수)를 가질 확률이 없지 않기 때문에, 모든 위험 수준 별 패턴 그룹에 포함된 신규 패턴을 입력으로 하여 신규 패턴 판별 규칙이 학습될 수 있다. 또한, 신규 패턴 판별 규칙의 학습은 특질 변수의 중요도가 고려될 수 있다. 즉, 다양한 특질 변수 중에도 부당 청구의 위험성이 높은 특질 변수의 경우, 상대적으로 높은 중요도를 가질 수 있으며, 신규 패턴 판별 규칙의 학습 시 이러한 특질 변수의 중요도가 고려될 수 있다. 예를 들어, 특질 변수 중 유의 병원 방문 총 횟수는 고객이 만난 의사의 명수보다 높은 중요도가 부여될 수 있다. 판별 모형 구축부(130)는 특질 변수의 중요도를 고려함으로써 보다 정확한 신규 패턴 판별 모형을 구축할 수 있다.The new pattern discrimination rule means a rule capable of discriminating a claim pattern as a new pattern based on a characteristic variable of a threshold value or more. In addition, the new pattern discrimination rule may be integrated into a new pattern discrimination model by the discrimination model constructing unit 130. It is apparent that the new patterns of the pattern group set to the high risk group have a relatively high probability of having a characteristic variable above a predetermined threshold value compared to other risk groups. However, even a group of low-risk groups does not have a probability of having an overcharged feature variable (that is, a feature variable above a predetermined threshold), so that new patterns included in all risk level pattern groups can be used as input. New pattern determination rules can be learned. In addition, the learning of the new pattern discrimination rule may be considered the importance of the feature variable. That is, among the various characteristic variables, the characteristic variables with high risk of fraudulent claims may have a relatively high importance, and the importance of these characteristic variables may be considered when learning new pattern discrimination rules. For example, the total number of significant hospital visits among the characteristic variables may be given a higher importance than the number of doctors the client met. The discrimination model building unit 130 may construct a new pattern discrimination model more accurate by considering the importance of the feature variables.

신규 패턴 판별부(140)는 상기 신규 패턴 판별 모형에 기초하여 신규 청구의 패턴 유형을 판별할 수 있다. 신규 패턴 판별부(140)는 신규 패턴 판별 규칙에 기초하여 상기 신규 청구가 정상 청구 패턴, 부당청구 패턴 및 신규 패턴 중 어떠한 패턴 유형인지 판별할 수 있다. 다시 말해, 구축된 신규 패턴 판별 모형에 신규 청구의 특질 변수를 입력하여, 정상으로 판단된 청구 패턴인지 부당청구로 판단된 패턴인지 새로운 부당청구의 신규 패턴인지 패턴의 유형을 판별할 수 있다. 또한, 상기 신규 청구가 부당청구로 판단된 패턴 및 부당청구의 신규 패턴으로 판단된 경우, 부당청구 판단의 요인(즉, 특질 변수)으로 작용한 특질 변수를 검출할 수 있다. The new pattern determination unit 140 may determine the pattern type of the new claim based on the new pattern determination model. The new pattern determination unit 140 may determine whether the new claim is a pattern type among the normal claim pattern, the unfair claim pattern, and the new pattern based on the new pattern determination rule. In other words, by inputting the characteristic variable of the new claim into the constructed new pattern discrimination model, it is possible to determine the type of the pattern whether the claim pattern determined to be normal, the pattern judged to be an illegal claim, or the new pattern of the new invalid claim. In addition, when it is determined that the new claim is a pattern judged as an illegal claim and a new pattern of an invalid claim, it is possible to detect a characteristic variable that acted as a factor (ie, a characteristic variable) of the judgment of an invalid claim.

예시적으로, 신규 청구 중 건강보험의 분포가 다수이고, 수진자의 나이가 20대 중반이며, 여성이 다수인 특질 변수일 때, 초진 진찰 및 재진 진찰이 높은 분포를 보이는 경우, 유의 해야할 부당 청구의 패턴으로 판단될 수 있다. 또한, 초진에 비해 재진의 비율이 상대적으로 높은 경우 치료, 시술, 수술 전 부당 청구의 패턴으로 판단할 수 있다.For example, if there is a large distribution of health insurance among the new claims, the age of the examinees in their mid-20s, and the number of females with a characteristic variable, the initial and re-examination examinations show a high distribution. It may be determined as a pattern. In addition, when the ratio of the second visit is relatively higher than that of the first visit, it can be judged as a pattern of unfair claim before treatment, procedure and surgery.

도 4는 본원의 일 실시예에 따른 신규 부당 청구 패턴 분석 방법의 흐름을 도시한 도면이다.Figure 4 is a diagram showing the flow of a new invalid claim pattern analysis method according to an embodiment of the present application.

도 4에 도시된 신규 부당 청구 패턴 분석 방법은 앞선 도1 내지 도 3을 통해 설명된 신규 부당 청구 패턴 분석 장치에 의하여 수행될 수 있다. 따라서 이하 생략된 내용이라고 하더라도 도 1 내지 도 3을 통해 신규 부당 청구 패턴 분석 장치에 대하여 설명된 내용은 도 4에도 동일하게 적용될 수 있다.The new fraudulent claim pattern analysis method illustrated in FIG. 4 may be performed by the novel fraudulent claim pattern analysis apparatus described with reference to FIGS. 1 to 3. Therefore, even if omitted below, the descriptions of the apparatus for analyzing new unfair claims through FIGS. 1 to 3 may be equally applicable to FIG. 4.

도 4를 참조하면, 단계 S410에서 데이터 전처리부(110)는 보험청구 내역 데이터를 정형화하여 특질 변수를 도출할 수 있다. 본원의 일 실시예에 따르면, 데이터베이스(150)는 상기 부당청구 내역 데이터, 상기 특질 변수 및 상기 신규 패턴 중 적어도 하나를 기록할 수 있다. 또한, 데이터베이스(150)는 정상적인 청구 패턴을 저장할 수 있다. 상기 부당청구 내역 데이터는 청구 데이터, 계약 데이터, 지급 데이터, 보험설계사 데이터, 고객 데이터 중 적어도 하나를 포함할 수 있다. 보험청구 내역 데이터의 정형화는 예를 들어, 고객 데이터인 경우, 고객의 소득 수준, 고객이 방문한 병원의 수, 고객이 신청한 질병 사유의 개수 등을 수치화하는 것을 의미한다. 또한 데이터 전처리부(110)에 의해 도출되는 특질 변수는 보험금 청구와 관련되어 수치화된 값을 가질 수 있는 변수를 의미한다.Referring to FIG. 4, in step S410, the data preprocessor 110 may derive characteristic variables by shaping insurance claim details data. According to an exemplary embodiment of the present application, the database 150 may record at least one of the billing history data, the characteristic variable, and the new pattern. In addition, the database 150 may store a normal billing pattern. The invalid claim history data may include at least one of billing data, contract data, payment data, insurance company data, customer data. For example, in the case of the customer data, the standardization of the insurance claim data means to quantify the income level of the customer, the number of hospitals visited by the customer, the number of reasons for the disease applied by the customer, and the like. In addition, the characteristic variable derived by the data preprocessor 110 refers to a variable that may have a numerical value associated with the claim.

단계 S420에서 부당청구 패턴 분류부(120)는 상기 특질 변수를 입력으로 하는 비지도 학습 기반의 군집 알고리즘에 기초하여 부당청구의 신규 패턴을 분류할 수 있다. 부당 청구 패턴 분류부(120)는 보험청구 내역 데이터의 빈도에 기초하여 특질 변수를 복수개의 청구 패턴으로 군집할 수 있다. 예시적으로, 부당 청구 패턴 분류부(120)는 K-means 클러스터링 알고리즘, SOM(Self-Organizing-Maps) 알고리즘 EM & Canopy 알고리즘 중 적어도 하나의 알고리즘에 기초하여 부당청구의 신규 패턴을 분류할 수 있다. 부당청구 패턴 분류부(120)는 상기 특질 변수의 빈도에 기초하여 상기 군집 알고리즘을 통해 상기 특질 변수를 복수개의 청구 패턴으로 군집할 수 있다. 즉 청구 패턴은 동질성 있는 특질 변수들의 군집일 수 있다. 또한, 청구 패턴의 군집간 분리도 즉 군집간 거리에 기초하여 상기 신규 패턴을 검출할 수 있다.In operation S420, the invalid claim pattern classification unit 120 may classify the new pattern of the invalid claim based on an unsupervised learning-based clustering algorithm using the characteristic variable as an input. The unfair claim pattern classifying unit 120 may cluster the feature variables into a plurality of claim patterns based on the frequency of the claim details data. For example, the unfair claim pattern classifying unit 120 may classify a new pattern of unfair claims based on at least one of a K-means clustering algorithm, a self-organizing-maps (SOM) algorithm, and an EM & Canopy algorithm. . The invalid claim pattern classification unit 120 may cluster the feature variables into a plurality of claim patterns through the clustering algorithm based on the frequency of the feature variables. In other words, the claim pattern may be a cluster of homogeneous feature variables. In addition, the new pattern may be detected based on the degree of separation between clusters of the claimed pattern, that is, the distance between the clusters.

또한, 부당청구 패턴 분류부(120)는 신규 패턴 상호간의 유사도에 기초하여 상기 신규 패턴을 복수개의 유사 패턴 그룹으로 분류할 수 있다. 신규 패턴 상호간의 유사도는 전술한 군집 알고리즘에 기초하여 연산될 수 있으므로 중복되는 설명은 생략한다.In addition, the unfair claim pattern classification unit 120 may classify the new pattern into a plurality of similar pattern groups based on the similarity between the new patterns. Since the similarity between the new patterns can be calculated based on the clustering algorithm described above, redundant descriptions are omitted.

단계 S430에서 판별 모형 구축부(130)는 지도 학습 기반의 의사결정 알고리즘에 기초하여 신규 패턴 판별 모형을 구축할 수 있다. 의사 결정 알고리즘은 예를 들어 Decision Tree알고리즘을 포함할 수 있으나 이에 한정되는 것은 아니다. 판별 모형 구축부(130)는 위험 수준이 설정된 패턴 그룹으로 분류된 상기 청구 패턴을 입력으로 하는 의사결정 알고리즘에 기초하여 신규 패턴 판별 규칙을 학습할 수 있다. 또한, 판별 모형 구축부(130)는 신규 패턴 판별 규칙을 포함하는 신규 부당청구 패턴 판별 모형을 구축할 수 있다. 신규 패턴 판별 규칙의 학습은 특질 변수의 중요도가 고려될 수 있다. 즉, 다양한 특질 변수 중에도 부당 청구의 위험성이 높은 특질 변수의 경우, 상대적으로 높은 중요도를 가질 수 있으며, 신규 패턴 판별 규칙의 학습 시 이러한 특질 변수의 중요도가 고려될 수 있다. 예를 들어, 특질 변수 중 유의 병원 방문 총 횟수는 고객이 만난 의사의 명수보다 높은 중요도가 부여될 수 있다. 판별 모형 구축부(130)는 특질 변수의 중요도를 고려함으로써 보다 정확한 신규 패턴 판별 모형을 구축할 수 있다.In operation S430, the discrimination model building unit 130 may build a new pattern discrimination model based on the supervised learning-based decision algorithm. The decision algorithm may include, but is not limited to, a Decision Tree algorithm, for example. The discrimination model construction unit 130 may learn a new pattern discrimination rule based on a decision algorithm that takes the claim pattern classified into a pattern group in which a risk level is set. In addition, the discrimination model building unit 130 may construct a new unfair billing pattern discrimination model including the new pattern discrimination rule. Learning new pattern discrimination rules can take into account the importance of the feature variables. That is, among the various characteristic variables, the characteristic variables with high risk of fraudulent claims may have a relatively high importance, and the importance of these characteristic variables may be considered when learning new pattern discrimination rules. For example, the total number of significant hospital visits among the characteristic variables may be given a higher importance than the number of doctors the client met. The discrimination model building unit 130 may construct a new pattern discrimination model more accurate by considering the importance of the feature variables.

단계 S440에서 신규 패턴 판별부(140)는 상기 신규 패턴 판별 모형에 기초하여 신규 청구의 패턴 유형을 판별할 수 있다. 신규 패턴 판별부(140)는 신규 패턴 판별 규칙에 기초하여 상기 신규 청구가 정상 청구 패턴, 부당청구 패턴 및 신규 패턴 중 어떠한 패턴 유형인지 판별할 수 있다. 다시 말해, 구축된 신규 패턴 판별 모형에 신규 청구의 특질 변수를 입력하여, 정상으로 판단된 청구 패턴인지 부당청구로 판단된 패턴인지 새로운 부당청구의 신규 패턴인지 패턴의 유형을 판별할 수 있다. 또한, 상기 신규 청구가 부당청구로 판단된 패턴 및 부당청구의 신규 패턴으로 판단된 경우, 부당청구 판단의 요인(즉, 특질 변수)으로 작용한 특질 변수를 검출할 수 있다.In operation S440, the new pattern determination unit 140 may determine the pattern type of the new claim based on the new pattern determination model. The new pattern determination unit 140 may determine whether the new claim is a pattern type among the normal claim pattern, the unfair claim pattern, and the new pattern based on the new pattern determination rule. In other words, by inputting the characteristic variable of the new claim into the constructed new pattern discrimination model, it is possible to determine the type of the pattern whether the claim pattern determined to be normal, the pattern judged to be an illegal claim, or the new pattern of the new invalid claim. In addition, when it is determined that the new claim is a pattern judged as an illegal claim and a new pattern of an invalid claim, it is possible to detect a characteristic variable that acted as a factor (ie, a characteristic variable) of the judgment of an invalid claim.

본원의 일 실시 예에 따른, 신규 부당청구 패턴 분석 방법은, 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.According to an embodiment of the present disclosure, the method for analyzing a new unfair claim pattern may be implemented in the form of program instructions that can be executed by various computer means and recorded in a computer readable medium. The computer readable medium may include program instructions, data files, data structures, and the like, alone or in combination. Program instructions recorded on the media may be those specially designed and constructed for the purposes of the present invention, or they may be of the kind well-known and available to those having skill in the computer software arts. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tape, optical media such as CD-ROMs, DVDs, and magnetic disks, such as floppy disks. Magneto-optical media, and hardware devices specifically configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine code generated by a compiler, but also high-level language code that can be executed by a computer using an interpreter or the like. The hardware device described above may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

전술한 본원의 설명은 예시를 위한 것이며, 본원이 속하는 기술분야의 통상의 지식을 가진 자는 본원의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.The above description of the present application is intended for illustration, and it will be understood by those skilled in the art that the present invention may be easily modified in other specific forms without changing the technical spirit or essential features of the present application. Therefore, it should be understood that the embodiments described above are exemplary in all respects and not restrictive. For example, each component described as a single type may be implemented in a distributed manner, and similarly, components described as distributed may be implemented in a combined form.

본원의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본원의 범위에 포함되는 것으로 해석되어야 한다.The scope of the present application is indicated by the following claims rather than the above detailed description, and all changes or modifications derived from the meaning and scope of the claims and their equivalents should be construed as being included in the scope of the present application.

100: 신규 부당청구 패턴 분석 장치
110: 데이터 전처리부
120: 부당청구 패턴 분류부
130: 판별 모형 구축부
140: 신규 패턴 판별부
150: 데이터베이스100: new unfair claim pattern analysis device
110: data preprocessor
120: unfair claim pattern classification unit
130: Discrimination Model Builder
140: new pattern determination unit
150: database

Claims

In the new unfair claim pattern analysis apparatus,
A data preprocessing unit for deriving characteristic variables by formulating insurance claim data;
An illegal claim pattern classification unit classifying a new pattern of an illegal claim based on an unsupervised learning-based clustering algorithm using the feature variable as an input;
Discrimination model construction unit for constructing a new pattern discrimination model based on supervised learning-based decision algorithm; And
Including a new pattern determination unit for determining a pattern type of a new claim based on the new pattern determination model,
The invalid claim pattern classification unit,
Cluster the feature variables into a plurality of claimed patterns through the clustering algorithm based on the frequency of the feature variables,
Detecting the new pattern based on the degree of separation between the clusters of the claimed patterns;
Classify the new pattern into a plurality of similar pattern groups based on the similarity between the new patterns;
A risk level of the pattern group may be set based on the number of feature variables above a threshold set in each of the feature variables among the feature variables associated with each of the new patterns.
The characteristic variable may be a variable that may have a numerical value in relation to a claim, wherein the characteristic variable includes a number of duplicate applications for the same disease name and a number of reasons for a disease applied by a customer. Device.

The method of claim 1,
And a database for recording at least one of the claim history data, the feature variable, and the new pattern.
The claim details data includes at least one of claim data, contract data, payment data, insurance company data, customer data, new unfair claim pattern analysis device

delete

The method of claim 1,
The discrimination model construction unit,
Learning a new pattern discrimination rule based on a decision algorithm for inputting the new pattern included in the pattern group for each risk level, and constructing the new pattern discrimination model including the new pattern discrimination rule; New unfair billing pattern analysis device.

The method of claim 6,
The new pattern determination unit,
And determining whether the new claim is a pattern type among a normal claim pattern, an illegal claim pattern, and a new pattern based on the new pattern discrimination rule.

In the novel fraud pattern analysis method by the novel fraud pattern analysis apparatus,
Deriving a feature variable by formulating insurance claim data;
Classifying a new pattern of unfair claims based on an unsupervised learning based clustering algorithm using the feature variable as an input;
Constructing a new pattern discrimination model based on supervised learning based decision algorithm; And
Determining a pattern type of a new claim based on the new pattern determination model,
Classifying the new pattern,
Cluster the feature variables into a plurality of claimed patterns through the clustering algorithm based on the frequency of the feature variables,
Detecting the new pattern based on the degree of separation between the clusters of the claimed patterns;
Classify the new pattern into a plurality of similar pattern groups based on the similarity between the new patterns;
The risk level of the pattern group may be set based on the number of feature variables above a threshold set in each of the feature variables among the feature variables associated with each of the new patterns.
The characteristic variable may be a variable that may have a numerical value in relation to the claim, wherein the characteristic variable includes the number of duplicate applications for the same disease name and the number of reasons for the disease for which the customer has applied for an unfair claim pattern analysis. Way.

The method of claim 8,
Recording at least one of the claim history data, the feature variable, and the new pattern;
The claim details data includes at least one of billing data, contract data, payment data, insurance company data, customer data, new fraud pattern analysis method.

delete

The method of claim 8,
Building the new pattern discrimination model,
Learning a new pattern discrimination rule based on a decision algorithm for inputting the new pattern included in the pattern group for each risk level, and constructing the new pattern discrimination model including the new pattern discrimination rule; New fraud pattern analysis method.

The method of claim 13,
The determining of the pattern type may include:
And determining whether the new claim is a pattern type of a normal claim pattern, an invalid claim pattern, and a new pattern based on the new pattern discrimination rule.

A computer-readable recording medium having recorded thereon a program for executing the method of any one of claims 8, 9, 13 and 14 on a computer.