JP7245125B2

JP7245125B2 - Generation device, generation method, and generation program

Info

Publication number: JP7245125B2
Application number: JP2019118136A
Authority: JP
Inventors: 薫樹小林; 洋史近藤; 泰隆長谷川; 裕司鎌田; 俊太郎由井; 秀行伴; 隆秀新家
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2019-06-26
Filing date: 2019-06-26
Publication date: 2023-03-23
Anticipated expiration: 2039-06-26
Also published as: WO2020261869A1; JP2021005191A

Description

本発明は、データを生成する生成装置、生成方法、および生成プログラムに関する。 The present invention relates to a generation device, generation method, and generation program for generating data.

生命保険の加入審査では、加入希望者の健康状態に基づいて将来の発症や入院リスクが予測される。健康状態は健康診断結果や告知情報など多変量データで表現される。さらに、健康状態の変化を考慮する場合、複数年分の健康状態を考慮してリスク予測を行うため、多変量データの次元数はさらに膨大になる。 In life insurance enrollment screening, future onset and hospitalization risks are predicted based on the health condition of the applicant. Health conditions are represented by multivariate data such as health checkup results and notification information. Furthermore, when considering changes in health conditions, the number of dimensions of multivariate data becomes even more enormous because risk prediction is performed in consideration of health conditions for multiple years.

時系列データ分析のための手法として、ベクトル自己回帰モデルやＬＳＴＭ（Ｌｏｎｇｓｈｏｒｔ‐ｔｅｒｍＭｅｍｏｒｙ）などがあり、また、各時点で独立した変数として回帰モデルやニューラルネットワークを用いて分析する方法がある。 Techniques for time-series data analysis include vector autoregression models and LSTM (Long short-term memory), and there are methods of analysis using regression models and neural networks as independent variables at each time point.

また、特許文献１は、複数年の健康状態を分析するデータ分析装置を開示する。このデータ分析装置は、ＩＤ及び時間情報をそれぞれ有する定量データ及び定性データを記憶し、前記定量データから時系列定量イベントデータを生成し、前記定性データから時系列定性イベントデータを生成し、前記時系列定量及び定性イベントデータの一方から変化がある特徴部分を抽出し、前記特徴部分に対応するイベントデータの集合から時系列イベントパターンを生成し、前記時系列イベントパターンに含まれるＩＤと、前記時系列定量及び定性イベントデータの他方に含まれるＩＤとを対応付け、前記対応づけられた時系列イベントパターンと、前記対応付けられた時系列定量及び定性イベントデータの他方と、を表示する。 Further, Patent Literature 1 discloses a data analysis device that analyzes health conditions for multiple years. This data analysis device stores quantitative data and qualitative data each having an ID and time information, generates time-series quantitative event data from the quantitative data, generates time-series qualitative event data from the qualitative data, and generates time-series qualitative event data from the qualitative data. Extracting a characteristic portion having a change from one of series quantitative and qualitative event data, generating a time-series event pattern from a set of event data corresponding to the characteristic portion, generating an ID included in the time-series event pattern and the time An ID included in the other of the series quantitative and qualitative event data is associated, and the associated time series event pattern and the other of the associated time series quantitative and qualitative event data are displayed.

特開２００６‐２８５６７２号公報JP-A-2006-285672

しかしながら、分析に必要な特徴量を全探索的に抽出する場合、分析に不要な特徴量も生成され、計算コストが増加する。また、分析に不要な特徴量が、目的とする分析に悪影響を及ぼす可能性も生じる。 However, when extracting feature amounts necessary for analysis by exhaustive search, feature amounts unnecessary for analysis are also generated, increasing calculation costs. In addition, there is a possibility that feature amounts unnecessary for analysis may adversely affect the intended analysis.

本発明は、効率的かつ高精度なデータ分析を実現することを目的とする。 An object of the present invention is to realize efficient and highly accurate data analysis.

本願において開示される発明の一側面となる生成装置は、プログラムを実行するプロセッサと、前記プログラムを記憶する記憶デバイスと、を有する生成装置であって、経時データから得られる経時的な特徴を示す経時特徴情報と、前記経時データが所属すべき複数のグループが規定されたグループ分け情報と、にアクセス可能であり、前記プロセッサは、前記経時特徴情報に基づいて、分析対象の経時データから、前記経時的な特徴を示す複数の経時特徴データを生成する生成処理と、前記グループ分け情報に基づいて、前記生成処理によって生成された複数の経時特徴データを前記複数のグループに分割する分割処理と、前記分割処理によって分割された複数のグループの各々を次元圧縮し、分析対象の非経時データを次元圧縮する次元圧縮処理と、を実行することを特徴とする。 A generation device that is one aspect of the invention disclosed in the present application is a generation device that includes a processor that executes a program and a storage device that stores the program, and exhibits chronological characteristics obtained from chronological data. It is possible to access temporal characteristic information and grouping information that defines a plurality of groups to which the temporal data should belong, and the processor, based on the temporal characteristic information, selects, from the temporal data to be analyzed, the a generation process of generating a plurality of pieces of temporal feature data indicating characteristics over time; a division process of dividing the plurality of pieces of temporal feature data generated by the generation processing into the plurality of groups based on the grouping information; and dimension compression processing for dimensionally compressing each of the plurality of groups divided by the division processing, and dimensionally compressing the non-temporal data to be analyzed .

本発明の代表的な実施の形態によれば、効率的かつ高精度なデータ分析を実現することができる。前述した以外の課題、構成及び効果は、以下の実施例の説明により明らかにされる。 According to representative embodiments of the present invention, efficient and highly accurate data analysis can be achieved. Problems, configurations, and effects other than those described above will be clarified by the following description of the embodiments.

図１は、生成装置のハードウェア構成例を示すブロック図である。FIG. 1 is a block diagram showing a hardware configuration example of a generation device. 図２は、実施例１にかかる生成装置の機能的構成例を示すブロック図である。FIG. 2 is a block diagram of a functional configuration example of the generation device according to the first embodiment; 図３は、告知情報の記憶内容例を示す説明図である。FIG. 3 is an explanatory diagram showing an example of the contents of notification information stored. 図４は、ドメイン知識の記憶内容例を示す説明図である。FIG. 4 is an explanatory diagram showing an example of the contents of domain knowledge stored. 図５は、経時特徴の一例を示す説明図である。FIG. 5 is an explanatory diagram showing an example of temporal features. 図６は、分割経時特徴の一例を示す説明図である。FIG. 6 is an explanatory diagram showing an example of divided temporal features. 図７は、高次特徴データの一例を示す説明図である。FIG. 7 is an explanatory diagram showing an example of high-order feature data. 図８は、入出力画面例１を示す説明図である。FIG. 8 is an explanatory diagram showing an input/output screen example 1. FIG. 図９は、告知情報入力画面例を示す説明図である。FIG. 9 is an explanatory diagram showing an example of a notification information input screen. 図１０は、入出力画面例２を示す説明図である。FIG. 10 is an explanatory diagram showing example 2 of the input/output screen. 図１１は、生成装置による生成処理手順例を示すフローチャートである。FIG. 11 is a flowchart illustrating an example of a generation processing procedure by the generation device. 図１２は、実施例２にかかる生成装置の機能的構成例を示すブロック図である。FIG. 12 is a block diagram of a functional configuration example of a generation device according to a second embodiment; 図１３は、マルチモーダルニューラルネットワークの一例を示す説明図である。FIG. 13 is an explanatory diagram showing an example of a multimodal neural network. 図１４は、マルチモーダルニューラルネットワークによる分析結果を示す入出力画面例を示す説明図である。FIG. 14 is an explanatory diagram showing an example of an input/output screen showing analysis results by the multimodal neural network.

以下、添付図面を用いて本発明にかかる生成装置について説明する。本明細書では、生命保険の引受査定における保険金支払リスク予測の例を示す。引受査定では、契約希望者が告知した情報（以下、告知情報）に基づき、将来の保険金支払リスクが査定され、保険加入の承認または謝絶が決定される。告知情報は、健康診断の検査結果、問診、既往歴等を含む。 Hereinafter, a generation device according to the present invention will be described with reference to the accompanying drawings. Provided herein are examples of claim risk prediction in life insurance underwriting. In the underwriting assessment, based on the information notified by the applicant (hereinafter referred to as notification information), the future insurance claim payment risk is assessed and the approval or refusal of insurance enrollment is determined. The notification information includes test results of physical examinations, medical interviews, medical history, and the like.

＜生成装置のハードウェア構成例＞
図１は、生成装置のハードウェア構成例を示すブロック図である。生成装置１００は、プロセッサ１０１と、記憶デバイス１０２と、入力デバイス１０３と、出力デバイス１０４と、通信インターフェース（通信ＩＦ）１０５と、を有する。プロセッサ１０１、記憶デバイス１０２、入力デバイス１０３、出力デバイス１０４、および通信ＩＦ１０５は、バス１０６により接続される。プロセッサ１０１は、生成装置１００を制御する。記憶デバイス１０２は、プロセッサ１０１の作業エリアとなる。また、記憶デバイス１０２は、各種プログラムやデータを記憶する非一時的なまたは一時的な記録媒体である。記憶デバイス１０２としては、たとえば、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）、フラッシュメモリがある。入力デバイス１０３は、データを入力する。入力デバイス１０３としては、たとえば、キーボード、マウス、タッチパネル、テンキー、スキャナがある。出力デバイス１０４は、データを出力する。出力デバイス１０４としては、たとえば、ディスプレイ、プリンタがある。通信ＩＦ１０５は、ネットワークと接続し、データを送受信する。 <Hardware configuration example of generation device>
FIG. 1 is a block diagram showing a hardware configuration example of a generation device. The generation device 100 has a processor 101 , a storage device 102 , an input device 103 , an output device 104 and a communication interface (communication IF) 105 . Processor 101 , storage device 102 , input device 103 , output device 104 and communication IF 105 are connected by bus 106 . A processor 101 controls the generator 100 . Storage device 102 serves as a work area for processor 101 . Also, the storage device 102 is a non-temporary or temporary recording medium that stores various programs and data. Examples of the storage device 102 include ROM (Read Only Memory), RAM (Random Access Memory), HDD (Hard Disk Drive), and flash memory. The input device 103 inputs data. The input device 103 includes, for example, a keyboard, mouse, touch panel, numeric keypad, and scanner. The output device 104 outputs data. Output devices 104 include, for example, displays and printers. Communication IF 105 connects to a network and transmits and receives data.

＜生成装置１００の機能的構成例＞
図２は、実施例１にかかる生成装置１００の機能的構成例を示すブロック図である。生成装置１００は、判定部２０１と、生成部２０２と、分割部２０３と、次元圧縮部２０４と、結合部２０５と、分析部２０６と、を有する。判定部２０１、生成部２０２、分割部２０３、次元圧縮部２０４、結合部２０５、および分析部２０６は、具体的には、たとえば、図１に示した記憶デバイス１０２に記憶されたプログラムをプロセッサ１０１に実行させることにより実現される。 <Functional Configuration Example of Generation Device 100>
FIG. 2 is a block diagram of a functional configuration example of the generation device 100 according to the first embodiment. The generation device 100 includes a determination unit 201 , a generation unit 202 , a division unit 203 , a dimension compression unit 204 , a combination unit 205 and an analysis unit 206 . Specifically, the determination unit 201, the generation unit 202, the division unit 203, the dimension compression unit 204, the combination unit 205, and the analysis unit 206, for example, convert the program stored in the storage device 102 shown in FIG. It is realized by executing

また、生成装置１００は、少なくとも生成部２０２と分割部２０３と次元圧縮部２０４とを有していればよく、判定部２０１、結合部２０５、および分析部２０６は、生成装置１００と通信可能な他のコンピュータで実現されてもよい。 Moreover, the generation device 100 only needs to have at least the generation unit 202, the division unit 203, and the dimension compression unit 204, and the determination unit 201, the combination unit 205, and the analysis unit 206 can communicate with the generation device 100. It may also be implemented on other computers.

また、生成装置１００は、告知情報３００と、ドメイン知識４００と、を記憶デバイス１０２に記憶する。告知情報３００およびドメイン知識４００は、あらかじめ生成装置１００に記憶されていてもよく、生成装置１００と通信可能な他のコンピュータから取得してもよい。まず、告知情報３００について詳細に説明する。 The generating device 100 also stores the notification information 300 and the domain knowledge 400 in the storage device 102 . The announcement information 300 and the domain knowledge 400 may be stored in the generation device 100 in advance, or may be obtained from another computer that can communicate with the generation device 100 . First, the notification information 300 will be described in detail.

図３は、告知情報３００の記憶内容例を示す説明図である。告知情報３００は、契約希望者が告知した保険契約に必要な情報であり、分析対象データとなる。告知情報３００は、告知基本情報３１０と、健診結果３２０と、問診結果３３０と、既往歴３４０と、を有する。告知基本情報３１０は、契約希望者の告知に関する基本情報である。告知基本情報３１０は、氏名ＩＤ３１１と、生年月日３１２と、年齢３１３と、を含む。 FIG. 3 is an explanatory diagram showing an example of the contents of the notification information 300. As shown in FIG. The notification information 300 is information necessary for the insurance contract notified by the contract applicant, and is data to be analyzed. Notification information 300 includes notification basic information 310 , health checkup results 320 , interview results 330 , and past history 340 . The basic notification information 310 is basic information regarding the notification of the contract applicant. Notification basic information 310 includes name ID 311 , date of birth 312 , and age 313 .

氏名ＩＤ３１１は、契約希望者を一意に特定する識別情報である。生年月日３１２は、契約希望者が生まれた年月日である。図３の氏名ＩＤ３１１が「０００１」の契約希望者の３つのエントリは、当該契約希望者の過去３年分の分析対象データを示す。年齢３１３は、契約希望者の生年月日３１２から起算した年単位の経過年数である。後述する例において、氏名ＩＤ３１１が「０００１」の契約希望者の３つのエントリについて、年齢３１３が「４７」を時系列の１年目、「４８」を時系列の２年目、「４９」を時系列の３年目とする。 The name ID 311 is identification information that uniquely identifies the contract applicant. The date of birth 312 is the date on which the contract applicant was born. The three entries of the contract applicant whose name ID 311 is "0001" in FIG. 3 indicate the analysis target data for the past three years of the contract applicant. The age 313 is the elapsed years in years calculated from the date of birth 312 of the contract applicant. In the example described later, for three entries of a contract applicant whose name ID 311 is "0001", age 313 is "47" for the first year in chronological order, "48" for the second year in chronological order, and "49" for age 313. This is the third year in chronological order.

健診結果３２０は、契約希望者が受けた健康診断の結果である。健診結果３２０は、体重３２１と、ＢＭＩ（ＢｏｄｙＭａｓｓＩｎｄｅｘ）３２２と、収縮期血圧３２３と、拡張期血圧３２４と、空腹時血糖３２５と、を含む。体重３２１は、契約希望者の体の重さである。ＢＭＩ３２２は、人間の肥満度を表す体格指数であり、体重／（身長^２）で算出される。ＢＭＩ３２２は、その値が小さくなるほど痩せており、大きくなるほど太っていることを示す。 The health checkup result 320 is the result of the health checkup that the contract candidate received. The health checkup results 320 include body weight 321 , BMI (Body Mass Index) 322 , systolic blood pressure 323 , diastolic blood pressure 324 , and fasting blood sugar 325 . The weight 321 is the weight of the contract applicant's body. BMI322 is a body mass index representing the degree of obesity of a person and is calculated by weight/(height ² ). BMI 322 indicates that the smaller the value, the thinner the person, and the larger the value, the fatter the person.

収縮期血圧３２３は、心臓から大動脈へ血液を送り出す状態において、心臓の収縮で押し出された血液によって大動脈の血管壁にかかる圧力である。拡張期血圧３２４は、心臓へ血液が戻る状態において、心臓の拡張で大動脈から心臓に血液が流入し大動脈の血液量が減少することで低下した大動脈の血管壁にかかる圧力である。空腹時血糖３２５は、空腹の状態で測定された血糖値である。 Systolic blood pressure 323 is the pressure exerted on the wall of the aorta by the blood pushed out by contraction of the heart in the state of pumping blood from the heart to the aorta. The diastolic blood pressure 324 is the pressure applied to the aortic vessel wall, which is lowered when blood flows from the aorta into the heart due to the dilation of the heart and the amount of blood in the aorta decreases when the blood returns to the heart. Fasting blood glucose 325 is the blood glucose level measured in the fasting state.

問診結果３３０は、契約希望者が受けた問診の結果である。問診結果３３０は、喫煙習慣３３１と、飲酒習慣３３２と、運動習慣３３３と、を含む。喫煙習慣３３１は、契約希望者の喫煙の有無や頻度、喫煙量である。飲酒習慣３３２は、契約希望者の飲酒の有無や頻度、飲酒量である。運動習慣３３３は、契約希望者の運動の有無や頻度、運動量である。 The inquiry result 330 is the result of the inquiry received by the contract applicant. The interview results 330 include smoking habits 331 , drinking habits 332 , and exercise habits 333 . The smoking habit 331 is the presence/absence, frequency, and amount of smoking of the person who wishes to make a contract. The drinking habit 332 is the presence or absence of drinking, the frequency of drinking, and the amount of drinking of the contract applicant. The exercise habit 333 is the presence/absence, frequency, and amount of exercise of the contract applicant.

既往歴３４０は、契約希望者が既に受信または入院した履歴である。既往歴３４０は、高血圧症受診歴３４１と、高血圧症入院歴３４２と、糖尿病受診歴３４３と、を含む。高血圧症受診歴３４１は、契約希望者が高血圧症に関して受診した履歴である。高血圧症入院歴３４２は、契約希望者が高血圧症に関して入院した履歴である。糖尿病受診歴３４３は、契約希望者が糖尿病に関して受診した履歴である。 The anamnesis 340 is a history of the contract applicant having already received or been hospitalized. The medical history 340 includes hypertension medical history 341 , hypertension hospitalization history 342 , and diabetes medical history 343 . The hypertension medical examination history 341 is a history of medical examinations regarding hypertension by the contract applicant. The hypertension hospitalization history 342 is a history of hospitalization for hypertension of the contract applicant. Diabetes consultation history 343 is a history of consultations regarding diabetes by the contract applicant.

図４は、ドメイン知識４００の記憶内容例を示す説明図である。ドメイン知識４００は、告知情報３００に含まれる健診結果３２０、問診結果３３０、既往歴３４０などの各種情報に対する定性的な情報であり、医学的知見に相当する。具体的には、たとえば、ドメイン知識４００は、経時データ判定知識４１０と、経時特徴知識４２０と、経時特徴分割知識４３０と、を含む。 FIG. 4 is an explanatory diagram showing an example of the contents of the domain knowledge 400. As shown in FIG. The domain knowledge 400 is qualitative information for various types of information included in the notification information 300, such as the health checkup result 320, medical interview result 330, and medical history 340, and corresponds to medical knowledge. Specifically, for example, domain knowledge 400 includes temporal data determination knowledge 410 , temporal feature knowledge 420 , and temporal feature segmentation knowledge 430 .

経時データ判定知識４１０は、分析対象データである契約希望者の告知情報３００が非経時データ２３１および経時データ２３２のいずれに該当するかを判定するための判定情報である。具体的には、たとえば、経時データ判定知識４１０は、非経時データ２３１および経時データ２３２に該当する告知情報３００の項目を規定する。 The chronological data judgment knowledge 410 is judgment information for judging to which of the non-chronological data 231 and the chronological data 232 the notification information 300 of the contract applicant, which is the data to be analyzed, corresponds. Specifically, for example, the chronological data determination knowledge 410 defines items of the notification information 300 corresponding to the non-chronological data 231 and the chronological data 232 .

非経時データ項目４１１は、非経時データ２３１に該当する項目を含む。非経時データ２３１とは、データの時系列な変化がない、または当該変化はあっても意味のないデータである。非経時データ２３１には、たとえば、喫煙習慣３３１、飲酒習慣３３２、各種受診歴、各種入院歴が該当する。経時データ項目４１２は、経時データ２３２に該当する項目を含む。経時データ２３２とは、データの時系列な変化に意味がある時系列データである。経時データ２３２には、たとえば、体重３２１、ＢＭＩ３２２、収縮期血圧３２３、拡張期血圧３２４、空腹時血糖３２５などが該当する。 Non-chronological data items 411 include items corresponding to non-chronological data 231 . The non-temporal data 231 is data that does not change in time series, or data that changes but is meaningless. Non-chronological data 231 includes, for example, smoking habits 331, drinking habits 332, various medical examination histories, and various hospitalization histories. The chronological data items 412 include items corresponding to the chronological data 232 . The chronological data 232 is time-series data whose chronological change is significant. The chronological data 232 includes, for example, weight 321, BMI 322, systolic blood pressure 323, diastolic blood pressure 324, fasting blood sugar 325, and the like.

経時特徴知識４２０は、経時データ２３２から得られる経時的な特徴（経時特徴）を示す経時特徴情報である。たとえば、経時特徴知識４２０は、基本統計量項目４２１と、変化量項目４２２と、変化割合項目４２３と、…を規定する。基本統計量項目４２１は、経時データ２３２から得られる基本統計量に該当する項目を含む。基本統計量は、たとえば、経時データ２３２の最大値、最小値、平均値である。 The temporal feature knowledge 420 is temporal feature information indicating temporal features (temporal features) obtained from the temporal data 232 . For example, the feature knowledge over time 420 defines a basic statistic item 421, a change amount item 422, a change rate item 423, and so on. The basic statistics items 421 include items corresponding to basic statistics obtained from the temporal data 232 . Basic statistics are, for example, the maximum value, minimum value, and average value of the temporal data 232 .

変化量項目４２２は、経時データ２３２のうち連続する２つの値の変化を示す変化量に該当する項目である。たとえば、経時データ２３２が１～３年分の年ごとの体重３２１である場合、変化量（１、２年目）は、１年目の体重３２１と２年目の体重３２１との差であり、変化量（２、３年目）は、２年目の体重３２１と３年目の体重３２１との差である。 The amount of change item 422 is an item corresponding to the amount of change that indicates the change in two consecutive values in the temporal data 232 . For example, if the temporal data 232 is yearly weight 321 for one to three years, the amount of change (1st, 2nd year) is the difference between the weight 321 in the first year and the weight 321 in the second year. , the amount of change (second and third years) is the difference between the weight 321 in the second year and the weight 321 in the third year.

変化割合項目４２３は、経時データ２３２のうち連続する２つの値の変化の割合を示す値に該当する項目である。たとえば、経時データ２３２が１～３年分の年ごとの体重３２１である場合、変化割合（１、２年目）は、１年目の体重３２１と２年目の体重３２１との差を１年目の体重３２１で割った値であり、変化量（２、３年目）は、２年目の体重３２１と３年目の体重３２１との差を２年目の体重３２１で割った値である。 The change rate item 423 is an item corresponding to a value indicating the rate of change between two consecutive values in the temporal data 232 . For example, if the temporal data 232 is yearly weight 321 for one to three years, the rate of change (1st, 2nd year) is the difference between the weight 321 in the first year and the weight 321 in the second year. It is the value divided by the weight 321 in the year, and the amount of change (2nd and 3rd years) is the difference between the weight 321 in the 2nd year and the weight 321 in the 3rd year divided by the weight 321 in the 2nd year. is.

経時特徴分割知識４３０は、経時データ２３２が所属すべきグループが規定されたグループ分け情報である。経時特徴分割知識４３０は、具体的には、たとえば、統計学的知見あるいは医学的知見に基づいて規定される。具体的には、たとえば、経時特徴分割知識は、体型基本情報項目４３１と、血圧系検査値項目４３２と、血糖系検査値項目４３３と、肝機能系検査値項目４３４と、…を、それぞれグループとして含む。 The temporal feature division knowledge 430 is grouping information defining groups to which the temporal data 232 should belong. The temporal feature segmentation knowledge 430 is specifically defined based on, for example, statistical knowledge or medical knowledge. Specifically, for example, the temporal feature division knowledge includes a body type basic information item 431, a blood pressure system test value item 432, a blood glucose system test value item 433, a liver function system test value item 434, and so on. Including as

体型基本情報項目４３１は、体型基本情報に該当する項目を含む。体型基本情報とは、契約希望者の体型に関する基本情報である。体型基本情報項目４３１は、たとえば、年齢３１３、体重３２１、ＢＭＩ３２２を項目として含む。 The body type basic information item 431 includes items corresponding to body type basic information. The body shape basic information is basic information about the body shape of the contract applicant. The body type basic information item 431 includes, for example, age 313, weight 321, and BMI 322 as items.

血圧系検査値項目４３２は、血圧系検査値に該当する項目を含む。血圧系検査値は、契約希望者の血圧に関する検査値である。血圧系検査値項目４３２は、たとえば、収縮期血圧３２３や拡張期血圧３２４を項目として含む。 The blood pressure test value item 432 includes items corresponding to blood pressure test values. Blood pressure test values are test values related to the blood pressure of the contract applicant. The blood pressure test value item 432 includes, for example, the systolic blood pressure 323 and the diastolic blood pressure 324 as items.

血糖系検査値項目４３３は、血糖系検査値に該当する項目を含む。血糖系検査値は、契約希望者の血糖に関する検査値である。血糖系検査値項目４３３は、たとえば、空腹時血糖３２５やＨｂＡ１ｃを項目として含む。 The blood sugar test value item 433 includes items corresponding to blood sugar test values. The blood sugar test value is a blood sugar test value of the contract applicant. The blood sugar test value item 433 includes, for example, fasting blood sugar 325 and HbA1c as items.

肝機能系検査値項目４３４は、肝機能系検査値に該当する項目を含む。肝機能系検査値は、契約希望者の肝機能に関する検査値である。肝機能系検査値項目４３４は、たとえば、ＧＯＴ（グルタミン酸オキサロ酢酸トランスアミナーゼ）、ＧＰＴ（グルタミン酸ピルビン酸トランスアミナーゼ）、γ－ＧＴＰ（γグルタミルトランスペプチダーゼ）を項目として含む。 The liver function test value item 434 includes items corresponding to liver function test values. The liver function system test value is a test value related to the liver function of the contract applicant. The liver function test value item 434 includes, for example, GOT (glutamate oxaloacetate transaminase), GPT (glutamate pyruvate transaminase), and γ-GTP (γ glutamyltranspeptidase) as items.

図２に戻り、判定部２０１は、経時データ判定知識４１０に基づいて、告知情報３００が経時データ２３２および非経時データ２３１のいずれに該当するかを判定する。図３の告知情報３００内の氏名ＩＤ３１１が「０００１」のエントリを例に挙げる。年齢３１３、体重３２１、ＢＭＩ３２２、収縮期血圧３２３、拡張期血圧３２４および空腹時血糖３２５は、経時データ項目４１２に含まれる。したがって、判定部２０１は、年齢３１３の「４７」，「４８」，「４９」、体重３２１の「８３．４」，「８６．６」，「９２．０」、ＢＭＩ３２２の「２２．８」，「２４．３」，「２６．０」、収縮期血圧３２３の「１２４．９」，「１２８．５」，「１３３．８」、拡張期血圧３２４の「８０．７」，「８６．１」，「９０．０」、空腹時血糖３２５の「１０４．５」，「１０７．２」，「１１０．０」が、氏名ＩＤ３１１が「０００１」である契約希望者の経時データ２３２である、と判定する。判定部２０１は、判定された経時データ２３２を判定結果として出力する。 Returning to FIG. 2 , the determination unit 201 determines whether the notification information 300 corresponds to the temporal data 232 or the non-temporal data 231 based on the temporal data determination knowledge 410 . An entry in which the name ID 311 in the notification information 300 in FIG. 3 is "0001" is taken as an example. Age 313 , weight 321 , BMI 322 , systolic blood pressure 323 , diastolic blood pressure 324 and fasting blood glucose 325 are included in chronological data items 412 . Therefore, the determination unit 201 determines that the age 313 is "47", "48" and "49", the weight 321 is "83.4", "86.6" and "92.0", and the BMI 322 is "22.8". , "24.3", "26.0", "124.9", "128.5", "133.8" of the systolic blood pressure 323, "80.7", "86. 1”, “90.0”, and “104.5”, “107.2”, and “110.0” of the fasting blood sugar 325 are the chronological data 232 of the contract applicant whose name ID 311 is “0001”. , and determine. The determination unit 201 outputs determined temporal data 232 as a determination result.

また、喫煙習慣３３１、飲酒習慣３３２、運動習慣３３３、高血圧症受診歴３４１、高血圧症入院歴３４２および糖尿病受診歴３４３は、非経時データ項目４１１に含まれる。したがって、判定部２０１は、喫煙習慣３３１の「なし」，「なし」，「なし」、飲酒習慣３３２の「週１」，「週１」，「週２」、運動習慣３３３の「週１」，「週１」，「なし」、高血圧症受診歴３４１の「なし」，「なし」，「なし」、高血圧症入院歴３４２の「なし」，「なし」，「なし」、糖尿病受診歴３４３の「なし」，「なし」，「なし」が、氏名ＩＤ３１１が「０００１」である契約希望者の非経時データ２３１である、と判定する。判定部２０１は、判定された非経時データ２３１を判定結果として出力する。 Smoking habits 331 , drinking habits 332 , exercise habits 333 , hypertension medical history 341 , hypertension hospitalization history 342 , and diabetes medical history 343 are included in non-chronological data items 411 . Therefore, the determining unit 201 selects “none”, “none”, and “none” for the smoking habit 331, “1st week”, “1st week”, and “2nd week” for the drinking habit 332, and “1st week” for the exercise habit 333. , "week 1", "none", hypertension medical history 341 "none", "none", "none", hypertension hospitalization history 342 "none", "none", "none", diabetes medical history 343 are the non-temporal data 231 of the contract applicant whose name ID 311 is "0001". The determination unit 201 outputs the determined non-temporal data 231 as a determination result.

生成部２０２は、経時特徴知識４２０に基づいて、経時データ２３２から、経時的な特徴を示す経時特徴データを生成する。生成部２０２は、経時データ２３２について、基本統計量項目に含まれる最大値５１１、最小値５１２、平均値、…を算出する。生成部２０２は、経時データ２３２について、変化量項目に含まれる変化量（１，２年目）５２１、変化量（２，３年目）５２２…を算出する。生成部２０２は、経時データ２３２について、変化割合項目に含まれる変化割合（１，２年目）５３１、変化割合（２，３年目）５３２、…を算出する。 The generation unit 202 generates temporal feature data representing temporal features from the temporal data 232 based on the temporal feature knowledge 420 . The generation unit 202 calculates the maximum value 511, the minimum value 512, the average value, . The generation unit 202 calculates the amount of change (1st and 2nd years) 521, the amount of change (2nd and 3rd years) 522, . The generation unit 202 calculates a change rate (1st and 2nd years) 531, a change rate (2nd and 3rd years) 532, .

図３の告知情報３００内の氏名ＩＤ３１１が「０００１」のエントリを例に挙げる。たとえば、経時データ２３２に含まれる体重３２１の「８３．４」，「８６．６」，「９２．０」の場合、最大値５１１は、３年目の体重３２１である「９２．０」、最小値５１２は、１年目の体重３２１である「８３．４」、平均値は、「８３．４」，「８６．６」，「９２．０」を平均化した「８７．３」である。 An entry in which the name ID 311 in the notification information 300 in FIG. 3 is "0001" is taken as an example. For example, in the case of "83.4", "86.6", and "92.0" for the weight 321 included in the chronological data 232, the maximum value 511 is "92.0", which is the weight 321 in the third year. The minimum value 512 is ``83.4'', which is the weight 321 in the first year, and the average value is ``87.3'', which is the average of ``83.4'', ``86.6'', and ``92.0''. be.

また、変化量（１，２年目）５２１は、「８６．６」から「８３．４」を引いた「３．２」、変化量（２，３年目）５２２は、「９２．０」から「８６．６」を引いた「５．４」である。また、変化割合（１，２年目）５３１は、変化量（１，２年目）５２１である「３．２」を１年目の体重３２１の「８３．４」で割った「０．０４」、変化割合（２，３年目）５３２は、変化量（２，３年目）５２２である「５．４」を２年目の体重３２１の「８６．６」で割った「０．０６」である。体重３２１以外のＢＭＩ３２２、収縮期血圧３２３、拡張期血圧３２４および空腹時血糖３２５などについても同様である。生成部２０２は、生成結果を経時特徴データ５００として出力する。 Further, the amount of change (1st and 2nd years) 521 is "3.2" obtained by subtracting "83.4" from "86.6", and the amount of change (2nd and 3rd years) 522 is "92.0". ' minus '86.6' is '5.4'. Also, the rate of change (first and second years) 531 is obtained by dividing the change amount (first and second years) 521 of "3.2" by the weight 321 of the first year of "83.4", which is "0. 04", the rate of change (2nd and 3rd years) 532 is "0 .06". The same applies to BMI 322, systolic blood pressure 323, diastolic blood pressure 324, fasting blood sugar 325, etc. other than body weight 321. FIG. The generation unit 202 outputs the generation result as temporal feature data 500 .

図５は、経時特徴データ５００の一例を示す説明図である。経時特徴データ５００は、基本統計量５１０と、変化量５２０と、変化割合５３０と、を含む。基本統計量５１０は、基本統計量項目４２１に従って算出された最大値５１１、最小値５１２、平均値（不図示）、…を含む。変化量５２０は、変化量項目４２２に従って算出された変化量（１，２年目）５２１、変化量（２，３年目）５２２、…を含む。変化割合５３０は、変化量項目４２２に従って算出された変化割合（１，２年目）５３１、変化割合（２，３年目）５３２、…を含む。 FIG. 5 is an explanatory diagram showing an example of temporal feature data 500. As shown in FIG. The chronological feature data 500 includes a basic statistic 510 , a change amount 520 and a change rate 530 . The basic statistics 510 include a maximum value 511, a minimum value 512, an average value (not shown), . . . calculated according to the basic statistics item 421. FIG. The amount of change 520 includes an amount of change (1st and 2nd years) 521, an amount of change (2nd and 3rd years) 522, . . . The rate of change 530 includes a rate of change (1st and 2nd years) 531, a rate of change (2nd and 3rd years) 532, . . .

基本統計量５１０、変化量５２０および変化割合５３０はそれぞれ、経時データ２３２の種類ごとに、生成部２０２によって生成される。図５に示した経時特徴内のエントリ５０１－１、５０１－２、…、５０２－１、５０２－２、…、５０１－３、５０３－２、…、５０４－１、５０４－２、…は、ある一人の契約希望者（たとえば、氏名ＩＤ３１１が「０００１」の契約希望者）に関する経時特徴データである。生成部２０２は、契約希望者ごとにエントリ５０１－１、５０１－２、…、５０２－１、５０２－２、…、５０１－３、５０３－２、…、５０４－１、５０４－２、…を生成する。 The basic statistic 510, the amount of change 520, and the rate of change 530 are each generated by the generator 202 for each type of temporal data 232. FIG. Entries 501-1, 501-2, . . . , 502-1, 502-2, . , is chronological feature data relating to a certain contract applicant (for example, a contract applicant whose name ID 311 is "0001"). , 502-1, 502-2, . . . , 501-3, 503-2, . to generate

エントリ５０１は、体型基本情報項目４３１に従った経時特徴データである。具体的には、たとえば、エントリ５０１－１は、体重３２１についての基本統計量、変化量および変化割合を示す経時特徴データである。エントリ５０１－２は、ＢＭＩ３２２についての基本統計量、変化量および変化割合を示す経時特徴データである。 Entry 501 is characteristic data over time according to body type basic information item 431 . Specifically, for example, entry 501 - 1 is feature data over time indicating basic statistics, amount of change, and rate of change for body weight 321 . Entry 501-2 is feature data over time indicating basic statistics, amount of change, and rate of change of BMI 322. FIG.

エントリ５０２は、血圧系検査値項目４３２に従った経時特徴データである。具体的には、たとえば、エントリ５０２－１は、収縮期血圧３２３についての基本統計量、変化量および変化割合を示す経時特徴データである。エントリ５０２－２は、拡張期血圧３２４についての基本統計量、変化量および変化割合を示す経時特徴データである。 Entry 502 is characteristic data over time according to the blood pressure system test value item 432 . Specifically, entry 502 - 1 , for example, is feature data over time indicating basic statistics, amount of change, and rate of change for systolic blood pressure 323 . Entry 502 - 2 is feature data over time indicating basic statistics, amount of change, and rate of change for diastolic blood pressure 324 .

エントリ５０３は、血糖系検査値項目４３３に従った経時特徴データである。具体的には、たとえば、エントリ５０３－１は、空腹時血糖３２５についての基本統計量、変化量および変化割合を示す経時特徴データである。エントリ５０３－２は、ＨｂＡ１ｃについての基本統計量、変化量および変化割合を示す経時特徴データである。 Entry 503 is chronological feature data according to blood glucose test value item 433 . Specifically, entry 503 - 1 , for example, is feature data over time indicating basic statistics, amount of change, and rate of change for fasting blood glucose 325 . Entry 503-2 is chronological feature data indicating basic statistics, amount of change, and rate of change for HbA1c.

エントリ５０４は、肝機能系検査値項目４３４に従った経時特徴データである。具体的には、たとえば、エントリ５０４－１は、ＧＯＴについての基本統計量、変化量および変化割合を示す経時特徴データである。エントリ５０４－２は、ＧＰＴについての基本統計量、変化量および変化割合を示す経時特徴データである。 Entry 504 is characteristic data over time according to liver function test value item 434 . Specifically, for example, entry 504-1 is chronological feature data indicating basic statistics, amount of change, and rate of change for GOT. Entry 504-2 is feature data over time indicating basic statistics, amount of change, and rate of change for GPT.

図２に戻り、分割部２０３は、経時特徴分割知識４３０に基づいて、生成部２０２によって生成された複数の経時特徴データを複数のグループに分割し、分割経時特徴データ６００を出力する。複数の経時特徴データとは、たとえば、図５に示した各エントリ５０１－１、５０１－２、…、５０２－１、５０２－２、…、５０１－３、５０３－２、…、５０４－１、５０４－２、…を構成する値の各々である。複数のグループは、分割経時特徴データ６００－１，６００－２、…、６００－ｎ（ｎは１以上の整数）である。分割経時特徴データ６００－１，６００－２、…、６００－ｎの各々は、図４に示した経時特徴分割知識４３０の体型基本情報項目４３１、血圧系検査値項目４３２、血糖系検査値項目４３３、肝機能系検査値項目４３４、…として規定される。 Returning to FIG. 2 , the dividing unit 203 divides the plurality of temporal feature data generated by the generating unit 202 into a plurality of groups based on the temporal feature division knowledge 430 and outputs divided temporal feature data 600 . , 502-1, 502-2, . . . , 501-3, 503-2, . , 504-2, . . . The multiple groups are divided temporal feature data 600-1, 600-2, . . . , 600-n (n is an integer of 1 or more). Each of the divided temporal feature data 600-1, 600-2, . 433, liver function test value items 434, . . .

経時特徴分割知識４３０は、統計学的知見あるいは医学的知見に基づいて規定されている。（例１）ＢＭＩ３２２は身長と体重３２１から算出される指数である。保険契約希望者の年代であれば身長の変化は大きくない。このため、ＢＭＩ３２２の変化は体重３２１の変化と非常に強い相関がある。告知情報３００の中で相関の強い項目が複数存在する場合、それらは冗長な情報となり、データ分析において非効率の原因となる。 The temporal feature segmentation knowledge 430 is defined based on statistical knowledge or medical knowledge. (Example 1) BMI 322 is an index calculated from height and weight 321 . There is not much change in height among the age groups of those who wish to purchase insurance. Therefore, changes in BMI 322 have a very strong correlation with changes in body weight 321 . If there are multiple highly correlated items in the notification information 300, they become redundant information and cause inefficiency in data analysis.

これに対し、経時特徴分割知識４３０を適用することにより、冗長さを含む複数の項目（年齢３１３、体重３２１、ＢＭＩ３２２、…）の経時特徴データが、体型基本情報項目４３１という分割経時特徴データ６００－１のグループ（以下、体型グループ６００－１）にまとめられる。これにより、生成装置１００は、当該複数の項目の値を用いて生成された経時特徴データを、体型グループとしてまとめて次元圧縮して高次元の特徴（以下、高次特徴）を抽出することが可能となり、データ分析の効率化を図ることができる。 On the other hand, by applying the temporal feature division knowledge 430, the temporal feature data of a plurality of items including redundancy (age 313, weight 321, BMI 322, . -1 group (hereinafter referred to as body type group 600-1). As a result, the generation device 100 can collect the temporal feature data generated using the values of the plurality of items as a body shape group, perform dimension compression, and extract high-dimensional features (hereinafter referred to as high-order features). This makes it possible to improve the efficiency of data analysis.

（例２）収縮期血圧３２３と拡張期血圧３２４は、どちらも加齢とともに緩やかに悪化することが知られている。また、生活習慣の改善、悪化などで血圧が下降、上昇する場合も、収縮期血圧３２３と拡張期血圧３２４はバランスを保ったまま下降、上昇する。しかし、そのバランスに変化が生じたとき、動脈硬化などの高血圧疾患の予兆であると言われている。 (Example 2) Both the systolic blood pressure 323 and the diastolic blood pressure 324 are known to slowly deteriorate with aging. In addition, even when the blood pressure decreases or increases due to improvement or deterioration of lifestyle habits, the systolic blood pressure 323 and the diastolic blood pressure 324 decrease or increase while maintaining a balance. However, when this balance changes, it is said to be a sign of hypertensive diseases such as arteriosclerosis.

そこで、経時特徴分割知識４３０を適用することにより、収縮期血圧３２３と拡張期血圧３２４の経時特徴データが、血圧系検査値項目４３２という分割経時特徴データ６００－２のグループ（以下、血圧系グループ６００－２）にまとめられる。これにより、生成装置１００は、収縮期血圧３２３の値と拡張期血圧３２４の値とを用いて生成された経時特徴データを、血圧系グループとしてまとめて次元圧縮して高次特徴を抽出することで、データ分析に必要な複合的な特徴量を得ることができる。 Therefore, by applying the temporal feature division knowledge 430, the temporal feature data of the systolic blood pressure 323 and the diastolic blood pressure 324 are divided into a group of the divided temporal feature data 600-2 called the blood pressure test value item 432 (hereinafter referred to as the blood pressure group). 600-2). As a result, the generating apparatus 100 collects the temporal feature data generated using the values of the systolic blood pressure 323 and the diastolic blood pressure 324 as a blood pressure group, dimensionally compresses them, and extracts higher-order features. , it is possible to obtain the composite feature quantity necessary for data analysis.

同様の理由で、空腹時血糖３２５とＨｂＡ１ｃの経時特徴データが、血糖系検査値項目４３３という分割経時特徴データ６００－３のグループ（以下、血糖系グループ６００－３）にまとめられ、ＧＯＴ，ＧＰＴ，γ－ＧＴＰの経時特徴データが、肝機能系検査値項目４３４という分割経時特徴データ６００－４のグループ（以下、肝機能系グループ６００－４）にまとめられる。 For the same reason, the fasting blood sugar 325 and HbA1c chronological feature data are grouped into a group of divided chronological feature data 600-3 (hereafter, blood glucose group 600-3) called blood glucose test value item 433, and GOT, GPT , γ-GTP are grouped into a group of divided temporal feature data 600-4 called liver function test value items 434 (hereafter, liver function group 600-4).

図６は、分割経時特徴データ６００の一例を示す説明図である。分割経時特徴は、体型グループ６００－１と、血圧系グループ６００－２と、血糖系グループ６００－３と、肝機能系グループ６００－４と、を含む。体型グループ６００－１は、体型基本情報項目４３１に該当する経時特徴データを含むデータ集合である。血圧系グループ６００－２は、血圧系検査値項目４３２に該当する経時特徴データを含むデータ集合である。血糖系グループ６００－３は、血糖系検査値項目４３３に該当する経時特徴データを含むデータ集合である。肝機能系グループ６００－４は、肝機能系検査値項目４３４に該当する経時特徴データを含むデータ集合である。 FIG. 6 is an explanatory diagram showing an example of divided temporal feature data 600. As shown in FIG. The segmented temporal features include a body type group 600-1, a blood pressure group 600-2, a blood sugar group 600-3, and a liver function group 600-4. The figure group 600 - 1 is a data set containing chronological characteristic data corresponding to the figure basic information item 431 . A blood pressure group 600 - 2 is a data set containing chronological characteristic data corresponding to the blood pressure test value item 432 . The glycemic group 600-3 is a data set containing chronological feature data corresponding to the blood glucose test value item 433. FIG. The hepatic function group 600-4 is a data set containing chronological characteristic data corresponding to the hepatic function test value item 434. FIG.

体型グループ６００－１は、体重３２１の経時特徴データ６０１－１と、ＢＭＩ３２２の経時特徴データ６０１－２と、を含む。体重３２１の経時特徴データ６０１－１のデータ列をベクトルＵｂ１－１とする。ＢＭＩ３２２の経時特徴データ６０１－２のデータ列をベクトルＵｂ１－２とする。 Body type group 600-1 includes weight 321 feature data over time 601-1 and BMI 322 over time feature data 601-2. A data string of the temporal feature data 601-1 of the body weight 321 is assumed to be a vector Ub1-1. A data string of the temporal characteristic data 601-2 of the BMI 322 is assumed to be a vector Ub1-2.

血圧系グループ６００－２は、収縮期血圧３２３の経時特徴データ６０２－１と、拡張期血圧３２４の経時特徴データ６０２－２と、を含む。収縮期血圧３２３の経時特徴データ６０２－１のデータ列をベクトルＵｂ２－１とする。拡張期血圧３２４の経時特徴データ６０２－２のデータ列をベクトルＵｂ２－２とする。 The blood pressure system group 600-2 includes temporal feature data 602-1 of the systolic blood pressure 323 and temporal feature data 602-2 of the diastolic blood pressure 324. FIG. Let the data string of the temporal feature data 602-1 of the systolic blood pressure 323 be a vector Ub2-1. Let the data string of the temporal feature data 602-2 of the diastolic blood pressure 324 be a vector Ub2-2.

血糖系グループ６００－３は、空腹時血糖３２５の経時特徴データ６０３－１と、ＨｂＡ１ｃの経時特徴データ６０２－２と、を含む。空腹時血糖３２５の経時特徴データ６０３－１のデータ列をベクトルＵｂ３－１とする。ＨｂＡ１ｃの経時特徴データ６０２－２のデータ列をベクトルＵｂ３－２とする。 The blood glucose system group 600-3 includes fasting blood glucose 325 time course feature data 603-1 and HbA1c time course feature data 602-2. Let the data string of the fasting blood glucose 325 chronological feature data 603-1 be a vector Ub3-1. Let the data string of the HbA1c chronological feature data 602-2 be a vector Ub3-2.

肝機能系グループ６００－４は、ＧＯＴの経時特徴データ６０４－１と、ＧＰＴの経時特徴データ６０４－２と、を含む。ＧＯＴの経時特徴データ６０４－１のデータ列をベクトルＵｂ４－１とする。ＧＰＴの経時特徴データ６０４－２のデータ列をベクトルＵｂ４－２とする。 The liver function system group 600-4 includes GOT feature data over time 604-1 and GPT feature data over time 604-2. A data string of the GOT temporal feature data 604-1 is assumed to be a vector Ub4-1. A data string of the GPT chronological feature data 604-2 is assumed to be a vector Ub4-2.

図２に戻り、次元圧縮部２０４は、入力されてくるデータを次元圧縮して高次特徴データ７００を生成する。具体的には、たとえば、次元圧縮部２０４は、分割部２０３によって分割された複数のグループの各々を次元圧縮する。具体的には、たとえば、次元圧縮部２０４は、分割経時特徴データ６００－１～６００－ｎの各々について次元圧縮をおこない、経時データ２３２に関する高次特徴データ７０２－１～７０２－ｎを生成する。 Returning to FIG. 2 , the dimension compression unit 204 dimensionally compresses the input data to generate high-order feature data 700 . Specifically, for example, the dimension compression unit 204 dimension-compresses each of the plurality of groups divided by the division unit 203 . Specifically, for example, the dimension compression unit 204 performs dimension compression on each of the divided temporal feature data 600-1 to 600-n to generate high-order feature data 702-1 to 702-n regarding the temporal data 232. .

また、次元圧縮部２０４は、非経時データ２３１について次元圧縮をおこない、非経時データ２３１に関する高次特徴データ７０１を生成する。データの次元圧縮による高次特徴データの抽出は、Ｐｒｉｎｃｉｐａｌｃｏｍｐｏｎｅｎｔｓａｎａｌｙｓｉｓ（ＰＣＡ）や、Ｓｔａｃｋｅｄａｕｔｏｅｎｃｏｄｅｒなど公知の次元圧縮方法により実現される。また、次元圧縮部２０４は、ニューラルネットワークなどを用いて、一度データの次元数を拡張することで高次特徴データを生成し、その後次元圧縮をしてもよい。 The dimension compression unit 204 also performs dimension compression on the non-temporal data 231 to generate high-order feature data 701 on the non-temporal data 231 . Extraction of high-order feature data by dimensional compression of data is realized by known dimensional compression methods such as principal components analysis (PCA) and stacked autoencoder. Further, the dimension compression unit 204 may generate high-order feature data by once expanding the number of dimensions of data using a neural network or the like, and then perform dimension compression.

図７は、高次特徴データの一例を示す説明図である。高次特徴データは、非経時データ２３１に関する高次特徴データ７０１と、経時データ２３２に関する体型系高次特徴データ７０２－１と、経時データ２３２に関する血圧系高次特徴データ７０２－２と、経時データ２３２に関する血糖系高次特徴データ７０２－３と、経時データ２３２に関する肝機能系高次特徴データ７０２－４と、を含む。 FIG. 7 is an explanatory diagram showing an example of high-order feature data. The high-level feature data includes high-level feature data 701 related to the non-chronological data 231, body type high-level feature data 702-1 related to the chronological data 232, blood pressure high-level feature data 702-2 related to the chronological data 232, and chronological data. blood glucose system high-level feature data 702-3 for 232 and liver function system high-level feature data 702-4 for time course data 232;

非経時データ２３１に関する高次特徴データ７０１、経時データ２３２に関する体型系高次特徴データ７０２－１、経時データ２３２に関する血圧系高次特徴データ７０２－２、経時データ２３２に関する血糖系高次特徴データ７０２－３、および経時データ２３２に関する肝機能系高次特徴データ７０２－４はそれぞれ、次元圧縮により得られた特徴量１、特徴量２、…を含む。 High-level feature data 701 related to the non-chronological data 231, body system high-level feature data 702-1 related to the chronological data 232, blood pressure high-level feature data 702-2 related to the chronological data 232, and blood glucose high-level feature data 702 related to the chronological data 232. -3, and liver function system high-order feature data 702-4 related to the chronological data 232 respectively include feature quantities 1, 2, . . . obtained by dimensional compression.

非経時データ２３１に関する高次特徴データ７０１の特徴量１の列における値の集合がベクトルＶａ１であり、特徴量２の列における値の集合がベクトルＶａ２である。経時データ２３２に関する体型系高次特徴データ７０２－１の特徴量１の列における値の集合がベクトルＶｂａ１－１であり、特徴量２の列における値の集合がベクトルＶｂ１－２である。 A vector Va1 is a set of values in the feature amount 1 column of the high-order feature data 701 related to the non-temporal data 231, and a vector Va2 is a set of values in the feature amount 2 column. A vector Vba1-1 is a set of values in the feature amount 1 column of the body type high-level feature data 702-1 related to the chronological data 232, and a vector Vb1-2 is a set of values in the feature amount 2 column.

経時データ２３２に関する血圧系高次特徴データ７０２－２の特徴量１の列における値の集合がベクトルＶｂ２－１であり、特徴量２の列における値の集合がベクトルＶｂ２－２である。経時データ２３２に関する血糖系高次特徴データ７０２－３の特徴量１の列における値の集合がベクトルＶｂ３－１であり、特徴量２の列における値の集合がベクトルＶｂ３－２である。経時データ２３２に関する肝機能系高次特徴データ７０２－４の特徴量１の列における値の集合がベクトルＶｂ４－１であり、特徴量２の列における値の集合がベクトルＶｂ４－２である。 A vector Vb2-1 is a set of values in the feature quantity 1 column of the blood pressure system high-level feature data 702-2 related to the chronological data 232, and a vector Vb2-2 is a set of values in the feature quantity 2 column. A vector Vb3-1 is a set of values in the feature amount 1 column of the blood glucose system high-level feature data 702-3 related to the chronological data 232, and a vector Vb3-2 is a set of values in the feature amount 2 column. A vector Vb4-1 is a set of values in the feature amount 1 column of the liver function system high-level feature data 702-4 related to the chronological data 232, and a vector Vb4-2 is a set of values in the feature amount 2 column.

図２に戻り、結合部２０５は、次元圧縮部２０４による次元圧縮後の複数のグループを結合し、結合高次特徴データ７１０を生成する。次元圧縮後の複数のグループとは、分割経時特徴データ６００－１～６００－ｎが次元圧縮された場合、経時データ２３２に関する高次特徴データ７０２－１～７０２－ｎである。非経時データ２３１および分割経時特徴データ６００－１～６００－ｎが次元圧縮された場合、非経時データ２３１に関する高次特徴データ７０１および経時データ２３２に関する高次特徴データ７０２－１～７０２－ｎである。 Returning to FIG. 2 , the combining unit 205 combines a plurality of groups after dimension compression by the dimension compression unit 204 to generate combined high-order feature data 710 . The multiple groups after dimension compression are high-level feature data 702-1 to 702-n related to the temporal data 232 when the divided temporal feature data 600-1 to 600-n are dimensionally compressed. When the non-temporal data 231 and the divided temporal feature data 600-1 to 600-n are dimensionally compressed, the high-level feature data 701 for the non-temporal data 231 and the high-level feature data 702-1 to 702-n for the temporal data 232 be.

図７を用いて結合部２０５による結合例を示す。結合部２０５は、Ｖａ１，Ｖａ２，…，Ｖｂ１－１，Ｖｂ１－２，…，Ｖｂ３－１，Ｖｂ３－２，…，Ｖｂ４－１，Ｖｂ４－２，…を結合して、高次特徴ベクトルＶａｌｌを結合高次特徴データ７１０として生成する。高次特徴ベクトルＶａｌｌの次元数は、Ｖａ１，ｖＡ２，…，Ｖｂ１－１，Ｖｂ１－２，…，Ｖｂ３－１，Ｖｂ３－２，…，Ｖｂ４－１，Ｖｂ４－２，…の各々の要素の総和である。 An example of coupling by the coupling unit 205 is shown with reference to FIG. , Vb1-1, Vb1-2, . . , Vb3-1, Vb3-2, . . . , Vb4-1, Vb4-2, . is generated as the combined high-order feature data 710 . , Vb1-1, Vb1-2, . . . , Vb3-1, Vb3-2, . summation.

図２に戻り、分析部２０６は、結合部２０５による結合結果（結合高次特徴データ７１０）を説明変数とし、対応する目的変数を出力する。たとえば、分析部２０６は、保険金支払リスク分析を行い、具体的には、死亡、入院、手術、通院などの将来の発生確率を目的変数として出力する。具体的には、たとえば、分析部２０６は、重回帰分析なのどの統計的手法や、ニューラルネットワークなどの機械学習手法など、公知の技術を用いてデータ分析を実行する。具体的には、たとえば、分析部２０６は、既存の告知情報３００から得られる高次特徴ベクトルＶａｌｌとその分析結果との組み合わせを訓練データとして学習モデルを生成し、新規の告知情報３００から得られる高次特徴ベクトルＶａｌｌを学習モデルに入力することで、新規の告知情報３００に対応する新規の分析結果を得る。 Returning to FIG. 2, the analysis unit 206 uses the combination result (combined high-order feature data 710) by the combination unit 205 as an explanatory variable, and outputs a corresponding objective variable. For example, the analysis unit 206 performs insurance payment risk analysis, and specifically outputs the future probability of occurrence of death, hospitalization, surgery, hospital visits, etc. as objective variables. Specifically, for example, the analysis unit 206 performs data analysis using known techniques such as statistical techniques such as multiple regression analysis and machine learning techniques such as neural networks. Specifically, for example, the analysis unit 206 generates a learning model using a combination of the high-order feature vector Vall obtained from the existing notification information 300 and its analysis result as training data, and obtains from the new notification information 300 A new analysis result corresponding to the new notification information 300 is obtained by inputting the high-order feature vector Vall into the learning model.

＜画面例＞
図８は、入出力画面例１を示す説明図である。入出力画面８００は、生成装置１００の出力デバイス１０４の一例であるディスプレイまたは生成装置１００と通信可能な他のコンピュータのディスプレイに表示される。 <Screen example>
FIG. 8 is an explanatory diagram showing an input/output screen example 1. FIG. The input/output screen 800 is displayed on a display, which is an example of the output device 104 of the generation device 100 or a display of another computer that can communicate with the generation device 100 .

入出力画面８００は、告知情報読込みボタン８０１と、ドメイン知識読込みボタン８０２と、特徴抽出手法選択プルダウン８０３と、分析手法選択プルダウン８０４と、分析実行ボタン８０５と、実行結果表示領域８０６と、を含む。告知情報読込みボタン８０１は、入力デバイス１０３で押下されるボタンである。告知情報読込みボタン８０１が押下されると、記憶デバイス１０２に記憶された契約希望者の告知情報３００が読み込まれる。 The input/output screen 800 includes a notification information read button 801, a domain knowledge read button 802, a feature extraction method selection pulldown 803, an analysis method selection pulldown 804, an analysis execution button 805, and an execution result display area 806. . A notification information read button 801 is a button that is pressed on the input device 103 . When the notice information read button 801 is pressed, the notice information 300 of the contract applicant stored in the storage device 102 is read.

告知情報３００は、告知情報読込みボタン８０１を押下する方法以外に、告知情報入力ボタン８０７の押下により告知情報入力画面を表示し、入力デバイス１０３によって入力することもできる。図９は、告知情報入力画面例を示す説明図である。告知情報入力画面９００は、健診結果入力領域９０１と、問診結果入力領域９０２と、を含む。健診結果入力領域９０１では、入力デバイス１０３により、収縮期血圧３２３、拡張期血圧３２４、空腹時血糖３２５などの値が設定可能である。問診結果入力領域９０２では、入力デバイス１０３により、喫煙習慣３３１、飲酒習慣３３２、運動習慣３３３などの有無が設定可能である。 The notification information 300 can be input by the input device 103 by pressing the notification information input button 807 to display the notification information input screen, instead of pressing the notification information read button 801 . FIG. 9 is an explanatory diagram showing an example of a notification information input screen. The notification information input screen 900 includes a health checkup result input area 901 and an interview result input area 902 . Values such as systolic blood pressure 323 , diastolic blood pressure 324 , fasting blood sugar 325 and the like can be set using the input device 103 in the health checkup result input area 901 . In the interview result input area 902 , the presence or absence of a smoking habit 331 , a drinking habit 332 , an exercise habit 333 , etc. can be set using the input device 103 .

図８に戻り、ドメイン知識読込みボタン８０２は、入力デバイス１０３で押下されるボタンである。ドメイン知識読込みボタン８０２が押下されると、記憶デバイス１０２に記憶されたドメイン知識４００が読み込まれる。あるいは、ドメイン知識入力ボタン８０８の押下により、ドメイン知識を入力するための設定画面（不図示）を表示することができる。設定画面では、入力デバイス１０３により、経時データ判定知識４１０、経時特徴知識４２０、および経時特徴分割知識４３０の内容について追加、変更、削除が可能となる。 Returning to FIG. 8 , a domain knowledge read button 802 is a button pressed on the input device 103 . When the domain knowledge read button 802 is pressed, the domain knowledge 400 stored in the storage device 102 is read. Alternatively, pressing the domain knowledge input button 808 can display a setting screen (not shown) for inputting domain knowledge. On the setting screen, it is possible to add, change, or delete the contents of the temporal data determination knowledge 410 , the temporal feature knowledge 420 , and the temporal feature division knowledge 430 by using the input device 103 .

特徴抽出手法選択プルダウン８０３は、入力デバイス１０３で複数の特徴抽出手法をプルダウン表示させ、いずれか１つを選択させるボタンである。複数の特徴抽出手法には、たとえば、上述したＰＣＡやＳｔａｃｋｅｄａｕｔｏｅｎｃｏｄｅｒなど公知の次元圧縮方法が含まれる。たとえば、ＰＣＡが選択されると、生成装置１００は、ＰＣＡで次元圧縮をおこなうことになる。 A feature extraction method selection pull-down 803 is a button for displaying a pull-down list of a plurality of feature extraction methods on the input device 103 and selecting one of them. Multiple feature extraction techniques include, for example, well-known dimensionality reduction methods such as PCA and Stacked autoencoder as described above. For example, if PCA is selected, generator 100 will perform dimensionality compression with PCA.

分析手法選択プルダウン８０４は、入力デバイス１０３で複数の分析手法をプルダウン表示させ、いずれか１つを選択させるボタンである。複数の分析手法には、たとえば、上述した重回帰分析なのどの統計的手法やニューラルネットワークなどの機械学習手法など公知の手法が含まれる。たとえば、重回帰分析が選択されると、生成装置１００は、重回帰分析でデータ分析をおこなうことになる。 An analysis method selection pull-down 804 is a button for displaying a pull-down display of a plurality of analysis methods on the input device 103 and for selecting one of them. The plurality of analysis methods include, for example, known methods such as statistical methods such as multiple regression analysis described above and machine learning methods such as neural networks. For example, when the multiple regression analysis is selected, the generation device 100 will perform data analysis using the multiple regression analysis.

分析実行ボタン８０５は、入力デバイス１０３で押下されるボタンである。分析実行ボタンが押下されると、生成装置１００は、告知情報３００およびドメイン知識４００を記憶デバイス１０２からロードし、特徴抽出手法選択プルダウン８０３で選択された手法により次元圧縮をおこない、分析手法選択プルダウン８０４で選択された手法により、データ分析を実行する。実行結果表示領域８０６は、分析実行ボタン８０５が押下されたことにより実行されたデータ分析の実行結果が表示される領域である。 An analysis execution button 805 is a button that is pressed on the input device 103 . When the analysis execution button is pressed, the generation device 100 loads the notification information 300 and the domain knowledge 400 from the storage device 102, performs dimensional compression using the method selected in the feature extraction method selection pull-down 803, and selects the analysis method selection pull-down. Data analysis is performed according to the technique selected at 804 . An execution result display area 806 is an area in which the execution result of the data analysis executed by pressing the analysis execution button 805 is displayed.

図１０は、入出力画面例２を示す説明図である。図１０では、実行結果表示領域８０６に氏名ＩＤ３１１ごとの分析結果１０００が表示される。 FIG. 10 is an explanatory diagram showing example 2 of the input/output screen. In FIG. 10 , an analysis result 1000 for each name ID 311 is displayed in the execution result display area 806 .

＜生成処理手順例＞
図１１は、生成装置１００による生成処理手順例を示すフローチャートである。生成装置１００は、分析実行ボタン８０５の押下により、告知情報３００およびドメイン知識４００を読み込み（ステップＳ１１０１）、告知情報３００内における契約希望者の分析対象データの各々について、経時データ２３２であるか非経時データ２３１であるかを判定部２０１により判定する（ステップＳ１１０２）。経時データ２３２であると判定されたデータについて（ステップＳ１１０２：Ｙｅｓ）、生成装置１００は、生成部２０２により経時特徴データ５００を生成し（ステップＳ１１０３）、分割部２０３により複数の経時特徴データを複数のグループに分割する（ステップＳ１１０４）。 <Generation processing procedure example>
FIG. 11 is a flow chart showing an example of a generation processing procedure by the generation device 100. As shown in FIG. When the analysis execution button 805 is pressed, the generation device 100 reads the notification information 300 and the domain knowledge 400 (step S1101). The determination unit 201 determines whether the data is the temporal data 231 (step S1102). For the data determined to be the temporal data 232 (step S1102: Yes), the generating apparatus 100 causes the generating unit 202 to generate temporal feature data 500 (step S1103), and the dividing unit 203 divides the plurality of temporal feature data into a plurality of pieces. (step S1104).

非経時データ２３１であると判定されたデータについて（ステップＳ１１０２：Ｎｏ）、生成装置１００は、次元圧縮部２０４により非経時データ２３１に関する高次特徴データ７０１を抽出する（ステップＳ１１０５）。同様に、生成装置１００は、次元圧縮部２０４により経時データ２３２に関する高次特徴データ７０２－１～７０２－ｎを抽出する（ステップＳ１１０６－１～Ｓ１１０６－ｎ）。 For the data determined to be non-temporal data 231 (step S1102: No), the generating device 100 extracts high-level feature data 701 related to the non-temporal data 231 by the dimension compression unit 204 (step S1105). Similarly, the generation device 100 extracts high-order feature data 702-1 to 702-n regarding the temporal data 232 by the dimension compression unit 204 (steps S1106-1 to S1106-n).

そして、生成装置１００は、結合部２０５により高次特徴データ７０１、７０２－１～７０２－ｎを結合して、結合高次特徴データ７１０を生成し（ステップＳ１１０７）、分析部２０６によりデータ分析を実行して（ステップＳ１１０８）、分析結果１０００を入出力画面８００の実行結果表示領域８０６に表示する。これにより、一連の生成処理が終了する。 Generating apparatus 100 combines high-level feature data 701 and 702-1 to 702-n by combining unit 205 to generate combined high-level feature data 710 (step S1107), and analysis unit 206 performs data analysis. Execute (step S1108) and display the analysis result 1000 in the execution result display area 806 of the input/output screen 800. FIG. This completes a series of generation processes.

このように、実施例１によれば、経時特徴データ５００が経時特徴分割知識４３０に従ってグループ分けされるため、分析に不要な特徴量の生成を抑制することができる。これにより、高品質な説明変数を生成することができ、データ分析の高精度化を図ることができる。また、分岐に不要な特徴量の生成を抑制することにより、計算コストが低減され、データ生成およびデータ分析における計算効率の向上を図ることができる。 As described above, according to the first embodiment, the chronological feature data 500 is grouped according to the chronological feature division knowledge 430, so generation of unnecessary feature amounts for analysis can be suppressed. As a result, it is possible to generate high-quality explanatory variables and improve the accuracy of data analysis. In addition, by suppressing the generation of feature amounts unnecessary for branching, the calculation cost can be reduced, and the calculation efficiency in data generation and data analysis can be improved.

つぎに、実施例２について説明する。実施例２では、実施例１との相違点を中心に説明するため、実施例１と同一構成には同一符号を付し、その説明を省略する。 Next, Example 2 will be described. In the second embodiment, differences from the first embodiment will be mainly described, so that the same components as those in the first embodiment are denoted by the same reference numerals, and the description thereof will be omitted.

図１２は、実施例２にかかる生成装置１００の機能的構成例を示すブロック図である。実施例１にかかる生成装置１００は、次元圧縮部２０４による次元圧縮処理、結合部２０５による結合処理、および分析部２０６による分析処理をそれぞれ独立した処理として実行したが、実施例２では、次元圧縮部２０４、結合部２０５および分析部２０６に替えて、マルチモーダルニューラルネットワーク１２００を適用することで、次元圧縮部２０４による次元圧縮処理、結合部２０５による結合処理、および分析部２０６による分析処理を連続的に実行する。 FIG. 12 is a block diagram of a functional configuration example of the generation device 100 according to the second embodiment. The generation apparatus 100 according to the first embodiment executes the dimension compression processing by the dimension compression unit 204, the combination processing by the combination unit 205, and the analysis processing by the analysis unit 206 as independent processes. By applying the multimodal neural network 1200 in place of the unit 204, the combining unit 205, and the analyzing unit 206, the dimension compression processing by the dimension compression unit 204, the combining processing by the combining unit 205, and the analysis processing by the analyzing unit 206 can be performed continuously. execute

図１３は、マルチモーダルニューラルネットワーク１２００の一例を示す説明図である。図１４は、マルチモーダルニューラルネットワーク１２００による分析結果１４００を示す入出力画面例を示す説明図である。 FIG. 13 is an explanatory diagram showing an example of a multimodal neural network 1200. As shown in FIG. FIG. 14 is an explanatory diagram showing an example of an input/output screen showing an analysis result 1400 by the multimodal neural network 1200. As shown in FIG.

マルチモーダルニューラルネットワーク１２００は、まず、複数のグループに分類された入力ベクトルに対し、各グループで分岐したニューラルネットワークｆ１，ｆ２，…，ｆｎで特徴抽出を行う。つぎに、マルチモーダルニューラルネットワーク１２００は、ニューラルネットワークｆ１，ｆ２，…，ｆｎの出力ベクトルを結合し、全結合ネットワークｇによって特徴抽出および分析を行い、出力層ｈにて分析結果を出力する。ニューラルネットワークｆ１，ｆ２，…，ｆｎが次元圧縮処理に対応し、全結合ネットワークｇおよび出力層ｈが結合処理および分析処理に対応する。 The multimodal neural network 1200 first performs feature extraction on input vectors classified into a plurality of groups using neural networks f1, f2, . Next, the multimodal neural network 1200 connects the output vectors of the neural networks f1, f2, . Neural networks f1, f2, .

マルチモーダルニューラルネットワーク１２００の学習は、ニューラルネットワークｆ１，ｆ２，…，ｆｎに入力される入力ベクトルと、出力値である分析結果の教師あり学習である。マルチモーダルニューラルネットワーク１２００は、グループごとの特徴抽出、全グループを結合した高次特徴抽出、分析の３つの処理を同時に学習することができる。したがって、ネットワークの構造やパラメータの設計次第で高精度な分析が可能である。マルチモーダルニューラルネットワーク１２００の動作は、たとえば、図１３に示した式（１）により表現される。 The learning of the multimodal neural network 1200 is supervised learning of input vectors input to the neural networks f1, f2, . The multimodal neural network 1200 can simultaneously learn three processes: feature extraction for each group, high-level feature extraction combining all groups, and analysis. Therefore, highly accurate analysis is possible depending on the design of the network structure and parameters. The operation of multimodal neural network 1200 is expressed, for example, by equation (1) shown in FIG.

ｈ＝ｇ（ｆ（Ｕａ），…，ｆ（Ｕｂ１－１），ｆ（Ｕｂ１－２），…，ｆ（Ｕｂ３－１），ｆ（Ｕｂ３－２），…，ｆ（Ｕｂ４－１），ｆ（Ｕｂ４－２），…）・・・（１） h=g(f(Ua), ..., f(Ub1-1), f(Ub1-2), ..., f(Ub3-1), f(Ub3-2), ..., f(Ub4-1), f(Ub4-2),...)...(1)

なお、式（１）において、ベクトルＵａは、非経時データ２３１のベクトル表現である。ベクトルＵａを関数ｆに与えることで、非経時データ２３１に関する高次特徴データ７０１のベクトルＶａ１、Ｖａ２、…が生成される。また、ベクトルＵｂ１－１、ベクトルＵｂ１－２、…を関数ｆに与えることで、経時データ２３２に関する高次特徴データ７０２－１のベクトルＶｂ１－１、Ｖｂａ１－２、…が生成される。 Note that in equation (1), the vector Ua is a vector representation of the non-temporal data 231 . By giving the vector Ua to the function f, vectors Va1, Va2, . Also, by giving the vector Ub1-1, the vector Ub1-2, . . . to the function f, the vectors Vb1-1, Vba1-2, .

また、ベクトルＵｂ２－１、ベクトルＵｂ２－２、…を関数ｆに与えることで、経時データ２３２に関する高次特徴データ７０２－２のベクトルＶｂ２－１、Ｖｂａ２－２、…が生成される。また、ベクトルＵｂ３－１、ベクトルＵｂ３－２、…を関数ｆに与えることで、経時データ２３２に関する高次特徴データ７０２－３のベクトルＶｂ３－１、Ｖｂａ３－２、…が生成される。また、ベクトルＵｂ４－１、ベクトルＵｂ４－２、…を関数ｆに与えることで、経時データ２３２に関する高次特徴データ７０２－４のベクトルＶｂ４－１、Ｖｂａ４－２、…が生成される。 Also, by giving the vector Ub2-1, the vector Ub2-2, . . . to the function f, the vectors Vb2-1, Vba2-2, . Also, by giving the vector Ub3-1, the vector Ub3-2, . . . to the function f, the vectors Vb3-1, Vba3-2, . Also, by giving vector Ub4-1, vector Ub4-2, . . . to function f, vectors Vb4-1, Vba4-2, .

このように、実施例２によれば、次元圧縮部２０４による次元圧縮処理、結合部２０５による結合処理、および分析部２０６による分析処理が連続的に実行されるため、生成処理および分析処理の高速化および高精度化を図ることができる。 As described above, according to the second embodiment, since the dimension compression processing by the dimension compression unit 204, the combination processing by the combination unit 205, and the analysis processing by the analysis unit 206 are continuously executed, the generation processing and the analysis processing are performed at high speed. It is possible to improve the accuracy and accuracy of the measurement.

また、上述した実施例１および実施例２では、生命保険の引受査定における保険金支払リスク予測を例にあげて説明したが、企業の財務分析にも適用可能である。この場合、告知情報３００に替えて有価証券報告書に記載されたデータまたは当該データから算出される指標データとする。また、経時特徴分割知識４３０には、たとえば、上記データを、収益性、安全性、活動性、生産性および成長性の５つの観点でグループ分けした情報となる。 Further, in the first and second embodiments described above, the prediction of the insurance claim payment risk in the underwriting assessment of life insurance was explained as an example, but the present invention can also be applied to the financial analysis of companies. In this case, instead of the notification information 300, data described in the securities report or index data calculated from the data is used. Also, the temporal feature division knowledge 430 is, for example, information obtained by grouping the above data from five viewpoints of profitability, safety, activity, productivity, and growth.

収益性とは、企業がどれだけ利益を上げられているかを示す項目であり、売上高総利益率，売上高営業利益率，総資本経常利益率（ＲＯＡ），自己資本当期利益率（ＲＯＥ）を含む。安全性とは、銀行からの借入に対する返済能力といった企業の支払い能力を示す項目であり、流動比率，当座比率，営業キャッシュフロー，投資キャッシュフローなどを含む。活動性とは、資本を効率的に使い、多くの売り上げをあげているかを示す項目であり、総資本回転率，固定資産回転率，棚卸資産回転率などを含む。 Profitability is an item that indicates how much profit a company is making. Gross profit margin on sales, operating profit on sales, ordinary return on capital (ROA), return on equity (ROE) including. Safety is an item that indicates the ability of a company to pay for borrowings from banks, and includes current ratio, quick ratio, operating cash flow, investment cash flow, and the like. Activity is an item that indicates whether capital is used efficiently and sales are high, and includes total capital turnover, fixed asset turnover, inventory turnover, and the like.

生産性とは、企業が従業員や設備などを効率よく活用しているかどうかを示す項目であり、売上高付加価値率，労働分配率，労働生産性などを含む。成長性とは、企業の今後の成長可能性を示す項目であり、売上高伸び率，経常利益伸び率，当期純利益伸び率などを含む。このように、実施例１および実施例２にかかる生成装置１００は、各種データ分析に適用可能である。 Productivity is an item that indicates whether a company is using its employees and facilities efficiently, and includes sales value added ratio, labor share, and labor productivity. Growth potential is an item that indicates the future growth potential of a company, and includes sales growth rate, current profit growth rate, current net profit growth rate, and the like. As described above, the generation device 100 according to the first and second embodiments can be applied to various data analyses.

なお、本発明は前述した実施例に限定されるものではなく、添付した特許請求の範囲の趣旨内における様々な変形例及び同等の構成が含まれる。例えば、前述した実施例は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに本発明は限定されない。また、ある実施例の構成の一部を他の実施例の構成に置き換えてもよい。また、ある実施例の構成に他の実施例の構成を加えてもよい。また、各実施例の構成の一部について、他の構成の追加、削除、または置換をしてもよい。 It should be noted that the present invention is not limited to the embodiments described above, but includes various modifications and equivalent configurations within the scope of the appended claims. For example, the above-described embodiments have been described in detail for easy understanding of the present invention, and the present invention is not necessarily limited to those having all the described configurations. Also, part of the configuration of one embodiment may be replaced with the configuration of another embodiment. Moreover, the configuration of another embodiment may be added to the configuration of one embodiment. Moreover, other configurations may be added, deleted, or replaced with respect to a part of the configuration of each embodiment.

また、前述した各構成、機能、処理部、処理手段等は、それらの一部又は全部を、例えば集積回路で設計する等により、ハードウェアで実現してもよく、プロセッサ１０１がそれぞれの機能を実現するプログラムを解釈し実行することにより、ソフトウェアで実現してもよい。 Further, each configuration, function, processing unit, processing means, etc. described above may be realized by hardware, for example, by designing a part or all of them with an integrated circuit, and the processor 101 performs each function. It may be realized by software by interpreting and executing a program to be realized.

各機能を実現するプログラム、テーブル、ファイル等の情報は、メモリ、ハードディスク、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）等の記憶装置、又は、ＩＣ（ＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）カード、ＳＤカード、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｃ）の記録媒体に格納することができる。 Information such as programs, tables, files, etc. that realize each function is stored in storage devices such as memory, hard disk, SSD (Solid State Drive), or IC (Integrated Circuit) card, SD card, DVD (Digital Versatile Disc) recording Can be stored on media.

また、制御線や情報線は説明上必要と考えられるものを示しており、実装上必要な全ての制御線や情報線を示しているとは限らない。実際には、ほとんど全ての構成が相互に接続されていると考えてよい。 In addition, the control lines and information lines indicate those considered necessary for explanation, and do not necessarily indicate all the control lines and information lines necessary for mounting. In practice, it can be considered that almost all configurations are interconnected.

１００生成装置
１０１プロセッサ
１０２記憶デバイス
２０１判定部
２０２生成部
２０３分割部
２０４次元圧縮部
２０５結合部
２０６分析部
３００告知情報
４００ドメイン知識
５００経時特徴データ
６００分割経時特徴データ
７００高次特徴データ
７１０結合高次特徴データ
１２００マルチモーダルニューラルネットワーク 100 generation device 101 processor 102 storage device 201 determination unit 202 generation unit 203 division unit 204 dimension compression unit 205 combination unit 206 analysis unit 300 notification information 400 domain knowledge 500 temporal feature data 600 divided temporal feature data 700 high-level feature data 710 connection height Next Feature Data 1200 Multimodal Neural Network

Claims

A generating device having a processor that executes a program and a storage device that stores the program,
Accessible to temporal feature information indicating temporal features obtained from the temporal data, and grouping information defining a plurality of groups to which the temporal data should belong,
The processor
a generation process of generating a plurality of pieces of temporal feature data indicating the temporal feature from the temporal data to be analyzed based on the temporal feature information;
a division process for dividing the plurality of temporal feature data generated by the generation process into the plurality of groups based on the grouping information;
Dimensional compression processing for dimensionally compressing each of the plurality of groups divided by the division processing, and dimensionally compressing non-temporal data to be analyzed ;
A generating device characterized by executing

The generator of claim 1, comprising:
Accessible to determination information for determining which of the chronological data and the non-chronological data corresponds,
The processor
executing determination processing for determining whether the data to be analyzed corresponds to the temporal data or the non-temporal data based on the determination information;
In the generation process, the processor treats the analysis target data determined to be the temporal data by the determination process as the analysis target temporal data, and based on the temporal feature information, the analysis target temporal data is: generating the plurality of temporal feature data;
In the dimension compression processing, the processor treats the analysis target data determined to be the non-temporal data by the determination processing as the analysis target non-temporal data, and dimensionally compresses the analysis target non-temporal data.
A generating device characterized by:

A generating device according to claim 2, comprising:
A generating device that performs a combining process for combining non-temporal data to be analyzed after the dimension compression by the dimension compression process and a plurality of groups after the dimension compression.

A generating device according to claim 3, comprising:
The processor
A generation apparatus characterized by executing an analysis process for outputting a corresponding objective variable using a combination result obtained by the combination process as an explanatory variable.

5. A generating device according to claim 4,
A generating device, wherein the dimensionality compression processing, the connection processing, and the analysis processing are performed by a multimodal neural network.

A generation method executed by a generation device having a processor that executes a program and a storage device that stores the program,
The generation device is capable of accessing temporal characteristic information indicating temporal characteristics of temporal data and grouping information defining a plurality of groups to which the temporal data should belong,
The processor
a generation process of generating a plurality of pieces of temporal feature data indicating the temporal feature from the temporal data to be analyzed based on the temporal feature information;
a division process for dividing the plurality of temporal feature data generated by the generation process into the plurality of groups based on the grouping information;
Dimensional compression processing for dimensionally compressing each of the plurality of groups divided by the division processing, and dimensionally compressing non-temporal data to be analyzed;
A generation method characterized by executing

A generator program for causing a processor to perform data generation, comprising:
The processor is capable of accessing temporal characteristic information indicating temporal characteristics of the temporal data and grouping information defining a plurality of groups to which the temporal data should belong,
to the processor;
a generation process of generating a plurality of pieces of temporal feature data indicating the temporal feature from the temporal data to be analyzed based on the temporal feature information;
a division process for dividing the plurality of temporal feature data generated by the generation process into the plurality of groups based on the grouping information;
Dimensional compression processing for dimensionally compressing each of the plurality of groups divided by the division processing, and dimensionally compressing non-temporal data to be analyzed;
A generation program characterized by executing