WO2017199445A1

WO2017199445A1 - Data analysis system, method for control thereof, program, and recording medium

Info

Publication number: WO2017199445A1
Application number: PCT/JP2016/065096
Authority: WO
Inventors: 公平松本
Original assignee: 株式会社Ｕｂｉｃ
Priority date: 2016-05-20
Filing date: 2016-05-20
Publication date: 2017-11-23
Also published as: JPWO2017199445A1; JP6748710B2

Abstract

Provided is a data analysis system, which: sets a plurality of evaluation items which are associated with a prescribed case; evaluates training data in which classification information is set which relates to the degree of association with each of the plurality of evaluation items; evaluates subject data for each of the plurality of evaluation items on the basis of the result of the evaluation of the training data; carries out an analysis based on a prescribed prediction model, with the evaluation of the subject data for each of the plurality of evaluation items as input; and outputs prediction information which relates to the occurrence of the prescribed case on the basis of the result of the analysis.

Description

DATA ANALYSIS SYSTEM, ITS CONTROL METHOD, PROGRAM, AND RECORDING MEDIUM

The present invention relates to a data analysis system for analyzing data based on a predetermined case.

In recent years, the prevention of incidents has become important in various industrial fields. In fact, in the medical field, various measures have been studied to prevent medical accidents, and as shown in the following patent document, incident reports are recorded, and dangerous actions that can lead to medical accidents and medical accidents are based on incident reports. A system for managing and preventing medical accidents is disclosed.

JP 2008-165680 A Japanese Patent No. 3861986 US Patent Application Publication No. 2013/0144816 US Patent Application Publication No. 2011/0295621

In medical accidents, there are accidents caused by the circumstances of the patient, for example, patient falls, in addition to accidents caused by medical practices such as doctors and nurses. Although it is possible to prevent the former accident as much as possible by improving the quality of medical practice by doctors and nurses, it is difficult to prevent the latter accident, which has a large factor on the patient side. Even if it can be predicted that an accident may occur on the patient side according to the above-described conventional example, it is difficult to predict the accident quantitatively. Therefore, in the conventional measures, it was actually possible to take only rough measures such as uniformly restricting the behavior of patients. Therefore, an object of the present invention is to provide a data analysis technique capable of accurately predicting the occurrence of a predetermined case such as an accident or a dangerous state caused by a person such as a fall.

The object is a data analysis system for analyzing data based on a predetermined case, comprising: a memory for storing a plurality of target data to be subjected to data analysis; and a controller for evaluating the plurality of target data. The controller sets a plurality of evaluation items related to a predetermined case, evaluates learning data in which classification information related to relevance to each of the plurality of evaluation items is set, and the learning data Based on the evaluation result, the target data is evaluated for each of the plurality of evaluation items, and an analysis based on a predetermined prediction model is performed using the evaluation of the target data for each of the plurality of evaluation items as an input. And outputting prediction information related to the occurrence of the predetermined case based on the result of the analysis. Furthermore, a control method of the data analysis system, a program thereof, and a recording medium are provided.

According to the above-described disclosure, it is possible to provide a data analysis technique capable of accurately predicting the occurrence of a predetermined case.

It is a block diagram which shows an example of the hardware constitutions of a data analysis system. It is a flowchart which shows operation | movement of a data analysis system. It is the graph which showed the characteristic in which the objective variable (occurrence probability of a fall) of logistic regression analysis changes in time series.

[Data analysis system configuration]
FIG. 1 is a block diagram showing an example of a hardware configuration of a data analysis system according to the present embodiment (hereinafter sometimes simply referred to as “system”). For example, the system can execute any recording medium (for example, memory, hard disk, etc.) capable of storing data (including digital data and / or analog data) and a control program stored in the recording medium. A computer or a computer system (analyzing data by operating a plurality of computers in an integrated manner) that analyzes data stored at least temporarily in the recording medium. It is realized as a system).

In the present embodiment, “learning data” (training data) is, for example, data presented to the user as reference data and associated with classification information (classified reference data, combination of reference data and classification information) ). The learning data may be referred to as “teacher data” or “training data”. In addition, “evaluation data” (evaluation data) is data that is not associated with the classification information (unclassified data that is not presented to the user as reference data and is not classified for the user, “unknown data” May be said). Here, the “classification information” may be an identification label used for arbitrarily classifying reference data. For example, the “classification information” widely includes reference data and a predetermined case (the target for which the system evaluates relevance between data). , The range is not limited), and the reference data can be an arbitrary number (for example, two) ) May be classified into groups.

The client device 3 presents a part of the data as reference data to the user. As a result, the user, as an evaluator (or viewer), can perform input for evaluation / classification with respect to the reference data (giving classification information) via the client device 3. Based on the combination of reference data and classification information (learning data), the server device 2 widely uses patterns (for example, abstract rules, meanings, concepts, styles, distributions, and samples included in the data) from the data. And is not limited to a so-called “specific pattern”), and based on the learned pattern, the relevance between the evaluation target data and the predetermined case is evaluated.

The management computer 6 executes predetermined management processing for the client device 3, the server device 2, and the storage system 5. The storage system 5 may be composed of, for example, a disk array system, and may include a database 4 that records data and results of evaluation / classification of the data. The server device 2 and the storage system 5 are communicably connected by a DAS (Direct Attached Storage) method or a SAN (Storage Area Network).

Note that the hardware configuration shown in FIG. 1 is merely an example, and the above system may be replaced by other hardware configurations. For example, a part or all of the processing executed in the server device 2 may be executed in the client device 3, or a part or all of the processing may be executed in the server device 2. There may be a configuration in which the storage system 5 is built in the server device 2. Further, the user not only performs input for evaluation / classification of sample data via the client device 3 (gives classification information), but also performs the above input via an input device directly connected to the server device 2. You can also. It is understood by those skilled in the art that there can be various hardware configurations capable of realizing the system, and the present invention is not limited to one specific configuration (for example, the configuration illustrated in FIG. 1).

[Data evaluation function]
The system can have a data evaluation function. The data evaluation function evaluates a large number of evaluation target data (including big data) based on a small number of data (learning data) classified manually. By providing the data evaluation function, the system can, for example, indicate an index indicating the level of relevance between the evaluation target data and the predetermined case (for example, a numerical value (for example, a score), a character that enables the evaluation target data to be ordered) (Eg, “high”, “medium”, “low”, etc.) and / or symbols (eg, “◎”, “○”, “△”, “x”, etc.), The data evaluation function can be realized by the controller of the server device 2.

When the system derives a score as an index for the evaluation, the system can calculate the score by any method. For example, various methods used in the field of machine learning or natural language processing (for example, a K-neighbor method, a method using a support vector machine, a method using a neural network, a method that assumes a statistical model for data) The score may be calculated based on (for example, a method using a Gaussian process) and / or a method combining these, or based on various methods used in the field of statistics ( (E.g., based on how often the component appears in the data).

The “component” (which may be referred to as a data element) may be partial data that forms at least a part of the data. For example, morphemes, keywords, sentences, paragraphs, and / or metadata ( For example, e-mail header information), partial voice that constitutes audio, volume (gain) information, and / or timbre information, partial image that constitutes an image, partial pixels, and / or luminance It may be information, a frame image constituting a video, motion information, and / or 3D information.

When the system calculates the score based on the frequency at which the component appears in the data, for example, the following calculation method can be considered. First, the system extracts the constituent elements constituting the learning data from the learning data, and evaluates the constituent elements. At this time, for example, the system described above is the degree that a plurality of constituent elements constituting at least a part of the learning data contribute to the combination of data and classification information (in other words, the constituent elements appear according to the classification information). Frequency). The degree may be rephrased as a weight. As a more specific example, the system evaluates a component using a transmitted information amount (for example, an information amount calculated from a predetermined formula using the appearance probability of the component and the appearance probability of the classification information). By doing so, an evaluation value as evaluation information of the constituent element is calculated according to the following formula 1.

Here, wgt indicates the initial value of the evaluation value of the i-th component before evaluation. Wgt indicates the evaluation value of the i-th component after the Lth evaluation. γ means an evaluation parameter in the L-th evaluation, and θ means a threshold value in the evaluation. Thereby, for example, the system can be evaluated such that the greater the value of the calculated transmission information amount, the more the component represents the characteristic of the predetermined classification information.

Next, the system associates the component with the evaluation value, and stores both in an arbitrary memory (for example, the storage system 5). Then, the system extracts a component from the evaluation target data, inquires whether or not the component is stored in the memory, and if it is stored, an evaluation value associated with the component Is read from the memory, and the evaluation object data is evaluated based on the evaluation value. As a more specific example, the system can calculate the score by calculating the following expression using an evaluation value associated with a component that constitutes at least a part of the evaluation target data. .

m _j : appearance frequency of the i th component, wgt _i : evaluation value of the i th component

The server device 2 may continue (repeate) the extraction and evaluation of the constituent elements until the recall rate reaches a predetermined target value. The recall is an index indicating the ratio (coverability) of the data to be discovered with respect to a predetermined number of data. For example, when the recall is 80% with respect to 30% of all data, a predetermined case As shown in the figure, 80% of the data to be discovered is included in the data of the top 30% of the index (score). When a person hits the data (linear review) without using a data analysis system, the amount of data to be discovered is proportional to the amount reviewed by the person, so the greater the deviation from this proportionality, the greater the data analysis performance of the system. Will be good.

The implementation example of the data evaluation function described above is merely an example. In other words, as long as the data evaluation function is a function of “evaluating evaluation target data based on learning data”, the specific mode is a specific one configuration (for example, the aforementioned score calculation method) ) Is not limited.

[Evaluation of Evaluation Target Data by Server Device 2]
The evaluation operation of the evaluation target data by the server device 2 will be described. FIG. 2 is a flowchart of the server device 2 (specifically, the controller of the server device 2). The server device 2 acquires one or more data as reference data from the evaluation target data recorded in the storage system 5 (step S300: data acquisition module). Each step can be rephrased as a module or means. Next, the server device 2 determines the classification by actually reviewing the reference data by the user, and acquires the classification information input to the reference data by the user from any input device (step S302: classification information). Acquisition module). The server device 2 composes learning data by combining the reference data and the classification information (step 304: learning data configuration module), and extracts components from the learning data (step S306: component extraction module). . Then, the controller evaluates the constituent element (step S308: constituent element evaluation module), associates the constituent element with the evaluation value, and stores both in the storage system 5 (step S310: constituent element storage module). The processing of S300 to S308 corresponds to the “learning phase” (phase in which artificial intelligence learns a pattern). Note that the learning data may be prepared in advance instead of creating from the reference data.

Next, the server device 2 acquires evaluation target data from the storage system 5 (step S312: evaluation target data acquisition module). The server device 2 further reads out the constituent elements and their evaluation values from the storage system 5 and extracts the constituent elements from the evaluation target data (step S314: constituent element extraction module). The server device 2 evaluates the evaluation object data based on the evaluation value associated with the constituent element (step S316: evaluation object data evaluation module), and creates ranking information (ranking) of the plurality of evaluation object data can do. The higher the evaluation target data, the higher the relevance with the predetermined case. The processing after step S312 is an evaluation phase with respect to the learning phase described above. It should be noted that each process included in the above-described flowchart is an example and does not indicate a limited aspect.

[Operation example of data analysis system]
Next, an operation example of the data analysis system will be described based on the above-mentioned predetermined case based on “patient fall” as a medical incident. The data analysis system is realized as a fall prediction model that can predict the fall of a patient. In the storage system 5, an electronic medical record is recorded for each patient. The structured data of the electronic medical record is recorded in the database 4. Medical personnel such as doctors, nurses, clinical technologists, and pharmacists access the electronic medical record via the client device 3 every time they need medical care, examination, diagnosis, treatment, treatment, examination, or medication. Record information. Thus, the electronic medical record is updated in time series. The server device 2 records the electronic medical record in the storage system 5 in time series. For example, transaction data of an electronic medical record is configured like an electronic medical record at 8:00 pm on May 1 and an electronic medical record at 8:00 pm on May 2.

Electronic medical records for each patient include, for example, diagnostic information, treatment information, medication information, vital data such as blood pressure and pulse, clinical analysis data such as blood, etc., patient symptoms / states / conditions grasped by medical personnel, patients Symptoms filed with medical personnel, conversations between medical personnel and patients, etc. may be recorded.

The aforementioned reference data may be an electronic medical record stored in the storage system 5. The server device 2 extracts a predetermined number of electronic medical records from the storage system 5 and provides them to the client device 3 (S300 in FIG. 2). The medical staff acquires evaluation items related to the fall from the server device 2 via the client device 3 (S304). For example, the following parameters have been found by the inventor as evaluation items regarding falls.
Evaluation items: somnolence, anemia, delirium, fatigue, breathing difficulty, nausea and vomiting, numbness, paralysis, abnormal excretion, light-headedness, poor food intake, pain, number of days to stop eating

医療 As a reviewer, medical personnel set classification information on whether or not electronic medical records are relevant for each of these items. The evaluation items provided to the medical personnel are not necessarily all the parameters described above, and may be a plurality of items selected by the medical personnel as related to the fall. The parameters already described are highly independent from each other with respect to falling, and the prediction accuracy of falling is improved as the number of parameters increases. Therefore, a plurality of parameters are preferably selected as evaluation items.

The server device 2 configures learning data as a set of electronic medical records and classification information for each of the plurality of evaluation items (S304). Then, the server device 2 determines a component and its evaluation value for each of a plurality of evaluation items in accordance with S306 to S310. Next, the server device acquires an electronic medical record as classification data from the storage system 5, and calculates a score for each of the plurality of evaluation items (S312 to S316).

(Example 1)
In the case where the following information is recorded in the electronic medical record (evaluation target data), the “delirium” score has a high value, and the nausea and vomiting score has a relatively high value.
“Xxxx year xx month xx day Daily observation: The line of sight hangs out. There are occasional remarks.
Although drinking water, there is no noticeable drought, but a slight wheezing is heard near the pharynx after drinking. "

(Example 2)
In the case where the following information is recorded in the electronic medical record, “somnolence” has a high score.
“Appealing sleepiness. The same is true during a family visit.”

(Example 3)
In the case where the following information is recorded in the electronic medical chart, both “easy fatigue” and “nausea and vomiting” have high scores, and “numbness” has a relatively high score.
“It seems to be a lot today.” (Since it ’s late in the evening. Sometimes it ’s easy in the morning. There ’s a little numbness. The effect of not being able to sleep at night, exhaustion due to nausea and vomiting.)

Next, in order to predict the fall of the patient, the server device 2 performs an analysis based on a predetermined prediction model, with each of a plurality of electronic medical records (target data) being input with an evaluation for each of a plurality of evaluation items. In a preferred embodiment, the server device 2 uses logistic regression analysis as a predetermined prediction model, and calculates a score for each of a plurality of evaluation items for each of a plurality of electronic medical records. Then, logistic regression analysis is performed based on the following mathematical formula, and the probability of falling (predicting the occurrence of falling) is calculated based on the objective variables without falling (0) and with falling (1).

_{_{β 0i, β 1i, ···· β}} ki is the regression coefficient, previously, using the historical data of the electronic medical record, need only be calculated by the least squares method. x _0i , x _1i ,..., x _ki are explanatory variables, and the score of each evaluation item is substituted for each. pi is an objective variable (probability of falling). The server device 2 records pi in time series, and the temporal change in the occurrence probability of the fall, for example, the occurrence probability of the fall currently obtained exceeds a predetermined threshold, or a predetermined number of days (for example, 7 days) When moving to the plus side from the moving average (deviation rate is 25%, for example), the patient may fall down, and this is notified to the medical equipment (PC, smartphone, etc.) can do. FIG. 3 is a graph showing characteristics in which the objective variable (probability of falling) changes in time series. At the time indicated by the arrow, the server apparatus 2 falls to a medical person to a patient corresponding to the electronic medical record. Notify that the occurrence of

In the case where a fall actually occurred, it is known that the item appears to change from a predetermined number of days before the fall occurred. Therefore, in the logistic regression analysis of the electronic medical record three days before including the day when the fall actually occurred, “1” is set to the flag of the objective variable (pi) to optimize the regression coefficient. Based on the optimized regression coefficient, the probability of falling within a predetermined number of days (for example, 3 days) is calculated as an objective variable by performing logistic regression analysis on the current electronic medical record. Therefore, the objective variable obtained by the server device 2 performing the logistic regression analysis indicates the probability that the patient falls within a predetermined number of days. In this way, since the probability of falling is obtained with a certain range, there is an advantage that preparations for falling are easy for both medical personnel and patients.

Numeric data such as vital data and test results are known to be related to the occurrence of falls. For example, anomalies of numerical data (medical data measured from patients) such as blood pressure, pulse rate, body temperature, percutaneous arterial oxygen saturation (SPO ₂ ), food intake, and pain NRS index can contribute to the occurrence of falls. It is known to be deeply involved. Therefore, applying one or more of these numerical data as explanatory variables to logistic regression analysis is effective in predicting the occurrence of falls. Furthermore, drug history data (drug name, drug efficacy, dose, usage, duration of medication, side effects, etc.) are equally effective. For example, taking sleep inducers and tranquilizers is related to the occurrence of falls.

According to the data analysis system 1, in order to output the occurrence of a fall caused by the patient's behavior as an occurrence probability within a predetermined period, the occurrence of the fall can be effectively predicted, and medical personnel and patients can be predicted. The effect that it is very easy to cope in advance with falling is achieved. According to the data analysis system 1, in addition to a patient's fall, it is possible to predict the occurrence of an accident or a dangerous act due to a patient's behavior or circumstances such as a patient's fall. Furthermore, according to the data analysis system 1, not only the medical field but also the occurrence of accidents and dangerous acts due to the actions and circumstances of operators and drivers in other industrial fields such as transportation, construction, civil engineering, product manufacturing, etc. be able to.

In the above description, for the logistic regression analysis, the data analysis system acquired the information contributing to the regression from the electronic medical record as a score by the process of predictive coding, but even if this is changed as follows: Good. That is, the data analysis system includes a dictionary in the artificial intelligence, analyzes text data of the electronic medical record (target data) with the dictionary, and sets a flag if there is text related to each item. For example, if there is “anemia” in the electronic medical record, an anemia flag is set. The data analysis system applies a flag for each of a plurality of evaluation items as an explanatory variable to logistic regression analysis. A flag for each of the plurality of evaluation items may be set by a doctor or a nurse when information is recorded in the electronic medical record.

Although it has been explained that logistic regression analysis is applied to a fall prediction model, it may be replaced with SVM (support vector machine), decision tree learning, or the like. In the above explanation, the fall of the patient was explained as an incident, but it is not limited to this. Abnormal behavior can be predicted by data analysis.

Periods may be provided for explanatory variables for logistic regression analysis. For example, assuming that the explanatory variable is 5 and a lag of 3 days is provided, it is as follows.
Get the score for each of the five evaluation items using the May 1 chart as input,
Get the score for each of the five evaluation items using the May 2 chart as input,
Get the score for each of the five evaluation items using the May 3 chart as input,
On May 3rd, three sets of five scores are connected in parallel and a logistic regression analysis is performed based on 15-dimensional explanatory variables.
According to this aspect, since fluctuations in the explanatory variables during the period can be collectively applied to the regression analysis, the overall physical condition change of the patient during this period can be reflected in the fall prediction.

[Data format processed by the data analysis system]
In the present embodiment, “data” may be any data expressed in a format that can be processed by a computer. The data may be, for example, unstructured data whose structure definition is incomplete at least in part, and document data (for example, e-mail (attached file header) Information), technical documents (including a wide range of documents explaining technical matters such as academic papers, patent publications, product specifications, design drawings, etc.), presentation materials, spreadsheets, financial statements, meeting materials, Record reports, sales documents, contracts, organization charts, business plans, company analysis information, electronic medical records, web pages, blogs, comments posted on social network services, etc., audio data (eg conversation / music) Data), image data (eg, data composed of a plurality of pixels or vector information), video data (eg, Broadly includes data formed) including a plurality of frame images (not limited to these examples).

For example, when analyzing document data, the system extracts morphemes contained in document data as learning data as constituent elements, evaluates the constituent elements, and extracts from the document data as evaluation data. Based on the element, the relevance between the document data and the predetermined case can be evaluated. When analyzing voice data, the system may analyze the voice data itself, or convert the voice data into document data by voice recognition, and use the converted document data as an analysis target. Good. In the former case, for example, the system divides the voice data into partial voices of a predetermined length to form components, and uses the voice analysis method (for example, hidden Markov model, Kalman filter, etc.) to convert the partial voices. By identifying, the voice data can be analyzed. In the latter case, speech is recognized using an arbitrary speech recognition algorithm (for example, a recognition method using a hidden Markov model), and the same procedure as described above is performed on the recognized data (document data). Can be analyzed. When analyzing image data, the system, for example, divides the image data into partial images of a predetermined size to form components, and any image recognition method (for example, pattern matching, support vector machine, neural network) Etc.) can be used to identify the partial image. Further, when analyzing video data, the system, for example, divides a plurality of frame images included in the video data into partial images each having a predetermined size to form a component, and an arbitrary image recognition technique (for example, a pattern The video data can be analyzed by identifying the partial image using matching, a support vector machine, a neural network, or the like.

[Example of implementation using software and hardware]
The control block of the above system may be realized by a logic circuit (hardware) formed in an integrated circuit (IC chip) or the like, or may be realized by software using a CPU. In the latter case, the system includes a CPU that executes a program (control program for the data analysis system) that is software that implements each function, and a ROM (in which the program and various data are recorded so as to be readable by the computer (or CPU)). A Read Only Memory) or a storage device (these are referred to as “recording media”), a RAM (Random Access Memory) for developing the program, and the like are provided. And the objective of this invention is achieved when a computer (or CPU) reads the said program from the said recording medium and runs it. As the recording medium, a “non-temporary tangible medium” such as a tape, a disk, a card, a semiconductor memory, a programmable logic circuit, or the like can be used. The program may be supplied to the computer via an arbitrary transmission medium (such as a communication network or a broadcast wave) that can transmit the program. The present invention can also be realized in the form of a data signal embedded in a carrier wave in which the program is embodied by electronic transmission. Note that the above program can be implemented in any programming language. Also, any recording medium that records the above program falls within the scope of the present invention.

[Other application examples]
The system is not limited to the fall prediction system described in the above [Example of operation of data analysis system]. For example, a discovery support system, a forensic system, an e-mail monitoring system, a medical application system (for example, a pharmacovigilance support system, Clinical trial efficiency system, medical risk hedging system, prognosis prediction system, diagnosis support system, etc.), Internet application system (eg, smart mail system, information aggregation (curation) system, user monitoring system, social media management system, etc.), information Leakage detection system, project evaluation system, marketing support system, intellectual property evaluation system, fraudulent transaction monitoring system, call center escalation system, credit check system Etc., it may be implemented as an artificial intelligence system for analyzing big data (data with a predetermined cases any system capable assess the relevance). For example, when the system is realized as a discovery support system, the predetermined case is a “lawsuit”, and when the system is realized as a forensic system, the predetermined case is an “illegal case” (crime). When implemented as another system, the predetermined case depends on the field in which the system is implemented. Depending on the field to which the data analysis system of the present invention is applied, in consideration of circumstances peculiar to the field, for example, preprocessing (for example, extracting an important part from the data and extracting only the important part from the data) The analysis target may be applied), or the mode of displaying the data analysis result may be changed. It will be understood by those skilled in the art that a variety of such variations can exist, and all variations fall within the scope of the present invention.

The present invention is not limited to the above-described embodiments, and various modifications can be made within the scope of the claims, and the technical means disclosed in different embodiments can be appropriately combined. Embodiments to be made are also included in the technical scope of the present invention. Furthermore, a new technical feature can be formed by combining the technical means disclosed in each embodiment.

1 ... Data analysis system, 2 ... Server device, 3 ... Client device, 4 ... Database, 5 ... Storage system, 6 ... Management computer

Claims

A data analysis system for analyzing data based on a predetermined case,
A memory for storing a plurality of target data to be analyzed;
A controller for evaluating the plurality of target data;
With
The controller is
Setting multiple evaluation items related to a given case;
Evaluating the learning data in which the classification information related to the relevance with each of the plurality of evaluation items is set;
Evaluating the target data for each of the plurality of evaluation items based on the evaluation result of the learning data;
Using the evaluation of the target data for each of the plurality of evaluation items as input, and performing an analysis based on a predetermined prediction model;
Outputting prediction information related to the occurrence of the predetermined case based on the result of the analysis;
A data analysis system comprising:
The controller is
Extracting numerical data related to the predetermined case from the target data;
Performing the analysis based on the numerical data in addition to the evaluation of the target data;
A data analysis system comprising:
The controller is
Providing the user with reference data to be referenced for evaluation of the target data;
Setting the classification information in the reference data for each of the plurality of evaluation items based on the input from the user;
A combination of the reference data and the classification information as the learning data;
As evaluation of the learning data, evaluating the degree to which the constituent elements of the reference data contribute to the combination;
Evaluating the target data includes calculating a relevance between each of the plurality of evaluation items and the target data as a score based on the evaluation of the component;
The data analysis system according to claim 1 or 2, further comprising:
The controller is
Performing the analysis includes applying the score of the target data for each of the plurality of evaluation items as an explanatory variable to logistic regression analysis;
Outputting the prediction information includes outputting the occurrence probability of the predetermined case by the logistic regression analysis;
The data analysis system according to claim 3, further comprising:
The controller is
Obtaining an electronic medical record as the target data from the memory;
The prediction information includes a probability that the patient falls as the predetermined case,
The data analysis system according to any one of claims 1 to 4, further comprising:
The plurality of evaluation items include a parameter related to a patient's fall;
The numerical data includes medical data measured from a patient;
The data analysis system according to claim 2, further comprising:
The controller is
Outputting the probability of occurrence by the logistic regression analysis includes outputting the probability that the predetermined case will occur within a predetermined period;
The data analysis system according to claim 4, further comprising:
A data analysis system control method for analyzing data based on a predetermined case,
The data analysis system is
Storing a plurality of target data to be analyzed, and
Setting a plurality of evaluation items related to a predetermined case;
Evaluating the learning data in which classification information related to each of the plurality of evaluation items is set;
Evaluating the target data for each of the plurality of evaluation items based on an evaluation result of the learning data;
Performing an analysis based on a predetermined prediction model with the evaluation of the target data for each of the plurality of evaluation items as an input;
Outputting prediction information related to the occurrence of the predetermined case based on the result of the analysis;
Control method of data analysis system to be executed.
A program that causes a computer to analyze data based on a predetermined case,
A function for storing a plurality of target data to be analyzed;
The ability to set multiple assessment items related to a given case;
A function of evaluating learning data in which classification information related to the relevance to each of the plurality of evaluation items is set;
A function for evaluating the target data for each of the plurality of evaluation items based on the evaluation result of the learning data;
A function of performing an analysis based on a predetermined prediction model, with the evaluation of the target data for each of the plurality of evaluation items as an input;
A function of outputting prediction information related to the occurrence of the predetermined case based on the result of the analysis;
A program to make a computer realize.
A computer-readable recording medium on which the program according to claim 9 is recorded.