WO2017199445A1 - Data analysis system, method for control thereof, program, and recording medium - Google Patents

Data analysis system, method for control thereof, program, and recording medium Download PDF

Info

Publication number
WO2017199445A1
WO2017199445A1 PCT/JP2016/065096 JP2016065096W WO2017199445A1 WO 2017199445 A1 WO2017199445 A1 WO 2017199445A1 JP 2016065096 W JP2016065096 W JP 2016065096W WO 2017199445 A1 WO2017199445 A1 WO 2017199445A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
evaluation
analysis
evaluation items
target data
Prior art date
Application number
PCT/JP2016/065096
Other languages
French (fr)
Japanese (ja)
Inventor
公平 松本
Original Assignee
株式会社Ubic
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社Ubic filed Critical 株式会社Ubic
Priority to JP2018518059A priority Critical patent/JP6748710B2/en
Priority to PCT/JP2016/065096 priority patent/WO2017199445A1/en
Publication of WO2017199445A1 publication Critical patent/WO2017199445A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H40/00ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
    • G16H40/20ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the management or administration of healthcare resources or facilities, e.g. managing hospital staff or surgery rooms
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Definitions

  • the present invention relates to a data analysis system for analyzing data based on a predetermined case.
  • JP 2008-165680 A Japanese Patent No. 3861986 US Patent Application Publication No. 2013/0144816 US Patent Application Publication No. 2011/0295621
  • an object of the present invention is to provide a data analysis technique capable of accurately predicting the occurrence of a predetermined case such as an accident or a dangerous state caused by a person such as a fall.
  • the object is a data analysis system for analyzing data based on a predetermined case, comprising: a memory for storing a plurality of target data to be subjected to data analysis; and a controller for evaluating the plurality of target data.
  • the controller sets a plurality of evaluation items related to a predetermined case, evaluates learning data in which classification information related to relevance to each of the plurality of evaluation items is set, and the learning data Based on the evaluation result, the target data is evaluated for each of the plurality of evaluation items, and an analysis based on a predetermined prediction model is performed using the evaluation of the target data for each of the plurality of evaluation items as an input. And outputting prediction information related to the occurrence of the predetermined case based on the result of the analysis. Furthermore, a control method of the data analysis system, a program thereof, and a recording medium are provided.
  • FIG. 1 is a block diagram showing an example of a hardware configuration of a data analysis system according to the present embodiment (hereinafter sometimes simply referred to as “system”).
  • the system can execute any recording medium (for example, memory, hard disk, etc.) capable of storing data (including digital data and / or analog data) and a control program stored in the recording medium.
  • a computer or a computer system analyzing data by operating a plurality of computers in an integrated manner) that analyzes data stored at least temporarily in the recording medium. It is realized as a system).
  • “learning data” (training data) is, for example, data presented to the user as reference data and associated with classification information (classified reference data, combination of reference data and classification information) ).
  • the learning data may be referred to as “teacher data” or “training data”.
  • “evaluation data” (evaluation data) is data that is not associated with the classification information (unclassified data that is not presented to the user as reference data and is not classified for the user, “unknown data” May be said).
  • the “classification information” may be an identification label used for arbitrarily classifying reference data.
  • the “classification information” widely includes reference data and a predetermined case (the target for which the system evaluates relevance between data). , The range is not limited), and the reference data can be an arbitrary number (for example, two) ) May be classified into groups.
  • the client device 3 presents a part of the data as reference data to the user.
  • the user as an evaluator (or viewer), can perform input for evaluation / classification with respect to the reference data (giving classification information) via the client device 3.
  • the server device 2 Based on the combination of reference data and classification information (learning data), the server device 2 widely uses patterns (for example, abstract rules, meanings, concepts, styles, distributions, and samples included in the data) from the data. And is not limited to a so-called “specific pattern”), and based on the learned pattern, the relevance between the evaluation target data and the predetermined case is evaluated.
  • patterns for example, abstract rules, meanings, concepts, styles, distributions, and samples included in the data
  • the management computer 6 executes predetermined management processing for the client device 3, the server device 2, and the storage system 5.
  • the storage system 5 may be composed of, for example, a disk array system, and may include a database 4 that records data and results of evaluation / classification of the data.
  • the server device 2 and the storage system 5 are communicably connected by a DAS (Direct Attached Storage) method or a SAN (Storage Area Network).
  • DAS Direct Attached Storage
  • SAN Storage Area Network
  • the hardware configuration shown in FIG. 1 is merely an example, and the above system may be replaced by other hardware configurations.
  • a part or all of the processing executed in the server device 2 may be executed in the client device 3, or a part or all of the processing may be executed in the server device 2.
  • the storage system 5 is built in the server device 2.
  • the user not only performs input for evaluation / classification of sample data via the client device 3 (gives classification information), but also performs the above input via an input device directly connected to the server device 2. You can also.
  • the system can have a data evaluation function.
  • the data evaluation function evaluates a large number of evaluation target data (including big data) based on a small number of data (learning data) classified manually.
  • the system can, for example, indicate an index indicating the level of relevance between the evaluation target data and the predetermined case (for example, a numerical value (for example, a score), a character that enables the evaluation target data to be ordered) (Eg, “high”, “medium”, “low”, etc.) and / or symbols (eg, “ ⁇ ”, “ ⁇ ”, “ ⁇ ”, “x”, etc.),
  • the data evaluation function can be realized by the controller of the server device 2.
  • the system can calculate the score by any method.
  • various methods used in the field of machine learning or natural language processing for example, a K-neighbor method, a method using a support vector machine, a method using a neural network, a method that assumes a statistical model for data
  • the score may be calculated based on (for example, a method using a Gaussian process) and / or a method combining these, or based on various methods used in the field of statistics ( (E.g., based on how often the component appears in the data).
  • the “component” (which may be referred to as a data element) may be partial data that forms at least a part of the data.
  • morphemes, keywords, sentences, paragraphs, and / or metadata For example, e-mail header information
  • partial voice that constitutes audio
  • volume (gain) information For example, e-mail header information
  • timbre information For example, e-mail header information
  • partial image that constitutes an image, partial pixels, and / or luminance It may be information, a frame image constituting a video, motion information, and / or 3D information.
  • the system extracts the constituent elements constituting the learning data from the learning data, and evaluates the constituent elements.
  • the system described above is the degree that a plurality of constituent elements constituting at least a part of the learning data contribute to the combination of data and classification information (in other words, the constituent elements appear according to the classification information). Frequency).
  • the degree may be rephrased as a weight.
  • the system evaluates a component using a transmitted information amount (for example, an information amount calculated from a predetermined formula using the appearance probability of the component and the appearance probability of the classification information). By doing so, an evaluation value as evaluation information of the constituent element is calculated according to the following formula 1.
  • wgt indicates the initial value of the evaluation value of the i-th component before evaluation.
  • Wgt indicates the evaluation value of the i-th component after the Lth evaluation.
  • means an evaluation parameter in the L-th evaluation, and ⁇ means a threshold value in the evaluation.
  • the system associates the component with the evaluation value, and stores both in an arbitrary memory (for example, the storage system 5). Then, the system extracts a component from the evaluation target data, inquires whether or not the component is stored in the memory, and if it is stored, an evaluation value associated with the component Is read from the memory, and the evaluation object data is evaluated based on the evaluation value.
  • the system can calculate the score by calculating the following expression using an evaluation value associated with a component that constitutes at least a part of the evaluation target data. . m j : appearance frequency of the i th component, wgt i : evaluation value of the i th component
  • the server device 2 may continue (repeate) the extraction and evaluation of the constituent elements until the recall rate reaches a predetermined target value.
  • the recall is an index indicating the ratio (coverability) of the data to be discovered with respect to a predetermined number of data. For example, when the recall is 80% with respect to 30% of all data, a predetermined case As shown in the figure, 80% of the data to be discovered is included in the data of the top 30% of the index (score).
  • the amount of data to be discovered is proportional to the amount reviewed by the person, so the greater the deviation from this proportionality, the greater the data analysis performance of the system. Will be good.
  • the specific mode is a specific one configuration (for example, the aforementioned score calculation method) ) Is not limited.
  • FIG. 2 is a flowchart of the server device 2 (specifically, the controller of the server device 2).
  • the server device 2 acquires one or more data as reference data from the evaluation target data recorded in the storage system 5 (step S300: data acquisition module). Each step can be rephrased as a module or means.
  • the server device 2 determines the classification by actually reviewing the reference data by the user, and acquires the classification information input to the reference data by the user from any input device (step S302: classification information). Acquisition module).
  • the server device 2 composes learning data by combining the reference data and the classification information (step 304: learning data configuration module), and extracts components from the learning data (step S306: component extraction module). . Then, the controller evaluates the constituent element (step S308: constituent element evaluation module), associates the constituent element with the evaluation value, and stores both in the storage system 5 (step S310: constituent element storage module).
  • the processing of S300 to S308 corresponds to the “learning phase” (phase in which artificial intelligence learns a pattern). Note that the learning data may be prepared in advance instead of creating from the reference data.
  • the server device 2 acquires evaluation target data from the storage system 5 (step S312: evaluation target data acquisition module).
  • the server device 2 further reads out the constituent elements and their evaluation values from the storage system 5 and extracts the constituent elements from the evaluation target data (step S314: constituent element extraction module).
  • the server device 2 evaluates the evaluation object data based on the evaluation value associated with the constituent element (step S316: evaluation object data evaluation module), and creates ranking information (ranking) of the plurality of evaluation object data can do.
  • the higher the evaluation target data the higher the relevance with the predetermined case.
  • the processing after step S312 is an evaluation phase with respect to the learning phase described above. It should be noted that each process included in the above-described flowchart is an example and does not indicate a limited aspect.
  • the data analysis system is realized as a fall prediction model that can predict the fall of a patient.
  • an electronic medical record is recorded for each patient.
  • the structured data of the electronic medical record is recorded in the database 4.
  • Medical personnel such as doctors, nurses, clinical technologists, and pharmacists access the electronic medical record via the client device 3 every time they need medical care, examination, diagnosis, treatment, treatment, examination, or medication. Record information.
  • the electronic medical record is updated in time series.
  • the server device 2 records the electronic medical record in the storage system 5 in time series.
  • transaction data of an electronic medical record is configured like an electronic medical record at 8:00 pm on May 1 and an electronic medical record at 8:00 pm on May 2.
  • Electronic medical records for each patient include, for example, diagnostic information, treatment information, medication information, vital data such as blood pressure and pulse, clinical analysis data such as blood, etc., patient symptoms / states / conditions grasped by medical personnel, patients Symptoms filed with medical personnel, conversations between medical personnel and patients, etc. may be recorded.
  • the aforementioned reference data may be an electronic medical record stored in the storage system 5.
  • the server device 2 extracts a predetermined number of electronic medical records from the storage system 5 and provides them to the client device 3 (S300 in FIG. 2).
  • the medical staff acquires evaluation items related to the fall from the server device 2 via the client device 3 (S304). For example, the following parameters have been found by the inventor as evaluation items regarding falls. Evaluation items: somnolence, anemia, delirium, fatigue, breathing difficulty, nausea and vomiting, numbness, paralysis, abnormal excretion, light-headedness, poor food intake, pain, number of days to stop eating
  • the evaluation items provided to the medical personnel are not necessarily all the parameters described above, and may be a plurality of items selected by the medical personnel as related to the fall.
  • the parameters already described are highly independent from each other with respect to falling, and the prediction accuracy of falling is improved as the number of parameters increases. Therefore, a plurality of parameters are preferably selected as evaluation items.
  • the server device 2 configures learning data as a set of electronic medical records and classification information for each of the plurality of evaluation items (S304). Then, the server device 2 determines a component and its evaluation value for each of a plurality of evaluation items in accordance with S306 to S310. Next, the server device acquires an electronic medical record as classification data from the storage system 5, and calculates a score for each of the plurality of evaluation items (S312 to S316).
  • Example 1 In the case where the following information is recorded in the electronic medical record (evaluation target data), the “delirium” score has a high value, and the nausea and vomiting score has a relatively high value. “Xxxx year xx month xx day Daily observation: The line of sight hangs out. There are occasional remarks. Although drinking water, there is no noticeable drought, but a slight wheezing is heard near the pharynx after drinking. "
  • Example 2 In the case where the following information is recorded in the electronic medical record, “somnolence” has a high score. “Appealing sleepiness. The same is true during a family visit.”
  • Example 3 In the case where the following information is recorded in the electronic medical chart, both “easy fatigue” and “nausea and vomiting” have high scores, and “numbness” has a relatively high score. “It seems to be a lot today.” (Since it ’s late in the evening. Sometimes it ’s easy in the morning. There ’s a little numbness. The effect of not being able to sleep at night, exhaustion due to nausea and vomiting.)
  • the server device 2 performs an analysis based on a predetermined prediction model, with each of a plurality of electronic medical records (target data) being input with an evaluation for each of a plurality of evaluation items.
  • the server device 2 uses logistic regression analysis as a predetermined prediction model, and calculates a score for each of a plurality of evaluation items for each of a plurality of electronic medical records. Then, logistic regression analysis is performed based on the following mathematical formula, and the probability of falling (predicting the occurrence of falling) is calculated based on the objective variables without falling (0) and with falling (1).
  • ⁇ 0i, ⁇ 1i, ⁇ ⁇ ki is the regression coefficient, previously, using the historical data of the electronic medical record, need only be calculated by the least squares method.
  • x 0i , x 1i ,..., x ki are explanatory variables, and the score of each evaluation item is substituted for each.
  • pi is an objective variable (probability of falling).
  • the server device 2 records pi in time series, and the temporal change in the occurrence probability of the fall, for example, the occurrence probability of the fall currently obtained exceeds a predetermined threshold, or a predetermined number of days (for example, 7 days)
  • a predetermined threshold for example, 7 days
  • the patient may fall down, and this is notified to the medical equipment (PC, smartphone, etc.) can do.
  • FIG. 3 is a graph showing characteristics in which the objective variable (probability of falling) changes in time series.
  • the server apparatus 2 falls to a medical person to a patient corresponding to the electronic medical record. Notify that the occurrence of
  • Numeric data such as vital data and test results are known to be related to the occurrence of falls.
  • anomalies of numerical data such as blood pressure, pulse rate, body temperature, percutaneous arterial oxygen saturation (SPO 2 ), food intake, and pain NRS index can contribute to the occurrence of falls. It is known to be deeply involved. Therefore, applying one or more of these numerical data as explanatory variables to logistic regression analysis is effective in predicting the occurrence of falls.
  • drug history data drug name, drug efficacy, dose, usage, duration of medication, side effects, etc.
  • taking sleep inducers and tranquilizers is related to the occurrence of falls.
  • the data analysis system 1 in order to output the occurrence of a fall caused by the patient's behavior as an occurrence probability within a predetermined period, the occurrence of the fall can be effectively predicted, and medical personnel and patients can be predicted. The effect that it is very easy to cope in advance with falling is achieved. According to the data analysis system 1, in addition to a patient's fall, it is possible to predict the occurrence of an accident or a dangerous act due to a patient's behavior or circumstances such as a patient's fall. Furthermore, according to the data analysis system 1, not only the medical field but also the occurrence of accidents and dangerous acts due to the actions and circumstances of operators and drivers in other industrial fields such as transportation, construction, civil engineering, product manufacturing, etc. be able to.
  • the data analysis system acquired the information contributing to the regression from the electronic medical record as a score by the process of predictive coding, but even if this is changed as follows: Good. That is, the data analysis system includes a dictionary in the artificial intelligence, analyzes text data of the electronic medical record (target data) with the dictionary, and sets a flag if there is text related to each item. For example, if there is “anemia” in the electronic medical record, an anemia flag is set.
  • the data analysis system applies a flag for each of a plurality of evaluation items as an explanatory variable to logistic regression analysis. A flag for each of the plurality of evaluation items may be set by a doctor or a nurse when information is recorded in the electronic medical record.
  • logistic regression analysis is applied to a fall prediction model
  • SVM support vector machine
  • decision tree learning or the like.
  • the fall of the patient was explained as an incident, but it is not limited to this.
  • Abnormal behavior can be predicted by data analysis.
  • Periods may be provided for explanatory variables for logistic regression analysis. For example, assuming that the explanatory variable is 5 and a lag of 3 days is provided, it is as follows. Get the score for each of the five evaluation items using the May 1 chart as input, Get the score for each of the five evaluation items using the May 2 chart as input, Get the score for each of the five evaluation items using the May 3 chart as input, On May 3rd, three sets of five scores are connected in parallel and a logistic regression analysis is performed based on 15-dimensional explanatory variables. According to this aspect, since fluctuations in the explanatory variables during the period can be collectively applied to the regression analysis, the overall physical condition change of the patient during this period can be reflected in the fall prediction.
  • data may be any data expressed in a format that can be processed by a computer.
  • the data may be, for example, unstructured data whose structure definition is incomplete at least in part, and document data (for example, e-mail (attached file header) Information), technical documents (including a wide range of documents explaining technical matters such as academic papers, patent publications, product specifications, design drawings, etc.), presentation materials, spreadsheets, financial statements, meeting materials, Record reports, sales documents, contracts, organization charts, business plans, company analysis information, electronic medical records, web pages, blogs, comments posted on social network services, etc., audio data (eg conversation / music) Data), image data (eg, data composed of a plurality of pixels or vector information), video data (eg, Broadly includes data formed) including a plurality of frame images (not limited to these examples).
  • document data for example, e-mail (attached file header) Information
  • technical documents including a wide range of documents explaining technical matters such as academic papers, patent publications, product specifications, design drawings, etc.
  • presentation materials
  • the system when analyzing document data, extracts morphemes contained in document data as learning data as constituent elements, evaluates the constituent elements, and extracts from the document data as evaluation data. Based on the element, the relevance between the document data and the predetermined case can be evaluated.
  • voice data the system may analyze the voice data itself, or convert the voice data into document data by voice recognition, and use the converted document data as an analysis target. Good.
  • the system divides the voice data into partial voices of a predetermined length to form components, and uses the voice analysis method (for example, hidden Markov model, Kalman filter, etc.) to convert the partial voices. By identifying, the voice data can be analyzed.
  • speech is recognized using an arbitrary speech recognition algorithm (for example, a recognition method using a hidden Markov model), and the same procedure as described above is performed on the recognized data (document data).
  • an arbitrary speech recognition algorithm for example, a recognition method using a hidden Markov model
  • the system for example, divides the image data into partial images of a predetermined size to form components, and any image recognition method (for example, pattern matching, support vector machine, neural network) Etc.) can be used to identify the partial image.
  • the system when analyzing video data, divides a plurality of frame images included in the video data into partial images each having a predetermined size to form a component, and an arbitrary image recognition technique (for example, a pattern
  • the video data can be analyzed by identifying the partial image using matching, a support vector machine, a neural network, or the like.
  • the control block of the above system may be realized by a logic circuit (hardware) formed in an integrated circuit (IC chip) or the like, or may be realized by software using a CPU.
  • the system includes a CPU that executes a program (control program for the data analysis system) that is software that implements each function, and a ROM (in which the program and various data are recorded so as to be readable by the computer (or CPU)).
  • a Read Only Memory or a storage device (these are referred to as “recording media”), a RAM (Random Access Memory) for developing the program, and the like are provided.
  • the objective of this invention is achieved when a computer (or CPU) reads the said program from the said recording medium and runs it.
  • a “non-temporary tangible medium” such as a tape, a disk, a card, a semiconductor memory, a programmable logic circuit, or the like can be used.
  • the program may be supplied to the computer via an arbitrary transmission medium (such as a communication network or a broadcast wave) that can transmit the program.
  • the present invention can also be realized in the form of a data signal embedded in a carrier wave in which the program is embodied by electronic transmission. Note that the above program can be implemented in any programming language. Also, any recording medium that records the above program falls within the scope of the present invention.
  • the system is not limited to the fall prediction system described in the above [Example of operation of data analysis system].
  • a discovery support system for example, a forensic system, an e-mail monitoring system, a medical application system (for example, a pharmacovigilance support system, Clinical trial efficiency system, medical risk hedging system, prognosis prediction system, diagnosis support system, etc.), Internet application system (eg, smart mail system, information aggregation (curation) system, user monitoring system, social media management system, etc.), information Leakage detection system, project evaluation system, marketing support system, intellectual property evaluation system, fraudulent transaction monitoring system, call center escalation system, credit check system Etc., it may be implemented as an artificial intelligence system for analyzing big data (data with a predetermined cases any system capable assess the relevance).
  • the predetermined case when the system is realized as a discovery support system, the predetermined case is a “lawsuit”, and when the system is realized as a forensic system, the predetermined case is an “illegal case” (crime).
  • the predetermined case depends on the field in which the system is implemented.
  • preprocessing for example, extracting an important part from the data and extracting only the important part from the data
  • the analysis target may be applied), or the mode of displaying the data analysis result may be changed. It will be understood by those skilled in the art that a variety of such variations can exist, and all variations fall within the scope of the present invention.

Abstract

Provided is a data analysis system, which: sets a plurality of evaluation items which are associated with a prescribed case; evaluates training data in which classification information is set which relates to the degree of association with each of the plurality of evaluation items; evaluates subject data for each of the plurality of evaluation items on the basis of the result of the evaluation of the training data; carries out an analysis based on a prescribed prediction model, with the evaluation of the subject data for each of the plurality of evaluation items as input; and outputs prediction information which relates to the occurrence of the prescribed case on the basis of the result of the analysis.

Description

データ分析システム、その制御方法、プログラム、及び、記録媒体DATA ANALYSIS SYSTEM, ITS CONTROL METHOD, PROGRAM, AND RECORDING MEDIUM
 本発明は、データを所定事案に基づいて分析するデータ分析システム等に関する。 The present invention relates to a data analysis system for analyzing data based on a predetermined case.
 近年、インシデントを予防することが様々な産業分野において重要視されている。実際に、医療分野でも、医療事故を防ぐために様々な方策が検討され、下記特許文献に示すように、インシデントレポートを記録し、医療事故や医療事故に繋がり得る危険な行為をインシデントレポートに基づいて管理して医療事故を未然に防止するためのシステムが開示されている。 In recent years, the prevention of incidents has become important in various industrial fields. In fact, in the medical field, various measures have been studied to prevent medical accidents, and as shown in the following patent document, incident reports are recorded, and dangerous actions that can lead to medical accidents and medical accidents are based on incident reports. A system for managing and preventing medical accidents is disclosed.
特開2008-165680号公報JP 2008-165680 A 特許第3861986号公報Japanese Patent No. 3861986 米国特許出願公開第2013/0144816号明細書US Patent Application Publication No. 2013/0144816 米国特許出願公開第2011/0295621号明細書US Patent Application Publication No. 2011/0295621
 医療事故には、医師や看護士等の医療行為に起因する事故の他に、患者側の事情に起因する事故、例えば、患者の転倒が存在する。医師や看護士による医療行為の質を向上させることによって、前者の事故を極力防ぐようにはできるが、患者側の要因が大きい後者の事故を防ぐことはそもそも難しい。既述の従来例によって、患者側に事故が発生し得ることを予見できるにしても、事故を定量的に予測することは困難であった。したがって、従来の対策では、患者の行動を一律に規制する等大まかな対応しか講じ得なかったのが実情であった。そこで、本発明は、例えば、転倒等人に起因する事故や危険状態といった所定事案の発生を、精度よく予測可能なデータ分析技術を提供することを目的とする。 In medical accidents, there are accidents caused by the circumstances of the patient, for example, patient falls, in addition to accidents caused by medical practices such as doctors and nurses. Although it is possible to prevent the former accident as much as possible by improving the quality of medical practice by doctors and nurses, it is difficult to prevent the latter accident, which has a large factor on the patient side. Even if it can be predicted that an accident may occur on the patient side according to the above-described conventional example, it is difficult to predict the accident quantitatively. Therefore, in the conventional measures, it was actually possible to take only rough measures such as uniformly restricting the behavior of patients. Therefore, an object of the present invention is to provide a data analysis technique capable of accurately predicting the occurrence of a predetermined case such as an accident or a dangerous state caused by a person such as a fall.
 前記目的は、データを所定事案に基づいて分析するデータ分析システムであって、 データ分析の対象となる複数の対象データを記憶するメモリと、前記複数の対象データを評価するコントローラと、を備え、前記コントローラは、所定事案に関連する複数の評価項目を設定することと、前記複数の評価項目夫々との関連性に係る分類情報が設定された学習用データを評価することと、前記学習用データの評価結果に基づいて、前記複数の評価項目夫々について、前記対象データを評価することと、前記複数の評価項目毎の前記対象データの評価を入力として、所定の予測モデルに基づく分析を行うことと、前記分析の結果に基づいて前記所定事案の発生に関する予測情報を出力することとによって達成される。さらに、データ分析システムの制御方法、そのプログラム、及び、記録媒体が提供される。 The object is a data analysis system for analyzing data based on a predetermined case, comprising: a memory for storing a plurality of target data to be subjected to data analysis; and a controller for evaluating the plurality of target data. The controller sets a plurality of evaluation items related to a predetermined case, evaluates learning data in which classification information related to relevance to each of the plurality of evaluation items is set, and the learning data Based on the evaluation result, the target data is evaluated for each of the plurality of evaluation items, and an analysis based on a predetermined prediction model is performed using the evaluation of the target data for each of the plurality of evaluation items as an input. And outputting prediction information related to the occurrence of the predetermined case based on the result of the analysis. Furthermore, a control method of the data analysis system, a program thereof, and a recording medium are provided.
 既述の開示によって、所定事案の発生を、精度よく予測可能なデータ分析技術を提供することができる。 According to the above-described disclosure, it is possible to provide a data analysis technique capable of accurately predicting the occurrence of a predetermined case.
データ分析システムのハードウェア構成の一例を示すブロック図である。It is a block diagram which shows an example of the hardware constitutions of a data analysis system. データ分析システムの動作を示すフローチャートである。It is a flowchart which shows operation | movement of a data analysis system. ロジスティック回帰分析の目的変数(転倒の発生確率)が時系列に変化する特性を示したグラフである。It is the graph which showed the characteristic in which the objective variable (occurrence probability of a fall) of logistic regression analysis changes in time series.
 〔データ分析システムの構成〕
 図1は、本実施の形態に係るデータ分析システム(以下、単に「システム」と略記することがある)のハードウェア構成の一例を示すブロック図である。当該システムは、例えば、データ(デジタルデータ、及び/又は、アナログデータを含む)を格納可能な任意の記録媒体(例えば、メモリ、ハードディスク等)と、当該記録媒体に格納された制御プログラムを実行可能なコントローラ(例えば、CPU;Central Processing Unit)とを備え、当該記録媒体に少なくとも一時的に格納されたデータを分析するコンピュータ、又は、コンピュータシステム(複数のコンピュータが統合的に動作することによってデータ分析を実現するシステム)として実現される。
[Data analysis system configuration]
FIG. 1 is a block diagram showing an example of a hardware configuration of a data analysis system according to the present embodiment (hereinafter sometimes simply referred to as “system”). For example, the system can execute any recording medium (for example, memory, hard disk, etc.) capable of storing data (including digital data and / or analog data) and a control program stored in the recording medium. A computer or a computer system (analyzing data by operating a plurality of computers in an integrated manner) that analyzes data stored at least temporarily in the recording medium. It is realized as a system).
 本実施の形態において、「学習用データ」(training data)は、例えば、参照データとしてユーザに提示され、分類情報が対応付けられたデータ(分類済みの参照データ、参照データと分類情報との組み合わせ)であってよい。学習用データを、「教師データ」又は「トレーニングデータ」といってもよい。また、「評価対象データ」(evaluation data)は、当該分類情報が対応付けられていないデータ(参照データとしてユーザに提示されておらず、ユーザにとっては分類されていない未分類のデータ、「未知データ」といってもよい)であってよい。ここで、上記「分類情報」は、参照データを任意に分類するために用いる識別ラベルであってよく、例えば、参照データと所定事案(上記システムがデータとの関連性を評価する対象を広く含み、その範囲は制限されない)とが関係することを示す「Related」ラベルと、両者が関係しないことを示す「Non-Related」ラベルとのように、当該参照データを任意の数(例えば、2つ)のグループに分類する情報であってよい。 In the present embodiment, “learning data” (training data) is, for example, data presented to the user as reference data and associated with classification information (classified reference data, combination of reference data and classification information) ). The learning data may be referred to as “teacher data” or “training data”. In addition, “evaluation data” (evaluation data) is data that is not associated with the classification information (unclassified data that is not presented to the user as reference data and is not classified for the user, “unknown data” May be said). Here, the “classification information” may be an identification label used for arbitrarily classifying reference data. For example, the “classification information” widely includes reference data and a predetermined case (the target for which the system evaluates relevance between data). , The range is not limited), and the reference data can be an arbitrary number (for example, two) ) May be classified into groups.
 クライアント装置3は、データの一部を参照データとしてユーザに提示する。これにより、当該ユーザは、評価者(又は、ビューワ)として、クライアント装置3を介して参照データに対する評価・分類のための入力を行う(分類情報を与える)ことができる。サーバ装置2は、参照データと分類情報との組み合わせ(学習用データ)に基づいて、当該データからパターン(例えば、データに含まれる抽象的な規則、意味、概念、様式、分布、サンプルなどを広く指し、いわゆる「特定のパターン」に限定されない)を学習し、当該学習したパターンに基づいて、評価対象データと所定事案との関連性を評価する。 The client device 3 presents a part of the data as reference data to the user. As a result, the user, as an evaluator (or viewer), can perform input for evaluation / classification with respect to the reference data (giving classification information) via the client device 3. Based on the combination of reference data and classification information (learning data), the server device 2 widely uses patterns (for example, abstract rules, meanings, concepts, styles, distributions, and samples included in the data) from the data. And is not limited to a so-called “specific pattern”), and based on the learned pattern, the relevance between the evaluation target data and the predetermined case is evaluated.
 管理計算機6は、クライアント装置3、サーバ装置2、及び、ストレージシステム5に対して、所定の管理処理を実行する。ストレージシステム5は、例えば、ディスクアレイシステムから構成され、データと当該データに対する評価・分類の結果とを記録するデータベース4を備えてよい。サーバ装置2とストレージシステム5とは、DAS(Direct Attached Storage)方式、又は、SAN(Storage Area Network)によって通信可能に接続されている。 The management computer 6 executes predetermined management processing for the client device 3, the server device 2, and the storage system 5. The storage system 5 may be composed of, for example, a disk array system, and may include a database 4 that records data and results of evaluation / classification of the data. The server device 2 and the storage system 5 are communicably connected by a DAS (Direct Attached Storage) method or a SAN (Storage Area Network).
 なお、図1に示されるハードウェア構成はあくまで例示に過ぎず、上記システムは、他のハードウェア構成によっても代替等されてもよい。例えば、サーバ装置2において実行される処理の一部、又は、全部がクライアント装置3において実行される構成であってもよいし、当該処理の一部または全部がサーバ装置2において実行される構成であってもよいし、ストレージシステム5がサーバ装置2に内蔵される構成であってもよい。また、ユーザは、クライアント装置3を介してサンプルデータに対する評価・分類のための入力を行う(分類情報を与える)だけでなく、サーバ装置2に直接接続された入力機器を介して上記入力を行うこともできる。当該システムを実現可能なハードウェア構成が多様に存在し得ることは、当業者に理解されるところであり、特定の1つの構成(例えば、図1に例示されるような構成)に限定されない。 Note that the hardware configuration shown in FIG. 1 is merely an example, and the above system may be replaced by other hardware configurations. For example, a part or all of the processing executed in the server device 2 may be executed in the client device 3, or a part or all of the processing may be executed in the server device 2. There may be a configuration in which the storage system 5 is built in the server device 2. Further, the user not only performs input for evaluation / classification of sample data via the client device 3 (gives classification information), but also performs the above input via an input device directly connected to the server device 2. You can also. It is understood by those skilled in the art that there can be various hardware configurations capable of realizing the system, and the present invention is not limited to one specific configuration (for example, the configuration illustrated in FIG. 1).
 〔データ評価機能〕
 上記システムは、データ評価機能を備えることができる。当該データ評価機能は、人手で分類された少数のデータ(学習用データ)に基づいて、多数の評価対象データ(ビッグデータを含む)を評価するものである。当該データ評価機能を備えることにより、上記システムは、例えば、評価対象データと所定事案との関連性の高低を示す指標(例えば、評価対象データを序列化可能にする数値(例えば、スコア)、文字(例えば、「高」、「中」、「低」など)、及び//又は、記号(例えば、「◎」、「○」、「△」、「×」など)、を導出することによって、上記評価を実現することができる。データ評価機能は、サーバ装置2のコントローラによって実現される。
[Data evaluation function]
The system can have a data evaluation function. The data evaluation function evaluates a large number of evaluation target data (including big data) based on a small number of data (learning data) classified manually. By providing the data evaluation function, the system can, for example, indicate an index indicating the level of relevance between the evaluation target data and the predetermined case (for example, a numerical value (for example, a score), a character that enables the evaluation target data to be ordered) (Eg, “high”, “medium”, “low”, etc.) and / or symbols (eg, “◎”, “○”, “△”, “x”, etc.), The data evaluation function can be realized by the controller of the server device 2.
 上記システムが上記評価のための指標としてスコアを導出する場合、当該システムは、当該スコアを任意の方法で算出することができる。例えば、機械学習、又は、自然言語処理の分野で用いられる各種の手法(例えば、K近傍法、サポートベクターマシンを用いた手法、ニューラルネットワークを用いた手法、データに対して統計モデルを仮定する手法(例えば、ガウス過程を用いた手法など)、及び/又は、これらを組み合わせた手法など)に基づいて当該スコアを算出してもよいし、統計学の分野で用いられる各種の手法に基づいて(例えば、構成要素がデータに現れる頻度に基づいて)算出してもよい。 When the system derives a score as an index for the evaluation, the system can calculate the score by any method. For example, various methods used in the field of machine learning or natural language processing (for example, a K-neighbor method, a method using a support vector machine, a method using a neural network, a method that assumes a statistical model for data) The score may be calculated based on (for example, a method using a Gaussian process) and / or a method combining these, or based on various methods used in the field of statistics ( (E.g., based on how often the component appears in the data).
 「構成要素」(データ要素と呼んでもよい)は、データの少なくとも一部を構成する部分データであってよく、例えば、文書を構成する形態素、キーワード、センテンス、段落、及び/又は、メタデータ(例えば、電子メールのヘッダ情報)であったり、音声を構成する部分音声、ボリューム(ゲイン)情報、及び/又は、音色情報であったり、画像を構成する部分画像、部分画素、及び/又は、輝度情報であったり、映像を構成するフレーム画像、モーション情報、及び/又は、3次元情報であったりしてよい。 The “component” (which may be referred to as a data element) may be partial data that forms at least a part of the data. For example, morphemes, keywords, sentences, paragraphs, and / or metadata ( For example, e-mail header information), partial voice that constitutes audio, volume (gain) information, and / or timbre information, partial image that constitutes an image, partial pixels, and / or luminance It may be information, a frame image constituting a video, motion information, and / or 3D information.
 構成要素がデータに現れる頻度に基づいて上記システムが上記スコアを算出する場合、例えば、次のような算出方法が考えられる。先ず、上記システムは、学習用データから、当該学習用データを構成する構成要素を抽出し、当該構成要素を評価する。このとき、上記システムは、例えば、学習用データの少なくとも一部を構成する複数の構成要素が、データと分類情報との組み合わせに寄与する度合い(言い換えれば、当該構成要素が分類情報に応じて出現する頻度)をそれぞれ評価する。度合いを重みと言い換えてもよい。より具体的な一例として、上記システムは、伝達情報量(例えば、構成要素の出現確率と分類情報の出現確率とを用いて、所定の式から算出される情報量)を用いて構成要素を評価することによって、当該構成要素の評価情報としての評価値を、下記の数1に従い算出する。 When the system calculates the score based on the frequency at which the component appears in the data, for example, the following calculation method can be considered. First, the system extracts the constituent elements constituting the learning data from the learning data, and evaluates the constituent elements. At this time, for example, the system described above is the degree that a plurality of constituent elements constituting at least a part of the learning data contribute to the combination of data and classification information (in other words, the constituent elements appear according to the classification information). Frequency). The degree may be rephrased as a weight. As a more specific example, the system evaluates a component using a transmitted information amount (for example, an information amount calculated from a predetermined formula using the appearance probability of the component and the appearance probability of the classification information). By doing so, an evaluation value as evaluation information of the constituent element is calculated according to the following formula 1.
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001
 ここで、wgtは、評価前のi番目の構成要素の評価値の初期値を示す。また、wgtは、L回目の評価後のi番目の構成要素の評価値を示す。γはL回目の評価における評価パラメータを意味し、θは評価の際の閾値を意味する。これにより、上記システムは、例えば、算出した伝達情報量の値が大きいほど、構成要素が所定の分類情報の特徴を表すものとして評価することができる。 Here, wgt indicates the initial value of the evaluation value of the i-th component before evaluation. Wgt indicates the evaluation value of the i-th component after the Lth evaluation. γ means an evaluation parameter in the L-th evaluation, and θ means a threshold value in the evaluation. Thereby, for example, the system can be evaluated such that the greater the value of the calculated transmission information amount, the more the component represents the characteristic of the predetermined classification information.
 次に、上記システムは、上記構成要素と評価値とを対応付け、両者を任意のメモリ(例えば、ストレージシステム5)に格納する。そして、上記システムは、評価対象データから構成要素を抽出し、当該構成要素が上記メモリに格納されているか否かを照会し、格納されている場合は、当該構成要素に対応付けられた評価値を当該メモリから読み出し、当該評価値に基づいて評価対象データを評価する。より具体的な一例として、上記システムは、評価対象データの少なくとも一部を構成する構成要素に対応付けられた評価値を用いて以下の式を計算することによって、上記スコアを算出することができる。
Figure JPOXMLDOC01-appb-M000002
:i番目の構成要素の出現頻度、wgt:i番目の構成要素の評価値
Next, the system associates the component with the evaluation value, and stores both in an arbitrary memory (for example, the storage system 5). Then, the system extracts a component from the evaluation target data, inquires whether or not the component is stored in the memory, and if it is stored, an evaluation value associated with the component Is read from the memory, and the evaluation object data is evaluated based on the evaluation value. As a more specific example, the system can calculate the score by calculating the following expression using an evaluation value associated with a component that constitutes at least a part of the evaluation target data. .
Figure JPOXMLDOC01-appb-M000002
m j : appearance frequency of the i th component, wgt i : evaluation value of the i th component
 サーバ装置2は、再現率が所定の目標値になるまで、構成要素の抽出、及び、評価を継続する(繰り返す)ことができるようにしてもよい。再現率とは、所定数のデータに対して発見すべきデータが占める割合(網羅性)を示す指標であり、例えば、全データの30%に対して再現率が80%である場合、所定事案に関係するとして、発見されるべきデータの80%が、指標(スコア)上位30%のデータの中に含まれていることを示す。データ分析システムを用いず、人がデータに総当たり(リニアレビュー)した場合、発見すべきデータの量は人がレビューした量に比例するため、この比例からの乖離が大きいほどシステムのデータ分析性能が良いことになる。 The server device 2 may continue (repeate) the extraction and evaluation of the constituent elements until the recall rate reaches a predetermined target value. The recall is an index indicating the ratio (coverability) of the data to be discovered with respect to a predetermined number of data. For example, when the recall is 80% with respect to 30% of all data, a predetermined case As shown in the figure, 80% of the data to be discovered is included in the data of the top 30% of the index (score). When a person hits the data (linear review) without using a data analysis system, the amount of data to be discovered is proportional to the amount reviewed by the person, so the greater the deviation from this proportionality, the greater the data analysis performance of the system. Will be good.
 前述したデータ評価機能の実現例は、あくまでも一例に過ぎない。すなわち、当該データ評価機能は、「学習用データに基づいて評価対象データを評価する」という機能でありさえすれば、その具体的な態様は特定の1つの構成(例えば、前述したスコアの算出方法)に限定されない。 The implementation example of the data evaluation function described above is merely an example. In other words, as long as the data evaluation function is a function of “evaluating evaluation target data based on learning data”, the specific mode is a specific one configuration (for example, the aforementioned score calculation method) ) Is not limited.
 〔サーバ装置2による評価対象データの評価〕
 サーバ装置2による評価対象データの評価動作を説明する。図2は、サーバ装置2(詳しくはサーバ装置2のコントローラ)のフローチャートである。サーバ装置2は、ストレージシステム5に記録された評価対象データの中から一つ又は複数のデータを参照データとして取得する(ステップS300:データ取得モジュール)。各ステップを、モジュール又は手段と言い換えることもできる。次に、サーバ装置2は、ユーザが参照データを実際にレビューして分類を決定し、ユーザによって参照データに対して入力された分類情報を、任意の入力装置から取得する(ステップS302:分類情報取得モジュール)。サーバ装置2は、参照データと分類情報とを組み合わせることによって学習用データを構成し(ステップ304:学習用データ構成モジュール)、学習用データから構成要素を抽出する(ステップS306:構成要素抽出モジュール)。そして、コントローラは、当該構成要素を評価し(ステップS308:構成要素評価モジュール)、当該構成要素と評価値とを対応付け、両者をストレージシステム5に格納する(ステップS310:構成要素格納モジュール)。上記S300~S308の処理は、「学習フェーズ」(人工知能がパターンを学習するフェーズ)に対応する。なお、学習用データを、参照データから作成する代わりに、予め用意しておいてもよい。
[Evaluation of Evaluation Target Data by Server Device 2]
The evaluation operation of the evaluation target data by the server device 2 will be described. FIG. 2 is a flowchart of the server device 2 (specifically, the controller of the server device 2). The server device 2 acquires one or more data as reference data from the evaluation target data recorded in the storage system 5 (step S300: data acquisition module). Each step can be rephrased as a module or means. Next, the server device 2 determines the classification by actually reviewing the reference data by the user, and acquires the classification information input to the reference data by the user from any input device (step S302: classification information). Acquisition module). The server device 2 composes learning data by combining the reference data and the classification information (step 304: learning data configuration module), and extracts components from the learning data (step S306: component extraction module). . Then, the controller evaluates the constituent element (step S308: constituent element evaluation module), associates the constituent element with the evaluation value, and stores both in the storage system 5 (step S310: constituent element storage module). The processing of S300 to S308 corresponds to the “learning phase” (phase in which artificial intelligence learns a pattern). Note that the learning data may be prepared in advance instead of creating from the reference data.
 次に、サーバ装置2は、ストレージシステム5から評価対象データを取得する(ステップS312:評価対象データ取得モジュール)。サーバ装置2は、さらに、ストレージシステム5から構成要素とその評価値とを読み出し、当該構成要素を評価対象データから抽出する(ステップS314:構成要素抽出モジュール)。サーバ装置2は、当該構成要素に対応付けられた評価値に基づいて評価対象データを評価して(ステップS316:評価対象データ評価モジュール)、複数の評価対象データの序列化情報(ランキング)を作成することができる。上位の評価対象データであるほど所定事案との関連性が高いことにある。ステップS312以降の処理が、既述の学習フェーズに対して、評価フェーズになる。なお、既述のフローチャートに含まれる各処理は、一例であって、限定される態様を示したものでないことに留意すべきである。 Next, the server device 2 acquires evaluation target data from the storage system 5 (step S312: evaluation target data acquisition module). The server device 2 further reads out the constituent elements and their evaluation values from the storage system 5 and extracts the constituent elements from the evaluation target data (step S314: constituent element extraction module). The server device 2 evaluates the evaluation object data based on the evaluation value associated with the constituent element (step S316: evaluation object data evaluation module), and creates ranking information (ranking) of the plurality of evaluation object data can do. The higher the evaluation target data, the higher the relevance with the predetermined case. The processing after step S312 is an evaluation phase with respect to the learning phase described above. It should be noted that each process included in the above-described flowchart is an example and does not indicate a limited aspect.
 〔データ分析システムの動作例〕
 次に、データ分析システムの動作例を、既述の所定事案を、医療インシデントとしての「患者の転倒」に基づいて説明する。データ分析システムは、患者の転倒を予測できる転倒予測モデルとして実現される。ストレージシステム5には、電子カルテが患者毎に記録されている。電子カルテの構造化データはデータベース4に記録されている。医師、看護師、臨床検査技師、薬剤師等の医療関係者は、診療、診察、診断、治療、処置、検査、又は、投薬の都度、クライアント装置3を介して電子カルテにアクセスして、必要な情報を記録する。このように、電子カルテは時系列に更新される。サーバ装置2は、時系列に電子カルテをストレージシステム5に記録する。例えば、5月1日午後8時の電子カルテ、5月2日午後8時の電子カルテの如く、電子カルテのトランザクションデータが構成される。
[Operation example of data analysis system]
Next, an operation example of the data analysis system will be described based on the above-mentioned predetermined case based on “patient fall” as a medical incident. The data analysis system is realized as a fall prediction model that can predict the fall of a patient. In the storage system 5, an electronic medical record is recorded for each patient. The structured data of the electronic medical record is recorded in the database 4. Medical personnel such as doctors, nurses, clinical technologists, and pharmacists access the electronic medical record via the client device 3 every time they need medical care, examination, diagnosis, treatment, treatment, examination, or medication. Record information. Thus, the electronic medical record is updated in time series. The server device 2 records the electronic medical record in the storage system 5 in time series. For example, transaction data of an electronic medical record is configured like an electronic medical record at 8:00 pm on May 1 and an electronic medical record at 8:00 pm on May 2.
 患者毎の電子カルテには、例えば、診断情報、治療情報、投薬情報、血圧・脈拍等のバイタルデータ、血液等の臨床分析データ)、医療関係者が把握した患者の症状・状態・状況、患者が医療関係者に申し立てた症状、医療関係者と患者との会話、等が記録されていてよい。 Electronic medical records for each patient include, for example, diagnostic information, treatment information, medication information, vital data such as blood pressure and pulse, clinical analysis data such as blood, etc., patient symptoms / states / conditions grasped by medical personnel, patients Symptoms filed with medical personnel, conversations between medical personnel and patients, etc. may be recorded.
 既述の参照データは、ストレージシステム5に記憶された電子カルテでよい。サーバ装置2は、ストレージシステム5から所定数の電子カルテを抽出し、これをクライアント装置3に提供する(図2のS300)。医療関係者は、クライアント装置3を介してサーバ装置2から、転倒に関する評価項目を取得する(S304)。転倒に関する評価項目として、発明者によって判明された、例えば、次のようなパラメータがある。
 評価項目:傾眠、貧血、せん妄、易疲労、呼吸苦、嘔気嘔吐、痺れ、麻痺、排泄異常、ふらつき、食事摂取の不良、痛み、食止め日数
The aforementioned reference data may be an electronic medical record stored in the storage system 5. The server device 2 extracts a predetermined number of electronic medical records from the storage system 5 and provides them to the client device 3 (S300 in FIG. 2). The medical staff acquires evaluation items related to the fall from the server device 2 via the client device 3 (S304). For example, the following parameters have been found by the inventor as evaluation items regarding falls.
Evaluation items: somnolence, anemia, delirium, fatigue, breathing difficulty, nausea and vomiting, numbness, paralysis, abnormal excretion, light-headedness, poor food intake, pain, number of days to stop eating
 医療関係者はレヴューワとして、これら項目夫々について、電子カルテが関係するか、否かの分類情報を設定する。医療関係者に提供される評価項目は、既述のパラメータ全てである必要は必ずしもなく、医療関係者によって、転倒に関連するものとして選択された複数のものであってよい。既述のパラメータは、転倒に関して互いに独立性が高く、パラメータ数が多くなれば転倒の予測精度も向上される。したがって、評価項目として、複数のパラメータが選択されることがよい。 医療 As a reviewer, medical personnel set classification information on whether or not electronic medical records are relevant for each of these items. The evaluation items provided to the medical personnel are not necessarily all the parameters described above, and may be a plurality of items selected by the medical personnel as related to the fall. The parameters already described are highly independent from each other with respect to falling, and the prediction accuracy of falling is improved as the number of parameters increases. Therefore, a plurality of parameters are preferably selected as evaluation items.
 サーバ装置2は、複数の評価項目夫々について、電子カルテと分類情報の組としての学習用データを構成する(S304)。そして、サーバ装置2は、S306~S310に沿って、複数の評価項目夫々について、構成要素とその評価値とを決定する。次いで、サーバ装置は、ストレージシステム5から、分類用データとしての電子カルテを取得し、複数の評価項目夫々について、スコアを算出する(S312~S316)。 The server device 2 configures learning data as a set of electronic medical records and classification information for each of the plurality of evaluation items (S304). Then, the server device 2 determines a component and its evaluation value for each of a plurality of evaluation items in accordance with S306 to S310. Next, the server device acquires an electronic medical record as classification data from the storage system 5, and calculates a score for each of the plurality of evaluation items (S312 to S316).
 (例1)
 電子カルテ(評価対象データ)に、下記の情報が記録されているケースでは、「せん妄」のスコアが高い値になり、嘔気嘔吐のスコアは比較的高い値になる。
 「xxxx年xx月xx日
 日々の観察:視線はどこかうつろ。時折辻褄の合わない発言もあり。
 飲水、著明なむせはないが、飲水後咽頭付近にわずかに喘鳴聞かれる。」
(Example 1)
In the case where the following information is recorded in the electronic medical record (evaluation target data), the “delirium” score has a high value, and the nausea and vomiting score has a relatively high value.
“Xxxx year xx month xx day Daily observation: The line of sight hangs out. There are occasional remarks.
Although drinking water, there is no noticeable drought, but a slight wheezing is heard near the pharynx after drinking. "
 (例2)
 電子カルテに、下記の情報が記録されているケースでは、「傾眠」が高いスコアになる。
 「眠気を訴える。家族の見舞い中にも、同様。」
(Example 2)
In the case where the following information is recorded in the electronic medical record, “somnolence” has a high score.
“Appealing sleepiness. The same is true during a family visit.”
 (例3)
 電子カルテに、下記の情報が記録されているケースでは、「易疲労」と「嘔気嘔吐」が共に高いスコアになり、「痺れ」が比較的高いスコアになる。
 「今日はしんどそうですね
  本人:「(頷いた後)夕方になるとしんどい。朝は楽なときもあるの。少し痺れも有るようなないような。夜間眠れないことによる影響、嘔気嘔吐による消耗。」
(Example 3)
In the case where the following information is recorded in the electronic medical chart, both “easy fatigue” and “nausea and vomiting” have high scores, and “numbness” has a relatively high score.
“It seems to be a lot today.” (Since it ’s late in the evening. Sometimes it ’s easy in the morning. There ’s a little numbness. The effect of not being able to sleep at night, exhaustion due to nausea and vomiting.)
 次に、サーバ装置2は、患者の転倒を予測するために、複数の電子カルテ(対象データ)夫々について、複数の評価項目毎の評価を入力として、所定の予測モデルに基づく分析を行う。好適な態様では、サーバ装置2は、所定の予測モデルとしてロジスティック回帰分析を利用し、複数の電子カルテ夫々について、複数の評価項目毎のスコアを計算すると、複数の項目毎のスコアを説明変数として、下記数式に基づいてロジスティック回帰分析を実施し、目的変数である転倒なし(0)、転倒あり(1)に基づいて、転倒の発生確率(転倒の発生予測)を演算する。
Figure JPOXMLDOC01-appb-M000003
 β0i,β1i,・・・・βkiは回帰係数であり、予め、電子カルテの過去データを利用して、最小2乗法によって算出されていればよい。x0i,x1i,・・・・xkiは、説明変数であって、夫々に各評価項目のスコアが代入される。piは、目的変数(転倒の発生確率)である。サーバ装置2は、piを時系列に記録し、転倒の発生確率の時間的な変化、例えば、現在得られた転倒の発生確率が、所定の閾値を越えた場合、或いは、所定日数(例えば、7日間)の移動平均からプラス側に乖離(乖離率が、例えば、25%)した場合に、患者が転倒する虞があるとして、これを医療関係者の電子機器(パソコン、スマートフォン等)に通知することができる。図3は、目的変数(転倒の発生確率)が時系列に変化する特性を示したグラフであり、矢印に示す時点で、サーバ装置2は、医療関係者に、電子カルテに対応する患者に転倒の発生が起こり得ることを通知する。
Next, in order to predict the fall of the patient, the server device 2 performs an analysis based on a predetermined prediction model, with each of a plurality of electronic medical records (target data) being input with an evaluation for each of a plurality of evaluation items. In a preferred embodiment, the server device 2 uses logistic regression analysis as a predetermined prediction model, and calculates a score for each of a plurality of evaluation items for each of a plurality of electronic medical records. Then, logistic regression analysis is performed based on the following mathematical formula, and the probability of falling (predicting the occurrence of falling) is calculated based on the objective variables without falling (0) and with falling (1).
Figure JPOXMLDOC01-appb-M000003
β 0i, β 1i, ···· β ki is the regression coefficient, previously, using the historical data of the electronic medical record, need only be calculated by the least squares method. x 0i , x 1i ,..., x ki are explanatory variables, and the score of each evaluation item is substituted for each. pi is an objective variable (probability of falling). The server device 2 records pi in time series, and the temporal change in the occurrence probability of the fall, for example, the occurrence probability of the fall currently obtained exceeds a predetermined threshold, or a predetermined number of days (for example, 7 days) When moving to the plus side from the moving average (deviation rate is 25%, for example), the patient may fall down, and this is notified to the medical equipment (PC, smartphone, etc.) can do. FIG. 3 is a graph showing characteristics in which the objective variable (probability of falling) changes in time series. At the time indicated by the arrow, the server apparatus 2 falls to a medical person to a patient corresponding to the electronic medical record. Notify that the occurrence of
 実際に転倒が発生したケースでは、転倒が発生した日のから所定日数前から項目に変化が現れることが判っている。そこで、実際に転倒が発生した日を含む3日前の電子カルテのロジスティック回帰分析において、目的変数(pi)のフラグに「1」を設定して回帰係数を最適化する。そして、最適化された回帰係数に基づいて、現在の電子カルテに対してロジスティック回帰分析を行うこと依って、所定日数(例えば、3日)以内に転倒する確率が目的変数として算出される。したがって、サーバ装置2がロジスティック回帰分析を行うことによって得られた目的変数は、所定日数以内に患者が転倒する確率を示している。このように、ある程度幅を持って、転倒の発生確率が得られるため、医療関係者にとっても、患者にとっても、転倒に対する備えを行い易いという利点がある。 In the case where a fall actually occurred, it is known that the item appears to change from a predetermined number of days before the fall occurred. Therefore, in the logistic regression analysis of the electronic medical record three days before including the day when the fall actually occurred, “1” is set to the flag of the objective variable (pi) to optimize the regression coefficient. Based on the optimized regression coefficient, the probability of falling within a predetermined number of days (for example, 3 days) is calculated as an objective variable by performing logistic regression analysis on the current electronic medical record. Therefore, the objective variable obtained by the server device 2 performing the logistic regression analysis indicates the probability that the patient falls within a predetermined number of days. In this way, since the probability of falling is obtained with a certain range, there is an advantage that preparations for falling are easy for both medical personnel and patients.
 バイタルデータ、検査結果等の数値データも転倒の発生に関係があることが判っている。例えば、血圧、脈拍、体温、経皮的動脈血酸素飽和度 (SPO2)、食事摂取量、及び、疼痛NRS指標等の数値データ(患者から測定された医療データ)のアノマリーが、転倒の発生に深く関与していることが判っている。したがって、これらの数値データの一つ又は複数を説明変数として、ロジスティック回帰分析に適用することが、転倒の発生予測には効果的である。さらに、薬歴データ(薬剤名、薬効、用量、用法、服薬期間、副作用など)も同様に効果的である。例えば、睡眠導入剤、精神安定剤の服用は、転倒の発生に関係がある。 Numeric data such as vital data and test results are known to be related to the occurrence of falls. For example, anomalies of numerical data (medical data measured from patients) such as blood pressure, pulse rate, body temperature, percutaneous arterial oxygen saturation (SPO 2 ), food intake, and pain NRS index can contribute to the occurrence of falls. It is known to be deeply involved. Therefore, applying one or more of these numerical data as explanatory variables to logistic regression analysis is effective in predicting the occurrence of falls. Furthermore, drug history data (drug name, drug efficacy, dose, usage, duration of medication, side effects, etc.) are equally effective. For example, taking sleep inducers and tranquilizers is related to the occurrence of falls.
 データ分析システム1によれば、患者側の行動に起因する転倒の発生を、所定の期間内での発生確率として出力するために、転倒の発生が効果的に予測でき、医療関係者や患者にとっても転倒に対する事前対応がし易いという効果が達成される。データ分析システム1によれば、患者の転倒に加えて、患者の転落等患者側の行動や事情による事故、危険行為の発生を予測することができる。さらに、データ分析システム1によれば、医療分野に限らず、運送、建築、土木、製品製造等他の産業分野におけるオペレータやドライバー等の人の行動や事情による事故、危険行為の発生を予測することができる。 According to the data analysis system 1, in order to output the occurrence of a fall caused by the patient's behavior as an occurrence probability within a predetermined period, the occurrence of the fall can be effectively predicted, and medical personnel and patients can be predicted. The effect that it is very easy to cope in advance with falling is achieved. According to the data analysis system 1, in addition to a patient's fall, it is possible to predict the occurrence of an accident or a dangerous act due to a patient's behavior or circumstances such as a patient's fall. Furthermore, according to the data analysis system 1, not only the medical field but also the occurrence of accidents and dangerous acts due to the actions and circumstances of operators and drivers in other industrial fields such as transportation, construction, civil engineering, product manufacturing, etc. be able to.
 既述の説明では、データ分析システムは、ロジスティック回帰分析のために、電子カルテから回帰に寄与する情報を、スコアとして予測符号化の処理によって取得したが、これを次のように変更してもよい。即ち、データ分析システムは、人口知能に辞書を搭載し、電子カルテ(対象データ)のテキストデータを辞書で分析、各項目に関連するテキストがあればフラグを設定する。例えば、電子カルテに“貧血”があれば貧血フラグをセットする。データ分析システムは、複数の評価項目夫々のフラグを説明変数として、ロジスティック回帰分析に適用する。複数の評価項目夫々のフラグは、電子カルテへの情報記録の際に、医師、看護士によってセットされるようにしてもよい。 In the above description, for the logistic regression analysis, the data analysis system acquired the information contributing to the regression from the electronic medical record as a score by the process of predictive coding, but even if this is changed as follows: Good. That is, the data analysis system includes a dictionary in the artificial intelligence, analyzes text data of the electronic medical record (target data) with the dictionary, and sets a flag if there is text related to each item. For example, if there is “anemia” in the electronic medical record, an anemia flag is set. The data analysis system applies a flag for each of a plurality of evaluation items as an explanatory variable to logistic regression analysis. A flag for each of the plurality of evaluation items may be set by a doctor or a nurse when information is recorded in the electronic medical record.
 転倒予測モデルにロジスティック回帰分析を適用することを説明したが、SVM(サポートベクターマシン)、決定木学習等に置換してもよい。既述の説明では、インシデントとして、患者の転倒を説明したが、これに限られず、患者の発作、患者の俳諧、患者の意識消失等、医師や看護士の医療行為に直接関連しない、患者側の異常行動が、データ分析によって予測可能である。 Although it has been explained that logistic regression analysis is applied to a fall prediction model, it may be replaced with SVM (support vector machine), decision tree learning, or the like. In the above explanation, the fall of the patient was explained as an incident, but it is not limited to this. Abnormal behavior can be predicted by data analysis.
 ロジスティック回帰分析のための説明変数に期間を設けるようにしてもよい。例えば、説明変数を5とし、3日のラグを設けるとすると、次のようになる。
 5月1日のカルテを入力として5つの評価項目に対して夫々スコアを取得し、
 5月2日のカルテを入力として5つの評価項目に対して夫々スコアを取得し、
 5月3日のカルテを入力として5つの評価項目に対して夫々スコアを取得し、
 5月3日に5つのスコアの3組を並列に繋いで15次元の説明変数に基づいてロジスティック回帰分析を行う。
 この態様によれば、期間中の説明変数の変動を纏めて回帰分析に適用できるために、この期間の患者の総合的な体調変化を転倒の予測に反映させることができる。
Periods may be provided for explanatory variables for logistic regression analysis. For example, assuming that the explanatory variable is 5 and a lag of 3 days is provided, it is as follows.
Get the score for each of the five evaluation items using the May 1 chart as input,
Get the score for each of the five evaluation items using the May 2 chart as input,
Get the score for each of the five evaluation items using the May 3 chart as input,
On May 3rd, three sets of five scores are connected in parallel and a logistic regression analysis is performed based on 15-dimensional explanatory variables.
According to this aspect, since fluctuations in the explanatory variables during the period can be collectively applied to the regression analysis, the overall physical condition change of the patient during this period can be reflected in the fall prediction.
 〔データ分析システムが処理するデータ形式〕
 本実施の形態において、「データ」は、コンピュータによって処理可能となる形式で表現された任意のデータであってよい。上記データは、例えば、少なくとも一部において構造定義が不完全な非構造化データであってよく、自然言語によって記述された文章を少なくとも一部に含む文書データ(例えば、電子メール(添付ファイル・ヘッダ情報を含む)、技術文書(例えば、学術論文、特許公報、製品仕様書、設計図など、技術的事項を説明する文書を広く含む)、プレゼンテーション資料、表計算資料、決算報告書、打ち合わせ資料、報告書、営業資料、契約書、組織図、事業計画書、企業分析情報、電子カルテ、ウェブページ、ブログ、ソーシャルネットワークサービスに投稿されたコメントなど)、音声データ(例えば、会話・音楽などを録音したデータ)、画像データ(例えば、複数の画素またはベクター情報から構成されるデータ)、映像データ(例えば、複数のフレーム画像から構成されるデータ)などを広く含む(これらの例に限定されない)。
[Data format processed by the data analysis system]
In the present embodiment, “data” may be any data expressed in a format that can be processed by a computer. The data may be, for example, unstructured data whose structure definition is incomplete at least in part, and document data (for example, e-mail (attached file header) Information), technical documents (including a wide range of documents explaining technical matters such as academic papers, patent publications, product specifications, design drawings, etc.), presentation materials, spreadsheets, financial statements, meeting materials, Record reports, sales documents, contracts, organization charts, business plans, company analysis information, electronic medical records, web pages, blogs, comments posted on social network services, etc., audio data (eg conversation / music) Data), image data (eg, data composed of a plurality of pixels or vector information), video data (eg, Broadly includes data formed) including a plurality of frame images (not limited to these examples).
 例えば、文書データを分析する場合、上記システムは、学習用データとしての文書データに含まれる形態素を構成要素として抽出し、当該構成要素をそれぞれ評価し、評価用データとしての文書データから抽出した構成要素に基づいて、当該文書データと所定事案との関連性を評価することができる。また、音声データを分析する場合、上記システムは、当該音声データ自体を分析の対象としてもよいし、音声認識により当該音声データを文書データに変換し、変換後の文書データを分析の対象としてもよい。前者の場合、上記システムは、例えば、音声データを所定の長さの部分音声に分割して構成要素とし、任意の音声分析手法(例えば、隠れマルコフモデル、カルマンフィルタなど)を用いて当該部分音声を識別することによって、当該音声データを分析できる。後者の場合、任意の音声認識アルゴリズム(例えば、隠れマルコフモデルを用いた認識方法など)を用いて音声を認識し、認識後のデータ(文書データ)に対して、前述した手順と同様の手順で分析できる。また、画像データを分析する場合、上記システムは、例えば、画像データを所定の大きさの部分画像に分割して構成要素とし、任意の画像認識手法(例えば、パターンマッチング、サポートベクターマシン、ニューラルネットワークなど)を用いて当該部分画像を識別することによって、当該画像データを分析できる。さらに、映像データを分析する場合、上記システムは、例えば、映像データに含まれる複数のフレーム画像を所定の大きさの部分画像にそれぞれ分割して構成要素とし、任意の画像認識手法(例えば、パターンマッチング、サポートベクターマシン、ニューラルネットワークなど)を用いて当該部分画像を識別することによって、当該映像データを分析できる。 For example, when analyzing document data, the system extracts morphemes contained in document data as learning data as constituent elements, evaluates the constituent elements, and extracts from the document data as evaluation data. Based on the element, the relevance between the document data and the predetermined case can be evaluated. When analyzing voice data, the system may analyze the voice data itself, or convert the voice data into document data by voice recognition, and use the converted document data as an analysis target. Good. In the former case, for example, the system divides the voice data into partial voices of a predetermined length to form components, and uses the voice analysis method (for example, hidden Markov model, Kalman filter, etc.) to convert the partial voices. By identifying, the voice data can be analyzed. In the latter case, speech is recognized using an arbitrary speech recognition algorithm (for example, a recognition method using a hidden Markov model), and the same procedure as described above is performed on the recognized data (document data). Can be analyzed. When analyzing image data, the system, for example, divides the image data into partial images of a predetermined size to form components, and any image recognition method (for example, pattern matching, support vector machine, neural network) Etc.) can be used to identify the partial image. Further, when analyzing video data, the system, for example, divides a plurality of frame images included in the video data into partial images each having a predetermined size to form a component, and an arbitrary image recognition technique (for example, a pattern The video data can be analyzed by identifying the partial image using matching, a support vector machine, a neural network, or the like.
 〔ソフトウェア・ハードウェアによる実現例〕
 上記システムの制御ブロックは、集積回路(ICチップ)等に形成された論理回路(ハードウェア)によって実現してもよいし、CPUを用いてソフトウェアによって実現してもよい。後者の場合、上記システムは、各機能を実現するソフトウェアであるプログラム(データ分析システムの制御プログラム)を実行するCPU、当該プログラムおよび各種データがコンピュータ(またはCPU)で読み取り可能に記録されたROM(Read Only Memory)または記憶装置(これらを「記録媒体」と称する)、当該プログラムを展開するRAM(Random Access Memory)などを備えている。そして、コンピュータ(またはCPU)が上記プログラムを上記記録媒体から読み取って実行することにより、本発明の目的が達成される。上記記録媒体としては、「一時的でない有形の媒体」、例えば、テープ、ディスク、カード、半導体メモリ、プログラマブルな論理回路などを用いることができる。また、上記プログラムは、当該プログラムを伝送可能な任意の伝送媒体(通信ネットワークや放送波等)を介して上記コンピュータに供給されてもよい。本発明は、上記プログラムが電子的な伝送によって具現化された、搬送波に埋め込まれたデータ信号の形態でも実現され得る。なお、上記プログラムは、任意のプログラミング言語によって実装可能である。また、上記プログラムを記録した任意の記録媒体も、本発明の範疇に入る。
[Example of implementation using software and hardware]
The control block of the above system may be realized by a logic circuit (hardware) formed in an integrated circuit (IC chip) or the like, or may be realized by software using a CPU. In the latter case, the system includes a CPU that executes a program (control program for the data analysis system) that is software that implements each function, and a ROM (in which the program and various data are recorded so as to be readable by the computer (or CPU)). A Read Only Memory) or a storage device (these are referred to as “recording media”), a RAM (Random Access Memory) for developing the program, and the like are provided. And the objective of this invention is achieved when a computer (or CPU) reads the said program from the said recording medium and runs it. As the recording medium, a “non-temporary tangible medium” such as a tape, a disk, a card, a semiconductor memory, a programmable logic circuit, or the like can be used. The program may be supplied to the computer via an arbitrary transmission medium (such as a communication network or a broadcast wave) that can transmit the program. The present invention can also be realized in the form of a data signal embedded in a carrier wave in which the program is embodied by electronic transmission. Note that the above program can be implemented in any programming language. Also, any recording medium that records the above program falls within the scope of the present invention.
 〔他のアプリケーション例〕
 上記システムは、上記〔データ分析システムの動作例〕で説明した転倒予測システムとしてだけではなく、例えば、ディスカバリー支援システム、フォレンジックシステム、電子メール監視システム、医療応用システム(例えば、ファーマコビジランス支援システム、治験効率化システム、医療リスクヘッジシステム、予後予測システム、診断支援システムなど)、インターネット応用システム(例えば、スマートメールシステム、情報アグリゲーション(キュレーション)システム、ユーザ監視システム、ソーシャルメディア運営システムなど)、情報漏洩検知システム、プロジェクト評価システム、マーケティング支援システム、知財評価システム、不正取引監視システム、コールセンターエスカレーションシステム、信用調査システムなど、ビッグデータを分析する人工知能システム(データと所定事案との関連性を評価可能な任意のシステム)として実現され得る。例えば、上記システムがディスカバリー支援システムとして実現される場合、所定事案は「訴訟」であり、フォレンジックシステムとして実現される場合、所定事案は「不正な事件」(犯罪)である。その他のシステムとして実現される場合、所定事案は当該システムが実現される分野に応じる。なお、本発明のデータ分析システムが応用される分野によっては、当該分野に特有の事情を考慮して、例えば、データに前処理(例えば、当該データから重要箇所を抜き出し、当該重要箇所のみをデータ分析の対象とするなど)を施したり、データ分析の結果を表示する態様を変化させたりしてよい。こうした変形例が多様に存在し得ることは、当業者に理解されるところであり、すべての変形例が本発明の範疇に入る。
[Other application examples]
The system is not limited to the fall prediction system described in the above [Example of operation of data analysis system]. For example, a discovery support system, a forensic system, an e-mail monitoring system, a medical application system (for example, a pharmacovigilance support system, Clinical trial efficiency system, medical risk hedging system, prognosis prediction system, diagnosis support system, etc.), Internet application system (eg, smart mail system, information aggregation (curation) system, user monitoring system, social media management system, etc.), information Leakage detection system, project evaluation system, marketing support system, intellectual property evaluation system, fraudulent transaction monitoring system, call center escalation system, credit check system Etc., it may be implemented as an artificial intelligence system for analyzing big data (data with a predetermined cases any system capable assess the relevance). For example, when the system is realized as a discovery support system, the predetermined case is a “lawsuit”, and when the system is realized as a forensic system, the predetermined case is an “illegal case” (crime). When implemented as another system, the predetermined case depends on the field in which the system is implemented. Depending on the field to which the data analysis system of the present invention is applied, in consideration of circumstances peculiar to the field, for example, preprocessing (for example, extracting an important part from the data and extracting only the important part from the data) The analysis target may be applied), or the mode of displaying the data analysis result may be changed. It will be understood by those skilled in the art that a variety of such variations can exist, and all variations fall within the scope of the present invention.
 本発明は上述したそれぞれの実施の形態に限定されるものではなく、請求項に示した範囲で種々の変更が可能であり、異なる実施の形態にそれぞれ開示された技術的手段を適宜組み合わせて得られる実施の形態についても、本発明の技術的範囲に含まれる。さらに、各実施の形態にそれぞれ開示された技術的手段を組み合わせることにより、新しい技術的特徴を形成できる。 The present invention is not limited to the above-described embodiments, and various modifications can be made within the scope of the claims, and the technical means disclosed in different embodiments can be appropriately combined. Embodiments to be made are also included in the technical scope of the present invention. Furthermore, a new technical feature can be formed by combining the technical means disclosed in each embodiment.
 1……データ分析システム、2……サーバ装置、3……クライアント装置、4……データベース、5……ストレージシステム、6……管理計算機 1 ... Data analysis system, 2 ... Server device, 3 ... Client device, 4 ... Database, 5 ... Storage system, 6 ... Management computer

Claims (10)

  1.  データを所定事案に基づいて分析するデータ分析システムであって、
     データ分析の対象となる複数の対象データを記憶するメモリと、
     前記複数の対象データを評価するコントローラと、
     を備え、
     前記コントローラは、
     所定事案に関連する複数の評価項目を設定することと、
     前記複数の評価項目夫々との関連性に係る分類情報が設定された学習用データを評価することと、
     前記学習用データの評価結果に基づいて、前記複数の評価項目夫々について、前記対象データを評価することと、
     前記複数の評価項目毎の前記対象データの評価を入力として、所定の予測モデルに基づく分析を行うことと、
     前記分析の結果に基づいて前記所定事案の発生に関する予測情報を出力することと、
     を備えるデータ分析システム。
    A data analysis system for analyzing data based on a predetermined case,
    A memory for storing a plurality of target data to be analyzed;
    A controller for evaluating the plurality of target data;
    With
    The controller is
    Setting multiple evaluation items related to a given case;
    Evaluating the learning data in which the classification information related to the relevance with each of the plurality of evaluation items is set;
    Evaluating the target data for each of the plurality of evaluation items based on the evaluation result of the learning data;
    Using the evaluation of the target data for each of the plurality of evaluation items as input, and performing an analysis based on a predetermined prediction model;
    Outputting prediction information related to the occurrence of the predetermined case based on the result of the analysis;
    A data analysis system comprising:
  2.  前記コントローラは、
     前記対象データから前記所定事案に関連する数値データを抽出することと、
     前記分析を前記対象データの評価に加えて前記数値データに基づいて行うことと、
     を備えるデータ分析システム。
    The controller is
    Extracting numerical data related to the predetermined case from the target data;
    Performing the analysis based on the numerical data in addition to the evaluation of the target data;
    A data analysis system comprising:
  3.  前記コントローラは、
     前記対象データの評価のために参照される参照データをユーザに提供することと、
     前記ユーザからの入力に基づいて、前記複数の評価項目夫々について、前記参照データに前記分類情報を設定することと、
     前記参照データと前記分類情報の組合せを前記学習用データとすることと、
     前記学習用データの評価として、前記参照データの構成要素が前記組合せに寄与する度合いを評価することと、
     前記対象データを評価することが、前記構成要素の評価に基づいて、前記複数の評価項目夫々と前記対象データとの関連性をスコアとして算出することを含むようにすることと、
     をさらに備える請求項1又は2記載のデータ分析システム。
    The controller is
    Providing the user with reference data to be referenced for evaluation of the target data;
    Setting the classification information in the reference data for each of the plurality of evaluation items based on the input from the user;
    A combination of the reference data and the classification information as the learning data;
    As evaluation of the learning data, evaluating the degree to which the constituent elements of the reference data contribute to the combination;
    Evaluating the target data includes calculating a relevance between each of the plurality of evaluation items and the target data as a score based on the evaluation of the component;
    The data analysis system according to claim 1 or 2, further comprising:
  4.  前記コントローラは、
     前記分析を行うことが、前記複数の評価項目毎の前記対象データのスコアを説明変数として、ロジスティック回帰分析に適用することを含むようにすることと、
     前記予測情報を出力することが、前記ロジスティック回帰分析によって前記所定事案の発生確率を出力することを含むようにすることと、
     をさらに備える請求項3記載のデータ分析システム。
    The controller is
    Performing the analysis includes applying the score of the target data for each of the plurality of evaluation items as an explanatory variable to logistic regression analysis;
    Outputting the prediction information includes outputting the occurrence probability of the predetermined case by the logistic regression analysis;
    The data analysis system according to claim 3, further comprising:
  5.  前記コントローラは、
     前記メモリから前記対象データとしての電子カルテを取得することと、
     前記予測情報に前記所定事案としての患者の転倒が発生する確率を含むようにすることと、
     をさらに備える請求項1乃至4の何れか1項記載のデータ分析システム。
    The controller is
    Obtaining an electronic medical record as the target data from the memory;
    The prediction information includes a probability that the patient falls as the predetermined case,
    The data analysis system according to any one of claims 1 to 4, further comprising:
  6.  前記複数の評価項目に、患者の転倒に関連するパラメータを含むようにすることと、
     前記数値データに、患者から測定された医療データを含むようにすることと、
     をさらに備える請求項2記載のデータ分析システム。
    The plurality of evaluation items include a parameter related to a patient's fall;
    The numerical data includes medical data measured from a patient;
    The data analysis system according to claim 2, further comprising:
  7.  前記コントローラは、
     前記ロジスティック回帰分析によって前記発生確率を出力することが、前記所定事案が所定期間内に発生する確率を出力することを含むようにすること、
     をさらに備える請求項4記載のデータ分析システム。
    The controller is
    Outputting the probability of occurrence by the logistic regression analysis includes outputting the probability that the predetermined case will occur within a predetermined period;
    The data analysis system according to claim 4, further comprising:
  8.  データを所定事案に基づいて分析するデータ分析システムの制御方法であって、
     前記データ分析システムが、
     データ分析の対象となる複数の対象データを記憶するステップと、
     所定事案に関連する複数の評価項目を設定するステップと、
     前記複数の評価項目夫々との関連性に係る分類情報が設定された学習用データを評価するステップと、
     前記学習用データの評価結果に基づいて、前記複数の評価項目夫々について、前記対象データを評価するステップと、
     前記複数の評価項目毎の前記対象データの評価を入力として、所定の予測モデルに基づく分析を行うステップと、
     前記分析の結果に基づいて前記所定事案の発生に関する予測情報を出力するステップと、
     実行するデータ分析システムの制御方法。
    A data analysis system control method for analyzing data based on a predetermined case,
    The data analysis system is
    Storing a plurality of target data to be analyzed, and
    Setting a plurality of evaluation items related to a predetermined case;
    Evaluating the learning data in which classification information related to each of the plurality of evaluation items is set;
    Evaluating the target data for each of the plurality of evaluation items based on an evaluation result of the learning data;
    Performing an analysis based on a predetermined prediction model with the evaluation of the target data for each of the plurality of evaluation items as an input;
    Outputting prediction information related to the occurrence of the predetermined case based on the result of the analysis;
    Control method of data analysis system to be executed.
  9.  データを所定事案に基づいて分析することをコンピュータに実行させるプログラムであって、
     データ分析の対象となる複数の対象データを記憶する機能と、
     所定事案に関連する複数の評価項目を設定する機能と、
     前記複数の評価項目夫々との関連性に係る分類情報が設定された学習用データを評価する機能と、
     前記学習用データの評価結果に基づいて、前記複数の評価項目夫々について、前記対象データを評価する機能と、
     前記複数の評価項目毎の前記対象データの評価を入力として、所定の予測モデルに基づく分析を行う機能と、
     前記分析の結果に基づいて前記所定事案の発生に関する予測情報を出力する機能と、
     をコンピュータに実現させるためのプログラム。
    A program that causes a computer to analyze data based on a predetermined case,
    A function for storing a plurality of target data to be analyzed;
    The ability to set multiple assessment items related to a given case;
    A function of evaluating learning data in which classification information related to the relevance to each of the plurality of evaluation items is set;
    A function for evaluating the target data for each of the plurality of evaluation items based on the evaluation result of the learning data;
    A function of performing an analysis based on a predetermined prediction model, with the evaluation of the target data for each of the plurality of evaluation items as an input;
    A function of outputting prediction information related to the occurrence of the predetermined case based on the result of the analysis;
    A program to make a computer realize.
  10.  請求項9記載のプログラムを記録したコンピュータ読み取り可能な記録媒体。 A computer-readable recording medium on which the program according to claim 9 is recorded.
PCT/JP2016/065096 2016-05-20 2016-05-20 Data analysis system, method for control thereof, program, and recording medium WO2017199445A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2018518059A JP6748710B2 (en) 2016-05-20 2016-05-20 Data analysis system, control method thereof, program, and recording medium
PCT/JP2016/065096 WO2017199445A1 (en) 2016-05-20 2016-05-20 Data analysis system, method for control thereof, program, and recording medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2016/065096 WO2017199445A1 (en) 2016-05-20 2016-05-20 Data analysis system, method for control thereof, program, and recording medium

Publications (1)

Publication Number Publication Date
WO2017199445A1 true WO2017199445A1 (en) 2017-11-23

Family

ID=60326349

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2016/065096 WO2017199445A1 (en) 2016-05-20 2016-05-20 Data analysis system, method for control thereof, program, and recording medium

Country Status (2)

Country Link
JP (1) JP6748710B2 (en)
WO (1) WO2017199445A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019212006A1 (en) * 2018-05-02 2019-11-07 株式会社Fronteoヘルスケア Phenomenon prediction device, prediction model generation device, and phenomenon prediction program
WO2019212005A1 (en) * 2018-05-02 2019-11-07 株式会社Fronteoヘルスケア Dangerous behavior prediction device, prediction model generation device, and dangerous behavior prediction program
CN111144658A (en) * 2019-12-30 2020-05-12 医渡云(北京)技术有限公司 Medical risk prediction method, device, system, storage medium and electronic equipment
JP6826652B1 (en) * 2019-12-27 2021-02-03 株式会社ビデオリサーチ Customer estimation device and customer estimation method
JP2021043513A (en) * 2019-09-06 2021-03-18 株式会社ビデオリサーチ Customer estimation device and customer estimation method
JP2021117757A (en) * 2020-01-27 2021-08-10 株式会社ビデオリサーチ Device and method for estimating customers
JP2021117758A (en) * 2020-01-27 2021-08-10 株式会社ビデオリサーチ Device and method for estimating customers
JP7466865B2 (en) 2020-08-07 2024-04-15 公立大学法人名古屋市立大学 Self-removal occurrence prediction device, self-removal occurrence prediction method, program, fall occurrence prediction device, fall occurrence prediction method, and medical safety improvement support method

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015194974A (en) * 2014-03-31 2015-11-05 株式会社東芝 Home/visit medical service system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5833068B2 (en) * 2013-09-03 2015-12-16 株式会社東芝 Series data analysis device and program

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015194974A (en) * 2014-03-31 2015-11-05 株式会社東芝 Home/visit medical service system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SHIN'ICHIRO YOKOTA: "Construction and Implementation of a formula predicting a patient's fall risk based on a historical cohort study using electronic medical records data", JAPAN JOURNAL OF MEDICAL INFORMATION, vol. 34, no. 3, 162, 29 September 2014 (2014-09-29), pages 119 - 128, XP055598180, ISSN: 0289-8055 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019212006A1 (en) * 2018-05-02 2019-11-07 株式会社Fronteoヘルスケア Phenomenon prediction device, prediction model generation device, and phenomenon prediction program
WO2019212005A1 (en) * 2018-05-02 2019-11-07 株式会社Fronteoヘルスケア Dangerous behavior prediction device, prediction model generation device, and dangerous behavior prediction program
JP2019194807A (en) * 2018-05-02 2019-11-07 株式会社Fronteoヘルスケア Dangerous action prediction device, prediction model generation device, and program for dangerous action prediction
EP3779727A4 (en) * 2018-05-02 2021-05-19 Fronteo, Inc. Dangerous behavior prediction device, prediction model generation device, and dangerous behavior prediction program
JP2021043513A (en) * 2019-09-06 2021-03-18 株式会社ビデオリサーチ Customer estimation device and customer estimation method
JP6826652B1 (en) * 2019-12-27 2021-02-03 株式会社ビデオリサーチ Customer estimation device and customer estimation method
JP2021108004A (en) * 2019-12-27 2021-07-29 株式会社ビデオリサーチ Customer estimation device and customer estimation method
CN111144658A (en) * 2019-12-30 2020-05-12 医渡云(北京)技术有限公司 Medical risk prediction method, device, system, storage medium and electronic equipment
CN111144658B (en) * 2019-12-30 2023-06-16 医渡云(北京)技术有限公司 Medical risk prediction method, device, system, storage medium and electronic equipment
JP2021117757A (en) * 2020-01-27 2021-08-10 株式会社ビデオリサーチ Device and method for estimating customers
JP2021117758A (en) * 2020-01-27 2021-08-10 株式会社ビデオリサーチ Device and method for estimating customers
JP7466865B2 (en) 2020-08-07 2024-04-15 公立大学法人名古屋市立大学 Self-removal occurrence prediction device, self-removal occurrence prediction method, program, fall occurrence prediction device, fall occurrence prediction method, and medical safety improvement support method

Also Published As

Publication number Publication date
JPWO2017199445A1 (en) 2019-03-28
JP6748710B2 (en) 2020-09-02

Similar Documents

Publication Publication Date Title
WO2017199445A1 (en) Data analysis system, method for control thereof, program, and recording medium
Chatterjee et al. Identification of risk factors associated with obesity and overweight—a machine learning overview
JP6182279B2 (en) Data analysis system, data analysis method, data analysis program, and recording medium
US11301774B2 (en) System and method for multi-modal graph-based personalization
Kumar et al. Optimized stacking ensemble learning model for breast cancer detection and classification using machine learning
Redelico et al. Classification of normal and pre-ictal eeg signals using permutation entropies and a generalized linear model as a classifier
Improta et al. Machine learning and lean six sigma to assess how COVID-19 has changed the patient management of the complex operative unit of neurology and stroke unit: a single center study
Mittal et al. Opinion mining for the tweets in healthcare sector using fuzzy association rule
US20180011977A1 (en) Data analysis system, data analysis method, and data analysis program
Sylolypavan et al. The impact of inconsistent human annotations on AI driven clinical decision making
Olson et al. Association rules
CN114783580A (en) Medical data quality evaluation method and system
WO2016189605A1 (en) Data analysis system, control method, control program, and recording medium
Chen et al. Identifying financial crises using machine learning on textual data
JP2017201543A (en) Data analysis system, data analysis method, data analysis program, and recording media
JP5933863B1 (en) Data analysis system, control method, control program, and recording medium
Bhola et al. Comparative study of machine learning techniques for chronic disease prognosis
JP6178480B1 (en) DATA ANALYSIS SYSTEM, ITS CONTROL METHOD, PROGRAM, AND RECORDING MEDIUM
Shokoohyar et al. Exploring the heated debate over reopening for economy or continuing lockdown for public health safety concerns about COVID-19 in Twitter
Sharon et al. Application of intelligent edge computing and machine learning algorithms in MBTI personality prediction
Su et al. Improved inpatient deterioration detection in general wards by using time-series vital signs
Hansen et al. Individual health indices via register-based health records and machine learning
Vuppalapati et al. Machine learning infused preventive healthcare for high-risk outpatient elderly
Park et al. Algorithmic fairness and AI justice in addressing health equity
Shafeie et al. Using Machine Learning to Model Potential Users with Health Risk Concerns Regarding Microchip Implants

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2018518059

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16902464

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 16902464

Country of ref document: EP

Kind code of ref document: A1