WO2017010103A1 - データ分析装置、データ分析方法、およびデータ分析プログラムを格納した記憶媒体 - Google Patents
データ分析装置、データ分析方法、およびデータ分析プログラムを格納した記憶媒体 Download PDFInfo
- Publication number
- WO2017010103A1 WO2017010103A1 PCT/JP2016/003332 JP2016003332W WO2017010103A1 WO 2017010103 A1 WO2017010103 A1 WO 2017010103A1 JP 2016003332 W JP2016003332 W JP 2016003332W WO 2017010103 A1 WO2017010103 A1 WO 2017010103A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- attribute
- field
- attendance
- employee
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
- G06Q10/109—Time management, e.g. calendars, reminders, meetings or time accounting
- G06Q10/1091—Recording time for administrative or management purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06398—Performance of employee with respect to a job function
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
- G06Q10/105—Human resources
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H10/00—ICT specially adapted for the handling or processing of patient-related medical or healthcare data
- G16H10/60—ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/30—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
Definitions
- the present invention relates to a data analysis apparatus, a data analysis method, and a storage medium storing a data analysis program for supporting health guidance of companies and the like.
- ployees Maintaining or promoting the health of employees and those who belong to organizations for business execution (hereinafter simply referred to as “employees”) means that those who manage business owners or organizations (hereinafter simply referred to as “business owners”) ) Is one of the most important roles. For this reason, employers deploy medical workers such as industrial physicians and public health nurses, and implement many measures related to health checkups and health guidance for employees.
- Patent Literature 1 individuals are grouped on the basis of a plurality of individual health checkup results and lifestyle inquiry results for advice, and based on the health condition and lifestyle features extracted for each group, A health support system that provides advice on maintaining and improving health is described.
- Patent Document 1 If the technique described in Patent Document 1 is used, for example, for individuals belonging to a group having a high blood pressure value compared to other groups, a diet that suppresses salt to reduce blood pressure is based on a medical viewpoint. You can give advice such as
- Advisors who provide health guidance for medical professionals, etc. are important sources of information to check the work status of employees, such as daily work time, leaving time, work / non-work, absence / overtime, and overtime hours.
- attendance data which is information in which items related to work status are arranged in time series, is utilized.
- attendance data includes dozens of fields, and many of them are recorded and recorded one record a day. In this way, attendance data has a tendency to increase the number of records in addition to the number of fields, but since the time of health guidance for each employee is limited, the advisor will have all this information within a limited time. It is difficult to confirm.
- the advisor can determine the specific work status related to health (for example, whether or not there is overwork or irregular work status). There was a problem that it could not be easily obtained from the data.
- Patent Document 1 a plurality of health condition groups are generated by covariance structure analysis from data relating to health conditions and management thereof, and characteristics characteristic to those belonging to the health condition groups remain in the group or others. It is described that it is presented as recommended item data for shifting to the group. Thereby, the health instructor can give advice, such as presenting recommended action information, based on the presented recommended item data.
- Patent Document 1 only extracts items indicating characteristic characteristics belonging to each group, and it is not possible to extract items that match the purpose of the advisor, the degree of relevance of the items, and the like. Can not. For example, suppose that an advisor focuses on a symptom and wants to leave only the fields that are particularly relevant to the symptom from the attendance data and not present other fields. In such a case, even if the method described in Patent Document 1 is applied, there is no guarantee that the grouping is performed depending on the presence or absence and the degree of the symptoms. Further, at another timing, the advisor may pay attention to another symptom and desire to leave only the field particularly relevant to the symptom from the attendance data and not to present other fields. However, Patent Document 1 does not describe any method for appropriately presenting time data (sorting information, processing, etc.) and presenting it in accordance with the purpose of the advisor.
- the present invention makes it easy for an advisor to provide information on specific fields included in data related to an employee's health status, including attendance data, which is related to any item of interest. It is an object of the present invention to provide a data analysis apparatus, a data analysis method, and a data analysis program that can be obtained.
- the data analysis apparatus specifies a target field that is a field from which relations are to be extracted from among fields of health status data that is information relating to the health status of an employee, and health statuses of two or more employees.
- Data acquisition means for acquiring at least data and attendance data that is information relating to work status, and for each employee, a predetermined time resolution, a time range, and an aggregation method for a predetermined field of the attendance data
- the attribute data generating means for generating attribute data having each of the aggregated results as attribute fields, and the target field as a target variable and each of the attribute fields possessed by the attribute data as explanatory variables.
- Model learning means for learning using the content of the target field and the content of the attribute data, a related field extracting means for extracting the attribute field related to the target field, which is indicated by the learned model, Summarizing means for summarizing and outputting the attendance data of the designated employee based on the information in the attribute field is provided.
- the information processing apparatus designates a target field that is a field from which relations are extracted from among fields of the health status data that is information on the health status of the employee, and two or more employees At least data on health status data on employees and attendance data that is information on work status, and the information processing device sets a predetermined time resolution for a predetermined field of the attendance data for each employee. , Aggregate using the time range and the aggregation method, generate attribute data having each of the aggregation results as attribute fields, and the information processing device explains each of the attribute fields that the attribute data has as the target variable A model that is a variable and that is represented by a polynomial is the health status of two or more employees.
- the target field and the content of the attribute data in the data extracts the attribute field related to the target field indicated by the learned model, and the information processing device extracts Based on the information of the specified attribute field, the attendance data of the designated employee is summarized and output.
- a data analysis program stored in a storage medium is a computer that designates a target field that is a field from which relations are to be extracted from among fields included in health state data that is information related to an employee's health state.
- a model represented by a polynomial that can be used by two or more employees Processing using the content of the target field in the state data and the content of the attribute data, processing for extracting the attribute field related to the target field indicated by the learned model, and information of the extracted attribute field Based on the above, it is characterized in that processing for summarizing and outputting the attendance data of the designated employee is executed.
- the advisor can easily obtain specific field information included in the data related to the health condition of the employee including the attendance data, which is related to an arbitrary item of interest.
- FIG. 2 is a configuration diagram illustrating an example of a hardware configuration of a data analysis device 10.
- FIG. It is a flowchart which shows an example of operation
- It is explanatory drawing which shows the time series relationship of time resolution with respect to attendance data and the attendance data.
- It is explanatory drawing which shows the example of attendance data.
- It is explanatory drawing which shows the example of attribute data setting information.
- FIG. 1 is a block diagram illustrating a configuration example of the data analysis apparatus according to the first embodiment of this invention.
- 1 includes a data input unit 11, an attribute data generation unit 12, a model learning unit 13, a related field extraction unit 14, and a summarization unit 15.
- the data input unit 11 inputs information required by each processing unit of the data analysis apparatus 10.
- the input information includes, for example, specification of a target field that is a field from which relations are extracted from among fields included in health status data that is data related to the health status of an employee, past health status data, and the health status Attendance data having data for a predetermined period in the past from the data measurement date, which is the date on which the data was measured, may be included.
- the health condition data is the health examination data indicating the results of the employee's health examination, examination or health inquiry
- the data measurement date is the date when the health examination data was acquired (if there are multiple examination dates)
- the health status data and the data measurement date are not limited to these.
- the health data in a broad sense includes attendance data.
- the data input unit 11 may input information indicating a method for generating attribute data as described later, for example.
- the target field may be any field included in the health condition data, or a plurality of fields may be included.
- the attribute data generation unit 12 generates attribute data indicating various characteristics of the work status of each employee based on the input time data. More specifically, the attribute data generation unit 12 performs various processes for each employee with a predetermined time resolution such as monthly, quarterly, semiannual, annually, etc. for each time data field. Generate attribute data that summarizes time range information.
- the calculation method that is, the calculation method used for the calculation is not limited to one, and a plurality of calculation methods may be used. Moreover, it is preferable that a total is performed with respect to one attendance field using a plurality of time resolutions and time ranges.
- the model learning unit 13 is a polynomial in which the target field is an objective variable and each attribute data field (hereinafter referred to as attribute field) is an explanatory variable, and a polynomial for calculating the value of the objective variable from the value of each explanatory variable
- attribute field each attribute data field
- the model represented by the above is learned using the health check data and attendance data of a plurality of employees. Specifically, the model learning unit 13 learns the coefficient of each explanatory variable in the polynomial.
- the related field extraction unit 14 extracts an attribute field related to the target field indicated by the learned model. Specifically, the related field extraction unit 14 may extract an attribute field corresponding to an explanatory variable having a coefficient other than zero. The related field extraction unit 14 also uses information about the attendance field used to generate the attribute field and a summary method for the field (time resolution, time range, aggregation method, etc. for the attendance field) as the extracted attribute field information. May be extracted. In addition, the related field extraction unit 14 may extract a coefficient value as information indicating the degree of relevance.
- the summarization unit 15 summarizes and outputs the attendance data based on the extraction result by the related field extraction unit 14.
- the summary unit 15 may exclude and output fields other than the attendance field used to generate the extracted attribute field from the attendance data of the designated employee.
- the summary unit 15 uses the extracted time resolution, time range, and totalization method for the attendance field used to generate the extracted attribute field in the attendance data of the designated employee.
- the result of the aggregation may be output together with the coefficient value.
- the summary unit 15 may omit the aggregation process.
- FIG. 2 is a configuration diagram illustrating an example of a hardware configuration of the data analysis apparatus 10.
- a data analysis apparatus 10 illustrated in FIG. 2 includes a CPU (Central Processing Unit) 1001, a memory 1002, an output device 1003, an input device 1004, and a network interface 1005.
- CPU Central Processing Unit
- the memory 1002 is, for example, a RAM (Random Access Memory), a ROM (Read Only Memory), an auxiliary storage device (such as a hard disk), or the like.
- the output device 1003 is a device that outputs information, such as a display device or a printer.
- the input device 1004 is a device that receives an input of a user operation, such as a keyboard and a mouse.
- the network interface 1005 is an interface connected to a network constituted by, for example, the Internet, a LAN (Local Area Network), a public line network, a wireless communication network, or a combination thereof.
- each of the functional blocks of the data analysis apparatus 10 shown in FIG. 1 is configured by a CPU 1001 that reads and executes a computer program stored in the memory 1002 and controls other units.
- the hardware configuration of the data analysis device 10 and each functional block thereof is not limited to the above configuration.
- the data input unit 11 may read the input information from the memory 1002 in addition to inputting the input information from the outside.
- FIG. 3 is a flowchart showing an example of the operation of the data analysis apparatus 10 of the present embodiment.
- the data input unit 11 inputs the designation of the target field and each medical checkup data and attendance data of the employee (step S11).
- FIG. 4 is an explanatory diagram showing time-series relationships of time data as input information and time resolution for the time data.
- the attendance data includes records for a predetermined period (for example, one year) prior to the most recent medical examination date prior to the determination time (for example, when advice is given). It may be.
- FIG. 4A shows an example in which the last day of the predetermined period is the medical examination day, but any number of days between the predetermined period and the medical examination day as the attendance data collection period. There may be an opening.
- the attendance data is a predetermined time before the first time, with an arbitrary time point (for example, the end of the fiscal year) past the most recent medical examination date as the first time point.
- the medical examination date is not limited to the latest one.
- the attendance data only needs to include a record in a time range including a predetermined period that does not exceed the date (physical examination date) on which the contents of the target field to be extracted for the relationship are acquired.
- the time resolution used for generating attribute data is not particularly limited as long as it is a period shorter than the time range of the entire attendance data.
- FIG. 5 is an explanatory diagram showing a configuration example of attendance data.
- the attendance data is information in which matters relating to work status such as daily work time, leaving time, working / non-working, vacation / non-working hours for each employee are arranged in time series. Good.
- each item related to work status included in the attendance data is called a field, that is, an attendance field.
- a set of values of each attendance field at a certain point included in attendance data is called a record of attendance data.
- the attribute data generation unit 12 performs aggregation on a predetermined field included in the attendance data using an arbitrary time resolution, time range, and aggregation method to generate attribute data (step S12).
- FIG. 6 is an explanatory diagram showing an example of attribute data setting information.
- the attribute data generation unit 12 may generate attribute data by performing aggregation processing on the attendance field according to attribute data setting information indicating a generation method of attribute data as illustrated in FIG. 6.
- FIG. 6 includes, for each attribute field, an identifier, a summary, a time field to be counted, a time resolution of the time field, a time range of the time field, and a counting method for the time field.
- An example of attribute data setting information is shown.
- the attribute data generation unit 12 performs a totaling process on the designated attendance field based on the time resolution, the time range, and the totaling method indicated by such attribute data setting information, and each of the totaling results is assigned to the attribute field.
- the value of one attribute field may be calculated using a plurality of attendance fields.
- An example is a ratio calculated using values of a plurality of attendance fields.
- a plurality of attendance fields are registered in the attribute data setting information as attendance fields to be aggregated.
- FIG. 7 is an explanatory diagram showing an example of attribute data.
- the attribute data generation unit 12 may generate attribute data including the value of each attribute field (total result) as an attribute value for each employee.
- the model learning unit 13 generates a model composed of a polynomial in which the target field is an objective variable and each attribute field included in the attribute data generated in step S12 is an explanatory variable. In particular, learning is performed using the value of the target field) and attribute data (especially each value of the attribute field) (step S13).
- the related field extracting unit 14 extracts an attribute field related to the target field, which is indicated by the model learned in step S13 (step S14).
- the related field extraction unit 14 may extract, for example, attribute field information corresponding to an explanatory variable in which a model parameter (polynomial coefficient) takes a value other than zero.
- the summarizing unit 15 summarizes the time data of the designated employee based on the information extracted in step S14, and outputs it as information on time data related to the designated target field (step). S15).
- the designation of the employee is not limited to one, but may be plural (including all employees).
- the summary unit 15 may summarize time data for each of the designated employees and output the data as time data information related to the designated target field.
- steps S13 to S15 may be repeated for each target field.
- steps S12 to S15 will be described in more detail.
- step S12 More detailed example of operation in attribute data generation phase (step S12)
- N is an integer of 1 or more.
- the attribute data of the nth employee is represented as X_n.
- n 1,..., N.
- the attribute data X_n in this example is represented as a vector composed of a plurality of elements.
- the number of elements (number of fields) of attribute data is 7.
- the value of the first attribute field is 0, the value of the second attribute field is 0, the value of the third attribute field is 3, the value of the fourth attribute field is 2, This indicates that the value of the fifth attribute field is 1, the value of the sixth attribute field is 0, and the value of the seventh attribute field is 0.
- the attribute data generation unit 12 generates attribute data for each employee, and stores the generated attribute data in the memory 1002.
- one element (attribute field) of attribute data of a certain employee is the employee's attendance data, such as vacation acquisition date and working time of attendance data that is aggregated using an arbitrary time resolution. It may be the number of vacations, the number of late arrivals, and the like. For example, if it is the number of vacation acquisitions per month, the total number of days on which the vacation was acquired in the month is calculated as the attribute value of the attribute field. Further, if the average working hours per month, (total working hours of the month / number of days of actual working days of the month) is calculated as the attribute value of the attribute field. In FIG.
- Step S13 More Detailed Example of Operation in Model Learning Phase
- the jth element of the attribute data of employee n is represented as X_nj.
- j 1,..., M (M is the number of elements of attribute data).
- M is the number of elements of attribute data).
- the value of the target field is represented as Y_n.
- the following formula (1) is a formula showing the relationship between Y_n and X_n.
- the model learning unit 13 learns parameters necessary for expressing the function f () represented by the above equation (1).
- f () is a function expressed by a polynomial composed of explanatory variables and coefficients for each explanatory variable.
- X_n is an M-dimensional explanatory variable corresponding to the attribute data
- Y_n is a numerical value.
- W is an M-dimensional weight vector
- equation (1) is expressed as equation (2).
- an M + 1-dimensional weight vector W may be obtained by adding one dimension for representing the polynomial intercept to the M-order vector W.
- the weight vector W is treated as an M-dimensional vector unless it is limited to either an M-dimensional vector or an M + 1-dimensional vector.
- the superscript T represents transposition of a vector.
- the value of the parameter W can be calculated by optimizing the objective function of the following equation (3).
- ⁇ is a parameter for adjusting the balance between the sum of squares error (first term on the right side) and the penalty term (second term on the right side).
- is the norm of W.
- L1 norm or L2 norm is used.
- L (W) is a convex function related to W, and can be maximized by a method according to the gradient method.
- the model learning unit 13 may obtain the value of the parameter W that maximizes L (W) in the above equation (3).
- the value of the parameter W obtained here may be expressed as Wc .
- Model learning unit 13 stores W c obtained in the memory 1002.
- Figure 8 is an explanatory diagram showing an example of an attribute table including the model parameters W c obtained as a result of the learning.
- the parameters W c — 14 and W c — 20 corresponding to the coefficients of the 14th and 20th attribute fields are values other than 0, and the other parameters W c — 1 to W c — 13, W c — 15 to W c _19, example W c _21 ⁇ is 0 is shown.
- the model learning unit 13 may store, in the memory 1002, an attribute table in which an attribute field identifier and a parameter W c — j obtained as a coefficient of the attribute field are associated with each other as illustrated in FIG. .
- the relevant field extracting unit 14 of the W c _j which read, may be extracted identifier of the attribute field value corresponds to W c _j nonzero. Then, based on the extracted identifier, a set of attendance field, time resolution, time range, and tabulation method used to generate the attribute field may be extracted.
- the related field extracting unit 14, W c _j (j 1 , ⁇ , M) of the absolute value
- W c — j when W c — j is a negative value, it indicates that there is a negative correlation between the target field and the j-th attribute field. Further, when W c — j is a positive value, it indicates that there is a positive correlation between the target field and the j-th attribute field. Note that when W c — j is 0, it indicates that there is no correlation between the target field and the j-th attribute field.
- the related field extraction unit 14 is the attendance field, the time resolution for the attendance field, the time range, and the counting method for all the attribute fields corresponding to W c — j having a value other than 0 as a result of the model learning by the model learning unit 13. You may extract the set. Further, the related field extraction unit 14 may store the extracted information in the memory 1002.
- the parameters W c — 14 and W c — 20 corresponding to the coefficients of the 14th and 20th attribute fields are values other than 0, the 14th and 20th For the attribute field, a set of the attendance field, the time resolution for the attendance field, the time range, and the tabulation method is extracted and stored in the memory 1002.
- the summarization unit 15 stores time fields, time resolutions, time ranges, and tabulations stored as information on attribute fields related to the target field from the memory 1002. Read a set of methods. The summarizing unit 15 summarizes the attendance data of the designated employee based on the read information and outputs the result.
- the output destination may be the memory 1002, the output device 1003, or another device connected via the network interface 1005.
- FIG. 9 is an explanatory diagram illustrating an example of a summary result of attendance data output by the summarizing unit 15.
- the summary unit 15 is designated for all attribute fields that have a positive or negative correlation with the target field, together with an overview of the attribute field, the attendance field used for generation, and the degree of positive / negative correlation.
- Employee attribute values may be output.
- this attribute value corresponds to the summary result of the attendance data of the employee.
- the model parameter W c — j corresponds to information indicating the degree of positive / negative correlation.
- FIG. 9 shows an example in which the average of the attribute values of all employees is also output.
- information on the summarization method time resolution, time range, tabulation method, etc.
- information on the summarization method may also be output.
- the value of the model parameter W c — j that is a coefficient of the attribute field that the attribute value of the attribute field, that is, the average number of leave acquisitions in the second quarter, has a positive correlation with the target field. From this, it can be interpreted that the larger the attribute value, the larger the value of the target field.
- the target field is a blood glucose level
- the advisor may have a high average number of leave acquisitions in the second quarter as one of the factors in which the value of the target field of the employee is high. The same applies to the average monthly working hours.
- the advisor can give accurate advice.
- the advisor not only presents the attendance time field related to the specified medical examination field, but also the accurate time resolution, time range, tabulation method, etc. for the attendance field. Information, and attendance data actually summarized by those methods can be provided. Therefore, the advisor can give appropriate advice based on such information. In addition to presenting the time data in a summarized manner, how the time fields included in the summarized time data are related to the target field and the degree (positive or negative correlation). ), The advisor can give more appropriate advice based on this information.
- the data analysis apparatus shown in FIG. 1 is intended to make it possible for an advisor to easily grasp and understand from the attendance data whether or not the employee has a work situation related to the health condition at the time of determination. For this reason, the data analysis apparatus calculates the relationship between the medical examination data before the determination time and the attendance data for a predetermined period in the past from the medical examination date from which the medical examination data was obtained. Based on the value of each coefficient obtained by learning the model, the attendance data of each employee for the predetermined period is summarized and output.
- the following information is added to the input information. That is, as the attendance data, the first attendance data used for learning and the second attendance data targeted for presenting the relationship with the target field at a future time are input.
- FIG. 10 is an explanatory diagram showing an example of attendance data input in the first modification.
- the data input unit 11 of the present example includes, as time data, for example, first time data including records for a first period before a predetermined first time before the determination time, And a second record including a record for a first period before a second time point, which is a predetermined time point that is more than a predetermined second time period from a most recent medical checkup date (first medical checkup date).
- the first medical examination date is shown as a future date from the first time point, but the relationship between the first autopsy date and the first time point is not limited to this. is not. That is, the first autopsy date may be past the first time point (see FIG. 11 described later).
- the contents of the target field of the medical examination data on the first medical examination date are used as objective variables, and the contents of each attribute field of the second attribute data generated using the second attendance data are described. Learning as a variable. Then, based on the learned content, the relationship between the content of each of the first attribute data generated using the first attendance data and the content of the target field at the predicted time point in the future is presented. To do. In other words, the first period before the second time point is used as the learning period, and the first period before the first time point is used as the prediction period. More specifically, the first attendance data is used as an object for deriving the relationship with the target field at the time of prediction, that is, attendance data used for prediction, and the second attendance data is used for learning for prediction. Used as attendance data.
- FIG. 11 is an explanatory diagram showing another example of attendance data input in the first modification.
- the data input unit 11 sets a predetermined time point in the past that is more than the second period from the predicted time point as the first time point as the first time period before the first time point.
- the first time data including the record is input and a predetermined time point that is more than the second period from the most recent medical examination date (first medical examination date in the figure) in the past from the judgment time point
- second attendance data including records for a first period before the second time point may be input.
- the predicted time point may be a future time point ahead of the second period from any first time point before the determination time point.
- the 1st time point should just be the past from the judgment time point, and does not necessarily need to be the past from the 1st medical examination date.
- the second time point may be a date that is a past day for the second period or more than the first medical examination date that is earlier than the determination time point.
- the first attendance data collection period and the second attendance data collection period do not necessarily have to be continuous or overlapped. That is, there may be an arbitrary number of days between the first attendance data collection period and the second attendance data collection period.
- first medical examination data the most recent medical examination data before the judgment time may be referred to as first medical examination data.
- the first period may be referred to as a collection period
- the second period may be referred to as a retroactive period.
- the interval between the first medical examination date and the second time point is more than a certain amount, more specifically, more than a period from the first time point to a desired expected time point.
- the second period may be the same as the first period, or may be shorter or longer than the first period.
- first attendance data and the second attendance data are not particularly distinguished, and include records for periods including both the first attendance data collection period and the second attendance data collection period.
- One attendance data may be input. Even in such a case, hereinafter, for convenience of explanation, the first attendance data and the second attendance data are expressed separately.
- the configuration of the present modification may be basically the same as that of the first embodiment shown in FIG.
- the data input unit 11 inputs the second attendance data of each employee in addition to the input information in the first embodiment described above.
- the attribute data generation unit 12 generates attribute data for each employee based on the input second attendance data. Note that the method for generating attribute data may be the same as in the first embodiment.
- attribute data generated using the second attendance data may be referred to as second attribute data
- attribute data generated using the first attendance data may be referred to as first attribute data.
- the attribute data generation unit 12 may generate first attribute data in addition to the second attribute data.
- the time range is indicated by a specific date or the like.
- the attendance data to be aggregated is displayed. It is assumed that the contents such as “January of the year of the start time” are set on the basis of the collection start time (for example, the past time point that is back by the first period from the second medical examination date) .
- the model learning unit 13 uses a polynomial model in which the target field of the first medical examination data is an objective variable and each attribute field of the second attribute data is an explanatory variable. Learning is performed using the medical examination data and the second attribute data.
- the second embodiment is different from the first embodiment described above in that the second attribute data is used instead of the first attribute data.
- the model is a model representing the influence of attendance data before the time point (second time point) that is more than the second period beyond the target field value acquired on the first medical examination date. I can say that.
- the related field extraction unit 14 may be the same as that in the first embodiment described above. That is, the related field extraction unit 14 extracts an attribute field related to the target field, which is indicated by the learned model.
- the summarization unit 15 summarizes and outputs the first attendance data, for example, based on the information extracted by the related field extraction unit 14.
- the summarization process may be the same as in the first embodiment.
- the summarization unit 15 includes the attribute field information (attendance field, summarization method, degree of association, etc.) and the specified employee's first
- the attribute value of one attribute data may be output.
- the summarizing unit 15 can omit the summarization process and use the attribute value of the first attribute data.
- the advisor may, for each employee, for a second period or more in the future (eg, six months later, one year later, etc.) from the first medical examination date. ), It is possible to easily grasp and understand the presence or absence of work status that is expected to affect the value of the target field in the medical examination data.
- the relationship between the target field in the first medical examination data and the second attendance data which is indicated by the information extracted by the related field extraction unit 14, is past data, more specifically, from the determination time point.
- the relationship between the value of the target field on the first medical examination date in the past from the determination time and the second attendance data, and the target field from the determination time to the predicted time in the future It is assumed that there is no significant change in the relationship between the value and the first attendance data.
- the advisor can select any arbitrary time based on the first time data summarized by the summarization method specified by the model learned using the second attribute data generated from the second time data as the learning data. For employees, it is possible to easily grasp and understand the presence of overwork and irregular working conditions related to the target field on the future medical examination date that is assumed as the expected time.
- FIG. 12 is a block diagram illustrating a configuration example of the data analysis apparatus according to this modification.
- the data analysis apparatus 10 illustrated in FIG. 12 further includes a prediction unit 16 in addition to the configuration of the first modification.
- the data input unit 11, the attribute data generation unit 12, the model learning unit 13, and the related field extraction unit 14 may be the same as in the first modification.
- the prediction unit 16 predicts the value of the target field at a predetermined prediction time using the learned model and the first attribute data of the designated employee.
- the prediction unit 16 may calculate the value of the target field at the predicted time using the following equation (4), using the learned model parameter Wc and the first attribute data.
- the first attribute data of the designated employee used for prediction is represented as X′_n.
- the predicted time point may be the latest medical examination date in the future for the second period or more from the first medical examination date.
- the prediction unit 16 stores the calculated Y′_n in the memory 1002.
- Y′_n represents the predicted value of the target field at the predicted time for employee n.
- the summarization unit 15 outputs, for example, the predicted value of the target field predicted by the prediction unit 16 in addition to the function of the summarization unit 15 in the first modification.
- FIG. 13 is a flowchart showing an example of the operation of the data analysis apparatus according to this modification.
- the data input unit 11 inputs necessary information (step S21).
- the data input unit 11 inputs the designation of the target field and the first checkup data, the first attendance data, and the second attendance data of each employee.
- the attribute data generation unit 12 generates second attribute data based on the second attendance data (step S22).
- the model learning unit 13 learns a model using the value of the target field in the first medical examination data and the content of the second attribute data of a plurality of employees (step S23).
- the related field extraction unit 14 extracts information on attribute fields related to the target field indicated by the learned model (step S24).
- the prediction unit 16 calculates a predicted value of the target field of the employee at the time of prediction using the learned model and the first attribute data of the designated employee (step S25).
- the summary unit 15 summarizes the first attendance data of the designated employee based on the information extracted in Step S24, and outputs the predicted value calculated in Step S25 together with the summary result (Step S25). S26).
- the advisor can understand and understand the current overwork and irregular working conditions related to the future health checkup results of any employee of interest, and Based on good or bad, it is possible to give health promotion advice to the employee.
- the attribute data generation unit 12 may include values of predetermined medical examination fields in the medical examination data, such as blood pressure, blood glucose (HbA1c, etc.), lipid (HDL, LDL, etc.), height, weight, and inquiry results (smoking habits, A result obtained by performing a tabulation process using a predetermined method on the values of sleep habits, answers to questions about eating habits, etc.) may be included in the attribute field of the attribute data.
- predetermined medical examination fields in the medical examination data such as blood pressure, blood glucose (HbA1c, etc.), lipid (HDL, LDL, etc.), height, weight, and inquiry results (smoking habits, A result obtained by performing a tabulation process using a predetermined method on the values of sleep habits, answers to questions about eating habits, etc.) may be included in the attribute field of the attribute data.
- K indicates the number of medical examination fields to be added to X_nj.
- the target field is not included in K.
- the target field in the medical examination data that is actually the target for extracting the relationship is referred to as “target field”.
- FIG. 14 is an explanatory diagram showing an example of attribute data setting information in this modification.
- the attribute data generation unit 12 may store attribute data setting information indicating a method for generating attribute data in advance for input information including not only attendance data but also medical examination data. Good.
- FIG. 14 shows an example in which blood glucose (HbA1c), body weight, and lipid (HDL) values are used as elements of attribute data, that is, attribute fields, among the fields of medical examination data.
- HbA1c blood glucose
- body weight body weight
- HDL lipid
- medical examination result data can be specified as a data field.
- the data field “work / vacation acquisition” indicates that the data field to be aggregated is a leave acquisition field for attendance data.
- the data field “Healthy. Blood glucose” indicates that the data field to be counted is the blood glucose field of the medical examination data.
- the aggregation method “none” indicates that the value is used as it is.
- FIG. 15 is an explanatory diagram showing an example of attribute data generated based on the attribute data setting information shown in FIG.
- the values of at least the 50th to 52nd attribute fields are the values of the medical examination field.
- the attribute data generation unit 12 generates attribute data from the input attendance data and medical examination data for each employee based on the attribute data setting information.
- the related field extraction unit 14 also includes information on identifiers and summaries of attendance fields and / or medical examination fields used for generation as attribute field information related to the target field indicated by the learned model. It may be extracted.
- the summarization unit 15 summarizes and outputs the attendance data and the medical examination data based on the information extracted by the related field extraction unit 14.
- FIG. 16 is an explanatory diagram showing another example of the attribute table. According to FIG. 16, as a result of the model learning, it is understood that the parameters W c — 14, W c — 20, and W c — 50 corresponding to the coefficients of the 14th, 20th, and 50th attribute fields are values other than 0.
- FIG. 17 is an explanatory diagram illustrating an example of a summary result of attendance data and medical examination data output by the summary unit 15.
- the summary result includes an attribute field identifier, an outline, a field name of the original attendance data or medical examination data, a time range, a degree of relevance (model parameter W c — j), an average value, An aggregation result (attribute value) may be included.
- the summary result may further include information on time resolution and a totaling method.
- the data input unit 11 specifies the target field and the second attendance data in addition to the first checkup data, the first attendance data, and the second attendance data for each employee.
- Second medical examination data included in the collection period or collected within a predetermined day from the collection period (for example, until a predetermined number of days elapses) is input.
- FIG. 18 is an explanatory diagram showing the relationship between attendance data and medical examination data (more specifically, medical examination date) in this modification.
- the data input unit 11 sets the first medical examination day as the first medical examination day, which is the latest medical examination day from the last day of the first attendance data collection period.
- the medical examination data on the first medical examination day is input as the first medical examination data, and the most recent medical examination day is selected as the second medical examination day from the last day of the collection period of the second attendance data.
- the medical examination data on the second medical examination day may be input as the second medical examination data.
- FIG. 18 is an explanatory diagram showing the relationship between attendance data and medical examination data (more specifically, medical examination date) in this modification.
- the data input unit 11 sets the first medical examination day as the first medical examination day, which is the latest medical examination day from the last day of the first attendance data collection period.
- the medical examination data on the first medical examination day is input as the first medical examination data, and the most recent medical examination day is selected as the second medical examination day from the last day
- the data input unit 11 sets, as the first medical examination date, the medical examination date performed during the collection period of the first attendance data, for example,
- the medical examination data on the first medical examination day is input as the first medical examination data
- the medical examination date performed during the collection period of the second attendance data is set as the second medical examination day. You may input the medical examination data in the said 2nd medical examination day as 2nd medical examination data.
- the attribute data generation unit 12 generates second attribute data from the input second attendance data and second medical examination data for each employee based on the attribute data setting information.
- the attribute data generation unit 12 may further generate first attribute data for each employee from the input first attendance data and first medical examination data.
- the model learning unit 13 uses a target field included in the first medical examination data as an objective variable, and learns a model for calculating the value of the objective variable using the second attribute data.
- the related field extraction unit 14 extracts information on identifiers and summaries of attendance and / or medical examination fields used for generation as attribute field information related to the target field indicated by the learned model. May be.
- the prediction unit 16 predicts the value of the target field at the time of prediction using the learned model and the first attribute data of the designated employee.
- the summarization unit 15 summarizes the first attendance data and the first medical examination data based on the information extracted by the related field extraction unit 14, and outputs the predicted value of the target field together with the summary result.
- the advisor can not only easily grasp and understand the presence or absence of overwork or irregular work status related to any item related to the health condition of interest, but also other matters related to the item. This makes it possible to easily grasp and understand the test values and the presence / absence of interview results, thereby realizing further efficiency improvement of health guidance.
- the employees may be divided into groups, and each group may be modified so as to perform the processes of the above-described embodiments and modifications.
- a model is learned for each group, and processing for extracting attribute field information related to the target field is performed.
- attendance data and, if necessary, medical examination data are summarized based on attribute field information extracted for each group to which the designated employee belongs.
- calculating the predicted value of the target field it is calculated using a model for each group to which the designated employee belongs.
- each employee may be grouped based on a predetermined condition. Moreover, you may group each employee based on the attribute data which the attribute data generation part 12 produced
- FIG. 19 is a block diagram illustrating a configuration example of the data analysis apparatus according to this modification.
- FIG. 19 shows a configuration example in which the fourth modification is combined with the third modification.
- the data analysis apparatus 10 illustrated in FIG. 19 further includes a grouping unit 17 in addition to the configuration of the third modification.
- the grouping unit 17 groups each employee based on a predetermined condition, for example, the grouping unit 17 uses the same items such as the employee's office, department, occupation, age, and gender. Alternatively, employees having similar contents may be assigned to the same group.
- the grouping unit 17 uses a general grouping method such as K-MEANS clustering technology to group employees with similar attribute data into the same group. You may sort it into.
- the grouping unit 17 may group the employees based on a predetermined condition indicated by the value of the attribute data or a condition designated by the advisor. .
- a prediction formula for calculating a predicted value of the target field is provided for each group (see FIG. 20B).
- ⁇ 1 to ⁇ 4 represent intercepts of the respective prediction equations.
- the prediction unit 16 not only calculates the predicted value of the target field of the employee using the prediction formula of the group to which the employee belongs, but also uses the prediction formula of another group.
- a predicted value of the target field of the employee may be calculated.
- the advisor selects a group in which the employee is likely to move among the groups in which the predicted value of the target field falls within the target range based on the predicted value of the target field of each group and the grouping conditions. Since it can be easily recognized, it can be used to improve items such as work status.
- the prediction unit 16 may perform a process of calculating the predicted value of the target field of each group even in the case of other grouping methods.
- the grouping unit 17 performs grouping using the medical examination data of each employee, the contents of the medical examination data are similar using, for example, a general grouping method such as K-MEANS clustering technology. Employees may be assigned to the same group. Further, for example, when there are two types of medical examination data as in the first modification, the difference between the first medical examination data and the second medical examination data of each employee is obtained, and the magnitude of the difference is obtained. Employees with similar may be assigned to the same group.
- FIG. 21 is a block diagram showing an outline of a data analysis apparatus according to the present invention.
- the data analysis apparatus 50 includes data acquisition means 51, attribute data generation means 52, model learning means 53, related field extraction means 54, and summarization means 55.
- the data acquisition means 51 (for example, the data input unit 11) is configured to specify a target field that is a field from which relations are to be extracted from among fields included in health status data that is information on the health status of the employee, and two or more At least health status data about employees and attendance data that is information on work status are acquired.
- the attribute data generation means 52 (for example, the attribute data generation unit 12) performs aggregation for each employee using a predetermined time resolution, time range, and aggregation method for a predetermined field of the attendance data. And generating attribute data having each of the tabulated results as attribute fields.
- the model learning unit 53 (for example, the model learning unit 13) uses a target field as an objective variable and each attribute field of the attribute data as an explanatory variable. A member learns using the contents of the target field in the health condition data and the contents of the attribute data.
- the related field extracting means 54 (for example, the related field extracting unit 14) extracts an attribute field related to the target field, which is indicated by the learned model.
- Summarizing means 55 summarizes and outputs the attendance data of the designated employee based on the extracted attribute field information.
- the model learning means learns each coefficient of the explanatory variable included in the polynomial as a model parameter, and the related field extraction means uses the attribute field corresponding to the explanatory variable whose coefficient takes a value other than zero as the target field. It may be extracted as an attribute field related to.
- the attribute data generation means may perform aggregation for one field of attendance data using a plurality of time resolutions, a plurality of time ranges, or a plurality of aggregation methods.
- the data acquisition means acquires designations of two or more target fields, and the model learning means sets, for each of the two or more designated target fields, the target field as an objective variable, and sets each attribute field included in the attribute data.
- a model that is an explanatory variable and represented by a polynomial may be learned using the contents of the target field in the health condition data and the contents of the attribute data of two or more employees.
- the attendance data includes a record for the first period before the first time point, which is a predetermined time point in the past, which is a predetermined second time period than the expected time point, which is a predetermined future time point, A record for a first period before the second time point, which is a predetermined time point that is more than a second time period from the date on which the health condition data was acquired, and the attribute data generating means 2.
- Aggregation is performed using a predetermined time resolution, time range, and aggregation method for a predetermined field included in the second attendance data composed of records for the first period before the time point 2, and each of the aggregation results Second attribute data is generated as an attribute field, and the model learning means uses each target field in the latest health condition data as an objective variable, and explains each attribute field included in the second attribute data.
- a model that is a variable and that is represented by a polynomial is learned by using the contents of the target field in the most recent health condition data and the contents of the second attribute data of two or more employees, and summarizing means Summarizes the first time data consisting of records for the first period before the first time of the specified employee based on the information of the extracted attribute field, You may output as the information of attendance data relevant to a field.
- the data analysis device 50 determines the expected time of the employee based on the learned model and the first attribute data that is attribute data generated using the first attendance data of the designated employee.
- Prediction means (not shown; for example, prediction unit 16) may be provided for predicting the value of the target field in.
- the data analysis device 50 also includes grouping means (not shown; for example, the grouping unit 17) that groups employees based on predetermined conditions, health status data, attendance data, or attribute data.
- the model learning means may learn a model for each group of employees using the contents of the target field and the contents of the attribute data in the health condition data of the employees belonging to the group.
- the attribute data may include an attribute field that is a field of the health condition data and in which a total result for a predetermined field other than the target field is registered.
- the attribute data generation means is predetermined for each of the predetermined fields included in the attendance data and the fields included in the health status data and other than the target fields. Aggregation is performed using the time resolution, the time range, and the aggregation method, and attribute data having each of the aggregation results as attribute fields is generated.
- the summarizing means uses the information of the extracted attribute fields to specify the specified employee.
- the attendance data and health status data may be summarized and output.
- the present invention is not limited to the use of providing field information related to any medical examination test result in attendance data for the purpose of health guidance, but the relationship between data having many fields and records and any item It can be suitably applied to the purpose of analysis.
- This application claims the priority on the basis of Japanese application Japanese Patent Application No. 2015-142404 for which it applied on July 16, 2015, and takes in those the indications of all here.
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Entrepreneurship & Innovation (AREA)
- Economics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Operations Research (AREA)
- General Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Quality & Reliability (AREA)
- Marketing (AREA)
- Data Mining & Analysis (AREA)
- Educational Administration (AREA)
- Medical Informatics (AREA)
- Health & Medical Sciences (AREA)
- Public Health (AREA)
- Development Economics (AREA)
- Primary Health Care (AREA)
- Epidemiology (AREA)
- Game Theory and Decision Science (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Pathology (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Medical Treatment And Welfare Office Work (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
以下、本発明の実施形態を図面を参照して説明する。図1は、本発明の第1の実施形態のデータ分析装置の構成例を示すブロック図である。
入力情報は、例えば、従業員の健康状態に関するデータである健康状態データに含まれるフィールドのうち関連性の抽出対象とするフィールドである対象フィールドの指定と、過去の健康状態データと、該健康状態データが計測等された日であるデータ計測日よりも過去の所定期間分のデータを有する勤怠データと、が含まれていてもよい。
本例では、N人の従業員の勤怠データが入力されたとする。なお、Nは1以上の整数である。また、n番目の従業員の属性データをX_nと表す。ここで、n=1,・・・,Nである。本例の属性データX_nは、複数個の要素からなるベクトルとして表される。例えば、属性データの要素数(フィールド数)が7であるとする。この場合、属性データ生成部12は、1番目の従業員の属性データとして、X_1=(0,0,3,2,1,0,0)と表されるようなデータを生成してもよい。これは、1番目の従業員について、1番目の属性フィールドの値が0、2番目の属性フィールドの値が0、3番目の属性フィールドの値が3、4番目の属性フィールドの値が2、5番目の属性フィールドの値が1、6番目の属性フィールドの値が0、7番目の属性フィールドの値が0、であることを表している。属性データ生成部12は、従業員ごとに属性データを生成し、生成された属性データをメモリ1002に格納する。
以下、従業員nの属性データのj番目の要素を、X_njと表す。ここで、j=1,・・・,M(Mは属性データの要素数である)。また、従業員nの健診データのうち、対象フィールドの値をY_nと表す。以下の式(1)は、Y_nとX_nの関係を示す式である。
関連フィールド抽出部14は、例えば、メモリ1002に記憶されている属性表から、多項式の係数に相当する各モデルパラメータWc_j(j=1,・・・,M)の値を読み出す。
要約部15は、メモリ1002から、対象フィールドと関連のある属性フィールドの情報として記憶された、勤怠フィールド、時間分解能、時間範囲および集計方法の組を読み出す。そして、要約部15は、読みだした情報に基づいて、指定された従業員の勤怠データを要約して、その結果を出力する。出力先は、メモリ1002、出力装置1003またはネットワークインタフェース1005を介して接続される他の装置等であってもよい。
図1に示したデータ分析装置は、アドバイザーが、従業員の、判断時点での健康状態に関連のある勤務状況の有無等を勤怠データから容易に把握、理解可能にすることを目的としている。このため、該データ分析装置は、判断時点以前の健診データと、その健診データを得た健診検査日より過去の所定の期間分の勤怠データとの間の関連性を多項式モデルの係数で表し、該モデルを学習して得られる各係数の値を基に、上記の所定の期間分の各従業員の勤怠データを要約して出力する。
本変形例では、第1の変形例の機能に加えて、さらに、アドバイザーに対して、将来の時点における健診データの対象フィールドの予測値を提供する。
本変形例では、属性データを生成する際に、勤怠データだけでなく、健診データも用いる。
次に、第4の変形例について説明する。保健指導では、従業員の職種や事業所ごとの特性も加味した上で、各従業員の特性に応じた的確なアドバイスをすることが求められている。例えば、職種や事業所が異なる従業員では、出社時間が異なったり、休憩時間が異なったり、平均残業時間等が異なったりすると考えられる。
この出願は、2015年7月16日に出願された日本出願特願2015-142404を基礎とする優先権を主張し、その開示の全てをここに取り込む。
11 データ入力部
12 属性データ生成部
13 モデル学習部
14 関連フィールド抽出部
15 要約部
16 予測部
17 グループ化部
50 データ分析装置
51 データ取得手段
52 属性データ生成手段
53 モデル学習手段
54 関連フィールド抽出手段
55 要約手段
1001 CPU
1002 メモリ
1003 出力装置
1004 入力装置
1005 ネットワークインタフェース
Claims (10)
- 従業員の健康状態に関する情報である健康状態データが有するフィールドのうち関連性の抽出対象とするフィールドである対象フィールドの指定と、2以上の従業員についての前記健康状態データおよび勤務状況に関する情報である勤怠データと、を少なくとも取得するデータ取得手段と、
前記従業員の各々について、前記勤怠データが有する所定のフィールドに対して、予め定められた時間分解能、時間範囲および集計方法を用いて集計を行い、集計結果の各々を属性フィールドとして有する属性データを生成する属性データ生成手段と、
前記対象フィールドを目的変数とし、前記属性データが有する属性フィールドの各々を説明変数とするモデルであって多項式で表されるモデルを、前記2以上の従業員の、前記健康状態データにおける前記対象フィールドの内容と、前記属性データの内容とを用いて学習するモデル学習手段と、
学習済みのモデルによって示される、前記対象フィールドと関連のある属性フィールドを抽出する関連フィールド抽出手段と、
抽出された前記属性フィールドの情報を基に、指定された従業員の勤怠データを要約して出力する要約手段を備えるデータ分析装置。 - 前記モデル学習手段は、モデルパラメータとして、前記多項式に含まれる説明変数の各々の係数を学習し、
前記関連フィールド抽出手段は、前記係数がゼロ以外の値をとる説明変数に対応する属性フィールドを、前記対象フィールドと関連のある属性フィールドとして抽出する
請求項1に記載のデータ分析装置。 - 前記属性データ生成手段は、前記勤怠データの1つのフィールドに対して、複数の時間分解能、複数の時間範囲または複数の集計方法を用いて集計を行う
請求項1または請求項2に記載のデータ分析装置。 - 前記データ取得手段は、2以上の対象フィールドの指定を取得し、
前記モデル学習手段は、指定された2以上の対象フィールドの各々について、当該対象フィールドを目的変数とし、前記属性データが有する前記属性フィールドの各々を説明変数とするモデルであって多項式で表されるモデルを、前記2以上の従業員の、健康状態データにおける当該対象フィールドの内容と、前記属性データの内容とを用いて学習する
請求項1から請求項3のうちのいずれか1項に記載のデータ分析装置。 - 前記勤怠データは、所定の将来の時点である予想時点よりも予め定めた第2の期間遡った過去の所定の時点である第1の時点以前の第1の期間分のレコードと、直近の健康状態データを取得した日より前記第2の期間以上遡った所定の時点である第2の時点以前の第1の期間分のレコードとを含み、
前記属性データ生成手段は、前記従業員の各々について、前記第2の時点以前の第1の期間分のレコードからなる第2の勤怠データが有する所定フィールドに対して、予め定められた時間分解能、時間範囲および集計方法を用いて集計を行い、集計結果の各々を属性フィールドとして有する第2の属性データを生成し、
前記モデル学習手段は、前記直近の健康状態データにおける対象フィールドを目的変数とし、前記第2の属性データが有する属性フィールドの各々を説明変数とするモデルであって多項式で表されるモデルを、2以上の従業員の、前記直近の健康状態データにおける対象フィールドの内容と、前記第2の属性データの内容とを用いて学習し、
前記要約手段は、抽出された属性フィールドの情報を基に、指定された従業員の第1の時点以前の第1の期間分のレコードからなる第1の勤怠データを要約し、要約結果を前記予想時点における対象フィールドと関連のある勤怠データの情報として出力する
請求項1から請求項4のうちのいずれか1項に記載のデータ分析装置。 - 前記学習済みのモデルと、指定された従業員の第1の勤怠データを用いて生成される属性データである第1の属性データとに基づいて、前記従業員の予想時点における対象フィールドの値を予測する予測手段を備えた
請求項5に記載のデータ分析装置。 - 前記従業員を、予め定められた条件、健康状態データ、勤怠データもしくは属性データに基づいてグループ化するグループ化手段を備え、
前記モデル学習手段は、前記従業員のグループごとに、当該グループに属する従業員の、健康状態データにおける対象フィールドの内容と、属性データの内容とを用いてモデルを学習する
請求項1から請求項6のうちのいずれか1項に記載のデータ分析装置。 - 前記属性データは、健康状態データが有するフィールドであって対象フィールド以外の所定のフィールドに対する集計結果が登録される属性フィールドを有し、
前記属性データ生成手段は、前記従業員の各々について、前記勤怠データが有する所定のフィールドおよび前記健康状態データが有するフィールドであって対象フィールド以外の所定のフィールドに対して、予め定められた時間分解能、時間範囲および集計方法を用いて集計を行い、集計結果の各々を属性フィールドとして有する属性データを生成し、
前記要約手段は、抽出された属性フィールドの情報を基に、前記指定された従業員の勤怠データおよび前記健康状態データを要約して出力する
請求項1から請求項7のうちのいずれか1項に記載のデータ分析装置。 - 情報処理装置が、従業員の健康状態に関する情報である健康状態データが有するフィールドのうち関連性の抽出対象とするフィールドである対象フィールドの指定と、2以上の従業員についての前記健康状態データおよび勤務状況に関する情報である勤怠データと、を少なくとも取得し、
前記情報処理装置が、前記従業員の各々について、前記勤怠データが有する所定のフィールドに対して、予め定められた時間分解能、時間範囲および集計方法を用いて集計を行い、集計結果の各々を属性フィールドとして有する属性データを生成し、
前記情報処理装置が、前記対象フィールドを目的変数とし、前記属性データが有する属性フィールドの各々を説明変数とするモデルであって多項式で表されるモデルを、前記2以上の従業員の、前記健康状態データにおける前記対象フィールドの内容と、前記属性データの内容とを用いて学習し、
前記情報処理装置が、学習済みのモデルによって示される、前記対象フィールドと関連のある属性フィールドを抽出し、
前記情報処理装置が、抽出された前記属性フィールドの情報を基に、指定された従業員の勤怠データを要約して出力する、
データ分析方法。 - コンピュータに、
従業員の健康状態に関する情報である健康状態データが有するフィールドのうち関連性の抽出対象とするフィールドである対象フィールドの指定と、2以上の従業員についての前記健康状態データおよび勤務状況に関する情報である勤怠データと、を少なくとも取得する処理、
前記従業員の各々について、前記勤怠データが有する所定のフィールドに対して、予め定められた時間分解能、時間範囲および集計方法を用いて集計を行い、集計結果の各々を属性フィールドとして有する属性データを生成する処理、
前記対象フィールドを目的変数とし、前記属性データが有する属性フィールドの各々を説明変数とするモデルであって多項式で表されるモデルを、前記2以上の従業員の、前記健康状態データにおける前記対象フィールドの内容と、前記属性データの内容とを用いて学習する処理、
学習済みのモデルによって示される、前記対象フィールドと関連のある属性フィールドを抽出する処理、および
抽出された前記属性フィールドの情報を基に、指定された従業員の勤怠データを要約して出力する処理
を実行させるためのデータ分析プログラムを格納した記憶媒体。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201680041982.6A CN107851295B (zh) | 2015-07-16 | 2016-07-14 | 数据分析设备、数据分析方法和存储数据分析程序的存储介质 |
JP2016573620A JP6105825B1 (ja) | 2015-07-16 | 2016-07-14 | データ分析装置、データ分析方法およびデータ分析プログラム |
US15/742,948 US20180225634A1 (en) | 2015-07-16 | 2016-07-14 | Data analysis device, data analysis method, and storage medium storing data analysis program |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2015142404 | 2015-07-16 | ||
JP2015-142404 | 2015-07-16 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017010103A1 true WO2017010103A1 (ja) | 2017-01-19 |
Family
ID=57757273
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2016/003332 WO2017010103A1 (ja) | 2015-07-16 | 2016-07-14 | データ分析装置、データ分析方法、およびデータ分析プログラムを格納した記憶媒体 |
Country Status (4)
Country | Link |
---|---|
US (1) | US20180225634A1 (ja) |
JP (1) | JP6105825B1 (ja) |
CN (1) | CN107851295B (ja) |
WO (1) | WO2017010103A1 (ja) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6481794B1 (ja) * | 2018-04-20 | 2019-03-13 | 富士通株式会社 | 学習用データ生成方法、学習用データ生成プログラム |
JP2019212155A (ja) * | 2018-06-07 | 2019-12-12 | 富士通株式会社 | 学習用データ生成プログラムおよび学習用データ生成方法 |
JP2020038431A (ja) * | 2018-09-03 | 2020-03-12 | 孝文 栢 | 行動推奨装置及び行動推奨システム |
JP2020095448A (ja) * | 2018-12-12 | 2020-06-18 | 株式会社日立ソリューションズ | データ処理システム、方法、およびプログラム |
JP2020190764A (ja) * | 2019-05-17 | 2020-11-26 | 株式会社エンジョイ | ユーザストレス対処システム、ユーザストレス対処方法及びユーザストレス対処プログラム |
JP2022112434A (ja) * | 2021-01-21 | 2022-08-02 | Tis株式会社 | 職域改善装置、職域改善方法、および職域改善プログラム |
JP2022112435A (ja) * | 2021-01-21 | 2022-08-02 | Tis株式会社 | 職域改善装置、職域改善方法、および職域改善プログラム |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP7067235B2 (ja) * | 2018-04-20 | 2022-05-16 | 富士通株式会社 | 機械学習プログラム、機械学習方法および機械学習装置 |
US20220237536A1 (en) * | 2019-06-17 | 2022-07-28 | Nec Corporation | Risk estimation apparatus, risk estimation method, computer program and recording medium |
CN111462910A (zh) * | 2020-03-31 | 2020-07-28 | 上海商汤智能科技有限公司 | 项目的匹配方法及装置、电子设备和存储介质 |
CN117196560B (zh) * | 2023-11-07 | 2024-02-13 | 深圳市慧云智跑网络科技有限公司 | 一种基于物联网的打卡设备数据采集方法及系统 |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2003256578A (ja) * | 2002-03-05 | 2003-09-12 | Kobe Steel Ltd | 健康管理システム |
JP2014130579A (ja) * | 2012-11-29 | 2014-07-10 | Exbrain Inc | メンタルヘルス管理システム、メンタルヘルス管理プログラム及びメンタルヘルス管理方法 |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7403901B1 (en) * | 2000-04-13 | 2008-07-22 | Accenture Llp | Error and load summary reporting in a health care solution environment |
KR20050109919A (ko) * | 2002-12-10 | 2005-11-22 | 텔어바웃 인크 | 컨텐츠 제작, 배급, 상호작용, 및 모니터링 시스템 |
JP3992111B2 (ja) * | 2005-12-26 | 2007-10-17 | 日本アイ・ビー・エム株式会社 | 勤務計画を調整するシステム、方法、プログラム |
JP5471019B2 (ja) * | 2009-04-30 | 2014-04-16 | 株式会社リコー | 健康管理システム及び健康管理プログラム |
US20140278455A1 (en) * | 2013-03-14 | 2014-09-18 | Microsoft Corporation | Providing Feedback Pertaining to Communication Style |
WO2015009287A1 (en) * | 2013-07-16 | 2015-01-22 | Wright Beth Ann | Learning model for competency based performance |
JP2015056150A (ja) * | 2013-09-13 | 2015-03-23 | 株式会社東芝 | 脅威分析システムおよび情報処理装置 |
US20150186817A1 (en) * | 2013-12-28 | 2015-07-02 | Evolv Inc. | Employee Value-Retention Risk Calculator |
EP2889822A1 (en) * | 2013-12-28 | 2015-07-01 | Evolv Inc. | Employee value-retention risk calculator |
EP3051483A1 (en) * | 2015-01-30 | 2016-08-03 | Ricoh Company, Ltd. | Information processing apparatus, and non-transitory recording medium |
CN104636889A (zh) * | 2015-03-10 | 2015-05-20 | 刘升 | 一种人员信息综合管理系统 |
-
2016
- 2016-07-14 JP JP2016573620A patent/JP6105825B1/ja active Active
- 2016-07-14 CN CN201680041982.6A patent/CN107851295B/zh active Active
- 2016-07-14 WO PCT/JP2016/003332 patent/WO2017010103A1/ja active Application Filing
- 2016-07-14 US US15/742,948 patent/US20180225634A1/en not_active Abandoned
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2003256578A (ja) * | 2002-03-05 | 2003-09-12 | Kobe Steel Ltd | 健康管理システム |
JP2014130579A (ja) * | 2012-11-29 | 2014-07-10 | Exbrain Inc | メンタルヘルス管理システム、メンタルヘルス管理プログラム及びメンタルヘルス管理方法 |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6481794B1 (ja) * | 2018-04-20 | 2019-03-13 | 富士通株式会社 | 学習用データ生成方法、学習用データ生成プログラム |
JP2019191781A (ja) * | 2018-04-20 | 2019-10-31 | 富士通株式会社 | 学習用データ生成方法、学習用データ生成プログラム |
JP2019212155A (ja) * | 2018-06-07 | 2019-12-12 | 富士通株式会社 | 学習用データ生成プログラムおよび学習用データ生成方法 |
US11829867B2 (en) | 2018-06-07 | 2023-11-28 | Fujitsu Limited | Computer-readable recording medium and learning data generation method |
JP2020038431A (ja) * | 2018-09-03 | 2020-03-12 | 孝文 栢 | 行動推奨装置及び行動推奨システム |
JP7224618B2 (ja) | 2018-09-03 | 2023-02-20 | 孝文 栢 | 行動推奨装置及び行動推奨システム |
JP2020095448A (ja) * | 2018-12-12 | 2020-06-18 | 株式会社日立ソリューションズ | データ処理システム、方法、およびプログラム |
JP2020190764A (ja) * | 2019-05-17 | 2020-11-26 | 株式会社エンジョイ | ユーザストレス対処システム、ユーザストレス対処方法及びユーザストレス対処プログラム |
JP2022112434A (ja) * | 2021-01-21 | 2022-08-02 | Tis株式会社 | 職域改善装置、職域改善方法、および職域改善プログラム |
JP2022112435A (ja) * | 2021-01-21 | 2022-08-02 | Tis株式会社 | 職域改善装置、職域改善方法、および職域改善プログラム |
JP7198296B2 (ja) | 2021-01-21 | 2022-12-28 | Tis株式会社 | 職域改善装置、職域改善方法、および職域改善プログラム |
Also Published As
Publication number | Publication date |
---|---|
CN107851295A (zh) | 2018-03-27 |
JPWO2017010103A1 (ja) | 2017-07-13 |
US20180225634A1 (en) | 2018-08-09 |
CN107851295B (zh) | 2022-02-25 |
JP6105825B1 (ja) | 2017-03-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6105825B1 (ja) | データ分析装置、データ分析方法およびデータ分析プログラム | |
Malik et al. | Data mining and predictive analytics applications for the delivery of healthcare services: a systematic literature review | |
Abrizah et al. | LIS journals scientific impact and subject categorization: a comparison between Web of Science and Scopus | |
Young et al. | A survey of methodologies for the treatment of missing values within datasets: Limitations and benefits | |
Milovic et al. | Prediction and decision making in health care using data mining | |
Einav et al. | Selection on moral hazard in health insurance | |
Gorgens-Ekermans et al. | Psychological capital: Internal and external validity of the Psychological Capital Questionnaire (PCQ-24) on a South African sample | |
Keong Choong | Understanding the features of performance measurement system: a literature review | |
Li et al. | A literature review of nursing turnover costs | |
Bao et al. | Quantifying repetitive hand activity for epidemiological research on musculoskeletal disorders–Part II: comparison of different methods of measuring force level and repetitiveness | |
Oswald et al. | Meta-analysis and the art of the average | |
Briggs | Handling uncertainty in economic evaluation | |
Bauer et al. | How does scientific success relate to individual and organizational characteristics? A scientometric study of psychology researchers in the German-speaking countries | |
Barth et al. | The effects of scientists and engineers on productivity and earnings at the establishment where they work | |
Converse et al. | Thinking ahead: Assuming linear versus nonlinear personality-criterion relationships in personnel selection | |
Pool et al. | Size and characteristics of the biomedical research workforce associated with US National Institutes of Health extramural grants | |
Tong et al. | Testing the generalizability of an automated method for explaining machine learning predictions on asthma patients’ asthma hospital visits to an academic healthcare system | |
Wolcott et al. | Modeling time-dependent and-independent indicators to facilitate identification of breakthrough research papers | |
Bae et al. | Age and workplace ageism: a systematic review and meta-analysis | |
Piccialli et al. | A robust ensemble technique in forecasting workload of local healthcare departments | |
Steele et al. | Multilevel structural equation models for longitudinal data where predictors are measured more frequently than outcomes: An application to the effects of stress on the cognitive function of nurses | |
Maharani et al. | The effect of education, health, minimum wage, foreign investment on labor productivity in 33 provinces of Indonesia | |
Radha et al. | An experimental analysis of work-life balance among the employees using machine learning classifiers | |
US20230274834A1 (en) | Model-based evaluation of assessment questions, assessment answers, and patient data to detect conditions | |
Ang et al. | Employee turnover prediction by machine learning techniques |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 2016573620 Country of ref document: JP Kind code of ref document: A |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16824087 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 15742948 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 16824087 Country of ref document: EP Kind code of ref document: A1 |