WO2019146123A1

WO2019146123A1 - Alertness estimation device, alertness estimation method, and computer readable recording medium

Info

Publication number: WO2019146123A1
Application number: PCT/JP2018/002804
Authority: WO
Inventors: 剛範辻川
Original assignee: 日本電気株式会社
Priority date: 2018-01-29
Filing date: 2018-01-29
Publication date: 2019-08-01
Also published as: JPWO2019146123A1; JP6879388B2

Abstract

An alertness estimation device 10 is a device for estimating a user's alertness and includes: an image data acquisition unit 11 which acquires image data that includes the user's facial image at a set frame rate; a time-series data extraction unit 12 which extracts time-series data that indicates the user's bioinformation from the image data acquired at the set frame rate; a data processing unit 13 which interpolates the time-series data such that the number of samples of the time-series data is at a set value; and an alertness estimation unit 14 which inputs the interpolated time-series data into a learning model constructed using a convolutional neural network and estimates the alertness of the user.

Description

Wake level estimation device, wake level estimation method, and computer readable recording medium

The present invention relates to an arousal level estimation apparatus and a degree of arousal level estimation method for estimating an arousal level representing an arousal state of a person, and further relates to a computer readable recording medium storing a program for realizing the same. .

In recent years, with the declining birthrate and aging population, the working-age population has decreased and labor shortages have progressed. Under such circumstances, there is an increasing number of attempts to replace some of the work that humans have been doing with robots or artificial intelligence (AI). However, it is difficult to replace robots with AI for tasks that require intellectual labor among human tasks. For this reason, in the future, it will be essential for people to maintain and improve the productivity of intellectual labor.

By the way, people, unlike machines, feel sleepy (low alertness) or stress (super alertness). That is, the productivity of human intellectual labor changes according to the state of mind and body awakening. Therefore, in order to improve the productivity of human intellectual labor, it is important to make the awake state just good.

Then, as a method for making the arousal state of a person just in a good state during intellectual labor, the latest arousal state of a person is detected, and depending on the detected arousal state, temperature, humidity, illuminance in the office, etc. Control of the environment can be mentioned. In particular, in this method, it is important to detect the arousal state of a person with high accuracy.

For example, Patent Document 1 discloses an apparatus for estimating the arousal level of a person from the degree of opening of the eye. The apparatus disclosed in Patent Document 1 obtains the eye opening time of the driver's eyes from the camera image sent at the set frame rate, obtains the variation from the obtained eye opening time, and obtains the variation from the obtained variation. Estimate the alertness of

Further, Non-Patent Document 1 discloses an apparatus for estimating the arousal level (stress) of a person from a face image. The device disclosed in Non-Patent Document 1 calculates low frequency HRV (Heart Rate Variability) components of human face and respiration rate from camera images sent at a set frame rate, and statistics the calculated values. Input into a model to estimate the arousal level of a person.

International Publication No. 2010/092860

By the way, in any of the apparatus disclosed in Patent Document 1 and the apparatus disclosed in Non-Patent Document 1, it is necessary to set and process a high frame rate of a camera image to maintain estimation accuracy of arousal level. There is. For this reason, there is a problem that the processing load on the device is large.

One example of the object of the present invention is an arousal level estimation device, an arousal level estimation method, and a computer readable recording medium capable of accurately estimating the arousal level of a person while solving the above problems and reducing the processing load. It is to provide.

In order to achieve the above object, the awakening level estimation apparatus according to an aspect of the present invention is an apparatus for estimating a user's awakening level,
An image data acquisition unit for acquiring image data including a face image of the user at a set frame rate;
A time-series data extraction unit that extracts time-series data indicating biological information of the user from the image data acquired at the set frame rate;
A data processing unit that interpolates the time-series data such that the sampling number of the extracted time-series data becomes a set value;
An awakening level estimation unit that inputs the time-series data after interpolation into a learning model constructed using a convolutional neural network, and estimates the awakening level of the user;
It is characterized by having.
It is characterized by

In addition, in order to achieve the above object, the awakening degree estimation method according to one aspect of the present invention is a method for estimating the awakening degree of a user,
(A) acquiring image data including a face image of the user at a set frame rate;
(B) extracting time-series data indicating biometric information of the user from the image data acquired at the set frame rate;
(C) interpolating the time-series data so that the sampling number of the extracted time-series data becomes a set value;
(D) inputting the time-series data after interpolation into a learning model constructed using a convolutional neural network to estimate the awakening degree of the user;
It is characterized by having.

Furthermore, to achieve the above object, a computer readable recording medium according to one aspect of the present invention is a computer readable recording medium storing a program for estimating a user's alertness by a computer.
On the computer
(A) acquiring image data including a face image of the user at a set frame rate;
(B) extracting time-series data indicating biometric information of the user from the image data acquired at the set frame rate;
(C) interpolating the time-series data so that the sampling number of the extracted time-series data becomes a set value;
(D) inputting the time-series data after interpolation into a learning model constructed using a convolutional neural network to estimate the awakening degree of the user;
And recording a program including an instruction to execute the program.

As described above, according to the present invention, it is possible to accurately estimate the arousal level of a person while reducing the processing load.

FIG. 1 is a block diagram showing a configuration of an arousal level estimation apparatus according to Embodiment 1 of the present invention. FIG. 2 is a diagram showing an example of interpolation of time-series data performed in the first embodiment of the present invention. FIG. 3 is a diagram showing an example of a convolutional neural network used in the first embodiment of the present invention. FIG. 4 is a flowchart showing the operation of the alertness level estimation apparatus 10 according to the first embodiment of the present invention. FIG. 5 is a block diagram showing the configuration of the awakening level estimation apparatus according to the second embodiment of the present invention. FIG. 6 is a flow chart showing the operation of the awakening level estimation device 30 according to the second embodiment of the present invention. FIG. 7 is a block diagram showing an example of a computer for realizing the arousal level estimation device in the first and second embodiments of the present invention.

Embodiment 1
The waking degree estimation device, the waking degree estimation method, and the program according to the first embodiment of the present invention will be described below with reference to FIGS. 1 to 4.

[Device configuration]
First, the configuration of the awakening degree estimation apparatus according to the first embodiment will be described with reference to FIG. FIG. 1 is a block diagram showing a configuration of an arousal level estimation apparatus according to Embodiment 1 of the present invention.

The awakening level estimation device 10 according to the present embodiment shown in FIG. 1 is a device for estimating the awakening degree of the user. As shown in FIG. 1, the awakening level estimation device 10 includes an image data acquisition unit 11, a time series data extraction unit 12, a data processing unit 13, and an awakening level estimation unit 14.

Among these, the image data acquisition unit 11 acquires image data including a face image of the user at a set frame rate. Further, the time-series data extraction unit 12 extracts time-series data indicating biometric information of the user from the image data acquired at the set frame rate.

The data processing unit 13 interpolates time series data so that the sampling number of the extracted time series data becomes a set value. The awakening level estimation unit 14 inputs time series data after interpolation into a learning model constructed using a convolutional neural network, and estimates the awakening degree of the user.

As described above, in the first embodiment, since the sampling number of time-series data extracted from image data can be interpolated, the frame rate of image data can be suppressed to a low level in advance. Therefore, according to the first embodiment, it is possible to accurately estimate the awakening level of a person while reducing the processing load on the device.

Subsequently, the configuration of the awakening level estimation device 10 according to the first embodiment will be described more specifically. First, as shown in FIG. 1, in the first embodiment, the awakening level estimation device 10 is connected to an external imaging device 20.

The imaging device 20 is a digital camera and outputs image data at a set frame rate. In addition, the imaging device 20 is disposed so as to be able to capture a face image of the user. The image data acquisition unit 11 acquires image data output from the imaging device 20.

In the first embodiment, as biological information indicated by time series data, information indicating the degree of opening and closing of the eye, information indicating the direction of the line of sight, information indicating the direction of the face, and information indicating the pulse wave in the user Information indicating the blood flow, information indicating the degree of opening and closing of the mouth, etc. may be mentioned. In the first embodiment, the time-series data extraction unit 12 extracts the above-mentioned information from the image data for each frame to generate time-series data.

Specifically, the time-series data extraction unit 12 detects, for example, both eyes of the user from the image data, obtains the open / close degree from the detected size of both eyes, and time-series information indicating the open / close degree of the eye Generate data. Further, the time-series data extraction unit 12 detects the center position of the user's eyes from the image data, calculates the direction of the line of sight from each detected center position, and time-series data of information indicating the calculated direction of the line of sight Generate

Furthermore, the time-series data extraction unit 12 detects the center line and contour of the user's face from the image data, calculates the direction of the user's face from the positional relationship between the detected center line and contour, and calculates Generate time series data of information indicating the direction. Further, the time-series data extraction unit 12 detects the mouth of the user from the image data, obtains the degree of opening and closing from the size of the detected mouth, and generates time-series data of information indicating the degree of opening and closing of the mouth.

In addition, the time-series data extraction unit 12 calculates the pulse wave or blood flow of the user using the property that hemoglobin in blood absorbs the green component of light (see, for example, the following reference). Specifically, the time-series data extraction unit 12 identifies the region of the user's skin from the image data, and calculates the luminance value of each of the R, G, and B channels in the identified region. Then, the time-series data extraction unit 12 calculates the pulse wave or blood flow of the user using the fact that the green light is absorbed when the blood flow increases and the luminance value of G decreases, and the pulse wave or blood is calculated. Generate time series data of information indicating a flow.

References: Asahimi Umematsu, Takenori Sugakawa, “High-accuracy heart rate estimation from face images based on ICA-R”, NEC Data Science Laboratories, The Institute of Electronics, Information and Communication Engineers, Information Science Techniques IE Technical Report 2017

In the first embodiment, as shown in FIG. 2, the data processing unit 13 interpolates time-series data by performing up-sampling on the time-series data. FIG. 2 is a diagram showing an example of interpolation of time-series data performed in the first embodiment of the present invention. In the example of FIG. 2, the frame rate of time-series data originally obtained is R, and the frame rate of actual time-series data is (R / 2). Moreover, "(circle)" in the figure has shown time-sequential data.

As shown in FIG. 2, the data processing unit 13 sets two consecutive time series data so that the number of samplings becomes a set value, that is, the frame rate changes from (R / 2) to R. Add new time series data in between.

Also, addition of new time-series data is performed by, for example, linear interpolation or spline interpolation. Furthermore, for example, a neural network may be constructed using data of a frame rate of R / 2 and data of a frame rate of R as learning data, and interpolation may be performed by this neural network. Thereafter, as shown in FIG. 2, the up-sampled time-series data is input to the convolutional neural network by the awakening level estimation unit 14.

In addition, the data processing unit 13 can also perform various types of signal processing, such as noise removal processing, interpolation processing of missing data, removal processing of outliers, and the like. By such processing, improvement in estimation accuracy of the awakening degree by the awakening degree estimation unit 14 can be expected.

Further, in the first embodiment, when the biological information indicated by the time series data is two or more pieces of information, the learning model has a layer for performing convolution for each biological information. Here, the estimation process by the awakening level estimation unit 14 in the first embodiment will be specifically described with reference to FIG.

FIG. 3 is a diagram showing an example of a convolutional neural network used in the first embodiment of the present invention. In the example of FIG. 3, the time-series data shows three pieces of information such as information indicating the degree of opening and closing of the eye, information indicating the direction of the line of sight, and information indicating the direction of the face, and convolution is performed for each information. . Further, it is assumed that time series data is normalized to a value such as 0 to 1 or -1 to +1 for each time series.

As shown in FIG. 3, the size of time-series data indicating the degree of eye open / close is set to (D _EC , W _T ). D _EC is the number of time-series data indicating the degree of eye opening and closing. Further, for example, in the case of inputting time-series data on the opening and closing of the right eye and the left eye, D _EC = 2. W _T is a time window width for estimation of the arousal level. For example, when using 10 s data, using a frame rate R, W _T = 10R.

First, the awakening level estimation unit 14 inputs time-series data of size (D _EC , W _T ) to the convolution layer, convolutes the weight filter for each window whose size is smaller than (D _EC , W _T ), and applies bias. In addition, the output is obtained through the activation function. Next, the awakening level estimation unit 14 shifts the position of the window and performs the same operation using different weight filters and biases to obtain an output of size (D _{EC —} C ₁ , W _{EC — C 1} ). The activation function may, for example, be a ReLU (Rectified Linear Unit).

Next, the awakening level estimation unit 14, in pooling layer, the size _(D _{EC_C1, W EC_C1)} for the input data, size _(D _{EC_C1, W EC_C1)} for each smaller windows, the pooling process in pooling layer Do. For example, in the case of commonly used max pooling, processing is performed so that the maximum value in the window remains. Next, the awakening level estimation unit 14 shifts the window position and performs the same pooling process to obtain an output of the size ( _{DEC_P1} , _{WEC_P1} ).

Subsequently, the arousal level estimating unit 14, similarly in the next convolution layer, the size _(D _{EC_P1, W EC_P1)} for the input data, size _(D _{EC_P1, W EC_P1)} for each smaller windows, the filter It performs convolution, addition of bias, and input to the activation function. Also in this case, the awakening level estimation unit 14 shifts the position of the window and performs the same process to obtain the output of the size ( _{DEC_C2} , _{WEC_C2} ).

Furthermore, the awakening level estimation unit 14, similarly in the following pooling layer, the size _(D _{EC_C2, W EC_C2)} for the input data, size _(D _{EC_C2, W EC_C2)} for each smaller windows, the pooling process Do. Next, the awakening level estimation unit 14 shifts the position of the window and performs the same pooling process to obtain, for example, an output of a size ( _{DEC_P2} , 1).

Further, in the first embodiment, the awakening level estimation unit 14 performs the same processing on each of time series data indicating the direction of the line of sight and time series data indicating the direction of the face. _Obtain the output of _{EG_P2,1} ) and the output of size ( _{DHP_P2,1} ).

Next, the awakening level estimation unit 14 concatenates and flattens the output of each time-series data as the “connection and flattening” process (see FIG. 3). Thereby, the awakening level estimation unit 14 obtains an output of size (1, _{DEC_P2} + _{DEG_P2} + _{DHP_P2} ).

After that, the awakening level estimation unit 14 _applies a bias to the weight filter and _applies an activation function to all input data of the size (1, _{DEC_P2} + _{DEG_P2} + _{DHP_P2} ) in all the connection layers. , Get the alertness estimate as output.

In the first embodiment, the learning model uses a convolutional neural network, but the weight filter and the bias in the convolutional layer use the sample data to which the correct label of the awakening degree is attached in advance. It is learned by doing deep learning. Further, learning can be performed by using an error back propagation method or the like so that the difference between the correctness label of the awakening degree and the estimated value of the awakening degree decreases.

Furthermore, in learning, over-learning can be prevented by using Dropout, in which weight and bias are zeroed with a certain probability. Further, in the first embodiment, the learning model includes a plurality of time series data with different frame rates (for example, time series data with frame rates of R, R / 2, R / 3, R / 6, and R / 10). ) May be constructed by inputting data obtained by performing interpolation so that the number of samples becomes a set value, as learning data into a convolutional neural network. In this case, more accurate modeling is possible.

Furthermore, in the learning model used in the first embodiment, the configuration of the convolutional neural network is not particularly limited. The learning model may have, for example, a configuration in which the second pooling layer is removed and, instead, the entire combined layer is added further after the connection and planarization. Various modifications may be added to the learning model.

[Device operation]
Next, the operation of the awakening level estimation apparatus 10 according to the first embodiment will be described with reference to FIG. FIG. 4 is a flowchart showing the operation of the alertness level estimation apparatus 10 according to the first embodiment of the present invention. In the following description, FIGS. 1 to 3 will be referred to as appropriate. In the first embodiment, the awakening level estimation method is implemented by operating the awakening level estimation apparatus 10. Therefore, the description of the awakening degree estimation method in the first embodiment is replaced with the following operation description of the awakening degree estimation apparatus 10.

As shown in FIG. 4, when the image data is output from the imaging device 20, the image data acquisition unit 11 acquires the output image data, and holds the acquired image data (step S1).

Next, the image data acquisition unit 11 determines whether the number of stored image data has reached a predetermined value (step S2). As a result of the determination in step S2, when the number of image data has not reached the predetermined value, the image data acquisition unit 11 executes step S1 again. On the other hand, when the number of pieces of image data has reached the predetermined value as a result of the determination in step S2, the image data acquisition unit 11 passes the held image data to the time series data extraction unit 12.

Next, when receiving the image data, the time-series data extraction unit 12 extracts time-series data indicating biological information of the user from the image data acquired in step S1 (step S3). When a plurality of users are included in the image data, the time-series data extraction unit 12 can also extract time-series data for each user in step S3.

Next, the data processing unit 13 interpolates time series data such that the sampling number of time series data extracted in step S3 becomes a set value (step S4).

Next, the awakening level estimation unit 14 inputs the time-series data after interpolation in step S4 to the learning model constructed using the convolutional neural network, and estimates the awakening degree of the user (step S5). .

Specifically, as shown in FIG. 3, when the time-series data indicates information indicating the degree of opening and closing of the eye, information indicating the direction of the line of sight, and information indicating the direction of the face, the awakening level estimation unit 14 For each information, convolution is performed to estimate the arousal level. Further, after step S5 is executed, steps S1 to S5 are executed again, and estimation of the awakening degree of the user is constantly performed.

Further, the awakening degree estimation device 10 inputs the estimated awakening degree to the control system of the air conditioner, the operation system of the vehicle, and the like. Thereby, each system can perform optimization control based on the awakening degree of the user.

[Effect in Embodiment 1]
As described above, in the first embodiment, since the number of samplings is interpolated not in image data but in time-series data extracted from an image, the frame rate of image data can be suppressed in advance. Therefore, the processing of the time-series data extraction unit 12 from the image data having a large load can be reduced, and as a result, the awakening degree of a person can be accurately estimated while reducing the processing load on the entire device. Further, in the first embodiment, the estimation accuracy of the arousal level can be further improved by modeling the convolutional neural network using data interpolated with respect to input time-series data of a plurality of frame rates. . Further, in the first embodiment, since plural pieces of biological information can be used as time series data, the accuracy of the alertness can be further improved.

[program]
The program in the first embodiment may be a program that causes a computer to execute steps S1 to S5 shown in FIG. By installing this program in a computer and executing it, the awakening level estimation device 10 and the awakening level estimation method according to the first embodiment can be realized. In this case, the processor of the computer functions as the image data acquisition unit 11, the time series data extraction unit 12, the data processing unit 13, and the awakening level estimation unit 14 to perform processing.

Also, the program in the first embodiment may be executed by a computer system constructed by a plurality of computers. In this case, for example, each computer may function as any of the image data acquisition unit 11, the time series data extraction unit 12, the data processing unit 13, and the awakening level estimation unit 14.

Second Embodiment
Next, the awakening degree estimation apparatus according to the second embodiment of the present invention will be described with reference to FIGS. 5 and 6.

[Device configuration]
First, the configuration of the awakening degree estimation apparatus according to the second embodiment will be described with reference to FIG. FIG. 5 is a block diagram showing the configuration of the awakening level estimation apparatus according to the second embodiment of the present invention.

As shown in FIG. 5, the awakening level estimation apparatus 30 according to the second embodiment includes a frame rate adjustment unit 31 in addition to the same configuration as the awakening level estimation apparatus 10 according to the first embodiment shown in FIG. ing. Hereinafter, differences from the first embodiment will be mainly described.

The frame rate adjustment unit 31 adjusts the frame rate according to the awakening degree estimated by the awakening degree estimation unit 14. Further, after adjusting the frame rate, the frame rate adjusting unit 31 instructs the imaging device 20 that outputs the image data to the adjusted frame rate. Also, the frame rate adjustment unit 31 may instruct the image data acquisition unit 11 and the time-series data extraction unit 12 instead of the imaging device 20 to indicate the adjusted frame rate. Further, the frame rate adjustment unit 31 can also adjust the frame rate according to the biological information indicated by the extracted time-series data.

Specifically, when the awakening degree is constant, the frame rate adjustment unit 31 sets the frame rate low, and reduces the processing load in the awakening degree estimation apparatus 30. On the other hand, the frame rate adjustment unit 31 sets the frame rate high, and improves the estimation accuracy of the awakening degree, when the awakening degree changes significantly (when the change range exceeds the predetermined range).

[Device operation]
Next, the operation of the awakening level estimation device 30 according to the second embodiment will be described with reference to FIG. FIG. 6 is a flow chart showing the operation of the awakening level estimation device 30 according to the second embodiment of the present invention. In the following description, FIG. 5 is referred to as appropriate. Further, in the second embodiment, the awakening level estimation method is implemented by operating the awakening level estimation device 30. Therefore, the description of the awakening level estimation method in the second embodiment is replaced with the following operation description of the awakening level estimation apparatus 30.

As shown in FIG. 6, when the image data is output from the imaging device 20, the image data acquisition unit 11 acquires the output image data, and holds the acquired image data (step S11).

Next, the image data acquisition unit 11 determines whether the number of stored image data has reached a predetermined value (step S12). If the number of image data does not reach the predetermined value as a result of the determination in step S12, the image data acquisition unit 11 executes step S11 again. On the other hand, as a result of the determination in step Ss2, when the number of image data has reached the predetermined value, the image data acquisition unit 11 passes the held image data to the time-series data extraction unit 12.

Next, when receiving the image data, the time-series data extraction unit 12 extracts time-series data indicating biological information of the user from the image data acquired in step S11 (step S13). When a plurality of users are included in the image data, the time-series data extraction unit 12 can also extract time-series data for each user in step S13.

Next, the data processing unit 13 interpolates time series data such that the sampling number of time series data extracted in step S13 becomes a set value (step S14).

Next, the awakening level estimation unit 14 inputs time-series data after interpolation in step S14 to a learning model constructed using a convolutional neural network, and estimates the awakening degree of the user (step S15). .

By executing the above steps S11 to S15, the awakening degree of the user is estimated. Steps S11 to S15 are similar to steps S1 to S5 shown in FIG.

Next, after execution of step S15, the frame rate adjustment unit 31 adjusts the frame rate according to the awakening degree estimated in step S15 (step S16). Subsequently, the frame rate adjustment unit 31 instructs the imaging device 20 on the frame rate adjusted in step S16 (step S17).

After execution of step S17, the imaging device 20 outputs image data at the instructed frame rate. Also, after step S17 is performed, steps S11 to S17 are performed again, and at that time, time-series data is generated at the instructed frame rate, and the arousal level is newly estimated. Further, also in the second embodiment, estimation of the awakening degree of the user is always performed by performing steps S11 to S15 again.

Also in the second embodiment, the awakening level estimation device 30 inputs the estimated awakening level to the control system of the air conditioner, the operation system of the vehicle, and the like. Thereby, each system can perform optimization control based on the awakening degree of the user.

[Effect in Embodiment 2]
As described above, in the second embodiment, the frame rate of image data can be adjusted. According to the second embodiment, an appropriate frame rate can be set in accordance with the required accuracy of the awakening degree. Also in the second embodiment, the same effect as that of the first embodiment can be obtained.

[program]
The program in the second embodiment may be a program that causes a computer to execute steps S11 to S17 shown in FIG. By installing this program in a computer and executing it, the awakening level estimation device 30 and the awakening level estimation method according to the first embodiment can be realized. In this case, the processor of the computer functions as the image data acquisition unit 11, the time series data extraction unit 12, the data processing unit 13, the alertness estimation unit 14, and the frame rate adjustment unit 31, and performs processing.

Further, the program in the second embodiment may be executed by a computer system constructed by a plurality of computers. In this case, for example, each computer functions as any of the image data acquisition unit 11, the time series data extraction unit 12, the data processing unit 13, the awakening degree estimation unit 14, and the frame rate adjustment unit 31. good.

(Physical configuration)
Here, a computer for realizing the awakening degree estimation apparatus by executing the programs in the first and second embodiments of the present invention will be described with reference to FIG. FIG. 7 is a block diagram showing an example of a computer for realizing the arousal level estimation device in the first and second embodiments of the present invention.

As shown in FIG. 7, the computer 110 includes a central processing unit (CPU) 111, a main memory 112, a storage device 113, an input interface 114, a display controller 115, a data reader / writer 116, and a communication interface 117. And These units are communicably connected to each other via a bus 121. Note that the computer 110 may include a graphics processing unit (GPU) or a field-programmable gate array (FPGA) in addition to or instead of the CPU 111.

The CPU 111 develops the program (code) in the present embodiment stored in the storage device 113 in the main memory 112 and executes various operations by executing these in a predetermined order. The main memory 112 is typically a volatile storage device such as a dynamic random access memory (DRAM). In addition, the program in the present embodiment is provided in the state of being stored in computer readable recording medium 120. The program in the present embodiment may be distributed on the Internet connected via communication interface 117.

Further, as a specific example of the storage device 113, besides a hard disk drive, a semiconductor storage device such as a flash memory may be mentioned. The input interface 114 mediates data transmission between the CPU 111 and an input device 118 such as a keyboard and a mouse. The display controller 115 is connected to the display device 119 and controls the display on the display device 119.

The data reader / writer 116 mediates data transmission between the CPU 111 and the recording medium 120, and executes reading of a program from the recording medium 120 and writing of the processing result in the computer 110 to the recording medium 120. The communication interface 117 mediates data transmission between the CPU 111 and another computer.

Further, specific examples of the recording medium 120 include general-purpose semiconductor storage devices such as CF (Compact Flash (registered trademark)) and SD (Secure Digital), magnetic recording media such as flexible disk (Flexible Disk), or CD- An optical recording medium such as a ROM (Compact Disk Read Only Memory) may be mentioned.

The awakening level estimation apparatus according to the present embodiment can also be realized by using hardware corresponding to each unit, not a computer on which a program is installed. Furthermore, the awakening level estimation device may be partially realized by a program, and the remaining portion may be realized by hardware.

A part or all of the embodiment described above can be expressed by (Appendix 1) to (Appendix 18) described below, but is not limited to the following description.

(Supplementary Note 1)
An apparatus for estimating the awakening degree of a user,
An image data acquisition unit for acquiring image data including a face image of the user at a set frame rate;
A time-series data extraction unit that extracts time-series data indicating biological information of the user from the image data acquired at the set frame rate;
A data processing unit that interpolates the time-series data such that the sampling number of the extracted time-series data becomes a set value;
An awakening level estimation unit that inputs the time-series data after interpolation into a learning model constructed using a convolutional neural network, and estimates the awakening level of the user;
The awakening degree estimation device characterized by having.

(Supplementary Note 2)
The awakening degree estimation apparatus according to supplementary note 1, wherein
The learning model is constructed by inputting data obtained by performing interpolation on a plurality of time series data with different frame rates so that the sampling number becomes a set value, as learning data, to a convolutional neural network Being
An awakening level estimation device characterized in that.

(Supplementary Note 3)
The awakening degree estimation device according to the supplementary note 1 or 2, wherein
The biological information indicated by the time series data is information indicating the degree of opening and closing of the eye, information indicating the direction of the line of sight, information indicating the direction of the face, information indicating the pulse wave, information indicating the blood flow, At least one of the information indicating the degree of opening and closing,
An awakening level estimation device characterized in that.

(Supplementary Note 4)
It is the awakening degree estimation apparatus according to appendix 3.
When the biological information indicated by the time-series data is two or more pieces of information, the learning model has a layer for performing convolution for each of the biological information.
An awakening level estimation device characterized in that.

(Supplementary Note 5)
The awakening degree estimation device according to any one of the supplementary notes 1 to 4,
The apparatus further comprises a frame rate adjustment unit that adjusts the frame rate according to the estimated awakening degree.
An awakening level estimation device characterized in that.

(Supplementary Note 6)
The awakening degree estimation apparatus according to supplementary note 5, wherein
The frame rate adjustment unit further adjusts the frame rate in accordance with biological information indicated by the extracted time series data.
An awakening level estimation device characterized in that.

(Appendix 7)
A method for estimating a user's alertness, comprising
(A) acquiring image data including a face image of the user at a set frame rate;
(B) extracting time-series data indicating biometric information of the user from the image data acquired at the set frame rate;
(C) interpolating the time-series data so that the sampling number of the extracted time-series data becomes a set value;
(D) inputting the time-series data after interpolation into a learning model constructed using a convolutional neural network to estimate the awakening degree of the user;
A method of estimating arousal level characterized by having:

(Supplementary Note 8)
It is an awakening degree estimation method given in appendix 7.
The learning model is constructed by inputting data obtained by performing interpolation on a plurality of time series data with different frame rates so that the sampling number becomes a set value, as learning data, to a convolutional neural network Being
A method for estimating arousal level characterized by

(Appendix 9)
The alertness estimation method according to appendix 7 or 8,
The biological information indicated by the time series data is information indicating the degree of opening and closing of the eye, information indicating the direction of the line of sight, information indicating the direction of the face, information indicating the pulse wave, information indicating the blood flow, At least one of the information indicating the degree of opening and closing,
A method for estimating arousal level characterized by

(Supplementary Note 10)
It is the awakening degree estimation method according to appendix 9.
When the biological information indicated by the time-series data is two or more pieces of information, the learning model has a layer for performing convolution for each of the biological information.
A method for estimating arousal level characterized by

(Supplementary Note 11)
The alertness estimation method according to any one of appendices 7 to 10, wherein
(E) adjusting the frame rate according to the estimated arousal level,
A method for estimating arousal level characterized by

(Supplementary Note 12)
It is an awakening degree estimation method given in appendix 11.
Further, in the step (e), the frame rate is adjusted according to biological information indicated by the extracted time-series data.
A method for estimating arousal level characterized by

(Supplementary Note 13)
A computer readable recording medium storing a program for estimating a user's alertness by a computer, comprising:
On the computer
(A) acquiring image data including a face image of the user at a set frame rate;
(B) extracting time-series data indicating biometric information of the user from the image data acquired at the set frame rate;
(C) interpolating the time-series data so that the sampling number of the extracted time-series data becomes a set value;
(D) inputting the time-series data after interpolation into a learning model constructed using a convolutional neural network to estimate the awakening degree of the user;
A computer readable storage medium storing a program, comprising: instructions for executing the program.

(Supplementary Note 14)
A computer readable recording medium according to appendix 13, comprising
The learning model is constructed by inputting data obtained by performing interpolation on a plurality of time series data with different frame rates so that the sampling number becomes a set value, as learning data, to a convolutional neural network Being
A computer readable recording medium characterized in that.

(Supplementary Note 15)
24. A computer-readable recording medium according to

appendix

13 or 14,
The biological information indicated by the time series data is information indicating the degree of opening and closing of the eye, information indicating the direction of the line of sight, information indicating the direction of the face, information indicating the pulse wave, information indicating the blood flow, At least one of the information indicating the degree of opening and closing,
A computer readable recording medium characterized in that.

(Supplementary Note 16)
24. The computer-readable recording medium according to appendix 15.
When the biological information indicated by the time-series data is two or more pieces of information, the learning model has a layer for performing convolution for each of the biological information.
A computer readable recording medium characterized in that.

(Supplementary Note 17)
The computer-readable recording medium according to any one of appendices 13 to 16, wherein
The program is stored in the computer
(E) adjusting the frame rate according to the estimated arousal level, further comprising an instruction to execute the step
A computer readable recording medium characterized in that.

(Appendix 18)
24. The computer-readable recording medium according to appendix 17.
Further, in the step (e), the frame rate is adjusted according to biological information indicated by the extracted time-series data.
A computer readable recording medium characterized in that.

Although the present invention has been described above with reference to the embodiment, the present invention is not limited to the above embodiment. The configurations and details of the present invention can be modified in various ways that can be understood by those skilled in the art within the scope of the present invention.

As described above, according to the present invention, it is possible to accurately estimate the arousal level of a person while reducing the processing load. The present invention is useful for various systems where estimation of the awakening level of a person is required, for example, an air conditioning system, an operation system of a vehicle such as a car, and the like.

10 Awakening Level Estimating Device (First Embodiment)
11 image data acquisition unit 12 time series data extraction unit 13 data processing unit 14 awakening degree estimation unit 20 imaging device 30 awakening degree estimation device (second embodiment)
31 Frame rate adjustment unit 110 Computer 111 CPU
112 main memory 113 storage device 114 input interface 115 display controller 116 data reader / writer 117 communication interface 118 input device 119 display device 120 recording medium 121 bus

Claims

An apparatus for estimating the awakening degree of a user,
An image data acquisition unit for acquiring image data including a face image of the user at a set frame rate;
A time-series data extraction unit that extracts time-series data indicating biological information of the user from the image data acquired at the set frame rate;
A data processing unit that interpolates the time-series data such that the sampling number of the extracted time-series data becomes a set value;
An awakening level estimation unit that inputs the time-series data after interpolation into a learning model constructed using a convolutional neural network, and estimates the awakening level of the user;
The awakening degree estimation device characterized by having.
The awakening degree estimation apparatus according to claim 1, wherein
The learning model is constructed by inputting data obtained by performing interpolation on a plurality of time series data with different frame rates so that the sampling number becomes a set value, as learning data, to a convolutional neural network Being
An awakening level estimation device characterized in that.
The awakening degree estimation apparatus according to claim 1 or 2, wherein
The biological information indicated by the time series data is information indicating the degree of opening and closing of the eye, information indicating the direction of the line of sight, information indicating the direction of the face, information indicating the pulse wave, information indicating the blood flow, At least one of the information indicating the degree of opening and closing,
An awakening level estimation device characterized in that.
The awakening degree estimation apparatus according to claim 3,
When the biological information indicated by the time-series data is two or more pieces of information, the learning model has a layer for performing convolution for each of the biological information.
An awakening level estimation device characterized in that.
The awakening level estimation apparatus according to any one of claims 1 to 4, wherein
The apparatus further comprises a frame rate adjustment unit that adjusts the frame rate according to the estimated awakening degree.
An awakening level estimation device characterized in that.
The awakening degree estimation apparatus according to claim 5, wherein
The frame rate adjustment unit further adjusts the frame rate in accordance with biological information indicated by the extracted time series data.
An awakening level estimation device characterized in that.
A method for estimating a user's alertness, comprising
(A) acquiring image data including a face image of the user at a set frame rate;
(B) extracting time-series data indicating biometric information of the user from the image data acquired at the set frame rate;
(C) interpolating the time-series data so that the sampling number of the extracted time-series data becomes a set value;
(D) inputting the time-series data after interpolation into a learning model constructed using a convolutional neural network to estimate the awakening degree of the user;
A method of estimating arousal level characterized by having:
The awakening degree estimation method according to claim 7, wherein
The learning model is constructed by inputting data obtained by performing interpolation on a plurality of time series data with different frame rates so that the sampling number becomes a set value, as learning data, to a convolutional neural network Being
A method for estimating arousal level characterized by
The awakening degree estimation method according to claim 7 or 8, wherein
The biological information indicated by the time series data is information indicating the degree of opening and closing of the eye, information indicating the direction of the line of sight, information indicating the direction of the face, information indicating the pulse wave, information indicating the blood flow, At least one of the information indicating the degree of opening and closing,
A method for estimating arousal level characterized by
It is the awakening degree estimation method according to claim 9,
When the biological information indicated by the time-series data is two or more pieces of information, the learning model has a layer for performing convolution for each of the biological information.
A method for estimating arousal level characterized by
The method for estimating arousal level according to any one of claims 7 to 10, wherein
(E) adjusting the frame rate according to the estimated arousal level,
A method for estimating arousal level characterized by
It is the awakening degree estimation method according to claim 11,
Further, in the step (e), the frame rate is adjusted according to biological information indicated by the extracted time-series data.
A method for estimating arousal level characterized by
A computer readable recording medium storing a program for estimating a user's alertness by a computer, comprising:
On the computer
(A) acquiring image data including a face image of the user at a set frame rate;
(B) extracting time-series data indicating biometric information of the user from the image data acquired at the set frame rate;
(C) interpolating the time-series data so that the sampling number of the extracted time-series data becomes a set value;
(D) inputting the time-series data after interpolation into a learning model constructed using a convolutional neural network to estimate the awakening degree of the user;
A computer readable storage medium storing a program, comprising: instructions for executing the program.
The computer readable recording medium according to claim 13, wherein
The learning model is constructed by inputting data obtained by performing interpolation on a plurality of time series data with different frame rates so that the sampling number becomes a set value, as learning data, to a convolutional neural network Being
A computer readable recording medium characterized in that.
A computer readable recording medium according to claim 13 or 14,
The biological information indicated by the time series data is information indicating the degree of opening and closing of the eye, information indicating the direction of the line of sight, information indicating the direction of the face, information indicating the pulse wave, information indicating the blood flow, At least one of the information indicating the degree of opening and closing,
A computer readable recording medium characterized in that.
The computer-readable recording medium according to claim 15.
When the biological information indicated by the time-series data is two or more pieces of information, the learning model has a layer for performing convolution for each of the biological information.
A computer readable recording medium characterized in that.
The computer readable recording medium according to any one of claims 13 to 16, wherein
The program is stored in the computer
(E) adjusting the frame rate according to the estimated arousal level, further comprising an instruction to execute the step
A computer readable recording medium characterized in that.
A computer readable recording medium according to claim 17, wherein
Further, in the step (e), the frame rate is adjusted according to biological information indicated by the extracted time-series data.
A computer readable recording medium characterized in that.