WO2021209426A1

WO2021209426A1 - Method and system for driver assistance

Info

Publication number: WO2021209426A1
Application number: PCT/EP2021/059512
Authority: WO
Inventors: Antonyo Musabini; Mousnif CHETITAH; Kevin Nguyen
Original assignee: Valeo Schalter Und Sensoren Gmbh
Priority date: 2020-04-16
Filing date: 2021-04-13
Publication date: 2021-10-21
Also published as: DE102020110417A1

Abstract

According to a method for assisting a driver (5) of a vehicle (1), a gaze tracking system (3) is used to determine a gaze direction (6) of the driver (5). A computing unit (4) is used to determine an intersection point (7) of the gaze direction (6) with a virtual plane (8) and to track the intersection point (7) during a predefined time period, to determine a map (9a, 9b, 9c, 9d) representing a distribution of the position of the intersection point (7) during the time period. The computing unit (4) is also used to determine a level of cognitive distraction of the driver (5) based on the map (9a, 9b, 9c, 9d) and to generate or cause an output signal to assist the driver (5) depending on the determined level of cognitive distraction.

Description

Method and system for driver assistance

The present invention relates to a method for assisting a driver of a vehicle, wherein a gaze tracking system is used to determine a gaze direction of the driver. Furthermore, the invention is related to a respective driver assistance system for a vehicle and a computer program product.

Distraction of a driver of a motor vehicle can lead to serious accidents. Therein, distraction can take different forms. For example, if the driver takes the eyes off the road, it can be considered as visual distraction an if the driver takes the hands off the steering wheel, it can be considered as a manual distraction of the driver. In addition, there may be cognitive distraction, where the driver takes his mind off the driving task.

Existing approaches to distraction detection include sensors that recognize the driver taking the hands off the steering wheel or using an eye tracker to recognize the driver taking the eyes off the road.

For example, document US 6,496,117 B2 describes a system for monitoring a driver’s attention to driving a vehicle. The gaze and a facial pose of the driver is determined by means of a camera. In case the gaze or the facial pose of the driver are not oriented in a forward direction of travel, the system may issue an alarm signal.

Flowever, known approaches are not capable of taking into account cognitive distraction in a reliable manner, since cognitive distraction is not necessarily correlated with the gaze of the driver being on or off the road or with the hands of the driver being on or off the steering wheel.

It is therefore an object of the present invention to provide an improved concept for assisting a driver of a vehicle that reduces a risk for accidents due to cognitive distraction of the driver.

This object is achieved by the subject-matter of the independent claims. Further implementations and preferred embodiments are subject-matter of the dependent claims. The improved concept is based on the idea to track the gaze of the driver over a time period and determine the level of cognitive distraction based on a map representing the distribution of an intersection point of the gaze direction with a virtual plane in front of the driver during the time period.

According to the improved concept, a method for assisting a driver of a vehicle is provided. A gaze tracking system is used to determine a gaze direction of the driver and a computing unit, in particular of the vehicle, is used to determine an intersection point of the gaze direction with a virtual plane and to track the gaze direction and thereby the intersection point during a predefined time period, in particular, a measuring time period or capturing time period. The computing unit is used to determine a map representing a distribution of the position of the intersection point on the virtual plane during the time period. The computing unit is used to determine a level of cognitive distraction of the driver based on the map and to generate or cause an output signal to assist the driver depending on the determined level of cognitive distraction.

The gaze tracking system may for example comprise an infrared sensor or an infrared camera, wherein a potential position of the head or eyes of the driver are in a field of view of the camera or sensor. Infrared radiation emitted by an infrared emitter of the gaze tracking system is partially reflected by the eyes of the driver depending on the gaze direction. In this way, the gaze tracking system can determine the gaze direction.

However, also alternative approaches to tracking or determining the gaze direction may be equally applicable.

The gaze direction can for example be understood as a straight line defined by a gaze direction vector.

In order to track the intersection point during the time period, the gaze direction is, for example, determined repeatedly at a predefined frequency, for example in the order of several tens of milliseconds, for example around 50 ms, and the intersection point is determined for each repetition. The positions of the intersection point may be stored or buffered during the time period in order to determine the map. The map can be understood as a representation, in particular an image representation, of the intersection point in two dimensions, in particular on the virtual plane. The map may for example have the form of a matrix with entries, wherein values of the entries depend on the duration or likelihood of the gaze direction having the corresponding intersection point with the virtual plane during the time period. In other words, the map represents a local distribution of the gaze intensity. The map may for example be visually represented as a two-dimensional heatmap or color shaded map or grey shaded map. In other words, the map may be represented as a two- dimensional image.

Causing the output signal may for example include triggering a further unit, for example an actuator, a display or a further computing unit, to generate the output signal.

The virtual plane may be understood as a geometric construction of a plane surface of predefined size. The virtual plane may also be denoted as a virtual wall or an imaginary surface.

In particular, the virtual plane is virtually located at a predefined distance from the gaze tracking system, for example the camera or sensor, or a potential reference position for the driver’s head or eyes. According to several implementations, a position and orientation of the virtual plane is fixed to a vehicle coordinate system of the vehicle or a sensor coordinate system of the gaze tracking system. In particular, the position and orientation of the virtual plane with respect to the gaze tracking system or the vehicle does not change when the gaze direction of the driver has actually changed.

For example, the virtual plane may be positioned parallel to a lateral axis of the vehicle coordinate system or the sensor coordinate system and may be positioned with a fixed angle or predefined angle with respect to a normal axis of the vehicle coordinate system or the sensor coordinate system. The angle may for example be approximately zero degrees, such that the virtual plane is oriented perpendicular to a longitudinal axis of the vehicle coordinate system or the sensor coordinate system. The virtual plane may for example be positioned at a predefined distance from an origin of the vehicle coordinate system or the sensor coordinate system in a longitudinal direction. By means of a method according to the improved concept, the gaze dynamics is directly analyzed to identify cognitive distraction of the driver, for example in so-called day dreaming situations, where the gaze of the driver is directed on the road, however, the gaze dynamics is not appropriate to the current traffic situation or driving situation.

By taking into account the cognitive distraction level, risky situations and accidents may be prevented.

Since an image-based representation of the gaze dynamics is used in form of the map, the level of distraction or the respective cognitive state of the driver may be determined in a particularly reliable way, for example compared to a mere analysis of the gaze direction. Furthermore, by the image-based representation, an algorithm for analyzing the map to determine the level of cognitive distraction may require less data for training and/or may generate more precise results.

By analyzing the intersection point of the gaze direction with the virtual plane, the movement of the eyes of the driver can be considered to be amplified by the distance between the eyes by the driver and the virtual plane. Therefore, more precise results may be achieved, for example compared to a mere tracking of the pupils of the driver.

According to several implementations of the method according to the improved concept, the output signal comprises an alert signal to the driver.

According to several implementations, the alert signal includes a visual signal and/or an acoustic signal and/or a haptic signal. According to several implementations, the output signal may be used, for example, by a driver assistance system, which may comprise the computing unit to initiate a risk reduction measure, such as limiting a maximum speed of the vehicle or increasing a degree of driver assistance. According to several implementations, the computing unit is used to apply a classification algorithm, in particular a trained classification algorithm, to the map or to data being derived based on the map, in order to determine the level of cognitive distraction.

The classification algorithm may, for example, use a binary classification to classify the cognitive distraction into two classes, one corresponding to a distracted driving and one corresponding to a neutral or undistracted driving. However, also more than two classes can be used to allow for a more sensitive reaction.

According to several implementations, the computing unit is used to extract a predefined set of one or more characteristic features from the map and to apply a classification algorithm, in particular a trained classification algorithm, to the set of the characteristic features.

The characteristic features may include, for example, statistical properties of the map or the distribution, such as a standard deviation or a maximum value of the distribution in a given direction on the virtual plane and so forth.

The classification algorithm may also comprise the transformation of the predefined set of characteristic features into a suitable form for being classified by the classification algorithm.

According to several implementations, the classification algorithm comprises a support vector machine or an artificial neural network, for example a convolutional neural network, CNN. These algorithms have proven to represent powerful tools for image-based classification.

According to several implementations, the computing unit is used to determine a spatial region on the virtual plane, wherein all entries of the map corresponding to the spatial region have values that are equal to or greater than a predefined threshold value. The computing unit is used to determine the level of distraction based on the spatial region.

The characteristic features may for example comprise one or more geometric features of the spatial region and/or one or more features concerning the shape of the spatial region. For example, an area or a size in a predefined direction on the virtual plane may be used as such features.

According to several implementations, one or more further spatial regions are determined by using the computing unit, wherein all entries of the map corresponding to the respective further spatial region have respective values that are equal to or greater than a predefined respective further threshold value. The computing unit is used to determine the level of cognitive distraction based on the further spatial regions. The further threshold values may be different from the threshold value.

According to several implementations, the computing unit is used to determine an image moment and/or a Hu moment based on the map. The computing unit is used to determine the level of cognitive distraction based on the image moment and/or based on the Hu moment.

The image moment and/or the Hu moment may be considered as features of the predefined set of characteristic features.

According to several implementations, the computing unit is used to compare the level of cognitive distraction with a predefined maximum distraction value and to generate the output signal as an alert signal, if the level of cognitive distraction is greater than the maximum distraction value.

According to several implementations, the computing unit is used to track the gaze direction and thereby the intersection point during a predefined further time period, wherein a duration of the further time period is different from a duration of the time period. The computing unit is used to determine a further map representing a further distribution of the position of the intersection point duration the further time period. The computing unit is used to determine a further level of cognitive distraction of the driver based on the map and to generate the output signal depending on the determined further level of cognitive distraction. In general, the duration of the time period or the further time period may be chosen according to a trade-off between reactiveness and reliability. The shorter the duration, the higher is the reactiveness, but the longer the duration, the higher is the reliability.

By tracking the gaze direction and the intersection point duration different time periods of different durations, and the respective parallel generation of the maps and the further maps, both advantages may be combined and a high reactiveness at a high reliability may be achieved.

In particular, the further time period and the time period may overlap.

According to several implementations, the duration of the time period lies in the interval [1 s, 30 s], preferably in the interval [10 s, 20 s]. According to several implementations, the duration of the further time period lies in the interval [15 s, 120 s], preferably in the interval [30 s, 90 s]. In particular, the duration of the time period may lie in the interval [10 s, 20 s], while the duration of the further time period lies in the interval [30 s, 90 s]. For example, the time period may be approximately 15 s and the further time period may be approximately 60 s.

According to several implementations, the further time period is at least twice as longitudinal as the time period, preferably the further time period is at least four times as long as the time period.

According to several implementations, an environmental sensor system of the vehicle is used to identify an object present in an environment of the vehicle, in particular during the time period. The computing unit is used to augment the map with information regarding a position of the object with respect to virtual plane and the computing unit is used to determine the level of cognitive distraction based on the augmented map.

For example, the computing unit may insert a symbol representing the object, such as a bounding box or the like.

The environmental sensor system may for example comprise a further camera system, a Lidar system, a radar system and/or an ultrasonic sensor system. In particular, the environmental sensor system may generate environmental sensor data representing the object and the computing unit augment the map based on the environmental sensor data.

The interpretation of the gaze distribution may depend on further objects in the scene, such as other vehicles, pedestrians, traffic signs and so forth. Therein, the interpretation may depend on the number of objects, their respective positions and/or their type.

Consequently, such implementations allow to take into account such effects when determining the level of cognitive distraction. In particular, the distribution of the gaze direction may be different in case there are further objects in the scene compared to an empty road in front of the driver. In consequence, a higher reliability of the method and an improved level of safety may be achieved. According to the improved concept, also a driver assistance system for a vehicle, in particular an advanced driver assistance system, ADAS, is provided. The driver assistance system comprises a gaze tracking system, which is configured to determine a gaze direction of a driver of a vehicle. The driver assistance system comprises a computing unit, which is configured to determine an intersection point of the gaze direction with a virtual plane and to track the intersection point during a predefined time period. The computing unit is configured to determine a map representing a distribution of the position of the intersection point during the time period, to determine a level of cognitive distraction of the driver based on the map and to generate or cause an output signal to assist the driver depending on the determined level of cognitive distraction.

The driver assistance system may, in some implementations, comprise the environmental sensor system explained above with respect to the method.

According to several implementations of the driver assistance system, the gaze tracking system comprises an emitter, which is configured to emit infrared radiation towards an eye of the driver and an infrared sensor configured to detect a portion of the infrared radiation reflected from the eye of the driver and to generate a sensor signal depending on the detected portion of the infrared radiation. The gaze tracking system comprises a control unit, which is configured to determine the gaze direction based on the sensor signal.

The control unit may be partially or completely comprised by the computing unit or may be implemented separately.

Further implementations of the driver assistance system according to the improved concept follow from the various implementations of the method according to the improved concept and vice versa. In particular, a driver assistance system according to the improved concept may be programmed to or configured to perform a method according to the improved concept or the system performs such a method.

According to the improved concept, also a computer program comprising instructions is provided. When the instructions or the computer program are executed by a driver assistance system according to the improved concept, the instructions cause the driver assistance system to carry out a method according to the improved concept. According to the improved concept, also a computer readable storage medium storing a computer program according to the improved concept is provided.

Both, the computer program and the computer readable storage medium can therefore be considered as respective computer program products comprising the instructions.

Further features of the invention are apparent from the claims, the figures and the description of figures. The features and feature combinations mentioned above in the description as well as the features and feature combinations mentioned below in the description of figures and/or shown in the figures alone are usable not only in the respectively specified combination, but also in other combinations without departing from the scope of the invention. Thus, implementations are also to be considered as encompassed and disclosed by the invention, which are not explicitly shown in the figures and explained, but arise from and can be generated by separated feature combinations from the explained implementations. Implementations and feature combinations are also to be considered as disclosed, which do not have all of the features of an originally formulated independent claim. Moreover, implementations and feature combinations are to be considered as disclosed, in particular by the implementations set out above, which extend beyond or deviate from the feature combinations set out in the relations of the claims.

In the figures:

Fig. 1 shows schematically a vehicle with an exemplary implementation of a driver assistance system according to the improved concept,

Fig. 2 shows a flow diagram of an exemplary implementations of a method according to the improved concept, Fig. 3 shows schematically a virtual plane for use in exemplary implementations according to the improved concept,

Fig. 4 shows schematically an example of a map for use in exemplary implementations according to the improved concept, Fig. 5 shows schematically further examples of maps for use in further exemplary implementations according to the improved concept,

Fig. 6 shows schematically steps for generating a map for use according to the improved concept,

Fig. 7 shows schematically distributions for use according to the improved concept, Fig. 8 shows schematically further examples of maps for use in further exemplary implementations according to the improved concept, and

Fig. 9 shows further examples of maps for use in further exemplary implementations according to the improved concept.

In Fig. 1 , a vehicle 1 is shown schematically, which comprises an exemplary implementation of a driver assistance system 2 according to the improved concept.

The driver assistance system 2 comprises a gaze tracking system 3, configured to determine a gaze direction 6 of a driver 5 (see Fig. 3) of the vehicle 1. Furthermore, the driver assistance system 2 comprises a computing unit 4, which may, in particular in combination with the gaze tracking system 3, carry out a method according to the improved concept. Optionally, the driver assistance system 2 further comprises an environmental sensor system 10, such as a camera system, a Lidar system, a radar system and/or an ultrasonic sensor system.

The function of the driver assistance system 2 according to the improved concept will in the following be explained in more detail with respect to exemplary implementations of methods according to the improved concept and, in particular, with respect to Fig. 2 to Fig. 9.

Fig. 2 shows a flow diagram of an exemplary implementation of a method according to the improved concept. Step SO denotes the start of the method. In step S1 , the gaze tracking system 3 determines a gaze direction 6 of the driver 5 (see Fig. 3). Furthermore, it is determined by the computing unit 4 in step S1 , whether the gaze direction intersects a virtual plane 8 located in front of the driver 5, as schematically shown in Fig. 3.

In the example of Fig. 3, the gaze direction 6 intersects the virtual plane 8 at an intersection point 7. In case the gaze direction 6 does not intersect the predefined virtual plane 8, the method starts over with step SO. If, on the other hand, the gaze direction 6 intersects the virtual plane 8, the coordinates of the intersection point 7 on the virtual plane 8 may for example be stored on a storage element of the computing unit 4. The described steps of detecting the gaze direction 6, checking whether it intersects with the virtual plane 8 and, optionally, storing the position of the respective intersection point 7, are repeated regularly during a predefined time window or time period.

In step S2, the computing unit 4 generates a map based on the tracked gaze direction and the correspondingly tracked position of the intersection point 7. It is pointed out, that method steps S1 and S2 may also be carried out partially in parallel to each other. In particular, the time period may be a floating time window.

In order to determine the map, the computing unit 4 determines a distribution or a density distribution of the gaze direction or the intersection point 7 during the time interval. The result is a distribution map 9a that can be considered as a heatmap, as depicted in Fig. 4.

In particular, the map 9a may have different colors or different grey values for different values of densities of intersection points 7 on the virtual plane 8 during the time interval. In steps S3 and S4, the computing unit 4 determines a level of cognitive distraction of the driver 5 based on the map 9a.

For example, during S3 the computing unit 4 may extract a predefined set of characteristic features. In step S4, the computing unit 4 may apply a trained classification algorithm to the set of characteristic features. As a result of the classification algorithm, the level of cognitive distraction is obtained.

Alternatively, the computing unit 4 may apply a classification algorithm directly to the map 9a to determine the level of cognitive distraction.

For extracting the characteristic features, a feature extraction module of the computing unit 4, in particular a software module, may compute the set of features and send them to the classification algorithm, which predicts the level of cognitive distraction of the driver. The result may be a set of information including the state of the driver, in particular his or her level of cognitive distraction or, in other words, cognitive load. Potentially, other useful information may be extracted from the map 9a as well, for example the willingness or intention of the driver to change a driving lane. By comparing statistical or geometrical characteristics of the map 9a, like area and standard deviation of geometric features on the map 9a, the classification algorithm may determine, whether there is a high level of cognitive distraction, as for example indicated by the map 9b shown exemplarily in panels a) or the map 9c as shown in panel b) of Fig. 5, or if the level of cognitive distraction is low, as for example indicated by the map 9d in panel c) of Fig. 5.

In step S5, the computing unit 4 may compare the determined level of cognitive distraction with a predefined threshold or maximum distraction. If the cognitive level of distraction is lower than the maximum distraction, the method starts over with SO. Otherwise, an output signal is generated in step S6 to alert or assist the driver 5 by making him or her aware of the cognitive distraction.

Optionally, the environmental sensor system 10 may generate sensor data representing an environment of the vehicle 1. The sensor data may, for example, indicate that a further vehicle is present in the environment. Then, the computing unit 4 may augment the map by including a representation 11 of the further vehicle in an augmented map 9e, as indicated in panel b) of Fig. 5.

Knowing the presence of the further vehicle makes the classification algorithm to expect a wider heatmap, should cover, for example, at least a part of the further vehicle. Therefore, the representation of the further vehicle can be used in step S4 to estimate the level of cognitive distraction.

Also the number of further vehicles and their positions may affect the expected dynamics of the gaze direction and can be taken into account in this way.

In the following, a detailed explanation of an experimental setup to verify the advantages of the improved concept described with respect to Fig. 6 to Fig. 9. The explanations can be understood as representing the basis of the improved concept as elaborated by the inventors. Consequently, the following explanations may refer to rather specific and non limiting implementations. Flowever, individual features or combinations of features described in the following may also be combined with various implementations of the method and/or the driver assistance system according to the improved concept. Specific eye related measurements like blinks, saccades, pupils and fixations are relevant for assessing cognitive load. An observer’s visual scanning behavior tends to narrow during periods of increased cognitive demand, which is in parallel to the fact that mental tasks produce spatial gaze concentration and visual detection impairment. Based on this knowledge, instead of detecting an analyzing all eye related movements individually, an approach, which sums up all the gaze activities is described. Flence, the driver's eye gaze vector is projected on an imaginary distant surface, also denoted as virtual plane. By following the temporal variation of this projection, an image-based representation, also denoted as map, is created. These shapes reveal the cognitive load of the driver. In particular, a link between short-term memory and distraction while driving is explored. The concept of cognitive load, inattention and distraction are three different concepts. Cognitive load refers to the percentage of used resources in working memory. Inattention is the fact that the driver is losing attention from the driving task to other secondary tasks and distraction refers to the involvement of the driver in other tasks. Distraction leads to inattention from a particular task and this causes a high cognitive load. Therefore, the following assumption is made: During neutral driving, the driver has enough cognitive resources to explore the environment and performs normal tasks related to the driving, such as regularly checking the mirrors, other vehicles, road signs etc. Among vestibulo- ocular eye movements (fixations), saccades (rapid, ballistic movements) and smooth pursuits (slower tracking movements) should be observed. Flowever, while distracted driving, the driver has less cognitive resources left for the driving task. Thus, the gaze behavior has a smaller range. As a result, a variation on the eye movements is expected.

The experimental protocol will be explained in the following.

The experimental session was composed by two driving laps on the same route. The first lap constitutes the base line where the driver performed the driving task naturally. He (all the participants were male with an average age of 29.4 years) is told to relax and drive carefully. This lap is called neutrally driving, which is important to learn the base line eye gaze variations of the participants. The second lap was performed immediately after the first one on the same portion of the road. During the second lap, the driver had to perform four secondary tasks, which aimed to cognitively overload the driver. This lap is called distracted driving. An important aspect of the experimental protocol was to recreate the driving conditions (road, weather, traffic jam) as similar as possible between sessions, but also for both laps completed by a single participant. Therefore, a highway road in France was defined as the experimentation route for each participant. The speed limit on this highway was constant (90 km/h) and it took 22 minutes to complete a single lap. Driving was performed during daytime between 10 a.m. and 5 p.m., in order to minimize the variation in weather and traffic conditions.

An expert is in charge of the experimental protocol, launching the secondary tasks, annotating events and guiding the driver on the driving path. This expert is called the accompanist in the following chapters.

Five drivers participated to the data collection protocol. All of them were volunteers, working in automotive industry. However, they were not aware of the purpose of the driving session.

The very aim of the secondary tasks was to increase the mental workload for the driver.

In literature, the distinct secondary tasks are cited such as foot tapping (secondary task), while learning (primary task) and measuring the rhythmic position or measuring the detection response tasks while driving. In a driving simulator-based experiment, drivers had to accomplish visual manual, auditory, verbal and haptic secondary tasks. Results of eye glance analysis showed that the visual detection response tasks, DRTs, were more efficient than other ones. However, in the present study, in order to keep the eye gaze patterns as neutral as possible, the visual secondary task has been discarded. Attention paid to have immersive and fun secondary tasks for approaching to a more naturalistic experimentation. The following four games are designed all based on the N-back task strategy. N-back tasks are cognitively distracting tasks where the participants have to recall successive instructions. Recalling these successive instructions leads to increasing the mental workload. Each game was designed to last four minutes. After one minute of pause, the following game started.

The first game is called “neither yes nor no”. This game is based on abolishing the words “yes”, “no” and their alternatives like “yeah” or "oui". The accompanist asks successive questions for force the participants to pronounce these words. The second game is called “in my trunk there is”. It consists by citing the words “in my trunk there is” followed by an item’s name. The participant and the accompanist, turn by turn, should recall all the past objects and add a new one to the list.

The third game is called “guess who”. The participant thinks about a real or imaginary character and the accompanist tries to find out the identity of the character by asking questions from a mobile application. The participant should answer to the questions correctly.

The fourth game is called “the 21”. The accompanist states one to three numbers in the numerical order, for example, three numbers 1 , 2, 3, and the driver follows the order and states a different number of digits than the accompanist, for example, two numbers 4, 5. The game continues so on. The number 21 is forbidden. Instead of saying 21 , a rule should be added to the game, for example, “do not say multiples of four”. For data acquisition, the position of the vehicle’s interior parts, such as interior mirrors and instrument cluster, are measured and they are illustrated in a 3D-world representation, as sketched in Fig. 3. These objects are static for a reference frame, where the vehicle is the origin. While driving, the driver is monitored with a rear infrared camera, placed in front of the instrument cluster. This camera extracts the head position, eyes position and their direction. The head position and eye gaze direction are also illustrated in Fig. 3. Hence, if the driver is looking to one of the objects, present in the scene, this can be determined.

In Fig. 3, the circle can be considered to represent the driver’s head position. The driver’s eye gaze vector is represented as a line going from this circle. The small solid rectangles represent the vehicle’s interior parts, such as left and right mirrors, eye central mirror, instrument cluster’s speed indicator, RPM-indicator, central stack and navigation screen. The virtual plane is also called a virtual wall.

The virtual wall is a surface placed in front of the vehicle, as if it was one of the vehicle’s interior parts. The intersection point between this virtual wall and the eye gaze vector is tracked on a given time window. Following the variations of the intersection point over the surface, an image-based representation is generated. This representation, which may be denoted as a heatmap, is the collected data, which is used to detect the cognitive load of the driver.

The vehicle is also equipped with a frontal RGB camera. The positions and the dimensions of the imaginary surface is set to maximize the junction of this surface with the camera’s field of view and the area where the gaze detection is available. This camera is located in the inner center of the vehicle, whereas the driver is sitting on the front left seat. Thus, the driver's gaze activities seem to be concentrated on the left side of the image on the overlays and heatmaps.

The heatmap is a data visualization technique used in different studies and solutions. There are often used to highlight the areas of interest of the people, therefore we can explore several situations from it. A heatmap as shown in Fig. 4 may be used for visualization and feature extraction after successive steps for creating the heatmap, as indicated in the panels of Fig. 6.

The time stamped raw intersection points in x and x between the eye gaze vectors and the imaginary surface are the heatmap generator's input. This data may be acquired every 50 ms, if the driver is looking to the imaginary plane. If the driver is not looking to the plane, the distraction source is visual and not necessarily cognitive. For example, the driver might be checking a smartphone. Nevertheless, this distraction can also be detected by the present method.

Since a single intersection point is not sufficiently meaningful for the present problem, the points are buffered as a sliding window, as depicted in panel a) of Fig. 6. In the aim of covering the field of view of the driver, a circle of 15 pixel is placed, centered on the intersection points, as indicated in panel b) of Fig. 6. The choice of the circle diameter that represents the gaze fixation is mainly influenced by the dimensions of the heatmap, which is 640 x 400 in the present case. After normalization of the field of view circles, this mask is used to vary the opacity of intersections as indicated in panel c) of Fig. 6. A Gaussian filter is applied on this mask to reduce the noise due to the gaze activity and also to reduce a detail which keeps the focus on the most explored area, as indicated in panel d) of Fig. 6. Feature engineering is applied to the generated heatmaps in order to reduce the data dimension. From each heatmap, the following feature sets, based on their pixel intensities and shape, are extracted.

For example, appearance features may be extracted. The pixel intensity variation of a head map contains information on the area checked by the driver. The histogram is an efficient tool to visualize those data distributions, especially if the gaze activity is focused on a specific area or if it is exploring a wider area. While distracted driving, it is expected to visualize more concentration on higher intensities and while natural driving, as the driver should cover a wider area, it is expected to visualize more concentration on lower intensities. Flence, a 6 bin-histogram of pixel number in terms of pixel intensity is generated per heatmap, as shown in panels a) and b) in Fig. 7. Fig. 7 shows the computed histograms of the heatmap depicted in Fig. 4. In panel a), pixel intensities are represented on the abscissa and the respective number of pixels is represented on the ordinate. In panel b), the values from panel a) are summed up to 6 bins.

Also the geometric features can be extracted. Beyond the pixel intensities that are exploited with heatmap histograms, their shape has to be considered as well. Thus, the generated heatmap is divided in contours according to the difference in pixel intensities: binary large object (BLOBs) as indicated in the panels of Fig. 8. Panel a) of Fig. 8 depicts the driver's gaze activity heatmap. The central regions in the BLOB represent the most fixated area and the boundary region is the less fixated. Panel b) shows a thresholded heatmap, converted to a grey scale image with distinguished contours of focus. In panels c), d), e), f) extracted contours from the thresholded heatmap of panel b) are shown. Each contour is defined by the pixel intensities. A binary threshold is performed for each zone.

Furthermore, image moments may be extracted. The grey scale BLOBs are constructed from movements containing motion energy and motion history. Therefore, the motion energy and history from each of them are explored. The following equation (1 ) describes the computed features that are based on the image moments which are particularly weighted averages of the image intensities or a representation of the pixel distribution in the image.

M = Sum_x[Sumy[xy l(x,y)]] (1)

For a given image l(x,y), the i and j are integers to the define the moment order.

Moments are sensitive in the pixel position in the image, whereas the shapes are expected to be detected independently to their positions (so-called regular moments).

Flu moments are seven particular image moments, which are invariant under translation, rotation and scale. They are particular combinations of central moments denoted from hi to h7. Flu moments are extracted per BLOB.

Also statistical measures may be extracted. In order to grasp the information about the driver's gaze dispersion across the imaginary plane, the following features are extracted as statistical measures over all BLOBs: · standard deviation on x and y

• centroid coordinates (in case there are several contours, their mean value)

• boundaries of each zone (maximum and minimum of x and x)

• first quartile, median and third quartile on x and y

• area of the contour · perimeter of the contour If the driver does not look to the image and imaginary plane for all along the given time window, the heatmap could contain less relevant data. Therefore, the information of how much time did spend looking ahead is another feature, which determines the quality of the heatmap.

Finally, all the extracted features are standardized by removing the mean and scaling to unit variants per heatmap. A supervised binary classification algorithm based on a support vector machine is trained with the extracted features. Data collected while neutral driving has been annotated neutral and data collecting while presence of secondary tasks has been annotated as distracted. The classification is validated through stratified K-fold cross validation technique with ten iterations (K = 10). Stratifications seek to ensure that each fold is representative of all straights of the data, which aims to ensure each class is equally represented across the test fold and consists of splitting the dataset into samples. Parallel to the initial expectations, the variation of the obtained shapes is visually different between neutral and distracted driving, as indicated in Fig. 9. These shapes occupy a wide area in neutral driving, as the driver checks the environment often. However, in the presence of cognitive distraction, the covered area narrows down, as the driver fixes more to a single zone. For a longer time observed heatmap, a better separable visual pattern is obtained. This is due to the fact that the driver has more time to explore the environment in neutral driving, whereas in distracted driving. As a result, the distances between neutral and distracted driving patterns become more obvious with time, as indicated in Fig. 9.

Fig. 9 represents heatmaps during 5, 15 and 30 s in distracted and neutral driving scenarios. Panels a), c), e) correspond to distracted driving scenarios while panels b), d), f) correspond to neutral driving scenarios. Panels a) and b) correspond to 5 s, panels c) and d) correspond to 15 s and panels e) and f) correspond to 30 s.

Claims

Method for assisting a driver (5) of a vehicle (1), wherein a gaze tracking system (3) is used to determine a gaze direction (6) of the driver (5), characterized in that a computing unit (4) is used to determine an intersection point (7) of the gaze direction (6) with a virtual plane (8) and to track the intersection point (7) during a predefined time period; determine a map (9a, 9b, 9c, 9d) representing a distribution of the position of the intersection point (7) during the time period; determine a level of cognitive distraction of the driver (5) based on the map (9a, 9b, 9c, 9d); and generate or cause an output signal to assist the driver (5) depending on the determined level of cognitive distraction.

Method according to claim 1 , characterized in that the computing unit (4) is used to apply a classification algorithm to the map (9a, 9b, 9c, 9d) to determine the level of cognitive distraction.

Method according to claim 1 , characterized in that the computing unit (4) is used to extract a predefined set of characteristic features from the map (9a, 9b, 9c, 9d) and apply a classification algorithm to the set of characteristic features.

Method according to one of claims 2 or 3, characterized in that the classification algorithm comprises a support vector machine or an artificial neural network. 5. Method according to one of the preceding claims, characterized in that the computing unit (4) is used to determine a spatial region on the virtual plane (8), wherein all entries of the map (9a, 9b, 9c, 9d) corresponding to the spatial region have respective values that are equal to or greater than a predefined threshold value; and the computing unit (4) is used to determine the level of cognitive distraction based on the spatial region. 6. Method according to one of the preceding claims, characterized in that the computing unit (4) is used to determine an image moment and/or a Hu moment based on the map (9a, 9b, 9c, 9d); and the computing unit (4) is used to determine the level of cognitive distraction based on the image moment and/or based on the Hu moment.

7. Method according to one of the preceding claims, characterized in that the computing unit (4) is used to compare the level of cognitive distraction with a predefined maximum distraction value and to generate the output signal as an alert signal, if the level of cognitive distraction is greater than the maximum distraction value.

8. Method according to one of the preceding claims, characterized in that the computing unit (4) is used to track the intersection point (7) during a predefined further time period, wherein a duration of the further time period is different from a duration of the time period; determine a further map (9a, 9b, 9c, 9d) representing a further distribution of the position of the intersection point (7) during the further time period; determine a further level of cognitive distraction of the driver (5) based on the map (9a, 9b, 9c, 9d); and generate the output signal depending on the determined further level of cognitive distraction. 9. Method according to claim 8, characterized in that the duration of the time period lies in the interval [1 s, 30 s], preferably in the interval [10s, 20s]; and/or - the duration of the further time period lies in the interval [15 s, 120 s], preferably in the interval [30 s, 90 s].

10. Method according to one of claims 8 or 9, characterized in that the further time period is at least twice as long as the time period, preferably at least four times as long as the time period.

11. Method according to one of the preceding claims, characterized in that the output signal includes a visual signal and/or an acoustic signal and/or a haptic signal.

12. Method according to one of the preceding claims, characterized in that - an environmental sensor system (10) is used to identify an object present in an environment of the vehicle (1); the computing unit (4) is used to augment the map (9a, 9b, 9c, 9d) with information regarding a position of the object with respect to the virtual plane (8); and the computing unit (4) is used to determine the level of cognitive distraction based on the augmented map (9e).

13. Driver assistance system for a vehicle (1 ), the driver assistance system (2) comprising a gaze tracking system (3), configured to determine a gaze direction (6) of a driver (5) of the vehicle (1 ), characterized in that the driver assistance system (2) comprises a computing unit (4), configured to determine an intersection point (7) of the gaze direction with a virtual plane (8) and to track the intersection point (7) during a predefined time period; determine a map (9a, 9b, 9c, 9d) representing a distribution of the position of the intersection point (7) during the time period; determine a level of cognitive distraction of the driver (5) based on the map (9a, 9b, 9c, 9d); and generate or cause an output signal to assist the driver (5) depending on the determined level of cognitive distraction.

14. Driver assistance system according to claim 13, characterized in that the gaze tracking system (3) comprises an emitter, configured to emit infrared radiation towards an eye of the driver (5); - an infrared sensor, configured to detect a portion of the infrared radiation reflected from the eye of the driver (5) and to generate a sensor signal depending on the detected portion of the infrared radiation; and a control unit, configured to determine the gaze direction based on the sensor signal.

15. Computer program product comprising instructions, which, when executed by a driver assistance system (2) according to one of claims 13 or 14, cause the driver assistance system to carry out a method according to one of claims 1 to 12.