CN115158325A - Method and vehicle for determining a gaze area of a person - Google Patents

Method and vehicle for determining a gaze area of a person Download PDF

Info

Publication number
CN115158325A
CN115158325A CN202210269005.9A CN202210269005A CN115158325A CN 115158325 A CN115158325 A CN 115158325A CN 202210269005 A CN202210269005 A CN 202210269005A CN 115158325 A CN115158325 A CN 115158325A
Authority
CN
China
Prior art keywords
camera
probability
region
cameras
person
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210269005.9A
Other languages
Chinese (zh)
Inventor
H-J·比格
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Robert Bosch GmbH
Original Assignee
Robert Bosch GmbH
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Robert Bosch GmbH filed Critical Robert Bosch GmbH
Publication of CN115158325A publication Critical patent/CN115158325A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W40/00Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
    • B60W40/08Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models related to drivers or passengers
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60RVEHICLES, VEHICLE FITTINGS, OR VEHICLE PARTS, NOT OTHERWISE PROVIDED FOR
    • B60R16/00Electric or fluid circuits specially adapted for vehicles and not otherwise provided for; Arrangement of elements of electric or fluid circuits specially adapted for vehicles and not otherwise provided for
    • B60R16/02Electric or fluid circuits specially adapted for vehicles and not otherwise provided for; Arrangement of elements of electric or fluid circuits specially adapted for vehicles and not otherwise provided for electric constitutive elements
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2420/00Indexing codes relating to the type of sensors based on the principle of their operation
    • B60W2420/40Photo, light or radio wave sensitive means, e.g. infrared sensors
    • B60W2420/403Image sensing, e.g. optical camera
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2540/00Input parameters relating to occupants
    • B60W2540/225Direction of gaze

Landscapes

  • Engineering & Computer Science (AREA)
  • Mechanical Engineering (AREA)
  • Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Mathematical Physics (AREA)
  • Transportation (AREA)
  • Traffic Control Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a method for determining a region of gaze (150) of a person (140) in an environment in which a plurality of different regions (110-118) are each assigned a camera (120-128), wherein a video signal (132) is obtained from each of the plurality of cameras (120-128), wherein a probability (W) is determined by means of the video signal (132) of each of the cameras (120-128) that the region of gaze (150) of the person (140) comprises the region assigned to the respective camera (120-128) 1 ,W 2 ) And wherein the gaze region (150) of the person (140) is determined as the highest probability (W) of the cameras (120-128) in the plurality of different regions (110-118) 2 ) That area (1) allocated10‑118)。

Description

Method and vehicle for determining a region of gaze of a person
Technical Field
The present invention relates to a method for determining a gaze area of a person, in particular of a driver in a vehicle, a computing unit and a computer program for the execution thereof, and a vehicle having such a computing unit.
Background
In modern vehicles, for example, the direction of the driver's gaze can be determined in order to determine his attention to traffic situations, for example.
Disclosure of Invention
According to the invention, a method for determining a region of gaze of a person, a computing unit and a computer program for carrying out the method, and a vehicle are proposed with the features of the independent patent claims. Advantageous embodiments are the subject matter of the dependent claims and the following description.
The present invention relates to determining a gaze area of a person in an environment, for example a gaze area of a driver in a vehicle. In a camera-based driver attention estimation system, the primary source of estimation may be one or more camera-based gaze direction estimates (i.e., an estimate of the driver's gaze direction). The camera data can be interpreted with the aid of model assumptions, so that the gaze direction vector and the origin of the eyes can be estimated relative to the vehicle interior. On the basis of this, it can be estimated whether the driver's line of sight is directed to a specific location or a specific region, in particular a so-called "region of interest" (ROI), i.e. a region of interest, at a specific point in time and a corresponding attention ranking can be carried out.
The attention of the driver can be understood here as a variable which describes the level of the driver's full concentration of the current traffic situation. The value of this variable can be determined, in particular, using a measuring device present in the vehicle, which detects, in particular, the driver's interaction with the vehicle. In this sense, the driver's attention is a representation of variables that describe the driver's interaction with the vehicle.
By typical positioning of the line-of-sight detection camera in the vehicle, for example in the vicinity of the dashboard, viewing other positions or ROIs (for example viewing the right side exterior rear view mirror) is difficult to estimate or not recognizable at all.
In this context, it is proposed to assign cameras to a plurality of different (local) areas or points in the environment, respectively, wherein a video signal is obtained from each of the plurality of cameras. Based on the video signal (which represents the image detected by the cameras, ideally also with a person in the image), the probability of the respective camera in the gazing area of the person, in particular in the center or the best field of view, is then determined for each camera. In particular, a (statistical) gaze model may be used here, which will be explained in more detail later. And then determining the gazing area of the person as the area allocated by the camera with the highest probability.
Preferably, however, the specific gaze area is output as information, for example to a further processing unit, such as a driver assistance system in the vehicle, only if the highest probability deviates from the second highest probability by more than a predefined threshold value (which may be given absolutely or relatively, for example). Otherwise the estimate may be considered to be not good enough. I.e. whereby also the estimated quality can be quantified.
Instead of abstracting the gaze region absolutely in the form of a gaze direction vector and thus estimating the possible gaze contact with a region or ROI, the attention to one or more local regions is estimated directly on the basis of the camera images assigned to this region. Thus, the estimation can be more accurate and reliable. It may also be estimated using a higher quality camera (e.g., lower resolution) or a simpler, less powerful algorithm. Such a system may therefore be more cost effective than a system with a central driver viewing camera, even if more cameras are required for this purpose. The cameras are preferably assigned to the areas such that the cameras record images from the direction of the area in the direction of the person.
There are two preferred possibilities to determine the probability that the camera is in the gazing zone. One possibility is to use a statistical model. This involves identifying a face in the camera image and then identifying the eye region. The eye region is to be understood here to mean, in particular, a rectangular region surrounding one or both eyes. The recognition in the camera image is performed, for example, as a two-dimensional rectangular area in the camera image coordinates. A more recent example of a particular possibility can be found, for example, in "Viola, P. & Jones, M. Rapid object detection using a booted cassette of simple defects of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001, 1, 511-518". For example, the eye regions may be identified by estimating the pose of the face using a zone boundary marker model, such as described in "Kazemi, V. & Sullivan, J. One millilocally plane orientation with an ensemble of regression trees 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014, 1867-1874", and then determined, for example, by means of the eye zone boundary markers.
If the eye area cannot be determined or identified by these steps, i.e. if the identification of the eye area fails, it is assumed that the relevant camera is not watched, i.e. the probability of the relevant camera is assumed to be zero.
If the eye area can be determined by these steps, i.e. if the eye area is successfully recognized, further steps can be performed. The probability of the relevant camera is then determined, for example, by means of statistical classification methods. This includes, inter alia, creating feature vectors based on the pixel information in the eye region and inputting the feature vectors into a statistical classification method for assessing the gaze. This includes, inter alia, the determination of a confidence coefficient, for example based on the following features: successful identification of eye regions; the position and size of the eye region in the camera image; the position and size of the eye region in the camera image compared to the temporally previous estimate; the estimated probability of the corresponding statistical (classification) method.
One possibility of statistical classification methods is to calculate feature vectors based on the image content, which are interpreted by means of Decision Trees, random forests or other statistical learning methods. For example, such a feature vector may consist of various statistics formed by the image content of an eye region or a transformation of the image content of the region (e.g., a transformation for edge recognition). An example of such a statistic is, for example, the average luminance of the portion. Alternatively, the values of the histogram of the luminance values in the portion or the luminance values of individual or all pixels within the eye region may be directly included in the feature vector. The last possibility assumes that the size of the eye region is adjusted to a uniform size. Such learning methods can be trained accordingly before use, for example using corresponding known images from which it is known that the eyes of the person's face displayed on them are looking at the camera.
In particular, another possibility is to determine the probability using an artificial neural network. Thereby, the gazing can be directly classified through the camera image. For this purpose, for example, so-called "convolutional neural networks" are used, into which the respective camera images can be directly input, i.e. which obtain the camera images or the video signals of the camera and output the probability of a person looking in the direction of the camera. Such neural networks can be trained accordingly before use, for example using corresponding known images from which it is known that the eyes of the person's face displayed on them are looking at the camera.
In particular, dashboards, rear-view mirrors, exterior rear-view mirrors (left or right), infotainment displays, center consoles and gantries are considered as possible regions or ROIs. In particular, these regions each relate to a region of the relevant component which is visible to the driver. An area may also be just one location, as desired. It is easy to understand that more than two areas can be selected and equipped with corresponding cameras. It will also be readily appreciated that these regions are merely exemplary, although they are regions that are particularly important or relevant for determining the attention of the driver. In this case, the cameras themselves should be arranged in a predefined environment (that is to say in the vicinity of the area) in or around the area to which the respective camera is assigned. In the case of an arrangement in a mirror, it should be noted that the mirror is partially transparent in the relevant area (camera lens).
The described camera or miniature camera using multiple region or ROI correlation can also be combined with the classical system with a central, high quality driver viewing camera. This can lead to advantages, in particular when viewing ROIs that can only be recognized incorrectly using such a central camera alone or that can only be distinguished with difficulty due to small differences in viewing angle (e.g. rear view mirror, side view mirror, combination/street view).
Modern vehicles often have various driver assistance systems which automate the vehicle or its operation to some extent. This includes, for example, so-called lane departure warning systems and distance and/or speed control systems. However, if the vehicle is operated in a semi-automatic manner, it is often desirable for safety reasons that the vehicle driver is responsible for and monitors the operation of the vehicle, i.e. has close attention to the surrounding traffic and remains able to react to unforeseen events at any time. However, there is a risk here that the driver relies too much on these driver assistance systems and then reacts too late or incorrectly to unforeseen events. In any event, the problem may be that the driver is not gasped by a complex driving situation.
The proposed approach for determining the gaze area and, in relation thereto, the attention of the driver can be used here. This allows, for example, checking whether the driver is looking at the side or rear view mirrors when turning, or whether he is looking at the road at high speed, even if an active cruise control system is present, if necessary.
The method steps described may be performed, for example, by means of a suitable computing system, i.e. the computing system processes the video signal and performs calculations for the estimation. The computing unit according to the invention, for example a control device of a motor vehicle, is therefore provided, in particular in terms of programming technology, to carry out the method according to the invention.
The invention also relates to a system, for example a vehicle, having a plurality of cameras, each assigned to one of a plurality of different positions, and a calculation unit according to the invention.
It is easy to understand that the proposed method can also be applied outside the vehicle, for example in a monitoring or control station where it is necessary to keep various monitors or devices in view.
The implementation of the method according to the invention in the form of a computer program or a computer program product with program code for executing all method steps is also advantageous, since this results in particularly low costs, in particular when the execution control device is also used for other tasks and is therefore always present. Suitable data carriers for supplying the computer program are in particular magnetic, optical and electronic memories, such as hard disks, flash memories, EEPROMs, DVDs etc. The program may also be downloaded via a computer network (internet, intranet, etc.).
Other advantages and design aspects of the invention will be apparent from the description and drawings.
The invention is schematically illustrated by means of embodiments in the drawings and described below with reference to the drawings.
Drawings
Fig. 1 schematically shows a vehicle in which the method according to the invention can be implemented.
Fig. 2 schematically shows the flow of the method according to the invention in a preferred embodiment.
Detailed Description
In fig. 1 a vehicle 100 is schematically shown as an environment in which the method according to the invention may be performed. For example, the vehicle is shown from the direction of a driver 140, which is a person in the environment, and includes, as components, a dashboard 110, an infotainment display 112, a rear view mirror 114, a left exterior rear view mirror 116, and a right exterior rear view mirror 118. The driver 140 views the rear-view mirror 114 in the illustrated case, the corresponding viewing area being designated by 150. At the same time, these components 110 to 118 form an area in which the line of sight of the driver 140 can be directed in general. In particular, the central or best view of the driver may be determined as the gaze area.
Furthermore, one camera is assigned to each of these regions, and in particular, cameras are also arranged in or around the relevant region. In the present case, a camera is assigned to an area, so that the camera records an image from the direction of the area in the direction of the person. A camera 120 is assigned to the dashboard 110, a camera 122 (disposed slightly above) is assigned to the infotainment display 112, a camera 124 is assigned to the rear view mirror 114, a camera 126 is assigned to the left exterior rear view mirror 116, and a camera 128 is assigned to the right exterior rear view mirror 118. As already mentioned, when arranging the camera in the mirror, it should be noted that the mirror is partially transparent. However, it is also conceivable to arrange the camera on the edge of the mirror, for example in or on the housing or housing edge of the rear view mirror.
A computing unit 130 is also provided, which may be, for example, a control device or other computing system in the vehicle 100. This calculation unit 130 is connected to each of the cameras 120 to 128 and is thus able to obtain and process the video signals of the cameras, in particular in real time. Such a video signal 132 is illustratively described with respect to the camera 128. It will be readily appreciated that a corresponding power supply should also be provided for the camera head.
Fig. 2 schematically shows the flow of the method according to the invention in a preferred embodiment, which can be carried out, for example, in vehicle 100 from fig. 1. The video signals of the camera 126 for the area 116 (left outer rear view mirror) and of the camera 124 for the area 114 (rear view mirror) should be taken into account here by way of example. The video signal of camera 126 provides a camera image 216 and the video signal of camera 124 provides a camera image 214.
The two camera images 216 and 214 are now analyzed, for example, in the calculation unit 130 according to fig. 1 as follows: i.e. the probability with which the driver's line of sight is directed towards the relevant camera. This is to be explained in more detail by way of example using the camera image 214 and one of the previously mentioned possibilities (statistical model).
For this purpose, a human face is first recognized in the camera image 214, or whether a human face exists is checked. Illustratively, the face is labeled 224. In addition, an eye region, here indicated at 234, is then identified or attempted to be identified.
If the eye area (i.e. the rectangle containing no one or two eyes) cannot be determined or identified with these steps, i.e. if the identification of the eye area fails, then the probability of not looking at the relevant camera, i.e. the relevant camera, here the probability W, is assumed 2 -is assumed to be zero. However, in the example shown, the eye region should be identified, in particular because both eyes can be seen there.
Then the probability W of the camera is determined by using a statistical classification method M 2 . This includes, for example, creating a feature vector based on the pixel information in the eye region and inputting the feature vector into a statistical classification method for assessing gaze, as already explained in more detail above. Ultimately, this results in a probability that the driver's gaze area includes the camera 124, e.g., W 2 =80%。
This analysis is also performed on and confirmed for the camera image 216 of the camera 116Probability of phased association W 1 . Although the face 226 is recognized there, and the region 236 is also an eye region, even though only one eye is visible there. However, the line of sight is not directed at the camera, and eventually only the eyes are captured from the side. This is because the driver does not observe the left outer rear view mirror and hence the associated camera. For example, this may be determined as W 1 =20%. However, if there are no eyes in the image at all, the region 236 may not be recognized as an eye region, which then results in a probability of zero.
Subsequently, in step 240, the probabilities of all camera images-e.g., exemplarily only W-are compared 1 And W 2 -comparing each other. Basically, the region to which the camera is assigned with the highest probability is determined as the region of gaze (the region to which the line of sight is directed). In this example, this would be area 114, i.e. the rear view mirror.
However, the quality of the gaze region determination may additionally be checked. Only when the highest probability deviates from the next highest probability by more than a threshold aw of, for example, 10%, i.e. when W 2 >W 1 The gaze area determined in this way is assumed to also correspond to the actual gaze area, when + aw applies. Information 250 is then output that describes what the current gaze area is. This information 250 can then be processed, for example, in a further control device or in the context of a driver assistance function.

Claims (13)

1. A method for determining a region of gaze (150) of a person (140) in an environment in which a plurality of different regions (110-118) are respectively assigned cameras (120-128),
wherein a video signal (132) is obtained from each of a plurality of cameras (120-128),
wherein a probability (W) is determined by means of the video signal (132) of each of the cameras (120-128) that the region of gaze (150) of the person (140) comprises a region assigned to the respective camera (120-128) 1 ,W 2 ) And is and
wherein the gaze area (150) of the person (140) is determined as cameras in a plurality of different areas (110-118)(120-128) with highest probability (W) 2 ) That region (110-118) is assigned.
2. The method of claim 1, wherein only when there is a highest probability (W) 2 ) With the second highest probability (W) 1 ) Only if the deviation exceeds a predetermined threshold value (Δ W) is the specific gaze area (150) output as information (250).
3. The method according to claim 1 or 2, wherein for determining the probability (W) of each camera (120-128), a probability (W) is determined for each camera (120-128) 1 ,W 2 ) A face (224) in the camera images (210, 214) is identified and then an eye region is identified (234).
4. The method according to claim 3, wherein if the recognition of the eye region (234) fails, the probability of the associated camera is assumed to be zero.
5. Method according to claim 3 or 4, wherein, if the eye region (234) is successfully recognized, the probability (W) of the relevant camera (114) is determined by means of a statistical classification method (M) 2 )。
6. The method of any preceding claim, wherein the probability is determined using an artificial neural network.
7. The method according to any of the preceding claims, wherein the plurality of cameras (120-128) are arranged in a predetermined environment in the area (110-118) or around the area to which the respective camera (120-128) is assigned, respectively.
8. The method according to any of the preceding claims, wherein the environment is a vehicle (100) and the person (140) is a driver of the vehicle.
9. The method of claim 8, wherein the plurality of zones are selected from a dashboard (110), a rear view mirror (114), an exterior rear view mirror (116, 118), an infotainment display (112), a center stack, and a door frame.
10. A computing unit (130) arranged to perform all method steps of the method according to one of the preceding claims.
11. System (100) with a plurality of cameras (120-128), each assigned to one of a plurality of different areas (110-118), and a calculation unit (130) according to claim 10.
12. A computer program which, when executed on a computing unit (130), causes the computing unit (130) to carry out all the method steps of the method according to one of claims 1 to 9.
13. A machine readable storage medium having stored thereon a computer program according to claim 12.
CN202210269005.9A 2021-03-19 2022-03-18 Method and vehicle for determining a gaze area of a person Pending CN115158325A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DE102021202704.0A DE102021202704A1 (en) 2021-03-19 2021-03-19 Method for determining a viewing area of a person and vehicle
DE102021202704.0 2021-03-19

Publications (1)

Publication Number Publication Date
CN115158325A true CN115158325A (en) 2022-10-11

Family

ID=83114988

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210269005.9A Pending CN115158325A (en) 2021-03-19 2022-03-18 Method and vehicle for determining a gaze area of a person

Country Status (2)

Country Link
CN (1) CN115158325A (en)
DE (1) DE102021202704A1 (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9354718B2 (en) 2010-12-22 2016-05-31 Zspace, Inc. Tightly coupled interactive stereo display
US9652031B1 (en) 2014-06-17 2017-05-16 Amazon Technologies, Inc. Trust shifting for user position detection
WO2017053966A1 (en) 2015-09-24 2017-03-30 Tobii Ab Eye-tracking enabled wearable devices
JP7219041B2 (en) 2018-10-05 2023-02-07 現代自動車株式会社 Gaze detection device and its congestion control method

Also Published As

Publication number Publication date
DE102021202704A1 (en) 2022-09-22

Similar Documents

Publication Publication Date Title
RU2756256C1 (en) System and methods for monitoring the behaviour of the driver for controlling a car fleet in a fleet of vehicles using an imaging apparatus facing the driver
US10088899B2 (en) Eye gaze tracking utilizing surface normal identification
CN111469802B (en) Seat belt state determination system and method
RU2764646C2 (en) System and methods for monitoring the behaviour of the driver for controlling a car fleet in a fleet of vehicles using an imaging apparatus facing the driver
CN110826370B (en) Method and device for identifying identity of person in vehicle, vehicle and storage medium
US9881221B2 (en) Method and system for estimating gaze direction of vehicle drivers
US10521683B2 (en) Glare reduction
Chuang et al. Estimating gaze direction of vehicle drivers using a smartphone camera
US9662977B2 (en) Driver state monitoring system
US10764536B2 (en) System and method for a dynamic human machine interface for video conferencing in a vehicle
CN112016457A (en) Driver distraction and dangerous driving behavior recognition method, device and storage medium
US11458979B2 (en) Information processing system, information processing device, information processing method, and non-transitory computer readable storage medium storing program
WO2018145028A1 (en) Systems and methods of a computational framework for a driver's visual attention using a fully convolutional architecture
US20220180483A1 (en) Image processing device, image processing method, and program
US11455810B2 (en) Driver attention state estimation
CN108860045A (en) Driving support method, driving support device, and storage medium
CN111033559A (en) Image processing for image blur correction, image processing method, and program
EP4009287A1 (en) Devices and methods for monitoring drivers of vehicles
JP7154959B2 (en) Apparatus and method for recognizing driver's state based on driving situation judgment information
US20120189161A1 (en) Visual attention apparatus and control method based on mind awareness and display apparatus using the visual attention apparatus
JP2017129973A (en) Driving support apparatus and driving support method
Xiao et al. Detection of drivers visual attention using smartphone
Shirpour et al. A probabilistic model for visual driver gaze approximation from head pose estimation
CN115158325A (en) Method and vehicle for determining a gaze area of a person
CN115995142A (en) Driving training reminding method based on wearable device and wearable device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination