WO2023054661A1

WO2023054661A1 - Gaze position analysis system and gaze position analysis method

Info

Publication number: WO2023054661A1
Application number: PCT/JP2022/036643
Authority: WO
Inventors: 浩彦佐川; 貴之藤原
Original assignee: 株式会社日立製作所
Priority date: 2021-10-01
Filing date: 2022-09-30
Publication date: 2023-04-06
Also published as: JP2023053631A

Abstract

The present invention comprises: a space model creation unit that creates a space model, which is a three-dimensional model of a surrounding space, from a plurality of captured images; an object position/posture estimation unit that matches the space model and an object model, and arranges the object model in the space model in accordance with the position/posture obtained by the matching; and a gaze position calculation unit that calculates the gaze position in the object model on the basis of the arranged object model and the gaze direction in the space model.

Description

Gaze position analysis system and gaze position analysis method

The present invention relates to a gaze position analysis system and a gaze position analysis method.

When analyzing the user's interest and work situation from the line-of-sight information obtained by using a line-of-sight measurement device worn by the user, the user can move freely in a three-dimensional space where various objects are arranged. will do. For this reason, it is desirable to be able to confirm where in the three-dimensional space the user is gazing, how the transition is, and the like.

Techniques for analyzing the line-of-sight information acquired by a user moving freely in a three-dimensional space as a gaze position in the three-dimensional space are disclosed in Patent Document 1 and Patent document 2 is disclosed.

In Patent Document 1, a virtual three-dimensional space is generated using images photographed from a plurality of different photographing positions, the photographing positions in the virtual three-dimensional space are calculated, and the line of sight of the user obtained at the timing of photographing the images. A technique for calculating a gaze position and gaze time of a user in a virtual three-dimensional space from a direction is disclosed.

On the other hand, Patent Literature 2 discloses a technique for identifying an object that a user is focusing on from the position of a display device in a three-dimensional space, the line of sight of the user, and the position of the object in the three-dimensional space. .

JP 2020-135737 A JP 2018-195319 A

In Japanese Patent Laid-Open No. 2002-200012, there are cases where the data that constitutes a virtual three-dimensional space generated from a plurality of images is rough, or the data can only be generated partially. difficult to judge.

On the other hand, Patent Document 2 assumes that the line of sight of the user in the three-dimensional space and the position of the object in the three-dimensional space can always be obtained, so the required equipment is large and the scope of application is limited. There is a problem that In addition, it is necessary to clarify the positions of all the objects to be analyzed in advance, which requires labor for advance preparation.

When analyzing the gaze position in the three-dimensional space, in addition to browsing how the gaze position has changed in the entire three-dimensional space, we can focus on a specific object and see how the gaze position is on that object. It is desirable to be able to easily view whether the transition to

At this time, it is expected that a more detailed analysis will be possible if it is possible to associate the gaze position on the 3D model for each object existing in the 3D space. Also, if each object model can be automatically arranged in a three-dimensional space, it is considered possible to simplify the presetting required for analysis.

The purpose of the present invention is to automatically associate the user's gaze position with a three-dimensional model corresponding to each object existing in a three-dimensional space in a gaze position analysis system.

A gaze position analysis system according to one aspect of the present invention includes a space model creation unit, an object position/orientation estimation unit, and a gaze position calculation unit. A gaze position analysis system that automatically associates an object model gaze position, which is a gaze position on an object model, wherein the spatial model creation unit acquires a first-person image that is an image similar to the user's field of view, obtaining a first-person video gaze position, which is a user's gaze position on the first-person video, creating a space model, which is a three-dimensional model of a space in a range where the user's line of sight is directed, from the first-person video; The object position/posture estimation unit estimates the position/posture of the object in the space model by matching the object model and the space model, and the Placing the object model in the space model using the position and orientation of the object in the space model; A gaze direction in the model is calculated, and the gaze position calculation unit calculates the object model gaze position by obtaining an intersection of the gaze direction and the object model in the space model.

A gaze position analysis system according to one aspect of the present invention is a gaze position analysis system that automatically associates a user's gaze position on an object model with an object model that is a three-dimensional model of an object existing in space, a space model creation unit that creates a space model, which is a three-dimensional model of a surrounding space, from a plurality of captured images; and a position and orientation obtained by matching the space model and the object model. and an object position/orientation estimating unit that places the object model on the space model, and a gaze position that calculates the gaze position on the object model based on the placed object model and the gaze direction in the space model and a calculating unit.

According to one aspect of the present invention, in the gaze position analysis system, it is possible to automatically associate the user's gaze position with a three-dimensional model corresponding to each object existing in the three-dimensional space.

FIG. 2 is a configuration diagram of a computer when the gaze position analysis system of the embodiment of the present invention is executed by a general computer; It is a figure showing the basic composition of the sight line measuring device assumed by the present invention. FIG. 4 is a diagram showing an example of a format of first-person video data; FIG. 4 is a diagram showing an example of the format of gaze position data; FIG. 4 is a diagram showing an example of a format of object model data; It is a figure which shows an example of the format of spatial model data. FIG. 4 is a diagram showing an example of a format of object placement data; It is a figure explaining the content of the process performed by a space model creation program. FIG. 4 is a diagram showing an example of a flowchart of processing executed by a gaze position calculation program; FIG. 4 is a diagram for explaining the contents of processing executed by a gaze position calculation program; FIG. FIG. 10 is a diagram showing an example of AR markers placed on an actual object; FIG. 4 is a diagram showing an example of an object model whose shape and size are changing; FIG. 4 is a diagram showing an example of an object model and gaze positions on the object model; FIG. 10 is a diagram showing an example of a case where the gaze position is always displayed so as to face the user; FIG. 11 is a diagram showing an example of displaying gaze positions by restricting the method of adjusting the position and orientation of an object model;

Hereinafter, embodiments of the present invention will be described with reference to the drawings.

FIG. 1 is a configuration diagram of a computer when a gaze position analysis system according to an embodiment of the present invention is executed by a general computer.
A gaze measuring device 101 in FIG. 1 measures a first-person image, which is an image similar to the user's field of view, and a gaze position on the first-person image, and stores the data in a database for storing first-person image data 110 and gaze position data 111, respectively. It is an input device for recording, and a device commonly used under the name of "eye tracker" or the like can be used.

In particular, as the line-of-sight measurement device 101 in the present invention, a portable device that can be worn by the user is assumed. As a result, the gaze direction can be measured while the user can move freely in space.

FIG. 2 shows the basic configuration of the sight line measuring device 101 worn by the user.
Reference numeral 201 in FIG. 2 denotes a photographing device for acquiring a first-person image of the user, and a device equivalent to a camera generally used in a personal computer or the like can be used. Reference numeral 202 denotes a spectacle-type device equipped with a sensor for detecting the movement of the user's eyes and measuring the gaze direction. A terminal 203 is used to record the obtained first-person video data 110 and gaze position data 111 in a database. The terminal 203 may transmit data to the information processing device 102 .

The line-of-sight measurement device 101 is not limited to a portable type, as long as it is possible to measure the gaze position in a state in which the user can move freely in space and acquire the gaze position on the first-person video. , a stationary device may be used.

The information processing device 102 in FIG. 1 is an information processing device for executing each program in the gaze position analysis system.

The input device 103 includes general computer input devices such as keyboards, buttons, mice, and touch panels for controlling the start and end of the system.

The output device 104 is a means for displaying the result of gaze position analysis, the operating status of the system, etc. to the user, and includes the screen of a smartphone or tablet terminal, or a display device for general computers.

Also, 105 is a storage device for storing each program in the gaze position analysis system. The storage device 105 includes a spatial model creation program 106 , an object position/orientation estimation program 107 , a gaze position calculation program 108 and a gaze position display program 109 .

Here, the information processing device 102 functions as a space model unit by executing processing according to the space model creation program 106. The information processing apparatus 102 also functions as an object position/orientation estimation unit by executing processing according to the object position/orientation estimation program 107 . The information processing apparatus 102 also functions as a gaze position calculation unit by executing processing according to the gaze position calculation program 108 . Further, the information processing apparatus 102 functions as a gaze position display unit by executing processing according to the gaze position display program 109 .

The first-person video data 110 database stores first-person video data, which is the same video as the user's field of view. It is assumed that the first-person video data 110 is acquired by the eye gaze measuring device 101 as described above.

FIG. 3 shows an example of the format of the first-person video data 110. As shown in FIG.
A data name 301 in FIG. 3 is a name given to the first-person video data, and any string of characters and symbols can be used. Also, the first-person video data 110 includes a plurality of images acquired at predetermined time intervals or arbitrary timing in the order in which they were acquired, and the number of data 302 in FIG. number of images to be stored.

Time 1 of 303 represents the time when the first image was acquired, Image 1 of 304 represents the first acquired image, and Photographing position and orientation 1 of 305 represents the position and orientation of the camera that captured the first image. Time n 306 represents the time when the n-th image was acquired, image n 307 represents the n-th image, and shooting position and orientation n 308 represents the position and orientation of the camera that shot the n-th image. Note that the shooting positions and

orientations

305 and 308 are data calculated by the space model creation program 106, as will be described later, and are blank when each image is acquired.

Also, as the format of the first-person video data 110, any format, such as a commonly used video format, can be used as long as the image at each time in the data can be easily acquired. The gaze position data 111 database stores gaze position data acquired by the eye gaze measuring device 101 .

FIG. 4 shows an example of the format of the gaze position data 111. As shown in FIG.
As described above, since it is assumed that the gaze position is expressed as position coordinates on the first-person video data 110 , the gaze position data 111 has corresponding first-person video data 110 . Therefore, by describing the same name as the data name 301 in the first-person video data corresponding to the data name 401, the correspondence relationship between the gaze position data 111 and the first-person video data 110 is represented.

In addition, the gaze position data 111 includes a plurality of position coordinates acquired at predetermined time intervals or arbitrary timing in the order in which they were acquired. The number of data 402 in FIG. 4 represents the number of position coordinates included in the data. 403 is the time when the position coordinates of the gaze position were first acquired; 404 is the position coordinates of the gaze position that was first acquired; Describe the name of the object.

406 is the time when the position coordinates of the n-th gaze position are acquired, 407 is the position coordinates of the n-th gaze position acquired, and 408 is when the gaze position data is associated with the object model by the gaze position calculation program 108. , describes the name of the object. Since there is no associated object model when gaze position data is acquired, 405 and 408 are blank.

Further, when the gaze position data 111 is acquired, the gaze position is position coordinates on the first-person video, that is, two-dimensional coordinate data. becomes the position coordinates of , and is rewritten to three-dimensional coordinate data.

The gaze position data 111 described above is based on the premise that the gaze position data acquired by the eye gaze measuring device 101 and the gaze position data associated with the object model are managed with the same data. However, the gaze position data acquired by the eye gaze measuring device 101 and the gaze position data associated with the object model may be managed as separate data.

The database of object model data 112 stores three-dimensional models of objects that exist in the space within the range where the user's line of sight is directed. In the present invention, it is assumed that the three-dimensional model of each object is represented as a collection of points representing the shape of the object, that is, as point cloud data.

Object models created using general three-dimensional CAD, etc. are often represented as collections of polygons, but object models represented by polygons can be easily converted into point cloud models. can be done. For example, first divide each polygon into triangles by lines connecting a vertex and its non-adjacent vertex. By repeatedly dividing the triangles and finally selecting the vertices of all the triangles, the object model represented by the polygons can be represented as point cloud data.

FIG. 5 shows an example of the format of an object model stored in the object model data 112 database.
501 in FIG. 5 is the name given to the object, and any string of letters and symbols can be used. 502 represents the number of points included in the object model, 503 represents the positional coordinates of the first point, and 504 represents the positional coordinates of the n-th point. Although the format shown in FIG. 5 includes only the positional coordinates of the points, it may also include information attached to each point, such as color information.

The spatial model data 113 database stores a spatial model, which is a three-dimensional model for the space within the range where the user looks. The spatial model is data created by the spatial model creation program 106 using the first-person video data 110, and is assumed to be represented as point cloud data, like the object model described above.

FIG. 6 shows an example of the format of the spatial model stored in the database of the spatial model data 113. As shown in FIG.
A model name 601 in FIG. 6 is a name given to the spatial model, and any string of letters and symbols can be used. 602 is the name of the first-person video data used to create the space model, 603 is the number of points included in the space model, 604 is the position coordinates of the first point, and 605 is the position coordinates of the nth point. Although the format shown in FIG. 6 includes only the positional coordinates of the points, it may also include information attached to each point, such as color information.

The database of the object placement data 114 stores data on the object model that has been matched with the space model and placed on the space model by the object position/orientation estimation program 107 .

FIG. 7 shows an example of the format of the object placement data 114 stored in the database of the object placement data 114. As shown in FIG.
Assuming that the object placement data 114 is stored for each space model, the name of the space model corresponding to the model name 701 in FIG. 7 is described. The number of objects 702 represents the number of object models arranged in the target space model.

In FIG. 7, 703 is the name of the first object placed on the space model, 704 is the position and orientation of the first object on the space model, and 705 is the time when the first object was placed on the space model. 706 is the name of the nth object placed on the space model, 707 is the position and orientation of the nth object on the space model, and 708 is the time when the nth object was placed on the space model.

The present invention uses the first-person video data 110 and the gaze position data 111 acquired from the line-of-sight measuring device 101 to determine the gaze position on each object by the space model creation program 106, the object position/orientation estimation program 107, and the gaze position calculation program 108. Analytical processing. Especially in this embodiment, it is assumed that processing is performed while acquiring the first-person image data 110 and gaze position data 111 from the line-of-sight measuring device 101 in real time.

For this reason, first, the spatial model creation program 106 constantly reads newly stored data from the data acquired from the line-of-sight measurement device 101 and stored in the database of the first-person video data 110, and performs processing for creating a spatial model. .

As a technique for creating a spatial model represented by point cloud data from video data, a technique that combines the well-known SLAM (Simultaneous Localization and Mapping) and MVS (Multi-View Stereo) methods can be used.

SLAM is a technology that creates rough point cloud data by analyzing the correspondence between images using multiple consecutive images corresponding to when the camera is moved. On the other hand, the MVS method is a technique of creating more detailed dense point cloud data by using the analysis results in SLAM.

Fig. 8 shows an image of creating a spatial model from point cloud data using the SLAM and MVS methods.

In FIG. 8, 801 represents an object that exists within the range where the user's line of sight moves, and for the sake of simplicity, it is assumed that there are no other objects in the surrounding area. 802 and 804 represent the positions of the cameras when the first-person video was captured, and 803 and 805 illustrate the range of the camera's field of view and the camera posture when the first-person video was captured from each camera position. be.

As shown in 802 and 804, a space model represented by point cloud data as shown in 806 is created by using images of the same object or the same location photographed from multiple locations.

Although only two camera positions are shown in FIG. 8, generally more images are used in order to create a more accurate space model 806 . By using the technique described above, the space model 806 can be created in real time according to the movement of the gaze position of the user. In addition to the above techniques, any technique may be used in the spatial model creation program 106 as long as it can create a spatial model represented by point cloud data.

Also, by using SLAM technology, it is possible to simultaneously calculate the position and orientation of the camera that captured the first-person video on the created space model. As will be described later, the position and orientation of the camera are information necessary for calculating the gaze position on the object model. Therefore, when using a space model creation technique that cannot calculate the position and orientation of the camera, it is necessary to separately use means for acquiring the position and orientation of the camera on the space model. For example, any technology that acquires the position and orientation in space, such as using a sensor that acquires the position and orientation, can be used.

The object position/orientation estimation program 107 matches the point cloud data of the object model with a part of the point cloud data in the space model created by the space model creation program 106, that is, matches the position and orientation of the object model. It is a program for calculating the position and orientation of the object model in the space model.

As a technique used in the object position/orientation estimation program 107, a technique called the well-known ICP (Iterative Closest Point) algorithm or NDT (Normal Distribution Transform) algorithm can be used. Alternatively, any technique may be used as long as it is capable of performing matching between point cloud data and determining the position and orientation of an object model in a space model. Further, the object position/orientation estimation program 107 is controlled in timing of processing by a gaze position calculation program 108 which will be described later.

The gaze position calculation program 108 uses the gaze position data 111 and the object model data 112 acquired by the eye gaze measuring device 101 and the space model data 113 created by the space model creation program 106 to obtain the gaze position on the object model. It's a program.

The flow of processing of the gaze position calculation program 108 will be described using the flowchart of FIG.
In step 901 in FIG. 9, new gaze position data 111 is obtained from gaze position data 111 stored in the gaze position data 111 database. Alternatively, new gaze position data may be directly acquired from the line-of-sight measuring device 101 .

In step 902, using the newly obtained gaze position data 111 and the photographing positions and

orientations

305 and 308 calculated when creating the space model by the space model creation program 106 and described in the first-person video data 110, Calculate the gaze direction on the space model.

In addition, the direction of gaze on the space model is represented by a vector that indicates the starting point of the gaze and the line-of-sight direction from that position. To calculate the direction of gaze on the space model, first, the photographing position and orientation for the first-person video data at the same time as the newly acquired gaze position data 111 are obtained from the first-person video data 110 .

If there is no photographing position and orientation at exactly the same time, for example, the photographing position and orientation for the first-person image data 110 corresponding to times before and after the time of the gaze position data 111 are acquired, and the relation with the time of the gaze position data 111 is obtained. A method such as using the photographing position and orientation obtained by performing interpolation based on the above may be used. Next, using the obtained photographing position and orientation, the gaze position on the first-person video is coordinate-transformed into the gaze position on the space model.

As described above, the gaze position data 111 is represented by the gaze position on the first-person video. The positional relationship between the first-person image and gaze position can be obtained in the same size as the actual space. Furthermore, by subjecting the obtained positional relationship to coordinate transformation using the shooting position and orientation acquired from the first-person video data 110, the positional relationship among the shooting position of the camera, the first-person video, and the gaze position on the space model can be obtained as shown in FIG. can be represented.

In FIG. 10, 1001 is a space model, 1002 is the shooting position of the camera that shot the first-person video on the

space model

1001, and 1003 is the shooting range of the first-person video on the space model 1001, which corresponds to the shooting posture of the camera. .

Also, 1004 is the gaze position on the first-person video associated with the spatial model 1001 . The gaze direction on the space model 1001 can be obtained as a vector 1005 that starts at 1002 and passes through the gaze position 1004 on the first-person video in FIG.

In step 903 in FIG. 9, it is determined whether or not there is an object model placed on the space model 1001 in the gaze direction on the space model 1001 obtained in step 902. If it is determined that there is an object model placed on the model 1001, the process proceeds to step 906; otherwise, to step 904.

As a method for making the determination, first, information on an object model placed in the target space model 1001 is acquired from the object placement data 114, and point cloud data of the corresponding object model is read from the object model data 112 and displayed on the space model. to be placed. Next, a vector representing the gaze direction on the space model 1001 (hereinafter referred to as gaze direction vector), for example, the distance between the vector 1005 in FIG. and select the closest distance.

If the selected distance is equal to or less than a predetermined threshold, it can be determined that the target object model exists in the gaze direction. As the threshold value for the distance, for example, the largest distance, half the largest distance, or the like can be selected among the distances between points in the point cloud data constituting the object model. Also, as a determination method, when there are points whose distance from the gaze direction vector on the space model is equal to or less than a predetermined threshold, the number of points whose distance is equal to or greater than the predetermined threshold is determined. You may make it determine that it exists in.

As object models to be stored in the database of object model data 112, in addition to point cloud data, polygon data, which is data represented by a set of polygons, is also stored. Polygon data is also placed on the corresponding object model, and if there is a point where each polygon placed on the spatial model 1001 intersects with a vector representing the line-of-sight direction placed on the spatial model, the target object model is You may make it determine that it exists in a gaze direction.

Furthermore, if there are a plurality of objects placed on the space model 1001, the point closest to the gaze direction vector 1005 and the position of the polygon that intersects the gaze direction vector 1005 on the gaze direction vector are obtained. , the starting point of the gazing direction vector 1005, that is, an object model including a point or polygon closest to the photographing position may be selected as an object model existing in the gazing direction.

Furthermore, a virtual space that can express the same positional relationship as the space model 1001 is prepared. You may make it perform the determination process which carried out.

In step 904, the object position/orientation estimation program 107 is used to match the object model 1004 with the point cloud data on the space model 1001 existing in the gaze direction on the space model 1001. 10, by variously adjusting the position and orientation of the object model 1006 with respect to the space model 1001, this processing makes it possible to replace part of the space model 1001 with the target object model 1006. is the processing of

As the point cloud data on the space model 1001 existing in the line-of-sight direction, for example, point cloud data on the space model whose distance from the gaze direction vector 1005 is a predetermined distance or less can be selected. As the predetermined distance, the maximum size of the object model to be matched, half the maximum size, or the like can be used.

Alternatively, it can be point cloud data existing within a range of a specific shape having a predetermined size with the gazing direction vector 1005 as the central axis. It is also possible to obtain the position of each point in the point cloud data on the gaze direction vector and select points in a predetermined range centering on the point where the point cloud data are most concentrated on the gaze direction vector. can. In addition to the above, any method may be used as long as it can select point group data to be matched around the gaze direction vector.

The object model that matches the point cloud data on the spatial model can be all object models stored in the database of object model data 112, or a well-known QR codes (registered trademark), AR markers, symbols and character strings representing the names of objects are installed, and well-known image recognition technology and character recognition technology are used to read them, and the corresponding object models are converted into object model data. 112 may be selected for matching.

1101 in FIG. 11 shows an example of an AR marker installed on an actual object. Alternatively, extract well-known point cloud feature values (three-dimensional feature values) from the point cloud data on the space model and the point cloud data on the object model, and extract the point cloud feature values similar to the point cloud data on the space model. may be selected from the object model data 112 and matched.

Also, when performing assembly work, the shape and size of an object existing at a specific location change. The object model representing the state of the object and its order relationship are stored in the object model data 112 for each task, and the object model corresponding to the work process is matched with the space model by matching with the space model in order. become able to.

Specifically, for example, if none of the object models are matched to the point cloud data on the space model existing in the direction of gaze, the object model corresponding to the first step is selected and matched.

On the other hand, if the object model corresponding to one of the processes has already been matched to the point cloud data on the space model existing in the gaze direction, select the object model corresponding to the next process, and After matching with point cloud data, if the object model corresponding to the next process is matched with the space model with a higher degree of matching than the already matched object model and the space model, the already matched object The model can be replaced with an object model corresponding to the next process.

As the degree of matching, the same degree as that used in the processing in step 905, which will be described later, can be used. Furthermore, the matching of the object model to the space model corresponding to the next step is performed when the gaze position moves on the already matched object model, or when the space model creation program 106 detects a change in the space model. It is good to go to

The detection of the change in the spatial model is performed by the spatial model creation program 106, for example, by using a newly acquired first-person video image and a predetermined number of first-person video images immediately before it to generate a spatial model and the A previously generated space model is compared, and if the difference between the two is greater than or equal to a predetermined threshold, it is determined that the space model has changed.

The difference between the spatial models can be obtained, for example, by searching for the closest point in the other spatial model for each point in one spatial model, calculating the distance to the searched point, It can be calculated by averaging the distances for all points in the model. Alternatively, the number of points where the obtained distance is equal to or greater than a predetermined threshold may be used. Alternatively, any method may be used as long as it can calculate the difference between point groups.

In step 905, it is determined whether or not the object model has been correctly matched with the point cloud data on the space model existing in the viewing direction on the space model. After saving to the database go to step 906 , otherwise go to step 908 .

As a method of judging whether the object model has been correctly matched with the point cloud data on the space model, the point cloud data on the space model and the object model placed on the space model based on the result of matching processing If the degree of matching is greater than a predetermined value, it is determined that the object model is correctly matched.

If there are multiple object models that match the point cloud data on the spatial model, select the highest degree of matching, and if the selected degree of matching is greater than a predetermined value, determine that the object model is correctly matched. At the same time, it can be determined that the object model corresponding to the selected degree of matching has been matched with the point cloud data on the space model.

As the degree of matching, for example, for each point in the object model placed on the spatial model using the matching result, the point with the smallest distance in the point cloud data on the spatial model is searched, and the obtained distance is less than a predetermined threshold, and the ratio of the determined number of points to the number of points in the object model can be used. As the degree of matching between the point cloud data on the space model and the object model, any index other than the above may be used as long as it is an index that can determine the quality of the matching result between the point cloud data.

At step 906, the gaze position in the object model placed on the space model is calculated using the matching result at step 904. Since the object model placed on the space model and the gaze direction vector on the space model are 1005 in FIG. A process similar to the process of determining whether an object model exists can be used.

However, in step 903, when the object model is point cloud data, attention is paid to whether or not there is a point whose distance from the gaze direction vector 1005 is equal to or less than a predetermined threshold. The difference is that a point whose distance from the vector is equal to or less than a predetermined threshold value and which is closest to the shooting position 1002 is selected and set as the gaze position.

Alternatively, if the distance from the gazing direction vector 1005 is less than or equal to a predetermined threshold value and there are a plurality of points within a predetermined range from the point closest to the shooting position 1002, the point closest to the shooting position 1002 , and an average of points within a predetermined range from there may be used as the gaze position.

Alternatively, if the object model is expressed as a collection of polygons, then the point of intersection between the gaze direction vector and each polygon in the object model may be obtained, and the point of intersection closest to the photographing position 1002 may be obtained as the gaze position. .

Since the gaze position obtained by the above is the gaze position on the space model, in step 906, the obtained gaze position is set using the position and orientation of the object model on the space model obtained by the matching result in step 904. Perform coordinate transformation. As a result, the gaze position 1004 on the space model 1001 is transformed into the gaze position 1007 on the object model 1006, that is, on the object model coordinate system.

Also, when an object model is placed on the space model in step 905, in step 906, the gaze position on the placed object model may be calculated for the past gaze position data. This is because, in the initial stage of processing of the gaze position calculation program 108, etc., it is assumed that the point cloud data for each object on the space model is small and the object model cannot be arranged on the space model.

At step 907 , the gaze position on the object model obtained at step 906 is saved in the gaze position data 111 . Also, when saving the gaze position of the object model, the shooting position and orientation of the first-person video on the space model, the gaze position in the space model, the position and orientation of the object model on the space model, etc. may be saved together. good.

In step 908, if there is an end instruction from the input device 103 or the like, the process ends; otherwise, the process returns to step 901.

The gaze position calculation program 108 described above acquires the first-person video data 110 and the gaze position data 111 in real time, arranges the object model data 112 on the space model while generating the space model data 113, and places the object model data 112 on the object model. However, after the target first-person video data 110 and gaze position data 111 have been acquired, the data stored in the first-person video data 110 and gaze position data 111 are sequentially processed. The processing of the gaze position calculation program 108 may be executed while reading.

At that time, after creating a space model, or creating a space model and placing an object model on the space model, the gaze position on the object model may be calculated. In this case, the above-described processing may be performed after manually arranging an object model for a location where no object model is arranged on the space model.

Alternatively, after executing the gaze position calculation program 108, the gaze position calculation program 108 is executed again using the space model data 113 and the object arrangement data 114 generated after the gaze position calculation program 108 is executed. You can do it. At the initial stage of processing of the gaze position calculation program 108, etc., it is assumed that the point group data for each object on the space model is small and the object model cannot be arranged on the space model.

This is because, in such a case, among the gaze positions stored in the gaze position data 111, there is a possibility that gaze positions on the object model remain uncalculated. In this case, the gaze position calculation program 108 is executed with the following modifications.

The first change is that the acquisition of the gaze position data in step 901 is performed by reading data from the data stored in the gaze position data 111 in the order in which they were saved. Further, it is determined whether or not the gaze position on the object model has been calculated for the acquired gaze position data, and if it has been calculated, the process of proceeding to step 908 is added.

The second change is to proceed to step 908 instead of step 904 if, in step 903, the object model is not arranged in the gaze direction on the space model.

The third change is to delete

steps

904 and 905. By making the above three changes to the gaze position calculation program 108, it is possible to calculate the gaze position on the object model for gaze positions for which calculation of the gaze position on the object model has not been performed. It can be carried out. In this case, the above-described processing may be performed after manually arranging an object model for a location where no object model is arranged on the space model.

The gaze position display program 109 is a program that displays gaze position data according to instructions from the user. Gaze position can be determined by a well-known method of gaze analysis, for example, a method using a heat map, which is the frequency distribution of gaze positions, or the length of time that the gaze position stayed within a certain range. display method, etc.

In addition to displaying the gaze position on the first-person video, it also displays the gaze position on the object model. As a method of displaying the gaze position on the object model, a virtual space model, which is a virtual space corresponding to the space model, is prepared, and the object model is displayed based on the data stored in the object placement data 114 in the virtual space model. A method of placing and displaying the gaze position on the object model placed in the virtual space model can be used. Alternatively, a method of selecting only a specific object model and displaying the gaze position only on the selected object model can also be used.

Furthermore, when displaying the gaze position on the object model, the virtual space model or the object model may be adjusted and displayed so that the gaze position always faces the user.

1301 in FIG. 13 shows an example of an object model, and 1302, 1303, 1304 and 1305 show examples of gaze positions on the object model. Further, FIG. 14 shows an example of the case where the gaze position shown in FIG. 13 is always displayed facing the user.

In FIG. 14, 1401 is an object model displaying a point of gaze 1302 facing the user, and 1402 is the point of gaze displayed on 1401 . Reference numeral 1403 denotes an object model in which the gaze point 1303 is displayed facing the user, and 1404 denotes the gaze point displayed on the 1403 . Reference numeral 1405 denotes an object model in which the gaze point 1304 is displayed facing the user, and 1406 the gaze point displayed on 1405 . Reference numeral 1407 denotes an object model in which the gaze point 1305 is displayed facing the user, and 1408 denotes the gaze point displayed on 1407 .

In the display method shown in FIG. 14, the gaze position is displayed by adjusting the object model so that each gaze position completely faces the user. To do this, for example, if the object model is point cloud data, select a point within a predetermined range that includes the gaze position, It is sufficient to obtain the orientation and adjust the position and orientation of the object model so that the obtained normal line faces the user. If the object model is represented by a collection of polygons, select the polygon containing the gaze position from the object model and adjust the position and orientation of the object model so that the normal line faces the user. Just do it.

Also, when displaying the gaze position, it is possible to set a limit on the adjustment method of the position and orientation of the object model to be displayed. Furthermore, by adjusting the extent to which the gaze position faces the user, it is possible to suppress a large change in the posture of the object model when displaying the gaze position.

For example, the position and orientation of the object model are adjusted so that the first gaze position faces the user, and the subsequent gaze positions do not exceed a predetermined range of facing the user. . As the extent to which the gaze position faces the user, the angle or inner product between the direction normal to the plane containing the gaze position and the direction facing the user, that is, the direction perpendicular to the screen, can be used.

FIG. 15 shows a display example when the gaze position is displayed by the method described above.
FIG. 15 shows a method for adjusting the position and orientation when displaying a gaze position in a state in which the vertical coordinate axes of the object model are tilted toward the user when displaying the object model. This shows a case where only rotation around is allowed and displayed. The position and orientation of the object model 1501 are adjusted so that the first gaze position 1502 faces the user.

However, in FIG. 15, the coordinate axes in the vertical direction of the object model are tilted, and the gaze position cannot be completely facing the user. Specifically, the normal to the plane containing the gaze position is obtained in the same manner as described above, and the object model Adjust the position and orientation of the The object model 1503 is displayed in a state in which the gaze position 1504 faces the user less than the gaze position 1502, that is, the gaze position can be confirmed but does not face the user. A gaze position 1506 on an object model 1505 and a gaze position 1508 on an object model 1507 are also displayed in the same manner as the gaze position 1504. FIG.

As described above, the gaze position analysis system according to the embodiment of the present invention generates a space model, which is a three-dimensional model of the space in which the user's line of sight is directed, from the first-person image, which is an image similar to the user's field of view. and the shooting position and orientation of the first-person image in the space model. The object model is placed in the space model according to the position and orientation of the object on the space model estimated by matching the object model and the space model, and the intersection of the gaze direction and the object model on the space model is determined. By obtaining, the gaze position on the object model is calculated.

According to the embodiment of the present invention, for a sight line measuring device worn by a user, various objects existing in the space are obtained from the sight line information of the user, which is measured while freely moving in the space of the measurement target. It is possible to automatically associate the gaze position of the user with the three-dimensional model of the object.

101 Gaze measurement device 102 Information processing device 103 Input device 104 Output device 105 Storage device 106 Spatial model creation program 107 Object position/orientation estimation program 108 Gaze position calculation program 109 Gaze position display program 110 First person image data 111 Gaze position data 112 Object model data 113 space model data 114 object placement data

Claims

An object model gaze position, which is the user's gaze position on the object model, on the object model, which is a three-dimensional model of an object existing in space, and which has a space model creation unit, an object position/orientation estimation unit, and a gaze position calculation unit. A gaze position analysis system that automatically associates
The spatial model creation unit
obtaining a first-person video that is similar to the user's field of view;
obtaining a first-person video gaze position, which is a gaze position of the user on the first-person video;
creating a space model, which is a three-dimensional model of the space in the range where the user's line of sight is directed, from the first-person video;
calculating a shooting position and orientation of the first-person video in the spatial model;
The object position/orientation estimator,
estimating the position and orientation of the object in the spatial model by matching the object model and the spatial model;
placing the object model in the spatial model using the position and orientation of the object in the spatial model;
calculating a viewing direction in the space model using the shooting position and orientation of the first-person video in the space model and the first-person video gaze position;
The gaze position calculation unit
A gaze position analysis system, wherein the object model gaze position is calculated by obtaining an intersection point between the gaze direction and the object model in the space model.
The object position/orientation estimator,
If there is data for which the object model has not been matched in the gaze direction in the space model, by matching the object model with data in the space model for which the object model has not been matched, the 2. The gaze position analysis system according to claim 1, wherein the position and orientation of said object in a space model are estimated.
The object position/orientation estimator,
selecting data in the spatial model existing within a predetermined range with respect to the viewing direction from the data in the spatial model;
2. The gaze position analysis system according to claim 1, wherein the position and orientation of the object in the spatial model are estimated by matching the object model with data in the selected spatial model.
The object position/orientation estimator,
reading a preset image pattern, character string or symbol string from the first-person video;
selecting the object model corresponding to the image pattern, the character string or the symbol string;
The image pattern and the character in the space model are obtained from the space model, the photographing position and orientation of the first-person video in the space model, and the position of the image pattern, the character string, or the symbol string in the first-person video. calculating the position in the spatial model of the string or the symbolic string;
selecting data in the spatial model existing in the predetermined range based on the position in the spatial model of the image pattern, the character string, or the symbol string in the spatial model;
4. The gaze position analysis system according to claim 3, wherein the position and orientation of the object in the spatial model are estimated by matching the selected object model and data in the selected spatial model. .
The object position/orientation estimator,
calculating the difference between the data representing the space model created from the first-person video at a certain time and the range data corresponding to the first-person video at the same time in the already created space model;
if the calculated difference is greater than a predetermined threshold, replacing data of the corresponding range on the spatial model with data representing the spatial model created from the first-person video;
performing matching of the object model with respect to the data at the location where the replacement was performed;
2. The gaze position analysis system according to claim 1, wherein the object model matched against the data before the replacement is replaced with the newly matched object model.
an object model storage unit that stores the object model corresponding to each process of the change and the order relationship of the change for the object whose shape or structure changes;
The object position/orientation estimator,
For locations where the object model corresponding to the changing object is matched, the object model corresponding to the change process of the object is selected from the object model storage unit each time a change in the space model is detected. 2. The gaze position analysis system according to claim 1, wherein the selected object model and the space model are matched.
The gaze position calculation unit
When any of the object models is newly matched to any location on the space model,
2. The gaze according to claim 1, wherein it is determined whether the object model gaze position exists on the object model newly matched on the space model with respect to the previous object model gaze position. Position analysis system.
The object position/orientation estimator,
The user manually matches the object model to a location on the space model where the object model has not been matched,
The gaze position calculation unit
Determining whether or not the object model gaze position exists on the object model manually matched on the space model by the user for all the object model gaze positions that have already been acquired. The gaze position analysis system according to claim 1.
The gaze position calculation unit
At least one of the name of the object model, the shooting position and orientation of the first-person video in the space model, the first-person video viewing position in the space model, and the position and orientation of the object model in the space model is defined in the object model. 2. The gaze position analysis system according to claim 1, wherein the gaze position is stored in association with the above object model gaze position.
further comprising a gaze position display unit that displays the object model gaze position on the object model;
The gaze position display unit
2. The method according to claim 1, wherein when displaying the object model gaze position on the object model, the position and orientation of the object model are adjusted so that the object model gaze position at each time is always displayed in front. The described gaze position analysis system.
further comprising a line-of-sight measurement device that captures the first-person video of the user and measures the gaze position of the first-person video in a state in which the user can move freely in the space;
The spatial model creation unit
2. The gaze position analysis system according to claim 1, wherein the first-person image and the first-person image gaze position are obtained from the eye gaze measuring device.
The line-of-sight measurement device
having a photographing device,
The spatial model creation unit
12. The gaze position analysis system according to claim 11, wherein the surrounding space model is created from a plurality of photographed images photographed by the photographing device.
A gaze position analysis system that automatically associates a user's gaze position on an object model with an object model that is a three-dimensional model of an object existing in space,
a space model creation unit that creates a space model, which is a three-dimensional model of the surrounding space, from a plurality of captured images;
an object position/posture estimation unit that matches the space model and the object model, and arranges the object model on the space model according to the position/posture obtained by the matching;
a gaze position calculation unit that calculates the gaze position on the object model based on the gaze direction in the arranged object model and the space model;
A gaze position analysis system comprising:
The gaze position calculation unit
14. The gaze position analysis system according to claim 13, wherein the gaze position on the object model is calculated by obtaining an intersection point between the gaze direction and the object model in the space model.
A gaze position analysis method for automatically associating an object model gaze position, which is a gaze position of a user on an object model, with an object model, which is a three-dimensional model of an object existing in space, comprising:
obtaining a first person image that is similar to the user's field of view;
obtaining a first-person video gaze position, which is the user's gaze position on the first-person video;
a step of creating a spatial model, which is a three-dimensional model of the space in the range where the user's line of sight is directed, from the first-person video;
calculating a shooting position and orientation of the first-person video in the spatial model;
estimating a position and orientation of the object in the spatial model by matching the object model and the spatial model;
placing the object model in the spatial model using the pose of the object in the spatial model;
calculating a viewing direction in the space model using the shooting position and orientation of the first-person video in the space model and the first-person video viewing position;
calculating the object model gaze position by finding the intersection of the gaze direction and the object model in the spatial model;
A gaze position analysis method, comprising: