WO2023054661A1 - Gaze position analysis system and gaze position analysis method - Google Patents

Gaze position analysis system and gaze position analysis method Download PDF

Info

Publication number
WO2023054661A1
WO2023054661A1 PCT/JP2022/036643 JP2022036643W WO2023054661A1 WO 2023054661 A1 WO2023054661 A1 WO 2023054661A1 JP 2022036643 W JP2022036643 W JP 2022036643W WO 2023054661 A1 WO2023054661 A1 WO 2023054661A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
gaze
gaze position
space
object model
Prior art date
Application number
PCT/JP2022/036643
Other languages
French (fr)
Japanese (ja)
Inventor
浩彦 佐川
貴之 藤原
Original Assignee
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社日立製作所 filed Critical 株式会社日立製作所
Publication of WO2023054661A1 publication Critical patent/WO2023054661A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods

Definitions

  • the present invention relates to a gaze position analysis system and a gaze position analysis method.
  • the user When analyzing the user's interest and work situation from the line-of-sight information obtained by using a line-of-sight measurement device worn by the user, the user can move freely in a three-dimensional space where various objects are arranged. will do. For this reason, it is desirable to be able to confirm where in the three-dimensional space the user is gazing, how the transition is, and the like.
  • Patent Document 1 Techniques for analyzing the line-of-sight information acquired by a user moving freely in a three-dimensional space as a gaze position in the three-dimensional space are disclosed in Patent Document 1 and Patent document 2 is disclosed.
  • Patent Document 1 a virtual three-dimensional space is generated using images photographed from a plurality of different photographing positions, the photographing positions in the virtual three-dimensional space are calculated, and the line of sight of the user obtained at the timing of photographing the images.
  • a technique for calculating a gaze position and gaze time of a user in a virtual three-dimensional space from a direction is disclosed.
  • Patent Literature 2 discloses a technique for identifying an object that a user is focusing on from the position of a display device in a three-dimensional space, the line of sight of the user, and the position of the object in the three-dimensional space. .
  • Patent Document 2 assumes that the line of sight of the user in the three-dimensional space and the position of the object in the three-dimensional space can always be obtained, so the required equipment is large and the scope of application is limited. There is a problem that In addition, it is necessary to clarify the positions of all the objects to be analyzed in advance, which requires labor for advance preparation.
  • the purpose of the present invention is to automatically associate the user's gaze position with a three-dimensional model corresponding to each object existing in a three-dimensional space in a gaze position analysis system.
  • a gaze position analysis system includes a space model creation unit, an object position/orientation estimation unit, and a gaze position calculation unit.
  • a gaze position analysis system that automatically associates an object model gaze position, which is a gaze position on an object model, wherein the spatial model creation unit acquires a first-person image that is an image similar to the user's field of view, obtaining a first-person video gaze position, which is a user's gaze position on the first-person video, creating a space model, which is a three-dimensional model of a space in a range where the user's line of sight is directed, from the first-person video;
  • the object position/posture estimation unit estimates the position/posture of the object in the space model by matching the object model and the space model, and the Placing the object model in the space model using the position and orientation of the object in the space model;
  • a gaze direction in the model is calculated, and the gaze position calculation unit calculates the object model gaze position by obtaining an intersection of the gaze direction and the object model in the space model
  • a gaze position analysis system is a gaze position analysis system that automatically associates a user's gaze position on an object model with an object model that is a three-dimensional model of an object existing in space, a space model creation unit that creates a space model, which is a three-dimensional model of a surrounding space, from a plurality of captured images; and a position and orientation obtained by matching the space model and the object model. and an object position/orientation estimating unit that places the object model on the space model, and a gaze position that calculates the gaze position on the object model based on the placed object model and the gaze direction in the space model and a calculating unit.
  • the gaze position analysis system it is possible to automatically associate the user's gaze position with a three-dimensional model corresponding to each object existing in the three-dimensional space.
  • FIG. 2 is a configuration diagram of a computer when the gaze position analysis system of the embodiment of the present invention is executed by a general computer; It is a figure showing the basic composition of the sight line measuring device assumed by the present invention.
  • FIG. 4 is a diagram showing an example of a format of first-person video data;
  • FIG. 4 is a diagram showing an example of the format of gaze position data;
  • FIG. 4 is a diagram showing an example of a format of object model data; It is a figure which shows an example of the format of spatial model data.
  • FIG. 4 is a diagram showing an example of a format of object placement data; It is a figure explaining the content of the process performed by a space model creation program.
  • FIG. 4 is a diagram showing an example of a format of first-person video data
  • FIG. 4 is a diagram showing an example of the format of gaze position data
  • FIG. 4 is a diagram showing an example of a format of object model data
  • It which shows an example of the format of spatial model data.
  • FIG. 4 is a diagram showing an example of a flowchart of processing executed by a gaze position calculation program;
  • FIG. 4 is a diagram for explaining the contents of processing executed by a gaze position calculation program;
  • FIG. 10 is a diagram showing an example of AR markers placed on an actual object;
  • FIG. 4 is a diagram showing an example of an object model whose shape and size are changing;
  • FIG. 4 is a diagram showing an example of an object model and gaze positions on the object model;
  • FIG. 10 is a diagram showing an example of a case where the gaze position is always displayed so as to face the user;
  • FIG. 11 is a diagram showing an example of displaying gaze positions by restricting the method of adjusting the position and orientation of an object model;
  • FIG. 1 is a configuration diagram of a computer when a gaze position analysis system according to an embodiment of the present invention is executed by a general computer.
  • a gaze measuring device 101 in FIG. 1 measures a first-person image, which is an image similar to the user's field of view, and a gaze position on the first-person image, and stores the data in a database for storing first-person image data 110 and gaze position data 111, respectively. It is an input device for recording, and a device commonly used under the name of "eye tracker" or the like can be used.
  • the line-of-sight measurement device 101 in the present invention a portable device that can be worn by the user is assumed. As a result, the gaze direction can be measured while the user can move freely in space.
  • FIG. 2 shows the basic configuration of the sight line measuring device 101 worn by the user.
  • Reference numeral 201 in FIG. 2 denotes a photographing device for acquiring a first-person image of the user, and a device equivalent to a camera generally used in a personal computer or the like can be used.
  • Reference numeral 202 denotes a spectacle-type device equipped with a sensor for detecting the movement of the user's eyes and measuring the gaze direction.
  • a terminal 203 is used to record the obtained first-person video data 110 and gaze position data 111 in a database. The terminal 203 may transmit data to the information processing device 102 .
  • the line-of-sight measurement device 101 is not limited to a portable type, as long as it is possible to measure the gaze position in a state in which the user can move freely in space and acquire the gaze position on the first-person video.
  • a stationary device may be used.
  • the information processing device 102 in FIG. 1 is an information processing device for executing each program in the gaze position analysis system.
  • the input device 103 includes general computer input devices such as keyboards, buttons, mice, and touch panels for controlling the start and end of the system.
  • the output device 104 is a means for displaying the result of gaze position analysis, the operating status of the system, etc. to the user, and includes the screen of a smartphone or tablet terminal, or a display device for general computers.
  • the storage device 105 is a storage device for storing each program in the gaze position analysis system.
  • the storage device 105 includes a spatial model creation program 106 , an object position/orientation estimation program 107 , a gaze position calculation program 108 and a gaze position display program 109 .
  • the information processing device 102 functions as a space model unit by executing processing according to the space model creation program 106.
  • the information processing apparatus 102 also functions as an object position/orientation estimation unit by executing processing according to the object position/orientation estimation program 107 .
  • the information processing apparatus 102 also functions as a gaze position calculation unit by executing processing according to the gaze position calculation program 108 .
  • the information processing apparatus 102 functions as a gaze position display unit by executing processing according to the gaze position display program 109 .
  • the first-person video data 110 database stores first-person video data, which is the same video as the user's field of view. It is assumed that the first-person video data 110 is acquired by the eye gaze measuring device 101 as described above.
  • FIG. 3 shows an example of the format of the first-person video data 110.
  • a data name 301 in FIG. 3 is a name given to the first-person video data, and any string of characters and symbols can be used.
  • the first-person video data 110 includes a plurality of images acquired at predetermined time intervals or arbitrary timing in the order in which they were acquired, and the number of data 302 in FIG. number of images to be stored.
  • Time 1 of 303 represents the time when the first image was acquired
  • Image 1 of 304 represents the first acquired image
  • Photographing position and orientation 1 of 305 represents the position and orientation of the camera that captured the first image.
  • Time n 306 represents the time when the n-th image was acquired
  • image n 307 represents the n-th image
  • shooting position and orientation n 308 represents the position and orientation of the camera that shot the n-th image. Note that the shooting positions and orientations 305 and 308 are data calculated by the space model creation program 106, as will be described later, and are blank when each image is acquired.
  • the gaze position data 111 database stores gaze position data acquired by the eye gaze measuring device 101 .
  • FIG. 4 shows an example of the format of the gaze position data 111.
  • the gaze position data 111 has corresponding first-person video data 110 . Therefore, by describing the same name as the data name 301 in the first-person video data corresponding to the data name 401, the correspondence relationship between the gaze position data 111 and the first-person video data 110 is represented.
  • the gaze position data 111 includes a plurality of position coordinates acquired at predetermined time intervals or arbitrary timing in the order in which they were acquired.
  • the number of data 402 in FIG. 4 represents the number of position coordinates included in the data.
  • 403 is the time when the position coordinates of the gaze position were first acquired;
  • 404 is the position coordinates of the gaze position that was first acquired; Describe the name of the object.
  • 406 is the time when the position coordinates of the n-th gaze position are acquired
  • 407 is the position coordinates of the n-th gaze position acquired
  • 408 is when the gaze position data is associated with the object model by the gaze position calculation program 108. , describes the name of the object. Since there is no associated object model when gaze position data is acquired, 405 and 408 are blank.
  • the gaze position is position coordinates on the first-person video, that is, two-dimensional coordinate data. becomes the position coordinates of , and is rewritten to three-dimensional coordinate data.
  • the gaze position data 111 described above is based on the premise that the gaze position data acquired by the eye gaze measuring device 101 and the gaze position data associated with the object model are managed with the same data. However, the gaze position data acquired by the eye gaze measuring device 101 and the gaze position data associated with the object model may be managed as separate data.
  • the database of object model data 112 stores three-dimensional models of objects that exist in the space within the range where the user's line of sight is directed.
  • the three-dimensional model of each object is represented as a collection of points representing the shape of the object, that is, as point cloud data.
  • Object models created using general three-dimensional CAD, etc. are often represented as collections of polygons, but object models represented by polygons can be easily converted into point cloud models. can be done. For example, first divide each polygon into triangles by lines connecting a vertex and its non-adjacent vertex. By repeatedly dividing the triangles and finally selecting the vertices of all the triangles, the object model represented by the polygons can be represented as point cloud data.
  • FIG. 5 shows an example of the format of an object model stored in the object model data 112 database.
  • 501 in FIG. 5 is the name given to the object, and any string of letters and symbols can be used.
  • 502 represents the number of points included in the object model
  • 503 represents the positional coordinates of the first point
  • 504 represents the positional coordinates of the n-th point.
  • the format shown in FIG. 5 includes only the positional coordinates of the points, it may also include information attached to each point, such as color information.
  • the spatial model data 113 database stores a spatial model, which is a three-dimensional model for the space within the range where the user looks.
  • the spatial model is data created by the spatial model creation program 106 using the first-person video data 110, and is assumed to be represented as point cloud data, like the object model described above.
  • FIG. 6 shows an example of the format of the spatial model stored in the database of the spatial model data 113.
  • a model name 601 in FIG. 6 is a name given to the spatial model, and any string of letters and symbols can be used.
  • 602 is the name of the first-person video data used to create the space model
  • 603 is the number of points included in the space model
  • 604 is the position coordinates of the first point
  • 605 is the position coordinates of the nth point.
  • the format shown in FIG. 6 includes only the positional coordinates of the points, it may also include information attached to each point, such as color information.
  • the database of the object placement data 114 stores data on the object model that has been matched with the space model and placed on the space model by the object position/orientation estimation program 107 .
  • FIG. 7 shows an example of the format of the object placement data 114 stored in the database of the object placement data 114. As shown in FIG. Assuming that the object placement data 114 is stored for each space model, the name of the space model corresponding to the model name 701 in FIG. 7 is described. The number of objects 702 represents the number of object models arranged in the target space model.
  • 703 is the name of the first object placed on the space model
  • 704 is the position and orientation of the first object on the space model
  • 705 is the time when the first object was placed on the space model
  • 706 is the name of the nth object placed on the space model
  • 707 is the position and orientation of the nth object on the space model
  • 708 is the time when the nth object was placed on the space model.
  • the present invention uses the first-person video data 110 and the gaze position data 111 acquired from the line-of-sight measuring device 101 to determine the gaze position on each object by the space model creation program 106, the object position/orientation estimation program 107, and the gaze position calculation program 108. Analytical processing. Especially in this embodiment, it is assumed that processing is performed while acquiring the first-person image data 110 and gaze position data 111 from the line-of-sight measuring device 101 in real time.
  • the spatial model creation program 106 constantly reads newly stored data from the data acquired from the line-of-sight measurement device 101 and stored in the database of the first-person video data 110, and performs processing for creating a spatial model. .
  • SLAM Simultaneous Localization and Mapping
  • MVS Multi-View Stereo
  • SLAM is a technology that creates rough point cloud data by analyzing the correspondence between images using multiple consecutive images corresponding to when the camera is moved.
  • the MVS method is a technique of creating more detailed dense point cloud data by using the analysis results in SLAM.
  • Fig. 8 shows an image of creating a spatial model from point cloud data using the SLAM and MVS methods.
  • 801 represents an object that exists within the range where the user's line of sight moves, and for the sake of simplicity, it is assumed that there are no other objects in the surrounding area.
  • 802 and 804 represent the positions of the cameras when the first-person video was captured, and 803 and 805 illustrate the range of the camera's field of view and the camera posture when the first-person video was captured from each camera position. be.
  • a space model represented by point cloud data as shown in 806 is created by using images of the same object or the same location photographed from multiple locations.
  • the space model 806 can be created in real time according to the movement of the gaze position of the user.
  • any technique may be used in the spatial model creation program 106 as long as it can create a spatial model represented by point cloud data.
  • the position and orientation of the camera are information necessary for calculating the gaze position on the object model. Therefore, when using a space model creation technique that cannot calculate the position and orientation of the camera, it is necessary to separately use means for acquiring the position and orientation of the camera on the space model. For example, any technology that acquires the position and orientation in space, such as using a sensor that acquires the position and orientation, can be used.
  • the object position/orientation estimation program 107 matches the point cloud data of the object model with a part of the point cloud data in the space model created by the space model creation program 106, that is, matches the position and orientation of the object model. It is a program for calculating the position and orientation of the object model in the space model.
  • the object position/orientation estimation program 107 As a technique used in the object position/orientation estimation program 107, a technique called the well-known ICP (Iterative Closest Point) algorithm or NDT (Normal Distribution Transform) algorithm can be used. Alternatively, any technique may be used as long as it is capable of performing matching between point cloud data and determining the position and orientation of an object model in a space model. Further, the object position/orientation estimation program 107 is controlled in timing of processing by a gaze position calculation program 108 which will be described later.
  • ICP Intelligent Closest Point
  • NDT Normal Distribution Transform
  • the gaze position calculation program 108 uses the gaze position data 111 and the object model data 112 acquired by the eye gaze measuring device 101 and the space model data 113 created by the space model creation program 106 to obtain the gaze position on the object model. It's a program.
  • new gaze position data 111 is obtained from gaze position data 111 stored in the gaze position data 111 database.
  • new gaze position data may be directly acquired from the line-of-sight measuring device 101 .
  • step 902 using the newly obtained gaze position data 111 and the photographing positions and orientations 305 and 308 calculated when creating the space model by the space model creation program 106 and described in the first-person video data 110, Calculate the gaze direction on the space model.
  • the direction of gaze on the space model is represented by a vector that indicates the starting point of the gaze and the line-of-sight direction from that position.
  • the photographing position and orientation for the first-person video data at the same time as the newly acquired gaze position data 111 are obtained from the first-person video data 110 .
  • the photographing position and orientation for the first-person image data 110 corresponding to times before and after the time of the gaze position data 111 are acquired, and the relation with the time of the gaze position data 111 is obtained.
  • a method such as using the photographing position and orientation obtained by performing interpolation based on the above may be used.
  • the gaze position on the first-person video is coordinate-transformed into the gaze position on the space model.
  • the gaze position data 111 is represented by the gaze position on the first-person video.
  • the positional relationship between the first-person image and gaze position can be obtained in the same size as the actual space. Furthermore, by subjecting the obtained positional relationship to coordinate transformation using the shooting position and orientation acquired from the first-person video data 110, the positional relationship among the shooting position of the camera, the first-person video, and the gaze position on the space model can be obtained as shown in FIG. can be represented.
  • 1001 is a space model
  • 1002 is the shooting position of the camera that shot the first-person video on the space model 1001
  • 1003 is the shooting range of the first-person video on the space model 1001, which corresponds to the shooting posture of the camera. .
  • 1004 is the gaze position on the first-person video associated with the spatial model 1001 .
  • the gaze direction on the space model 1001 can be obtained as a vector 1005 that starts at 1002 and passes through the gaze position 1004 on the first-person video in FIG.
  • step 903 in FIG. 9 it is determined whether or not there is an object model placed on the space model 1001 in the gaze direction on the space model 1001 obtained in step 902. If it is determined that there is an object model placed on the model 1001, the process proceeds to step 906; otherwise, to step 904.
  • gaze direction vector a vector representing the gaze direction on the space model 1001 (hereinafter referred to as gaze direction vector), for example, the distance between the vector 1005 in FIG. and select the closest distance.
  • the selected distance is equal to or less than a predetermined threshold, it can be determined that the target object model exists in the gaze direction.
  • a predetermined threshold for example, the largest distance, half the largest distance, or the like can be selected among the distances between points in the point cloud data constituting the object model.
  • the number of points whose distance is equal to or greater than the predetermined threshold is determined. You may make it determine that it exists in.
  • polygon data which is data represented by a set of polygons
  • Polygon data is also placed on the corresponding object model, and if there is a point where each polygon placed on the spatial model 1001 intersects with a vector representing the line-of-sight direction placed on the spatial model, the target object model is You may make it determine that it exists in a gaze direction.
  • the point closest to the gaze direction vector 1005 and the position of the polygon that intersects the gaze direction vector 1005 on the gaze direction vector are obtained.
  • the starting point of the gazing direction vector 1005, that is, an object model including a point or polygon closest to the photographing position may be selected as an object model existing in the gazing direction.
  • step 904 the object position/orientation estimation program 107 is used to match the object model 1004 with the point cloud data on the space model 1001 existing in the gaze direction on the space model 1001. 10, by variously adjusting the position and orientation of the object model 1006 with respect to the space model 1001, this processing makes it possible to replace part of the space model 1001 with the target object model 1006. is the processing of
  • point cloud data on the space model 1001 existing in the line-of-sight direction for example, point cloud data on the space model whose distance from the gaze direction vector 1005 is a predetermined distance or less can be selected.
  • the predetermined distance the maximum size of the object model to be matched, half the maximum size, or the like can be used.
  • it can be point cloud data existing within a range of a specific shape having a predetermined size with the gazing direction vector 1005 as the central axis. It is also possible to obtain the position of each point in the point cloud data on the gaze direction vector and select points in a predetermined range centering on the point where the point cloud data are most concentrated on the gaze direction vector. can. In addition to the above, any method may be used as long as it can select point group data to be matched around the gaze direction vector.
  • the object model that matches the point cloud data on the spatial model can be all object models stored in the database of object model data 112, or a well-known QR codes (registered trademark), AR markers, symbols and character strings representing the names of objects are installed, and well-known image recognition technology and character recognition technology are used to read them, and the corresponding object models are converted into object model data. 112 may be selected for matching.
  • QR codes registered trademark
  • AR markers, symbols and character strings representing the names of objects are installed, and well-known image recognition technology and character recognition technology are used to read them, and the corresponding object models are converted into object model data. 112 may be selected for matching.
  • FIG. 11 shows an example of an AR marker installed on an actual object.
  • extract well-known point cloud feature values three-dimensional feature values
  • extract the point cloud feature values similar to the point cloud data on the space model. may be selected from the object model data 112 and matched.
  • the shape and size of an object existing at a specific location change.
  • the object model representing the state of the object and its order relationship are stored in the object model data 112 for each task, and the object model corresponding to the work process is matched with the space model by matching with the space model in order. become able to.
  • the object model corresponding to the first step is selected and matched.
  • the object model corresponding to one of the processes has already been matched to the point cloud data on the space model existing in the gaze direction, select the object model corresponding to the next process, and After matching with point cloud data, if the object model corresponding to the next process is matched with the space model with a higher degree of matching than the already matched object model and the space model, the already matched object The model can be replaced with an object model corresponding to the next process.
  • the degree of matching the same degree as that used in the processing in step 905, which will be described later, can be used. Furthermore, the matching of the object model to the space model corresponding to the next step is performed when the gaze position moves on the already matched object model, or when the space model creation program 106 detects a change in the space model. It is good to go to
  • the detection of the change in the spatial model is performed by the spatial model creation program 106, for example, by using a newly acquired first-person video image and a predetermined number of first-person video images immediately before it to generate a spatial model and the A previously generated space model is compared, and if the difference between the two is greater than or equal to a predetermined threshold, it is determined that the space model has changed.
  • the difference between the spatial models can be obtained, for example, by searching for the closest point in the other spatial model for each point in one spatial model, calculating the distance to the searched point, It can be calculated by averaging the distances for all points in the model. Alternatively, the number of points where the obtained distance is equal to or greater than a predetermined threshold may be used. Alternatively, any method may be used as long as it can calculate the difference between point groups.
  • step 905 it is determined whether or not the object model has been correctly matched with the point cloud data on the space model existing in the viewing direction on the space model. After saving to the database go to step 906 , otherwise go to step 908 .
  • the point cloud data on the space model and the object model placed on the space model based on the result of matching processing If the degree of matching is greater than a predetermined value, it is determined that the object model is correctly matched.
  • the degree of matching for example, for each point in the object model placed on the spatial model using the matching result, the point with the smallest distance in the point cloud data on the spatial model is searched, and the obtained distance is less than a predetermined threshold, and the ratio of the determined number of points to the number of points in the object model can be used.
  • any index other than the above may be used as long as it is an index that can determine the quality of the matching result between the point cloud data.
  • the gaze position in the object model placed on the space model is calculated using the matching result at step 904. Since the object model placed on the space model and the gaze direction vector on the space model are 1005 in FIG. A process similar to the process of determining whether an object model exists can be used.
  • step 903 when the object model is point cloud data, attention is paid to whether or not there is a point whose distance from the gaze direction vector 1005 is equal to or less than a predetermined threshold.
  • the difference is that a point whose distance from the vector is equal to or less than a predetermined threshold value and which is closest to the shooting position 1002 is selected and set as the gaze position.
  • the distance from the gazing direction vector 1005 is less than or equal to a predetermined threshold value and there are a plurality of points within a predetermined range from the point closest to the shooting position 1002, the point closest to the shooting position 1002 , and an average of points within a predetermined range from there may be used as the gaze position.
  • the object model is expressed as a collection of polygons
  • the point of intersection between the gaze direction vector and each polygon in the object model may be obtained, and the point of intersection closest to the photographing position 1002 may be obtained as the gaze position.
  • the obtained gaze position is set using the position and orientation of the object model on the space model obtained by the matching result in step 904. Perform coordinate transformation.
  • the gaze position 1004 on the space model 1001 is transformed into the gaze position 1007 on the object model 1006, that is, on the object model coordinate system.
  • the gaze position on the placed object model may be calculated for the past gaze position data. This is because, in the initial stage of processing of the gaze position calculation program 108, etc., it is assumed that the point cloud data for each object on the space model is small and the object model cannot be arranged on the space model.
  • the gaze position on the object model obtained at step 906 is saved in the gaze position data 111 . Also, when saving the gaze position of the object model, the shooting position and orientation of the first-person video on the space model, the gaze position in the space model, the position and orientation of the object model on the space model, etc. may be saved together. good.
  • step 908 if there is an end instruction from the input device 103 or the like, the process ends; otherwise, the process returns to step 901.
  • the gaze position calculation program 108 described above acquires the first-person video data 110 and the gaze position data 111 in real time, arranges the object model data 112 on the space model while generating the space model data 113, and places the object model data 112 on the object model. However, after the target first-person video data 110 and gaze position data 111 have been acquired, the data stored in the first-person video data 110 and gaze position data 111 are sequentially processed. The processing of the gaze position calculation program 108 may be executed while reading.
  • the gaze position on the object model may be calculated.
  • the above-described processing may be performed after manually arranging an object model for a location where no object model is arranged on the space model.
  • the gaze position calculation program 108 is executed again using the space model data 113 and the object arrangement data 114 generated after the gaze position calculation program 108 is executed. You can do it.
  • the gaze position calculation program 108 is executed again using the space model data 113 and the object arrangement data 114 generated after the gaze position calculation program 108 is executed. You can do it.
  • the point group data for each object on the space model is small and the object model cannot be arranged on the space model.
  • the gaze position calculation program 108 is executed with the following modifications.
  • the first change is that the acquisition of the gaze position data in step 901 is performed by reading data from the data stored in the gaze position data 111 in the order in which they were saved. Further, it is determined whether or not the gaze position on the object model has been calculated for the acquired gaze position data, and if it has been calculated, the process of proceeding to step 908 is added.
  • the second change is to proceed to step 908 instead of step 904 if, in step 903, the object model is not arranged in the gaze direction on the space model.
  • the third change is to delete steps 904 and 905.
  • the gaze position calculation program 108 it is possible to calculate the gaze position on the object model for gaze positions for which calculation of the gaze position on the object model has not been performed. It can be carried out.
  • the above-described processing may be performed after manually arranging an object model for a location where no object model is arranged on the space model.
  • the gaze position display program 109 is a program that displays gaze position data according to instructions from the user. Gaze position can be determined by a well-known method of gaze analysis, for example, a method using a heat map, which is the frequency distribution of gaze positions, or the length of time that the gaze position stayed within a certain range. display method, etc.
  • the gaze position on the first-person video In addition to displaying the gaze position on the first-person video, it also displays the gaze position on the object model.
  • a virtual space model which is a virtual space corresponding to the space model, is prepared, and the object model is displayed based on the data stored in the object placement data 114 in the virtual space model.
  • a method of placing and displaying the gaze position on the object model placed in the virtual space model can be used.
  • a method of selecting only a specific object model and displaying the gaze position only on the selected object model can also be used.
  • the virtual space model or the object model may be adjusted and displayed so that the gaze position always faces the user.
  • FIG. 1301 in FIG. 13 shows an example of an object model
  • 1302, 1303, 1304 and 1305 show examples of gaze positions on the object model
  • FIG. 14 shows an example of the case where the gaze position shown in FIG. 13 is always displayed facing the user.
  • 1401 is an object model displaying a point of gaze 1302 facing the user, and 1402 is the point of gaze displayed on 1401 .
  • Reference numeral 1403 denotes an object model in which the gaze point 1303 is displayed facing the user, and 1404 denotes the gaze point displayed on the 1403 .
  • Reference numeral 1405 denotes an object model in which the gaze point 1304 is displayed facing the user, and 1406 the gaze point displayed on 1405 .
  • Reference numeral 1407 denotes an object model in which the gaze point 1305 is displayed facing the user, and 1408 denotes the gaze point displayed on 1407 .
  • the gaze position is displayed by adjusting the object model so that each gaze position completely faces the user.
  • the object model is point cloud data
  • select a point within a predetermined range that includes the gaze position It is sufficient to obtain the orientation and adjust the position and orientation of the object model so that the obtained normal line faces the user.
  • the object model is represented by a collection of polygons, select the polygon containing the gaze position from the object model and adjust the position and orientation of the object model so that the normal line faces the user. Just do it.
  • the gaze position when displaying the gaze position, it is possible to set a limit on the adjustment method of the position and orientation of the object model to be displayed. Furthermore, by adjusting the extent to which the gaze position faces the user, it is possible to suppress a large change in the posture of the object model when displaying the gaze position.
  • the position and orientation of the object model are adjusted so that the first gaze position faces the user, and the subsequent gaze positions do not exceed a predetermined range of facing the user.
  • the angle or inner product between the direction normal to the plane containing the gaze position and the direction facing the user that is, the direction perpendicular to the screen, can be used.
  • FIG. 15 shows a display example when the gaze position is displayed by the method described above.
  • FIG. 15 shows a method for adjusting the position and orientation when displaying a gaze position in a state in which the vertical coordinate axes of the object model are tilted toward the user when displaying the object model. This shows a case where only rotation around is allowed and displayed.
  • the position and orientation of the object model 1501 are adjusted so that the first gaze position 1502 faces the user.
  • the coordinate axes in the vertical direction of the object model are tilted, and the gaze position cannot be completely facing the user.
  • the normal to the plane containing the gaze position is obtained in the same manner as described above, and the object model Adjust the position and orientation of the The object model 1503 is displayed in a state in which the gaze position 1504 faces the user less than the gaze position 1502, that is, the gaze position can be confirmed but does not face the user.
  • a gaze position 1506 on an object model 1505 and a gaze position 1508 on an object model 1507 are also displayed in the same manner as the gaze position 1504.
  • the gaze position analysis system generates a space model, which is a three-dimensional model of the space in which the user's line of sight is directed, from the first-person image, which is an image similar to the user's field of view. and the shooting position and orientation of the first-person image in the space model.
  • the object model is placed in the space model according to the position and orientation of the object on the space model estimated by matching the object model and the space model, and the intersection of the gaze direction and the object model on the space model is determined. By obtaining, the gaze position on the object model is calculated.
  • various objects existing in the space are obtained from the sight line information of the user, which is measured while freely moving in the space of the measurement target. It is possible to automatically associate the gaze position of the user with the three-dimensional model of the object.
  • Gaze measurement device 101
  • Information processing device 102
  • Input device 104
  • Output device 105
  • Storage device 106
  • Spatial model creation program 107
  • Object position/orientation estimation program 108
  • Gaze position calculation program 109
  • Gaze position display program 110
  • First person image data 111
  • Gaze position data 112
  • Object model data 113
  • space model data 114 object placement data

Abstract

The present invention comprises: a space model creation unit that creates a space model, which is a three-dimensional model of a surrounding space, from a plurality of captured images; an object position/posture estimation unit that matches the space model and an object model, and arranges the object model in the space model in accordance with the position/posture obtained by the matching; and a gaze position calculation unit that calculates the gaze position in the object model on the basis of the arranged object model and the gaze direction in the space model.

Description

注視位置分析システム及び注視位置分析方法Gaze position analysis system and gaze position analysis method
 本発明は、注視位置分析システム及び注視位置分析方法に関する。 The present invention relates to a gaze position analysis system and a gaze position analysis method.
 利用者に装着する形態の視線計測装置を用いて取得した視線情報から利用者の関心や作業状況等の分析を行う場合、様々な物体が配置された三次元空間中を利用者が自由に移動することになる。このため、三次元空間中において利用者が注視した位置がどこであるか、またその遷移がどのようになっているか等を確認できることが望ましい。 When analyzing the user's interest and work situation from the line-of-sight information obtained by using a line-of-sight measurement device worn by the user, the user can move freely in a three-dimensional space where various objects are arranged. will do. For this reason, it is desirable to be able to confirm where in the three-dimensional space the user is gazing, how the transition is, and the like.
 利用者に装着する形態の視線計測装置を対象として、利用者が三次元空間中を自由に移動して取得された視線情報を三次元空間中の注視位置として分析する技術は、特許文献1および特許文献2が開示されている。 Techniques for analyzing the line-of-sight information acquired by a user moving freely in a three-dimensional space as a gaze position in the three-dimensional space are disclosed in Patent Document 1 and Patent document 2 is disclosed.
 特許文献1には、複数の異なる撮影位置から撮影された画像を用いて仮想三次元空間を生成するとともに仮想三次元空間における撮影位置を算出し、画像を撮影したタイミングで取得した利用者の視線方向から仮想三次元空間における利用者の注視位置及び注視時間を算出する技術が開示されている。 In Patent Document 1, a virtual three-dimensional space is generated using images photographed from a plurality of different photographing positions, the photographing positions in the virtual three-dimensional space are calculated, and the line of sight of the user obtained at the timing of photographing the images. A technique for calculating a gaze position and gaze time of a user in a virtual three-dimensional space from a direction is disclosed.
 一方、特許文献2には、三次元空間中の表示装置の位置及び利用者の視線と三次元空間上の物体の位置から、利用者が着目していた物体を特定する技術が開示されている。 On the other hand, Patent Literature 2 discloses a technique for identifying an object that a user is focusing on from the position of a display device in a three-dimensional space, the line of sight of the user, and the position of the object in the three-dimensional space. .
特開2020-135737号公報JP 2020-135737 A 特開2018-195319号公報JP 2018-195319 A
 特許文献1では、複数の画像から生成した仮想三次元空間はそれを構成するデータが粗い場合や、部分的にしかデータを生成できない場合があり、その様な場合、何に注視していたかを判定することが困難となる。 In Japanese Patent Laid-Open No. 2002-200012, there are cases where the data that constitutes a virtual three-dimensional space generated from a plurality of images is rough, or the data can only be generated partially. difficult to judge.
 一方、特許文献2では、三次元空間中の利用者の視線と三次元空間上の物体の位置が常に取得可能なことが前提であるため、必要となる設備が大掛かりになるとともに適用範囲も限定されるという問題がある。また、分析の対象となる全ての物体の位置をあらかじめ明確にしておく必要もあり、事前準備に労力を要する。 On the other hand, Patent Document 2 assumes that the line of sight of the user in the three-dimensional space and the position of the object in the three-dimensional space can always be obtained, so the required equipment is large and the scope of application is limited. There is a problem that In addition, it is necessary to clarify the positions of all the objects to be analyzed in advance, which requires labor for advance preparation.
 三次元空間中の注視位置を分析する場合、三次元空間全体の中をどのように注視位置が遷移したかを閲覧する他、特定の物体のみに着目し、その物体上を注視位置がどのように遷移したかを容易に閲覧できることが望ましい。 When analyzing the gaze position in the three-dimensional space, in addition to browsing how the gaze position has changed in the entire three-dimensional space, we can focus on a specific object and see how the gaze position is on that object. It is desirable to be able to easily view whether the transition to
 この際、三次元空間中に存在する各物体に対する三次元モデル上に注視位置を対応付けることができると、より詳細な分析が可能になることが期待できる。また、各物体モデルは三次元空間中に自動的に配置することができれば、分析に必要な事前設定を簡略化することができると考えられる。 At this time, it is expected that a more detailed analysis will be possible if it is possible to associate the gaze position on the 3D model for each object existing in the 3D space. Also, if each object model can be automatically arranged in a three-dimensional space, it is considered possible to simplify the presetting required for analysis.
 本発明の目的は、注視位置分析システムにおいて、三次元空間中に存在する各物体に対応する三次元モデル上に利用者の注視位置を自動的に対応付けることにある。 The purpose of the present invention is to automatically associate the user's gaze position with a three-dimensional model corresponding to each object existing in a three-dimensional space in a gaze position analysis system.
 本発明の一態様の注視位置分析システムは、空間モデル作成部と物体位置姿勢推定部と注視位置算出部を有し、空間中に存在する物体の三次元モデルである物体モデル上に利用者の物体モデル上の注視位置である物体モデル注視位置を自動的に対応付ける注視位置分析システムであって、前記空間モデル作成部は、前記利用者の視野と同様の映像である一人称映像を取得し、前記利用者の前記一人称映像上の注視位置である一人称映像注視位置を取得し、前記利用者が視線を向けた範囲における空間の三次元モデルである空間モデルを前記一人称映像から作成し、前記空間モデル中における前記一人称映像の撮影位置姿勢を算出し、前記物体位置姿勢推定部は、前記物体モデルと前記空間モデルとをマッチングすることにより、前記空間モデル中における前記物体の位置姿勢を推定し、前記空間モデル中における前記物体の位置姿勢を用いて、前記物体モデルを前記空間モデル中に配置し、前記空間モデル中における前記一人称映像の前記撮影位置姿勢と前記一人称映像注視位置を用いて、前記空間モデル中における注視方向を算出し、前記注視位置算出部は、前記空間モデル中における前記注視方向と前記物体モデルとの交点を求めることにより、前記物体モデル注視位置を算出することを特徴とする。 A gaze position analysis system according to one aspect of the present invention includes a space model creation unit, an object position/orientation estimation unit, and a gaze position calculation unit. A gaze position analysis system that automatically associates an object model gaze position, which is a gaze position on an object model, wherein the spatial model creation unit acquires a first-person image that is an image similar to the user's field of view, obtaining a first-person video gaze position, which is a user's gaze position on the first-person video, creating a space model, which is a three-dimensional model of a space in a range where the user's line of sight is directed, from the first-person video; The object position/posture estimation unit estimates the position/posture of the object in the space model by matching the object model and the space model, and the Placing the object model in the space model using the position and orientation of the object in the space model; A gaze direction in the model is calculated, and the gaze position calculation unit calculates the object model gaze position by obtaining an intersection of the gaze direction and the object model in the space model.
 本発明の一態様の注視位置分析システムは、空間中に存在する物体の三次元モデルである物体モデル上に利用者の物体モデル上の注視位置を自動的に対応付ける注視位置分析システムであって、撮影された複数枚の撮影画像から、周囲の空間の三次元モデルである空間モデルを作成する空間モデル作成部と、前記空間モデルと前記物体モデルとをマッチングし、前記マッチングで得られた位置姿勢により前記物体モデルを前記空間モデル上に配置する物体位置姿勢推定部と、前記配置された物体モデルと前記空間モデル中における注視方向に基づいて、前記物体モデル上の前記注視位置を算出する注視位置算出部と、を有することを特徴とする。 A gaze position analysis system according to one aspect of the present invention is a gaze position analysis system that automatically associates a user's gaze position on an object model with an object model that is a three-dimensional model of an object existing in space, a space model creation unit that creates a space model, which is a three-dimensional model of a surrounding space, from a plurality of captured images; and a position and orientation obtained by matching the space model and the object model. and an object position/orientation estimating unit that places the object model on the space model, and a gaze position that calculates the gaze position on the object model based on the placed object model and the gaze direction in the space model and a calculating unit.
 本発明の一態様によれば、注視位置分析システムにおいて、三次元空間中に存在する各物体に対応する三次元モデル上に利用者の注視位置を自動的に対応付けることができる。 According to one aspect of the present invention, in the gaze position analysis system, it is possible to automatically associate the user's gaze position with a three-dimensional model corresponding to each object existing in the three-dimensional space.
本発明の実施例の注視位置分析システムを一般的なコンピュータによって実行させる場合のコンピュータの構成図である。FIG. 2 is a configuration diagram of a computer when the gaze position analysis system of the embodiment of the present invention is executed by a general computer; 本発明で想定している視線計測装置の基本的な構成を表す図である。It is a figure showing the basic composition of the sight line measuring device assumed by the present invention. 一人称映像データのフォーマットの一例を示す図である。FIG. 4 is a diagram showing an example of a format of first-person video data; 注視位置データのフォーマットの一例を示す図である。FIG. 4 is a diagram showing an example of the format of gaze position data; 物体モデルデータのフォーマットの一例を示す図である。FIG. 4 is a diagram showing an example of a format of object model data; 空間モデルデータのフォーマットの一例を示す図である。It is a figure which shows an example of the format of spatial model data. 物体配置データのフォーマットの一例を示す図である。FIG. 4 is a diagram showing an example of a format of object placement data; 空間モデル作成プログラムで実行される処理の内容を説明する図である。It is a figure explaining the content of the process performed by a space model creation program. 注視位置算出プログラムで実行される処理のフローチャートの一例を示す図である。FIG. 4 is a diagram showing an example of a flowchart of processing executed by a gaze position calculation program; 注視位置算出プログラムで実行される処理の内容を説明する図である。FIG. 4 is a diagram for explaining the contents of processing executed by a gaze position calculation program; FIG. 実際の物体上に設置されたARマーカーの一例を示す図である。FIG. 10 is a diagram showing an example of AR markers placed on an actual object; 形状や大きさが変化していく物体モデルの一例を示す図である。FIG. 4 is a diagram showing an example of an object model whose shape and size are changing; 物体モデルと物体モデル上の注視位置の一例を示す図である。FIG. 4 is a diagram showing an example of an object model and gaze positions on the object model; 注視位置を常に利用者に正対するように表示した場合の一例を示す図である。FIG. 10 is a diagram showing an example of a case where the gaze position is always displayed so as to face the user; 物体モデルの位置姿勢の調整方法を制限して注視位置を表示した場合の一例を示す図である。FIG. 11 is a diagram showing an example of displaying gaze positions by restricting the method of adjusting the position and orientation of an object model;
 以下、図面を用いて本発明の実施例について説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
 図1は、本発明における実施例における注視位置分析システムを一般的なコンピュータによって実行させる場合のコンピュータの構成図である。
  図1における視線計測装置101は、利用者の視野と同様の映像である一人称映像と一人称映像上の注視位置を計測し、それぞれを一人称映像データ110と注視位置データ111を格納するためのデータベースに記録するための入力装置であり、「アイトラッカー」等の名称で一般的に良く利用されている装置を使用することができる。
FIG. 1 is a configuration diagram of a computer when a gaze position analysis system according to an embodiment of the present invention is executed by a general computer.
A gaze measuring device 101 in FIG. 1 measures a first-person image, which is an image similar to the user's field of view, and a gaze position on the first-person image, and stores the data in a database for storing first-person image data 110 and gaze position data 111, respectively. It is an input device for recording, and a device commonly used under the name of "eye tracker" or the like can be used.
 特に、本発明における視線計測装置101としては、利用者に装着して用いることができる携帯型の装置を想定している。これにより、利用者が自由に空間上を移動できる状態で注視方向を計測することができる。 In particular, as the line-of-sight measurement device 101 in the present invention, a portable device that can be worn by the user is assumed. As a result, the gaze direction can be measured while the user can move freely in space.
 図2に、利用者に装着して用いる視線計測装置101の基本的な構成を示す。
  図2における201は、利用者の一人称映像を取得するための撮影装置であり、一般的にパソコン等で用いられるカメラと同等の装置を使用することができる。202は、利用者の目の動きを検知し、注視方向を計測するためのセンサが搭載されている眼鏡型の装置である。203は、取得された一人称映像データ110と注視位置データ111をデータベースに記録するための端末である。端末203は、情報処理装置102にデータを送信するようにしても良い。
FIG. 2 shows the basic configuration of the sight line measuring device 101 worn by the user.
Reference numeral 201 in FIG. 2 denotes a photographing device for acquiring a first-person image of the user, and a device equivalent to a camera generally used in a personal computer or the like can be used. Reference numeral 202 denotes a spectacle-type device equipped with a sensor for detecting the movement of the user's eyes and measuring the gaze direction. A terminal 203 is used to record the obtained first-person video data 110 and gaze position data 111 in a database. The terminal 203 may transmit data to the information processing device 102 .
 視線計測装置101としては、空間中を利用者が自由に移動できる状態での注視位置の計測が可能で、且つ、一人称映像上での注視位置の取得が可能であれば、携帯型に限らず、据え置き型の装置を用いても良い。 The line-of-sight measurement device 101 is not limited to a portable type, as long as it is possible to measure the gaze position in a state in which the user can move freely in space and acquire the gaze position on the first-person video. , a stationary device may be used.
 図1における情報処理装置102は、注視位置分析システムにおける各プログラムを実行するための情報処理装置である。 The information processing device 102 in FIG. 1 is an information processing device for executing each program in the gaze position analysis system.
 入力装置103には、システムの開始や終了等を制御するためのキーボード、ボタン、マウスあるいはタッチパネル等の一般的なコンピュータにおける入力装置が含まれる。 The input device 103 includes general computer input devices such as keyboards, buttons, mice, and touch panels for controlling the start and end of the system.
 出力装置104は、利用者に注視位置分析の結果やシステムの動作状況等を表示するための手段であり、スマートフォンやタブレット端末の画面、あるいは一般的なコンピュータ用の表示装置が含まれる。 The output device 104 is a means for displaying the result of gaze position analysis, the operating status of the system, etc. to the user, and includes the screen of a smartphone or tablet terminal, or a display device for general computers.
 また、105は、注視位置分析システムにおける各プログラムを格納するための記憶装置である。記憶装置105には、空間モデル作成プログラム106、物体位置姿勢推定プログラム107、注視位置算出プログラム108および注視位置表示プログラム109が含まれる。 Also, 105 is a storage device for storing each program in the gaze position analysis system. The storage device 105 includes a spatial model creation program 106 , an object position/orientation estimation program 107 , a gaze position calculation program 108 and a gaze position display program 109 .
 ここで、情報処理装置102は、空間モデル作成プログラム106に従って処理を実行することで空間モデル部として機能する。また、情報処理装置102は、物体位置姿勢推定プログラム107に従って処理を実行することで物体位置姿勢推定部として機能する。また、情報処理装置102は、注視位置算出プログラム108に従って処理を実行することで注視位置算出部として機能する。また、情報処理装置102は、注視位置表示プログラム109に従って処理を実行することで注視位置表示部として機能する。 Here, the information processing device 102 functions as a space model unit by executing processing according to the space model creation program 106. The information processing apparatus 102 also functions as an object position/orientation estimation unit by executing processing according to the object position/orientation estimation program 107 . The information processing apparatus 102 also functions as a gaze position calculation unit by executing processing according to the gaze position calculation program 108 . Further, the information processing apparatus 102 functions as a gaze position display unit by executing processing according to the gaze position display program 109 .
 一人称映像データ110のデータベースには、利用者の視野と同様の映像である一人称映像のデータが格納される。一人称映像データ110は上述のように視線計測装置101で取得されることを想定している。 The first-person video data 110 database stores first-person video data, which is the same video as the user's field of view. It is assumed that the first-person video data 110 is acquired by the eye gaze measuring device 101 as described above.
 図3に、一人称映像データ110のフォーマットの一例を示す。
  図3におけるデータの名称301は、一人称映像データに付与された名称であり、任意の文字および記号の列を用いることができる。また、一人称映像データ110には、あらかじめ定められた時間間隔、あるいは任意のタイミングで取得した複数枚の画像が取得された順番で含まれており、図3におけるデータ数302は、データ中に含まれる画像の枚数を表す。
FIG. 3 shows an example of the format of the first-person video data 110. As shown in FIG.
A data name 301 in FIG. 3 is a name given to the first-person video data, and any string of characters and symbols can be used. Also, the first-person video data 110 includes a plurality of images acquired at predetermined time intervals or arbitrary timing in the order in which they were acquired, and the number of data 302 in FIG. number of images to be stored.
 303の時刻1は最初に画像が取得された時刻、304の画像1は最初に取得された画像、305の撮影位置姿勢1は最初の画像を撮影したカメラの位置姿勢を表す。306の時刻nはn番目の画像が取得された時刻、307の画像nはn番目の画像、308の撮影位置姿勢nはn番目の画像を撮影したカメラの位置姿勢を表す。なお、撮影位置姿勢305および308は、後述するように、空間モデル作成プログラム106によって算出されるデータであり、各画像を取得した時点では空欄である。 Time 1 of 303 represents the time when the first image was acquired, Image 1 of 304 represents the first acquired image, and Photographing position and orientation 1 of 305 represents the position and orientation of the camera that captured the first image. Time n 306 represents the time when the n-th image was acquired, image n 307 represents the n-th image, and shooting position and orientation n 308 represents the position and orientation of the camera that shot the n-th image. Note that the shooting positions and orientations 305 and 308 are data calculated by the space model creation program 106, as will be described later, and are blank when each image is acquired.
 また、一人称映像データ110のフォーマットとしては、一般的に用いられる動画フォーマット等、データ中の各時刻における画像が容易に取得できるフォーマットであれば、どのようなフォーマットを用いても良い。注視位置データ111のデータベースには、視線計測装置101で取得された注視位置のデータが格納される。 Also, as the format of the first-person video data 110, any format, such as a commonly used video format, can be used as long as the image at each time in the data can be easily acquired. The gaze position data 111 database stores gaze position data acquired by the eye gaze measuring device 101 .
 図4に、注視位置データ111のフォーマットの一例を示す。
  上述のように、注視位置は一人称映像データ110上における位置座標として表されることを前提とするため、注視位置データ111には対応する一人称映像データ110が存在する。このため、データの名称401に対応する一人称映像データにおけるデータの名称301と同じ名称を記述することにより、注視位置データ111と一人称映像データ110との対応関係を表す。
FIG. 4 shows an example of the format of the gaze position data 111. As shown in FIG.
As described above, since it is assumed that the gaze position is expressed as position coordinates on the first-person video data 110 , the gaze position data 111 has corresponding first-person video data 110 . Therefore, by describing the same name as the data name 301 in the first-person video data corresponding to the data name 401, the correspondence relationship between the gaze position data 111 and the first-person video data 110 is represented.
 また、注視位置データ111には、あらかじめ定められた時間間隔、あるいは任意のタイミングで取得した複数の位置座標が取得された順番で含まれている。図4におけるデータ数402は、データ中に含まれる位置座標の数を表す。403は最初に注視位置の位置座標が取得された時刻、404は最初に取得された注視位置の位置座標、405は注視位置算出プログラム108により注視位置データが物体モデルに対応付けられた場合、その物体の名称を記述する。 In addition, the gaze position data 111 includes a plurality of position coordinates acquired at predetermined time intervals or arbitrary timing in the order in which they were acquired. The number of data 402 in FIG. 4 represents the number of position coordinates included in the data. 403 is the time when the position coordinates of the gaze position were first acquired; 404 is the position coordinates of the gaze position that was first acquired; Describe the name of the object.
 406はn番目に注視位置の位置座標が取得された時刻、407はn番目に取得された注視位置の位置座標、408は注視位置算出プログラム108により注視位置データが物体モデルに対応付けられた場合、その物体の名称を記述する。注視位置データが取得された時点では、対応付けられた物体モデルは無いため、405および408は空欄となる。 406 is the time when the position coordinates of the n-th gaze position are acquired, 407 is the position coordinates of the n-th gaze position acquired, and 408 is when the gaze position data is associated with the object model by the gaze position calculation program 108. , describes the name of the object. Since there is no associated object model when gaze position data is acquired, 405 and 408 are blank.
 また、注視位置データ111が取得された時点では、注視位置は一人称映像上の位置座標、すなわち二次元の座標データであるが、注視位置データ111が物体モデルに対応付けられた場合は物体モデル上の位置座標となり三次元の座標データに書き換えられる。 Further, when the gaze position data 111 is acquired, the gaze position is position coordinates on the first-person video, that is, two-dimensional coordinate data. becomes the position coordinates of , and is rewritten to three-dimensional coordinate data.
 上述した注視位置データ111では、視線計測装置101で取得された注視位置データと、物体モデルに対応付けられた注視位置データを同じデータで管理することを前提としている。しかし、視線計測装置101で取得された注視位置データと、物体モデルに対応付けられた注視位置データを別々のデータとして管理するようにしても良い。 The gaze position data 111 described above is based on the premise that the gaze position data acquired by the eye gaze measuring device 101 and the gaze position data associated with the object model are managed with the same data. However, the gaze position data acquired by the eye gaze measuring device 101 and the gaze position data associated with the object model may be managed as separate data.
 物体モデルデータ112のデータベースには、利用者が視線を向ける範囲内の空間に存在する物体の三次元モデルを格納する。本発明では、それぞれの物体の三次元モデルは物体の形状を表す点の集まり、すなわち点群データとして表されることを想定する。 The database of object model data 112 stores three-dimensional models of objects that exist in the space within the range where the user's line of sight is directed. In the present invention, it is assumed that the three-dimensional model of each object is represented as a collection of points representing the shape of the object, that is, as point cloud data.
 一般的な三次元CAD等を使用して作成される物体モデルは、多角形の集まりとして表される場合が多いが、多角形で表された物体モデルは、容易に点群モデルに変換することができる。例えば、まずそれぞれの多角形について、ある頂点とその頂点と隣り合う頂点以外を結ぶ線で多角形を三角形に分割し、それぞれの三角形については、ある頂点とそれに向かい合う辺の中点を結ぶ線で三角形を分割することを繰り返し、最終的に全ての三角形の頂点を選択することで、多角形で表されている物体モデルを点群データとして表すことができる。 Object models created using general three-dimensional CAD, etc. are often represented as collections of polygons, but object models represented by polygons can be easily converted into point cloud models. can be done. For example, first divide each polygon into triangles by lines connecting a vertex and its non-adjacent vertex. By repeatedly dividing the triangles and finally selecting the vertices of all the triangles, the object model represented by the polygons can be represented as point cloud data.
 図5に、物体モデルデータ112のデータベースに格納される物体モデルのフォーマットの一例を示す。
  図5における501は、物体に付与された名称であり、任意の文字および記号の列を用いることができる。502は物体モデル中に含まれる点の数、503は最初の点の位置座標、504はn番目の点の位置座標を表す。図5に示すフォーマットでは点の位置座標のみを含めているが、例えば色に関する情報等、それぞれの点に付属する情報を含めるようにしても良い。
FIG. 5 shows an example of the format of an object model stored in the object model data 112 database.
501 in FIG. 5 is the name given to the object, and any string of letters and symbols can be used. 502 represents the number of points included in the object model, 503 represents the positional coordinates of the first point, and 504 represents the positional coordinates of the n-th point. Although the format shown in FIG. 5 includes only the positional coordinates of the points, it may also include information attached to each point, such as color information.
 空間モデルデータ113のデータベースには、利用者が視線を向ける範囲内の空間に対する三次元モデルである空間モデルを格納する。空間モデルは、一人称映像データ110を用いて空間モデル作成プログラム106によって作成されるデータであり、上述した物体モデルと同様に点群データとして表されることを想定している。 The spatial model data 113 database stores a spatial model, which is a three-dimensional model for the space within the range where the user looks. The spatial model is data created by the spatial model creation program 106 using the first-person video data 110, and is assumed to be represented as point cloud data, like the object model described above.
 図6に、空間モデルデータ113のデータベースに格納される空間モデルのフォーマットの一例を示す。
  図6におけるモデルの名称601は、空間モデルに付与された名称であり、任意の文字および記号の列を用いることができる。602は空間モデルの作成に使用された一人称映像データの名称、603は空間モデル中に含まれる点の数、604は最初の点の位置座標、605はn番目の点の位置座標を表す。図6に示すフォーマットでは点の位置座標のみを含めているが、例えば色に関する情報等、それぞれの点に付属する情報を含めるようにしても良い。
FIG. 6 shows an example of the format of the spatial model stored in the database of the spatial model data 113. As shown in FIG.
A model name 601 in FIG. 6 is a name given to the spatial model, and any string of letters and symbols can be used. 602 is the name of the first-person video data used to create the space model, 603 is the number of points included in the space model, 604 is the position coordinates of the first point, and 605 is the position coordinates of the nth point. Although the format shown in FIG. 6 includes only the positional coordinates of the points, it may also include information attached to each point, such as color information.
 物体配置データ114のデータベースには、物体位置姿勢推定プログラム107によって、空間モデルにマッチングされ、空間モデル上に配置された物体モデルに関するデータを格納する。 The database of the object placement data 114 stores data on the object model that has been matched with the space model and placed on the space model by the object position/orientation estimation program 107 .
 図7に、物体配置データ114のデータベースに格納される物体配置データ114のフォーマットの一例を示す。
  物体配置データ114は空間モデルごとに格納される想定として、図7におけるモデルの名称701に対応する空間モデルの名称を記述する。物体の数702は、対象となっている空間モデルに配置されている物体モデルの数を表す。
FIG. 7 shows an example of the format of the object placement data 114 stored in the database of the object placement data 114. As shown in FIG.
Assuming that the object placement data 114 is stored for each space model, the name of the space model corresponding to the model name 701 in FIG. 7 is described. The number of objects 702 represents the number of object models arranged in the target space model.
 図7において、703は空間モデル上に配置されている最初の物体の名称、704は最初の物体の空間モデル上における位置姿勢、705は最初の物体が空間モデルに配置された時刻を表す。706は空間モデル上に配置されているn番目の物体の名称、707はn番目の物体の空間モデル上における位置姿勢、708はn番目の物体が空間モデルに配置された時刻を表す。 In FIG. 7, 703 is the name of the first object placed on the space model, 704 is the position and orientation of the first object on the space model, and 705 is the time when the first object was placed on the space model. 706 is the name of the nth object placed on the space model, 707 is the position and orientation of the nth object on the space model, and 708 is the time when the nth object was placed on the space model.
 本発明は、視線計測装置101から取得された一人称映像データ110および注視位置データ111から、空間モデル作成プログラム106、物体位置姿勢推定プログラム107および注視位置算出プログラム108によって、各物体上における注視位置の分析処理を行う。本実施例では特に、視線計測装置101から一人称映像データ110および注視位置データ111をリアルタイムに取得しながら処理を行うことを想定している。 The present invention uses the first-person video data 110 and the gaze position data 111 acquired from the line-of-sight measuring device 101 to determine the gaze position on each object by the space model creation program 106, the object position/orientation estimation program 107, and the gaze position calculation program 108. Analytical processing. Especially in this embodiment, it is assumed that processing is performed while acquiring the first-person image data 110 and gaze position data 111 from the line-of-sight measuring device 101 in real time.
 このためまず、空間モデル作成プログラム106では、視線計測装置101から取得され、一人称映像データ110のデータベースに保存されたデータから、新たに保存されたデータを常時読み込み、空間モデルを作成する処理を行う。 For this reason, first, the spatial model creation program 106 constantly reads newly stored data from the data acquired from the line-of-sight measurement device 101 and stored in the database of the first-person video data 110, and performs processing for creating a spatial model. .
 映像データから点群データで表現される空間モデルを作成する技術としては、良く知られたSLAM(Simultaneous Localization and Mapping)およびMVS(Multi-View Stereo)法を組み合わせた技術を用いることができる。 As a technique for creating a spatial model represented by point cloud data from video data, a technique that combines the well-known SLAM (Simultaneous Localization and Mapping) and MVS (Multi-View Stereo) methods can be used.
 SLAMはカメラを移動させた場合に対応する連続した複数枚の画像を用いて、画像間の対応関係を解析することにより、粗い点群データを作成する技術である。一方、MVS法は、SLAMにおける解析結果を用いることにより、より詳細な密な点群データを作成する技術である。 SLAM is a technology that creates rough point cloud data by analyzing the correspondence between images using multiple consecutive images corresponding to when the camera is moved. On the other hand, the MVS method is a technique of creating more detailed dense point cloud data by using the analysis results in SLAM.
 SLAMおよびMVS法を用いて点群データによる空間モデルを作成する場合のイメージを図8に示す。 Fig. 8 shows an image of creating a spatial model from point cloud data using the SLAM and MVS methods.
 図8において801は、利用者が視線を移動する範囲に存在している物体を表しており、簡単のため、周囲には他の物体は存在しないと想定している。802および804は一人称映像を撮影した際のカメラの位置を表しており、803および805はそれぞれのカメラの位置から一人称映像を撮影した際のカメラの視野の範囲およびカメラの姿勢を図示したものである。 In FIG. 8, 801 represents an object that exists within the range where the user's line of sight moves, and for the sake of simplicity, it is assumed that there are no other objects in the surrounding area. 802 and 804 represent the positions of the cameras when the first-person video was captured, and 803 and 805 illustrate the range of the camera's field of view and the camera posture when the first-person video was captured from each camera position. be.
 802および804に示すように、複数の箇所から撮影した同一物体あるいは同一箇所の画像を用いることにより、806に示すような点群データで表される空間モデルが作成される。 As shown in 802 and 804, a space model represented by point cloud data as shown in 806 is created by using images of the same object or the same location photographed from multiple locations.
 図8では、カメラの位置は2箇所のみを示しているが、より正確な空間モデル806を作成するために、一般的には、それ以上の枚数の画像が用いられる。以上のような技術を用いることにより、利用者の注視位置の移動に応じて、空間モデル806をリアルタイムに作成することができる。空間モデル作成プログラム106で使用する技術としては、上記の技術の他、同様に点群データで表された空間モデルを作成できる技術であれば、どのような技術を用いても良い。 Although only two camera positions are shown in FIG. 8, generally more images are used in order to create a more accurate space model 806 . By using the technique described above, the space model 806 can be created in real time according to the movement of the gaze position of the user. In addition to the above techniques, any technique may be used in the spatial model creation program 106 as long as it can create a spatial model represented by point cloud data.
 また、SLAM技術を用いることにより、作成された空間モデル上における一人称映像を撮影したカメラの位置姿勢も同時に算出することができる。後述するように、カメラの位置姿勢は物体モデル上の注視位置の算出で必要となる情報である。このため、カメラの位置姿勢を算出することができない空間モデルの作成技術を使用する場合は、空間モデル上のカメラの位置姿勢を取得する手段を別途利用する必要がある。例えば、位置姿勢を取得するセンサを利用する等、空間中の位置姿勢を取得する技術であればどのような技術でも使用することができる。 Also, by using SLAM technology, it is possible to simultaneously calculate the position and orientation of the camera that captured the first-person video on the created space model. As will be described later, the position and orientation of the camera are information necessary for calculating the gaze position on the object model. Therefore, when using a space model creation technique that cannot calculate the position and orientation of the camera, it is necessary to separately use means for acquiring the position and orientation of the camera on the space model. For example, any technology that acquires the position and orientation in space, such as using a sensor that acquires the position and orientation, can be used.
 物体位置姿勢推定プログラム107は、空間モデル作成プログラム106によって作成された空間モデル中の一部の点群データに対して、物体モデルの点群データをマッチング、すなわち位置や姿勢がうまく合うように合わせ込みを行い、空間モデル中における物体モデルの位置姿勢を求めるためのプログラムである。 The object position/orientation estimation program 107 matches the point cloud data of the object model with a part of the point cloud data in the space model created by the space model creation program 106, that is, matches the position and orientation of the object model. It is a program for calculating the position and orientation of the object model in the space model.
 物体位置姿勢推定プログラム107で使用する技術としては、良く知られたICP(Iterative Closest Point)アルゴリズムやNDT(Normal Distribution Transform)アルゴリズムと呼ばれる技術を用いることができる。あるいは、点群データ同士のマッチングを行い、空間モデル中における物体モデルの位置姿勢を求めることができる技術であればどのような技術を用いても良い。また、物体位置姿勢推定プログラム107は、後述の注視位置算出プログラム108によって、処理を行うタイミングが制御される。 As a technique used in the object position/orientation estimation program 107, a technique called the well-known ICP (Iterative Closest Point) algorithm or NDT (Normal Distribution Transform) algorithm can be used. Alternatively, any technique may be used as long as it is capable of performing matching between point cloud data and determining the position and orientation of an object model in a space model. Further, the object position/orientation estimation program 107 is controlled in timing of processing by a gaze position calculation program 108 which will be described later.
 注視位置算出プログラム108は、視線計測装置101によって取得された注視位置データ111、物体モデルデータ112および空間モデル作成プログラム106によって作成された空間モデルデータ113を用いて、物体モデル上における注視位置を求めるプログラムである。 The gaze position calculation program 108 uses the gaze position data 111 and the object model data 112 acquired by the eye gaze measuring device 101 and the space model data 113 created by the space model creation program 106 to obtain the gaze position on the object model. It's a program.
 図9のフローチャートを用いて、注視位置算出プログラム108の処理の流れを説明する。
  図9におけるステップ901では、注視位置データ111のデータベースに格納されている注視位置データ111から新たな注視位置データ111を取得する。あるいは、視線計測装置101から新たな注視位置データを直接取得するようにしても良い。
The flow of processing of the gaze position calculation program 108 will be described using the flowchart of FIG.
In step 901 in FIG. 9, new gaze position data 111 is obtained from gaze position data 111 stored in the gaze position data 111 database. Alternatively, new gaze position data may be directly acquired from the line-of-sight measuring device 101 .
 ステップ902では、新たに取得された注視位置データ111、および空間モデル作成プログラム106によって空間モデルを作成する際に算出され、一人称映像データ110に記載されている撮影位置姿勢305および308を用いて、空間モデル上の注視方向を算出する。 In step 902, using the newly obtained gaze position data 111 and the photographing positions and orientations 305 and 308 calculated when creating the space model by the space model creation program 106 and described in the first-person video data 110, Calculate the gaze direction on the space model.
 また、空間モデル上の注視方向は、注視の始点となる位置とその位置からの視線方向を示すベクトルで表される。空間モデル上の注視方向の算出は、まず、新たに取得された注視位置データ111と同じ時刻の一人称映像データに対する撮影位置姿勢を一人称映像データ110から取得する。 In addition, the direction of gaze on the space model is represented by a vector that indicates the starting point of the gaze and the line-of-sight direction from that position. To calculate the direction of gaze on the space model, first, the photographing position and orientation for the first-person video data at the same time as the newly acquired gaze position data 111 are obtained from the first-person video data 110 .
 全く同じ時刻の撮影位置姿勢が存在しない場合は、例えば、注視位置データ111の時刻の前後の時刻に対応する一人称映像データ110に対する撮影位置姿勢を取得し、注視位置データ111の時刻との関係に基づいて補間を行うことにより求めた撮影位置姿勢を使用する、等の方法を用いれば良い。次に、取得した撮影位置姿勢を用いて、一人称映像上の注視位置を空間モデル上の注視位置に座標変換する。 If there is no photographing position and orientation at exactly the same time, for example, the photographing position and orientation for the first-person image data 110 corresponding to times before and after the time of the gaze position data 111 are acquired, and the relation with the time of the gaze position data 111 is obtained. A method such as using the photographing position and orientation obtained by performing interpolation based on the above may be used. Next, using the obtained photographing position and orientation, the gaze position on the first-person video is coordinate-transformed into the gaze position on the space model.
 上述のように、注視位置データ111は一人称映像上の注視位置で表されるが、良く知られたピンホールカメラモデルにカメラの視野角や焦点距離等の情報を適用することにより、撮影位置、一人称映像および注視位置の位置関係を実際の空間と同一サイズで求めることができる。さらに、一人称映像データ110から取得した撮影位置姿勢により、求めた位置関係を座標変換することにより、図10に示すように、空間モデル上におけるカメラの撮影位置、一人称映像および注視位置の位置関係を表すことができる。 As described above, the gaze position data 111 is represented by the gaze position on the first-person video. The positional relationship between the first-person image and gaze position can be obtained in the same size as the actual space. Furthermore, by subjecting the obtained positional relationship to coordinate transformation using the shooting position and orientation acquired from the first-person video data 110, the positional relationship among the shooting position of the camera, the first-person video, and the gaze position on the space model can be obtained as shown in FIG. can be represented.
 図10において、1001は空間モデル、1002は空間モデル1001上における一人称映像を撮影したカメラの撮影位置、1003は一人称映像の空間モデル1001上における撮影範囲を示しており、カメラの撮影姿勢に対応する。 In FIG. 10, 1001 is a space model, 1002 is the shooting position of the camera that shot the first-person video on the space model 1001, and 1003 is the shooting range of the first-person video on the space model 1001, which corresponds to the shooting posture of the camera. .
 また、1004は空間モデル1001に対応付けられた一人称映像上の注視位置である。空間モデル1001上の注視方向は、図10において、1002を始点として一人称映像上の注視位置1004を通過するベクトル1005として求めることができる。 Also, 1004 is the gaze position on the first-person video associated with the spatial model 1001 . The gaze direction on the space model 1001 can be obtained as a vector 1005 that starts at 1002 and passes through the gaze position 1004 on the first-person video in FIG.
 図9におけるステップ903では、ステップ902で求めた空間モデル1001上の注視方向に、空間モデル1001上に配置された物体モデルが存在するかどうかの判定を行い、空間モデル1001上の注視方向に空間モデル1001上に配置された物体モデルが存在すると判定された場合はステップ906に、存在しないと判定された場合は904に、それぞれ進む。 In step 903 in FIG. 9, it is determined whether or not there is an object model placed on the space model 1001 in the gaze direction on the space model 1001 obtained in step 902. If it is determined that there is an object model placed on the model 1001, the process proceeds to step 906; otherwise, to step 904.
 判定を行う方法としては、まず、物体配置データ114から対象となる空間モデル1001に配置された物体モデルの情報を取得し、該当する物体モデルの点群データを物体モデルデータ112から読み込み空間モデル上に配置する。次に、空間モデル1001上の注視方向を表すベクトル(以下、注視方向ベクトル)、例えば図10におけるベクトル1005と空間モデル1001上に配置された物体モデルの各点との距離を求め、注視方向ベクトルと最も近い距離を選択する。 As a method for making the determination, first, information on an object model placed in the target space model 1001 is acquired from the object placement data 114, and point cloud data of the corresponding object model is read from the object model data 112 and displayed on the space model. to be placed. Next, a vector representing the gaze direction on the space model 1001 (hereinafter referred to as gaze direction vector), for example, the distance between the vector 1005 in FIG. and select the closest distance.
 選択された距離があらかじめ定められた閾値以下であれば、対象となる物体モデルが注視方向に存在すると判定することができる。距離に対する閾値としては、例えば、物体モデルを構成する点群データにおける点間の距離において、最も大きい距離や最も大きい距離の2分の1等を選択することができる。また、判定方法としては、空間モデル上の注視方向ベクトルとの距離があらかじめ定められた閾値以下である点が、あらかじめ定められた閾値以上の個数存在する場合に、対象となる物体モデルが注視方向にあると判定するようにしても良い。 If the selected distance is equal to or less than a predetermined threshold, it can be determined that the target object model exists in the gaze direction. As the threshold value for the distance, for example, the largest distance, half the largest distance, or the like can be selected among the distances between points in the point cloud data constituting the object model. Also, as a determination method, when there are points whose distance from the gaze direction vector on the space model is equal to or less than a predetermined threshold, the number of points whose distance is equal to or greater than the predetermined threshold is determined. You may make it determine that it exists in.
 また、物体モデルデータ112のデータベースに格納する物体モデルとして、点群データの他に多角形(ポリゴン)の集合で表された形式のデータであるポリゴンデータも格納しておき、空間モデル1001上に該当する物体モデルもポリゴンデータを配置し、空間モデル1001上に配置された各ポリゴンと空間モデル上に配置された視線方向を表すベクトルとが交差する点が存在する場合、対象となる物体モデルが注視方向に存在すると判定するようにしても良い。 As object models to be stored in the database of object model data 112, in addition to point cloud data, polygon data, which is data represented by a set of polygons, is also stored. Polygon data is also placed on the corresponding object model, and if there is a point where each polygon placed on the spatial model 1001 intersects with a vector representing the line-of-sight direction placed on the spatial model, the target object model is You may make it determine that it exists in a gaze direction.
 さらには、空間モデル1001上に配置された物体が複数存在する場合は、上述した注視方向ベクトル1005と最も近い距離にある点や注視方向ベクトル1005と交差するポリゴンの注視方向ベクトル上の位置を求め、注視方向ベクトル1005の始点、すなわち撮影位置に最も近い点あるいはポリゴンを含む物体モデルを注視方向に存在する物体モデルとして選択すれば良い。 Furthermore, if there are a plurality of objects placed on the space model 1001, the point closest to the gaze direction vector 1005 and the position of the polygon that intersects the gaze direction vector 1005 on the gaze direction vector are obtained. , the starting point of the gazing direction vector 1005, that is, an object model including a point or polygon closest to the photographing position may be selected as an object model existing in the gazing direction.
 またさらには、空間モデル1001と同じ位置関係を表すことができる仮想空間を用意し、物体が空間モデル1001上に配置されるごとに、仮想空間上に物体モデルを配置し、仮想空間上で上述した判定処理を実行するようにしても良い。 Furthermore, a virtual space that can express the same positional relationship as the space model 1001 is prepared. You may make it perform the determination process which carried out.
 ステップ904では、物体位置姿勢推定プログラム107を用いて、空間モデル1001上の注視方向に存在する空間モデル1001上の点群データに対して、物体モデル1004をマッチングする。この処理は、図10において、空間モデル1001に対して物体モデル1006を位置や姿勢を様々に調整することにより、対象とする物体モデル1006で空間モデル1001の一部を置き換えることを可能とするための処理である。 In step 904, the object position/orientation estimation program 107 is used to match the object model 1004 with the point cloud data on the space model 1001 existing in the gaze direction on the space model 1001. 10, by variously adjusting the position and orientation of the object model 1006 with respect to the space model 1001, this processing makes it possible to replace part of the space model 1001 with the target object model 1006. is the processing of
 視線方向に存在する空間モデル1001上の点群データとしては、例えば、注視方向ベクトル1005からの距離があらかじめ定められた距離以下である空間モデル上の点群データを選択することができる。あらかじめ定められた距離としては、マッチングを行う物体モデルの最大サイズや最大サイズの2分の1等を用いることができる。 As the point cloud data on the space model 1001 existing in the line-of-sight direction, for example, point cloud data on the space model whose distance from the gaze direction vector 1005 is a predetermined distance or less can be selected. As the predetermined distance, the maximum size of the object model to be matched, half the maximum size, or the like can be used.
 あるいは、注視方向ベクトル1005を中心軸とするあらかじめ定められた大きさの特定の形状の範囲内に存在する点群データとすることもできる。また、点群データ中の各点の注視方向ベクトル上での位置を求め、点群データが注視方向ベクトル上で最も集中している箇所を中心にあらかじめ定められた範囲の点を選択することもできる。以上の他にも、注視方向ベクトルを中心としてマッチングの対象とする点群データを選択できる方法であれば、どのような方法を用いても良い。 Alternatively, it can be point cloud data existing within a range of a specific shape having a predetermined size with the gazing direction vector 1005 as the central axis. It is also possible to obtain the position of each point in the point cloud data on the gaze direction vector and select points in a predetermined range centering on the point where the point cloud data are most concentrated on the gaze direction vector. can. In addition to the above, any method may be used as long as it can select point group data to be matched around the gaze direction vector.
 空間モデル上の点群データにマッチングする物体モデルは、物体モデルデータ112のデータベースに格納されている全ての物体モデルを対象とすることもできるし、あるいは、実際の物体上に、良く知られたQRコード(登録商標)やARマーカー、物体の名称を表す記号や文字列等を設置し、良く知られた画像認識技術や文字認識技術を用いてそれらを読み取り、対応する物体モデルを物体モデルデータ112から選択してマッチングするようにしても良い。 The object model that matches the point cloud data on the spatial model can be all object models stored in the database of object model data 112, or a well-known QR codes (registered trademark), AR markers, symbols and character strings representing the names of objects are installed, and well-known image recognition technology and character recognition technology are used to read them, and the corresponding object models are converted into object model data. 112 may be selected for matching.
 図11の1101に、実際の物体上に設置されたARマーカーの一例を示す。あるいは、空間モデル上の点群データと物体モデルの点群データから、良く知られた点群特徴量(三次元特徴量)を抽出し、空間モデル上の点群データと類似した点群特徴量を有する物体モデルを物体モデルデータ112から選択してマッチングするようにしても良い。 1101 in FIG. 11 shows an example of an AR marker installed on an actual object. Alternatively, extract well-known point cloud feature values (three-dimensional feature values) from the point cloud data on the space model and the point cloud data on the object model, and extract the point cloud feature values similar to the point cloud data on the space model. may be selected from the object model data 112 and matched.
 また、組み立て作業を行うような場合、特定の箇所に存在する物体の形状や大きさが変化していくが、このような場合、図12の1201、1202および1203に示すように、作業の工程ごとに物体の状態を示す物体モデルとその順序関係を物体モデルデータ112に保存しておき、順に空間モデルとマッチングを行うようにすることで、作業の工程に応じた物体モデルを空間モデルにマッチングできるようになる。 Also, when performing assembly work, the shape and size of an object existing at a specific location change. The object model representing the state of the object and its order relationship are stored in the object model data 112 for each task, and the object model corresponding to the work process is matched with the space model by matching with the space model in order. become able to.
 具体的には、例えば、注視方向に存在する空間モデル上の点群データにいずれの物体モデルもマッチングされていない場合は、最初の工程に対応する物体モデルを選択してマッチングを行う。 Specifically, for example, if none of the object models are matched to the point cloud data on the space model existing in the direction of gaze, the object model corresponding to the first step is selected and matched.
 一方、注視方向に存在する空間モデル上の点群データに、いずれかの工程に対応する物体モデルがすでにマッチングされている場合は、次の工程に対応する物体モデルを選択し、空間モデル上の点群データとのマッチングを行い、既にマッチングされている物体モデルと空間モデルの一致度より高い一致度で次の工程に対応する物体モデルが空間モデルにマッチングされた場合、既にマッチングされている物体モデルを次の工程に対応する物体モデルで置き換えるようにすれば良い。 On the other hand, if the object model corresponding to one of the processes has already been matched to the point cloud data on the space model existing in the gaze direction, select the object model corresponding to the next process, and After matching with point cloud data, if the object model corresponding to the next process is matched with the space model with a higher degree of matching than the already matched object model and the space model, the already matched object The model can be replaced with an object model corresponding to the next process.
 一致度としては、後述するステップ905での処理において使用するものと同様のものを用いることができる。さらに、次の工程に対応する物体モデルの空間モデルへのマッチングは、既にマッチングされている物体モデル上に注視位置が移動した場合や、空間モデル作成プログラム106において空間モデルの変化が検出された場合に行うようにすれば良い。 As the degree of matching, the same degree as that used in the processing in step 905, which will be described later, can be used. Furthermore, the matching of the object model to the space model corresponding to the next step is performed when the gaze position moves on the already matched object model, or when the space model creation program 106 detects a change in the space model. It is good to go to
 空間モデルの変化の検出は、空間モデル作成プログラム106において、例えば、新たに取得された一人称映像の画像とその直前のあらかじめ定められた枚数の一人称映像の画像を用いて生成された空間モデルとそれ以前に生成されていた空間モデルとを比較し、両者の差分があらかじめ定められた閾値以上の大きさであれば、空間モデルが変化したと判定すればよい。 The detection of the change in the spatial model is performed by the spatial model creation program 106, for example, by using a newly acquired first-person video image and a predetermined number of first-person video images immediately before it to generate a spatial model and the A previously generated space model is compared, and if the difference between the two is greater than or equal to a predetermined threshold, it is determined that the space model has changed.
 空間モデルと空間モデルの差分は、例えば、一方の空間モデル中の各点に対して、他方の空間モデル中の最も近い点を検索し、検索された点との距離を算出し、一方の空間モデル中の全ての点についての距離の平均を求めることにより算出することができる。あるいは、求められた距離があらかじめ定められた閾値以上である点の数を用いても良い。あるいは、点群間の差分を算出できる方法であれば、どのような方法を用いても良い。 The difference between the spatial models can be obtained, for example, by searching for the closest point in the other spatial model for each point in one spatial model, calculating the distance to the searched point, It can be calculated by averaging the distances for all points in the model. Alternatively, the number of points where the obtained distance is equal to or greater than a predetermined threshold may be used. Alternatively, any method may be used as long as it can calculate the difference between point groups.
 ステップ905では、空間モデル上の注視方向に存在する空間モデル上の点群データに物体モデルが正しくマッチングされたかどうかを判定し、正しくマッチングされた場合は物体モデルのマッチング結果を物体配置データ114のデータベースに保存した後ステップ906に、そうでない場合はステップ908に進む。 In step 905, it is determined whether or not the object model has been correctly matched with the point cloud data on the space model existing in the viewing direction on the space model. After saving to the database go to step 906 , otherwise go to step 908 .
 空間モデル上の点群データに物体モデルが正しくマッチングされたかどうかを判定する方法としては、空間モデル上の点群データと、マッチング処理を行った結果に基づいて空間モデル上に配置された物体モデルとの一致度を求め、一致度があらかじめ定められた値より大きい場合、物体モデルが正しくマッチングされたと判定すれば良い。 As a method of judging whether the object model has been correctly matched with the point cloud data on the space model, the point cloud data on the space model and the object model placed on the space model based on the result of matching processing If the degree of matching is greater than a predetermined value, it is determined that the object model is correctly matched.
 空間モデル上の点群データにマッチングされる物体モデルが複数存在する場合は、最も大きい一致度を選択し、選択した一致度があらかじめ定められた値より大きい場合、物体モデルが正しくマッチングされたと判定するとともに、選択された一致度に対応する物体モデルが空間モデル上の点群データにマッチングされたと判断すれば良い。 If there are multiple object models that match the point cloud data on the spatial model, select the highest degree of matching, and if the selected degree of matching is greater than a predetermined value, determine that the object model is correctly matched. At the same time, it can be determined that the object model corresponding to the selected degree of matching has been matched with the point cloud data on the space model.
 一致度としては、例えば、マッチング結果を用いて空間モデル上に配置された物体モデル中の各点に対して、空間モデル上の点群データ中でもっとも距離が小さい点を探索し、求めた距離があらかじめ定められた閾値より小さい点の数の個数を求め、物体モデル中における点の数に対する求めた点の個数の割合を用いることができる。空間モデル上の点群データと物体モデルの一致度としては、上記の他、点群データ同士のマッチング結果の良否を判定できる指標であれば、どのような指標を用いても良い。 As the degree of matching, for example, for each point in the object model placed on the spatial model using the matching result, the point with the smallest distance in the point cloud data on the spatial model is searched, and the obtained distance is less than a predetermined threshold, and the ratio of the determined number of points to the number of points in the object model can be used. As the degree of matching between the point cloud data on the space model and the object model, any index other than the above may be used as long as it is an index that can determine the quality of the matching result between the point cloud data.
 ステップ906では、ステップ904でのマッチング結果を用いて空間モデル上に配置された物体モデルにおける注視位置を算出する。空間モデル上に配置された物体モデルと空間モデル上の注視方向ベクトルは、図10における1005となるため、上述したステップ903における処理、すなわち、空間モデル上の注視方向に空間モデル上に配置された物体モデルが存在するかどうかの判定処理と同様の処理を用いることができる。 At step 906, the gaze position in the object model placed on the space model is calculated using the matching result at step 904. Since the object model placed on the space model and the gaze direction vector on the space model are 1005 in FIG. A process similar to the process of determining whether an object model exists can be used.
 ただしステップ903では、物体モデルが点群データの場合、注視方向ベクトル1005との距離があらかじめ定められた閾値以下の点が存在するかどうかに着目していたが、注視位置の算出では、注視方向ベクトルとの距離があらかじめ定められた閾値以下であり、かつ、撮影位置1002に最も近い点を選択し、それを注視位置とすることが異なる。 However, in step 903, when the object model is point cloud data, attention is paid to whether or not there is a point whose distance from the gaze direction vector 1005 is equal to or less than a predetermined threshold. The difference is that a point whose distance from the vector is equal to or less than a predetermined threshold value and which is closest to the shooting position 1002 is selected and set as the gaze position.
 あるいは、注視方向ベクトル1005との距離があらかじめ定められた閾値以下であり、撮影位置1002に最も近い点からあらかじめ定められた範囲内に複数の点が存在する場合は、撮影位置1002に最も近い点と、そこからあらかじめ定められた範囲内に点の平均を注視位置としても良い。 Alternatively, if the distance from the gazing direction vector 1005 is less than or equal to a predetermined threshold value and there are a plurality of points within a predetermined range from the point closest to the shooting position 1002, the point closest to the shooting position 1002 , and an average of points within a predetermined range from there may be used as the gaze position.
 あるいは、物体モデルが多角形の集まりとして表される場合は、注視方向ベクトルと物体モデル中の各多角形との交点を求め、撮影位置1002に最も近い交点を注視位置として求めるようにすれば良い。 Alternatively, if the object model is expressed as a collection of polygons, then the point of intersection between the gaze direction vector and each polygon in the object model may be obtained, and the point of intersection closest to the photographing position 1002 may be obtained as the gaze position. .
 以上によって求められた注視位置は空間モデル上での注視位置であるため、ステップ906では、求められた注視位置をステップ904のマッチング結果により得られた空間モデル上における物体モデルの位置姿勢を用いて座標変換を行う。これにより、空間モデル1001上での注視位置1004を物体モデル1006上、すなわち物体モデルの座標系における注視位置1007に変換する。 Since the gaze position obtained by the above is the gaze position on the space model, in step 906, the obtained gaze position is set using the position and orientation of the object model on the space model obtained by the matching result in step 904. Perform coordinate transformation. As a result, the gaze position 1004 on the space model 1001 is transformed into the gaze position 1007 on the object model 1006, that is, on the object model coordinate system.
 また、ステップ905において空間モデル上に物体モデルが配置された場合、ステップ906において、過去の注視位置データに対して、配置された物体モデル上における注視位置の計算を行うようにしても良い。これは、注視位置算出プログラム108の処理の初期段階等では、空間モデル上の各物体に対する点群データが少なく、空間モデル上への物体モデルの配置が行えない場合が想定されるためである。 Also, when an object model is placed on the space model in step 905, in step 906, the gaze position on the placed object model may be calculated for the past gaze position data. This is because, in the initial stage of processing of the gaze position calculation program 108, etc., it is assumed that the point cloud data for each object on the space model is small and the object model cannot be arranged on the space model.
 ステップ907では、ステップ906で求めた物体モデル上の注視位置を注視位置データ111に保存する。また、物体モデルの注視位置を保存する際、空間モデル上における一人称映像の撮影位置姿勢や、空間モデル中における注視位置、空間モデル上における物体モデルの位置姿勢等を合わせて保存するようにしても良い。 At step 907 , the gaze position on the object model obtained at step 906 is saved in the gaze position data 111 . Also, when saving the gaze position of the object model, the shooting position and orientation of the first-person video on the space model, the gaze position in the space model, the position and orientation of the object model on the space model, etc. may be saved together. good.
 ステップ908では、入力装置103等から終了の指示がある場合は処理を終了し、そうでなければステップ901に戻る。 In step 908, if there is an end instruction from the input device 103 or the like, the process ends; otherwise, the process returns to step 901.
 上述した注視位置算出プログラム108では、一人称映像データ110および注視位置データ111をリアルタイムに取得し、空間モデルデータ113を生成しながら物体モデルデータ112の空間モデル上への配置を行い、物体モデル上への注視位置の計算を行っていたが、対象となる一人称映像データ110および注視位置データ111の取得が一通り完了した後、一人称映像データ110および注視位置データ111に保存されているデータを順番に読み込みながら注視位置算出プログラム108の処理を実行するようにしても良い。 The gaze position calculation program 108 described above acquires the first-person video data 110 and the gaze position data 111 in real time, arranges the object model data 112 on the space model while generating the space model data 113, and places the object model data 112 on the object model. However, after the target first-person video data 110 and gaze position data 111 have been acquired, the data stored in the first-person video data 110 and gaze position data 111 are sequentially processed. The processing of the gaze position calculation program 108 may be executed while reading.
 その際、空間モデルの作成、あるいは、空間モデルの作成および空間モデル上への物体モデルの配置を行った後、物体モデル上への注視位置の計算を行うようにしても良い。またその際、空間モデル上に物体モデルが配置されていない箇所に対して、手動で物体モデルを配置した後、上述した処理を行うようにしても良い。 At that time, after creating a space model, or creating a space model and placing an object model on the space model, the gaze position on the object model may be calculated. In this case, the above-described processing may be performed after manually arranging an object model for a location where no object model is arranged on the space model.
 あるいは、注視位置算出プログラム108の処理を実行した後、注視位置算出プログラム108を実行した後に生成される空間モデルデータ113および物体配置データ114を用いて、再度、注視位置算出プログラム108を実行するようにしても良い。これは、注視位置算出プログラム108の処理の初期段階等では、空間モデル上の各物体に対する点群データが少なく、空間モデル上への物体モデルの配置が行えない場合が想定される。 Alternatively, after executing the gaze position calculation program 108, the gaze position calculation program 108 is executed again using the space model data 113 and the object arrangement data 114 generated after the gaze position calculation program 108 is executed. You can do it. At the initial stage of processing of the gaze position calculation program 108, etc., it is assumed that the point group data for each object on the space model is small and the object model cannot be arranged on the space model.
 このような場合、注視位置データ111に格納された注視位置の内、物体モデル上の注視位置が計算されないままになる注視位置が生じる可能性があるためである。この場合、注視位置算出プログラム108は以下の変更を行った上で実行される。 This is because, in such a case, among the gaze positions stored in the gaze position data 111, there is a possibility that gaze positions on the object model remain uncalculated. In this case, the gaze position calculation program 108 is executed with the following modifications.
 一つ目の変更点としては、ステップ901における注視位置データの取得は注視位置データ111に保存されているデータから保存された順序でデータを読み込むことによって行う。さらに、取得された注視位置データに対して物体モデル上への注視位置の計算が行われているかどうかを判定し、行われていればステップ908に進む処理を追加する。 The first change is that the acquisition of the gaze position data in step 901 is performed by reading data from the data stored in the gaze position data 111 in the order in which they were saved. Further, it is determined whether or not the gaze position on the object model has been calculated for the acquired gaze position data, and if it has been calculated, the process of proceeding to step 908 is added.
 二つ目の変更点としては、ステップ903において、空間モデル上の注視方向に物体モデルが配置されていない場合、ステップ904ではなくステップ908に進むようにすることである。 The second change is to proceed to step 908 instead of step 904 if, in step 903, the object model is not arranged in the gaze direction on the space model.
 三つ目の変更点は、ステップ904およびステップ905を削除することである。注視位置算出プログラム108に対して、以上の三点の変更を行うことにより、物体モデル上への注視位置の計算が行われていない注視位置に対して、物体モデル上への注視位置の計算を行うことができる。またその際、空間モデル上に物体モデルが配置されていない箇所に対して、手動で物体モデルを配置した後、上述した処理を行うようにしても良い。 The third change is to delete steps 904 and 905. By making the above three changes to the gaze position calculation program 108, it is possible to calculate the gaze position on the object model for gaze positions for which calculation of the gaze position on the object model has not been performed. It can be carried out. In this case, the above-described processing may be performed after manually arranging an object model for a location where no object model is arranged on the space model.
 注視位置表示プログラム109は、利用者からの指示に応じて、注視位置データの表示を行うプログラムである。注視位置は、視線分析で良く知られた方法、例えば、注視位置の頻度分布であるヒートマップによる方法や、注視位置が一定範囲内にとどまっていた時間の長さを円の大きさ等で視覚的に表示する方法等により表示を行う。 The gaze position display program 109 is a program that displays gaze position data according to instructions from the user. Gaze position can be determined by a well-known method of gaze analysis, for example, a method using a heat map, which is the frequency distribution of gaze positions, or the length of time that the gaze position stayed within a certain range. display method, etc.
 また、一人称映像上の注視位置を表示する他、物体モデル上における注視位置の表示を行う。物体モデル上における注視位置の表示方法としては、空間モデルに対応する仮想空間である仮想空間モデルを用意し、仮想空間モデル中に物体配置データ114中に保存されているデータに基づいて物体モデルを配置し、仮想空間モデル中に配置された物体モデル上に注視位置を表示する方法を用いることができる。あるいは、特定の物体モデルのみを選択し、選択された物体モデル上にのみ、注視位置を表示する方法を用いることもできる。 In addition to displaying the gaze position on the first-person video, it also displays the gaze position on the object model. As a method of displaying the gaze position on the object model, a virtual space model, which is a virtual space corresponding to the space model, is prepared, and the object model is displayed based on the data stored in the object placement data 114 in the virtual space model. A method of placing and displaying the gaze position on the object model placed in the virtual space model can be used. Alternatively, a method of selecting only a specific object model and displaying the gaze position only on the selected object model can also be used.
 さらに、物体モデル上に注視位置を表示する場合、注視位置が常に利用者に正対するように仮想空間モデルあるいは物体モデルを調整して表示するようにしても良い。 Furthermore, when displaying the gaze position on the object model, the virtual space model or the object model may be adjusted and displayed so that the gaze position always faces the user.
 図13の1301は物体モデルの例を、1302、1303、1304および1305は物体モデル上の注視位置の例を示す。また、図14に、図13で示す注視位置を常に利用者に正対するように表示した場合の一例を示す。 1301 in FIG. 13 shows an example of an object model, and 1302, 1303, 1304 and 1305 show examples of gaze positions on the object model. Further, FIG. 14 shows an example of the case where the gaze position shown in FIG. 13 is always displayed facing the user.
 図14において、1401は注視点1302を利用者に正対するように表示した物体モデル、1402は1401上に表示された注視点である。1403は注視点1303を利用者に正対するように表示した物体モデル、1404は1403上に表示された注視点である。1405は注視点1304を利用者に正対するように表示した物体モデル、1406は1405上に表示された注視点である。1407は注視点1305を利用者に正対するように表示した物体モデル、1408は1407上に表示された注視点である。 In FIG. 14, 1401 is an object model displaying a point of gaze 1302 facing the user, and 1402 is the point of gaze displayed on 1401 . Reference numeral 1403 denotes an object model in which the gaze point 1303 is displayed facing the user, and 1404 denotes the gaze point displayed on the 1403 . Reference numeral 1405 denotes an object model in which the gaze point 1304 is displayed facing the user, and 1406 the gaze point displayed on 1405 . Reference numeral 1407 denotes an object model in which the gaze point 1305 is displayed facing the user, and 1408 denotes the gaze point displayed on 1407 .
 図14に示す表示方法では、各注視位置が完全に利用者に正対するように物体モデルを調整することにより、注視位置の表示を行っている。これを行うためには、例えば、物体モデルが点群データである場合は、注視位置を含むあらかじめ定められた範囲にある点を選択し、選択された点から注視位置を含む平面の法線の向きを求め、求めた法線が利用者側に向くように物体モデルの位置姿勢を調整するようにすればよい。物体モデルが多角形の集まりで表現されている場合は、注視位置を含む多角形を物体モデルから選択し、その法線が利用者側に向くように物体モデルの位置姿勢を調整するようにすればよい。 In the display method shown in FIG. 14, the gaze position is displayed by adjusting the object model so that each gaze position completely faces the user. To do this, for example, if the object model is point cloud data, select a point within a predetermined range that includes the gaze position, It is sufficient to obtain the orientation and adjust the position and orientation of the object model so that the obtained normal line faces the user. If the object model is represented by a collection of polygons, select the polygon containing the gaze position from the object model and adjust the position and orientation of the object model so that the normal line faces the user. Just do it.
 また、注視位置を表示する際に、表示する物体モデルの位置姿勢の調整方法に制限を設けるようにしても良い。さらに、注視位置が利用者に正対する程度を調整することにより、注視位置を表示する際に物体モデルの姿勢が大きく変化することを抑えるようにしても良い。 Also, when displaying the gaze position, it is possible to set a limit on the adjustment method of the position and orientation of the object model to be displayed. Furthermore, by adjusting the extent to which the gaze position faces the user, it is possible to suppress a large change in the posture of the object model when displaying the gaze position.
 例えば、最初の注視位置は利用者に正対するように表示し、それ以降の注視位置は、利用者に正対する程度があらかじめ定められた範囲を超えないように、物体モデルの位置姿勢を調整する。注視位置が利用者に正対する程度としては、注視位置を含む平面の法線方向と利用者に正対する方向、すなわち画面に垂直な方向との間の角度や内積等を用いることができる。 For example, the position and orientation of the object model are adjusted so that the first gaze position faces the user, and the subsequent gaze positions do not exceed a predetermined range of facing the user. . As the extent to which the gaze position faces the user, the angle or inner product between the direction normal to the plane containing the gaze position and the direction facing the user, that is, the direction perpendicular to the screen, can be used.
 図15に、上述の方法で注視位置を表示した場合の表示例を示す。
  図15は、物体モデルを表示する際に、物体モデルの上下方向の座標軸を利用者側に傾けた状態で、注視位置を表示する際の位置姿勢の調整方法として、物体モデルの上下方向の座標軸周りの回転のみを許容して表示した場合を示している。物体モデル1501は最初の注視位置1502を利用者に正対するように位置姿勢を調整している。
FIG. 15 shows a display example when the gaze position is displayed by the method described above.
FIG. 15 shows a method for adjusting the position and orientation when displaying a gaze position in a state in which the vertical coordinate axes of the object model are tilted toward the user when displaying the object model. This shows a case where only rotation around is allowed and displayed. The position and orientation of the object model 1501 are adjusted so that the first gaze position 1502 faces the user.
 ただし図15では、物体モデルの上下方向の座標軸が傾いており、注視位置を利用者に完全に正対させることはできないため、できるだけ正対するように位置姿勢の調整を行う。具体的には、上述と同様に注視位置を含む平面の法線を求め、求めた法線と利用者に正対する方向、すなわち画面に垂直な方向との差が最も小さくなるように、物体モデルの位置姿勢を調整する。物体モデル1503では、注視位置1504が利用者に正対する程度が注視位置1502より低い状態、すなわち注視位置の確認は可能であるが、利用者には正対していない状態で表示されている。物体モデル1505上の注視位置1506および物体モデル1507上の注視位置1508も注視位置1504と同じ方法で表示された場合を示している。 However, in FIG. 15, the coordinate axes in the vertical direction of the object model are tilted, and the gaze position cannot be completely facing the user. Specifically, the normal to the plane containing the gaze position is obtained in the same manner as described above, and the object model Adjust the position and orientation of the The object model 1503 is displayed in a state in which the gaze position 1504 faces the user less than the gaze position 1502, that is, the gaze position can be confirmed but does not face the user. A gaze position 1506 on an object model 1505 and a gaze position 1508 on an object model 1507 are also displayed in the same manner as the gaze position 1504. FIG.
 以上のように、本発明に実施例における注視位置分析システムは、利用者の視野と同様の映像である一人称映像から利用者が視線を向けた範囲の空間の三次元モデルである空間モデルの生成と空間モデル中における一人称画像の撮影位置姿勢の算出を行い、空間モデル上における一人称映像の撮影位置姿勢と一人称映像上の注視位置とを用いて空間モデル上における注視方向を算出し、物体に対する三次元モデルである物体モデルと空間モデルとをマッチングすることにより推定された空間モデル上における物体の位置姿勢により物体モデルを空間モデル中に配置し、空間モデル上における注視方向と物体モデルとの交点を求めることにより、物体モデル上における注視位置を算出する。 As described above, the gaze position analysis system according to the embodiment of the present invention generates a space model, which is a three-dimensional model of the space in which the user's line of sight is directed, from the first-person image, which is an image similar to the user's field of view. and the shooting position and orientation of the first-person image in the space model. The object model is placed in the space model according to the position and orientation of the object on the space model estimated by matching the object model and the space model, and the intersection of the gaze direction and the object model on the space model is determined. By obtaining, the gaze position on the object model is calculated.
 本発明の実施例によれば、利用者に装着する形態の視線計測装置を対象として、計測対象の空間を自由に移動して計測された利用者の視線情報から、空間中に存在する各種対象物の三次元モデル上に利用者の注視位置を自動的に対応付けることが可能となる。 According to the embodiment of the present invention, for a sight line measuring device worn by a user, various objects existing in the space are obtained from the sight line information of the user, which is measured while freely moving in the space of the measurement target. It is possible to automatically associate the gaze position of the user with the three-dimensional model of the object.
101 視線計測装置
102 情報処理装置
103 入力装置
104 出力装置
105 記憶装置
106 空間モデル作成プログラム
107 物体位置姿勢推定プログラム
108 注視位置算出プログラム
109 注視位置表示プログラム
110 一人称映像データ
111 注視位置データ
112 物体モデルデータ
113 空間モデルデータ
114 物体配置データ
101 Gaze measurement device 102 Information processing device 103 Input device 104 Output device 105 Storage device 106 Spatial model creation program 107 Object position/orientation estimation program 108 Gaze position calculation program 109 Gaze position display program 110 First person image data 111 Gaze position data 112 Object model data 113 space model data 114 object placement data

Claims (15)

  1.  空間モデル作成部と物体位置姿勢推定部と注視位置算出部を有し、空間中に存在する物体の三次元モデルである物体モデル上に利用者の物体モデル上の注視位置である物体モデル注視位置を自動的に対応付ける注視位置分析システムであって、
     前記空間モデル作成部は、
     前記利用者の視野と同様の映像である一人称映像を取得し、
     前記利用者の前記一人称映像上の注視位置である一人称映像注視位置を取得し、
     前記利用者が視線を向けた範囲における空間の三次元モデルである空間モデルを前記一人称映像から作成し、
     前記空間モデル中における前記一人称映像の撮影位置姿勢を算出し、
     前記物体位置姿勢推定部は、
     前記物体モデルと前記空間モデルとをマッチングすることにより、前記空間モデル中における前記物体の位置姿勢を推定し、
     前記空間モデル中における前記物体の位置姿勢を用いて、前記物体モデルを前記空間モデル中に配置し、
     前記空間モデル中における前記一人称映像の前記撮影位置姿勢と前記一人称映像注視位置を用いて、前記空間モデル中における注視方向を算出し、
     前記注視位置算出部は、
     前記空間モデル中における前記注視方向と前記物体モデルとの交点を求めることにより、前記物体モデル注視位置を算出することを特徴とする注視位置分析システム。
    An object model gaze position, which is the user's gaze position on the object model, on the object model, which is a three-dimensional model of an object existing in space, and which has a space model creation unit, an object position/orientation estimation unit, and a gaze position calculation unit. A gaze position analysis system that automatically associates
    The spatial model creation unit
    obtaining a first-person video that is similar to the user's field of view;
    obtaining a first-person video gaze position, which is a gaze position of the user on the first-person video;
    creating a space model, which is a three-dimensional model of the space in the range where the user's line of sight is directed, from the first-person video;
    calculating a shooting position and orientation of the first-person video in the spatial model;
    The object position/orientation estimator,
    estimating the position and orientation of the object in the spatial model by matching the object model and the spatial model;
    placing the object model in the spatial model using the position and orientation of the object in the spatial model;
    calculating a viewing direction in the space model using the shooting position and orientation of the first-person video in the space model and the first-person video gaze position;
    The gaze position calculation unit
    A gaze position analysis system, wherein the object model gaze position is calculated by obtaining an intersection point between the gaze direction and the object model in the space model.
  2.  前記物体位置姿勢推定部は、
     前記空間モデル中の前記注視方向に前記物体モデルがマッチングされていないデータが存在する場合は、前記物体モデルと前記物体モデルがマッチングされていない前記空間モデル中のデータとをマッチングすることにより、前記空間モデル中における前記物体の位置姿勢を推定することを特徴とする請求項1に記載の注視位置分析システム。
    The object position/orientation estimator,
    If there is data for which the object model has not been matched in the gaze direction in the space model, by matching the object model with data in the space model for which the object model has not been matched, the 2. The gaze position analysis system according to claim 1, wherein the position and orientation of said object in a space model are estimated.
  3.  前記物体位置姿勢推定部は、
     前記空間モデル中におけるデータから、前記注視方向を基準として、あらかじめ定められた範囲に存在する空間モデル中のデータを選択し、
     前記物体モデルと前記選択された空間モデル中のデータとをマッチングすることにより前記空間モデル中における前記物体の位置姿勢を推定することを特徴とする請求項1に記載の注視位置分析システム。
    The object position/orientation estimator,
    selecting data in the spatial model existing within a predetermined range with respect to the viewing direction from the data in the spatial model;
    2. The gaze position analysis system according to claim 1, wherein the position and orientation of the object in the spatial model are estimated by matching the object model with data in the selected spatial model.
  4.  前記物体位置姿勢推定部は、
     あらかじめ設定された画像パターン、文字列又は記号列を前記一人称映像中から読み取り、
     前記画像パターン、前記文字列又は前記記号列に対応する前記物体モデルを選択し、
     前記空間モデルと前記空間モデル中における前記一人称映像の撮影位置姿勢と、前記一人称映像中の前記画像パターン、前記文字列又は前記記号列の位置とから、前記空間モデル中における前記画像パターン、前記文字列又は前記記号列の前記空間モデル中の位置を算出し、
     前記空間モデル中における前記画像パターン、前記文字列又は前記記号列の前記空間モデル中の位置を基準として、前記あらかじめ定められた範囲に存在する空間モデル中のデータを選択し、
     前記選択された物体モデルと前記選択された空間モデル中のデータとをマッチングすることにより前記空間モデル中における前記物体の位置姿勢を推定することを特徴とする請求項3に記載の注視位置分析システム。
    The object position/orientation estimator,
    reading a preset image pattern, character string or symbol string from the first-person video;
    selecting the object model corresponding to the image pattern, the character string or the symbol string;
    The image pattern and the character in the space model are obtained from the space model, the photographing position and orientation of the first-person video in the space model, and the position of the image pattern, the character string, or the symbol string in the first-person video. calculating the position in the spatial model of the string or the symbolic string;
    selecting data in the spatial model existing in the predetermined range based on the position in the spatial model of the image pattern, the character string, or the symbol string in the spatial model;
    4. The gaze position analysis system according to claim 3, wherein the position and orientation of the object in the spatial model are estimated by matching the selected object model and data in the selected spatial model. .
  5.  前記物体位置姿勢推定部は、
     ある時刻における前記一人称映像から作成された空間モデルを表すデータと、既に作成されている前記空間モデルの内、同時刻における前記一人称映像に対応する範囲のデータとの差異を算出し、
     前記算出された差異があらかじめ定められた閾値より大きい場合は、前記空間モデル上の該当する範囲のデータを前記一人称映像から作成された前記空間モデルを表すデータに置き換え、
     前記置き換えを行った箇所のデータに対して、前記物体モデルのマッチングを行い、
     前記置き換えを行う前のデータに対してマッチングされていた前記物体モデルを新たにマッチングされた前記物体モデルと置き換えることを特徴とする請求項1に記載の注視位置分析システム。
    The object position/orientation estimator,
    calculating the difference between the data representing the space model created from the first-person video at a certain time and the range data corresponding to the first-person video at the same time in the already created space model;
    if the calculated difference is greater than a predetermined threshold, replacing data of the corresponding range on the spatial model with data representing the spatial model created from the first-person video;
    performing matching of the object model with respect to the data at the location where the replacement was performed;
    2. The gaze position analysis system according to claim 1, wherein the object model matched against the data before the replacement is replaced with the newly matched object model.
  6.  形状又は構造が変化する前記物体に対して、前記変化のそれぞれの過程に対応する前記物体モデルと、前記変化の順序関係とを保存する物体モデル保存部を更に有し、
     前記物体位置姿勢推定部は、
     変化する前記物体に対応する前記物体モデルがマッチングされた箇所については、前記空間モデルの変化が検出されるごとに、前記物体の変化の過程に応じた前記物体モデルを前記物体モデル保存部から選択し、前記選択された物体モデルと前記空間モデルとのマッチングを行うことを特徴とする請求項1に記載の注視位置分析システム。
    an object model storage unit that stores the object model corresponding to each process of the change and the order relationship of the change for the object whose shape or structure changes;
    The object position/orientation estimator,
    For locations where the object model corresponding to the changing object is matched, the object model corresponding to the change process of the object is selected from the object model storage unit each time a change in the space model is detected. 2. The gaze position analysis system according to claim 1, wherein the selected object model and the space model are matched.
  7.  前記注視位置算出部は、
     任意の前記物体モデルが新たに前記空間モデル上の任意の箇所にマッチングされた場合、
     以前の前記物体モデル注視位置に対して、新たに前記空間モデル上にマッチングされた前記物体モデル上に前記物体モデル注視位置があるかどうかを判定することを特徴とする請求項1に記載の注視位置分析システム。
    The gaze position calculation unit
    When any of the object models is newly matched to any location on the space model,
    2. The gaze according to claim 1, wherein it is determined whether the object model gaze position exists on the object model newly matched on the space model with respect to the previous object model gaze position. Position analysis system.
  8.  前記物体位置姿勢推定部は、
     前記空間モデル上において前記物体モデルがマッチングされなかった箇所に対して、前記利用者が手動で前記物体モデルをマッチングさせ、
     前記注視位置算出部は、
     既に取得された全ての前記物体モデル注視位置に対して、前記利用者が手動で前記空間モデル上にマッチングさせた前記物体モデル上に前記物体モデル注視位置があるかどうかを判定することを特徴とする請求項1に記載の注視位置分析システム。
    The object position/orientation estimator,
    The user manually matches the object model to a location on the space model where the object model has not been matched,
    The gaze position calculation unit
    Determining whether or not the object model gaze position exists on the object model manually matched on the space model by the user for all the object model gaze positions that have already been acquired. The gaze position analysis system according to claim 1.
  9.  前記注視位置算出部は、
     前記物体モデルの名称、前記空間モデル中における前記一人称映像の撮影位置姿勢、前記空間モデル中における一人称映像注視位置、前記空間モデル中における前記物体モデルの位置姿勢の内の少なくとも一つを前記物体モデル上の前記物体モデル注視位置に対応付けて保存することを特徴とする請求項1に記載の注視位置分析システム。
    The gaze position calculation unit
    At least one of the name of the object model, the shooting position and orientation of the first-person video in the space model, the first-person video viewing position in the space model, and the position and orientation of the object model in the space model is defined in the object model. 2. The gaze position analysis system according to claim 1, wherein the gaze position is stored in association with the above object model gaze position.
  10.  前記物体モデル上の前記物体モデル注視位置を表示する注視位置表示部を更に有し、
     前記注視位置表示部は、
     前記物体モデル上の前記物体モデル注視位置を表示する際、各時刻の前記物体モデル注視位置が常に正面に表示されるように前記物体モデルの位置姿勢を調整することを特徴とする請求項1に記載の注視位置分析システム。
    further comprising a gaze position display unit that displays the object model gaze position on the object model;
    The gaze position display unit
    2. The method according to claim 1, wherein when displaying the object model gaze position on the object model, the position and orientation of the object model are adjusted so that the object model gaze position at each time is always displayed in front. The described gaze position analysis system.
  11.  前記空間中を前記利用者が自由に移動できる状態で、前記利用者の前記一人称映像を撮影し、前記一人称映像注視位置を計測する視線計測装置を更に有し、
     前記空間モデル作成部は、
     前記視線計測装置から前記一人称映像と前記一人称映像注視位置を取得することを特徴とする請求項1に記載の注視位置分析システム。
    further comprising a line-of-sight measurement device that captures the first-person video of the user and measures the gaze position of the first-person video in a state in which the user can move freely in the space;
    The spatial model creation unit
    2. The gaze position analysis system according to claim 1, wherein the first-person image and the first-person image gaze position are obtained from the eye gaze measuring device.
  12.  前記視線計測装置は、
     撮影装置を有し、
     前記空間モデル作成部は、
     前記撮影装置で撮影された複数枚の撮影画像から周囲の前記空間モデルを作成することを特徴とする請求項11に記載の注視位置分析システム。
    The line-of-sight measurement device
    having a photographing device,
    The spatial model creation unit
    12. The gaze position analysis system according to claim 11, wherein the surrounding space model is created from a plurality of photographed images photographed by the photographing device.
  13.  空間中に存在する物体の三次元モデルである物体モデル上に利用者の物体モデル上の注視位置を自動的に対応付ける注視位置分析システムであって、
     撮影された複数枚の撮影画像から、周囲の空間の三次元モデルである空間モデルを作成する空間モデル作成部と、
     前記空間モデルと前記物体モデルとをマッチングし、前記マッチングで得られた位置姿勢により前記物体モデルを前記空間モデル上に配置する物体位置姿勢推定部と、
     前記配置された物体モデルと前記空間モデル中における注視方向に基づいて、前記物体モデル上の前記注視位置を算出する注視位置算出部と、
     を有することを特徴とする注視位置分析システム。
    A gaze position analysis system that automatically associates a user's gaze position on an object model with an object model that is a three-dimensional model of an object existing in space,
    a space model creation unit that creates a space model, which is a three-dimensional model of the surrounding space, from a plurality of captured images;
    an object position/posture estimation unit that matches the space model and the object model, and arranges the object model on the space model according to the position/posture obtained by the matching;
    a gaze position calculation unit that calculates the gaze position on the object model based on the gaze direction in the arranged object model and the space model;
    A gaze position analysis system comprising:
  14.  前記注視位置算出部は、
     前記空間モデル中における前記注視方向と前記物体モデルとの交点を求めることにより、前記物体モデル上の前記注視位置を算出することを特徴とする請求項13に記載の注視位置分析システム。
    The gaze position calculation unit
    14. The gaze position analysis system according to claim 13, wherein the gaze position on the object model is calculated by obtaining an intersection point between the gaze direction and the object model in the space model.
  15.  空間中に存在する物体の三次元モデルである物体モデル上に利用者の物体モデル上の注視位置である物体モデル注視位置を自動的に対応付ける注視位置分析方法であって、
     前記利用者の視野と同様の映像である一人称映像を取得するステップと、
     前記利用者の前記一人称映像上の注視位置である一人称映像注視位置を取得するステップと、
     前記利用者が視線を向けた範囲における空間の三次元モデルである空間モデルを前記一人称映像から作成するステップと、
     前記空間モデル中における前記一人称映像の撮影位置姿勢を算出するステップと、
     前記物体モデルと前記空間モデルとをマッチングすることにより、前記空間モデル中における前記物体の位置姿勢を推定するステップと、
     前記空間モデル中における前記物体の位置姿勢を用いて、前記物体モデルを前記空間モデル中に配置するステップと、
     前記空間モデル中における前記一人称映像の前記撮影位置姿勢と前記一人称映像注視位置を用いて、前記空間モデル中における注視方向を算出するステップと、
     前記空間モデル中における前記注視方向と前記物体モデルとの交点を求めることにより、前記物体モデル注視位置を算出するするステップと、
     を有することを特徴とする注視位置分析方法。
    A gaze position analysis method for automatically associating an object model gaze position, which is a gaze position of a user on an object model, with an object model, which is a three-dimensional model of an object existing in space, comprising:
    obtaining a first person image that is similar to the user's field of view;
    obtaining a first-person video gaze position, which is the user's gaze position on the first-person video;
    a step of creating a spatial model, which is a three-dimensional model of the space in the range where the user's line of sight is directed, from the first-person video;
    calculating a shooting position and orientation of the first-person video in the spatial model;
    estimating a position and orientation of the object in the spatial model by matching the object model and the spatial model;
    placing the object model in the spatial model using the pose of the object in the spatial model;
    calculating a viewing direction in the space model using the shooting position and orientation of the first-person video in the space model and the first-person video viewing position;
    calculating the object model gaze position by finding the intersection of the gaze direction and the object model in the spatial model;
    A gaze position analysis method, comprising:
PCT/JP2022/036643 2021-10-01 2022-09-30 Gaze position analysis system and gaze position analysis method WO2023054661A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2021-162786 2021-10-01
JP2021162786A JP2023053631A (en) 2021-10-01 2021-10-01 Gazing point analysis system and gazing point analysis method

Publications (1)

Publication Number Publication Date
WO2023054661A1 true WO2023054661A1 (en) 2023-04-06

Family

ID=85782972

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/036643 WO2023054661A1 (en) 2021-10-01 2022-09-30 Gaze position analysis system and gaze position analysis method

Country Status (2)

Country Link
JP (1) JP2023053631A (en)
WO (1) WO2023054661A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015528359A (en) * 2012-09-17 2015-09-28 ゼンゾモトリック インストルメンツ ゲゼルシャフト フュア イノベイティブ ゼンゾリク ミット ベシュレンクテル ハフツング Method and apparatus for determining a point of interest on a three-dimensional object
JP2019121136A (en) * 2017-12-29 2019-07-22 富士通株式会社 Information processing apparatus, information processing system and information processing method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015528359A (en) * 2012-09-17 2015-09-28 ゼンゾモトリック インストルメンツ ゲゼルシャフト フュア イノベイティブ ゼンゾリク ミット ベシュレンクテル ハフツング Method and apparatus for determining a point of interest on a three-dimensional object
JP2019121136A (en) * 2017-12-29 2019-07-22 富士通株式会社 Information processing apparatus, information processing system and information processing method

Also Published As

Publication number Publication date
JP2023053631A (en) 2023-04-13

Similar Documents

Publication Publication Date Title
US10872459B2 (en) Scene recognition using volumetric substitution of real world objects
US9495068B2 (en) Three-dimensional user interface apparatus and three-dimensional operation method
US11222471B2 (en) Implementing three-dimensional augmented reality in smart glasses based on two-dimensional data
US8055061B2 (en) Method and apparatus for generating three-dimensional model information
JP5248806B2 (en) Information processing apparatus and information processing method
JP6008397B2 (en) AR system using optical see-through HMD
JP6688088B2 (en) Information processing apparatus and control method thereof
US20150277555A1 (en) Three-dimensional user interface apparatus and three-dimensional operation method
US10769437B2 (en) Adaptive sampling of training views
JP2013050947A (en) Method for object pose estimation, apparatus for object pose estimation, method for object estimation pose refinement and computer readable medium
US11842514B1 (en) Determining a pose of an object from rgb-d images
US20150339819A1 (en) Method for processing local information
US11490062B2 (en) Information processing apparatus, information processing method, and storage medium
JP7379065B2 (en) Information processing device, information processing method, and program
JP6129363B2 (en) Interactive system, remote control and operation method thereof
JP6946087B2 (en) Information processing device, its control method, and program
CN108629799B (en) Method and equipment for realizing augmented reality
CN110070578B (en) Loop detection method
JP6061334B2 (en) AR system using optical see-through HMD
US20230325009A1 (en) Methods, devices, apparatuses, and storage media for mapping mouse models for computer mouses
US20210327160A1 (en) Authoring device, authoring method, and storage medium storing authoring program
JP6719945B2 (en) Information processing apparatus, information processing method, information processing system, and program
WO2023054661A1 (en) Gaze position analysis system and gaze position analysis method
CN115100257A (en) Sleeve alignment method and device, computer equipment and storage medium
US20220207832A1 (en) Method and apparatus for providing virtual contents in virtual space based on common coordinate system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22876524

Country of ref document: EP

Kind code of ref document: A1