WO2023155813A1

WO2023155813A1 - Gaze information determining method and apparatus, eye tracking device, object to be observed, and medium

Info

Publication number: WO2023155813A1
Application number: PCT/CN2023/076256
Authority: WO
Inventors: 苑屹; 费文波
Original assignee: 北京七鑫易维信息技术有限公司
Priority date: 2022-02-18
Filing date: 2023-02-15
Publication date: 2023-08-24
Also published as: CN116665292A

Abstract

The present application discloses a gaze information determining method and apparatus, an eye tracking device, an object to be observed, and a medium. The method comprises: acquiring a foreground image and original gaze information of a user on the foreground image; identifying code information displayed in a code area in the foreground image; comparing the displayed code information with at least one piece of reference code information stored in a code database, and determining device information of an observation object that the user is gazing at; and performing, on the basis of the device information and the original gaze information, scale conversion on abstract coordinates in a coordinate system corresponding to the code area, and determining target gaze information of the user on the observation object, wherein at least two intersecting code areas are provided on the observation object.

Description

Fixation information determination method, device, eye movement equipment, object to be observed and medium

cross reference

This disclosure claims the priority of the Chinese patent application submitted to the China Patent Office on February 18, 2022, with the application number 202210152550.X, and the application name "gaze information determination method, device, eye movement equipment, object to be observed and medium", The entire contents of which are incorporated by reference in this application.

technical field

Embodiments of the present disclosure relate to the technical field of eye movement tracking, and in particular to a gaze information determination method, device, eye movement equipment, object to be observed, and a medium.

Background technique

With the rapid development of computer vision, artificial intelligence technology and digital technology, eye tracking technology has become a current hot research field and has a wide range of applications in the field of human-computer interaction. Eye tracking, also known as eye tracking, is a technology that obtains eye movement data by measuring eye movement, and then estimates eye gaze information through eye movement data.

At present, when identifying the equipment to be observed, an identification code for identification and a positioning code for positioning are deployed near the equipment to be observed, and the identification code and positioning code are collected by the foreground camera of the eye-tracking device to determine who is wearing the eye-tracking device. The gaze point of the user on the device to be observed realizes the human-computer interaction between the device to be observed and the user.

However, the existing technical solutions have the following defects: firstly, they have higher requirements for the resolution of the foreground camera of the eye movement device, and it is necessary for it to be able to clearly capture the complete multiple identification codes to be effective, and to capture and identify them in real time The calculation amount of the method is relatively high, and it has a high load on the hardware; secondly, there is also a certain requirement for the volume of the deployed identification code, which is difficult to be effectively applied on some small devices (such as mobile phones).

Contents of the invention

The embodiments of the present disclosure provide a gaze information determination method, device, eye movement equipment, object to be observed, and medium, so as to realize the determination of the user's target gaze information on the observed object, and improve the recognition speed and observation of the equipment information of the observed object. Confirmation accuracy of target gaze information on objects.

In a first aspect, an embodiment of the present disclosure provides a method for determining gaze information, which is applied to an eye movement device, and the method includes:

Acquiring a foreground image and the user's original gaze information on the foreground image;

identifying coded information displayed in coded regions within said foreground image;

Comparing the displayed coded information with at least one reference coded information stored in the coded database to determine the equipment information of the observation object that the user is looking at;

Scale converting the abstract coordinates in the coordinate system corresponding to the coded area based on the device information and the original gaze information to determine the target gaze information of the user on the observation object;

Wherein, at least two intersecting coding regions are set on the observation object.

Optionally, comparing the displayed coding information with at least one reference coding information stored in the coding database to determine the equipment information of the observation object that the user is looking at, including:

selecting one reference encoding information from at least one reference encoding information stored in the encoding database as target encoding information;

Comparing the displayed coded information with the target coded information to determine whether the displayed coded information is the target coded information or part of the target coded information;

If so, determining the device information corresponding to the target coding information as the device information of the observation object that the user is looking at;

If not, continue to select the next target coding information until the device of the observed object is determined information.

Optionally, based on the device information and the original gaze information, the abstract coordinates in the coordinate system corresponding to the encoding area are scaled to determine the target gaze information of the user on the observation object, including:

Based on the device size information in the device information, the size information of the coding region, and the angle information between the coding regions, convert the abstract coordinates of the original gaze information in the coordinate system corresponding to the coding region into the Target gaze information of the user on the observed object.

Optionally, any one of the reference coding information stored in the coding database is composed of one or more original coding information, and the original coding information is obtained by coding the equipment information of the observation object, and the equipment information includes equipment size information, The size information of the coding area, the angle information between the coding areas, the device identification information and the check code information.

Optionally, when the reference coding information is composed of a plurality of original coding information, a group number is added before each original coding information, so as to determine the abstraction of the original gaze information in the coordinate system of the coding region based on the group number. coordinate.

Optionally, the foreground image is one frame image or multiple frame images; the number of the multiple frame images is determined based on the length of the display period of the reference coding information corresponding to the observed object.

In the second aspect, the embodiment of the present disclosure also provides a device for determining gaze information, including:

An acquisition module configured to acquire a foreground image and the user's original gaze information on the foreground image;

An identification module configured to identify the encoded information displayed in the encoded area of the foreground image;

The device information determination module is configured to compare the displayed coded information with at least one reference coded information stored in the coded database, and determine the device information of the observation object that the user is looking at;

The target gaze information determination module is configured to set the target gaze information based on the equipment information and the original gaze information The abstract coordinates in the coordinate system corresponding to the encoding area are proportionally converted to determine the target gaze information of the user on the observation object;

In the third aspect, the embodiment of the present disclosure also provides an eye movement device, including:

one or more processors;

a storage device configured to store one or more programs;

When the one or more programs are executed by the one or more processors, the one or more processors are made to implement the gaze information determination method provided by the embodiments of the present disclosure.

In a fourth aspect, an embodiment of the present disclosure further provides an object to be observed, the object to be observed includes at least two intersecting coding regions, and the coding regions are used to display reference coding information of the object to be observed.

Optionally, the coding area is at least one of the following: a display screen of the object to be observed; an invisible light component provided on the object to be observed, and the invisible light component is coded by the brightness and darkness of the emitted invisible light display of information.

Optionally, the coding area is set along the edge of the object to be observed or the coding area is set along the edge of the display screen of the object to be observed.

In the fifth aspect, the embodiments of the present disclosure further provide a computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, the method for determining gaze information provided by the embodiments of the present disclosure is implemented.

Embodiments of the present disclosure provide a gaze information determination method, device, eye movement device, object to be observed, and a medium. First obtain the foreground image and the user's original gaze information on the foreground image; identify the coded information displayed in the coded area of the foreground image; then combine the displayed coded information with the code Comparing at least one reference coding information stored in the database to determine the equipment information of the observation object that the user is looking at; finally, based on the equipment information and the original gaze information, the abstract coordinates in the coordinate system corresponding to the coding area Scale conversion is performed to determine target gaze information of the user on the observation object; wherein at least two intersecting coding regions are set on the observation object. Using the above technical method, it is possible to determine the user's target gaze information on the observation object, improve the recognition speed of the equipment information of the observation object and the confirmation accuracy of the target gaze information on the observation object.

Description of drawings

FIG. 1 is a schematic flowchart of a method for determining gaze information provided by Embodiment 1 of the present disclosure;

FIG. 2a is a schematic flowchart of a method for determining gaze information provided by Embodiment 2 of the present disclosure;

Fig. 2b is a schematic diagram of a coded region of an object to be observed provided in Embodiment 2 of the present disclosure;

FIG. 2c is a schematic diagram of the spatial arrangement of a code provided in Embodiment 2 of the present disclosure;

FIG. 3 is a schematic structural diagram of a device for determining gaze information provided by Embodiment 3 of the present disclosure;

FIG. 4 is a schematic structural diagram of an eye movement device provided in Embodiment 4 of the present disclosure;

FIG. 5 is a schematic diagram of an object to be observed provided by Embodiment 5 of the present disclosure.

Detailed ways

The present disclosure will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present disclosure, but not to limit the present disclosure. In addition, it should be noted that, for the convenience of description, only some structures related to the present disclosure are shown in the drawings but not all structures.

Before discussing the exemplary embodiments in more detail, it should be mentioned that some exemplary embodiments are Described as a process or method depicted as a flowchart. Although the flowcharts describe various operations (or steps) as sequential processing, many of the operations may be performed in parallel, concurrently, or simultaneously. In addition, the order of operations can be rearranged. The process may be terminated when its operations are complete, but may also have additional steps not included in the figure. The processing may correspond to a method, function, procedure, subroutine, subroutine, or the like. In addition, the embodiments in the present disclosure and the features in the embodiments can be combined with each other if there is no conflict.

As used in this disclosure, the term "comprise" and its variations are open-ended, ie "including but not limited to". The term "based on" is "based at least in part on". The term "one embodiment" means "at least one embodiment."

It should be noted that concepts such as "first" and "second" mentioned in the present disclosure are only used to distinguish corresponding contents, and are not used to limit the sequence or interdependence relationship.

It should be noted that the modifications of "one" and "multiple" mentioned in the present disclosure are illustrative and not restrictive, and those skilled in the art should understand that unless the context clearly indicates otherwise, it should be understood as "one or more" multiple".

In the description of the present disclosure, when an element is referred to as being “disposed on” another element, it may be directly disposed on the other element or indirectly disposed on the other element. When an element is referred to as being "connected to" another element, it can be directly connected to the other element or indirectly connected to the other element. For example, it may be a fixed connection, a detachable connection, a mechanical connection, or an electrical connection.

In the description of the present disclosure, it should be noted that the orientation or positional relationship indicated by the terms "center", "upper", "lower", "left", "right" etc. is based on the orientation or positional relationship shown in the drawings , is only for the convenience of describing the present disclosure, but does not indicate that the device or element referred to must have a specific orientation, and thus should not be construed as a limitation of the present disclosure. In addition, techniques involved in various embodiments of the present disclosure described below Features can be combined with each other as long as they do not conflict with each other.

Embodiment one

Fig. 1 is a schematic flow chart of a fixation information determination method provided by Embodiment 1 of the present disclosure. This method is applicable to the situation of determining the target fixation information of the original fixation information on the observation object, and the method can be executed by a fixation information determination device. , wherein the device can be implemented by software and/or hardware, and is generally integrated on an eye movement device. In this embodiment, the eye movement device includes but is not limited to: augmented reality (Augmented Reality, AR) equipment, such as glasses equipment and AR eye Mobile equipment and other equipment.

The present disclosure can be applied to the field of eye movement tracking. At present, the optical recording method is widely used for eye movement tracking: a camera or video camera is used to record the subject's eye movement, that is, the user's eye movement, that is, to obtain an eye image reflecting the eye movement. And extract eye features from the acquired eye images, such as eye movement data for establishing a model for line of sight/gaze point estimation. Wherein, the eye features may include but not limited to: pupil position, pupil shape, iris position, iris shape, eyelid position, eye corner position, light spot (also called Purchin's spot) position, etc.

Among the optical recording methods, the current mainstream eye tracking method is called the pupil-corneal reflection method.

In addition to the optical recording method, there are other methods of obtaining gaze tracking, including but not limited to the following methods:

1. The eye tracking device can be a MEMS micro-electro-mechanical system, for example, including a MEMS infrared scanning mirror, an infrared light source and an infrared receiver.

2. In another embodiment, the eye tracking device can also be a contact/non-contact sensor (such as an electrode, a capacitive sensor), which detects eye movement through the capacitance value between the eyeball and the capacitive plate.

3. In yet another embodiment, the eye tracking device can also be a myoelectric detector, for example, by placing electrodes on the bridge of the nose, forehead, ear or earlobe, and detecting eye movement through the detected myoelectric signal pattern.

Among them, the working principle of the pupil-corneal reflection method can be summarized as follows: acquiring an eye image; estimating a line of sight/gaze point based on the eye image.

The hardware requirements for the pupil-cornea reflection method can be:

(1) Light source: generally infrared light source, because infrared light will not affect the vision of the eyes; and it can be multiple infrared light sources, arranged in a predetermined way, such as character-shaped, straight-shaped, etc.;

(2) Image acquisition equipment: such as infrared camera equipment, infrared image sensor, camera or video camera, etc.

The specific implementation of the pupil-cornea reflection method can be:

1. Eye image acquisition:

The light source shines on the eye, and the image acquisition device takes pictures of the eye, and correspondingly captures the reflection point of the light source on the cornea, that is, the light spot (also called Purchin's spot), thereby obtaining an eye image with the light spot.

2. Line of sight/gaze point estimation:

As the eyeball rotates, the relative positional relationship between the center of the pupil and the light spot changes accordingly, and the correspondingly collected several eye images with light spots reflect such a positional change relationship; estimate.

Based on this, in order to solve the hardware requirements for the foreground camera of the eye movement device and the technical problem of limited use scenarios of the eye movement device in the related art, as shown in Figure 1, a gaze information determination method provided by Embodiment 1 of the present disclosure, Including the following steps:

S110. Acquire a foreground image and the user's original gaze information on the foreground image.

In the present disclosure, the foreground image can be considered as an image collected by the foreground camera of the eye movement device.

Wherein, the foreground image is one frame image or multiple frame images; the number of the multiple frame images is determined based on the length of the display period of the reference coding information corresponding to the observation object.

The display period can be regarded as the whole of one cycle, that is, a complete cycle. If the reference coding information can be displayed through 5 frames, the length of the display cycle is 5 frames.

The unit of the display cycle is frame, and the length of the display cycle is rounded up by dividing the length of the reference coded information by the length of the coded information that can be displayed by the electronic device at one time. The number of corresponding multi-frame images is equal to the length of the display period. That is, the number of image frames included in one display period is equal to the number of multi-frame images.

The unit of the display period is time, and the length of the display period is time, so the number of multi-frame images must be determined in combination with the frame rate of the electronic device. The frame rate includes the frame rate of the foreground camera of the eye tracking device, the frame rate of the display of the electronic device and the frame rate of the algorithm.

The acquisition method of the foreground image can be regarded as real-time shooting and acquisition through the foreground camera of the eye-tracking device. The raw gaze information can be considered as the gaze information on the foreground image. Gaze information includes gaze point and/or line of sight.

In this step, the original gaze information may be determined through the eye movement data collected by the eye movement device, which is not specifically limited here.

Further, in this step, one or more frames of foreground images can be used to obtain the coding information, so as to determine the target gaze information based on the coding information.

S120. Identify the coding information displayed in the coding area in the foreground image.

The encoding area can be understood as an area in the foreground image for displaying encoding information. In the present disclosure, at least two intersecting coding regions are set on the observation object for displaying reference coding information corresponding to the observation object. The coding information may be data displayed in the coding area. By encoding the information can be unique Once the observation object is determined, then the determination of gaze information is realized. How to identify the encoded information is not limited here, and different encoding methods may correspond to different identification methods. When the reference coding information is displayed by the illuminant in the coding area, the display of the reference coding information can be realized based on the brightness and darkness of the light emitted by the luminous body.

In this embodiment, the length of the coding region and the distribution position of the coding region are not limited, and can be set based on actual needs, such as setting the length of the coding region to half or one third of the length of the object to be observed; The regions may also be distributed with unequal lengths, and depending on the usage scenarios, the distribution positions of the coding regions are also different.

Optionally, the length of the original coded information can be divided into two types. One is for home-oriented indoor multi-device applications. At this time, the code is only used to distinguish different home devices, and the required data length and information volume are relatively small. The original code The length of the information can be determined according to the actual situation, as long as it is ensured that different home devices can be distinguished. For example, the data of the first length can be selected, such as the data of 16bit length. At this time, the size of the device to be observed is very low. , even a small-sized device like a mobile phone can be deployed, which can improve usability and convenience; the other is a big data database, which includes all devices that support this technology, and is generally used for advertising machines everywhere, eye-moving devices, etc. For interactive devices, etc., the encoding at this time must be different from that of all other devices, and the required data length and information volume will be relatively large. The length of the encoding can also be determined according to the actual situation. For example, the second length can be selected For data with a length of 256 bits, for example, there are certain requirements for the size of the equipment to be observed, otherwise the displayed reference code information will be too dense and difficult to identify.

It should be noted that if the 256-bit length of data cannot be stored due to the size of the device to be observed, you can consider splitting the reference code information in time series, such as displaying 64-bit reference code information per second with a cycle of 1 second , to make the reference encoding information take effect through splicing in time sequence. For example, set 4 frames of images as a display cycle.

S130. Compare the displayed coding information with at least one reference coding information stored in the coding database, and determine the device information of the observation object that the user is looking at.

In the present disclosure, the observation object may be regarded as a device that a user wearing an eye-tracking device is looking at. The reference coded information may be considered to be composed of one or more original coded information, and the original coded information is obtained by coding the equipment information of the observation object.

Wherein, the encoding database is constructed according to encoding information, including encoding content, equipment size information, equipment identification number, angle information between areas set as encoding, check code information and so on. This ensures that the eye movement device can deliver the eye movement data to the device to be observed after parsing the coded information to form an interaction.

The device size information may be understood as information representing the size of the device, and the angle information between regions used for encoding may be an angle between encoding regions.

Optionally, the encoded content of the original encoded information can be generated by a hash algorithm, and the device identification number of the device, such as a unique identification number or SN (Serial Number, that is, a product serial number), is converted into a fixed-length Coded content, and then the coded content is presented through the square-shaped light and dark of infrared light (or visible light) or displayed on the display screen.

It can be understood that the encoded content is a part of the encoded information, the encoded information is the full name of a record in the encoded database, and the encoded content is only a hash code after a hash.

S140. Based on the device information and the original gaze information, perform scale conversion on the abstract coordinates in the coordinate system corresponding to the encoding region, and determine the target gaze information of the user on the observation object; wherein, the observation Objects are provided with at least two intersecting coding regions.

Exemplarily, the device information may be the size information of the device, the size information of the coding region, and the angle information between the coding regions, or it may be the device identification number, etc., which are not limited here and can be based on actual Requirements selected. It can be understood that the size information of the device and the identification number of the device are fixed.

Optionally, based on the device size information in the device information, the size information of the coding region, and the angle information between the coding regions, the abstract coordinates of the original gaze information in the coordinate system corresponding to the coding region are scaled converted into target gaze information of the user on the observed object.

Exemplarily, the device size information may be screen size information of the device and the like.

In the gaze information determination method provided by Embodiment 1 of the present disclosure, firstly, the foreground image and the original gaze information of the user on the foreground image are acquired; secondly, the coded information displayed in the coded area in the foreground image is identified; and then the The displayed coding information is compared with at least one reference coding information stored in the coding database to determine the device information of the observation object that the user is looking at; finally, based on the device information and the original gaze information, the coded area is Scale conversion is performed on the abstract coordinates in the corresponding coordinate system to determine the target gaze information of the user on the observation object; wherein at least two intersecting coding regions are set on the observation object. Using the above method can solve the shortcomings of the existing technology, such as the high requirement for the resolution of the eye movement equipment proactively, and the volume of the identification code deployed by the observation equipment. The displayed code information and the reference code stored in the database The information is compared, the observed observation equipment is determined, the recognition speed is improved, and the determination of the user's target gaze information on the observation object can be realized, and the recognition speed of the equipment information of the observation object and the target gaze information on the observation object can be improved. confirmation accuracy.

Embodiment two

FIG. 2 is a schematic flowchart of a method for determining gaze information provided by Embodiment 2 of the present disclosure. Embodiment 2 is optimized on the basis of the foregoing embodiments. In this embodiment, the displayed coded information is compared with at least one reference coded information stored in the coded database to determine the coded information entered by the user. The device information of the observed object is further specified as:

If not, continue to select the next target coding information until the device information of the observed object being watched is determined.

Further, based on the device information and the original gaze information, the abstract coordinates in the coordinate system corresponding to the coding area are proportionally converted, and the target gaze information of the user on the observation object is determined, including:

Based on the device size information in the device information, the size information of the coding region, and the angle information between the coding regions, convert the abstract coordinates of the original gaze information in the coordinate system corresponding to the coding region into the Target gaze information of the user on the observation object; wherein, at least two intersecting coding regions are set on the observation object.

Please refer to Embodiment 1 for the content that is not exhaustive in this embodiment.

As shown in FIG. 2a, a schematic flowchart of a method for determining gaze information provided by Embodiment 2 of the present disclosure includes the following steps:

S210. Acquire a foreground image and the user's original gaze information on the foreground image.

S220. Identify the coding information displayed in the coding area in the foreground image.

S230. Select a reference code from at least one reference code information stored in the code database information as the target encoded information.

Further, in this embodiment, any one of the reference coding information stored in the coding database is composed of one or more original coding information, and the original coding information is obtained by coding the equipment information of the observation object, and the equipment information Including device size information, size information of coding regions, angle information between coding regions, device identification information and check code information.

The size information of the coding region may represent information about the size of the coding region. The size of the coding area can be characterized by the number of illuminants, and one illuminant can be set to display one bit of coded information.

The angle between the encoding intervals can be regarded as the degree of the included angle between the encoding intervals. The device identification information can be considered as information that uniquely identifies a device, such as an SN number. The verification code information can be regarded as information for verification.

The coding database can store multiple reference coding information corresponding to objects to be observed. Each reference coding information can be composed of one or more original coding information, and the reference coding information can be composed of several original coding information, which can be based on the size of the coding area. Determined with the length of the original coded information.

If the full original coding information cannot be displayed in the coding area, the original coding information can be split in time sequence. At this time, the reference coding information is the original coding information.

If the encoding information can display multiple sets of original encoding information, the reference encoding information may consist of multiple sets of original encoding information. The number of original coding information included in the reference coding information may be determined based on the size of the coding region and the length of the original coding information.

Further, when the reference coding information is composed of a plurality of original coding information, a group number is added before each original coding information, so as to determine the abstract coordinates of the original gaze information in the coordinate system of the coding region based on the group number .

Exemplarily, adding a group number before each of the original encoding information can be considered as performing a binary encoding. Secondary encoding, the added group numbers are distributed on the equipment of the object to be observed. That is, each coding area contains more than one group of original codes and their group numbers. In the actual recognition process, the current eye-tracking device can be used to know the specific position of the device to be observed according to the group number. That is, by adding the group number, the accuracy of determining the abstract coordinates of the original gaze information in the coding area is improved. Because when the person is close to the observed object, the front camera of the eye movement device may not be able to capture the complete encoding, so the design of adding a group number for encoding is more beneficial to the user's free interaction, and for a group in the network For a smaller smart device (such as a mobile phone), the original encoded information may already occupy the entire screen of the mobile phone, and you can choose not to perform secondary encoding on it at this time.

S240. Compare the displayed encoding information with at least one reference encoding information stored in the encoding database.

It can be understood that when the eye movement device is compared, the displayed coded information is compared with at least one reference coded information stored in the coded database. Exemplarily, the device identification number in the displayed coded information can be After comparing with at least one reference coding information stored in the coding data, and finding out which device is the object of observation, the subsequent eye movement interaction is performed.

When comparing, the bit value can be used for comparison. For example, if the reference code information is 1110001101, when the code information is 00110, it can be considered that the reference code information displayed by the observation object is 1110001101, and then the observation object can be determined. Wherein, the coded information is 00110 is only an example, and the corresponding coded information may be different only when users are in different locations.

S250. Determine whether the displayed coded information is the target coded information or part of the target coded information. If yes, execute S260; if not, execute S280.

The partial information may be part of the encoding information in the target encoding information.

When the length of the coded information is the same as the length of the target coded information, the coded information and Whether the bit values of the corresponding bits of the target coding information are the same, if yes, execute S260; if not, execute S280.

When the length of the encoded information is different from the length of the target encoded information, the highest bit of the target encoded information can be compared with the highest bit of the encoded information, if they are the same, then continue to compare the next bit of the two; if not If they are the same, compare the next highest bit of the target coded information with the highest bit of the coded information until it is compared to the lowest bit of the coded information or the set bit of the target coded information, and the target coded area is set to the target code The number of bits between the lowest bits of the field is the same as the length of the coded information.

S260. Determine the device information corresponding to the target code information as the device information of the observation object that the user is looking at.

S270. Based on the device size information in the device information, the size information of the coding region, and the angle information between the coding regions, convert the abstract coordinates of the original gaze information in the coordinate system corresponding to the coding region into The target gaze information of the user on the observation object; wherein, at least two intersecting coding regions are set on the observation object.

Exemplarily, as shown in FIG. 2b, the present disclosure provides a schematic diagram of a coded area of an object to be observed, including a coded area 20, a coded area 21, and a gaze point 23. The coded area 20 and the coded area 21 are orthogonal to each other. When using the eye When the front camera of the mobile device captures the coding areas 20 and 21, the ratio relationship between the original gaze information in the foreground image captured by the front camera and the coding area can be used, combined with the ratio relationship between the coding area and the object to be observed, to directly Estimating the target gaze information of the user on the observed object. The performance of this embodiment is superior to the encoding method based on the identification code through the conversion of the ratio.

After the execution of S270, the confirmation of the target gaze information may be ended, or may continue to return to S210 to determine the next target gaze information.

S280. Continue to select the next target encoding information, and execute S230.

The original gaze information is converted into the target gaze information through the refined device information, which realizes the transformation of the proportional relationship and improves the determination speed of the target numerical information.

The embodiments of the present disclosure provide several specific implementation manners on the basis of the technical solutions of the foregoing embodiments.

As a specific implementation of this embodiment, for example, when the object to be observed includes at least two intersecting coding areas, and the two intersecting coding areas are orthogonal, the user is determined to be in the The whole process of target gaze information interaction on the observation object is as follows:

The foreground camera of the eye movement device will capture images in real time to obtain a foreground image, and at the same time, the eye movement device can project eye movement data onto the foreground image.

In the foreground image, use the neural network algorithm to obtain the candidate area of the suspected encoding area (when this part has been confirmed as the encoding area, there will be a tracking algorithm based on the previous frame, and no need to call the neural network algorithm at this time) , perform a clustering algorithm for all candidate regions. Select the cluster center that meets the screening conditions (the distance from the current eye movement data, the compactness of the clustering algorithm and the morphological judgment of nearby data meet certain requirements) as the coding area for subsequent judgment.

Code identification is performed in the central area, and the image information is converted into 01bit information (that is, coded information).

Compare the coded information with at least one reference coded information stored in the coded database, because there are cases where only part of the content is found, so the recognition degree of the local area code is improved by means of secondary coding to ensure that the corresponding target can be correctly identified observation object.

In the foreground image, find the proportional positional relationship between the coded information and the current eye movement data, determine the pixel position of the part to be interacted with, and perform the corresponding human-computer interaction operation.

It can be understood that, compared with the current solution, the encoding in the present disclosure is bit information, and compared with encoding methods such as two-dimensional codes, it requires lower definition, less requirements on hardware, and faster identification. It can be more effectively used in actual application scenarios. The orthogonal coding layout is also good for the conversion accuracy of eye movement information. For devices that can completely cover the x and y directions, the size of the coding itself can be used as a scale, and the coordinate system conversion part can be omitted directly, and the conversion can be done directly according to the ratio. The conversion of eye movement data is very convenient.

In addition, for the non-orthogonal case of the coding area, the angle needs to be taken into account when calculating the final coordinates, and the conditions for morphological judgment may also be relaxed when there are non-orthogonal devices in the coding database.

Exemplarily, the actual working mode of performing original encoding on the equipment to be observed includes the following stages:

For home-oriented devices, firstly, add the object to be observed to the home Internet of Things network, and encode according to the device information of the object to be observed to generate a corresponding code database, that is, the code database is based on the device information of the object to be observed The encoding information obtained by encoding is constructed, including encoding content, device size information, device identification number, angle information between areas used for encoding, check code information, etc. Then generate original coded information of a corresponding length according to the coded database and the hash algorithm. Then deploy the original coded information to the infrared fluorescence area of the equipment to be observed, and adjust the frequency of its fluorescence according to the constraints such as power consumption.

It can be understood that the screen size of the device (that is, the size information of the device) and the identification number of the device are fixed, and then a hash algorithm is used to generate an encoded content based on these fixed information. Then, based on the encoded content and the shape of the device itself, angle information, check code information, etc. are generated.

When an eye-tracking device in the home Internet of Things network is used, its front-end infrared sensing device can identify whether there is a coding system that has been entered into the system in the current gaze area.

Through the mutual confirmation and interaction between the coding information displayed by the device to be observed and the reference coding information in the coding database, the original gaze information can be converted to the target coding device, and because the original coding information contains the size information of the coding device itself, Therefore, all gaze information can be passed through the corresponding The scale is projected onto the encoding device.

At this time, the eye movement device can be used to interact with the target device to realize various daily applications such as waking up the screen, selecting functions, and switching devices. Thus, for a home IoT network in which several smart devices are deployed, one or more users wearing eye-tracking devices (ie, eye-tracking devices) can interact with any smart identification within the network.

It is understandable that different families correspond to different home IoT networks. Applying the above coding method to a home IoT network, the number of devices in the same network can be controlled, and there is no need to use a long code length when generating codes, which can improve the availability and recognition efficiency of codes.

For big data information databases, the basic encoding implementation remains the same. The main difference lies in the length of the original encoding information. Since the required encoding length is significantly larger, it may be necessary to deploy a sequence of polling and switching in sequence, and the detection process will also consume more If there is too much time, it is generally necessary to ensure more than 2 cycles of encoding acquisition to determine the encoding information.

Further, as an example, for a certain family, the coding logic of a newly purchased smart device (that is, the object to be observed) that supports this function is as follows:

First, based on the MAC address of the smart device, the device SN (Serial Number, that is, the product serial number) and other data, a 16-bit hash code is generated. Then according to the width and height information of the smart device, if it can display many groups of 16bit codes, then perform secondary coding, and add a group number to each section of 16bit (the length of which is related to the number of household devices) codes as a data header, Referring to the width and height of the smart device, consider using 4bit data for the group number (the length of this segment is related to the ratio of the size of the device to the length of the previous segment of the code).

Each segment of the smart device is coded at 20 bits, and 16 sets of codes are deployed in the entire horizontal and vertical space of the code. Its spatial arrangement can refer to the accompanying drawing 2c. For example, "0000", "0001", "0111" and "1111" shown in Fig. 2c are group numbers, "xxxxxxxxxxxxxxxxx" is the original coded information, and the coded information The information may be the reference coding information included in the foreground image, and the reference coding information may be the coding information actually displayed by the observed object. The coding information shown in FIG. When the foreground camera of the eye movement device captures part of the area of the smart device, the foreground image is obtained. There is an area for displaying coding information in the foreground image. By identifying the coding area in the foreground image, the content of the foreground image can be obtained. According to the coded information displayed in the coded area, the specific position where the eye-tracking device is watching the smart device is found according to the group number in the coded information, and the overall process of the aforementioned interaction is completed through it.

The gaze information determination method provided by Embodiment 2 of the present disclosure is specifically optimized to compare the displayed coded information with at least one actually displayed reference coded information stored in the coded database to determine the observation that the user is gazing at. The operation of the device information of the object, as well as the specific encoding operation. Using this method, it can solve the defects of the existing electronic equipment interacting with the eye tracking equipment or the encoding method of the equipment to be observed, and can realize the determination of the user's target gaze information on the observation object, and improve the equipment of the observation object. The recognition speed of information and the confirmation accuracy of the target gaze information on the observation object allow users to have a better human-computer interaction experience.

Embodiment three

FIG. 3 is a schematic structural diagram of a device for determining gaze information provided by Embodiment 3 of the present disclosure. The device can be adapted to determine the original gaze information on the target gaze information on the observation object, wherein the device can be composed of software and/or hardware. implemented, and generally integrated on eye tracking devices.

As shown in Figure 3, the device includes: an acquisition module 31, an identification module 32, a device information determination module 33 and a target gaze information determination module 34;

Wherein, the acquiring module 31 is configured to acquire the foreground image and the user's original image on the foreground image. watch information;

An identification module 32, configured to identify the encoding information displayed in the encoding area in the foreground image;

The device information determination module 33 is configured to compare the displayed coded information with at least one reference coded information stored in the coded database, and determine the device information of the observation object that the user is looking at;

The target gaze information determination module 34 is configured to convert the abstract coordinates in the coordinate system corresponding to the coding region based on the device information and the original gaze information, and determine the target gaze of the user on the observation object information;

In this embodiment, the device first obtains the foreground image and the user's original gaze information on the foreground image through the acquisition module 31; secondly, through the recognition module 32, it identifies the coded information displayed in the coded area in the foreground image; and then through The device information determination module 33 compares the displayed code information with at least one reference code information stored in the code database to determine the device information of the observation object that the user is watching; finally, the target gaze information determination module 34 based on the The device information and the original gazing information convert the abstract coordinates in the coordinate system corresponding to the coding area to determine the target gazing information of the user on the observation object; wherein, the observation object is set with At least two intersecting coding regions.

This embodiment provides a device for determining gaze information, which can determine the user's target gaze information on the observation object, and improve the recognition speed of the equipment information of the observation object and the confirmation accuracy of the target gaze information on the observation object.

Further, the device information determination module 33 is specifically set to:

Selecting one actually displayed reference coding information from at least one actually displayed reference coding information stored in the coding database as the target coding information;

Further, the target gaze information determination module 34 is specifically set to:

On the basis of the above optimization, any reference coding information stored in the coding database is composed of one or more original coding information, and the original coding information is obtained by coding the equipment information of the observation object, and the equipment information includes equipment Size information, size information of coding regions, angle information between coding regions, device identification information and check code information.

On the basis of the above optimization, when the reference coding information is composed of a plurality of original coding information, a group number is added before each original coding information, so as to determine where the original gaze information is located in the coding area based on the group number Abstract coordinates in the coordinate system.

Further, the foreground image in the acquisition module 31 is one frame image or multi-frame images; the number of the multi-frame images is determined based on the length of the display period of the observation object corresponding to the reference coding information.

The above gaze information determining device can execute the gaze information determining method provided by any embodiment of the present disclosure, and has corresponding functional modules and beneficial effects for executing the method.

Embodiment four

FIG. 4 is a schematic structural diagram of an eye movement device provided by Embodiment 4 of the present disclosure. As shown in FIG. 4 , the eye movement device provided by Embodiment 4 of the present disclosure includes: one or more processors 41 and storage devices 42; there may be one or more processors 41 in the eye movement device. One processor 41 is an example; the storage device 42 is configured to store one or more programs; the one or more programs are executed by the one or more processors 41, so that the one or more processors 41 realize the following The gaze information determination method described in any one of the embodiments of the present disclosure.

The eye movement device may further include: an input device 43 and an output device 44 .

The processor 41, the storage device 42, the input device 43 and the output device 44 in the eye movement device may be connected via a bus or in other ways. In FIG. 4, connection via a bus is taken as an example.

The storage device 42 in the eye movement device, as a computer-readable storage medium, can be set to store one or more programs, and the programs can be software programs, computer executable programs and modules, such as the first or second embodiment of the present disclosure. Two program instructions/modules corresponding to the fixation information determination method provided (for example, modules in the fixation information determination device shown in accompanying drawing 3, including: acquisition module 31, identification module 32, equipment information determination module 33 and target fixation information determination Module 34). The processor 41 executes various functional applications and data processing of the eye movement device by running the software programs, instructions and modules stored in the storage device 42, that is, realizes the fixation information determination method in the above method embodiment.

The storage device 42 may include a program storage area and a data storage area, wherein the program storage area may store an operating system and at least one application required by a function; the data storage area may store data created according to the use of the eye movement device, and the like. In addition, the storage device 42 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device or other non-volatile solid-state storage devices. In some examples, storage device 42 may further include storage of remote settings that can be connected to the device via a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The input device 43 can be configured to receive input numbers or character information, and generate key signal input related to user settings and function control of the eye movement device. The output device 44 may include a display device such as a display screen.

And, when one or more programs included in the above-mentioned eye movement device are executed by the one or more processors 41, the programs perform the following operations:

Embodiment five

FIG. 5 is a schematic diagram of an object to be observed provided in Embodiment 5 of the present disclosure. As shown in FIG. The reference coding information of the object to be observed described in the embodiment is disclosed.

The object to be observed provided by the embodiments of the present disclosure can be used by an eye movement device to determine the user's target gaze information on the observed object, and improve the recognition speed of the device information of the observed object and the confirmation accuracy of the target gaze information on the observed object. Further, the coding areas 50 and 51 are at least one of the following: a display screen of the object to be observed; an invisible light component set on the object to be observed, and the invisible The light component displays coded information by dimming the emitted invisible light.

Further, the coding regions 50 and 51 are arranged along the edge of the object to be observed or the coding regions are arranged along the edge of the display screen of the object to be observed.

Exemplarily, for any smart device with a display screen, part of its area can be taken as the coding area, and infrared fluorescent components can also be used to deploy under the screen to form a form of emitting invisible light under the screen (the area remains unchanged, but does not affect normal use of the display). For the equipment to be observed without additional deployment of new hardware, the code can be displayed on two orthogonal thin strips on its screen; Direct display, the illuminant contains a plurality of infrared lamp groups, and the infrared lamps can be deployed in a square array to meet this requirement without affecting the user's perception. In addition, for some objects to be observed with abstract shapes (non-rectangular or without space conditions for deploying orthogonal coding), the two mutually orthogonal coding sequences of the coding can be non-90°.

Embodiment six

Embodiment 6 of the present disclosure provides a computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, it is set to perform a gaze information determination method, the method comprising:

Optionally, when the program is executed by the processor, it may also be configured to execute the gaze information determination method provided by any embodiment of the present disclosure.

The computer storage medium in the embodiments of the present disclosure may use any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination thereof. More specific examples (non-exhaustive list) of computer-readable storage media include: electrical connections with one or more conductors, portable computer disks, hard disks, Random Access Memory (RAM), read-only memory (Read Only Memory, ROM), Erasable Programmable Read Only Memory (EPROM), flash memory, optical fiber, portable CD-ROM, optical storage device, magnetic storage device, or any suitable combination of the above . A computer readable storage medium may be any tangible medium that contains or stores a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a data signal carrying computer readable program code in baseband or as part of a carrier wave. Such propagated data signals may take many forms, including but not limited to: electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can send, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device. .

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wires, optical cables, radio frequency (Radio Frequency, RF), etc., or any suitable combination of the above.

Computer program code for carrying out the operations of the present disclosure can be written in one or more programming languages, or combinations thereof, including object-oriented programming languages such as Java, Smalltalk, C++, and conventional procedural programming language - such as "C" or a similar programming language. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In cases involving a remote computer, the remote computer may be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (for example, using an Internet service provider to connected via the Internet).

Note that the above are only preferred embodiments and technical principles used in the present disclosure. Those skilled in the art will understand that the present disclosure is not limited to the specific embodiments described herein, and that various obvious changes, rearrangements, and substitutions may be made by those skilled in the art without departing from the scope of the present disclosure. Therefore, although the present disclosure has been described in detail through the above embodiments, the present disclosure is not limited to the above embodiments, and can also include more other equivalent embodiments without departing from the concept of the present disclosure. The scope is determined by the scope of the appended claims.

Industrial Applicability

The method provided by the embodiments of the present disclosure can be applied to eye movement devices, and the abstract coordinates in the coordinate system corresponding to the coding area in the foreground image can be calculated by using the device information of the observation object that the user is gazing at and the user's original gaze information on the foreground image. Direct proportional conversion improves the conversion accuracy; realizes gaze information conversion through the encoding area, reduces the requirements for the equipment, improves the recognition speed of the observed object, and thus solves the problem of distinguishing the foreground camera of the eye movement equipment in related technologies High rate requirements and limited use scenarios of eye-tracking equipment.

Claims

A method for determining gaze information, applied to an eye movement device, the method comprising:

Acquiring a foreground image and the user's original gaze information on the foreground image;

identifying coded information displayed in coded regions within said foreground image;

Comparing the displayed coded information with at least one reference coded information stored in the coded database to determine the equipment information of the observation object that the user is looking at;

Scale converting the abstract coordinates in the coordinate system corresponding to the coded area based on the device information and the original gaze information to determine the target gaze information of the user on the observation object;

Wherein, at least two intersecting coding regions are set on the observation object.
The method according to claim 1, wherein comparing the displayed coded information with at least one reference coded information stored in a coded database to determine the device information of the observation object that the user is looking at includes:

selecting one reference encoding information from at least one reference encoding information stored in the encoding database as target encoding information;

Comparing the displayed coded information with the target coded information to determine whether the displayed coded information is the target coded information or part of the target coded information;

If so, determining the device information corresponding to the target coding information as the device information of the observation object that the user is looking at;

If not, continue to select the next target coding information until the device information of the observed object being watched is determined.
The method according to claim 1, wherein, based on the device information and the original gaze information, the abstract coordinates in the coordinate system corresponding to the coding area are scaled to determine the position of the user on the observed object. Target gaze information, including:

Based on the device size information in the device information, the size information of the coding region and the angle information between the coding regions, the abstract coordinates of the original gaze information in the coordinate system corresponding to the coding region are proportionally related converted into target gaze information of the user on the observed object.
The method according to claim 1, wherein any one of the reference coding information stored in the coding database is composed of one or more original coding information, and the original coding information is obtained by coding the equipment information of the observation object, The device information includes device size information, size information of the coding regions, angle information between the coding regions, device identification information and check code information.
The method according to claim 4, wherein, when the reference coded information is composed of a plurality of original coded information, a group number is added before each of the original coded information, so as to determine the position of the original gaze information based on the group number. Abstract coordinates in the coordinate system where the coded region is located.
The method according to claim 1, wherein the foreground image is one frame image or multi-frame images; the number of the multi-frame images is determined based on the length of the display period of the reference coding information corresponding to the observed object.
A device for determining gaze information, comprising:

An acquisition module configured to acquire a foreground image and the user's original gaze information on the foreground image;

An identification module configured to identify the encoded information displayed in the encoded area of the foreground image;

The device information determination module is configured to compare the displayed coded information with at least one reference coded information stored in the coded database, and determine the device information of the observation object that the user is looking at;

The target gaze information determination module is configured to convert the abstract coordinates in the coordinate system corresponding to the encoding area based on the device information and the original gaze information, and determine the target gaze information of the user on the observation object ;

Wherein, at least two intersecting coding regions are set on the observation object.
An eye tracking device comprising:

one or more processors;

a storage device configured to store one or more programs;

When the one or more programs are executed by the one or more processors, the one or more processors are made to implement the method according to any one of claims 1-6.
An object to be observed includes at least two intersecting coding areas, and the coding areas are used to display reference coding information of the object to be observed.
The object to be observed according to claim 9, wherein the coding region is at least one of the following:

the display screen of the object to be observed;

The invisible light component provided on the object to be observed, the invisible light component displays coded information through the brightness and darkness of the emitted invisible light.
According to the object to be observed according to claim 9, the coding area is arranged along the edge of the object to be observed or the coding area is arranged along the edge of the display screen of the object to be observed.
A computer-readable storage medium storing a computer program, which implements the method according to any one of claims 1-6 when the program is executed by a processor.