CN117406851A

CN117406851A - Equipment control method and device, electronic equipment and storage medium

Info

Publication number: CN117406851A
Application number: CN202210806813.4A
Authority: CN
Inventors: 易军
Original assignee: Beijing Xiaomi Mobile Software Co Ltd; Beijing Xiaomi Pinecone Electronic Co Ltd
Current assignee: Beijing Xiaomi Mobile Software Co Ltd; Beijing Xiaomi Pinecone Electronic Co Ltd
Priority date: 2022-07-08
Filing date: 2022-07-08
Publication date: 2024-01-16

Abstract

The disclosure relates to a device control method and device, an electronic device and a storage medium. The device control method comprises the following steps: determining the sight line direction of eyes of the acquisition object according to the acquisition image containing the eye imaging area; according to the focal length used by the image acquisition module for generating the acquired image, determining the three-dimensional coordinate of the eye in an image coordinate system; determining the intersection point position between the sight line direction and the display plane where the display screen of the controlled device is positioned according to the sight line direction and the three-dimensional coordinate; and controlling the controlled equipment according to the intersection point position.

Description

Equipment control method and device, electronic equipment and storage medium

Technical Field

The disclosure relates to the field of electronic technology, and in particular, to a device control method and apparatus, an electronic device, and a storage medium.

Background

The screen control and man-machine interaction realized by the line of sight are as follows:

by collecting face images and estimating the eye sight direction of human eyes and the gaze point on the screen, the bright screen is controlled, or man-machine interaction is performed, such as playing, pausing, last head, next head and the like of music is controlled.

The related technical proposal is as follows:

(1) The eye tracker can accurately estimate the sight line direction and the gaze point, but the device is complex, expensive and heavy, and is used with special software, is generally used for laboratory research, and cannot be used for consumer-grade scenes.

(2) Adopt the degree of depth camera, include: binocular color cameras, time of Flight (TOF) modules, or structured light cameras, etc. The depth information of human eyes is estimated through the depth camera, and the gaze point can be estimated by combining the position information of the human eyes and the direction of the line of sight. In summary, cameras require depth cameras, and depth estimation requires a large number of computation or dedicated chips, consumes large power, or has limited applicability.

Whether an eye tracker or a depth camera is used, additional equipment is required to assist in the determination of the point of view of the human eye.

Disclosure of Invention

The embodiment of the disclosure provides a device control method and device, electronic equipment and a storage medium.

A first aspect of an embodiment of the present disclosure provides an apparatus control method, including:

determining the sight line direction of eyes of the acquisition object according to the acquisition image containing the eye imaging area;

according to the focal length used by the image acquisition module for generating the acquired image, determining the three-dimensional coordinate of the eye in an image coordinate system;

Determining the intersection point position between the sight line direction and the display plane where the display screen of the controlled device is positioned according to the sight line direction and the three-dimensional coordinate;

and controlling the controlled equipment according to the intersection point position.

Based on the above scheme, the determining the three-dimensional coordinates of the eye in the image stereo coordinate system according to the focal length of the acquired image includes:

determining a first coordinate value of the eye on a first coordinate axis of the image coordinate system according to the imaging diameter of the eye on the acquired image, the iris diameter of the acquired object and the focal length;

determining a second coordinate value of the eye on a second coordinate axis of the image coordinate system and a third coordinate value of the eye on a third coordinate system according to the coordinates of the eye center point of the eye on the acquired image and the coordinates of the center point of the acquired image; any two of the first coordinate axis, the second coordinate axis and the third coordinate axis are perpendicular to each other; the first coordinate value, the second coordinate value and the third coordinate value together form the three-dimensional coordinate.

Based on the above scheme, the determining, according to the imaging diameter of the eye on the acquired image, the iris of the acquired object, and the focal length, a first coordinate value of the eye on a first coordinate axis of the image coordinate system includes:

Determining a first product between the iris diameter and the focal length;

determining a second product between the imaging diameter and the size of a single pixel on the image acquisition module;

and determining the first coordinate value according to the quotient between the first product and the second product.

Based on the above scheme, the determining, according to the coordinates of the eye center point of the eye on the collected image and the coordinates of the center point of the collected image, the second coordinate value of the eye on the second coordinate axis of the image coordinate system and the third coordinate value on the third coordinate system includes:

determining a second coordinate value according to the first value of the eye center point coordinate in a second coordinate axis and the second value of the center point of the acquired image in the second coordinate axis, the first coordinate value and the focal length;

and determining the third coordinate value according to the third value of the eye center point coordinate in the third coordinate axis and the fourth value of the center point of the acquired image in the second coordinate axis, the first coordinate value and the focal length.

Based on the above-mentioned scheme, the determining the line of sight direction of the collected eyes according to the collected image including the eye imaging region includes:

Performing face detection on the acquired image to obtain the face key points;

obtaining the eye imaging area according to the eye key points in the face key points;

determining the head posture characteristics of the acquisition object according to the face key points;

and determining the sight line direction of the eyes of the acquisition object according to the eye imaging area and the head gesture characteristics.

Based on the above scheme, the controlling the controlled device according to the intersection point position includes:

determining whether the sight of the eyes is projected on the display screen according to the intersection point position to obtain a determination result;

and controlling the display screen of the controlled device according to the determination result and the on-off state of the display screen at the current moment.

Based on the above scheme, the controlling the display screen of the controlled device according to the determination result and the on-off state of the display screen at the current moment includes:

controlling the on-off state of the display screen according to the duration of whether the vision of the eyes is projected on the display screen and the on-off state of the display screen at the current moment indicated by the determination result;

or,

when the display screen is in a bright screen state currently, determining the position transformation of the eye sight projected on the display screen according to the determination result, and controlling the output content of the controlled equipment.

Based on the above scheme, the controlling the on-off state of the display screen according to the duration of whether the line of sight of the eye is projected on the display screen and the on-off state of the display screen at the current moment indicated by the determination result includes:

when the display screen is in a bright screen shape, if the determined result shows that the duration of the projection of the sight of the eyes outside the display screen is longer than or equal to a first duration, controlling the display screen to stop screen;

or,

when the display screen is in the screen-off state, if the determined result shows that the duration of the projection of the sight of the eyes in the display screen is longer than or equal to the second duration, controlling the display screen to be on.

Based on the above scheme, when the display screen is currently in a bright screen state, determining, according to the determination result, a position change of the line of sight of the eyes projected on the display screen, and controlling output content of the controlled device, including:

if the display screen is in a bright screen state and the intersection point position is positioned in the display screen, determining the position change of the sight line of the eyes projected on the display screen according to the intersection point position;

determining target content to be switched of the controlled equipment according to the position change;

And controlling the target equipment to output the target content.

A second aspect of an embodiment of the present disclosure provides an apparatus control device, the device including:

a first determining module, configured to determine a line of sight direction of an eye of a collection object according to a collection image including an eye imaging region;

the second determining module is used for determining three-dimensional coordinates of the eyes in an image coordinate system according to the focal length used by the image acquisition module for generating the acquired image;

a third determining module, configured to determine an intersection point position between the line of sight direction and a display plane where a display screen of the controlled device is located according to the line of sight direction and the three-dimensional coordinate;

and the control module is used for controlling the controlled equipment according to the intersection point position.

A third aspect of an embodiment of the present disclosure provides a mobile terminal, including:

a memory for storing processor-executable instructions;

a processor connected to the memory;

wherein the processor is configured to perform the device control method as provided in any one of the technical aspects of the first aspect.

A fourth aspect of the disclosed embodiments provides a non-transitory computer-readable storage medium, which when executed by a processor of a computer, enables the computer to perform the device control method as provided in any one of the first aspects.

The technical scheme provided by the embodiment of the disclosure can comprise the following beneficial effects:

according to the technical scheme provided by the embodiment of the disclosure, after an acquired image containing an eye area is detected, the sight line direction of the eye of an acquisition object is determined directly by using the acquired image, and the three-dimensional coordinate of the eye in an image coordinate system nickel is determined; therefore, the intersection point between the eye sight direction and the plane of the display screen of the controlled device can be determined, the controlled device can be directly controlled based on the intersection point, and the eye sight of the user can be detected without the assistance of other devices such as an eye tracker or a depth camera, so that the sight control is simplified, and the hardware cost is reduced.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.

FIG. 1 is a flow chart diagram illustrating a device control method according to an exemplary embodiment;

FIG. 2 is a flow chart diagram illustrating a device control method according to an example embodiment;

FIG. 3 is a flow chart diagram illustrating a device control method according to an example embodiment;

Fig. 4A is a schematic diagram showing the intersection point position of the line of sight direction and the screen on which the display screen of the controlled apparatus is located, according to an exemplary embodiment;

fig. 4B is a schematic diagram showing the intersection point position of the line of sight direction and the screen on which the display screen of the controlled apparatus is located, according to an exemplary embodiment;

fig. 5 is a schematic structural view of a device control method according to an exemplary embodiment;

fig. 6 is a schematic diagram of a structure of an electronic device, according to an example embodiment.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus consistent with some aspects of the disclosure as detailed in the accompanying claims.

As shown in fig. 1, an embodiment of the present disclosure provides an apparatus control method, including:

s110: determining the sight line direction of eyes of the acquisition object according to the acquisition image containing the eye imaging area;

S120: according to the focal length used by the image acquisition module for generating the acquired image, determining the three-dimensional coordinate of the eye in an image coordinate system;

s130: determining the intersection point position between the sight line direction and the display plane where the display screen of the controlled device is positioned according to the sight line direction and the three-dimensional coordinate;

s140: and controlling the controlled equipment according to the intersection point position.

In the embodiment of the disclosure, the device control method may be executed by the electronic device itself. The electronic device may include a mobile device and/or a stationary device.

The mobile terminal includes, but is not limited to: foldable mobile phones, foldable tablet computers, tablet and notebook two-in-one devices, foldable electronic readers or wearable devices, and the like.

The electronic device may include an image acquisition module. The image acquisition module can be a common red, green and blue (RGB) camera. The camera may be a monocular camera. The image acquisition module may be a front camera or a rear camera, for example.

The image acquisition module may include: an image sensor and a lens assembly; the lens assembly includes one or more optics including, but not limited to, lenses and/or mirrors. The lens component is positioned at the front end of the image sensor and is used for conducting light rays to the image sensor.

The acquisition object may be a target user. The vision direction of the eyes of the acquisition object can be determined by processing the acquisition image through a software algorithm.

Illustratively, the acquired image is processed using one or more neural networks or other deep learning models, and the gaze direction is output.

The image coordinate system is a three-dimensional coordinate system. The XoY plane of the image coordinate system may be a plane where the light sensing surface of the image sensor of the image acquisition module is located, and the Z axis of the image coordinate system may be a coordinate axis perpendicular to the XoY plane, and typically, the line connecting the optical center and the focal point of the image sensor coincides.

In an embodiment of the present disclosure, the focal length of the image acquisition module is adjustable. In the embodiment of the disclosure, the distance from the eye to the image acquisition can be estimated by combining the imaging on the image acquired by the eye according to the focal length when the acquisition module generates the acquired image, and the distance can be used as a Z-axis coordinate in an image coordinate system.

The imaging of the eyes on the image reflects the position of the eyes of the acquisition object relative to the image acquisition module in the XoY plane. Thus, in the disclosed embodiments, the position of a user's eyes within an image coordinate system may be determined directly by a conventional image acquisition module (e.g., a color camera) without the use of an eye tracker or a depth camera.

After determining the position of the eyes in the image coordinate system, the line of sight directions of the eyes are assembled. According to the relative position between the image acquisition module and the display screen, the intersection point position of eyes of a user relative to the screen where the display screen is located can be known.

The controlled device may be controlled to perform various operations according to the intersection point position. The controlled device may be any of the electronic devices described above. The device control method can be executed by the controlled device itself or by the control device of the controlled device.

In summary, the image processing method provided by the embodiment of the present disclosure may directly determine, after detecting an acquired image including an image of an eye region, a line-of-sight direction of an eye of an acquisition object according to the acquired image, and determine three-dimensional coordinates of the eye in an image coordinate system nickel; therefore, the intersection point between the eye sight direction and the plane of the display screen of the controlled device can be determined, the controlled device can be directly controlled based on the intersection point, and the eye sight of the user can be detected without the assistance of other devices such as an eye tracker or a depth camera, so that the sight control is simplified, and the hardware cost is reduced.

In some embodiments, as shown in fig. 2, the S120 may include:

S121: determining a first coordinate value of the eye on a first coordinate axis of the image coordinate system according to the imaging diameter of the eye on the acquired image, the iris diameter of the acquired object and the focal length;

s122: determining a second coordinate value of the eye on a second coordinate axis of the image coordinate system and a third coordinate value of the eye on a third coordinate system according to the coordinates of the eye center point of the eye on the acquired image and the coordinates of the center point of the acquired image; any two of the first coordinate axis, the second coordinate axis and the third coordinate axis are perpendicular to each other; the first coordinate value, the second coordinate value and the third coordinate value together form the three-dimensional coordinate.

The first coordinate value on the first coordinate axis, the second coordinate value on the second coordinate axis and the third coordinate value on the third coordinate axis together form the three-dimensional coordinate, and the iris diameter of the acquisition object is obtained in advance.

The pre-acquiring the iris diameter of the acquisition object may include:

displaying a through hole parameter input interface;

Detecting input data for the input interface;

and determining the iris diameter according to the input data.

Still further exemplary, the pre-acquiring the iris diameter of the acquisition object may include:

displaying prompt information of eye detection data;

and when the confirmation indication of the prompt information is detected, determining the iris diameter according to the eye detection data.

entering an iris diameter measurement interface, and displaying a frame icon of a face image on the interface; the frame icon is used for prompting a user to adjust the distance between the frame icon and the image acquisition module;

acquiring a face image according to the face image and the frame icon overlapping rate reaching a preset value;

and calculating the iris diameter according to the focal length of the acquired face image and the imaging diameter of the iris of the eye in the face image.

The above are just a few alternatives to obtain the actual iris diameter, and the specific implementation is not limited to the above.

The ratio between the imaging diameter and the actual iris can be used for converting the focal length and the distance between the acquisition object and the image acquisition module, and the coordinate value of the acquisition object in the image coordinate system can be calculated according to the imaging position of the eye of the acquisition object in the acquisition image.

The first coordinate axis may be a Z axis, and may be a coordinate axis coinciding with a line between the optical center and the focal length of the image sensor.

The second coordinate axis and the third coordinate axis may be an X axis and a Y axis, respectively, or the second coordinate axis and the third coordinate axis may be a Y axis and an X axis, respectively.

The second and third coordinate axes may together form a plane XoY.

The XoY plane is a plane in which the photosensitive surface of the image sensor is located.

For example, if the image coordinate system uses the center point of the XoY plane as the center point of the image sensor as the origin, the difference between the eye center point coordinate and the center point coordinate of the collected image can calculate the second coordinate value and the third coordinate value of the collected object on the image coordinate system.

Illustratively, on the second coordinate axis, the second coordinate value may be calculated by the number of pixels and the imaging scale value of the interval between the eye center point coordinate and the center point coordinate of the acquired image.

On the third coordinate axis, the third coordinate value can be calculated by the number of pixels and the imaging proportion value of the interval between the coordinates of the center point of the eye and the coordinates of the center point of the acquired image.

The first coordinate value, the second coordinate value and the third coordinate value jointly form a three-dimensional coordinate of the eye of the acquisition object in an image coordinate system.

The process of solving the three-dimensional coordinates of the eyes in the image coordinate system is based on the fact that the eyes are imaged in the collected images and the focal length of the image collection module is used for image collection, and the determination of the three-dimensional coordinates is simply achieved without introducing peripheral equipment. Illustratively, the determining a first coordinate value of the eye on a first coordinate axis of the image coordinate system according to an imaging diameter of the eye on the acquired image, an iris of the acquired object, and the focal length includes:

determining a first product between the iris diameter and the focal length;

For example, the first coordinate value may be calculated using the following functional relationship.

D/F＝R/(P*S)；

Ez＝D＝F*R/(P*S)。

F is the focal length of the current image acquired by the image acquisition module; d is the distance from the acquisition object to the image acquisition module. Ez is a first coordinate value of the eye on a first coordinate axis within the image coordinate system.

S is Chi Xun of a single pixel in an image sensor contained in the image acquisition module; p is the number of pixels occupied by the imaging diameter of the iris.

Of course, the above is merely an example of the calculation of the first coordinate value.

Further, the first coordinate value can be simply and conveniently known directly according to the focal length, and the method has the characteristic of simplicity in calculation.

In some embodiments, the S122 may include:

Illustratively, assuming eye center coordinates (Ix, iy), the center point of the acquired image generally corresponds to the optical center of the image sensor, represented using coordinates (Cx, cy). At this time, the second coordinate value and the third coordinate value are calculated respectively using the following functional relationships.

Ex＝(Ix-Cx)*P*D/F；

Ey＝(Iy-Cy)*P*D/F。

Ex is the second coordinate value; ey is the second coordinate value.

And F is the focal length of the current image acquired by the image acquisition module. D is the distance from the acquisition object to the image acquisition module. P is the number of pixels occupied by the imaging diameter of the iris.

The above is merely an example of how the second coordinate value and the third coordinate value are calculated in detail.

In some embodiments, as shown in fig. 3, the determining the gaze direction of the captured eye from the captured image comprising the eye imaging region comprises:

s111: performing face detection on the acquired image to obtain the face key points;

s112: obtaining the eye imaging area according to the eye key points in the face key points;

s113: determining the head posture characteristics of the acquisition object according to the face key points;

s114: and determining the sight line direction of the eyes of the acquisition object according to the eye imaging area and the head gesture characteristics.

In some embodiments, a deep learning model is used to detect keypoints in the acquired image, resulting in at least facial keypoints.

The face key points include, but are not limited to: facial contour keypoints, and/or facial feature keypoints.

After the face key points are detected, the eye key points can be determined according to the distribution relation among the key points.

After the eye key points are determined, the eye imaging area can be intercepted from the acquired image according to the image coordinates of the eye key points on the acquired image.

In a specific implementation process, the head gesture characteristics of the user can be determined by the key points of the human face while intercepting the eye imaging area.

Illustratively, the head pose features may include at least:

a facial orientation feature, the facial orientation feature being indicative of at least one of: the display screen of the controlled device is faced forward, the display screen of the controlled device is faced left, the display screen of the controlled device is faced right, the user is faced upward, and the user is faced down.

Illustratively, the face orientation may be known from the line of the facial outline points of the facial key points.

Still further exemplary, the position of the iris within the orbit is determined based on the position of the center of the iris, the position of the orbital corner key points.

The eye imaging region may further extract features of the user's eye.

A location characteristic of the iris within the orbit, the location characteristic being indicative of at least one of: the iris is located in the middle, to the left or to the right of the orbit. The position of the iris within the orbit also affects the user's gaze direction.

After the head posture features are determined, the neural network can be further utilized to process the head posture features and the eye imaging area, so that the sight direction of the acquisition object can be estimated.

The line of sight direction may be a line of sight vector in the image coordinate system. The starting point of the sight line vector can be the starting position of the central point of the iris in the image coordinate system; the direction of the vector is the viewing direction representing the line of sight of the user.

Illustratively, the acquired image is processed by using a first deep learning model such as a first neural network, so as to obtain an eye imaging area separated by the first deep learning model and obtain the head posture feature from the acquired image. And inputting the molars imaging region and the head gesture features into a second deep learning model such as a second neural network for processing, and obtaining the sight line direction.

Further exemplary, the acquired image is processed by using an end-to-end neural network, on the one hand, an eye imaging region is obtained through an image separation branch, on the other hand, a head gesture feature is extracted through a gesture extraction branch, and then a decoding branch positioned at the rear ends of the image separation branch and the gesture extraction branch determines the sight direction according to the eye imaging region and the head gesture feature.

Different tasks are executed by adopting a plurality of depth models respectively, so that the training convergence of a single deep learning model is fast, the overall training quantity of the plurality of models is small, the error accumulation between the models can be reduced by adopting an end-to-end model, and the determination accuracy of the sight line direction is improved.

Eye imaging region in some embodiments, the controlling the controlled device according to the intersection position includes:

And if the intersection point is positioned in the display screen, the intersection point indicates that the sight of the acquisition object (namely the user) looks at the display screen. As shown in fig. 4A, the intersection point between the implementation direction of the eyelid of the user and the surface on which the display screen of the controlled device is located is a, and it is apparent that the position of a is located within the display screen.

If the intersection point is located outside the display screen, the probability that the line of sight of the user looks outside the display screen is high. As shown in fig. 4, an intersection point between the line of sight direction of the eyes of the user and the plane in which the display surface of the controlled apparatus is located is B; it is apparent that the location of B is outside the display screen.

The dashed line in fig. 4A and 4B represents the line of sight direction of the user.

The determination may be indicative of the intersection point location being located within a display screen or the intersection point location being located outside the display screen.

In the embodiment of the disclosure, the on/off of the display screen and/or the display content and the like are controlled at least according to the position of the intersection point, so that under the condition of introducing the structures such as an eye tracker and/or a depth camera and the like, whether the intersection point exists between the sight direction of the user and the display screen or not is determined based on pure software control such as one or more acquired images and the like, and the sight control of the display screen is realized.

In some embodiments, the controlling the display screen of the controlled device according to the determination result and the on-off state of the display screen at the current moment includes:

or,

In the embodiment of the disclosure, by combining the determination result and the on-off state of the display screen at the current moment, whether the screen needs to be turned on or off can be determined, so that the power consumption generated when the screen keeps the on-state for a long time is reduced, or the phenomenon of incapacity that the user needs to manually turn on is reduced.

Meanwhile, if the screen is in a lighting state at the current moment, the output display content and/or the audio content of the controlled equipment can be controlled according to the position variation quantity or the changed position of the sight projected on the display screen.

Illustratively, the controlled device is controlled to switch the display and/or to switch to the next audio, the last audio, or a particular audio, etc.

In some embodiments, the determining whether the line of sight of the eye is projected on the display screen for a duration of time and an on-off state of the display screen at a current moment, and controlling the on-off state of the display screen include:

or,

The first time period and the second time period may be equal or different. The first time period and the second time period can be determined according to operation statistics of a user for controlling the display screen to be turned on or off by using the sight line, and can also be determined according to experimental data points. Illustratively, the first and second durations may be equal-duration values of 0.5s or 1 s.

The lighting and extinguishing of the display screen is controlled by using the sight line in the embodiment of the disclosure.

In some embodiments, when the display screen is currently in a bright screen state, determining, according to the determination result, a position change of the line of sight of the eyes projected on the display screen, and controlling output content of the controlled device, where the method includes:

and controlling the target equipment to output the target content.

In some embodiments, the position change may correspond to a touch operation, e.g., a sliding operation corresponding to different sliding directions according to different position changes. For another example, according to the position transformation and the position of the previous moment, the display content of the current sight gaze can be determined, so that the target content to be switched by the controlled equipment is determined by the position transformation.

The target content includes, but is not limited to, display content and/or audio content.

The specific implementation of the technical scheme of the embodiment of the disclosure is as follows:

and acquiring an image to obtain an acquired image, and performing face detection on the acquired image. And performing key point positioning and head pose estimation on the detected face. And intercepting an eye imaging area according to eye key points, inputting the eye imaging area and the head gesture into a depth convolution neural network, and estimating the sight direction.

Based on the positioned iris of the eye, the pixel size of the iris on the image can be calculated, and the iris diameter of different people is smaller, so that the iris can be represented by taking a statistical average value. The actual size (i.e., actual size) of the iris may include: the iris size of the subject or the average iris size acquired from the big data is acquired.

The real size and imaging size of the iris are combined with the internal parameters of the camera, namely the focal length and the size of a single pixel, so that the distance between the eyes and the camera can be calculated. Assuming that the imaging number of the iris diameter is P, the real diameter of the iris can be R mm, the focal length of the camera is F mm, the pixel size is S mm, the depth from the eyes to the camera is D, and according to the camera imaging model, the following relationship exists:

d/f=r/(p×s), the depth of the eye, i.e., Z coordinate Ez of the eye in the camera space is ez=d=f×r/(p×s)

And then according to the positions Ix and Iy of the eyes on the image and the central positions Cx and Cy of the image, the X coordinate Ex and the Y coordinate Ey of the eyes in the camera space can be calculated, and the X coordinate Ex and the Y coordinate Ey are respectively as follows:

Ex＝(Ix-Cx)*P*D/F

Ey＝(Iy-Cy)*P*D/F

based on the positions Ex, ey, ez of eyes in the camera space and the directions Vx, vy, vz of the sight lines, the intersection point of the sight lines and the plane of the screen, namely the gaze point of the eyes on the screen, can be calculated, and if the plane of the screen is z=sz, the X coordinate Sz and the Y coordinate Sy of the gaze point on the screen are respectively:

Sx＝(Sz-Ez)/Vz*Vx+Ex

Sy＝(Sz-Ez)/Vz*Vy+Ey

according to whether the gaze point is on the screen and the duration thereof, the function of lighting the screen can be realized, and according to whether the gaze point is on the corresponding element of the user interface or the specific position of the screen and the duration thereof, the function of man-machine interaction can be realized.

According to the technical scheme provided by the embodiment of the disclosure, the implementation is realized through a pure software algorithm without depending on special equipment or a depth camera, and the implementation can be realized as long as the camera and the computing capacity are provided, so that the implementation can be deployed on the existing consumer electronic equipment, and the user experience of the existing equipment can be improved. The algorithm provided by the embodiment of the disclosure is very efficient to realize, can quickly realize eye depth estimation, has low requirement on computing power, and has relatively low realization cost.

As shown in fig. 5, an embodiment of the present disclosure provides an apparatus control device, including:

A first determining module 110, configured to determine a line of sight direction of an eye of a subject from an acquired image including an eye imaging region;

a second determining module 120, configured to determine three-dimensional coordinates of the eye in an image coordinate system according to a focal length used by the image acquisition module to generate the acquired image;

a third determining module 130, configured to determine, according to the line-of-sight direction and the three-dimensional coordinate, an intersection point position between the line-of-sight direction and a display plane where a display screen of the controlled device is located;

and the control module 140 is used for controlling the controlled equipment according to the intersection point position.

In some embodiments, the first determination module 110, the second determination module 120, the third determination module 130, and the control module 140 may comprise program modules; the program modules may implement the functions of any of the modules described above when executed by a processor.

In some embodiments, the first determination module 110, the second determination module 120, the third determination module 130, and the control module 140 may include a soft-hard combination module; the soft and hard combined module comprises a programmable array; the programmable array includes, but is not limited to: complex programmable arrays and/or field programmable arrays.

In some embodiments, the first determination module 110, the second determination module 120, the third determination module 130, and the control module 140 may comprise pure hardware modules; the pure hardware modules include, but are not limited to, application specific integrated circuits.

In some embodiments, the second determining module 120 is specifically configured to determine a first coordinate value of the eye on a first coordinate axis of the image coordinate system according to an imaging diameter of the eye on the acquired image, an iris diameter of the acquired object, and the focal length; determining a second coordinate value of the eye on a second coordinate axis of the image coordinate system and a third coordinate value of the eye on a third coordinate system according to the coordinates of the eye center point of the eye on the acquired image and the coordinates of the center point of the acquired image; any two of the first coordinate axis, the second coordinate axis and the third coordinate axis are perpendicular to each other; the first coordinate value, the second coordinate value and the third coordinate value together form the three-dimensional coordinate.

In some embodiments, the second determining module 120 is specifically configured to determine a first product between the iris diameter and the focal length; determining a second product between the imaging diameter and the size of a single pixel on the image acquisition module; and determining the first coordinate value according to the quotient between the first product and the second product.

In some embodiments, the second determining module 120 is specifically configured to determine the second coordinate value according to a first value of the eye center point coordinate on a second coordinate axis and a second value of the center point of the acquired image on the second coordinate axis, the first coordinate value and a focal length; and determining the third coordinate value according to the third value of the eye center point coordinate in the third coordinate axis and the fourth value of the center point of the acquired image in the second coordinate axis, the first coordinate value and the focal length.

In some embodiments, the first determining module 110 is specifically configured to perform face detection on the acquired image to obtain the face key point; obtaining the eye imaging area according to the eye key points in the face key points; determining the head posture characteristics of the acquisition object according to the face key points; and determining the sight line direction of the eyes of the acquisition object according to the eye imaging area and the head gesture characteristics.

In some embodiments, the first determining module 110 is specifically configured to determine whether the line of sight of the eye is projected on the display screen according to the intersection point position, so as to obtain a determination result; and controlling the display screen of the controlled device according to the determination result and the on-off state of the display screen at the current moment.

In some embodiments, the control module 140 is specifically configured to control the on-off state of the display screen according to a duration of whether the line of sight of the eye indicated by the determination result is projected on the display screen and the on-off state of the display screen at the current moment of the display screen; or when the display screen is in the bright screen state currently, determining the position transformation of the eye sight projected on the display screen according to the determination result, and controlling the output content of the controlled equipment.

In some embodiments, the control module 140 is specifically configured to control the display screen to display a message if the determination result indicates that the duration of the projection of the line of sight of the eyes outside the display screen is longer than or equal to a first duration when the display screen is in a bright screen state; or when the display screen is in the screen-off state, if the determined result shows that the duration of the projection of the sight line of the eyes in the display screen is longer than or equal to the second duration, controlling the display screen to be on.

In some embodiments, the control module 140 is specifically configured to determine, if the display screen is in a bright screen state and the intersection point position is located in the display screen, a change in a position of the line of sight of the eye projected on the display screen according to the intersection point position; determining target content to be switched of the controlled equipment according to the position change; and controlling the target equipment to output the target content.

An embodiment of the present disclosure provides an electronic device, including:

a memory for storing processor-executable instructions;

a processor connected with the memory;

wherein the processor is configured to execute the device control method provided by any of the foregoing technical solutions.

The processor may include various types of storage medium, which are non-transitory computer storage media, capable of continuing to memorize information stored thereon after a power down of the communication device.

The mobile terminal includes, but is not limited to: various mobile terminals such as mobile phones, tablets or projection devices, fixed terminals or smart televisions.

The processor may be connected to the memory via a bus or the like for reading an executable program stored on the memory, for example, capable of executing at least one of the device control methods as shown in any of fig. 1 to 3.

Fig. 6 is a block diagram of an electronic device 800, according to an example embodiment. For example, the electronic device 800 may be a mobile phone or a mobile computer, etc.

Referring to fig. 6, an electronic device 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 808, an audio component 810, an input/output (I/O) interface 812, a sensor component 814, and a communication component 816.

The processing component 802 generally controls overall operation of the electronic device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 802 may include one or more processors 820 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interactions between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.

The memory 804 is configured to store various types of data to support operations at the device 800. Examples of such data include instructions for any application or method operating on the electronic device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

The power supply component 806 provides power to the various components of the electronic device 800. Power components 806 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for electronic device 800.

The multimedia component 808 includes a screen between the electronic device 800 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or sliding action, but also the duration and pressure associated with the touch or sliding operation. In some embodiments, the multimedia component 808 includes a front camera and/or a rear camera. The front camera and/or the rear camera may receive external multimedia data when the device 800 is in an operational state, such as a photographing state or a video state. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.

The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 800 is in an operational state, such as a call state, a recording state, and a speech recognition state. The received audio signals may be further stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 further includes a speaker for outputting audio signals.

The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be a keyboard, click wheel, buttons, etc. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button.

The sensor assembly 814 includes one or more sensors for providing status assessment of various aspects of the electronic device 800. For example, the sensor assembly 814 may detect an on/off state of the device 800, a relative positioning of the components, such as a display and keypad of the electronic device 800, the sensor assembly 814 may also detect a change in position of the electronic device 800 or a component of the electronic device 800, the presence or absence of a user's contact with the electronic device 800, an orientation or acceleration/deceleration of the electronic device 800, and a change in temperature of the electronic device 800. The sensor assembly 814 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.

In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 816 is configured to facilitate communication between the electronic device 800 and other devices, either wired or wireless. The electronic device 800 may access a wireless network based on a communication standard, such as Wi-Fi,2G, or 3G, or a combination thereof. In one exemplary embodiment, the communication component 816 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the electronic device 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for executing the methods described above.

In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as memory 804 including instructions executable by processor 820 of electronic device 800 to perform the above-described method. For example, the non-transitory computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

The presently disclosed embodiments provide a non-transitory computer readable storage medium that, when executed by a processor of a mobile terminal, enables the mobile terminal to perform the device control method provided by any of the foregoing embodiments, for example, at least one of the methods as shown in any of fig. 1-3.

The device control method may include: determining the sight line direction of eyes of the acquisition object according to the acquisition image containing the eye imaging area; according to the focal length used by the image acquisition module for generating the acquired image, determining the three-dimensional coordinate of the eye in an image coordinate system; determining the intersection point position between the sight line direction and the display plane where the display screen of the controlled device is positioned according to the sight line direction and the three-dimensional coordinate; and controlling the controlled equipment according to the intersection point position.

It is understood that the determining the three-dimensional coordinates of the eye in the image stereoscopic coordinate system according to the focal length of the acquired image includes:

It is understood that the determining a first coordinate value of the eye on a first coordinate axis of the image coordinate system according to an imaging diameter of the eye on the acquired image, an iris of the acquired object, and the focal length includes:

determining a first product between the iris diameter and the focal length;

As will be appreciated, the determining the second coordinate value of the eye on the second coordinate axis of the image coordinate system and the third coordinate value of the eye on the third coordinate system according to the coordinates of the eye center point on the acquired image and the coordinates of the center point of the acquired image includes:

It will be appreciated that determining the gaze direction of the eye being captured from the captured image comprising the eye imaging region comprises:

performing face detection on the acquired image to obtain the face key points;

It is understood that the controlling the controlled device according to the intersection point position includes:

It can be understood that, according to the determination result and the on-off state of the display screen at the current moment, the controlling the display screen of the controlled device includes:

or,

It may be understood that the controlling the on-off state of the display screen according to the duration of whether the line of sight of the eye is projected on the display screen and the on-off state of the display screen at the current moment indicated by the determination result includes:

or,

It can be understood that, when the display screen is currently in the bright screen state, determining, according to the determination result, a position change of the line of sight of the eyes projected on the display screen, and controlling output content of the controlled device, including:

And controlling the target equipment to output the target content.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any adaptations, uses, or adaptations of the disclosure following the general principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A method of controlling a device, the method comprising:

2. The method of claim 1, wherein determining three-dimensional coordinates of the eye in an image stereoscopic coordinate system based on a focal length of the acquired image comprises:

3. The method of claim 2, wherein the determining a first coordinate value of the eye on a first coordinate axis of the image coordinate system based on an imaging diameter of the eye on the acquired image, an iris of the acquired object, and the focal distance comprises:

determining a first product between the iris diameter and the focal length;

4. A method according to claim 2 or 3, wherein said determining a third coordinate value of the eye on a second coordinate axis and a third coordinate axis of the image coordinate system from the eye center point coordinates of the eye center point on the acquired image and the coordinates of the center point of the acquired image comprises:

5. The method of claim 1, wherein determining the gaze direction of the captured eye from the captured image comprising the eye imaging region comprises:

performing face detection on the acquired image to obtain the face key points;

6. The method according to claim 1 or 2, wherein said controlling the controlled device according to the intersection point position comprises:

7. The method according to claim 6, wherein controlling the display screen of the controlled device according to the determination result and the on-off state of the display screen at the current time comprises:

or,

8. The method according to claim 7, wherein the controlling the on-off state of the display screen according to the duration of whether the line of sight of the eye is projected on the display screen and the on-off state of the display screen at the current time indicated by the determination result comprises:

or,

9. The method according to claim 7, wherein determining, when the display screen is currently in a bright screen state, a change in a position of the line of sight of the eye projected on the display screen according to the determination result, controlling output content of the controlled device, includes:

and controlling the target equipment to output the target content.

10. A device control apparatus, characterized in that the apparatus comprises:

11. An electronic device, comprising:

a memory for storing processor-executable instructions;

a processor connected to the memory;

wherein the processor is configured to perform the device control method of any one of claims 1 to 9.

12. A non-transitory computer readable storage medium, which when executed by a processor of a computer, enables the computer to perform the device control method provided in any one of claims 1 to 9.