CN112416126B

CN112416126B - Page scrolling control method and device, storage medium and electronic equipment

Info

Publication number: CN112416126B
Application number: CN202011296482.1A
Authority: CN
Inventors: 郭凯
Original assignee: Qingdao Haier Technology Co Ltd; Haier Smart Home Co Ltd
Current assignee: Qingdao Haier Technology Co Ltd; Haier Smart Home Co Ltd
Priority date: 2020-11-18
Filing date: 2020-11-18
Publication date: 2023-07-28
Anticipated expiration: 2040-11-18
Also published as: CN112416126A

Abstract

The invention discloses a page scrolling control method and device, a storage medium and electronic equipment. Wherein the method comprises the following steps: in the process of displaying a target page in the terminal equipment, a camera in the terminal equipment is called to acquire a head image of a target object browsing the target page; processing the head image based on a head posture estimation algorithm to obtain a first processing result; the first processing result is used for indicating a rotation vector of the head of the target object in the current gesture; processing the head image based on a gaze tracking algorithm to obtain a second processing result; wherein the second processing result includes: a target pupil position in the eye for indicating a pupil of the target subject; inputting the first processing result and the second processing result into a decision tree model to determine the gazing area of the target object; and controlling the target page to automatically scroll according to the gazing area. The invention solves the technical problem of complex page scrolling operation in the related technology.

Description

Page scrolling control method and device, storage medium and electronic equipment

Technical Field

The present invention relates to the field of mobile terminal display technologies, and in particular, to a method and apparatus for controlling page scrolling, a terminal, and a storage medium.

Background

With the development of internet technology, contents such as a web page, an electronic book, a document or a short message displayed on a mobile terminal are relatively large, and the contents (for example, the web page) cannot be completely displayed on a screen of the mobile terminal at one time, and the contents of the page of the screen need to be scrolled up and down to completely display the contents.

Currently, in the process of using a mobile terminal, a user usually relies on a user to manually operate up-and-down sliding of a scrollbar ScrollView control to realize movement of screen content of the mobile terminal. I.e. scroll up or down by touching the screen pull down or pull up with a finger to operate the screen of the mobile terminal to control the scrolling of the content to be read by the user. In some scenarios (e.g., when the mobile terminal user is wearing a cotton glove), the mobile terminal user will not be able to operate the screen of the mobile terminal with a finger, and the scroll up and down control of the ScrollView control will not be controlled. In addition, in other scenes, when the screen of the mobile terminal is manually operated to perform scrolling reading, for example, when a user looks at a long text, the user needs to frequently slide on the screen of the mobile terminal, so that finger fatigue is easily caused, and the user experience is poor.

In view of the above problems, no effective solution has been proposed at present.

Disclosure of Invention

The embodiment of the invention provides a page rolling control method and device, a storage medium and electronic equipment, which at least solve the technical problem that the page rolling operation is complex in the page rolling control method provided by the related technology.

According to an aspect of an embodiment of the present invention, there is provided a page scroll control method including: in the process of displaying a target page in terminal equipment, a camera in the terminal equipment is called to acquire a head image of a target object browsing the target page; processing the head image based on a head posture estimation algorithm to obtain a first processing result; the first processing result is used for indicating a rotation vector of the head of the target object in the current gesture; processing the head image based on a gaze tracking algorithm to obtain a second processing result; wherein the second processing result includes: a target pupil position in the eye for indicating a pupil of the target subject; inputting the first processing result and the second processing result into a decision tree model to determine the gazing area of the target object; and controlling the target page to automatically scroll according to the gazing area.

According to still another aspect of the embodiment of the present invention, there is also provided a page scroll control apparatus including: the first acquisition unit is used for calling a camera in the terminal equipment to acquire a head image of a target object browsing the target page in the process of displaying the target page in the terminal equipment; the first processing unit is used for processing the head image based on a head posture estimation algorithm to obtain a first processing result; the first processing result is used for indicating a rotation vector of the head of the target object in the current gesture; the second processing unit is used for processing the head image based on a gaze tracking algorithm to obtain a second processing result; wherein the second processing result includes: a target pupil position in the eye for indicating a pupil of the target subject; a first determining unit configured to input the first processing result and the second processing result into a decision tree model to determine a gaze area of the target object; and the first control unit is used for controlling the target page to automatically scroll according to the gazing area.

According to still another aspect of the embodiment of the present invention, there is also provided an electronic device, including: communication bus, memory, and processor, wherein: the communication bus is used for realizing communication connection between the processor and the memory; the memory is used for storing executable instructions; the processor is configured to execute the page scroll control program in the memory, so as to implement the following steps: in the process of displaying a target page in terminal equipment, a camera in the terminal equipment is called to acquire a head image of a target object browsing the target page; processing the head image based on a head posture estimation algorithm to obtain a first processing result; the first processing result is used for indicating a rotation vector of the head of the target object in the current gesture; processing the head image based on a gaze tracking algorithm to obtain a second processing result; wherein the second processing result includes: a target pupil position in the eye for indicating a pupil of the target subject; inputting the first processing result and the second processing result into a decision tree model to determine the gazing area of the target object; and controlling the target page to automatically scroll according to the gazing area.

According to still another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium having a computer program stored therein, wherein the computer program is configured to execute the above-described page scrolling control method when run.

In the embodiment of the invention, a camera in the terminal equipment is called to acquire a head image of a target object browsing a target page in the process of displaying the target page in the terminal equipment; processing the head image based on a head posture estimation algorithm to obtain a first processing result; the first processing result is used for indicating a rotation vector of the head of the target object in the current gesture; processing the head image based on a gaze tracking algorithm to obtain a second processing result; wherein the second processing result includes: a target pupil position in the eye for indicating a pupil of the target subject; inputting the first processing result and the second processing result into a decision tree model to determine the gazing area of the target object; and controlling the target page to automatically scroll according to the gazing area. The method has the advantages that the fixation area of the target object is determined based on the rotation vector of the head of the target object under the current gesture and the target pupil position of the pupil of the target object in the eyes, the target page is controlled to automatically roll through the fixation area, the automatic rolling of the target page can be flexibly controlled instead of manually operating a rolling control, the technical problem that the page rolling operation is complex in the page rolling control method in the related art is solved, and further the technical effects of flexibly and conveniently controlling the page rolling and reducing the complexity of the page rolling operation are achieved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiments of the invention and together with the description serve to explain the invention and do not constitute a limitation on the invention. In the drawings:

FIG. 1 is a schematic illustration of an application environment of an alternative page scrolling control method according to an embodiment of the invention;

FIG. 2 is a flow diagram of an alternative page scrolling control method according to an embodiment of the invention;

FIG. 3 is a schematic view of head pose rotation in a three-dimensional space coordinate system of another alternative page scrolling control method according to an embodiment of the invention;

FIG. 4 is a schematic diagram of a decision tree algorithm process flow for yet another alternative page scrolling control method in accordance with an embodiment of the invention;

FIG. 5 is a flow chart of yet another alternative page scrolling control method according to an embodiment of the invention;

FIG. 6 is a schematic diagram of an alternative page scrolling control device according to an embodiment of the invention;

fig. 7 is a schematic structural diagram of an alternative electronic device according to an embodiment of the present invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

In fig. 1, a user 102 and a terminal device 104 may perform man-machine interaction. The terminal device 104 includes a memory 106 for storing interaction data and a processor 108 for processing the interaction data. Terminal device 104 may interact with background server 114 via network 112. The background server 114 includes a database 116 for storing interaction data and a processing engine 118 for processing the interaction data. The above page scroll control method may be executed by the terminal device 104 or by the background server 114. For example, taking the terminal device 104 as an example, the terminal device 104 acquires a gaze area of the user 102, and controls a target page displayed on the display 110 in the terminal device 104 to automatically scroll according to the gaze area.

Alternatively, the terminal device 104 may be, but not limited to, a mobile phone, a tablet computer, a notebook computer, a PC, etc., and the network 112 may be, but not limited to, a wireless network or a wired network. Wherein the wireless network comprises: WIFI and other networks that enable wireless communications. The wired network may include, but is not limited to: wide area network, metropolitan area network, local area network. The background server 114 may include, but is not limited to, any hardware device capable of performing calculations.

The embodiment of the invention provides a page scrolling control method, as shown in fig. 2, which comprises the following steps:

s202: in the process of displaying a target page in the terminal equipment, a camera in the terminal equipment is called to acquire a head image of a target object browsing the target page;

s204: processing the head image based on a head posture estimation algorithm to obtain a first processing result; the first processing result is used for indicating a rotation vector of the head of the target object in the current gesture;

s206: processing the head image based on a gaze tracking algorithm to obtain a second processing result; wherein the second processing result includes: a target pupil position in the eye for indicating a pupil of the target subject;

S208: and inputting the first processing result and the second processing result into a decision tree model to determine the gazing area of the target object.

S210: and controlling the target page to automatically scroll according to the gazing area.

In step S202, in practical application, the terminal device may include, but is not limited to, at least one of the following: cell phones (e.g., android cell phones, IOS cell phones, etc.), notebook computers, tablet computers, palm computers, MIDs (Mobile Internet Devices ), PADs, desktop computers, smart televisions, etc. The target page may be a web page, an electronic document, a mail, or the like in a screen in the terminal device, the target object being a user currently operating the terminal device, and the head image of the target object including a two-dimensional image or a three-dimensional image.

In step S204, in practical application, the head pose estimation algorithm may include a deep learning method or a face key point projection method, which is not limited herein, and the head pose estimation mainly obtains angle information of the face orientation. It can be generally expressed in terms of a rotation matrix, a rotation vector, a quaternion or euler angles (these four quantities can also be converted into each other). In general, the Euler angle has better readability and wider application. Face pose information is represented by three euler angles (Yaw, pitch, roll). In addition, the head pose estimation is to identify the pose direction parameters of the head, i.e., the head position parameters (x, y, z) and the direction angle parameters (Yaw, pitch, roll) in one spatial coordinate system. As shown in fig. 3, fig. 3 is a schematic diagram of rotation of a head gesture in a three-dimensional space coordinate system of a page scroll control method according to an embodiment of the present invention, in which direction parameters of three rotational degrees of freedom of a target object head gesture in the three-dimensional space coordinate system, that is, a horizontal rotation euler angle (Yaw) 301, a vertical rotation euler angle (Pitch) 302, and a left-right rotation euler angle (Roll) 303 are described. Generally, the range of head motion of an adult is: the left and right deflection angles range from-40.9 DEG to 36.3 DEG, the vertical rotation angles range from-60.4 DEG to 69.6 DEG, and the horizontal rotation angles range from-79.8 DEG to 75.3 deg.

The face key point projection algorithm comprises a head gesture estimation (Head Pose Estimation) algorithm, and the algorithm is mainly used for solving according to coordinates of a plurality of points of a target object in a three-dimensional space coordinate system and a transformation relation matrix between point sets corresponding to the coordinates of the plurality of points projected into the two-dimensional space coordinate system, so that an estimation result of the head gesture is obtained.

The first processing result is used to indicate a rotation vector of the head of the target object in the current posture, and may be a rotation vector of a horizontal rotation euler angle (Yaw) 301, a rotation vector of a vertical rotation euler angle (Pitch) 302, and a rotation vector of a left-right rotation euler angle (Roll) 303.

In step S206, in actual application, the gaze tracking algorithm is also called gaze estimation or eye tracking, and is a technique for acquiring the current gaze direction or gaze point of the target object by using various detection means such as electricity and optics, capturing an eye image by a video camera, analyzing an eye ball image, and estimating the gaze direction. The embodiment of the invention can adopt but not limited to a pupil positioning method, wherein the pupil positioning method is based on RGB images acquired by a charge coupled device (Charge Coupled Device, CCD) camera or a complementary metal oxide semiconductor (Complementary Metal Oxide Semiconductor, CMOS) camera based on facial geometric features, firstly, the positions of facial feature points (such as facial outline, eyes, eyebrows, nose, mouth and the like) are detected, then the geometric relationship between the head gesture and specific feature points is defined by utilizing the position constraint among the feature points, and finally, the inverse trigonometric function is solved to obtain the head gesture angle. And the coordinates of all the characteristic points of the face of the person are obtained, including the coordinates of eyes. An RGB image of the eye is obtained based on the coordinates of the eye, and an eye image including the pupil is obtained by gray-scale processing and binarization processing. And calculating the position of the pupil relative to the eyes by using a geometric method, and estimating the direction of the sight line.

In steps S208 to S210, in practical application, the decision tree model may be a Random Forest (RF) algorithm, a concept extraction (Concept Learning System, CLS) algorithm, or the like, which is not limited herein. The first result and the second result are analyzed to judge the area to be watched by the target object, for example, when the area to be watched by the target object is the lower part of the terminal device, that is, the reading of the current concerned page by the target object is about to end, the page of the terminal screen automatically moves upwards, and the next page content of the current page can be automatically converted.

According to the embodiment of the invention, the fixation area of the target object is determined based on the rotation vector of the head of the target object in the current gesture and the target pupil position of the pupil of the target object in the eye, and the target page is controlled to automatically roll through the fixation area, so that the automatic rolling of the target page can be flexibly controlled instead of manually operating a rolling control, the technical problem that the page rolling control method in the related art has complicated page rolling operation is solved, and the technical effects of flexibly and conveniently controlling the page rolling and reducing the complexity of the page rolling operation are achieved.

In an embodiment, determining the gaze area of the target object using the input of the first processing result and the second processing result into the decision tree model comprises: analyzing the first processing result and the second processing result through the decision tree model to obtain an analysis result; and determining a fixation area of the target object according to the analysis result, wherein the fixation area is used for indicating the position of the current sight line of the target object in the screen of the terminal device. In this embodiment, the analysis result may be three output categories, namely, a middle screen, an upper screen and a lower screen, and when the output category is the middle screen, the terminal device detects that the user is located at a middle position above the screen of the terminal device, and the screen of the terminal device is not scrolled; when the output category is the upper part of the screen, the terminal equipment detects that the user is positioned at the upper part of the screen of the terminal equipment, and at the moment, the terminal equipment controls the screen page of the terminal equipment to scroll downwards; when the output category is the lower part of the screen, the terminal equipment detects that the user is at the lower part position above the screen of the terminal equipment, and the terminal equipment controls the screen page of the terminal equipment to scroll upwards.

In an embodiment, controlling the target page to automatically scroll according to the gaze area may include: controlling a scrolling control in the target page to execute a downward scrolling operation under the condition that the gazing area indicates that the position of the current sight line in the screen is a first position; controlling the scrolling control in the target page not to scroll under the condition that the gazing area indicates that the position of the current sight line in the screen is the second position; controlling a scrolling control in the target page to execute an upward scrolling operation under the condition that the gazing area indicates that the position of the current sight line in the screen is a third position; wherein the first position is higher than the second position, and the second position is higher than the third position. In this embodiment, the first position may be an upper portion of a screen of the terminal device, the second position may be a middle portion of the screen of the terminal device, and the third position may be a lower portion of the screen of the terminal device. The page rolling direction can be flexibly controlled by setting different positions, so that a user can read conveniently.

In an embodiment, determining the gaze area of the target object according to the analysis result comprises: determining the position of the current sight of the target object in the screen as a second position under the condition that the analysis result indicates that the target object blinks; determining that the position of the current sight line of the target object in the screen is a first position under the condition that the analysis result indicates that the target object does not blink and the vertical coordinate of the target pupil position of the target object is larger than a first threshold value; determining the position of the current sight line of the target object in the screen as a first position under the condition that the analysis result indicates that the target object does not blink, the vertical coordinate of the target pupil position of the target object is smaller than or equal to a first threshold value, and the vertical rotation vector of the head of the target object is larger than a second threshold value; determining that the position of the current sight line of the target object in the screen is a third position when the analysis result indicates that the target object is not blinking and the vertical coordinate of the target pupil position of the target object is smaller than a third threshold value; and determining the position of the current sight line of the target object in the screen as a third position under the condition that the analysis result indicates that the target object is not blinked, the vertical coordinate of the target pupil position of the target object is larger than or equal to a third threshold value, and the vertical rotation vector of the head of the target object is smaller than a fourth threshold value.

In this embodiment, for example, the first threshold and the second threshold are positive numbers, and the third threshold and the fourth threshold are negative values, that is, when the position of the pupil is equal to or greater than the first threshold, the coordinate of the target pupil position is in the positive direction of the Y axis, and the line of sight of the target object moves upward at this time; when the position of the pupil is smaller than or equal to a third threshold value, the coordinate of the position of the target pupil is in the negative direction of the Y axis, and the sight line of the target object moves downwards at the moment; when the vertical rotation vector of the head of the target object is larger than or equal to a second threshold value, the target object is in a head upward moving state; when the vertical rotation vector of the head of the target object is smaller than or equal to the fourth threshold value, the target object is in a state that the head moves downwards, and the target object can be a user of the current terminal equipment.

In this embodiment, the gazing area of the target object is determined by the decision tree model algorithm, for example, as shown in fig. 4, firstly, step S402 is performed to determine whether the target object blinks, if yes, step S404 is performed to output the position of the gazing area as the middle part of the screen, that is, the current screen page does not scroll, if no, step S406 is performed to determine whether the vertical coordinate of the pupil of the target object is greater than the preset value n, if yes, step S408 is performed to output the position of the gazing area as the upper part of the screen; if not, step S410 is executed to determine whether the head Pitch parameter of the target object, that is, the vertical rotation vector of the target object is greater than the preset value k, if yes, the target object is in a head-up state, and if yes, step S412 is executed to output the position of the gazing area as the upper part of the screen; if not, go to step S414, determine whether the vertical coordinate of the pupil of the target object is smaller than the preset value m, if yes, go to step S416, and output the position of the gazing area as the lower part of the screen; if not, executing step S418, and determining whether the head Pitch parameter of the target object, that is, the vertical rotation vector of the target object is smaller than the preset value j, if yes, determining that the target object is in a low head state, and then entering step S420, wherein the position of the output gazing area is the lower part of the screen; if not, the process proceeds to step S422, where the position of the output gazing area is the middle part of the screen. By utilizing the decision tree model to track the change of the head gesture and the pupil position of the target object, the screen scrolling operation of the user can be simplified, and the user can read conveniently.

In an embodiment, step S204 may include the following steps: acquiring a face model containing a plurality of key points from a head image, and determining three-dimensional coordinates of the key points in a three-dimensional space coordinate system; based on a head posture estimation algorithm, determining a rotation Euler angle of the head of the target object according to the three-dimensional coordinates, wherein the rotation Euler angle is used for representing a rotation vector of the head of the target object. For example, a 3D face model with n keypoints is set, n may be defined according to the actual accuracy requirement. Typically, a common value for n may be 68.

In one embodiment, determining the rotational euler angle of the head of the target object from the three-dimensional coordinates based on the head pose estimation algorithm comprises: converting the three-dimensional coordinates into three two-dimensional plane coordinates; and determining a vertical rotation Euler angle of the head of the target object according to the two-dimensional plane coordinates of the vertical direction dimension, wherein the vertical rotation Euler angle is used for indicating a vertical rotation vector of the head of the target object, and the rotation vector comprises the vertical rotation vector. As shown in fig. 3, the rotation vector of the vertical rotation euler angle (Pitch) 302, by judging the vertical rotation vector, it can be judged whether the target object is head up or head down and head up.

In an embodiment, step S206 may include the following steps: acquiring an eye model containing a plurality of key points from a head image; determining a target pupil position of a pupil of a target object by using an eye model; judging the blink state of the target object; a second processing result is obtained based on the target pupil position and the blink status. In the present embodiment, an eye model including a plurality of key points is acquired from a plurality of key points in a head image; for example, 20 key points are selected from 68 key points set in the head image as key points of the eye model, and then a target pupil position where the target object is located is obtained according to the key points of the eye model, for example, a two-dimensional plane coordinate system is established with an eye center point of the target object as an axis, and the target pupil position can be any point in the plane coordinate system. In this embodiment, the second processing result includes blink status information of the user and the position of the target pupil.

In one embodiment, determining a target pupil position at which a pupil of the target subject is located using the eye model includes: acquiring RGB images corresponding to the eye models; performing image processing on the RGB image to obtain an eye image containing the pupil of the target object; and determining the target pupil position where the pupil is based on the eye image. In this embodiment, processing the RGB image of the eye model object may be performing binarization processing on the image, and removing the white portion of the eye of the target object, so as to obtain the pupil position of the target object.

According to the embodiment, the fixation area of the target object is determined based on the rotation vector of the head of the target object in the current gesture and the target pupil position of the pupil of the target object in the eye, and the target page is controlled to automatically roll through the fixation area, so that the automatic rolling of the target page can be flexibly controlled instead of manually operating a rolling control, the technical problem that the page rolling control method in the related art is complex in page rolling operation is solved, and the technical effects of flexibly and conveniently controlling the page rolling and reducing the complexity of the page rolling operation are achieved.

In an application embodiment, as shown in fig. 5, when a target object performs a page scrolling control operation, step S502 is first performed, a front camera of a terminal device is started to collect a head image of a user, then step S504 is performed based on the head image to perform head pose estimation, and step S506 is performed based on the head image to perform gaze tracking algorithm estimation, then step S508 is performed based on the execution results of step S504 and step S506, and the result is input into a decision tree algorithm model to perform processing, then step S510 is performed to determine a gaze position of the target object, when the gaze position is above a screen, step S512 is performed to control the page to scroll downward, when the gaze position is below the screen, step S514 is performed to control the page to scroll upward, and when the gaze position is in the middle of the screen, step S516 is performed to control the page not to scroll.

Wherein, in step S504, the head pose estimation algorithm may include the steps of:

1) First, a 3D face model with n key points is set, where n can define the accuracy class of the page scrolling operation by the target object. For example, n is set to 68 with a common value.

2) And determining the 2D face key points corresponding to the 3D face model through a face detection technology and face key point detection.

3) And analyzing the rotation vectors of the 3D face model of the target object, namely, the change vectors of three dimensional directions such as a horizontal rotation Euler angle (Yaw), a vertical rotation Euler angle (Pitch), a left-right rotation Euler angle (Roll) and the like.

4) The three-dimensional rotation vectors are converted into respective euler angles.

5) A vertical rotation euler angle (Pitch) of the head of the target object is acquired.

In step S506, the gaze tracking algorithm may include the steps of:

a) First, left and right eye coordinates of a 2D face key point corresponding to a target object in a two-dimensional coordinate system are calculated according to the key point in the 3D face model in step S504.

b) RGB images of the left and right eyes of the target object are acquired.

c) And carrying out graying treatment on the eye RGB image.

d) And carrying out binarization processing on the RGB image subjected to the grey scale processing.

e) Whether the target object blinks or not is judged according to whether the key points of the upper eyelid and the lower eyelid of the target object coincide or not and whether the key points of the pupils appear or not.

f) And calculating the position of the pupil key point of the target object under the two-dimensional coordinate system.

g) Calculating the vertical coordinates of the pupil of the target object relative to the eyes of the target object according to the position in the step f.

According to the embodiment of the invention, the fixation area of the target object is determined through the rotation vector of the head of the target object under the current gesture and the target pupil position of the pupil of the target object in the eyes, and the target page is controlled to automatically roll through the fixation area, so that the automatic rolling of the target page can be flexibly controlled instead of manually operating the rolling control, the technical problem that the page rolling control method in the related art is complex in page rolling operation is solved, and the technical effects of flexibly and conveniently controlling the page rolling and reducing the complexity of the page rolling operation are achieved.

Based on the foregoing embodiment, the present invention further provides a page scrolling control device. As shown in fig. 6, the apparatus includes:

a first obtaining unit 602, configured to invoke a camera in a terminal device to obtain a head image of a target object that is browsing a target page in a process of displaying the target page in the terminal device;

A first processing unit 604, configured to process the head image based on a head pose estimation algorithm, so as to obtain a first processing result; the first processing result is used for indicating a rotation vector of the head of the target object in the current gesture;

a second processing unit 606, configured to process the head image based on a gaze tracking algorithm, to obtain a second processing result; wherein the second processing result includes: a target pupil position in the eye for indicating a pupil of the target subject;

a first determining unit 608, configured to input the first processing result and the second processing result into a decision tree model, so as to determine a gaze area of the target object;

the first control unit 610 is configured to control the target page to automatically scroll according to the gaze area.

In an embodiment, the first determining unit 608 is specifically configured to analyze the first processing result and the second processing result through a decision tree model to obtain an analysis result; and determining a fixation area of the target object according to the analysis result, wherein the fixation area is used for indicating the position of the current sight line of the target object in the screen of the terminal device.

In an embodiment, the first determining unit 608 is specifically configured to control the scroll control in the target page to perform a scroll-down operation in a case where the gaze area indicates that the position of the current line of sight in the screen is the first position. Controlling the scrolling control in the target page not to scroll under the condition that the gazing area indicates that the position of the current sight line in the screen is the second position; controlling a scrolling control in the target page to execute an upward scrolling operation under the condition that the gazing area indicates that the position of the current sight line in the screen is a third position; wherein the first position is higher than the second position, and the second position is higher than the third position.

In an embodiment, the first determining unit 608 is specifically configured to determine, when the analysis result indicates that the target object is blinking, that the position of the current line of sight of the target object in the screen is the second position;

determining that the position of the current sight line of the target object in the screen is a first position under the condition that the analysis result indicates that the target object does not blink and the vertical coordinate of the target pupil position of the target object is larger than a first threshold value;

determining the position of the current sight line of the target object in the screen as a first position under the condition that the analysis result indicates that the vertical coordinate of the target pupil position of the target object is smaller than or equal to a first threshold value and the vertical rotation vector of the head of the target object is larger than a second threshold value;

determining that the position of the current sight line of the target object in the screen is a third position under the condition that the analysis result indicates that the target object does not blink and the vertical coordinate of the target pupil position of the target object is smaller than a third threshold value;

determining that the position of the current sight line of the target object in the screen is a third position under the condition that the analysis result indicates that the vertical coordinate of the target pupil position of the target object is larger than or equal to a third threshold value and the vertical rotation vector of the head of the target object is smaller than a fourth threshold value;

And determining the position of the current sight line of the target object in the screen as a second position when the analysis result indicates that the vertical rotation vector of the head of the target object is smaller than or equal to a second threshold value and larger than or equal to a fourth threshold value.

In an embodiment, the first processing unit 604 is specifically configured to obtain a face model including a plurality of key points from the head image, and determine three-dimensional coordinates of the plurality of key points in a three-dimensional space coordinate system; based on a head posture estimation algorithm, determining a rotation Euler angle of the head of the target object according to the three-dimensional coordinates, wherein the rotation Euler angle is used for representing a rotation vector of the head of the target object.

In an embodiment, the first processing unit 604 is further specifically configured to convert three-dimensional coordinates into three two-dimensional plane coordinates; and determining a vertical rotation Euler angle of the head of the target object according to the two-dimensional plane coordinates of the vertical direction dimension, wherein the vertical rotation Euler angle is used for indicating a vertical rotation vector of the head of the target object, and the rotation vector comprises the vertical rotation vector.

In an embodiment, the second processing unit 606 is specifically configured to obtain an eye model including a plurality of key points from the head image; determining a target pupil position of a pupil of a target object by using an eye model; judging the blink state of the target object; a second processing result is obtained based on the target pupil position and the blink status.

In an embodiment, the second processing unit 606 specifically acquires RGB images corresponding to the eye model; performing image processing on the RGB image to obtain an eye image containing the pupil of the target object; and determining the target pupil position where the pupil is based on the eye image.

According to still another aspect of the embodiment of the present invention, there is also provided an electronic device for implementing the above page scrolling control method. As shown in fig. 7, the electronic device comprises a memory 702 and a processor 704, the memory 702 storing a computer program, the processor 704 being arranged to perform the steps of any of the method embodiments described above by means of the computer program.

Alternatively, in this embodiment, the electronic device may be located in at least one network device of a plurality of network devices of the computer network.

Alternatively, in the present embodiment, the processor 704 may be configured to execute the following steps by a computer program:

s1, in the process of displaying a target page in terminal equipment, a camera in the terminal equipment is called to acquire a head image of a target object browsing the target page;

s2, processing the head image based on a head posture estimation algorithm to obtain a first processing result; the first processing result is used for indicating a rotation vector of the head of the target object in the current gesture;

s3, processing the head image based on a gaze tracking algorithm to obtain a second processing result; wherein the second processing result includes: a target pupil position in the eye for indicating a pupil of the target subject;

s4, inputting the first processing result and the second processing result into a decision tree model to determine the gazing area of the target object;

and S5, controlling the target page to automatically scroll according to the gazing area.

Alternatively, as will be appreciated by those skilled in the art, the structure shown in fig. 7 is merely illustrative, and the electronic device may be a smart phone (such as an Android phone, an iOS phone, etc.), a tablet computer, a palmtop computer, a mobile internet device (Mobile Internet Devices, MID), a PAD, or other terminal devices. Fig. 7 is not limited to the structure of the electronic device and the electronic apparatus described above. For example, the electronics may also include more or fewer components (e.g., network interfaces, etc.) than shown in fig. 7, or have a different configuration than shown in fig. 7.

The memory 702 may be used to store software programs and modules, such as program instructions/modules corresponding to the page scrolling control method and apparatus in the embodiments of the present invention, and the processor 704 executes the software programs and modules stored in the memory 702, thereby performing various functional applications and data processing, that is, implementing the page scrolling control method described above. The memory 702 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory. In some examples, the memory 702 may further include memory remotely located relative to the processor 704, which may be connected to the terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. The memory 702 may specifically, but not limited to, information such as a head image for a target object. As an example, as shown in fig. 7, the memory 702 may include, but is not limited to, the first acquiring unit 602, the first processing unit 604, the second processing unit 606, the first determining unit 608, and the first control unit 610 in the page scrolling control device. In addition, other module units in the above page scrolling control device may be included, but are not limited to, and are not described in detail in this example.

Optionally, the transmission device 706 is used to receive or transmit data via a network. Specific examples of the network described above may include wired networks and wireless networks. In one example, the transmission device 706 includes a network adapter (Network Interface Controller, NIC) that may be connected to other network devices and routers via a network cable to communicate with the internet or a local area network. In one example, the transmission device 706 is a Radio Frequency (RF) module that is configured to communicate wirelessly with the internet.

In addition, the electronic device further includes: a display 708 for displaying a scroll state of the page; and a connection bus 710 for connecting the respective module parts in the above-described electronic device.

In other embodiments, the terminal may be a node in a distributed system, where the distributed system may be a blockchain system, and the blockchain system may be a distributed system formed by connecting the plurality of nodes through a network communication. Among them, the nodes may form a Peer-To-Peer (P2P) network, and any type of computing device, such as a server, a terminal, etc., may become a node in the blockchain system by joining the Peer-To-Peer network.

Based on the above embodiments, the present invention also provides a computer program product or a computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device executes the page scrolling control method provided by one or more of the foregoing technical solutions.

Alternatively, in the present embodiment, the above-described storage medium may be configured to store a computer program for performing the steps of:

Alternatively, in this embodiment, it will be understood by those skilled in the art that all or part of the steps in the methods of the above embodiments may be performed by a program for instructing a terminal device to execute the steps, where the program may be stored in a computer readable storage medium, and the storage medium may include: flash disk, read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), magnetic or optical disk, and the like.

The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.

The integrated units in the above embodiments may be stored in the above-described computer-readable storage medium if implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the technical solution of the present invention may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing one or more computer devices (which may be personal computers, servers or network devices, etc.) to perform all or part of the steps of the method described in the embodiments of the present invention.

In the foregoing embodiments of the present invention, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.

In several embodiments provided in the present application, it should be understood that the disclosed client may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, such as the division of the units, is merely a logical function division, and may be implemented in another manner, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims

1. A method for controlling scrolling of a page, the method comprising:

in the process of displaying a target page in terminal equipment, calling a camera in the terminal equipment to acquire a head image of a target object browsing the target page;

Processing the head image based on a head posture estimation algorithm to obtain a first processing result; wherein the first processing result is used for indicating a rotation vector of the head of the target object in the current gesture;

processing the head image based on a gaze tracking algorithm to obtain a second processing result; wherein the second processing result includes: a target pupil position in the eye for indicating a pupil of the target subject;

inputting the first processing result and the second processing result into a decision tree model to determine a gaze area of the target object comprises: analyzing the first processing result and the second processing result through the decision tree model to obtain an analysis result; determining the gazing area of the target object according to the analysis result, wherein the gazing area is used for indicating the position of the current sight of the target object in the screen of the terminal device;

controlling the target page to automatically scroll according to the gazing area;

the determining the gazing area of the target object according to the analysis result comprises: determining that the position of the current sight line of the target object in the screen is a second position under the condition that the analysis result indicates that the target object blinks; determining that the position of the current line of sight of the target object in the screen is a first position if the analysis result indicates that the target object is not blinking and the vertical coordinate of the target pupil position of the target object is greater than a first threshold; determining that the current sight line of the target object is at the first position in the screen when the analysis result indicates that the target object is not blinking and the vertical coordinate of the target pupil position of the target object is less than or equal to the first threshold and the vertical rotation vector of the head of the target object is greater than a second threshold; determining that the position of the current line of sight of the target object in the screen is a third position if the analysis result indicates that the target object is not blinking and the vertical coordinate of the target pupil position of the target object is less than a third threshold; determining that the current sight line of the target object is at the third position in the screen when the analysis result indicates that the target object is not blinking and the vertical coordinate of the target pupil position of the target object is greater than or equal to the third threshold and the vertical rotation vector of the head of the target object is less than a fourth threshold; and determining the position of the current sight line of the target object in the screen as the second position when the analysis result indicates that the target object is not blinking and the vertical rotation vector of the head of the target object is smaller than or equal to the second threshold and larger than or equal to the fourth threshold, wherein the first position is the upper part of the screen of the terminal device, the second position is the middle part of the screen of the terminal device, and the third position is the lower part of the screen of the terminal device.

2. The method of claim 1, wherein controlling the target page to automatically scroll according to the gaze area comprises:

controlling a scrolling control in the target page to execute a downward scrolling operation under the condition that the gazing area indicates that the position of the current sight line in the screen is a first position;

controlling a scrolling control in the target page not to scroll under the condition that the gazing area indicates that the position of the current sight line in the screen is a second position;

controlling a scrolling control in the target page to execute an upward scrolling operation under the condition that the gazing area indicates that the position of the current sight line in the screen is a third position;

wherein the first position is higher than the second position, and the second position is higher than the third position.

3. The method of claim 1, wherein the processing the head image based on the head pose estimation algorithm to obtain a first processing result comprises:

acquiring a face model containing a plurality of key points from the head image, and determining three-dimensional coordinates of the key points in a three-dimensional space coordinate system;

And determining a rotation Euler angle of the head of the target object according to the three-dimensional coordinate based on the head posture estimation algorithm, wherein the rotation Euler angle is used for representing the rotation vector of the head of the target object.

4. A method according to claim 3, wherein said determining a rotational euler angle of a head of said target object from said three-dimensional coordinates based on said head pose estimation algorithm comprises:

converting the three-dimensional coordinates into three two-dimensional plane coordinates;

and determining a vertical rotation Euler angle of the head of the target object according to the two-dimensional plane coordinates of the vertical direction dimension, wherein the vertical rotation Euler angle is used for indicating a vertical rotation vector of the head of the target object, and the rotation vector comprises the vertical rotation vector.

5. The method according to claim 1, characterized in that: the head image is processed based on the gaze tracking algorithm, and the second processing result is obtained, wherein the second processing result comprises:

acquiring an eye model containing a plurality of key points from the head image;

determining the target pupil position of the pupil of the target object by using the eye model;

Judging the blink state of the target object;

and obtaining the second processing result based on the target pupil position and the blink state.

6. The method of claim 5, wherein the determining the target pupil location at which the pupil of the target subject is located using the eye model comprises:

acquiring an RGB image corresponding to the eye model;

performing image processing on the RGB image to obtain an eye image containing the pupil of the target object;

and determining the target pupil position where the pupil is located based on the eye image.

7. A page scroll control apparatus, comprising:

the first acquisition unit is used for calling a camera in the terminal equipment to acquire a head image of a target object browsing the target page in the process of displaying the target page in the terminal equipment;

the first processing unit is used for processing the head image based on a head posture estimation algorithm to obtain a first processing result; wherein the first processing result is used for indicating a rotation vector of the head of the target object in the current gesture;

the second processing unit is used for processing the head image based on a gaze tracking algorithm to obtain a second processing result; wherein the second processing result includes: a target pupil position in the eye for indicating a pupil of the target subject;

A first determining unit, configured to input the first processing result and the second processing result into a decision tree model, so as to determine a gaze area of the target object, where the first determining unit includes: analyzing the first processing result and the second processing result through the decision tree model to obtain an analysis result; determining the gaze area of the target object according to the analysis result comprises: determining that the position of the current sight of the target object in the screen is a second position under the condition that the analysis result indicates that the target object blinks; determining that the position of the current line of sight of the target object in the screen is a first position if the analysis result indicates that the target object is not blinking and the vertical coordinate of the target pupil position of the target object is greater than a first threshold; determining that the current sight line of the target object is at the first position in the screen when the analysis result indicates that the target object is not blinking and the vertical coordinate of the target pupil position of the target object is less than or equal to the first threshold and the vertical rotation vector of the head of the target object is greater than a second threshold; determining that the position of the current line of sight of the target object in the screen is a third position if the analysis result indicates that the target object is not blinking and the vertical coordinate of the target pupil position of the target object is less than a third threshold; determining that the current sight line of the target object is at the third position in the screen when the analysis result indicates that the target object is not blinking and the vertical coordinate of the target pupil position of the target object is greater than or equal to the third threshold and the vertical rotation vector of the head of the target object is less than a fourth threshold; determining that the position of the current line of sight of the target object in the screen is the second position if the analysis result indicates that the target object is not blinking and the vertical rotation vector of the head of the target object is less than or equal to the second threshold and greater than or equal to the fourth threshold; the gazing area is used for indicating the position of the current sight line of the target object in the screen of the terminal equipment, wherein the first position is the upper part of the screen of the terminal equipment, the second position is the middle part of the screen of the terminal equipment, and the third position is the lower part of the screen of the terminal equipment;

And the first control unit is used for controlling the target page to automatically scroll according to the gazing area.

8. An electronic device, the electronic device comprising: communication bus, memory, and processor, wherein:

the communication bus is used for realizing communication connection between the processor and the memory;

the memory is used for storing executable instructions;

the processor is configured to execute a page scroll control program in the memory, so as to implement the following steps:

inputting the first processing result and the second processing result into a decision tree model to determine a fixation area of the target object, including: analyzing the first processing result and the second processing result through the decision tree model to obtain an analysis result; determining the gaze area of the target object according to the analysis result comprises: determining that the position of the current sight of the target object in the screen is a second position under the condition that the analysis result indicates that the target object blinks; determining that the position of the current line of sight of the target object in the screen is a first position if the analysis result indicates that the target object is not blinking and the vertical coordinate of the target pupil position of the target object is greater than a first threshold; determining that the current sight line of the target object is at the first position in the screen when the analysis result indicates that the target object is not blinking and the vertical coordinate of the target pupil position of the target object is less than or equal to the first threshold and the vertical rotation vector of the head of the target object is greater than a second threshold; determining that the position of the current line of sight of the target object in the screen is a third position if the analysis result indicates that the target object is not blinking and the vertical coordinate of the target pupil position of the target object is less than a third threshold; determining that the current sight line of the target object is at the third position in the screen when the analysis result indicates that the target object is not blinking and the vertical coordinate of the target pupil position of the target object is greater than or equal to the third threshold and the vertical rotation vector of the head of the target object is less than a fourth threshold; determining that the position of the current line of sight of the target object in the screen is the second position if the analysis result indicates that the target object is not blinking and the vertical rotation vector of the head of the target object is less than or equal to the second threshold and greater than or equal to the fourth threshold; the gazing area is used for indicating the position of the current sight line of the target object in the screen of the terminal equipment, wherein the first position is the upper part of the screen of the terminal equipment, the second position is the middle part of the screen of the terminal equipment, and the third position is the lower part of the screen of the terminal equipment;

And controlling the target page to automatically scroll according to the gazing area.

9. A computer-readable storage medium storing one or more programs for execution by one or more processors to implement the steps of the page scroll control method of any one of claims 1 to 6.