WO2021098454A1 - Region of concern detection method and apparatus, and readable storage medium and terminal device - Google Patents

Region of concern detection method and apparatus, and readable storage medium and terminal device Download PDF

Info

Publication number
WO2021098454A1
WO2021098454A1 PCT/CN2020/124098 CN2020124098W WO2021098454A1 WO 2021098454 A1 WO2021098454 A1 WO 2021098454A1 CN 2020124098 W CN2020124098 W CN 2020124098W WO 2021098454 A1 WO2021098454 A1 WO 2021098454A1
Authority
WO
WIPO (PCT)
Prior art keywords
eye
coordinates
point
interest
image
Prior art date
Application number
PCT/CN2020/124098
Other languages
French (fr)
Chinese (zh)
Inventor
王杉杉
胡文泽
王孝宇
Original Assignee
深圳云天励飞技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳云天励飞技术股份有限公司 filed Critical 深圳云天励飞技术股份有限公司
Publication of WO2021098454A1 publication Critical patent/WO2021098454A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/193Preprocessing; Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/19Sensors therefor

Definitions

  • the present invention relates to the field of image processing technology, and in particular to a method, device, computer-readable storage medium and terminal equipment for detecting a region of interest.
  • Offline screens mainly include televisions, building screens, cinema ticket machines, cinema LEDs, building screens, supermarket/convenience store POS screens, taxi car screens, projectors, and express screens. Involve all aspects of the user's life. Offline advertising methods based on offline screens have a natural advantage in attracting consumers' attention, but advertisers cannot quickly know whether the design and content of offline advertising are more attractive to consumers, which leads to feedback on whether offline advertising is good or bad. It is often not as fast and accurate as online. These have caused the problem of insufficient accuracy and inefficiency of certain advertisements. In the prior art, precision instruments such as eye trackers can be used to track eye movement and determine the area of interest, but its price is very expensive and it is difficult to be widely used.
  • the embodiments of the present application provide a method, device, computer-readable storage medium, and terminal device for detecting an area of interest to solve the problem that the existing method for detecting an area of interest is very expensive and difficult to be widely used.
  • the first aspect of the embodiments of the present application provides a method for detecting a region of interest, which may include:
  • the eye area of interest is determined according to the attention direction of the line of sight, the coordinates of the center point of the left eye pupil, and the coordinates of the center point of the right eye pupil.
  • the determining the eye area of interest according to the line of sight attention direction, the coordinates of the center point of the left eye pupil, and the coordinates of the center point of the right eye pupil includes:
  • the eye area of interest is determined according to the coordinates of the eye point of interest.
  • the determining the eye area of interest according to the coordinates of the eye point of interest includes:
  • the screen area where the pixel position is located is determined as the eye focus area.
  • the converting the coordinates of the eye focus point according to the coordinates of the preset reference pixel point includes:
  • the first distance and the second distance are respectively calculated according to the coordinates of the reference pixel point and the coordinates of the eye point of interest.
  • the first distance is that the reference pixel point and the eye point of interest are in the preset first distance.
  • the determining the attention direction of the line of sight according to the left-eye image, the right-eye image, and the head posture includes:
  • the processing process of the line-of-sight prediction model includes:
  • the binocular feature information and the head posture are fused to obtain the attention direction of the line of sight.
  • the method may further include:
  • each training sample includes pre-collected left-eye image, right-eye image and head posture of the subject, and each training sample corresponds to
  • the labels are all pre-calibrated directions of attention, and SN is a positive integer;
  • the method may further include:
  • a second aspect of the embodiments of the present application provides a device for detecting a region of interest, which may include:
  • the face image acquisition module is used to acquire the target face image to be detected
  • a head posture calculation module for calculating the head posture of the target face image
  • An eye image extraction module for extracting left eye images and right eye images in the target face image
  • a sight attention direction determining module configured to determine the sight attention direction according to the left-eye image, the right-eye image, and the head posture
  • An eye key point detection module configured to detect key eye points in the left eye image and the right eye image to obtain the coordinates of the center point of the left eye pupil and the coordinates of the center point of the right eye pupil;
  • the eye area of interest determination module is configured to determine the eye area of interest according to the eye-focusing direction, the coordinates of the center point of the left eye pupil, and the coordinates of the right eye pupil center point.
  • the eye attention area determination module may include:
  • a center point coordinate calculation sub-module configured to calculate the coordinates of the center point of the pupils of both eyes according to the coordinates of the center point of the left eye pupil and the coordinates of the center point of the right eye pupil;
  • a point-to-surface distance calculation sub-module for calculating the point-to-surface distance between the center point of the pupils of the two eyes and a preset screen according to the coordinates of the center points of the pupils of the two eyes;
  • An eye point of interest coordinate calculation sub-module configured to calculate the coordinates of the eye point of interest according to the attention direction of the line of sight, the coordinates of the center points of the pupils of the two eyes, and the distance between the points;
  • the eye area of interest determination sub-module is used to determine the eye area of interest according to the coordinates of the eye point of interest.
  • sub-module for determining the eye area of interest may include:
  • a coordinate conversion unit configured to convert the coordinates of the eye point of interest according to the coordinates of a preset reference pixel point to obtain the pixel position of the eye point of interest on the screen;
  • the pixel position determining unit is configured to determine whether the pixel position is within the range of the screen according to a preset screen resolution
  • a screen area determining unit configured to determine the screen area where the pixel position is located according to a preset screen area division rule if the pixel position is within the range of the screen;
  • the eye attention area determining unit is configured to determine the screen area where the pixel position is located as the eye attention area.
  • the coordinate conversion unit may include:
  • the distance calculation subunit is configured to calculate a first distance and a second distance respectively according to the coordinates of the reference pixel point and the coordinates of the eye point of interest, where the first distance is the reference pixel point and the eye The distance of the point of interest in the preset first coordinate axis direction, where the second distance is the distance between the reference pixel point and the eye point of interest in the preset second coordinate axis direction;
  • a first pixel position calculation subunit configured to calculate the pixel position of the eye point of interest in the direction of the first coordinate axis according to the first distance and a preset first conversion coefficient
  • the second pixel position calculation subunit is configured to calculate the pixel position of the eye point of interest in the direction of the second coordinate axis according to the second distance and a preset second conversion coefficient.
  • the line-of-sight attention direction determining module is specifically configured to input the left-eye image, the right-eye image, and the head posture into a pre-trained line-of-sight prediction model for processing to obtain the line-of-sight attention direction ;
  • the sight attention direction determining module may include:
  • the feature information extraction sub-module is configured to extract feature information from the left-eye image and the right-eye image to obtain left-eye feature information and right-eye feature information;
  • the binocular feature information determining sub-module is configured to perform fusion processing on the left eye feature information and the right eye feature information to obtain binocular feature information;
  • the gaze attention direction determination sub-module is used to perform fusion processing on the binocular feature information and the head posture to obtain the gaze attention direction.
  • the device for detecting a region of interest may further include:
  • the sample set construction module is used to construct a training sample set, wherein the training sample set includes SN training samples, and each training sample includes pre-collected left-eye images, right-eye images and head poses of the subject, And the label corresponding to each training sample is the pre-calibrated line of sight attention direction, and SN is a positive integer;
  • the model training module is used to train the line-of-sight prediction model in the initial state by using the training sample set to obtain the pre-trained line-of-sight prediction model.
  • the device for detecting a region of interest may further include:
  • the facial feature information extraction module is used to extract the facial feature information in the target face image
  • a user information determining module configured to determine user information corresponding to the target face image according to the facial feature information
  • the screen display information determining module is used to determine the screen display information corresponding to the eye focus area
  • the correspondence relationship establishment module is used to establish the correspondence relationship between the user information and the screen display information.
  • the third aspect of the embodiments of the present application provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps of any of the above-mentioned region-of-interest detection methods are implemented .
  • the fourth aspect of the embodiments of the present application provides a terminal device, including a memory, a processor, and a computer program stored in the memory and running on the processor.
  • the processor executes the computer program, Steps for implementing any of the above-mentioned methods for detecting a region of interest.
  • the fifth aspect of the embodiments of the present application provides a computer program product, which when the computer program product runs on a terminal device, causes the terminal device to execute the steps of any of the above-mentioned region-of-interest detection methods.
  • the embodiment of the application has the following beneficial effects: the embodiment of the application obtains the target face image to be detected; calculates the head posture of the target face image; extracts the Left-eye image and right-eye image; determine the attention direction of the line of sight according to the left-eye image, the right-eye image, and the head posture; perform eye key points in the left-eye image and the right-eye image, respectively Through detection, the coordinates of the center point of the left eye pupil and the coordinates of the center point of the right eye pupil are obtained; the eye focus area is determined according to the focus direction of the line of sight, the coordinates of the left eye pupil center point and the coordinates of the right eye pupil center point .
  • FIG. 1 is a flowchart of an embodiment of a method for detecting a region of interest in an embodiment of the application
  • Figure 2 is a schematic diagram of a 3D coordinate system established in an embodiment of the application
  • Figure 3 is a schematic diagram of the network structure of the line-of-sight prediction model
  • Figure 4 is a schematic diagram of the attention direction of the line of sight
  • Fig. 5 is a schematic diagram of the center point of the pupil
  • Fig. 6 is a schematic flow chart of determining the eye area of interest according to the attention direction of the line of sight, the coordinates of the center point of the left eye pupil, and the coordinates of the right eye pupil center point;
  • Figure 7 is a schematic diagram of calculating the coordinates of the eye point of interest
  • FIG. 8 is a schematic flowchart of determining the eye area of interest according to the coordinates of the eye point of interest
  • FIG. 9 is a structural diagram of an embodiment of a device for detecting a region of interest in an embodiment of the application.
  • FIG. 10 is a schematic block diagram of a terminal device in an embodiment of this application.
  • an embodiment of a method for detecting a region of interest in an embodiment of the present application may include:
  • Step S101 Obtain a target face image to be detected.
  • a depth camera in order to determine the user's attention area on the screen, a depth camera may be configured for the screen.
  • the depth camera may be built into the screen or used as an external device of the screen.
  • the camera coordinate system of the depth camera can be used to establish a 3D coordinate system as shown in FIG. 2, and the upper left (that is, left_up in FIG. 2) and upper right (that is, the left_up in FIG. 2) of the screen can be pre-calibrated.
  • the execution subject of the embodiments of this application may be a terminal device connected to the screen in a wired or wireless manner, including but not limited to desktop computers, notebooks, palmtop computers, smart phones, servers, and other terminals with data processing functions. equipment.
  • the screen is a smart screen with a data processing function, it can also be used as a terminal device for executing the embodiments of the present application without relying on other external terminal devices.
  • an image around the screen can be collected by the depth camera, and face detection can be performed in the image. If a face is detected, the current face image can be captured , That is, the target face image.
  • Step S102 Calculate the head posture of the target face image.
  • 3D key points of the face can be detected on it, and the head pose of the target face image can be calculated according to these 3D key points.
  • an Iterative Closest Point (ICP) algorithm may be used to calculate the head posture.
  • a reference point cloud image as a comparison reference is preset, and each 3D key point as a comparison reference is included in the reference point cloud image, and then the detected 3D key points are constructed as the target face image.
  • the point cloud image uses the nearest neighbor criterion to determine the corresponding points in the two point cloud images, and determines the conversion function between the two through the least square method, and uses this conversion function to rotate the point cloud image of the target face image,
  • the updated point cloud image of the target face image is obtained, and the above process is repeated until the preset termination condition is reached, and the iteration is stopped.
  • the angle of rotation in each iteration of the calculation process is superimposed, and the result obtained is the head pose of the target face image.
  • the calculated head posture is denoted as headpose[theta, phi], where theta represents the upward or downward looking angle of the head, and phi represents the deflection angle of the head in the horizontal direction.
  • Step S103 Extract a left eye image and a right eye image in the target face image.
  • the left-eye key points can be filtered out from the detected 3D key points.
  • the abscissas of these left-eye key points The minimum value is left_x_min, the maximum value of the abscissa is left_x_max, the minimum value of the ordinate is left_y_min, and the maximum value of the ordinate is left_y_max.
  • LA1 rectangular area formed by the following four coordinate points
  • the image is used as the left-eye image: (left_x_min, left_y_max), (left_x_min, left_y_min), (left_x_max, left_y_max), (left_x_max, left_y_min).
  • LA1 the maximum value information to intercept the left-eye image
  • LA2 the image in LA2 as the Left eye image.
  • the extraction process of the right-eye image is similar to the extraction process of the left-eye image, and will not be repeated here.
  • Step S104 Determine the attention direction of the line of sight according to the left-eye image, the right-eye image and the head posture.
  • the left-eye image, the right-eye image, and the head posture may be input into a pre-trained line of sight prediction model for processing, so as to obtain the attention direction of the line of sight.
  • the line of sight prediction model uses a multi-input neural network structure.
  • the line of sight prediction model first extracts feature information from the left-eye image and the right-eye image to obtain the left-eye image. Eye feature information and right eye feature information, then the left eye feature information and the right eye feature information are fused to obtain binocular feature information, and finally the binocular feature information and the head posture are fused to process, Obtain the attention direction of the line of sight, as shown in FIG. 4.
  • ResNet18 block ie, the ResNet18 Block in Figure 3
  • feature information from the left eye image (ie, the Left eye in Figure 3)
  • the extracted feature information in turn Average pooling processing (ie Avg_pooling in Figure 3), fully connected layer processing (ie FC_Left in Figure 3), batch normalization processing (ie BN_Left in Figure 3), and activation function processing (ie Relu_Left in Figure 3) )
  • Average pooling processing ie Avg_pooling in Figure 3
  • FC_Left in Figure 3 fully connected layer processing
  • batch normalization processing ie BN_Left in Figure 3
  • activation function processing ie Relu_Left in Figure 3
  • the ResNet18 block (ie ResNet18 Block in Figure 3) is used to extract feature information from the right eye image (ie Right eye in Figure 3), and then the extracted feature information is sequentially performed Average pooling processing (ie Avg_pooling in Figure 3), fully connected layer processing (ie FC_Right in Figure 3), batch normalization processing (ie BN_Right in Figure 3), and activation function processing (ie Relu_Right in Figure 3) ) To obtain the right-eye characteristic information.
  • Average pooling processing ie Avg_pooling in Figure 3
  • FC_Right in Figure 3 fully connected layer processing
  • batch normalization processing ie BN_Right in Figure 3
  • activation function processing ie Relu_Right in Figure 3
  • the left-eye feature information and the right-eye feature information are obtained separately, the left-eye feature information and the right-eye feature information are spliced (ie EyesConcat in FIG. 3), and the two are merged, Then, the fused information is processed by the fully connected layer (ie EyesFc1 in FIG. 3) to obtain the binocular feature information.
  • the fully connected layer ie EyesFc1 in FIG. 3
  • the binocular feature information and the head pose are spliced (ie HeadConcat in Figure 3), the two are merged, and then the fusion is performed
  • the latter information is subjected to batch normalization processing (ie BN_Head in Figure 3), activation function processing (ie Relu_Head in Figure 3), and fully connected layer processing (ie Fc_Head in Figure 3) to obtain the attention direction of the line of sight (That is, Gaze in Figure 3).
  • the attention direction of the line of sight can be converted from angle form to vector form according to the following formula:
  • init_vector is the attention direction of the line of sight in vector form, and its components on the x-axis, y-axis, and z-axis are: vectorx, vectory, and vectorz, respectively.
  • the sight line in the form of a vector according to the following formula to obtain the normalized vector of the attention direction of the sight line:
  • gaze_vector init_vector/norm
  • norm is the modulus of the attention direction of the line of sight
  • gaze_vector is the normalized vector of the attention direction of the line of sight
  • its components on the x-axis, y-axis, and z-axis are: gaze_vector[x], gaze_vector[y] And gaze_vector[z].
  • the left-eye feature information, right-eye feature information, and head posture information are fused together for comprehensive consideration. Based on this, the attention direction of the line of sight is predicted, which greatly improves the final The accuracy of the forecast results.
  • a training sample set may be constructed in advance, and the initial state line-of-sight prediction model can be trained using the training sample set to obtain the pre-trained line-of-sight prediction model.
  • the training sample set includes SN training samples, each training sample includes pre-collected left-eye image, right-eye image and head posture of the subject, and the label corresponding to each training sample is pre-calibrated SN is a positive integer.
  • the training process of the neural network is a commonly used technology in the prior art, and for details, reference may be made to any neural network training method in the prior art, which will not be repeated in the embodiment of the present application.
  • Step S105 Perform eye key point detection in the left-eye image and the right-eye image, respectively, to obtain the coordinates of the center point of the left eye pupil and the coordinates of the right eye pupil center point.
  • ELM Eye Fixed-point Model
  • the obtained coordinates of the center point of the left eye pupil and the coordinates of the center point of the right eye pupil are all three-dimensional coordinates, which can be denoted as ( x left , y left , z left ) and (x right , y right , z right ), where x left , y left , and z left are respectively the center point of the left eye pupil on the x-axis, y-axis, and z-axis
  • the coordinates of x right , y right , and z right are the coordinates of the center point of the right eye pupil on the x-axis, y-axis, and z-axis, respectively.
  • Step S106 Determine an eye area of interest according to the line of sight attention direction, the coordinates of the center point of the left eye pupil, and the coordinates of the right eye pupil center point.
  • step S106 may specifically include the process shown in FIG. 6:
  • Step S1061 Calculate the coordinates of the center points of the pupils of both eyes according to the coordinates of the center point of the left eye pupil and the coordinates of the center point of the right eye pupil.
  • the coordinates of the center points of the pupils of the two eyes can be calculated according to the following formula:
  • middle_pos (left_iris_center+right_iris_center)/2
  • middle_pos is the coordinates of the center point of the pupils of the eyes
  • middle_pos (x middle , y middle , z middle )
  • x middle , y middle , and z middle are the pupil centers of the eyes on the x axis and y axis , The coordinates on the z axis.
  • Step S1062 Calculate the point-to-plane distance between the center point of the pupils of the eyes and the preset screen according to the coordinates of the center points of the pupils of the eyes.
  • n (A, B, C)
  • iris_distance (A*x middle +B* y middle +C* z middle )/sqrt(A 2 + B 2 +C 2 )
  • sqrt is a square root function
  • iris_distance is the distance between the center points of the pupils of the eyes and the screen.
  • Step S1063 Calculate the coordinates of the eye-focused point according to the eye-focusing direction, the coordinates of the center points of the pupils of the two eyes, and the point-to-surface distance.
  • the eye focus point is the projection point of the line of sight on the screen.
  • the coordinates of the eye point of interest can be calculated according to the following formula:
  • project_3d is the coordinate of the point of interest of the eye.
  • FIG. 7 shows a schematic diagram of calculating the coordinates of the eye point of interest.
  • Step S1064 Determine the eye focus area according to the coordinates of the eye focus point.
  • step S1064 may specifically include the process shown in FIG. 8:
  • Step S10641 Transform the coordinates of the eye point of interest according to the coordinates of the preset reference pixel point to obtain the pixel position of the eye point of interest on the screen.
  • the reference pixel point may be any one of the four corner points shown in FIG. 2, and here, the upper left corner point is preferably determined as the reference pixel point.
  • the first distance and the second distance may be calculated respectively according to the coordinates of the reference pixel point and the coordinates of the eye point of interest, and then according to the first distance and the preset first conversion coefficient Calculate the pixel position of the eye point of interest in the first coordinate axis direction, and calculate the eye point of interest in the second coordinate axis direction according to the second distance and a preset second conversion coefficient The pixel position on the top.
  • the first distance is the distance between the reference pixel point and the eye focus point in the direction of the preset first coordinate axis (that is, the x-axis in FIG. 2); the second distance is the reference pixel The distance between the point and the eye focus point in the direction of the preset second coordinate axis (ie, the y axis in FIG. 2).
  • the first conversion coefficient is the number of pixels included in the distance of each unit length in the direction of the first coordinate axis; the second conversion coefficient is the number of pixels per unit length in the direction of the second coordinate axis. The number of pixels included in the length of the distance.
  • project_3d[x] is the coordinate of the eye focus point in the direction of the first coordinate axis
  • left_up[x] Is the coordinate of the reference pixel in the direction of the first coordinate axis
  • scalex is the first conversion coefficient
  • project_pixel[x] is the pixel position of the eye point of interest in the direction of the first coordinate axis
  • Project_3d[y] is the coordinate of the eye focus point in the direction of the second coordinate axis
  • left_up[y] is the coordinate of the reference pixel point in the direction of the second coordinate axis
  • scaley is the The second conversion coefficient
  • project_piyel[y] is the pixel position of the eye focus point in the direction of the second coordinate axis.
  • the pixel position of the eye focus point on the screen is obtained, and the eye focus area can be accurately determined based on this.
  • Step S10642 Determine whether the pixel position is within the range of the screen according to the preset screen resolution.
  • the screen resolution of the screen is recorded as: MaxX* MaxY, if the pixel position satisfies: 0 ⁇ project_pixel[x] ⁇ MaxX and 0 ⁇ project_pixel[y] ⁇ MaxY, it can be determined that the pixel position is within the range of the screen, otherwise, it can be determined that the pixel position is not within the range of the screen.
  • the pixel position is not within the range of the screen, it means that the user is not paying attention to the content on the screen, and no subsequent processing is required at this time; if the pixel position is within the range of the screen, continue execution Next steps.
  • Step S10643 Determine the screen area where the pixel position is located according to a preset screen area division rule.
  • Step S10644 Determine the screen area where the pixel position is located as the eye focus area.
  • the screen can be divided into KN (KN is an integer greater than 1) screen areas in advance, which are recorded in the order from top to bottom and from left to right as: screen area 1, screen area 2...., screen area k,..., screen area KN, where 1 ⁇ k ⁇ KN, if the pixel position falls within the range of screen area 1, then screen area 1 can be determined as the eye focus area , If the pixel position falls within the range of the screen area 2, the screen area 2 can be determined as the eye focus area,..., if the pixel location falls within the range of the screen area k, the screen can be The area k is determined as the eye attention area, ..., if the pixel position falls within the range of the screen area KN, the screen area KN may be determined as the eye attention area.
  • KN is an integer greater than 1 screen areas in advance, which are recorded in the order from top to bottom and from left to right as: screen area 1, screen area 2...., screen area k,..., screen area KN, where 1 ⁇ k ⁇ KN, if the
  • facial feature information in the target face image can be extracted, and user information corresponding to the target face image can be determined according to the facial feature information .
  • This user information includes but is not limited to age, gender, etc.
  • each divided screen area can be used to display different information, including but not limited to advertisements, news, announcements, and so on.
  • the screen display information corresponding to the eye focus area can be further determined, and the corresponding relationship between the user information and the screen display information can be established.
  • the embodiment of the application obtains the target face image to be detected; calculates the head posture of the target face image; extracts the left eye image and the right eye image in the target face image; The left-eye image, the right-eye image, and the head posture determine the attention direction of the line of sight; eye key point detection is performed in the left-eye image and the right-eye image, respectively, to obtain the coordinates of the center point of the left eye pupil and The coordinates of the center point of the right eye pupil; the eye area of interest is determined according to the line of sight attention direction, the coordinates of the left eye pupil center point, and the coordinates of the right eye pupil center point.
  • FIG. 9 shows a structural diagram of an embodiment of a device for detecting a region of interest provided in an embodiment of the present application.
  • a device for detecting a region of interest may include:
  • the face image acquisition module 901 is used to acquire the target face image to be detected
  • the head posture calculation module 902 is used to calculate the head posture of the target face image
  • the eye image extraction module 903 is configured to extract the left eye image and the right eye image in the target face image
  • the sight attention direction determining module 904 is configured to determine the sight attention direction according to the left-eye image, the right-eye image, and the head posture;
  • the eye key point detection module 905 is configured to perform eye key point detection in the left eye image and the right eye image to obtain the coordinates of the center point of the left eye pupil and the coordinates of the center point of the right eye pupil;
  • the eye area of interest determination module 906 is configured to determine the eye area of interest according to the eye-focusing direction, the coordinates of the center point of the left eye pupil, and the coordinates of the center point of the right eye pupil.
  • the eye attention area determination module may include:
  • a center point coordinate calculation sub-module configured to calculate the coordinates of the center point of the pupils of both eyes according to the coordinates of the center point of the left eye pupil and the coordinates of the center point of the right eye pupil;
  • a point-to-surface distance calculation sub-module for calculating the point-to-surface distance between the center point of the pupils of the two eyes and a preset screen according to the coordinates of the center points of the pupils of the two eyes;
  • An eye point of interest coordinate calculation sub-module configured to calculate the coordinates of the eye point of interest according to the attention direction of the line of sight, the coordinates of the center points of the pupils of the two eyes, and the distance between the points;
  • the eye area of interest determination sub-module is used to determine the eye area of interest according to the coordinates of the eye point of interest.
  • sub-module for determining the eye area of interest may include:
  • a coordinate conversion unit configured to convert the coordinates of the eye point of interest according to the coordinates of a preset reference pixel point to obtain the pixel position of the eye point of interest on the screen;
  • the pixel position determining unit is configured to determine whether the pixel position is within the range of the screen according to a preset screen resolution
  • a screen area determining unit configured to determine the screen area where the pixel position is located according to a preset screen area division rule if the pixel position is within the range of the screen;
  • the eye attention area determining unit is configured to determine the screen area where the pixel position is located as the eye attention area.
  • the coordinate conversion unit may include:
  • the distance calculation subunit is configured to calculate a first distance and a second distance respectively according to the coordinates of the reference pixel point and the coordinates of the eye point of interest, where the first distance is the reference pixel point and the eye The distance of the point of interest in the preset first coordinate axis direction, where the second distance is the distance between the reference pixel point and the eye point of interest in the preset second coordinate axis direction;
  • a first pixel position calculation subunit configured to calculate the pixel position of the eye point of interest in the direction of the first coordinate axis according to the first distance and a preset first conversion coefficient
  • the second pixel position calculation subunit is configured to calculate the pixel position of the eye point of interest in the direction of the second coordinate axis according to the second distance and a preset second conversion coefficient.
  • the line-of-sight attention direction determining module is specifically configured to input the left-eye image, the right-eye image, and the head posture into a pre-trained line-of-sight prediction model for processing to obtain the line-of-sight attention direction ;
  • the sight attention direction determining module may include:
  • the feature information extraction sub-module is configured to extract feature information from the left-eye image and the right-eye image to obtain left-eye feature information and right-eye feature information;
  • the binocular feature information determining sub-module is configured to perform fusion processing on the left eye feature information and the right eye feature information to obtain binocular feature information;
  • the gaze attention direction determination sub-module is used to perform fusion processing on the binocular feature information and the head posture to obtain the gaze attention direction.
  • the device for detecting a region of interest may further include:
  • the sample set construction module is used to construct a training sample set, wherein the training sample set includes SN training samples, and each training sample includes pre-collected left-eye images, right-eye images and head poses of the subject, And the label corresponding to each training sample is the pre-calibrated line of sight attention direction, and SN is a positive integer;
  • the model training module is used to train the line-of-sight prediction model in the initial state by using the training sample set to obtain the pre-trained line-of-sight prediction model.
  • the device for detecting a region of interest may further include:
  • the facial feature information extraction module is used to extract the facial feature information in the target face image
  • a user information determining module configured to determine user information corresponding to the target face image according to the facial feature information
  • the screen display information determining module is used to determine the screen display information corresponding to the eye focus area
  • the correspondence relationship establishment module is used to establish the correspondence relationship between the user information and the screen display information.
  • FIG. 10 shows a schematic block diagram of a terminal device provided by an embodiment of the present application. For ease of description, only parts related to the embodiment of the present application are shown.
  • the terminal device 10 of this embodiment includes: a processor 100, a memory 101, and a computer program 102 stored in the memory 101 and running on the processor 100.
  • the processor 100 executes the computer program 102, the steps in the foregoing embodiments of the region of interest detection method are implemented, for example, step S101 to step S106 shown in FIG. 1.
  • the processor 100 executes the computer program 102, the functions of the modules/units in the foregoing device embodiments, for example, the functions of the modules 901 to 906 shown in FIG. 9 are realized.
  • the computer program 102 may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 101 and executed by the processor 100 to complete This application.
  • the one or more modules/units may be a series of computer program instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the computer program 102 in the terminal device 10.
  • the terminal device 10 may be a computing device such as a desktop computer, a notebook, a palmtop computer, a smart phone, a server, and a smart screen.
  • FIG. 10 is only an example of the terminal device 10, and does not constitute a limitation on the terminal device 10. It may include more or less components than shown in the figure, or a combination of certain components, or different components.
  • the terminal device 10 may also include an input/output device, a network access device, a bus, and the like.
  • the processor 100 may be a central processing unit (Central Processing Unit, CPU), it can also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), and application-specific integrated circuits (Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the processor 100 may be the nerve center and command center of the terminal device 10, and the processor 100 may generate operation control signals according to instruction operation codes and timing signals, and complete the control of fetching instructions and executing instructions.
  • the memory 101 may be an internal storage unit of the terminal device 10, such as a hard disk or a memory of the terminal device 10.
  • the memory 101 may also be an external storage device of the terminal device 10, such as a plug-in hard disk, a smart memory card (Smart Media Card, SMC), and a secure digital (Secure Digital, SD) equipped on the terminal device 10. Flash card Card) and so on.
  • the memory 101 may also include both an internal storage unit of the terminal device 10 and an external storage device.
  • the memory 101 is used to store the computer program and other programs and data required by the terminal device 10.
  • the memory 101 can also be used to temporarily store data that has been output or will be output.
  • the terminal device 10 may also include a communication module, and the communication module may provide applications on a network device including a wireless local area network (Wireless Local Area Networks, WLAN) (such as Wi-Fi networks), Bluetooth, Zigbee, mobile communication networks, Global Navigation Satellite System (GNSS), Frequency Modulation (Frequency Modulation, FM), Near Field Communication Technology (Near Field Communication, NFC), infrared technology (Infrared, IR) and other communication solutions.
  • the communication module may be one or more devices integrating at least one communication processing module.
  • the communication module may include an antenna, and the antenna may have only one array element or an antenna array including multiple array elements.
  • the communication module can receive electromagnetic waves through an antenna, frequency-modulate and filter the electromagnetic wave signals, and send the processed signals to the processor.
  • the communication module can also receive the signal to be sent from the processor, perform frequency modulation and amplification, and convert it into electromagnetic waves to radiate through the antenna.
  • the terminal device 10 may also include a power management module, which can receive input from an external power source, a battery, and/or a charger, and supply power to the processor, the memory, the communication module, and the like.
  • a power management module which can receive input from an external power source, a battery, and/or a charger, and supply power to the processor, the memory, the communication module, and the like.
  • the terminal device 10 may also include a display module, which may be used to display information input by the user or information provided to the user.
  • the display module may include a display panel.
  • a liquid crystal display Liquid Crystal Display, LCD
  • Organic Light-Emitting Diode OLED
  • the touch panel may cover the display panel. When the touch panel detects a touch operation on or near it, it is transmitted to the processor to determine the type of the touch event, and then the processor is based on the type of the touch event. Provide corresponding visual output on the display panel.
  • the disclosed device/terminal device and method may be implemented in other ways.
  • the device/terminal device embodiments described above are merely illustrative.
  • the division of the modules or units is only a logical function division, and there may be other divisions in actual implementation, such as multiple units.
  • components can be combined or integrated into another system, or some features can be omitted or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the embodiments of the present application provide a computer program product.
  • the terminal device can implement the steps in the foregoing method embodiments.
  • the integrated module/unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the present application implements all or part of the processes in the above-mentioned embodiments and methods, and can also be completed by instructing relevant hardware through a computer program.
  • the computer program can be stored in a computer-readable storage medium. When the program is executed by the processor, it can implement the steps of the foregoing method embodiments.
  • the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate forms.
  • the computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), electrical carrier signals, telecommunications signals, and software distribution media.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • electrical carrier signals telecommunications signals
  • software distribution media any entity or device capable of carrying the computer program code
  • recording medium U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), electrical carrier signals, telecommunications signals, and software distribution media.

Abstract

A region of concern detection method and apparatus, and a computer readable storage medium and a terminal device. The method comprises: obtaining a target face image to be detected (S101); calculating a head posture of the target face image (S102); extracting a left eye image and a right eye image from the target face image (S103); determining a sight line concern direction according to the left eye image, the right eye image, and the head posture (S104); respectively performing eye key point detection in the left eye image and the right eye image to obtain a coordinate of a left eye pupil center point and a coordinate of a right eye pupil center point (S105); and determining a region of eye concern according to the sight line concern direction, the coordinate of the left eye pupil center point, and the coordinate of the right eye pupil center point (S106). An expensive precision instrument does not need to be used, and a region of eye concern is determined by analyzing and processing a face image, so that costs are greatly reduced, and a wide range of applications can be carried out.

Description

一种关注区域检测方法、装置、可读存储介质及终端设备Method, device, readable storage medium and terminal equipment for detecting area of interest 技术领域Technical field
本发明涉及图像处理技术领域,尤其涉及一种关注区域检测方法、装置、计算机可读存储介质及终端设备。The present invention relates to the field of image processing technology, and in particular to a method, device, computer-readable storage medium and terminal equipment for detecting a region of interest.
背景技术Background technique
线下屏幕主要包含电视机、楼宇屏幕、影院取票机、影院LED、楼宇屏幕、超市/便利店POS机屏、出租车车载屏幕、投影机以及快递屏等。涉及到用户生活的各个方面。基于线下屏幕的线下广告方式在吸引消费者注意力方面有着天然的优势,但是线下广告的设计和内容是否更吸引消费者,广告商无法迅速知道,这导致线下广告好坏的反馈往往不如线上迅速和准确。这些造成了某些广告投放不够精准,效率不高的问题。现有技术中可以使用眼动仪设备等精密仪器来跟踪眼部运动,确定其关注区域,但其价格十分昂贵,难以进行广泛应用。Offline screens mainly include televisions, building screens, cinema ticket machines, cinema LEDs, building screens, supermarket/convenience store POS screens, taxi car screens, projectors, and express screens. Involve all aspects of the user's life. Offline advertising methods based on offline screens have a natural advantage in attracting consumers' attention, but advertisers cannot quickly know whether the design and content of offline advertising are more attractive to consumers, which leads to feedback on whether offline advertising is good or bad. It is often not as fast and accurate as online. These have caused the problem of insufficient accuracy and inefficiency of certain advertisements. In the prior art, precision instruments such as eye trackers can be used to track eye movement and determine the area of interest, but its price is very expensive and it is difficult to be widely used.
技术解决方案Technical solutions
有鉴于此,本申请实施例提供了一种关注区域检测方法、装置、计算机可读存储介质及终端设备,以解决现有的关注区域检测方法价格十分昂贵,难以进行广泛应用的问题。In view of this, the embodiments of the present application provide a method, device, computer-readable storage medium, and terminal device for detecting an area of interest to solve the problem that the existing method for detecting an area of interest is very expensive and difficult to be widely used.
本申请实施例的第一方面提供了一种关注区域检测方法,可以包括:The first aspect of the embodiments of the present application provides a method for detecting a region of interest, which may include:
获取待检测的目标人脸图像;Acquiring a target face image to be detected;
计算所述目标人脸图像的头部姿态;Calculating the head pose of the target face image;
提取所述目标人脸图像中的左眼图像和右眼图像;Extracting a left eye image and a right eye image in the target face image;
根据所述左眼图像、所述右眼图像和所述头部姿态确定视线关注方向;Determining the attention direction of the line of sight according to the left-eye image, the right-eye image, and the head posture;
分别在所述左眼图像和所述右眼图像中进行眼部关键点检测,得到左眼瞳孔中心点的坐标和右眼瞳孔中心点的坐标;Performing eye key point detection in the left-eye image and the right-eye image, respectively, to obtain the coordinates of the center point of the pupil of the left eye and the coordinates of the center point of the pupil of the right eye;
根据所述视线关注方向、所述左眼瞳孔中心点的坐标和所述右眼瞳孔中心点的坐标确定眼部关注区域。The eye area of interest is determined according to the attention direction of the line of sight, the coordinates of the center point of the left eye pupil, and the coordinates of the center point of the right eye pupil.
进一步地,所述根据所述视线关注方向、所述左眼瞳孔中心点的坐标和所述右眼瞳孔中心点的坐标确定眼部关注区域包括:Further, the determining the eye area of interest according to the line of sight attention direction, the coordinates of the center point of the left eye pupil, and the coordinates of the center point of the right eye pupil includes:
根据所述左眼瞳孔中心点的坐标和所述右眼瞳孔中心点的坐标计算双眼瞳孔中心点的坐标;Calculating the coordinates of the center points of the pupils of both eyes according to the coordinates of the center point of the pupil of the left eye and the coordinates of the center point of the pupil of the right eye;
根据所述双眼瞳孔中心点的坐标计算所述双眼瞳孔中心点与预设的屏幕之间的点面距离;Calculating the point-to-plane distance between the center point of the pupils of the two eyes and the preset screen according to the coordinates of the center points of the pupils of the two eyes;
根据所述视线关注方向、所述双眼瞳孔中心点的坐标和所述点面距离计算眼部关注点的坐标;Calculating the coordinates of the eye point of interest according to the attention direction of the line of sight, the coordinates of the center points of the pupils of the two eyes, and the distance between the points and surfaces;
根据所述眼部关注点的坐标确定眼部关注区域。The eye area of interest is determined according to the coordinates of the eye point of interest.
进一步地,所述根据所述眼部关注点的坐标确定眼部关注区域包括:Further, the determining the eye area of interest according to the coordinates of the eye point of interest includes:
根据预设的基准像素点的坐标对所述眼部关注点的坐标进行转换,得到所述眼部关注点在所述屏幕中的像素位置;Transforming the coordinates of the eye point of interest according to the coordinates of the preset reference pixel point to obtain the pixel position of the eye point of interest on the screen;
根据预设的屏幕分辨率判断所述像素位置是否处于所述屏幕的范围内;Judging whether the pixel position is within the range of the screen according to a preset screen resolution;
若所述像素位置处于所述屏幕的范围内,则根据预设的屏幕区域划分规则确定所述像素位置所处的屏幕区域;If the pixel position is within the range of the screen, determining the screen area where the pixel position is located according to a preset screen area division rule;
将所述像素位置所处的屏幕区域确定为所述眼部关注区域。The screen area where the pixel position is located is determined as the eye focus area.
进一步地,所述根据预设的基准像素点的坐标对所述眼部关注点的坐标进行转换包括:Further, the converting the coordinates of the eye focus point according to the coordinates of the preset reference pixel point includes:
根据所述基准像素点的坐标和所述眼部关注点的坐标分别计算第一距离和第二距离,所述第一距离为所述基准像素点和所述眼部关注点在预设的第一坐标轴方向上的距离,所述第二距离为所述基准像素点和所述眼部关注点在预设的第二坐标轴方向上的距离;The first distance and the second distance are respectively calculated according to the coordinates of the reference pixel point and the coordinates of the eye point of interest. The first distance is that the reference pixel point and the eye point of interest are in the preset first distance. A distance in a coordinate axis direction, where the second distance is a distance between the reference pixel point and the eye focus point in a preset second coordinate axis direction;
根据所述第一距离和预设的第一转换系数计算所述眼部关注点在所述第一坐标轴方向上的像素位置;Calculating the pixel position of the eye point of interest in the direction of the first coordinate axis according to the first distance and a preset first conversion coefficient;
根据所述第二距离和预设的第二转换系数计算所述眼部关注点在所述第二坐标轴方向上的像素位置。Calculate the pixel position of the eye point of interest in the direction of the second coordinate axis according to the second distance and a preset second conversion coefficient.
进一步地,所述根据所述左眼图像、所述右眼图像和所述头部姿态确定视线关注方向包括:Further, the determining the attention direction of the line of sight according to the left-eye image, the right-eye image, and the head posture includes:
将所述左眼图像、所述右眼图像和所述头部姿态输入到预训练好的视线预测模型中进行处理,得到所述视线关注方向;Inputting the left-eye image, the right-eye image, and the head posture into a pre-trained line of sight prediction model for processing to obtain the attention direction of the line of sight;
所述视线预测模型的处理过程包括:The processing process of the line-of-sight prediction model includes:
分别在所述左眼图像和所述右眼图像中进行特征信息提取,得到左眼特征信息和右眼特征信息;Performing feature information extraction in the left-eye image and the right-eye image respectively to obtain left-eye feature information and right-eye feature information;
将所述左眼特征信息和所述右眼特征信息进行融合处理,得到双眼特征信息;Performing fusion processing on the left eye feature information and the right eye feature information to obtain binocular feature information;
将所述双眼特征信息和所述头部姿态进行融合处理,得到所述视线关注方向。The binocular feature information and the head posture are fused to obtain the attention direction of the line of sight.
进一步地,在将所述左眼图像、所述右眼图像和所述头部姿态输入到预训练好的视线预测模型中进行处理之前,所述方法还可以包括:Further, before inputting the left-eye image, the right-eye image, and the head posture into a pre-trained line-of-sight prediction model for processing, the method may further include:
构造训练样本集,其中,所述训练样本集中包括SN个训练样本,每个训练样本均包括预先采集的受试者的左眼图像、右眼图像和头部姿态,且每个训练样本对应的标签均为预先标定的视线关注方向,SN为正整数;Construct a training sample set, where the training sample set includes SN training samples, each training sample includes pre-collected left-eye image, right-eye image and head posture of the subject, and each training sample corresponds to The labels are all pre-calibrated directions of attention, and SN is a positive integer;
使用所述训练样本集对初始状态的视线预测模型进行训练,得到所述预训练好的视线预测模型。Use the training sample set to train the line of sight prediction model in the initial state to obtain the pre-trained line of sight prediction model.
进一步地,在根据所述视线关注方向、所述左眼瞳孔中心点的坐标和所述右眼瞳孔中心点的坐标确定眼部关注区域之后,所述方法还可以包括:Further, after determining the eye area of interest according to the line of sight attention direction, the coordinates of the center point of the left eye pupil, and the coordinates of the center point of the right eye pupil, the method may further include:
提取所述目标人脸图像中的人脸特征信息;Extracting face feature information in the target face image;
根据所述人脸特征信息确定与所述目标人脸图像对应的用户信息;Determining user information corresponding to the target face image according to the facial feature information;
确定与所述眼部关注区域对应的屏幕展示信息;Determining the screen display information corresponding to the eye focus area;
建立所述用户信息与所述屏幕展示信息之间的对应关系。Establish a corresponding relationship between the user information and the screen display information.
本申请实施例的第二方面提供了一种关注区域检测装置,可以包括:A second aspect of the embodiments of the present application provides a device for detecting a region of interest, which may include:
人脸图像获取模块,用于获取待检测的目标人脸图像;The face image acquisition module is used to acquire the target face image to be detected;
头部姿态计算模块,用于计算所述目标人脸图像的头部姿态;A head posture calculation module for calculating the head posture of the target face image;
眼部图像提取模块,用于提取所述目标人脸图像中的左眼图像和右眼图像;An eye image extraction module for extracting left eye images and right eye images in the target face image;
视线关注方向确定模块,用于根据所述左眼图像、所述右眼图像和所述头部姿态确定视线关注方向;A sight attention direction determining module, configured to determine the sight attention direction according to the left-eye image, the right-eye image, and the head posture;
眼部关键点检测模块,用于分别在所述左眼图像和所述右眼图像中进行眼部关键点检测,得到左眼瞳孔中心点的坐标和右眼瞳孔中心点的坐标;An eye key point detection module, configured to detect key eye points in the left eye image and the right eye image to obtain the coordinates of the center point of the left eye pupil and the coordinates of the center point of the right eye pupil;
眼部关注区域确定模块,用于根据所述视线关注方向、所述左眼瞳孔中心点的坐标和所述右眼瞳孔中心点的坐标确定眼部关注区域。The eye area of interest determination module is configured to determine the eye area of interest according to the eye-focusing direction, the coordinates of the center point of the left eye pupil, and the coordinates of the right eye pupil center point.
进一步地,所述眼部关注区域确定模块可以包括:Further, the eye attention area determination module may include:
中心点坐标计算子模块,用于根据所述左眼瞳孔中心点的坐标和所述右眼瞳孔中心点的坐标计算双眼瞳孔中心点的坐标;A center point coordinate calculation sub-module, configured to calculate the coordinates of the center point of the pupils of both eyes according to the coordinates of the center point of the left eye pupil and the coordinates of the center point of the right eye pupil;
点面距离计算子模块,用于根据所述双眼瞳孔中心点的坐标计算所述双眼瞳孔中心点与预设的屏幕之间的点面距离;A point-to-surface distance calculation sub-module for calculating the point-to-surface distance between the center point of the pupils of the two eyes and a preset screen according to the coordinates of the center points of the pupils of the two eyes;
眼部关注点坐标计算子模块,用于根据所述视线关注方向、所述双眼瞳孔中心点的坐标和所述点面距离计算眼部关注点的坐标;An eye point of interest coordinate calculation sub-module, configured to calculate the coordinates of the eye point of interest according to the attention direction of the line of sight, the coordinates of the center points of the pupils of the two eyes, and the distance between the points;
眼部关注区域确定子模块,用于根据所述眼部关注点的坐标确定眼部关注区域。The eye area of interest determination sub-module is used to determine the eye area of interest according to the coordinates of the eye point of interest.
进一步地,所述眼部关注区域确定子模块可以包括:Further, the sub-module for determining the eye area of interest may include:
坐标转换单元,用于根据预设的基准像素点的坐标对所述眼部关注点的坐标进行转换,得到所述眼部关注点在所述屏幕中的像素位置;A coordinate conversion unit, configured to convert the coordinates of the eye point of interest according to the coordinates of a preset reference pixel point to obtain the pixel position of the eye point of interest on the screen;
像素位置判断单元,用于根据预设的屏幕分辨率判断所述像素位置是否处于所述屏幕的范围内;The pixel position determining unit is configured to determine whether the pixel position is within the range of the screen according to a preset screen resolution;
屏幕区域确定单元,用于若所述像素位置处于所述屏幕的范围内,则根据预设的屏幕区域划分规则确定所述像素位置所处的屏幕区域;A screen area determining unit, configured to determine the screen area where the pixel position is located according to a preset screen area division rule if the pixel position is within the range of the screen;
眼部关注区域确定单元,用于将所述像素位置所处的屏幕区域确定为所述眼部关注区域。The eye attention area determining unit is configured to determine the screen area where the pixel position is located as the eye attention area.
进一步地,所述坐标转换单元可以包括:Further, the coordinate conversion unit may include:
距离计算子单元,用于根据所述基准像素点的坐标和所述眼部关注点的坐标分别计算第一距离和第二距离,所述第一距离为所述基准像素点和所述眼部关注点在预设的第一坐标轴方向上的距离,所述第二距离为所述基准像素点和所述眼部关注点在预设的第二坐标轴方向上的距离;The distance calculation subunit is configured to calculate a first distance and a second distance respectively according to the coordinates of the reference pixel point and the coordinates of the eye point of interest, where the first distance is the reference pixel point and the eye The distance of the point of interest in the preset first coordinate axis direction, where the second distance is the distance between the reference pixel point and the eye point of interest in the preset second coordinate axis direction;
第一像素位置计算子单元,用于根据所述第一距离和预设的第一转换系数计算所述眼部关注点在所述第一坐标轴方向上的像素位置;A first pixel position calculation subunit, configured to calculate the pixel position of the eye point of interest in the direction of the first coordinate axis according to the first distance and a preset first conversion coefficient;
第二像素位置计算子单元,用于根据所述第二距离和预设的第二转换系数计算所述眼部关注点在所述第二坐标轴方向上的像素位置。The second pixel position calculation subunit is configured to calculate the pixel position of the eye point of interest in the direction of the second coordinate axis according to the second distance and a preset second conversion coefficient.
进一步地,所述视线关注方向确定模块具体用于将所述左眼图像、所述右眼图像和所述头部姿态输入到预训练好的视线预测模型中进行处理,得到所述视线关注方向;Further, the line-of-sight attention direction determining module is specifically configured to input the left-eye image, the right-eye image, and the head posture into a pre-trained line-of-sight prediction model for processing to obtain the line-of-sight attention direction ;
所述视线关注方向确定模块可以包括:The sight attention direction determining module may include:
特征信息提取子模块,用于分别在所述左眼图像和所述右眼图像中进行特征信息提取,得到左眼特征信息和右眼特征信息;The feature information extraction sub-module is configured to extract feature information from the left-eye image and the right-eye image to obtain left-eye feature information and right-eye feature information;
双眼特征信息确定子模块,用于将所述左眼特征信息和所述右眼特征信息进行融合处理,得到双眼特征信息;The binocular feature information determining sub-module is configured to perform fusion processing on the left eye feature information and the right eye feature information to obtain binocular feature information;
视线关注方向确定子模块,用于将所述双眼特征信息和所述头部姿态进行融合处理,得到所述视线关注方向。The gaze attention direction determination sub-module is used to perform fusion processing on the binocular feature information and the head posture to obtain the gaze attention direction.
进一步地,所述关注区域检测装置还可以包括:Further, the device for detecting a region of interest may further include:
样本集构造模块,用于构造训练样本集,其中,所述训练样本集中包括SN个训练样本,每个训练样本均包括预先采集的受试者的左眼图像、右眼图像和头部姿态,且每个训练样本对应的标签均为预先标定的视线关注方向,SN为正整数;The sample set construction module is used to construct a training sample set, wherein the training sample set includes SN training samples, and each training sample includes pre-collected left-eye images, right-eye images and head poses of the subject, And the label corresponding to each training sample is the pre-calibrated line of sight attention direction, and SN is a positive integer;
模型训练模块,用于使用所述训练样本集对初始状态的视线预测模型进行训练,得到所述预训练好的视线预测模型。The model training module is used to train the line-of-sight prediction model in the initial state by using the training sample set to obtain the pre-trained line-of-sight prediction model.
进一步地,所述关注区域检测装置还可以包括:Further, the device for detecting a region of interest may further include:
人脸特征信息提取模块,用于提取所述目标人脸图像中的人脸特征信息;The facial feature information extraction module is used to extract the facial feature information in the target face image;
用户信息确定模块,用于根据所述人脸特征信息确定与所述目标人脸图像对应的用户信息;A user information determining module, configured to determine user information corresponding to the target face image according to the facial feature information;
屏幕展示信息确定模块,用于确定与所述眼部关注区域对应的屏幕展示信息;The screen display information determining module is used to determine the screen display information corresponding to the eye focus area;
对应关系建立模块,用于建立所述用户信息与所述屏幕展示信息之间的对应关系。The correspondence relationship establishment module is used to establish the correspondence relationship between the user information and the screen display information.
本申请实施例的第三方面提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现上述任一种关注区域检测方法的步骤。The third aspect of the embodiments of the present application provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps of any of the above-mentioned region-of-interest detection methods are implemented .
本申请实施例的第四方面提供了一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现上述任一种关注区域检测方法的步骤。The fourth aspect of the embodiments of the present application provides a terminal device, including a memory, a processor, and a computer program stored in the memory and running on the processor. When the processor executes the computer program, Steps for implementing any of the above-mentioned methods for detecting a region of interest.
本申请实施例的第五方面提供了一种计算机程序产品,当计算机程序产品在终端设备上运行时,使得终端设备执行上述任一种关注区域检测方法的步骤。The fifth aspect of the embodiments of the present application provides a computer program product, which when the computer program product runs on a terminal device, causes the terminal device to execute the steps of any of the above-mentioned region-of-interest detection methods.
本申请实施例与现有技术相比存在的有益效果是:本申请实施例获取待检测的目标人脸图像;计算所述目标人脸图像的头部姿态;提取所述目标人脸图像中的左眼图像和右眼图像;根据所述左眼图像、所述右眼图像和所述头部姿态确定视线关注方向;分别在所述左眼图像和所述右眼图像中进行眼部关键点检测,得到左眼瞳孔中心点的坐标和右眼瞳孔中心点的坐标;根据所述视线关注方向、所述左眼瞳孔中心点的坐标和所述右眼瞳孔中心点的坐标确定眼部关注区域。在本申请实施例中,无需使用价格昂贵的精密仪器,而是通过对人脸图像的图像分析处理,分别得到视线关注方向、左眼瞳孔中心点的坐标和右眼瞳孔中心点的坐标,并以此来确定眼部关注区域,极大降低了成本,可以进行更广泛的应用。Compared with the prior art, the embodiment of the application has the following beneficial effects: the embodiment of the application obtains the target face image to be detected; calculates the head posture of the target face image; extracts the Left-eye image and right-eye image; determine the attention direction of the line of sight according to the left-eye image, the right-eye image, and the head posture; perform eye key points in the left-eye image and the right-eye image, respectively Through detection, the coordinates of the center point of the left eye pupil and the coordinates of the center point of the right eye pupil are obtained; the eye focus area is determined according to the focus direction of the line of sight, the coordinates of the left eye pupil center point and the coordinates of the right eye pupil center point . In the embodiments of this application, there is no need to use expensive precision instruments, but through image analysis and processing of the face image, the eye-focus direction, the coordinates of the center point of the left eye pupil and the coordinates of the center point of the right eye pupil are obtained respectively, and In this way, the eye area is determined, which greatly reduces the cost and can be used in a wider range of applications.
附图说明Description of the drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其它的附图。In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only of the present application. For some embodiments, those of ordinary skill in the art can obtain other drawings based on these drawings without creative labor.
图1为本申请实施例中一种关注区域检测方法的一个实施例流程图;FIG. 1 is a flowchart of an embodiment of a method for detecting a region of interest in an embodiment of the application;
图2为本申请实施例中建立的3D坐标系的示意图;Figure 2 is a schematic diagram of a 3D coordinate system established in an embodiment of the application;
图3为视线预测模型的网络结构示意图;Figure 3 is a schematic diagram of the network structure of the line-of-sight prediction model;
图4为视线关注方向的示意图;Figure 4 is a schematic diagram of the attention direction of the line of sight;
图5为瞳孔中心点的示意图;Fig. 5 is a schematic diagram of the center point of the pupil;
图6为根据视线关注方向、左眼瞳孔中心点的坐标和右眼瞳孔中心点的坐标确定眼部关注区域的示意流程图;Fig. 6 is a schematic flow chart of determining the eye area of interest according to the attention direction of the line of sight, the coordinates of the center point of the left eye pupil, and the coordinates of the right eye pupil center point;
图7为计算眼部关注点的坐标的示意图;Figure 7 is a schematic diagram of calculating the coordinates of the eye point of interest;
图8为根据眼部关注点的坐标确定眼部关注区域的示意流程图;FIG. 8 is a schematic flowchart of determining the eye area of interest according to the coordinates of the eye point of interest;
图9为本申请实施例中一种关注区域检测装置的一个实施例结构图;FIG. 9 is a structural diagram of an embodiment of a device for detecting a region of interest in an embodiment of the application;
图10为本申请实施例中一种终端设备的示意框图。FIG. 10 is a schematic block diagram of a terminal device in an embodiment of this application.
本发明的实施方式Embodiments of the present invention
请参阅图1,本申请实施例中一种关注区域检测方法的一个实施例可以包括:Referring to FIG. 1, an embodiment of a method for detecting a region of interest in an embodiment of the present application may include:
步骤S101、获取待检测的目标人脸图像。Step S101: Obtain a target face image to be detected.
在本申请实施例中,为了确定用户在屏幕上的关注区域,可以为所述屏幕配置深度相机,所述深度相机可以内置于所述屏幕中,也可以作为所述屏幕的外接设备。In the embodiment of the present application, in order to determine the user's attention area on the screen, a depth camera may be configured for the screen. The depth camera may be built into the screen or used as an external device of the screen.
在本申请实施例中,可以使用所述深度相机的相机坐标系建立如图2所示的3D坐标系,并预先标定所述屏幕的左上(即图2中的left_up)、右上(即图2中的right_up)、左下(即图2中的left_bottom)、右下(即图2中的right_bottom)这4个角点在3D坐标系中的坐标。In the embodiment of the present application, the camera coordinate system of the depth camera can be used to establish a 3D coordinate system as shown in FIG. 2, and the upper left (that is, left_up in FIG. 2) and upper right (that is, the left_up in FIG. 2) of the screen can be pre-calibrated. The coordinates of the 4 corner points in the 3D coordinate system, the right_up in the middle, the lower left (that is, the left_bottom in Fig. 2), and the lower right (that is, the right_bottom in Fig. 2).
本申请实施例的执行主体可以为与所述屏幕通过有线方式或者无线方式连接的终端设备,包括但不限于桌上型计算机、笔记本、掌上电脑、智能手机、服务器以及其它具有数据处理功能的终端设备。特殊地,若所述屏幕为具有数据处理功能的智能屏幕,则也可以将其作为执行本申请实施例的终端设备,而无需依赖其它外部的终端设备。The execution subject of the embodiments of this application may be a terminal device connected to the screen in a wired or wireless manner, including but not limited to desktop computers, notebooks, palmtop computers, smart phones, servers, and other terminals with data processing functions. equipment. In particular, if the screen is a smart screen with a data processing function, it can also be used as a terminal device for executing the embodiments of the present application without relying on other external terminal devices.
在本申请实施例的一种具体实现中,可以通过所述深度相机采集所述屏幕周边的图像,并在该图像中进行人脸检测,若检测到人脸,则可截取当前的人脸图像,也即所述目标人脸图像。In a specific implementation of the embodiment of the present application, an image around the screen can be collected by the depth camera, and face detection can be performed in the image. If a face is detected, the current face image can be captured , That is, the target face image.
步骤S102、计算所述目标人脸图像的头部姿态。Step S102: Calculate the head posture of the target face image.
在获取到所述目标人脸图像之后,可以对其进行人脸3D关键点检测,并根据这些3D关键点来计算所述目标人脸图像的头部姿态。After the target face image is acquired, 3D key points of the face can be detected on it, and the head pose of the target face image can be calculated according to these 3D key points.
在本申请实施例的一种具体实现中,可以使用迭代最近点(Iterative Closest Point,ICP)算法来进行头部姿态的计算。具体地,预先设置一个作为比对基准的参考点云图,在该参考点云图中包括各个作为比对基准的3D关键点,然后,将检测到的3D关键点构造为所述目标人脸图像的点云图,利用最近相邻准则判断两个点云图中的对应点,并通过最小二乘法确定两者之间的转换函数,利用这一转换函数对所述目标人脸图像的点云图进行旋转,得到所述目标人脸图像更新后的点云图,重复以上过程,直至达到预先设定的终止条件后,停止迭代。最后,将每一次迭代计算过程中旋转的角度进行叠加,所得结果即为所述目标人脸图像的头部姿态。In a specific implementation of the embodiment of the present application, an Iterative Closest Point (ICP) algorithm may be used to calculate the head posture. Specifically, a reference point cloud image as a comparison reference is preset, and each 3D key point as a comparison reference is included in the reference point cloud image, and then the detected 3D key points are constructed as the target face image. The point cloud image uses the nearest neighbor criterion to determine the corresponding points in the two point cloud images, and determines the conversion function between the two through the least square method, and uses this conversion function to rotate the point cloud image of the target face image, The updated point cloud image of the target face image is obtained, and the above process is repeated until the preset termination condition is reached, and the iteration is stopped. Finally, the angle of rotation in each iteration of the calculation process is superimposed, and the result obtained is the head pose of the target face image.
需要注意的是,以上头部姿态的计算过程仅为示例,在实际应用中,还可以根据具体情况选择现有技术中的任意一种头部姿态计算方法,本申请实施例对此不作具体限定。It should be noted that the calculation process of the head posture above is only an example. In practical applications, any head posture calculation method in the prior art can also be selected according to the specific situation, which is not specifically limited in the embodiment of the present application. .
此处将计算得到的所述头部姿态记为:headpose[theta, phi],其中,theta表示头部的向上仰视或者向下俯视的角度,phi表示头部在水平方向的偏转角。Here, the calculated head posture is denoted as headpose[theta, phi], where theta represents the upward or downward looking angle of the head, and phi represents the deflection angle of the head in the horizontal direction.
步骤S103、提取所述目标人脸图像中的左眼图像和右眼图像。Step S103: Extract a left eye image and a right eye image in the target face image.
以左眼图像的提取过程为例,在本申请实施例的一种具体实现中,可以首先从检测到的3D关键点中筛选出左眼关键点,此处将这些左眼关键点的横坐标最小值记为left_x_min,横坐标最大值记为left_x_max,纵坐标最小值记为left_y_min,纵坐标最大值记为left_y_max,则可以将以下四个坐标点所构成的矩形区域(记为LA1)中的图像作为所述左眼图像:(left_x_min,left_y_max),(left_x_min,left_y_min),(left_x_max,left_y_max),(left_x_max,left_y_min)。进一步地,考虑到直接利用最值信息截取左眼图像可能会存在边缘信息缺失的现象,因此,还可以进一步对LA1进行外扩,得到新的矩形区域LA2,并将LA2中的图像作为所述左眼图像。右眼图像的提取过程与左眼图像的提取过程类似,此处不再赘述。Taking the extraction process of the left-eye image as an example, in a specific implementation of the embodiment of the present application, the left-eye key points can be filtered out from the detected 3D key points. Here, the abscissas of these left-eye key points The minimum value is left_x_min, the maximum value of the abscissa is left_x_max, the minimum value of the ordinate is left_y_min, and the maximum value of the ordinate is left_y_max. Then the rectangular area (denoted as LA1) formed by the following four coordinate points The image is used as the left-eye image: (left_x_min, left_y_max), (left_x_min, left_y_min), (left_x_max, left_y_max), (left_x_max, left_y_min). Further, considering that directly using the maximum value information to intercept the left-eye image may have the phenomenon of missing edge information, it is possible to further expand LA1 to obtain a new rectangular area LA2, and use the image in LA2 as the Left eye image. The extraction process of the right-eye image is similar to the extraction process of the left-eye image, and will not be repeated here.
需要注意的是,以上眼部图像提取过程仅为示例,在实际应用中,还可以根据具体情况选择现有技术中的任意一种眼部图像提取方法,本申请实施例对此不作具体限定。It should be noted that the above eye image extraction process is only an example. In practical applications, any eye image extraction method in the prior art can also be selected according to specific conditions, which is not specifically limited in the embodiment of the present application.
步骤S104、根据所述左眼图像、所述右眼图像和所述头部姿态确定视线关注方向。Step S104: Determine the attention direction of the line of sight according to the left-eye image, the right-eye image and the head posture.
在本申请实施例中,可以将所述左眼图像、所述右眼图像和所述头部姿态输入到预训练好的视线预测模型中进行处理,从而得到所述视线关注方向。In the embodiment of the present application, the left-eye image, the right-eye image, and the head posture may be input into a pre-trained line of sight prediction model for processing, so as to obtain the attention direction of the line of sight.
如图3所示,所述视线预测模型采用的是一种多输入的神经网络结构,所述视线预测模型首先分别在所述左眼图像和所述右眼图像中进行特征信息提取,得到左眼特征信息和右眼特征信息,然后将所述左眼特征信息和所述右眼特征信息进行融合处理,得到双眼特征信息,最后将所述双眼特征信息和所述头部姿态进行融合处理,得到所述视线关注方向,如图4所示。As shown in Figure 3, the line of sight prediction model uses a multi-input neural network structure. The line of sight prediction model first extracts feature information from the left-eye image and the right-eye image to obtain the left-eye image. Eye feature information and right eye feature information, then the left eye feature information and the right eye feature information are fused to obtain binocular feature information, and finally the binocular feature information and the head posture are fused to process, Obtain the attention direction of the line of sight, as shown in FIG. 4.
需要注意的是,在本申请实施例中,默认双眼的视线关注方向是一致的,不存在对眼的情况。It should be noted that, in the embodiment of the present application, the default attention direction of the eyes of both eyes is the same, and there is no case of eye-to-eye.
现结合图3对所述视线预测模型具体的处理过程说明如下:The specific processing process of the line-of-sight prediction model is now described in conjunction with Figure 3 as follows:
对于左眼图像而言,使用ResNet18组块(即图3中的ResNet18 Block)在所述左眼图像(即图3中的Left eye)中进行特征信息提取,然后对提取出的特征信息依次进行平均池化处理(即图3中的Avg_pooling)、全连接层处理(即图3中的FC_Left)、批量归一化处理(即图3中的BN_Left)以及激活函数处理(即图3中的Relu_Left),从而得到所述左眼特征信息。For the left eye image, use the ResNet18 block (ie, the ResNet18 Block in Figure 3) to extract feature information from the left eye image (ie, the Left eye in Figure 3), and then perform the extracted feature information in turn Average pooling processing (ie Avg_pooling in Figure 3), fully connected layer processing (ie FC_Left in Figure 3), batch normalization processing (ie BN_Left in Figure 3), and activation function processing (ie Relu_Left in Figure 3) ) To obtain the left-eye feature information.
对于右眼图像而言,使用ResNet18组块(即图3中的ResNet18 Block)在所述右眼图像(即图3中的Right eye)中进行特征信息提取,然后对提取出的特征信息依次进行平均池化处理(即图3中的Avg_pooling)、全连接层处理(即图3中的FC_Right)、批量归一化处理(即图3中的BN_Right)以及激活函数处理(即图3中的Relu_Right),从而得到所述右眼特征信息。For the right eye image, the ResNet18 block (ie ResNet18 Block in Figure 3) is used to extract feature information from the right eye image (ie Right eye in Figure 3), and then the extracted feature information is sequentially performed Average pooling processing (ie Avg_pooling in Figure 3), fully connected layer processing (ie FC_Right in Figure 3), batch normalization processing (ie BN_Right in Figure 3), and activation function processing (ie Relu_Right in Figure 3) ) To obtain the right-eye characteristic information.
在分别得到所述左眼特征信息和所述右眼特征信息之后,对所述左眼特征信息和所述右眼特征信息进行拼接处理(即图3中的EyesConcat),将两者进行融合,然后对融合后的信息进行全连接层处理(即图3中的EyesFc1),从而得到所述双眼特征信息。After the left-eye feature information and the right-eye feature information are obtained separately, the left-eye feature information and the right-eye feature information are spliced (ie EyesConcat in FIG. 3), and the two are merged, Then, the fused information is processed by the fully connected layer (ie EyesFc1 in FIG. 3) to obtain the binocular feature information.
在得到所述双眼特征信息之后,将所述双眼特征信息和所述头部姿态(即图3中的HeadPose)进行拼接处理(即图3中的HeadConcat),将两者进行融合,然后对融合后的信息进行批量归一化处理(即图3中的BN_Head)、激活函数处理(即图3中的Relu_Head)、全连接层处理(即图3中的Fc_Head),从而得到所述视线关注方向(即图3中的Gaze)。After the binocular feature information is obtained, the binocular feature information and the head pose (ie HeadPose in Figure 3) are spliced (ie HeadConcat in Figure 3), the two are merged, and then the fusion is performed The latter information is subjected to batch normalization processing (ie BN_Head in Figure 3), activation function processing (ie Relu_Head in Figure 3), and fully connected layer processing (ie Fc_Head in Figure 3) to obtain the attention direction of the line of sight (That is, Gaze in Figure 3).
此处将计算得到的所述视线关注方向记为:Gaze[gaze_theta, gaze_phi],其中,gaze_theta表示视线的向上仰视或者向下俯视的角度,gaze_phi表示视线在水平方向的偏转角。Here, the calculated attention direction of the line of sight is recorded as: Gaze[gaze_theta, gaze_phi], where gaze_theta represents the upward or downward looking angle of the line of sight, and gaze_phi represents the deflection angle of the line of sight in the horizontal direction.
为了后续计算方便,可以根据下式将所述视线关注方向从角度形式转换为向量形式:For the convenience of subsequent calculations, the attention direction of the line of sight can be converted from angle form to vector form according to the following formula:
vectorx = cos(gaze_theta)*sin(gaze_phi)vectorx = cos(gaze_theta)*sin(gaze_phi)
vectory = sin(gaze_theta)vectory = sin(gaze_theta)
vectorz = cos(gaze_theta)*cos(gaze_phi)vectorz = cos(gaze_theta)*cos(gaze_phi)
init_vector=( vector, vectory, vectorz)init_vector=( vector, vectory, vectorz)
其中,init_vector为向量形式的所述视线关注方向,其在x轴、y轴、z轴上的分量分别为:vectorx、vectory和vectorz。Wherein, init_vector is the attention direction of the line of sight in vector form, and its components on the x-axis, y-axis, and z-axis are: vectorx, vectory, and vectorz, respectively.
优选地,还可以根据下式对向量形式的所述视线关注方向进行归一化处理,得到所述视线关注方向的归一化向量:Preferably, it is also possible to normalize the attention direction of the sight line in the form of a vector according to the following formula to obtain the normalized vector of the attention direction of the sight line:
norm = sqrt(vectorx 2+ vectory 2 + vectorz 2) norm = sqrt(vectorx 2 + vectory 2 + vectorz 2 )
gaze_vector= init_vector/normgaze_vector= init_vector/norm
其中,norm为所述视线关注方向的模,gaze_vector为所述视线关注方向的归一化向量,其在x轴、y轴、z轴上的分量分别为:gaze_vector[x]、gaze_vector[y]和gaze_vector[z]。Among them, norm is the modulus of the attention direction of the line of sight, gaze_vector is the normalized vector of the attention direction of the line of sight, and its components on the x-axis, y-axis, and z-axis are: gaze_vector[x], gaze_vector[y] And gaze_vector[z].
通过如图3所示的处理过程,将左眼特征信息、右眼特征信息以及头部姿态这三方面的信息融合在一起进行综合考虑,以此为依据来预测视线关注方向,大大提高了最终的预测结果的准确率。Through the processing process shown in Figure 3, the left-eye feature information, right-eye feature information, and head posture information are fused together for comprehensive consideration. Based on this, the attention direction of the line of sight is predicted, which greatly improves the final The accuracy of the forecast results.
优选地,在步骤S104之前,可以预先构造训练样本集,并使用所述训练样本集对初始状态的视线预测模型进行训练,得到所述预训练好的视线预测模型。Preferably, before step S104, a training sample set may be constructed in advance, and the initial state line-of-sight prediction model can be trained using the training sample set to obtain the pre-trained line-of-sight prediction model.
其中,所述训练样本集中包括SN个训练样本,每个训练样本均包括预先采集的受试者的左眼图像、右眼图像和头部姿态,且每个训练样本对应的标签均为预先标定的视线关注方向,SN为正整数。Wherein, the training sample set includes SN training samples, each training sample includes pre-collected left-eye image, right-eye image and head posture of the subject, and the label corresponding to each training sample is pre-calibrated SN is a positive integer.
神经网络的训练过程为现有技术中较为常用的技术,具体可参照现有技术中的任意一种神经网络训练方式,本申请实施例对此不再赘述。The training process of the neural network is a commonly used technology in the prior art, and for details, reference may be made to any neural network training method in the prior art, which will not be repeated in the embodiment of the present application.
通过这一训练过程,预先采集了大量的实际测试得到的样本,构造出训练样本集,以这些实测数据作为依据来对视线预测模型进行训练,从而使得最终得到的视线预测模型更加符合实际情况,以此为基础检测得到的视线关注方向具有更高的准确率。Through this training process, a large number of actual test samples are collected in advance to construct a training sample set, and the sight line prediction model is trained on the basis of these measured data, so that the final line of sight prediction model is more in line with the actual situation. Based on this, the attention direction of the line of sight obtained by the detection has a higher accuracy rate.
步骤S105、分别在所述左眼图像和所述右眼图像中进行眼部关键点检测,得到左眼瞳孔中心点的坐标和右眼瞳孔中心点的坐标。Step S105: Perform eye key point detection in the left-eye image and the right-eye image, respectively, to obtain the coordinates of the center point of the left eye pupil and the coordinates of the right eye pupil center point.
在本申请实施例中,优选采用眼睛定点模型(EyeLandMarkModel,ELM)分别在所述左眼图像和所述右眼图像中进行眼部关键点检测,从而得到所述左眼瞳孔中心点的坐标(记为left_iris_center)和所述右眼瞳孔中心点的坐标(记为right_iris_center),如图5所示。需要注意的是,由于本申请中使用的是深度相机,因此,所得到所述左眼瞳孔中心点的坐标和所述右眼瞳孔中心点的坐标均为三维坐标,分别可以将其记为(x left, y left, z left)及(x right, y right, z right),其中,x left, y left, z left分别为所述左眼瞳孔中心点在x轴、y轴、z轴上的坐标,x right, y right, z right分别为所述右眼瞳孔中心点在x轴、y轴、z轴上的坐标。 In the embodiment of the present application, it is preferable to use an eye fixed-point model (EyeLandMark Model, ELM) to perform eye key point detection in the left eye image and the right eye image respectively, so as to obtain the coordinates of the center point of the left eye pupil ( Denoted as left_iris_center) and the coordinates of the center point of the right eye pupil (denoted as right_iris_center), as shown in Figure 5. It should be noted that since the depth camera is used in this application, the obtained coordinates of the center point of the left eye pupil and the coordinates of the center point of the right eye pupil are all three-dimensional coordinates, which can be denoted as ( x left , y left , z left ) and (x right , y right , z right ), where x left , y left , and z left are respectively the center point of the left eye pupil on the x-axis, y-axis, and z-axis The coordinates of x right , y right , and z right are the coordinates of the center point of the right eye pupil on the x-axis, y-axis, and z-axis, respectively.
步骤S106、根据所述视线关注方向、所述左眼瞳孔中心点的坐标和所述右眼瞳孔中心点的坐标确定眼部关注区域。Step S106: Determine an eye area of interest according to the line of sight attention direction, the coordinates of the center point of the left eye pupil, and the coordinates of the right eye pupil center point.
在本申请实施例的一种具体实现中,步骤S106具体可以包括如图6所示的过程:In a specific implementation of the embodiment of the present application, step S106 may specifically include the process shown in FIG. 6:
步骤S1061、根据所述左眼瞳孔中心点的坐标和所述右眼瞳孔中心点的坐标计算双眼瞳孔中心点的坐标。Step S1061: Calculate the coordinates of the center points of the pupils of both eyes according to the coordinates of the center point of the left eye pupil and the coordinates of the center point of the right eye pupil.
具体地,可以根据下式计算所述双眼瞳孔中心点的坐标:Specifically, the coordinates of the center points of the pupils of the two eyes can be calculated according to the following formula:
middle_pos=(left_iris_center+right_iris_center)/2middle_pos=(left_iris_center+right_iris_center)/2
其中,middle_pos即为所述双眼瞳孔中心点的坐标,且middle_pos =(x middle, y middle, z middle),x middle, y middle, z middle分别为所述双眼瞳孔中心点在x轴、y轴、z轴上的坐标。 Among them, middle_pos is the coordinates of the center point of the pupils of the eyes, and middle_pos = (x middle , y middle , z middle ), x middle , y middle , and z middle are the pupil centers of the eyes on the x axis and y axis , The coordinates on the z axis.
步骤S1062、根据所述双眼瞳孔中心点的坐标计算所述双眼瞳孔中心点与预设的屏幕之间的点面距离。Step S1062: Calculate the point-to-plane distance between the center point of the pupils of the eyes and the preset screen according to the coordinates of the center points of the pupils of the eyes.
此处可以将所述屏幕所在的平面的的法向量记为n=(A,B,C),则可以根据下式计算所述双眼瞳孔中心点与所述屏幕之间的点面距离:Here, the normal vector of the plane on which the screen is located can be denoted as n=(A, B, C), and the point-to-plane distance between the center point of the pupils of the two eyes and the screen can be calculated according to the following formula:
iris_distance=(A*x middle+B* y middle +C* z middle)/sqrt(A 2+ B 2 +C 2) iris_distance=(A*x middle +B* y middle +C* z middle )/sqrt(A 2 + B 2 +C 2 )
其中,sqrt为求开方函数,iris_distance为所述双眼瞳孔中心点与所述屏幕之间的点面距离。Wherein, sqrt is a square root function, and iris_distance is the distance between the center points of the pupils of the eyes and the screen.
步骤S1063、根据所述视线关注方向、所述双眼瞳孔中心点的坐标和所述点面距离计算眼部关注点的坐标。Step S1063: Calculate the coordinates of the eye-focused point according to the eye-focusing direction, the coordinates of the center points of the pupils of the two eyes, and the point-to-surface distance.
所述眼部关注点为视线在所述屏幕上的投影点。具体地,可以根据下式计算所述眼部关注点的坐标:The eye focus point is the projection point of the line of sight on the screen. Specifically, the coordinates of the eye point of interest can be calculated according to the following formula:
project_3d = middle_pos + gaze_vector*(iris_distance/gaze_vector[z])project_3d = middle_pos + gaze_vector*(iris_distance/gaze_vector[z])
其中,project_3d为所述眼部关注点的坐标。Wherein, project_3d is the coordinate of the point of interest of the eye.
图7所示即为计算所述眼部关注点的坐标的示意图。通过这一过程,基于几何空间位置关系实现了对于所述眼部关注点的坐标的精准计算,以此为基础确定出的眼部关注区域具有更高的准确率。FIG. 7 shows a schematic diagram of calculating the coordinates of the eye point of interest. Through this process, accurate calculation of the coordinates of the eye focus point is achieved based on the geometric spatial position relationship, and the eye focus area determined on this basis has a higher accuracy rate.
步骤S1064、根据所述眼部关注点的坐标确定眼部关注区域。Step S1064: Determine the eye focus area according to the coordinates of the eye focus point.
在本申请实施例的一种具体实现中,步骤S1064具体可以包括如图8所示的过程:In a specific implementation of the embodiment of the present application, step S1064 may specifically include the process shown in FIG. 8:
步骤S10641、根据预设的基准像素点的坐标对所述眼部关注点的坐标进行转换,得到所述眼部关注点在所述屏幕中的像素位置。Step S10641: Transform the coordinates of the eye point of interest according to the coordinates of the preset reference pixel point to obtain the pixel position of the eye point of interest on the screen.
所述基准像素点可以为图2所示的4个角点中的任意一个,此处优选将左上角点确定为所述基准像素点。The reference pixel point may be any one of the four corner points shown in FIG. 2, and here, the upper left corner point is preferably determined as the reference pixel point.
在进行坐标转换时,可以首先根据所述基准像素点的坐标和所述眼部关注点的坐标分别计算第一距离和第二距离,然后根据所述第一距离和预设的第一转换系数计算所述眼部关注点在所述第一坐标轴方向上的像素位置,并根据所述第二距离和预设的第二转换系数计算所述眼部关注点在所述第二坐标轴方向上的像素位置。When performing coordinate conversion, the first distance and the second distance may be calculated respectively according to the coordinates of the reference pixel point and the coordinates of the eye point of interest, and then according to the first distance and the preset first conversion coefficient Calculate the pixel position of the eye point of interest in the first coordinate axis direction, and calculate the eye point of interest in the second coordinate axis direction according to the second distance and a preset second conversion coefficient The pixel position on the top.
所述第一距离为所述基准像素点和所述眼部关注点在预设的第一坐标轴(即图2中的x轴)方向上的距离;所述第二距离为所述基准像素点和所述眼部关注点在预设的第二坐标轴(即图2中的y轴)方向上的距离。所述第一转换系数为在所述第一坐标轴方向上,每一单位长度的距离所包含的像素个数;所述第二转换系数为在所述第二坐标轴方向上,每一单位长度的距离所包含的像素个数。The first distance is the distance between the reference pixel point and the eye focus point in the direction of the preset first coordinate axis (that is, the x-axis in FIG. 2); the second distance is the reference pixel The distance between the point and the eye focus point in the direction of the preset second coordinate axis (ie, the y axis in FIG. 2). The first conversion coefficient is the number of pixels included in the distance of each unit length in the direction of the first coordinate axis; the second conversion coefficient is the number of pixels per unit length in the direction of the second coordinate axis. The number of pixels included in the length of the distance.
具体的坐标转换公式如下所示:The specific coordinate conversion formula is as follows:
project_pixel[x] = (project_3d[x]-left_up[x])*scalexproject_pixel[x] = (project_3d[x]-left_up[x])*scalex
project_pixel[y] = (project_3d[y]- left_up[y])*scaleyproject_pixel[y] = (project_3d[y]- left_up[y])*scaley
其中,project_3d[x]为所述眼部关注点在所述第一坐标轴方向上的坐标,left_up[x] 为所述基准像素点在所述第一坐标轴方向上的坐标,scalex为所述第一转换系数,project_pixel[x]为所述眼部关注点在所述第一坐标轴方向上的像素位置,project_3d[y]为所述眼部关注点在所述第二坐标轴方向上的坐标,left_up[y] 为所述基准像素点在所述第二坐标轴方向上的坐标,scaley为所述第二转换系数,project_piyel[y]为所述眼部关注点在所述第二坐标轴方向上的像素位置。Wherein, project_3d[x] is the coordinate of the eye focus point in the direction of the first coordinate axis, and left_up[x] Is the coordinate of the reference pixel in the direction of the first coordinate axis, scalex is the first conversion coefficient, and project_pixel[x] is the pixel position of the eye point of interest in the direction of the first coordinate axis , Project_3d[y] is the coordinate of the eye focus point in the direction of the second coordinate axis, left_up[y] is the coordinate of the reference pixel point in the direction of the second coordinate axis, and scaley is the The second conversion coefficient, project_piyel[y] is the pixel position of the eye focus point in the direction of the second coordinate axis.
通过以上坐标转换过程,得到了所述眼部关注点在所述屏幕中的像素位置,以此为依据可以准确地确定出眼部关注区域。Through the above coordinate conversion process, the pixel position of the eye focus point on the screen is obtained, and the eye focus area can be accurately determined based on this.
步骤S10642、根据预设的屏幕分辨率判断所述像素位置是否处于所述屏幕的范围内。Step S10642: Determine whether the pixel position is within the range of the screen according to the preset screen resolution.
此处将所述屏幕的屏幕分辨率记为:MaxX* MaxY,若所述像素位置满足:0< project_pixel[x] < MaxX且0< project_pixel[y] < MaxY,则可以判定所述像素位置处于所述屏幕的范围内,反之,则可以判定所述像素位置不处于所述屏幕的范围内。Here, the screen resolution of the screen is recorded as: MaxX* MaxY, if the pixel position satisfies: 0< project_pixel[x] <MaxX and 0< project_pixel[y] <MaxY, it can be determined that the pixel position is within the range of the screen, otherwise, it can be determined that the pixel position is not within the range of the screen.
若所述像素位置不处于所述屏幕的范围内,则说明用户并未关注所述屏幕中的内容,此时无需进行后续处理;若所述像素位置处于所述屏幕的范围内,则继续执行后续步骤。If the pixel position is not within the range of the screen, it means that the user is not paying attention to the content on the screen, and no subsequent processing is required at this time; if the pixel position is within the range of the screen, continue execution Next steps.
步骤S10643、根据预设的屏幕区域划分规则确定所述像素位置所处的屏幕区域。Step S10643: Determine the screen area where the pixel position is located according to a preset screen area division rule.
步骤S10644、将所述像素位置所处的屏幕区域确定为所述眼部关注区域。Step S10644: Determine the screen area where the pixel position is located as the eye focus area.
在本申请实施例中,可以预先将所述屏幕划分为KN个(KN为大于1的整数)屏幕区域,按照从上到下,从左到右的顺序依次记为:屏幕区域1、屏幕区域2、…、屏幕区域k、…、屏幕区域KN,其中,1≤k≤KN,若所述像素位置落入屏幕区域1的范围内,则可将屏幕区域1确定为所述眼部关注区域,若所述像素位置落入屏幕区域2的范围内,则可将屏幕区域2确定为所述眼部关注区域,…,若所述像素位置落入屏幕区域k的范围内,则可将屏幕区域k确定为所述眼部关注区域,…,若所述像素位置落入屏幕区域KN的范围内,则可将屏幕区域KN确定为所述眼部关注区域。In the embodiment of this application, the screen can be divided into KN (KN is an integer greater than 1) screen areas in advance, which are recorded in the order from top to bottom and from left to right as: screen area 1, screen area 2...., screen area k,..., screen area KN, where 1≤k≤KN, if the pixel position falls within the range of screen area 1, then screen area 1 can be determined as the eye focus area , If the pixel position falls within the range of the screen area 2, the screen area 2 can be determined as the eye focus area,..., if the pixel location falls within the range of the screen area k, the screen can be The area k is determined as the eye attention area, ..., if the pixel position falls within the range of the screen area KN, the screen area KN may be determined as the eye attention area.
通过预先设置各个屏幕区域的范围,在计算得到所述眼部关注点在所述屏幕中的像素位置之后,仅需要判断该像素位置属于哪一个屏幕区域的范围,即可确定出对应的眼部关注区域,计算量极小,大大提升了关注区域检测的效率。By pre-setting the range of each screen area, after calculating the pixel position of the eye focus on the screen, it is only necessary to determine which screen area the pixel position belongs to, and then the corresponding eye can be determined The area of interest has a very small amount of calculation, which greatly improves the efficiency of the area of interest detection.
进一步地,在确定出所述眼部关注区域之后,还可以提取所述目标人脸图像中的人脸特征信息,并根据所述人脸特征信息确定与所述目标人脸图像对应的用户信息,这些用户信息包括但不限于年龄、性别等。Further, after the eye area of interest is determined, facial feature information in the target face image can be extracted, and user information corresponding to the target face image can be determined according to the facial feature information , This user information includes but is not limited to age, gender, etc.
在本申请实施例的一种具体应用中,划分出的各个屏幕区域可以用来分别展示不同的信息,包括但不限于广告、新闻、公告等等。在确定出所述眼部关注区域之后,即可进一步确定出与所述眼部关注区域对应的屏幕展示信息,并建立所述用户信息与所述屏幕展示信息之间的对应关系。In a specific application of the embodiment of the present application, each divided screen area can be used to display different information, including but not limited to advertisements, news, announcements, and so on. After the eye focus area is determined, the screen display information corresponding to the eye focus area can be further determined, and the corresponding relationship between the user information and the screen display information can be established.
通过这样的方式,即可搜集到大量的统计数据,例如,每个屏幕展示信息被用户关注了多少次,每个屏幕展示信息是被什么类型(用户类型可以根据年龄、性别等进行划分)的用户关注等等,将这些统计数据作为屏幕展示信息更换、投放的依据,使得屏幕展示信息投放的精准度和效率大大提升。In this way, you can collect a lot of statistical data, for example, how many times each screen display information has been followed by the user, and what type of each screen display information (user types can be divided according to age, gender, etc.) User attention, etc., use these statistical data as the basis for the replacement and delivery of screen display information, which greatly improves the accuracy and efficiency of screen display information delivery.
综上所述,本申请实施例获取待检测的目标人脸图像;计算所述目标人脸图像的头部姿态;提取所述目标人脸图像中的左眼图像和右眼图像;根据所述左眼图像、所述右眼图像和所述头部姿态确定视线关注方向;分别在所述左眼图像和所述右眼图像中进行眼部关键点检测,得到左眼瞳孔中心点的坐标和右眼瞳孔中心点的坐标;根据所述视线关注方向、所述左眼瞳孔中心点的坐标和所述右眼瞳孔中心点的坐标确定眼部关注区域。在本申请实施例中,无需使用价格昂贵的精密仪器,而是通过对人脸图像的图像分析处理,分别得到视线关注方向、左眼瞳孔中心点的坐标和右眼瞳孔中心点的坐标,并以此来确定眼部关注区域,极大降低了成本,可以进行更广泛的应用。In summary, the embodiment of the application obtains the target face image to be detected; calculates the head posture of the target face image; extracts the left eye image and the right eye image in the target face image; The left-eye image, the right-eye image, and the head posture determine the attention direction of the line of sight; eye key point detection is performed in the left-eye image and the right-eye image, respectively, to obtain the coordinates of the center point of the left eye pupil and The coordinates of the center point of the right eye pupil; the eye area of interest is determined according to the line of sight attention direction, the coordinates of the left eye pupil center point, and the coordinates of the right eye pupil center point. In the embodiments of this application, there is no need to use expensive precision instruments, but through image analysis and processing of the face image, the eye-focus direction, the coordinates of the center point of the left eye pupil and the coordinates of the center point of the right eye pupil are obtained respectively, and In this way, the eye area is determined, which greatly reduces the cost and can be used in a wider range of applications.
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution. The execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.
对应于上文实施例所述的一种关注区域检测方法,图9示出了本申请实施例提供的一种关注区域检测装置的一个实施例结构图。Corresponding to the method for detecting a region of interest described in the above embodiment, FIG. 9 shows a structural diagram of an embodiment of a device for detecting a region of interest provided in an embodiment of the present application.
本实施例中,一种关注区域检测装置可以包括:In this embodiment, a device for detecting a region of interest may include:
人脸图像获取模块901,用于获取待检测的目标人脸图像;The face image acquisition module 901 is used to acquire the target face image to be detected;
头部姿态计算模块902,用于计算所述目标人脸图像的头部姿态;The head posture calculation module 902 is used to calculate the head posture of the target face image;
眼部图像提取模块903,用于提取所述目标人脸图像中的左眼图像和右眼图像;The eye image extraction module 903 is configured to extract the left eye image and the right eye image in the target face image;
视线关注方向确定模块904,用于根据所述左眼图像、所述右眼图像和所述头部姿态确定视线关注方向;The sight attention direction determining module 904 is configured to determine the sight attention direction according to the left-eye image, the right-eye image, and the head posture;
眼部关键点检测模块905,用于分别在所述左眼图像和所述右眼图像中进行眼部关键点检测,得到左眼瞳孔中心点的坐标和右眼瞳孔中心点的坐标;The eye key point detection module 905 is configured to perform eye key point detection in the left eye image and the right eye image to obtain the coordinates of the center point of the left eye pupil and the coordinates of the center point of the right eye pupil;
眼部关注区域确定模块906,用于根据所述视线关注方向、所述左眼瞳孔中心点的坐标和所述右眼瞳孔中心点的坐标确定眼部关注区域。The eye area of interest determination module 906 is configured to determine the eye area of interest according to the eye-focusing direction, the coordinates of the center point of the left eye pupil, and the coordinates of the center point of the right eye pupil.
进一步地,所述眼部关注区域确定模块可以包括:Further, the eye attention area determination module may include:
中心点坐标计算子模块,用于根据所述左眼瞳孔中心点的坐标和所述右眼瞳孔中心点的坐标计算双眼瞳孔中心点的坐标;A center point coordinate calculation sub-module, configured to calculate the coordinates of the center point of the pupils of both eyes according to the coordinates of the center point of the left eye pupil and the coordinates of the center point of the right eye pupil;
点面距离计算子模块,用于根据所述双眼瞳孔中心点的坐标计算所述双眼瞳孔中心点与预设的屏幕之间的点面距离;A point-to-surface distance calculation sub-module for calculating the point-to-surface distance between the center point of the pupils of the two eyes and a preset screen according to the coordinates of the center points of the pupils of the two eyes;
眼部关注点坐标计算子模块,用于根据所述视线关注方向、所述双眼瞳孔中心点的坐标和所述点面距离计算眼部关注点的坐标;An eye point of interest coordinate calculation sub-module, configured to calculate the coordinates of the eye point of interest according to the attention direction of the line of sight, the coordinates of the center points of the pupils of the two eyes, and the distance between the points;
眼部关注区域确定子模块,用于根据所述眼部关注点的坐标确定眼部关注区域。The eye area of interest determination sub-module is used to determine the eye area of interest according to the coordinates of the eye point of interest.
进一步地,所述眼部关注区域确定子模块可以包括:Further, the sub-module for determining the eye area of interest may include:
坐标转换单元,用于根据预设的基准像素点的坐标对所述眼部关注点的坐标进行转换,得到所述眼部关注点在所述屏幕中的像素位置;A coordinate conversion unit, configured to convert the coordinates of the eye point of interest according to the coordinates of a preset reference pixel point to obtain the pixel position of the eye point of interest on the screen;
像素位置判断单元,用于根据预设的屏幕分辨率判断所述像素位置是否处于所述屏幕的范围内;The pixel position determining unit is configured to determine whether the pixel position is within the range of the screen according to a preset screen resolution;
屏幕区域确定单元,用于若所述像素位置处于所述屏幕的范围内,则根据预设的屏幕区域划分规则确定所述像素位置所处的屏幕区域;A screen area determining unit, configured to determine the screen area where the pixel position is located according to a preset screen area division rule if the pixel position is within the range of the screen;
眼部关注区域确定单元,用于将所述像素位置所处的屏幕区域确定为所述眼部关注区域。The eye attention area determining unit is configured to determine the screen area where the pixel position is located as the eye attention area.
进一步地,所述坐标转换单元可以包括:Further, the coordinate conversion unit may include:
距离计算子单元,用于根据所述基准像素点的坐标和所述眼部关注点的坐标分别计算第一距离和第二距离,所述第一距离为所述基准像素点和所述眼部关注点在预设的第一坐标轴方向上的距离,所述第二距离为所述基准像素点和所述眼部关注点在预设的第二坐标轴方向上的距离;The distance calculation subunit is configured to calculate a first distance and a second distance respectively according to the coordinates of the reference pixel point and the coordinates of the eye point of interest, where the first distance is the reference pixel point and the eye The distance of the point of interest in the preset first coordinate axis direction, where the second distance is the distance between the reference pixel point and the eye point of interest in the preset second coordinate axis direction;
第一像素位置计算子单元,用于根据所述第一距离和预设的第一转换系数计算所述眼部关注点在所述第一坐标轴方向上的像素位置;A first pixel position calculation subunit, configured to calculate the pixel position of the eye point of interest in the direction of the first coordinate axis according to the first distance and a preset first conversion coefficient;
第二像素位置计算子单元,用于根据所述第二距离和预设的第二转换系数计算所述眼部关注点在所述第二坐标轴方向上的像素位置。The second pixel position calculation subunit is configured to calculate the pixel position of the eye point of interest in the direction of the second coordinate axis according to the second distance and a preset second conversion coefficient.
进一步地,所述视线关注方向确定模块具体用于将所述左眼图像、所述右眼图像和所述头部姿态输入到预训练好的视线预测模型中进行处理,得到所述视线关注方向;Further, the line-of-sight attention direction determining module is specifically configured to input the left-eye image, the right-eye image, and the head posture into a pre-trained line-of-sight prediction model for processing to obtain the line-of-sight attention direction ;
所述视线关注方向确定模块可以包括:The sight attention direction determining module may include:
特征信息提取子模块,用于分别在所述左眼图像和所述右眼图像中进行特征信息提取,得到左眼特征信息和右眼特征信息;The feature information extraction sub-module is configured to extract feature information from the left-eye image and the right-eye image to obtain left-eye feature information and right-eye feature information;
双眼特征信息确定子模块,用于将所述左眼特征信息和所述右眼特征信息进行融合处理,得到双眼特征信息;The binocular feature information determining sub-module is configured to perform fusion processing on the left eye feature information and the right eye feature information to obtain binocular feature information;
视线关注方向确定子模块,用于将所述双眼特征信息和所述头部姿态进行融合处理,得到所述视线关注方向。The gaze attention direction determination sub-module is used to perform fusion processing on the binocular feature information and the head posture to obtain the gaze attention direction.
进一步地,所述关注区域检测装置还可以包括:Further, the device for detecting a region of interest may further include:
样本集构造模块,用于构造训练样本集,其中,所述训练样本集中包括SN个训练样本,每个训练样本均包括预先采集的受试者的左眼图像、右眼图像和头部姿态,且每个训练样本对应的标签均为预先标定的视线关注方向,SN为正整数;The sample set construction module is used to construct a training sample set, wherein the training sample set includes SN training samples, and each training sample includes pre-collected left-eye images, right-eye images and head poses of the subject, And the label corresponding to each training sample is the pre-calibrated line of sight attention direction, and SN is a positive integer;
模型训练模块,用于使用所述训练样本集对初始状态的视线预测模型进行训练,得到所述预训练好的视线预测模型。The model training module is used to train the line-of-sight prediction model in the initial state by using the training sample set to obtain the pre-trained line-of-sight prediction model.
进一步地,所述关注区域检测装置还可以包括:Further, the device for detecting a region of interest may further include:
人脸特征信息提取模块,用于提取所述目标人脸图像中的人脸特征信息;The facial feature information extraction module is used to extract the facial feature information in the target face image;
用户信息确定模块,用于根据所述人脸特征信息确定与所述目标人脸图像对应的用户信息;A user information determining module, configured to determine user information corresponding to the target face image according to the facial feature information;
屏幕展示信息确定模块,用于确定与所述眼部关注区域对应的屏幕展示信息;The screen display information determining module is used to determine the screen display information corresponding to the eye focus area;
对应关系建立模块,用于建立所述用户信息与所述屏幕展示信息之间的对应关系。The correspondence relationship establishment module is used to establish the correspondence relationship between the user information and the screen display information.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的装置,模块和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and conciseness of the description, the specific working processes of the above described devices, modules and units can refer to the corresponding processes in the foregoing method embodiments, which will not be repeated here.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。In the above-mentioned embodiments, the description of each embodiment has its own emphasis. For parts that are not described in detail or recorded in an embodiment, reference may be made to related descriptions of other embodiments.
图10示出了本申请实施例提供的一种终端设备的示意框图,为了便于说明,仅示出了与本申请实施例相关的部分。FIG. 10 shows a schematic block diagram of a terminal device provided by an embodiment of the present application. For ease of description, only parts related to the embodiment of the present application are shown.
如图10所示,该实施例的终端设备10包括:处理器100、存储器101以及存储在所述存储器101中并可在所述处理器100上运行的计算机程序102。所述处理器100执行所述计算机程序102时实现上述各个关注区域检测方法实施例中的步骤,例如图1所示的步骤S101至步骤S106。或者,所述处理器100执行所述计算机程序102时实现上述各装置实施例中各模块/单元的功能,例如图9所示模块901至模块906的功能。As shown in FIG. 10, the terminal device 10 of this embodiment includes: a processor 100, a memory 101, and a computer program 102 stored in the memory 101 and running on the processor 100. When the processor 100 executes the computer program 102, the steps in the foregoing embodiments of the region of interest detection method are implemented, for example, step S101 to step S106 shown in FIG. 1. Alternatively, when the processor 100 executes the computer program 102, the functions of the modules/units in the foregoing device embodiments, for example, the functions of the modules 901 to 906 shown in FIG. 9 are realized.
示例性的,所述计算机程序102可以被分割成一个或多个模块/单元,所述一个或者多个模块/单元被存储在所述存储器101中,并由所述处理器100执行,以完成本申请。所述一个或多个模块/单元可以是能够完成特定功能的一系列计算机程序指令段,该指令段用于描述所述计算机程序102在所述终端设备10中的执行过程。Exemplarily, the computer program 102 may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 101 and executed by the processor 100 to complete This application. The one or more modules/units may be a series of computer program instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the computer program 102 in the terminal device 10.
所述终端设备10可以是桌上型计算机、笔记本、掌上电脑、智能手机、服务器及智能屏幕等计算设备。本领域技术人员可以理解,图10仅仅是终端设备10的示例,并不构成对终端设备10的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如所述终端设备10还可以包括输入输出设备、网络接入设备、总线等。The terminal device 10 may be a computing device such as a desktop computer, a notebook, a palmtop computer, a smart phone, a server, and a smart screen. Those skilled in the art can understand that FIG. 10 is only an example of the terminal device 10, and does not constitute a limitation on the terminal device 10. It may include more or less components than shown in the figure, or a combination of certain components, or different components. For example, the terminal device 10 may also include an input/output device, a network access device, a bus, and the like.
所述处理器100可以是中央处理单元(Central Processing Unit,CPU),还可以是其它通用处理器、数字信号处理器 (Digital Signal Processor,DSP)、专用集成电路 (Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA) 或者其它可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。所述处理器100可以是所述终端设备10的神经中枢和指挥中心,所述处理器100可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。The processor 100 may be a central processing unit (Central Processing Unit, CPU), it can also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), and application-specific integrated circuits (Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like. The processor 100 may be the nerve center and command center of the terminal device 10, and the processor 100 may generate operation control signals according to instruction operation codes and timing signals, and complete the control of fetching instructions and executing instructions.
所述存储器101可以是所述终端设备10的内部存储单元,例如终端设备10的硬盘或内存。所述存储器101也可以是所述终端设备10的外部存储设备,例如所述终端设备10上配备的插接式硬盘,智能存储卡(Smart Media Card, SMC),安全数字(Secure Digital, SD)卡,闪存卡(Flash Card)等。进一步地,所述存储器101还可以既包括所述终端设备10的内部存储单元也包括外部存储设备。所述存储器101用于存储所述计算机程序以及所述终端设备10所需的其它程序和数据。所述存储器101还可以用于暂时地存储已经输出或者将要输出的数据。The memory 101 may be an internal storage unit of the terminal device 10, such as a hard disk or a memory of the terminal device 10. The memory 101 may also be an external storage device of the terminal device 10, such as a plug-in hard disk, a smart memory card (Smart Media Card, SMC), and a secure digital (Secure Digital, SD) equipped on the terminal device 10. Flash card Card) and so on. Further, the memory 101 may also include both an internal storage unit of the terminal device 10 and an external storage device. The memory 101 is used to store the computer program and other programs and data required by the terminal device 10. The memory 101 can also be used to temporarily store data that has been output or will be output.
所述终端设备10还可以包括通信模块,所述通信模块可以提供应用在网络设备上的包括无线局域网(Wireless Local Area Networks,WLAN)(如Wi-Fi网络),蓝牙,Zigbee,移动通信网络,全球导航卫星系统(Global Navigation Satellite System,GNSS),调频(Frequency Modulation,FM),近距离无线通信技术(Near Field Communication,NFC),红外技术(Infrared,IR)等通信的解决方案。所述通信模块可以是集成至少一个通信处理模块的一个或多个器件。该通信模块可以包括天线,该天线可以只有一个阵元,也可以是包括多个阵元的天线阵列。所述通信模块可以通过天线接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器。所述通信模块还可以从处理器接收待发送的信号,对其进行调频、放大,经天线转为电磁波辐射出去。The terminal device 10 may also include a communication module, and the communication module may provide applications on a network device including a wireless local area network (Wireless Local Area Networks, WLAN) (such as Wi-Fi networks), Bluetooth, Zigbee, mobile communication networks, Global Navigation Satellite System (GNSS), Frequency Modulation (Frequency Modulation, FM), Near Field Communication Technology (Near Field Communication, NFC), infrared technology (Infrared, IR) and other communication solutions. The communication module may be one or more devices integrating at least one communication processing module. The communication module may include an antenna, and the antenna may have only one array element or an antenna array including multiple array elements. The communication module can receive electromagnetic waves through an antenna, frequency-modulate and filter the electromagnetic wave signals, and send the processed signals to the processor. The communication module can also receive the signal to be sent from the processor, perform frequency modulation and amplification, and convert it into electromagnetic waves to radiate through the antenna.
所述终端设备10还可以包括电源管理模块,所述电源管理模块可以接收外接电源、电池和/或充电器的输入,为所述处理器、所述存储器和所述通信模块等供电。The terminal device 10 may also include a power management module, which can receive input from an external power source, a battery, and/or a charger, and supply power to the processor, the memory, the communication module, and the like.
所述终端设备10还可以包括显示模块,所述显示模块可用于显示由用户输入的信息或提供给用户的信息。所述显示模块可包括显示面板,可选的,可以采用液晶显示器(Liquid Crystal Display,LCD)、有机发光二极管(Organic Light-Emitting Diode, OLED)等形式来配置显示面板。进一步的,触控面板可覆盖显示面板,当触控面板检测到在其上或附近的触摸操作后,传送给所述处理器以确定触摸事件的类型,随后所述处理器根据触摸事件的类型在所述显示面板上提供相应的视觉输出。The terminal device 10 may also include a display module, which may be used to display information input by the user or information provided to the user. The display module may include a display panel. Optionally, a liquid crystal display (Liquid Crystal Display, LCD), Organic Light-Emitting Diode (OLED) and other forms to configure the display panel. Further, the touch panel may cover the display panel. When the touch panel detects a touch operation on or near it, it is transmitted to the processor to determine the type of the touch event, and then the processor is based on the type of the touch event. Provide corresponding visual output on the display panel.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述系统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and conciseness of description, only the division of the above functional units and modules is used as an example. In practical applications, the above functions can be allocated to different functional units and modules as needed. Module completion, that is, the internal structure of the device is divided into different functional units or modules to complete all or part of the functions described above. The functional units and modules in the embodiments can be integrated into one processing unit, or each unit can exist alone physically, or two or more units can be integrated into one unit. The above-mentioned integrated units can be hardware-based Formal realization can also be realized in the form of software functional units. In addition, the specific names of the functional units and modules are only used to facilitate distinguishing from each other, and are not used to limit the protection scope of the present application. For the specific working process of the units and modules in the foregoing system, reference may be made to the corresponding process in the foregoing method embodiment, which will not be repeated here.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。In the above-mentioned embodiments, the description of each embodiment has its own emphasis. For parts that are not described in detail or recorded in an embodiment, reference may be made to related descriptions of other embodiments.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。A person of ordinary skill in the art may realize that the units and algorithm steps of the examples described in combination with the embodiments disclosed herein can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are performed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
在本申请所提供的实施例中,应该理解到,所揭露的装置/终端设备和方法,可以通过其它的方式实现。例如,以上所描述的装置/终端设备实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性,机械或其它的形式。In the embodiments provided in this application, it should be understood that the disclosed device/terminal device and method may be implemented in other ways. For example, the device/terminal device embodiments described above are merely illustrative. For example, the division of the modules or units is only a logical function division, and there may be other divisions in actual implementation, such as multiple units. Or components can be combined or integrated into another system, or some features can be omitted or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
本申请实施例提供了一种计算机程序产品,当计算机程序产品在所述终端设备上运行时,使得所述终端设备可实现上述各个方法实施例中的步骤。The embodiments of the present application provide a computer program product. When the computer program product runs on the terminal device, the terminal device can implement the steps in the foregoing method embodiments.
所述集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,也可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质等。需要说明的是,所述计算机可读介质包含的内容可以根据司法管辖区内立法和专利实践的要求进行适当的增减,例如在某些司法管辖区,根据立法和专利实践,计算机可读介质不包括电载波信号和电信信号。If the integrated module/unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the present application implements all or part of the processes in the above-mentioned embodiments and methods, and can also be completed by instructing relevant hardware through a computer program. The computer program can be stored in a computer-readable storage medium. When the program is executed by the processor, it can implement the steps of the foregoing method embodiments. Wherein, the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate forms. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), electrical carrier signals, telecommunications signals, and software distribution media. It should be noted that the content contained in the computer-readable medium can be appropriately added or deleted according to the requirements of the legislation and patent practice in the jurisdiction. For example, in some jurisdictions, according to the legislation and patent practice, the computer-readable medium Does not include electrical carrier signals and telecommunication signals.
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, a person of ordinary skill in the art should understand that it can still implement the foregoing The technical solutions recorded in the examples are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in Within the scope of protection of this application.

Claims (10)

  1. 一种关注区域检测方法,其特征在于,包括:A method for detecting a region of interest, which is characterized in that it includes:
    获取待检测的目标人脸图像;Acquiring a target face image to be detected;
    计算所述目标人脸图像的头部姿态;Calculating the head pose of the target face image;
    提取所述目标人脸图像中的左眼图像和右眼图像;Extracting a left eye image and a right eye image in the target face image;
    根据所述左眼图像、所述右眼图像和所述头部姿态确定视线关注方向;Determining the attention direction of the line of sight according to the left-eye image, the right-eye image, and the head posture;
    分别在所述左眼图像和所述右眼图像中进行眼部关键点检测,得到左眼瞳孔中心点的坐标和右眼瞳孔中心点的坐标;Performing eye key point detection in the left-eye image and the right-eye image, respectively, to obtain the coordinates of the center point of the pupil of the left eye and the coordinates of the center point of the pupil of the right eye;
    根据所述视线关注方向、所述左眼瞳孔中心点的坐标和所述右眼瞳孔中心点的坐标确定眼部关注区域。The eye area of interest is determined according to the attention direction of the line of sight, the coordinates of the center point of the left eye pupil, and the coordinates of the center point of the right eye pupil.
  2. 根据权利要求1所述的关注区域检测方法,其特征在于,所述根据所述视线关注方向、所述左眼瞳孔中心点的坐标和所述右眼瞳孔中心点的坐标确定眼部关注区域包括:The method for detecting an area of interest according to claim 1, wherein the determining the eye area of interest according to the attention direction of the line of sight, the coordinates of the center point of the left eye pupil, and the coordinates of the center point of the right eye pupil comprises :
    根据所述左眼瞳孔中心点的坐标和所述右眼瞳孔中心点的坐标计算双眼瞳孔中心点的坐标;Calculating the coordinates of the center points of the pupils of both eyes according to the coordinates of the center point of the pupil of the left eye and the coordinates of the center point of the pupil of the right eye;
    根据所述双眼瞳孔中心点的坐标计算所述双眼瞳孔中心点与预设的屏幕之间的点面距离;Calculating the point-to-plane distance between the center point of the pupils of the two eyes and the preset screen according to the coordinates of the center points of the pupils of the two eyes;
    根据所述视线关注方向、所述双眼瞳孔中心点的坐标和所述点面距离计算眼部关注点的坐标;Calculating the coordinates of the eye point of interest according to the attention direction of the line of sight, the coordinates of the center points of the pupils of the two eyes, and the distance between the points and surfaces;
    根据所述眼部关注点的坐标确定眼部关注区域。The eye area of interest is determined according to the coordinates of the eye point of interest.
  3. 根据权利要求2所述的关注区域检测方法,其特征在于,所述根据所述眼部关注点的坐标确定眼部关注区域包括:The method for detecting a region of interest according to claim 2, wherein the determining the region of interest of the eye according to the coordinates of the point of interest of the eye comprises:
    根据预设的基准像素点的坐标对所述眼部关注点的坐标进行转换,得到所述眼部关注点在所述屏幕中的像素位置;Transforming the coordinates of the eye point of interest according to the coordinates of the preset reference pixel point to obtain the pixel position of the eye point of interest on the screen;
    根据预设的屏幕分辨率判断所述像素位置是否处于所述屏幕的范围内;Judging whether the pixel position is within the range of the screen according to a preset screen resolution;
    若所述像素位置处于所述屏幕的范围内,则根据预设的屏幕区域划分规则确定所述像素位置所处的屏幕区域;If the pixel position is within the range of the screen, determining the screen area where the pixel position is located according to a preset screen area division rule;
    将所述像素位置所处的屏幕区域确定为所述眼部关注区域。The screen area where the pixel position is located is determined as the eye focus area.
  4. 根据权利要求3所述的关注区域检测方法,其特征在于,所述根据预设的基准像素点的坐标对所述眼部关注点的坐标进行转换包括:The method for detecting a region of interest according to claim 3, wherein the transforming the coordinates of the eye point of interest according to the coordinates of a preset reference pixel point comprises:
    根据所述基准像素点的坐标和所述眼部关注点的坐标分别计算第一距离和第二距离,所述第一距离为所述基准像素点和所述眼部关注点在预设的第一坐标轴方向上的距离,所述第二距离为所述基准像素点和所述眼部关注点在预设的第二坐标轴方向上的距离;The first distance and the second distance are respectively calculated according to the coordinates of the reference pixel point and the coordinates of the eye point of interest. The first distance is that the reference pixel point and the eye point of interest are in the preset first distance. A distance in a coordinate axis direction, where the second distance is a distance between the reference pixel point and the eye focus point in a preset second coordinate axis direction;
    根据所述第一距离和预设的第一转换系数计算所述眼部关注点在所述第一坐标轴方向上的像素位置;Calculating the pixel position of the eye point of interest in the direction of the first coordinate axis according to the first distance and a preset first conversion coefficient;
    根据所述第二距离和预设的第二转换系数计算所述眼部关注点在所述第二坐标轴方向上的像素位置。Calculate the pixel position of the eye point of interest in the direction of the second coordinate axis according to the second distance and a preset second conversion coefficient.
  5. 根据权利要求1所述的关注区域检测方法,其特征在于,所述根据所述左眼图像、所述右眼图像和所述头部姿态确定视线关注方向包括:The method for detecting a region of interest according to claim 1, wherein the determining the attention direction of the line of sight according to the left-eye image, the right-eye image, and the head posture comprises:
    将所述左眼图像、所述右眼图像和所述头部姿态输入到预训练好的视线预测模型中进行处理,得到所述视线关注方向;Inputting the left-eye image, the right-eye image, and the head posture into a pre-trained line of sight prediction model for processing to obtain the attention direction of the line of sight;
    所述视线预测模型的处理过程包括:The processing process of the line-of-sight prediction model includes:
    分别在所述左眼图像和所述右眼图像中进行特征信息提取,得到左眼特征信息和右眼特征信息;Performing feature information extraction in the left-eye image and the right-eye image respectively to obtain left-eye feature information and right-eye feature information;
    将所述左眼特征信息和所述右眼特征信息进行融合处理,得到双眼特征信息;Performing fusion processing on the left eye feature information and the right eye feature information to obtain binocular feature information;
    将所述双眼特征信息和所述头部姿态进行融合处理,得到所述视线关注方向。The binocular feature information and the head posture are fused to obtain the attention direction of the line of sight.
  6. 根据权利要求1至5中任一项所述的关注区域检测方法,其特征在于,在根据所述视线关注方向、所述左眼瞳孔中心点的坐标和所述右眼瞳孔中心点的坐标确定眼部关注区域之后,还包括:The method for detecting a region of interest according to any one of claims 1 to 5, characterized in that it is determined according to the attention direction of the line of sight, the coordinates of the center point of the left eye pupil, and the coordinates of the center point of the right eye pupil. After focusing on the eye area, it also includes:
    提取所述目标人脸图像中的人脸特征信息;Extracting face feature information in the target face image;
    根据所述人脸特征信息确定与所述目标人脸图像对应的用户信息;Determining user information corresponding to the target face image according to the facial feature information;
    确定与所述眼部关注区域对应的屏幕展示信息;Determining the screen display information corresponding to the eye focus area;
    建立所述用户信息与所述屏幕展示信息之间的对应关系。Establish a corresponding relationship between the user information and the screen display information.
  7. 一种关注区域检测装置,其特征在于,包括:A detection device for a region of interest, which is characterized in that it comprises:
    人脸图像获取模块,用于获取待检测的目标人脸图像;The face image acquisition module is used to acquire the target face image to be detected;
    头部姿态计算模块,用于计算所述目标人脸图像的头部姿态;A head posture calculation module for calculating the head posture of the target face image;
    眼部图像提取模块,用于提取所述目标人脸图像中的左眼图像和右眼图像;An eye image extraction module for extracting left eye images and right eye images in the target face image;
    视线关注方向确定模块,用于根据所述左眼图像、所述右眼图像和所述头部姿态确定视线关注方向;A sight attention direction determining module, configured to determine the sight attention direction according to the left-eye image, the right-eye image, and the head posture;
    眼部关键点检测模块,用于分别在所述左眼图像和所述右眼图像中进行眼部关键点检测,得到左眼瞳孔中心点的坐标和右眼瞳孔中心点的坐标;An eye key point detection module, configured to detect key eye points in the left eye image and the right eye image to obtain the coordinates of the center point of the left eye pupil and the coordinates of the center point of the right eye pupil;
    眼部关注区域确定模块,用于根据所述视线关注方向、所述左眼瞳孔中心点的坐标和所述右眼瞳孔中心点的坐标确定眼部关注区域。The eye area of interest determination module is configured to determine the eye area of interest according to the eye-focusing direction, the coordinates of the center point of the left eye pupil, and the coordinates of the right eye pupil center point.
  8. 根据权利要求7所述的关注区域检测装置,其特征在于,所述眼部关注区域确定模块包括:The device for detecting a region of interest according to claim 7, wherein the module for determining the region of interest of the eye comprises:
    中心点坐标计算子模块,用于根据所述左眼瞳孔中心点的坐标和所述右眼瞳孔中心点的坐标计算双眼瞳孔中心点的坐标;A center point coordinate calculation sub-module, configured to calculate the coordinates of the center point of the pupils of both eyes according to the coordinates of the center point of the left eye pupil and the coordinates of the center point of the right eye pupil;
    点面距离计算子模块,用于根据所述双眼瞳孔中心点的坐标计算所述双眼瞳孔中心点与预设的屏幕之间的点面距离;A point-to-surface distance calculation sub-module for calculating the point-to-surface distance between the center point of the pupils of the two eyes and a preset screen according to the coordinates of the center points of the pupils of the two eyes;
    眼部关注点坐标计算子模块,用于根据所述视线关注方向、所述双眼瞳孔中心点的坐标和所述点面距离计算眼部关注点的坐标;An eye point of interest coordinate calculation sub-module, configured to calculate the coordinates of the eye point of interest according to the attention direction of the line of sight, the coordinates of the center points of the pupils of the two eyes, and the distance between the points;
    眼部关注区域确定子模块,用于根据所述眼部关注点的坐标确定眼部关注区域。The eye area of interest determination sub-module is used to determine the eye area of interest according to the coordinates of the eye point of interest.
  9. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至6中任一项所述的关注区域检测方法的步骤。A computer-readable storage medium, the computer-readable storage medium stores a computer program, wherein the computer program is executed by a processor to realize the area of interest detection according to any one of claims 1 to 6 Method steps.
  10. 一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1至6中任一项所述的关注区域检测方法的步骤。A terminal device, comprising a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the processor executes the computer program as claimed in claims 1 to Steps of the region of interest detection method described in any one of 6.
PCT/CN2020/124098 2019-11-21 2020-10-27 Region of concern detection method and apparatus, and readable storage medium and terminal device WO2021098454A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911151904.3 2019-11-21
CN201911151904.3A CN111046744B (en) 2019-11-21 2019-11-21 Method and device for detecting attention area, readable storage medium and terminal equipment

Publications (1)

Publication Number Publication Date
WO2021098454A1 true WO2021098454A1 (en) 2021-05-27

Family

ID=70232071

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/124098 WO2021098454A1 (en) 2019-11-21 2020-10-27 Region of concern detection method and apparatus, and readable storage medium and terminal device

Country Status (2)

Country Link
CN (1) CN111046744B (en)
WO (1) WO2021098454A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113516074A (en) * 2021-07-08 2021-10-19 西安邮电大学 Online examination system anti-cheating method based on pupil tracking
CN116820246A (en) * 2023-07-06 2023-09-29 上海仙视电子科技有限公司 Screen adjustment control method and device with self-adaptive visual angle

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110909611B (en) * 2019-10-29 2021-03-05 深圳云天励飞技术有限公司 Method and device for detecting attention area, readable storage medium and terminal equipment
CN111046744B (en) * 2019-11-21 2023-04-18 深圳云天励飞技术股份有限公司 Method and device for detecting attention area, readable storage medium and terminal equipment
CN111680546A (en) * 2020-04-26 2020-09-18 北京三快在线科技有限公司 Attention detection method, attention detection device, electronic equipment and storage medium
CN111626240B (en) * 2020-05-29 2023-04-07 歌尔科技有限公司 Face image recognition method, device and equipment and readable storage medium
CN111767820A (en) * 2020-06-23 2020-10-13 京东数字科技控股有限公司 Method, device, equipment and storage medium for identifying object concerned
CN111767821B (en) * 2020-06-23 2024-04-09 京东科技控股股份有限公司 Method, device, equipment and storage medium for identifying focused object
CN111796874A (en) * 2020-06-28 2020-10-20 北京百度网讯科技有限公司 Equipment awakening method and device, computer equipment and storage medium
CN111881763A (en) 2020-06-30 2020-11-03 北京小米移动软件有限公司 Method and device for determining user gaze position, storage medium and electronic equipment
CN112317362A (en) * 2020-09-24 2021-02-05 赣州好朋友科技有限公司 Method and device for sorting quartz associated gold ore and readable storage medium
CN112308932B (en) * 2020-11-04 2023-12-08 中国科学院上海微系统与信息技术研究所 Gaze detection method, device, equipment and storage medium
CN112416126B (en) * 2020-11-18 2023-07-28 青岛海尔科技有限公司 Page scrolling control method and device, storage medium and electronic equipment
CN112527103B (en) * 2020-11-24 2022-07-22 安徽鸿程光电有限公司 Remote control method and device for display equipment, equipment and computer readable storage medium
CN112711982A (en) * 2020-12-04 2021-04-27 科大讯飞股份有限公司 Visual detection method, equipment, system and storage device
CN112804504B (en) * 2020-12-31 2022-10-04 成都极米科技股份有限公司 Image quality adjusting method, image quality adjusting device, projector and computer readable storage medium
CN113052064B (en) * 2021-03-23 2024-04-02 北京思图场景数据科技服务有限公司 Attention detection method based on face orientation, facial expression and pupil tracking
CN113115086B (en) * 2021-04-16 2023-09-19 浙江闪链科技有限公司 Method for collecting elevator media viewing information based on video line-of-sight identification
CN113128417B (en) * 2021-04-23 2023-04-07 南开大学 Double-region eye movement tracking method based on head posture
CN113918007B (en) * 2021-04-27 2022-07-05 广州市保伦电子有限公司 Video interactive operation method based on eyeball tracking
CN113849142A (en) * 2021-09-26 2021-12-28 深圳市火乐科技发展有限公司 Image display method and device, electronic equipment and computer readable storage medium
CN114390267A (en) * 2022-01-11 2022-04-22 宁波视睿迪光电有限公司 Method and device for synthesizing stereo image data, electronic equipment and storage medium
CN115294320B (en) * 2022-10-08 2022-12-20 平安银行股份有限公司 Method and device for determining image rotation angle, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1700242A (en) * 2005-06-15 2005-11-23 北京中星微电子有限公司 Method and apparatus for distinguishing direction of visual lines
CN108345848A (en) * 2018-01-31 2018-07-31 广东欧珀移动通信有限公司 The recognition methods of user's direction of gaze and Related product
US20190043218A1 (en) * 2018-06-28 2019-02-07 Matthew Hiltner Multiple subject attention tracking
CN111046744A (en) * 2019-11-21 2020-04-21 深圳云天励飞技术有限公司 Method and device for detecting attention area, readable storage medium and terminal equipment

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105704369B (en) * 2016-01-20 2019-02-15 努比亚技术有限公司 A kind of information processing method and device, electronic equipment
CN109716268B (en) * 2016-09-22 2022-05-17 苹果公司 Eye and head tracking
JP2018205819A (en) * 2017-05-30 2018-12-27 富士通株式会社 Gazing position detection computer program, gazing position detection device, and gazing position detection method
CN109271914B (en) * 2018-09-07 2020-04-17 百度在线网络技术(北京)有限公司 Method, device, storage medium and terminal equipment for detecting sight line drop point

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1700242A (en) * 2005-06-15 2005-11-23 北京中星微电子有限公司 Method and apparatus for distinguishing direction of visual lines
CN108345848A (en) * 2018-01-31 2018-07-31 广东欧珀移动通信有限公司 The recognition methods of user's direction of gaze and Related product
US20190043218A1 (en) * 2018-06-28 2019-02-07 Matthew Hiltner Multiple subject attention tracking
CN111046744A (en) * 2019-11-21 2020-04-21 深圳云天励飞技术有限公司 Method and device for detecting attention area, readable storage medium and terminal equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113516074A (en) * 2021-07-08 2021-10-19 西安邮电大学 Online examination system anti-cheating method based on pupil tracking
CN116820246A (en) * 2023-07-06 2023-09-29 上海仙视电子科技有限公司 Screen adjustment control method and device with self-adaptive visual angle

Also Published As

Publication number Publication date
CN111046744B (en) 2023-04-18
CN111046744A (en) 2020-04-21

Similar Documents

Publication Publication Date Title
WO2021098454A1 (en) Region of concern detection method and apparatus, and readable storage medium and terminal device
US20200387698A1 (en) Hand key point recognition model training method, hand key point recognition method and device
WO2021082635A1 (en) Region of interest detection method and apparatus, readable storage medium and terminal device
CN110348543B (en) Fundus image recognition method and device, computer equipment and storage medium
US20200058156A1 (en) Dense three-dimensional correspondence estimation with multi-level metric learning and hierarchical matching
US20150117725A1 (en) Method and electronic equipment for identifying facial features
CN111914812B (en) Image processing model training method, device, equipment and storage medium
US20210272306A1 (en) Method for training image depth estimation model and method for processing image depth information
CN104169965A (en) Systems, methods, and computer program products for runtime adjustment of image warping parameters in a multi-camera system
CN110544272A (en) face tracking method and device, computer equipment and storage medium
WO2020042968A1 (en) Method for acquiring object information, device, and storage medium
EP4116462A2 (en) Method and apparatus of processing image, electronic device, storage medium and program product
CN112036331A (en) Training method, device and equipment of living body detection model and storage medium
CN111914180B (en) User characteristic determining method, device, equipment and medium based on graph structure
KR20220062460A (en) Method and apparatus for recognizing parking violation of vehicle, electronic device, storage medium, and computer program
CN114279433A (en) Map data automatic production method, related device and computer program product
KR102496334B1 (en) Method and device for detecting body temperature, electronic apparatus and storage medium
WO2021082636A1 (en) Region of interest detection method and apparatus, readable storage medium and terminal device
CN113628239A (en) Display optimization method, related device and computer program product
CN112818979A (en) Text recognition method, device, equipment and storage medium
CN113743186B (en) Medical image processing method, device, equipment and storage medium
CN113269730B (en) Image processing method, image processing device, computer equipment and storage medium
CN114140839B (en) Image transmission method, device, equipment and storage medium for face recognition
CN114663929A (en) Face recognition method, device, equipment and storage medium based on artificial intelligence
CN116524160B (en) Product consistency auxiliary verification system and method based on AR identification

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20890665

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20890665

Country of ref document: EP

Kind code of ref document: A1