WO2019144710A1 - Method and apparatus for determining position of pupil - Google Patents

Method and apparatus for determining position of pupil Download PDF

Info

Publication number
WO2019144710A1
WO2019144710A1 PCT/CN2018/119882 CN2018119882W WO2019144710A1 WO 2019144710 A1 WO2019144710 A1 WO 2019144710A1 CN 2018119882 W CN2018119882 W CN 2018119882W WO 2019144710 A1 WO2019144710 A1 WO 2019144710A1
Authority
WO
WIPO (PCT)
Prior art keywords
network model
training set
loss function
type
pupil
Prior art date
Application number
PCT/CN2018/119882
Other languages
French (fr)
Chinese (zh)
Inventor
聂凤梅
刘伟
任冬淳
王健
杨孟
宫小虎
Original Assignee
北京七鑫易维信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京七鑫易维信息技术有限公司 filed Critical 北京七鑫易维信息技术有限公司
Priority to US16/349,799 priority Critical patent/US10949991B2/en
Publication of WO2019144710A1 publication Critical patent/WO2019144710A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/19Sensors therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/66Analysis of geometric attributes of image moments or centre of gravity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/193Preprocessing; Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30041Eye; Retina; Ophthalmic

Definitions

  • Embodiments of the present invention relate to the field of image processing, and in particular, to a method and apparatus for determining a pupil position.
  • VR Virtual Reality
  • the VR device can perform line-of-sight estimation of the distant device of the fixation point according to the pupil center coordinates and corneal reflection in the eyeball-based 3D approximate sphere model.
  • the unsupervised learning method is adopted, that is, the model is trained using the unlabeled data.
  • the method can only roughly determine the position of the pupil center, and the accuracy is poor.
  • Embodiments of the present invention provide a method and apparatus for determining a pupil position to at least solve the technical problem that the prior art cannot accurately position the pupil center.
  • a method for determining a position of a pupil includes: acquiring an image to be detected including a pupil; acquiring a binary image corresponding to the preset region based on a preset model of semi-supervised learning, wherein, Let the area be the area where the pupil is located in the image to be detected; obtain the centroid of the binary image; determine the center position of the pupil according to the centroid of the binary image.
  • an apparatus for determining a position of a pupil comprising: a first acquisition module configured to acquire an image to be detected including a pupil; and a second acquisition module configured to be based on semi-supervised learning
  • the preset model acquires a binary image corresponding to the preset area, wherein the preset area is an area where the pupil is located in the image to be detected; the third acquiring module is configured to acquire a centroid of the binary image; and the determining module is set to be based on the binary value
  • the center of mass of the image determines the center of the pupil.
  • a storage medium comprising a stored program, wherein the program performs a method of determining a pupil position.
  • a processor for running a program wherein a method of determining a pupil position is performed while the program is running.
  • the image to be detected including the pupil is obtained by using a semi-supervised learning algorithm, and then the binary image corresponding to the preset region and the centroid of the binary image are acquired based on the preset model of semi-supervised learning. And determining the center position of the pupil according to the centroid of the binary image, wherein the preset area is the area where the pupil is located in the image to be detected, and the purpose of positioning the center of the pupil is achieved. Since semi-supervised learning includes two unsupervised learning and supervised learning processes, the pre-set model obtained by combining supervised learning with unsupervised learning avoids the use of unsupervised learning or only supervised learning. Precise positioning of the pupil.
  • the image to be detected containing the pupil is converted into a binary image with relatively simple processing, and the position of the pupil center can be accurately determined according to the centroid of the binary image, thereby realizing the accurate determination of the position of the pupil center.
  • the technical effect further solves the technical problem that the prior art cannot accurately position the pupil center.
  • FIG. 1 is a flow chart of a method of determining a pupil position according to an embodiment of the present invention
  • FIG. 2 is a schematic structural diagram of an optional binary image according to an embodiment of the present invention.
  • FIG. 3(a) is a schematic diagram of an optional labelless training set in accordance with an embodiment of the present invention.
  • 3(b) is a schematic diagram of an optional tagged training set in accordance with an embodiment of the present invention.
  • FIG. 4 is a flow chart showing the construction of an optional preset model according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural view of an apparatus for determining a pupil position according to an embodiment of the present invention.
  • an embodiment of a method of determining a pupil position is provided, and it is noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system such as a set of computer executable instructions, and Although a logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in a different order than the ones described herein.
  • FIG. 1 is a flow chart of a method for determining a pupil position according to an embodiment of the present invention. As shown in FIG. 1, the method includes the following steps:
  • Step S102 Acquire an image to be detected including a pupil.
  • the image capturing device can collect an image including the pupil, that is, obtain the image to be detected.
  • the image to be detected may be one piece or multiple sheets.
  • the image acquisition device collects a set of images to be detected including the pupil.
  • the processor connected to the image acquisition device may process the image to be detected to determine the center position of the pupil in the image to be detected.
  • Step S104 Acquire a binary image corresponding to the preset area based on the preset model of semi-supervised learning.
  • the preset area of this embodiment is the area where the pupil is located in the image to be detected.
  • semi-supervised learning is a machine learning method that combines supervised learning with unsupervised learning. Semi-supervised learning is used to train the preset model, which not only simplifies the model, but also obtains precision comparison. High processing results.
  • the processor uses the image to be detected as an input of the preset model to process the detected image, and the output of the corresponding preset model is a binary image of the region where the pupil is located in the image to be detected.
  • the binary image refers to an image in which only two possible values or gray levels are available for each pixel. Since the binary image has the advantages of less memory, high contrast, and the like, in the present application, the image to be detected including multiple gray levels or multiple colors is processed into a binary image with less value and less gray level. Increase the precision of the pupil center, and also improve the speed of data processing.
  • the output of the preset model is a binary image set including a plurality of binary images, wherein each image in the binary image set is concentrated with the image to be detected. The image corresponds.
  • Step S106 acquiring a centroid of the binary image.
  • the coordinates of the pixel points in the pupil region in the binary image are obtained, and then weighted and summed according to the coordinates of the pixel points of the pupil region, that is, the binary image is obtained. Centroid.
  • FIG. 2 a schematic diagram of an alternative binary image as shown in FIG.
  • a black circle indicates a pupil area in the binary image area. Since it is a binary image, it is only necessary to find the coordinates of the pixel points with the gradation level of 0 in the binary image, and the coordinates of the pixel points in the pupil region can be obtained, and the pupil in the binary image can be obtained by the following formula.
  • the center of mass is a binary image, it is only necessary to find the coordinates of the pixel points with the gradation level of 0 in the binary image, and the coordinates of the pixel points in the pupil region can be obtained, and the pupil in the binary image can be obtained by the following formula.
  • the center of mass is a binary image, it is only necessary to find the coordinates of the pixel points with the gradation level of 0 in the binary image, and the coordinates of the pixel points in the pupil region can be obtained, and the pupil in the binary image can be obtained by the following formula.
  • the center of mass is
  • M represents the total number of pixel points in the pupil area
  • i represents the subscript of the pixel point in the pupil area
  • x i represents the coordinates of the i-th pixel point in the pupil area
  • x and y represent the centroid coordinate
  • step S108 the center position of the pupil is determined according to the centroid of the binary image.
  • centroid of the binary image is the center position of the pupil.
  • the image to be detected including the pupil is obtained, and then the binary image corresponding to the preset region and the centroid of the binary image are acquired based on the preset model of semi-supervised learning. And determining the center position of the pupil according to the centroid of the binary image, wherein the preset area is the area where the pupil is located in the image to be detected.
  • the pre-set model obtained by combining supervised learning and unsupervised learning can overcome the use of only relevant technologies. Supervised learning or using only supervised learning can not accurately locate the pupil.
  • the preset model using the preset model, the image to be detected containing the pupil is converted into a binary image with relatively simple processing, and the position of the pupil center can be accurately determined according to the centroid of the binary image. In addition, the above calculation process is simple, and the speed of precise positioning of the pupil center is improved.
  • the embodiment provided by the present application can achieve the purpose of positioning the pupil center, thereby realizing the technical effect of accurately determining the position of the pupil center, and further solving the technology that the prior art cannot accurately position the pupil center. problem.
  • the preset model based on the semi-supervised learning acquires the binary image corresponding to the preset area, the preset model needs to be constructed, and the specific steps are as follows:
  • Step S10 acquiring a first type of training set and a second type of training set, wherein the first type of training set and the second type of training set each include one or more images to be trained;
  • Step S12 acquiring a network model, wherein the network model is used to convert a plurality of images to be trained from an original image to a binary image;
  • Step S14 constructing a loss function of the network model
  • Step S16 constructing a preset model according to the loss function of the first type training set, the second type training set, and the network model.
  • a plurality of images to be trained constitute a to-be-trained image set
  • the to-be-trained image set includes a first type training set and a second type training set
  • the first type training set is an unlabeled training set, that is, the original
  • the second type of training set is a labeled training set, that is, the original image
  • One-to-one correspondence is an illustration of an optional tagged training set.
  • x represents an original image
  • y represents a binary image.
  • the foregoing network model is a Generative Adversarial Networks (GAN), which may include two GAN networks, wherein one GAN network is used to convert an image from an original image to two. The value image, while another GAN network is used to convert the binary image to the original image.
  • GAN Generative Adversarial Networks
  • the loss function of the network model can be constructed based on the network model, and the specific steps are as follows:
  • Step S140 obtaining a hyperparameter of the network model
  • Step S142 in the case that the network model performs unsupervised learning, determining, according to the hyperparameter, the loss function of the network model is the first loss function and the second loss function;
  • Step S144 in the case that the network model performs supervised learning, the loss function of the network model is determined based on the hyperparameter as the third loss function and the fourth loss function.
  • the hyperparameter of the network model refers to a parameter that sets a value before starting the learning process in the context of machine learning.
  • the hyperparameters of the network model include at least one or more of the following: learning rate, ratio of unsupervised learning to supervised learning, number of batch images, and number of training rounds for training the network model.
  • the first loss function is a loss function of the generator
  • the second loss function is a loss function of the discriminator, wherein the first loss function is:
  • the second loss function is:
  • the third loss function is the loss function of the generator
  • the fourth loss function is the loss function of the discriminator, wherein the fourth loss function is the same as the second loss function, ie, supervised learning In the case of unsupervised learning, the way the discriminator is updated is unchanged.
  • the third loss function is:
  • ⁇ Y and ⁇ X represent hyperparameters, which can be determined empirically;
  • G A denotes generator A
  • G B denotes generator B
  • D B denotes discriminator B
  • D A denotes discriminator A.
  • X and Y respectively represent the original image domain and the binary image domain
  • x and y respectively represent images of the X and Y domains.
  • the preset model may be constructed, that is, the loss function of the preset model is constructed, and the specific method includes the following steps:
  • Step S160 updating the parameters of the discriminator and the generator of the network model based on the first type training set and the second type training set, to obtain an updated network model
  • Step S162 In the case that the number of updates to the network model reaches the first threshold, the preset model is constructed according to the updated network model.
  • the parameters of the network model discriminator and the generator are updated based on the first type training set and the second type training set, and the updated network model includes the following steps:
  • Step S1602 updating parameters of the discriminator according to the second loss function based on the first type of training set
  • Step S1604 updating parameters of the generator according to the first loss function based on the first type of training set
  • Step S1606 in the case that the number of times of updating the parameters of the discriminator and the generator reaches the second threshold, updating the parameters of the generator according to the third loss function based on the second type of training set;
  • Step S1608 updating parameters of the discriminator according to the fourth loss function based on the second type of training set
  • the number of updates of the network model is incremented until the number of updates of the network model reaches the first threshold.
  • the first threshold is a maximum number of updates for training the network model
  • the second threshold is a parameter for updating the generator based on the unsupervised learning mode (ie, a parameter of the generator in the unsupervised learning mode) and a discriminator.
  • the maximum number of updates of the parameters ie, the parameters of the discriminator in the unsupervised learning mode
  • the third threshold is the parameter that updates the generator based on the supervised learning mode (ie, the parameters of the generator in the supervised learning mode) and the discriminator
  • the maximum number of updates for the parameters that is, the parameters of the discriminator in supervised learning mode).
  • FIG. 4 a flow chart of the construction of an optional preset model is shown in FIG. Wherein, in FIG. 4, the first threshold is n, the second threshold is n1, and the third threshold is n2.
  • initializing the parameters of the network model specifically including initializing the weight parameter and the hyper parameter of the network model.
  • the parameters of the generator and the parameters of the discriminator are updated by the unsupervised learning method using the unlabeled training set (ie, the first type of training set) and the gradient descent method, the parameters and discriminators in the generator.
  • the update is based on the supervised learning mode, that is, the method of using the tagged training set (ie, the second type of training set) and the gradient descent is updated by the supervised learning method.
  • the parameters of the device and the parameters of the discriminator In the case where the parameters of the generator and the number of updates of the parameters of the discriminator reach the third threshold (ie, n2), the update of the network model is completed once, and the training of the network model is stopped. In the case where the number of times the parameters of the generator and the discriminator are updated reaches the first threshold, the preset model is constructed by the obtained generator and the discriminator at this time.
  • an apparatus embodiment for determining a pupil position includes one or more processors, and one or more memories storing program units, wherein the program unit is executed by a processor, and the program unit includes a first acquisition module, a second acquisition module, and a third acquisition.
  • Module and determination module. 5 is a schematic structural diagram of an apparatus for determining a pupil position according to an embodiment of the present invention. As shown in FIG. 5, the apparatus includes: a first obtaining module 501, a second acquiring module 503, a third obtaining module 505, and a determining module 507. .
  • the first obtaining module 501 is configured to acquire a to-be-detected image that includes a pupil; the second acquiring module 503 is configured to acquire a binary image corresponding to the preset region, where the preset region is The area of the pupil in the image to be detected is located; the third obtaining module 505 is configured to acquire the centroid of the binary image; and the determining module 507 is configured to determine the center position of the pupil according to the centroid of the binary image.
  • first acquiring module 501, the second obtaining module 503, the third obtaining module 505, and the determining module 507 correspond to steps S102 to S108 in Embodiment 1, and the four modules are implemented by corresponding steps.
  • the examples and application scenarios are the same, but are not limited to the contents disclosed in the above embodiment 1.
  • the foregoing first obtaining module 501, the second obtaining module 503, the third obtaining module 505, and the determining module 507 may be run in the terminal as part of the device, and the foregoing module may be executed by a processor in the terminal.
  • the functions realized can also be terminal devices such as smart phones (such as Android phones, iOS phones, etc.), tablet computers, applause computers, and mobile Internet devices (MID), PAD, and the like.
  • the device for determining a pupil position further includes: a fifth acquisition module, a sixth acquisition module, a first construction module, and a second construction module.
  • the fifth obtaining module is configured to acquire the first type training set and the second type training set, wherein the first type training set and the second type training set each include one or more images to be trained;
  • the trained image includes a first training image set and a second training image set;
  • a sixth obtaining module configured to acquire a network model, wherein the network model is configured to convert the plurality of images to be trained from the original image to the binary image;
  • a building block is configured to construct a loss function of the network model; and a second building block is configured to construct a preset model according to the loss function of the first type training set, the second type training set, and the network model.
  • the foregoing fifth obtaining module, the sixth obtaining module, the first building module, and the second building module may be run in the terminal as part of the device, and may be implemented by using the processor in the terminal.
  • the first building module includes: a seventh acquiring module, a first determining module, and a second determining module.
  • the seventh obtaining module is configured to acquire a hyperparameter of the network model;
  • the first determining module is configured to determine, according to the hyperparameter, the loss function of the network model as the first loss function and the first parameter in the case that the network model performs unsupervised learning
  • the second loss function is configured to determine that the loss function of the network model is a third loss function and a fourth loss function based on the hyperparameter in the case that the network model performs supervised learning.
  • the foregoing seventh obtaining module, the first determining module, and the second determining module correspond to steps S140 to S144 in Embodiment 1, and the three modules are the same as the examples and application scenarios implemented by the corresponding steps, but It is not limited to the contents disclosed in the above embodiment 1.
  • the seventh obtaining module, the first determining module, and the second determining module may be run in the terminal as part of the device, and the functions implemented by the module may be performed by a processor in the terminal.
  • the second building module includes: a first update module and a third building module.
  • the first update module is configured to update the parameters of the network model discriminator and the generator based on the first type training set and the second type training set to obtain an updated network model; and the third building module is set to In the case that the number of updates to the network model reaches the first threshold, the preset model is constructed according to the updated network model.
  • first update module and the third construction module correspond to step S160 to step S162 in the first embodiment, and the two modules are the same as the examples and application scenarios implemented by the corresponding steps, but are not limited to the above embodiments. 1 published content.
  • first update module and the third build module may be run in the terminal as part of the device, and the functions implemented by the module may be performed by a processor in the terminal.
  • the first update module includes: a second update module, a third update module, a fourth update module, and a fifth update module.
  • the second update module is configured to update the parameter of the discriminator according to the second loss function based on the first type of training set;
  • the third update module is configured to update the parameter of the generator according to the first loss function based on the first type of training set;
  • a fourth updating module configured to update a parameter of the generator according to the third loss function based on the second type of training set if the number of times of updating the parameters of the discriminator and the generator reaches a second threshold;
  • the fifth update module And being configured to update the parameter of the discriminator according to the fourth loss function based on the second type of training set; wherein, in the case that the number of times the parameter of the discriminator and the generator is updated reaches a third threshold, the number of times of updating the network model is performed Add one operation until the number of updates of the network model reaches the first threshold.
  • second update module, third update module, fourth update module, and fifth update module may be run in the terminal as part of the device, and may be implemented by using the processor in the terminal.
  • a storage medium comprising a stored program, wherein the program executes the method of determining a pupil position in Embodiment 1.
  • the various functional modules provided by the embodiments of the present application may be operated in a device or a similar computing device that determines the position of the pupil, or may be stored as part of the storage medium.
  • a computer program is stored in the storage medium, wherein the computer program is configured to be executed to execute a data processing method.
  • the storage medium is configured to store program code for performing the following steps: acquiring an image to be detected including the pupil; acquiring a binary image corresponding to the preset region based on the preset model of semi-supervised learning
  • the preset area is the area where the pupil is located in the image to be detected; the centroid of the binary image is obtained; and the center position of the pupil is determined according to the centroid of the binary image.
  • the storage medium may also be provided as program code for various preferred or optional method steps provided by the method of determining the pupil position.
  • a processor for executing a program, wherein the method of determining a pupil position in Embodiment 1 is executed while the program is running.
  • the processor can determine the running procedure of the method of the pupil position.
  • the processor may be configured to: perform: acquiring an image to be detected that includes the pupil; and acquiring a binary image corresponding to the preset region based on the preset model of the semi-supervised learning, where Let the area be the area where the pupil is located in the image to be detected; obtain the centroid of the binary image; determine the center position of the pupil according to the centroid of the binary image.
  • the above processor can execute various functional applications and data processing by running software programs and modules stored in the memory, that is, implementing the above-described method of determining the pupil position.
  • the storage medium may include a flash disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk.
  • the disclosed technical contents may be implemented in other manners.
  • the device embodiments described above are only schematic.
  • the division of the unit may be a logical function division.
  • there may be another division manner for example, multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, unit or module, and may be electrical or otherwise.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
  • the technical solution of the present invention may contribute to the related art or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium.
  • a number of instructions are included to cause a computer device (which may be a personal computer, server or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk, and the like. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Ophthalmology & Optometry (AREA)
  • Medical Informatics (AREA)
  • Geometry (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

A method and apparatus for determining a position of a pupil. Said method comprises: acquiring an image which includes a pupil and is to be detected (S102); acquiring, on the basis of a predetermined model for semi-supervisory learning, a binary image corresponding to a predetermined region, the predetermined region being a region where the pupil is located in the image to be detected (S104); acquiring a center of mass of the binary image (S106); and determining the center position of the pupil according to the center of mass of the binary image (S108).

Description

确定瞳孔位置的方法和装置Method and apparatus for determining pupil position
本申请要求于2018年1月23日提交中国专利局、优先权号为201810064311.2、发明名称为“确定瞳孔位置的方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。The present application claims priority to Chinese Patent Application No. 201810064311.2, entitled "Method and Apparatus for Determining the Position of the Pupil", issued January 23, 2018, the entire contents of which is incorporated herein by reference. in.
技术领域Technical field
本发明实施例涉及图像处理领域,具体而言,涉及一种确定瞳孔位置的方法和装置。Embodiments of the present invention relate to the field of image processing, and in particular, to a method and apparatus for determining a pupil position.
背景技术Background technique
虚拟现实(Virtual Reality,简称为VR)技术是一种可以创建和体验虚拟世界的计算机技术,其在视线追踪领域得到了广泛的应用。Virtual Reality (VR) technology is a computer technology that can create and experience virtual worlds. It has been widely used in the field of line-of-sight tracking.
在实际应用中,VR设备可根据基于眼球的3D近似圆球模型中的瞳孔中心坐标和角膜反射,对注视点的远距离设备进行视线估计。目前对瞳孔中心进行定位时,多采用无监督学习的方法,即使用无标签的数据对模型进行训练,然而该方法只能大致确定瞳孔中心的位置,精度较差。In practical applications, the VR device can perform line-of-sight estimation of the distant device of the fixation point according to the pupil center coordinates and corneal reflection in the eyeball-based 3D approximate sphere model. At present, when positioning the pupil center, the unsupervised learning method is adopted, that is, the model is trained using the unlabeled data. However, the method can only roughly determine the position of the pupil center, and the accuracy is poor.
针对上述相关技术不能对瞳孔的中心位置进行精确定位的问题,目前尚未提出有效的解决方案。In view of the above problem that the related art cannot accurately position the center position of the pupil, an effective solution has not been proposed yet.
发明内容Summary of the invention
本发明实施例提供了一种确定瞳孔位置的方法和装置,以至少解决现有技术不能对瞳孔中心进行精确定位的技术问题。Embodiments of the present invention provide a method and apparatus for determining a pupil position to at least solve the technical problem that the prior art cannot accurately position the pupil center.
根据本发明实施例的一个方面,提供了一种确定瞳孔位置的方法,包括:获取包含瞳孔的待检测图像;基于半监督学习的预设模型获取预设区域对应的二值图像,其中,预设区域为待检测图像中瞳孔所在的区域;获取二值图像的质心;根据二值图像的质心确定瞳孔的中心位置。According to an aspect of the embodiments of the present invention, a method for determining a position of a pupil includes: acquiring an image to be detected including a pupil; acquiring a binary image corresponding to the preset region based on a preset model of semi-supervised learning, wherein, Let the area be the area where the pupil is located in the image to be detected; obtain the centroid of the binary image; determine the center position of the pupil according to the centroid of the binary image.
根据本发明实施例的另一方面,还提供了一种确定瞳孔位置的装置,包括:第一获取模块,设置为获取包含瞳孔的待检测图像;第二获取模块,设置为基于半监督学习的预设模型获取预设区域对应的二值图像,其中,预设区域为待检测图像中瞳孔所 在的区域;第三获取模块,设置为获取二值图像的质心;确定模块,设置为根据二值图像的质心确定瞳孔的中心位置。According to another aspect of the embodiments of the present invention, there is further provided an apparatus for determining a position of a pupil, comprising: a first acquisition module configured to acquire an image to be detected including a pupil; and a second acquisition module configured to be based on semi-supervised learning The preset model acquires a binary image corresponding to the preset area, wherein the preset area is an area where the pupil is located in the image to be detected; the third acquiring module is configured to acquire a centroid of the binary image; and the determining module is set to be based on the binary value The center of mass of the image determines the center of the pupil.
根据本发明实施例的另一方面,还提供了一种存储介质,该存储介质包括存储的程序,其中,程序执行确定瞳孔位置的方法。According to another aspect of an embodiment of the present invention, there is also provided a storage medium comprising a stored program, wherein the program performs a method of determining a pupil position.
根据本发明实施例的另一方面,还提供了一种处理器,该处理器用于运行程序,其中,程序运行时执行确定瞳孔位置的方法。According to another aspect of an embodiment of the present invention, there is also provided a processor for running a program, wherein a method of determining a pupil position is performed while the program is running.
在本发明实施例中,采用半监督学习算法的方式,通过获取包含瞳孔的待检测图像,然后,基于半监督学习的预设模型获取与预设区域对应的二值图像以及二值图像的质心,并根据二值图像的质心确定瞳孔的中心位置,其中,预设区域为待检测图像中瞳孔所在的区域,达到了对瞳孔中心进行定位的目的。由于半监督学习包括无监督学习和有监督学习两个学习过程,因此,将有监督学习和无监督学习进行结合所得到的预设模型,避免了仅使用无监督学习或仅使用有监督学习无法对瞳孔进行精确定位。另外,使用预设模型,将包含瞳孔的待检测图像转换为处理过程比较简单的二值图像,进而根据二值图像的质心可准确确定瞳孔中心的位置,从而实现了准确确定瞳孔中心的位置的技术效果,进而解决了现有技术不能对瞳孔中心进行精确定位的技术问题。In the embodiment of the present invention, the image to be detected including the pupil is obtained by using a semi-supervised learning algorithm, and then the binary image corresponding to the preset region and the centroid of the binary image are acquired based on the preset model of semi-supervised learning. And determining the center position of the pupil according to the centroid of the binary image, wherein the preset area is the area where the pupil is located in the image to be detected, and the purpose of positioning the center of the pupil is achieved. Since semi-supervised learning includes two unsupervised learning and supervised learning processes, the pre-set model obtained by combining supervised learning with unsupervised learning avoids the use of unsupervised learning or only supervised learning. Precise positioning of the pupil. In addition, using the preset model, the image to be detected containing the pupil is converted into a binary image with relatively simple processing, and the position of the pupil center can be accurately determined according to the centroid of the binary image, thereby realizing the accurate determination of the position of the pupil center. The technical effect further solves the technical problem that the prior art cannot accurately position the pupil center.
附图说明DRAWINGS
此处所说明的附图用来提供对本发明实施例的进一步理解,构成本申请的一部分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:The drawings are intended to provide a further understanding of the embodiments of the present invention, and are intended to be a part of the present invention, and the description of the present invention is not intended to limit the invention. In the drawing:
图1是根据本发明实施例的一种确定瞳孔位置的方法流程图;1 is a flow chart of a method of determining a pupil position according to an embodiment of the present invention;
图2是根据本发明实施例的一种可选的二值图像的结构示意图;2 is a schematic structural diagram of an optional binary image according to an embodiment of the present invention;
图3(a)是根据本发明实施例的一种可选的无标签训练集的示意图;3(a) is a schematic diagram of an optional labelless training set in accordance with an embodiment of the present invention;
图3(b)是根据本发明实施例的一种可选的有标签训练集的示意图;3(b) is a schematic diagram of an optional tagged training set in accordance with an embodiment of the present invention;
图4是根据本发明实施例的一种可选的预设模型的构建流程图;以及4 is a flow chart showing the construction of an optional preset model according to an embodiment of the present invention;
图5是根据本发明实施例的一种确定瞳孔位置的装置结构示意图。FIG. 5 is a schematic structural view of an apparatus for determining a pupil position according to an embodiment of the present invention.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例 仅仅是本发明一部分的实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明保护的范围。The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is an embodiment of the invention, but not all of the embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts shall fall within the scope of the present invention.
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It is to be understood that the terms "first", "second" and the like in the specification and claims of the present invention are used to distinguish similar objects, and are not necessarily used to describe a particular order or order. It is to be understood that the data so used may be interchanged where appropriate, so that the embodiments of the invention described herein can be implemented in a sequence other than those illustrated or described herein. In addition, the terms "comprises" and "comprises" and "the" and "the" are intended to cover a non-exclusive inclusion, for example, a process, method, system, product, or device that comprises a series of steps or units is not necessarily limited to Those steps or units may include other steps or units not explicitly listed or inherent to such processes, methods, products or devices.
实施例1Example 1
根据本发明实施例,提供了一种确定瞳孔位置的方法实施例,需要说明的是,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。In accordance with an embodiment of the present invention, an embodiment of a method of determining a pupil position is provided, and it is noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system such as a set of computer executable instructions, and Although a logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in a different order than the ones described herein.
图1是根据本发明实施例的确定瞳孔位置的方法流程图,如图1所示,该方法包括如下步骤:1 is a flow chart of a method for determining a pupil position according to an embodiment of the present invention. As shown in FIG. 1, the method includes the following steps:
步骤S102,获取包含瞳孔的待检测图像。Step S102: Acquire an image to be detected including a pupil.
需要说明的是,图像采集设备可以采集到包含瞳孔的图像,即得到上述待检测图像。其中,上述待检测图像可以为一张,也可以为多张。在待检测图像为多张的情况下,图像采集设备采集到包含瞳孔的待检测图像集。另外,在图像采集设备采集到待检测图像之后,与图像采集设备连接的处理器可对待检测图像进行处理,以确定待检测图像中瞳孔的中心位置。It should be noted that the image capturing device can collect an image including the pupil, that is, obtain the image to be detected. The image to be detected may be one piece or multiple sheets. In the case that there are multiple images to be detected, the image acquisition device collects a set of images to be detected including the pupil. In addition, after the image acquisition device collects the image to be detected, the processor connected to the image acquisition device may process the image to be detected to determine the center position of the pupil in the image to be detected.
步骤S104,基于半监督学习的预设模型获取与预设区域对应的二值图像。Step S104: Acquire a binary image corresponding to the preset area based on the preset model of semi-supervised learning.
该实施例的预设区域为待检测图像中瞳孔所在的区域。The preset area of this embodiment is the area where the pupil is located in the image to be detected.
需要说明的是,半监督学习是将有监督学习和无监督学习相结合的一种机器学习方法,使用半监督学习对预设模型进行训练,不仅可以对模型进行简化处理,还可以得到精度比较高的处理结果。另外,处理器在得到待检测图像之后,将待检测图像作为预设模型的输入,来对待检测图像进行处理,对应的预设模型的输出即为待检测图像中瞳孔所在区域的二值图像。It should be noted that semi-supervised learning is a machine learning method that combines supervised learning with unsupervised learning. Semi-supervised learning is used to train the preset model, which not only simplifies the model, but also obtains precision comparison. High processing results. In addition, after obtaining the image to be detected, the processor uses the image to be detected as an input of the preset model to process the detected image, and the output of the corresponding preset model is a binary image of the region where the pupil is located in the image to be detected.
此外,还需要说明的是,二值图像是指每一个像素只有两种可能取值或灰度等级的图像。由于二值图像具有照样内存少、对比度高等优点,因此,在本申请中将包含多种灰度等级或多种颜色的待检测图像处理成取值少、灰度等级少的二值图像同样可以提高精度比较高的瞳孔中心,并且还提高了数据处理的速度。In addition, it should be noted that the binary image refers to an image in which only two possible values or gray levels are available for each pixel. Since the binary image has the advantages of less memory, high contrast, and the like, in the present application, the image to be detected including multiple gray levels or multiple colors is processed into a binary image with less value and less gray level. Increase the precision of the pupil center, and also improve the speed of data processing.
另外,如果预设模型的输入为待检测图像集,则预设模型的输出为包含多张二值图像的二值图像集,其中,二值图像集中的每张图像均与待检测图像集中的图像相对应。In addition, if the input of the preset model is the image set to be detected, the output of the preset model is a binary image set including a plurality of binary images, wherein each image in the binary image set is concentrated with the image to be detected. The image corresponds.
步骤S106,获取二值图像的质心。Step S106, acquiring a centroid of the binary image.
需要说明的是,在得到待检测图像的二值图像之后,获取二值图像中瞳孔区域内的像素点的坐标,然后根据瞳孔区域的像素点的坐标进行加权求和,即得到二值图像的质心。It should be noted that, after obtaining the binary image of the image to be detected, the coordinates of the pixel points in the pupil region in the binary image are obtained, and then weighted and summed according to the coordinates of the pixel points of the pupil region, that is, the binary image is obtained. Centroid.
在一种可选的实施例中,如图2所示的一种可选的二值图像的结构示意图。其中,在图2中,黑色圆表示二值图像区域中的瞳孔区域。由于是二值图像,所以仅需要找出二值图像中灰度等级为0的像素点的坐标,即可得到瞳孔区域内的像素点的坐标,并采用如下公式即可得到二值图像中瞳孔区域的质心:In an alternative embodiment, a schematic diagram of an alternative binary image as shown in FIG. Here, in FIG. 2, a black circle indicates a pupil area in the binary image area. Since it is a binary image, it is only necessary to find the coordinates of the pixel points with the gradation level of 0 in the binary image, and the coordinates of the pixel points in the pupil region can be obtained, and the pupil in the binary image can be obtained by the following formula. The center of mass:
Figure PCTCN2018119882-appb-000001
Figure PCTCN2018119882-appb-000001
Figure PCTCN2018119882-appb-000002
Figure PCTCN2018119882-appb-000002
在上式中,M表示瞳孔区域内像素点的总数,i表示瞳孔区域内的像素点的下标,x i、y i表示瞳孔区域内第i个像素点的坐标,x和y表示质心的坐标。 In the above formula, M represents the total number of pixel points in the pupil area, i represents the subscript of the pixel point in the pupil area, x i , y i represents the coordinates of the i-th pixel point in the pupil area, and x and y represent the centroid coordinate.
步骤S108,根据二值图像的质心确定瞳孔的中心位置。In step S108, the center position of the pupil is determined according to the centroid of the binary image.
需要说明的是,在得到二值图像的质心之后,二值图像的质心即为瞳孔的中心位置。It should be noted that after obtaining the centroid of the binary image, the centroid of the binary image is the center position of the pupil.
基于上述步骤S102至步骤S108所限定的方案,可以获知,通过获取包含瞳孔的待检测图像,然后,基于半监督学习的预设模型获取与预设区域对应的二值图像以及二值图像的质心,并根据二值图像的质心确定瞳孔的中心位置,其中,预设区域为待检测图像中瞳孔所在的区域。Based on the solution defined in the foregoing steps S102 to S108, it can be known that the image to be detected including the pupil is obtained, and then the binary image corresponding to the preset region and the centroid of the binary image are acquired based on the preset model of semi-supervised learning. And determining the center position of the pupil according to the centroid of the binary image, wherein the preset area is the area where the pupil is located in the image to be detected.
容易注意到的是,由于半监督学习包括无监督学习和有监督学习两个学习过程,因此,将有监督学习和无监督学习进行结合所得到的预设模型,可以克服相关技术中 仅使用无监督学习或仅使用有监督学习无法对瞳孔进行精确定位的问题。另外,使用预设模型,将包含瞳孔的待检测图像转换为处理过程比较简单的二值图像,进而根据二值图像的质心可准确确定瞳孔中心的位置。此外,上述计算过程简单,提高了对瞳孔中心进行精确定位的速度。It is easy to notice that since semi-supervised learning includes two learning processes: unsupervised learning and supervised learning, the pre-set model obtained by combining supervised learning and unsupervised learning can overcome the use of only relevant technologies. Supervised learning or using only supervised learning can not accurately locate the pupil. In addition, using the preset model, the image to be detected containing the pupil is converted into a binary image with relatively simple processing, and the position of the pupil center can be accurately determined according to the centroid of the binary image. In addition, the above calculation process is simple, and the speed of precise positioning of the pupil center is improved.
由上述内容可知,本申请所提供的实施例可以达到对瞳孔中心进行定位的目的,从而实现了准确确定瞳孔中心的位置的技术效果,进而解决了现有技术不能对瞳孔中心进行精确定位的技术问题。It can be seen from the above that the embodiment provided by the present application can achieve the purpose of positioning the pupil center, thereby realizing the technical effect of accurately determining the position of the pupil center, and further solving the technology that the prior art cannot accurately position the pupil center. problem.
需要说明的是,在基于半监督学习的预设模型获取与预设区域对应的二值图像之前,需要构建预设模型,具体步骤如下:It should be noted that before the preset model based on the semi-supervised learning acquires the binary image corresponding to the preset area, the preset model needs to be constructed, and the specific steps are as follows:
步骤S10,获取第一类训练集和第二类训练集,其中,第一类训练集和第二类训练集中均包括一张或多张待训练的图像;Step S10, acquiring a first type of training set and a second type of training set, wherein the first type of training set and the second type of training set each include one or more images to be trained;
步骤S12,获取网络模型,其中,网络模型用于将多张待训练的图像从原始图像转换为二值图像;Step S12, acquiring a network model, wherein the network model is used to convert a plurality of images to be trained from an original image to a binary image;
步骤S14,构建网络模型的损失函数;Step S14, constructing a loss function of the network model;
步骤S16,根据第一类训练集、第二类训练集以及网络模型的损失函数构建预设模型。Step S16, constructing a preset model according to the loss function of the first type training set, the second type training set, and the network model.
需要说明的是,多张待训练的图像组成待训练图像集,该待训练图像集包含第一类训练集和第二类训练集,其中,第一类训练集为无标签训练集,即原始图像与二值图像之间没有对应关系,如图3(a)所示的一种可选的无标签训练集的示意图;第二类训练集为有标签训练集,即原始图像与二值图像一一对应,如图3(b)所示的一种可选的有标签训练集的示意图。其中,在图3(a)和图3(b)中,x表示原始图像,y表示二值图像。It should be noted that a plurality of images to be trained constitute a to-be-trained image set, and the to-be-trained image set includes a first type training set and a second type training set, wherein the first type training set is an unlabeled training set, that is, the original There is no correspondence between the image and the binary image, as shown in Figure 3 (a), an optional set of unlabeled training sets; the second type of training set is a labeled training set, that is, the original image and the binary image One-to-one correspondence, as shown in Figure 3(b), is an illustration of an optional tagged training set. Here, in FIGS. 3(a) and 3(b), x represents an original image, and y represents a binary image.
此外,还需要说明的是,上述网络模型为生成对抗网络(Generative Adversarial Networks,简称为GAN),该网络模型可以包括两个GAN网络,其中,一个GAN网络用于将图像由原始图像转换为二值图像,而另一个GAN网络用于将二值图像转换为原始图像。在得到上述网络模型之后,可基于该网络模型构建网络模型的损失函数,具体步骤如下:In addition, it should be noted that the foregoing network model is a Generative Adversarial Networks (GAN), which may include two GAN networks, wherein one GAN network is used to convert an image from an original image to two. The value image, while another GAN network is used to convert the binary image to the original image. After obtaining the above network model, the loss function of the network model can be constructed based on the network model, and the specific steps are as follows:
步骤S140,获取网络模型的超参数;Step S140, obtaining a hyperparameter of the network model;
步骤S142,在网络模型进行无监督学习的情况下,基于超参数确定网络模型的损失函数为第一损失函数和第二损失函数;Step S142, in the case that the network model performs unsupervised learning, determining, according to the hyperparameter, the loss function of the network model is the first loss function and the second loss function;
步骤S144,在网络模型进行有监督学习的情况下,基于超参数确定网络模型的损失函数为第三损失函数和第四损失函数。Step S144, in the case that the network model performs supervised learning, the loss function of the network model is determined based on the hyperparameter as the third loss function and the fourth loss function.
需要说明的是,网络模型的超参数是指在机器学习的上下文中,在开始学习过程之前设置值的参数。在本申请中,网络模型的超参数至少包括如下一种或多种:学习率、无监督学习与有监督学习的次数比率、批处理图像的数量以及对网络模型进行训练的训练轮数。It should be noted that the hyperparameter of the network model refers to a parameter that sets a value before starting the learning process in the context of machine learning. In the present application, the hyperparameters of the network model include at least one or more of the following: learning rate, ratio of unsupervised learning to supervised learning, number of batch images, and number of training rounds for training the network model.
可选地,当对网络模型采用无监督学习时,第一损失函数为生成器的损失函数,第二损失函数为判别器的损失函数,其中,第一损失函数为:Optionally, when unsupervised learning is applied to the network model, the first loss function is a loss function of the generator, and the second loss function is a loss function of the discriminator, wherein the first loss function is:
l g=λ Y||y-G A(G B(y))||+λ X||x-G B(G A(x))||-D B(G B(y))-D A(G A(x)) l gY ||yG A (G B (y))||+λ X ||xG B (G A (x))||-D B (G B (y))-D A (G A (x))
第二损失函数为:The second loss function is:
l DA=D A(G A(x))-D A(y) l DA = D A (G A (x)) - D A (y)
l DB=D B(G B(x))-D B(x) l DB = D B (G B (x)) - D B (x)
当对网络模型采用有监督学习时,第三损失函数为生成器的损失函数,第四损失函数为判别器的损失函数,其中,第四损失函数与第二损失函数相同,即在有监督学习和无监督学习的情况下,判别器的更新方式不变。由此,第三损失函数为:When supervised learning is applied to the network model, the third loss function is the loss function of the generator, and the fourth loss function is the loss function of the discriminator, wherein the fourth loss function is the same as the second loss function, ie, supervised learning In the case of unsupervised learning, the way the discriminator is updated is unchanged. Thus, the third loss function is:
l g=λ Y||y-G A(x)||+λ X||x-G B(y)||-D B(G B(y))-D A(G A(x)) l gY ||yG A (x)||+λ X ||xG B (y)||-D B (G B (y))-D A (G A (x))
在上述公式中,λ Y、λ X表示超参数,可通过经验确定;G A表示生成器A,G B表示生成器B,D B表示判别器B,D A表示判别器A。X、Y分别表示原始图像域和二值图像域,x、y分别表示X、Y域的图像。 In the above formula, λ Y and λ X represent hyperparameters, which can be determined empirically; G A denotes generator A, G B denotes generator B, D B denotes discriminator B, and D A denotes discriminator A. X and Y respectively represent the original image domain and the binary image domain, and x and y respectively represent images of the X and Y domains.
需要说明的是,在得到网络模型以及第一类训练集和第二类训练集之后,即可构建预设模型,即构建预设模型的损失函数,具体方法包括如下步骤:It should be noted that, after obtaining the network model and the first type training set and the second type training set, the preset model may be constructed, that is, the loss function of the preset model is constructed, and the specific method includes the following steps:
步骤S160,基于第一类训练集和第二类训练集对网络模型的判别器和生成器的参数进行更新,得到更新后的网络模型;Step S160, updating the parameters of the discriminator and the generator of the network model based on the first type training set and the second type training set, to obtain an updated network model;
步骤S162,在对网络模型的更新次数达到第一阈值的情况下,根据更新后的网络模型构建预设模型。Step S162: In the case that the number of updates to the network model reaches the first threshold, the preset model is constructed according to the updated network model.
可选地,基于第一类训练集和第二类训练集对网络模型的判别器和生成器的参数进行更新,得到更新后的网络模型包括如下步骤:Optionally, the parameters of the network model discriminator and the generator are updated based on the first type training set and the second type training set, and the updated network model includes the following steps:
步骤S1602,基于第一类训练集根据第二损失函数更新判别器的参数;Step S1602, updating parameters of the discriminator according to the second loss function based on the first type of training set;
步骤S1604,基于第一类训练集根据第一损失函数更新生成器的参数;Step S1604, updating parameters of the generator according to the first loss function based on the first type of training set;
步骤S1606,在对判别器和生成器的参数进行更新的次数达到第二阈值的情况下,基于第二类训练集根据第三损失函数更新生成器的参数;Step S1606, in the case that the number of times of updating the parameters of the discriminator and the generator reaches the second threshold, updating the parameters of the generator according to the third loss function based on the second type of training set;
步骤S1608,基于第二类训练集根据第四损失函数更新判别器的参数;Step S1608, updating parameters of the discriminator according to the fourth loss function based on the second type of training set;
其中,在对判别器和生成器的参数进行更新的次数达到第三阈值的情况下,对网络模型的更新次数进行加一操作,直至网络模型的更新次数达到第一阈值为止。Wherein, when the number of times of updating the parameters of the discriminator and the generator reaches the third threshold, the number of updates of the network model is incremented until the number of updates of the network model reaches the first threshold.
需要说明的是,上述第一阈值为对网络模型进行训练的最大更新次数,第二阈值为基于无监督学习方式更新生成器的参数(即无监督学习方式下的生成器的参数)和判别器的参数(即无监督学习方式下的判别器的参数)的最大更新次数,第三阈值为基于有监督学习方式更新生成器的参数(即有监督学习方式下的生成器的参数)和判别器的参数(即有监督学习方式下的判别器的参数)的最大更新次数。It should be noted that the first threshold is a maximum number of updates for training the network model, and the second threshold is a parameter for updating the generator based on the unsupervised learning mode (ie, a parameter of the generator in the unsupervised learning mode) and a discriminator. The maximum number of updates of the parameters (ie, the parameters of the discriminator in the unsupervised learning mode), and the third threshold is the parameter that updates the generator based on the supervised learning mode (ie, the parameters of the generator in the supervised learning mode) and the discriminator The maximum number of updates for the parameters (that is, the parameters of the discriminator in supervised learning mode).
在一种可选的实施例中,如图4所示的一种可选的预设模型的构建流程图。其中,在图4中,第一阈值为n,第二阈值为n1,第三阈值为n2。可选地,在获取训练数据集之后,即在获取第一类训练集和第二类训练集之后,对网络模型的参数进行初始化处理,具体包括对网络模型的权重参数以及超参数进行初始化处理。在完成参数的初始化处理之后,使用无标签训练集(即第一类训练集)和梯度下降的方法通过无监督学习方式更新生成器的参数和判别器的参数,在生成器的参数和判别器的参数的更新次数达到第二阈值(即n1)的情况下,基于有监督学习方式更新,即使用有标签训练集(即第二类训练集)和梯度下降的方法通过有监督学习方式更新生成器的参数和判别器的参数。在生成器的参数和判别器的参数的更新次数达到第三阈值(即n2)的情况下,完成一次对网络模型的更新,停止对网络模型的训练。在对生成器和判别器的参数进行更新的次数达到第一阈值的情况下,以此时通过得到的生成器、和判别器来构建预设模型。In an alternative embodiment, a flow chart of the construction of an optional preset model is shown in FIG. Wherein, in FIG. 4, the first threshold is n, the second threshold is n1, and the third threshold is n2. Optionally, after acquiring the training data set, that is, after acquiring the first type training set and the second type training set, initializing the parameters of the network model, specifically including initializing the weight parameter and the hyper parameter of the network model. . After the initialization process of the parameters is completed, the parameters of the generator and the parameters of the discriminator are updated by the unsupervised learning method using the unlabeled training set (ie, the first type of training set) and the gradient descent method, the parameters and discriminators in the generator. When the number of updates of the parameter reaches the second threshold (ie, n1), the update is based on the supervised learning mode, that is, the method of using the tagged training set (ie, the second type of training set) and the gradient descent is updated by the supervised learning method. The parameters of the device and the parameters of the discriminator. In the case where the parameters of the generator and the number of updates of the parameters of the discriminator reach the third threshold (ie, n2), the update of the network model is completed once, and the training of the network model is stopped. In the case where the number of times the parameters of the generator and the discriminator are updated reaches the first threshold, the preset model is constructed by the obtained generator and the discriminator at this time.
实施例2Example 2
根据本发明实施例,还提供了一种确定瞳孔位置的装置实施例。该确定瞳孔位置的装置包括一个或多个处理器,以及一个或多个存储程序单元的存储器,其中,程序单元由处理器执行,程序单元包括第一获取模块、第二获取模块、第三获取模块以及确定模块。其中,图5是根据本发明实施例的确定瞳孔位置的装置结构示意图,如图5所示,该装置包括:第一获取模块501、第二获取模块503、第三获取模块505以及确定模块507。In accordance with an embodiment of the present invention, an apparatus embodiment for determining a pupil position is also provided. The apparatus for determining a pupil position includes one or more processors, and one or more memories storing program units, wherein the program unit is executed by a processor, and the program unit includes a first acquisition module, a second acquisition module, and a third acquisition. Module and determination module. 5 is a schematic structural diagram of an apparatus for determining a pupil position according to an embodiment of the present invention. As shown in FIG. 5, the apparatus includes: a first obtaining module 501, a second acquiring module 503, a third obtaining module 505, and a determining module 507. .
其中,第一获取模块501,设置为获取包含瞳孔的待检测图像;第二获取模块503,设置为基于半监督学习的预设模型获取与预设区域对应的二值图像,其中,预设区域为待检测图像中瞳孔所在的区域;第三获取模块505,设置为获取二值图像的质心;确定模块507,设置为根据二值图像的质心确定瞳孔的中心位置。The first obtaining module 501 is configured to acquire a to-be-detected image that includes a pupil; the second acquiring module 503 is configured to acquire a binary image corresponding to the preset region, where the preset region is The area of the pupil in the image to be detected is located; the third obtaining module 505 is configured to acquire the centroid of the binary image; and the determining module 507 is configured to determine the center position of the pupil according to the centroid of the binary image.
需要说明的是,上述第一获取模块501、第二获取模块503、第三获取模块505以及确定模块507对应于实施例1中的步骤S102至步骤S108,四个模块与对应的步骤所实现的示例和应用场景相同,但不限于上述实施例1所公开的内容。It should be noted that the first acquiring module 501, the second obtaining module 503, the third obtaining module 505, and the determining module 507 correspond to steps S102 to S108 in Embodiment 1, and the four modules are implemented by corresponding steps. The examples and application scenarios are the same, but are not limited to the contents disclosed in the above embodiment 1.
此处需要说明的是,上述第一获取模块501、第二获取模块503、第三获取模块505以及确定模块507可以作为装置的一部分运行在终端中,可以通过终端中的处理器来执行上述模块实现的功能,终端也可以是智能手机(如Android手机、iOS手机等)、平板电脑、掌声电脑以及移动互联网设备(Mobile Internet Devices,MID)、PAD等终端设备。It should be noted that the foregoing first obtaining module 501, the second obtaining module 503, the third obtaining module 505, and the determining module 507 may be run in the terminal as part of the device, and the foregoing module may be executed by a processor in the terminal. The functions realized can also be terminal devices such as smart phones (such as Android phones, iOS phones, etc.), tablet computers, applause computers, and mobile Internet devices (MID), PAD, and the like.
在一种可选的实施例中,确定瞳孔位置的装置还包括:第五获取模块、第六获取模块、第一构建模块以及第二构建模块。其中,第五获取模块,设置为获取第一类训练集和第二类训练集,其中,第一类训练集和第二类训练集中均包括一张或多张待训练的图像;多张待训练的图像包括第一训练图像集和第二训练图像集;第六获取模块,设置为获取网络模型,其中,网络模型用于将多张待训练的图像从原始图像转换为二值图像;第一构建模块,设置为构建网络模型的损失函数;第二构建模块,设置为根据第一类训练集、第二类训练集以及网络模型的损失函数构建预设模型。In an optional embodiment, the device for determining a pupil position further includes: a fifth acquisition module, a sixth acquisition module, a first construction module, and a second construction module. The fifth obtaining module is configured to acquire the first type training set and the second type training set, wherein the first type training set and the second type training set each include one or more images to be trained; The trained image includes a first training image set and a second training image set; a sixth obtaining module configured to acquire a network model, wherein the network model is configured to convert the plurality of images to be trained from the original image to the binary image; A building block is configured to construct a loss function of the network model; and a second building block is configured to construct a preset model according to the loss function of the first type training set, the second type training set, and the network model.
需要说明的是,上述第五获取模块、第六获取模块、第一构建模块以及第二构建模块对应于实施例1中的步骤S10至步骤S16,四个模块与对应的步骤所实现的示例和应用场景相同,但不限于上述实施例1所公开的内容。It should be noted that the foregoing fifth obtaining module, the sixth obtaining module, the first building module, and the second building module correspond to steps S10 to S16 in Embodiment 1, the examples implemented by the four modules and the corresponding steps, and The application scenario is the same, but is not limited to the content disclosed in the above embodiment 1.
此处需要说明的是,上述第五获取模块、第六获取模块、第一构建模块以及第二构建模块可以作为装置的一部分运行在终端中,可以通过终端中的处理器来执行上述模块实现的功能。It should be noted that the foregoing fifth obtaining module, the sixth obtaining module, the first building module, and the second building module may be run in the terminal as part of the device, and may be implemented by using the processor in the terminal. Features.
在一种可选的实施例中,第一构建模块包括:第七获取模块、第一确定模块以及第二确定模块。其中,第七获取模块,设置为获取网络模型的超参数;第一确定模块,设置为在网络模型进行无监督学习的情况下,基于超参数确定网络模型的损失函数为第一损失函数和第二损失函数;第二确定模块,设置为在网络模型进行有监督学习的情况下,基于超参数确定网络模型的损失函数为第三损失函数和第四损失函数。In an optional embodiment, the first building module includes: a seventh acquiring module, a first determining module, and a second determining module. The seventh obtaining module is configured to acquire a hyperparameter of the network model; the first determining module is configured to determine, according to the hyperparameter, the loss function of the network model as the first loss function and the first parameter in the case that the network model performs unsupervised learning The second loss function is configured to determine that the loss function of the network model is a third loss function and a fourth loss function based on the hyperparameter in the case that the network model performs supervised learning.
需要说明的是,上述第七获取模块、第一确定模块以及第二确定模块对应于实施 例1中的步骤S140至步骤S144,三个模块与对应的步骤所实现的示例和应用场景相同,但不限于上述实施例1所公开的内容。It should be noted that the foregoing seventh obtaining module, the first determining module, and the second determining module correspond to steps S140 to S144 in Embodiment 1, and the three modules are the same as the examples and application scenarios implemented by the corresponding steps, but It is not limited to the contents disclosed in the above embodiment 1.
此处需要说明的是,上述第七获取模块、第一确定模块以及第二确定模块可以作为装置的一部分运行在终端中,可以通过终端中的处理器来执行上述模块实现的功能。It should be noted that the seventh obtaining module, the first determining module, and the second determining module may be run in the terminal as part of the device, and the functions implemented by the module may be performed by a processor in the terminal.
在一种可选的实施例中,第二构建模块包括:第一更新模块以及第三构建模块。其中,第一更新模块,设置为基于第一类训练集和第二类训练集对网络模型的判别器和生成器的参数进行更新,得到更新后的网络模型;第三构建模块,设置为在在对网络模型的更新次数达到第一阈值的情况下,根据更新后的网络模型构建预设模型。In an optional embodiment, the second building module includes: a first update module and a third building module. The first update module is configured to update the parameters of the network model discriminator and the generator based on the first type training set and the second type training set to obtain an updated network model; and the third building module is set to In the case that the number of updates to the network model reaches the first threshold, the preset model is constructed according to the updated network model.
需要说明的是,上述第一更新模块以及第三构建模块对应于实施例1中的步骤S160至步骤S162,两个模块与对应的步骤所实现的示例和应用场景相同,但不限于上述实施例1所公开的内容。It should be noted that the foregoing first update module and the third construction module correspond to step S160 to step S162 in the first embodiment, and the two modules are the same as the examples and application scenarios implemented by the corresponding steps, but are not limited to the above embodiments. 1 published content.
此处需要说明的是,上述第一更新模块以及第三构建模块可以作为装置的一部分运行在终端中,可以通过终端中的处理器来执行上述模块实现的功能。It should be noted that the first update module and the third build module may be run in the terminal as part of the device, and the functions implemented by the module may be performed by a processor in the terminal.
在一种可选的实施例中,第一更新模块包括:第二更新模块、第三更新模块、第四更新模块以及第五更新模块。其中,第二更新模块,设置为基于第一类训练集根据第二损失函数更新判别器的参数;第三更新模块,设置为基于第一类训练集根据第一损失函数更新生成器的参数;第四更新模块,设置为在对判别器和生成器的参数进行更新的次数达到第二阈值的情况下,基于第二类训练集根据第三损失函数更新生成器的参数;第五更新模块,设置为基于第二类训练集根据第四损失函数更新判别器的参数;其中,在对判别器和生成器的参数进行更新的次数达到第三阈值的情况下,对网络模型的判更新次数进行加一操作,直至网络模型的更新次数达到第一阈值为止。In an optional embodiment, the first update module includes: a second update module, a third update module, a fourth update module, and a fifth update module. The second update module is configured to update the parameter of the discriminator according to the second loss function based on the first type of training set; the third update module is configured to update the parameter of the generator according to the first loss function based on the first type of training set; a fourth updating module, configured to update a parameter of the generator according to the third loss function based on the second type of training set if the number of times of updating the parameters of the discriminator and the generator reaches a second threshold; the fifth update module, And being configured to update the parameter of the discriminator according to the fourth loss function based on the second type of training set; wherein, in the case that the number of times the parameter of the discriminator and the generator is updated reaches a third threshold, the number of times of updating the network model is performed Add one operation until the number of updates of the network model reaches the first threshold.
需要说明的是,上述第二更新模块、第三更新模块、第四更新模块以及第五更新模块对应于实施例1中的步骤S1602至步骤S1608,四个模块与对应的步骤所实现的示例和应用场景相同,但不限于上述实施例1所公开的内容。It should be noted that the foregoing second update module, third update module, fourth update module, and fifth update module correspond to steps S1602 to S1608 in Embodiment 1, and examples implemented by four modules and corresponding steps are The application scenario is the same, but is not limited to the content disclosed in the above embodiment 1.
此处需要说明的是,上述第二更新模块、第三更新模块、第四更新模块以及第五更新模块可以作为装置的一部分运行在终端中,可以通过终端中的处理器来执行上述模块实现的功能。It should be noted that the foregoing second update module, third update module, fourth update module, and fifth update module may be run in the terminal as part of the device, and may be implemented by using the processor in the terminal. Features.
实施例3Example 3
根据本发明实施例的另一方面,还提供了一种存储介质,该存储介质包括存储的程序,其中,程序执行实施例1中的确定瞳孔位置的方法。According to another aspect of an embodiment of the present invention, there is further provided a storage medium comprising a stored program, wherein the program executes the method of determining a pupil position in Embodiment 1.
本申请实施例所提供的各个功能模块可以在确定瞳孔位置的装置或者类似的运算装置中运行,也可以作为存储介质的一部分进行存储。The various functional modules provided by the embodiments of the present application may be operated in a device or a similar computing device that determines the position of the pupil, or may be stored as part of the storage medium.
可选地,在本实施例中,上述存储介质中存储有计算机程序,其中,所述计算机程序被设置为运行时可以用于执行数据处理方法。Optionally, in this embodiment, a computer program is stored in the storage medium, wherein the computer program is configured to be executed to execute a data processing method.
可选地,在本实施例中,存储介质被设置为存储用于执行以下步骤的程序代码:获取包含瞳孔的待检测图像;基于半监督学习的预设模型获取预设区域对应的二值图像,其中,预设区域为待检测图像中瞳孔所在的区域;获取二值图像的质心;根据二值图像的质心确定瞳孔的中心位置。Optionally, in this embodiment, the storage medium is configured to store program code for performing the following steps: acquiring an image to be detected including the pupil; acquiring a binary image corresponding to the preset region based on the preset model of semi-supervised learning The preset area is the area where the pupil is located in the image to be detected; the centroid of the binary image is obtained; and the center position of the pupil is determined according to the centroid of the binary image.
可选地,在本实施例中,存储介质还可以被设置为确定瞳孔位置的方法提供的各种优选地或可选的方法步骤的程序代码。Alternatively, in the present embodiment, the storage medium may also be provided as program code for various preferred or optional method steps provided by the method of determining the pupil position.
实施例4Example 4
根据本发明实施例的另一方面,还提供了一种处理器,该处理器用于运行程序,其中,程序运行时执行实施例1中的确定瞳孔位置的方法。According to another aspect of an embodiment of the present invention, there is further provided a processor for executing a program, wherein the method of determining a pupil position in Embodiment 1 is executed while the program is running.
在发明本实施例中,上述处理器可以确定瞳孔位置的方法的运行程序。In the present embodiment of the invention, the processor can determine the running procedure of the method of the pupil position.
可选地,在本实施例中,处理器可以被设置为执行下述步骤:获取包含瞳孔的待检测图像;基于半监督学习的预设模型获取预设区域对应的二值图像,其中,预设区域为待检测图像中瞳孔所在的区域;获取二值图像的质心;根据二值图像的质心确定瞳孔的中心位置。Optionally, in this embodiment, the processor may be configured to: perform: acquiring an image to be detected that includes the pupil; and acquiring a binary image corresponding to the preset region based on the preset model of the semi-supervised learning, where Let the area be the area where the pupil is located in the image to be detected; obtain the centroid of the binary image; determine the center position of the pupil according to the centroid of the binary image.
上述处理器可以通过运行存储在存储器内的软件程序以及模块,从而执行各种功能应用以及数据处理,即实现上述的确定瞳孔位置的方法。The above processor can execute various functional applications and data processing by running software programs and modules stored in the memory, that is, implementing the above-described method of determining the pupil position.
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指示确定瞳孔位置的装置相关的硬件来完成,该程序可以存储于一确定瞳孔位置的装置可读存储介质中,存储介质可以包括:闪存盘、只读存储器(Read-Only Memory,ROM)、随机存取器(Random Access Memory,RAM)、磁盘或光盘等。One of ordinary skill in the art will appreciate that all or part of the various steps of the above-described embodiments can be accomplished by a program-independent device-independent hardware that determines the position of the pupil, which can be stored in a device that determines the position of the pupil. In the storage medium, the storage medium may include a flash disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk.
如上参照附图以示例的方式描述了根据本发明的确定瞳孔位置的方法和装置。但是,本领域技术人员应当理解,对于上述本发明所提出的确定瞳孔位置的方法和装置,还可以在不脱离本发明内容的基础上做出各种改进。因此,本发明的保护范围应当由所附的权利要求书的内容确定。A method and apparatus for determining a pupil position in accordance with the present invention is described above by way of example with reference to the accompanying drawings. However, it will be understood by those skilled in the art that various modifications can be made to the method and apparatus for determining the pupil position proposed by the present invention as described above without departing from the scope of the present invention. Therefore, the scope of the invention should be determined by the content of the appended claims.
上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。The serial numbers of the embodiments of the present invention are merely for the description, and do not represent the advantages and disadvantages of the embodiments.
在本发明的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above-mentioned embodiments of the present invention, the descriptions of the various embodiments are different, and the parts that are not detailed in a certain embodiment can be referred to the related descriptions of other embodiments.
在本申请所提供的几个实施例中,应该理解到,所揭露的技术内容,可通过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,可以为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或通信连接,可以是电性或其它的形式。In the several embodiments provided by the present application, it should be understood that the disclosed technical contents may be implemented in other manners. The device embodiments described above are only schematic. For example, the division of the unit may be a logical function division. In actual implementation, there may be another division manner, for example, multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, unit or module, and may be electrical or otherwise.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对相关技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。The integrated unit, if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may contribute to the related art or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium. A number of instructions are included to cause a computer device (which may be a personal computer, server or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present invention. The foregoing storage medium includes: a U disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk, and the like. .
以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本发明的保护范围。The above description is only a preferred embodiment of the present invention, and it should be noted that those skilled in the art can also make several improvements and retouchings without departing from the principles of the present invention. It should be considered as the scope of protection of the present invention.

Claims (12)

  1. 一种确定瞳孔位置的方法,包括:A method of determining the position of a pupil, comprising:
    获取包含瞳孔的待检测图像;Obtaining an image to be detected containing a pupil;
    基于半监督学习的预设模型获取与预设区域对应的二值图像,其中,所述预设区域为所述待检测图像中所述瞳孔所在的区域;Obtaining a binary image corresponding to the preset area, where the preset area is an area where the pupil is located in the image to be detected;
    获取所述二值图像的质心;Obtaining a centroid of the binary image;
    根据所述二值图像的质心确定所述瞳孔的中心位置。A center position of the pupil is determined according to a centroid of the binary image.
  2. 根据权利要求1所述的方法,其中,在基于半监督学习的预设模型获取与预设区域对应的二值图像之前,所述方法还包括:The method according to claim 1, wherein before the obtaining a binary image corresponding to the preset area, the method further comprises:
    获取第一类训练集和第二类训练集,其中,所述第一类训练集和所述第二类训练集中均包括一张或多张待训练的图像;Obtaining a first type of training set and a second type of training set, wherein the first type of training set and the second type of training set each include one or more images to be trained;
    获取网络模型,其中,所述网络模型用于将所述多张待训练的图像从原始图像转换为所述二值图像;Obtaining a network model, wherein the network model is configured to convert the plurality of images to be trained from an original image to the binary image;
    构建所述网络模型的损失函数;Constructing a loss function of the network model;
    根据所述第一类训练集、第二类训练集以及所述网络模型的损失函数构建所述预设模型。The preset model is constructed according to the first type training set, the second type training set, and the loss function of the network model.
  3. 根据权利要求2所述的方法,其中,构建所述网络模型的损失函数包括:The method of claim 2 wherein constructing the loss function of the network model comprises:
    获取所述网络模型的超参数;Obtaining a hyperparameter of the network model;
    在所述网络模型进行无监督学习的情况下,基于所述超参数确定所述网络模型的损失函数为第一损失函数和第二损失函数;And determining, in the case that the network model performs unsupervised learning, a loss function of the network model based on the hyperparameter as a first loss function and a second loss function;
    在所述网络模型进行有监督学习的情况下,基于所述超参数确定所述网络模型的损失函数为第三损失函数和第四损失函数。In the case where the network model performs supervised learning, the loss function of the network model is determined to be a third loss function and a fourth loss function based on the hyperparameter.
  4. 根据权利要求3所述的方法,其中,根据所述第一类训练集、第二类训练集以及所述网络模型的损失函数构建所述预设模型包括:The method of claim 3, wherein constructing the preset model according to the loss function of the first type of training set, the second type of training set, and the network model comprises:
    基于所述第一类训练集和所述第二类训练集对所述网络模型的判别器和生成器的参数进行更新,得到更新后的网络模型;Updating parameters of the discriminator and the generator of the network model based on the first type of training set and the second type of training set to obtain an updated network model;
    在对所述网络模型的更新次数达到第一阈值的情况下,根据更新后的网络模型构建所述预设模型。In a case where the number of updates to the network model reaches a first threshold, the preset model is constructed according to the updated network model.
  5. 根据权利要求4所述的方法,其中,基于所述第一类训练集和所述第二类训练集对所述网络模型的判别器和生成器的参数进行更新,得到更新后的网络模型包括:The method according to claim 4, wherein the parameters of the discriminator and the generator of the network model are updated based on the first type of training set and the second type of training set, and the updated network model is obtained. :
    基于所述第一类训练集根据所述第二损失函数更新所述判别器的参数;Updating parameters of the discriminator based on the second loss function based on the first type of training set;
    基于所述第一类训练集根据所述第一损失函数更新所述生成器的参数;Updating parameters of the generator based on the first loss function based on the first type of training set;
    在对所述判别器和所述生成器的参数进行更新的次数达到第二阈值的情况下,基于所述第二类训练集根据所述第三损失函数更新所述生成器的参数;And in a case that the number of times the parameter of the discriminator and the generator is updated reaches a second threshold, the parameter of the generator is updated according to the third loss function based on the second type of training set;
    基于所述第二类训练集根据所述第四损失函数更新所述判别器的参数;Updating parameters of the discriminator based on the fourth loss function based on the second type of training set;
    其中,在对所述判别器和所述生成器的参数进行更新的次数达到第三阈值的情况下,对所述网络模型的更新次数进行加一操作,直至所述网络模型的更新次数达到所述第一阈值为止。Wherein, when the number of times the parameter of the discriminator and the generator is updated reaches a third threshold, the number of updates of the network model is increased until the number of updates of the network model reaches The first threshold is described.
  6. 一种确定瞳孔位置的装置,包括一个或多个处理器,以及一个或多个存储程序单元的存储器,其中,所述程序单元由所述处理器执行,所述程序单元包括:An apparatus for determining a pupil position, comprising one or more processors, and one or more memories storing program units, wherein the program units are executed by the processor, the program units comprising:
    第一获取模块,设置为获取包含瞳孔的待检测图像;a first acquiring module, configured to acquire an image to be detected including a pupil;
    第二获取模块,设置为基于半监督学习的预设模型获取与预设区域对应的二值图像,其中,所述预设区域为所述待检测图像中所述瞳孔所在的区域;The second obtaining module is configured to acquire a binary image corresponding to the preset area according to the preset model of the semi-supervised learning, wherein the preset area is an area where the pupil is located in the image to be detected;
    第三获取模块,设置为获取所述二值图像的质心;a third obtaining module, configured to acquire a centroid of the binary image;
    确定模块,设置为根据所述二值图像的质心确定所述瞳孔的中心位置。And a determining module configured to determine a center position of the pupil according to a centroid of the binary image.
  7. 根据权利要求6所述的装置,其中,所述装置还包括:The apparatus of claim 6 wherein said apparatus further comprises:
    第五获取模块,设置为获取第一类训练集和第二类训练集,其中,所述第一类训练集和所述第二类训练集中均包括一张或多张待训练的图像;a fifth acquiring module, configured to acquire a first type of training set and a second type of training set, wherein the first type of training set and the second type of training set each include one or more images to be trained;
    第六获取模块,设置为获取网络模型,其中,所述网络模型用于将所述多张待训练的图像从原始图像转换为所述二值图像;a sixth obtaining module, configured to acquire a network model, wherein the network model is configured to convert the plurality of images to be trained from an original image to the binary image;
    第一构建模块,设置为构建所述网络模型的损失函数;a first building block, configured to construct a loss function of the network model;
    第二构建模块,设置为根据所述第一类训练集、第二类训练集以及所述网络模型的损失函数构建所述预设模型。The second building module is configured to construct the preset model according to the first type training set, the second type training set, and the loss function of the network model.
  8. 根据权利要求7所述的装置,其中,所述第一构建模块包括:The apparatus of claim 7, wherein the first building block comprises:
    第七获取模块,设置为获取所述网络模型的超参数;a seventh obtaining module, configured to acquire a hyperparameter of the network model;
    第一确定模块,设置为在所述网络模型进行无监督学习的情况下,基于所述超参数确定所述网络模型的损失函数为第一损失函数和第二损失函数;a first determining module, configured to determine, according to the hyperparameter, a loss function of the network model as a first loss function and a second loss function, in the case that the network model performs unsupervised learning;
    第二确定模块,设置为在所述网络模型进行有监督学习的情况下,基于所述超参数确定所述网络模型的损失函数为第三损失函数和第四损失函数。And a second determining module, configured to determine, according to the hyperparameter, a loss function of the network model as a third loss function and a fourth loss function, if the network model performs supervised learning.
  9. 根据权利要求8所述的装置,其中,所述第二构建模块包括:The apparatus of claim 8 wherein said second building block comprises:
    第一更新模块,设置为基于所述第一类训练集和所述第二类训练集对所述网络模型的判别器和生成器的参数进行更新,得到更新后的网络模型;a first update module, configured to update parameters of the discriminator and the generator of the network model based on the first type of training set and the second type of training set to obtain an updated network model;
    第三构建模块,设置为在对所述网络模型的更新次数达到第一阈值的情况下,根据更新后的网络模型构建所述预设模型。The third building module is configured to construct the preset model according to the updated network model if the number of updates to the network model reaches a first threshold.
  10. 根据权利要求9所述的装置,其中,所述第一更新模块包括:The apparatus of claim 9, wherein the first update module comprises:
    第二更新模块,设置为基于所述第一类训练集根据所述第二损失函数更新所述判别器的参数;a second update module, configured to update parameters of the discriminator according to the second loss function based on the first type of training set;
    第三更新模块,设置为基于所述第一类训练集根据所述第一损失函数更新所述生成器的参数;a third update module, configured to update parameters of the generator according to the first loss function based on the first type of training set;
    第四更新模块,设置为在对所述判别器和所述生成器的参数进行更新的次数达到第二阈值的情况下,基于所述第二类训练集根据所述第三损失函数更新所述生成器的参数;a fourth update module, configured to update, according to the third type of training set, the third loss function, based on the number of times the parameter of the discriminator and the generator is updated to reach a second threshold The parameters of the generator;
    第五更新模块,设置为基于所述第二类训练集根据所述第四损失函数更新所述判别器的参数;a fifth update module, configured to update parameters of the discriminator according to the fourth loss function based on the second type of training set;
    其中,在对所述判别器和所述生成器的参数进行更新的次数达到第三阈值的情况下,对所述网络模型的更新次数进行加一操作,直至所述网络模型的更新次数达到所述第一阈值为止。Wherein, when the number of times the parameter of the discriminator and the generator is updated reaches a third threshold, the number of updates of the network model is increased until the number of updates of the network model reaches The first threshold is described.
  11. 一种存储介质,所述存储介质包括存储的程序,其中,所述程序执行权利要求1至5中任意一项所述的确定瞳孔位置的方法。A storage medium, the storage medium comprising a stored program, wherein the program performs the method of determining a pupil position according to any one of claims 1 to 5.
  12. 一种处理器,所述处理器用于运行程序,其中,所述程序运行时执行权利要求1至5中任意一项所述的确定瞳孔位置的方法。A processor for executing a program, wherein the program is operative to perform the method of determining a pupil position according to any one of claims 1 to 5.
PCT/CN2018/119882 2018-01-23 2018-12-07 Method and apparatus for determining position of pupil WO2019144710A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/349,799 US10949991B2 (en) 2018-01-23 2018-12-07 Method and apparatus for determining position of pupil

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810064311.2 2018-01-23
CN201810064311.2A CN108197594B (en) 2018-01-23 2018-01-23 Method and device for determining pupil position

Publications (1)

Publication Number Publication Date
WO2019144710A1 true WO2019144710A1 (en) 2019-08-01

Family

ID=62590429

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/119882 WO2019144710A1 (en) 2018-01-23 2018-12-07 Method and apparatus for determining position of pupil

Country Status (4)

Country Link
US (1) US10949991B2 (en)
CN (1) CN108197594B (en)
TW (1) TWI714952B (en)
WO (1) WO2019144710A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112308014A (en) * 2020-11-18 2021-02-02 成都集思鸣智科技有限公司 High-speed accurate searching and positioning method for reflective points of pupils and cornea of eyes
US10949991B2 (en) 2018-01-23 2021-03-16 Beijing 7Invensun Technology Co., Ltd. Method and apparatus for determining position of pupil
CN113762393A (en) * 2021-09-08 2021-12-07 杭州网易智企科技有限公司 Model training method, gaze point detection method, medium, device, and computing device

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111222374A (en) * 2018-11-26 2020-06-02 广州慧睿思通信息科技有限公司 Lie detection data processing method and device, computer equipment and storage medium
CN113815623B (en) * 2020-06-11 2023-08-08 广州汽车集团股份有限公司 Method for visually tracking eye point of gaze of human eye, vehicle early warning method and device
CN116524581B (en) * 2023-07-05 2023-09-12 南昌虚拟现实研究院股份有限公司 Human eye image facula classification method, system, equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102129553A (en) * 2011-03-16 2011-07-20 上海交通大学 Method for eye detection based on single infrared light supply
CN103425970A (en) * 2013-08-29 2013-12-04 大连理工大学 Human-computer interaction method based on head postures
CN104732202A (en) * 2015-02-12 2015-06-24 杭州电子科技大学 Method for eliminating influence of glasses frame during human eye detection
US9104908B1 (en) * 2012-05-22 2015-08-11 Image Metrics Limited Building systems for adaptive tracking of facial features across individuals and groups
CN105205453A (en) * 2015-08-28 2015-12-30 中国科学院自动化研究所 Depth-auto-encoder-based human eye detection and positioning method
CN108197594A (en) * 2018-01-23 2018-06-22 北京七鑫易维信息技术有限公司 The method and apparatus for determining pupil position

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5091965A (en) * 1990-07-16 1992-02-25 Sony Corporation Video image processing apparatus
US6812688B2 (en) * 2001-12-12 2004-11-02 Tektronix, Inc. Signal acquisition method and apparatus using integrated phase locked loop
JP2008039596A (en) * 2006-08-07 2008-02-21 Pioneer Electronic Corp System, method, program for providing information and memory medium
JP2010142428A (en) * 2008-12-18 2010-07-01 Canon Inc Photographing apparatus, photographing method, program and recording medium
JP5436076B2 (en) * 2009-07-14 2014-03-05 キヤノン株式会社 Image processing apparatus, image processing method, and program
JP5836634B2 (en) * 2011-05-10 2015-12-24 キヤノン株式会社 Image processing apparatus and method
US8824779B1 (en) * 2011-12-20 2014-09-02 Christopher Charles Smyth Apparatus and method for determining eye gaze from stereo-optic views
US10048749B2 (en) * 2015-01-09 2018-08-14 Microsoft Technology Licensing, Llc Gaze detection offset for gaze tracking models
CN105303185A (en) * 2015-11-27 2016-02-03 中国科学院深圳先进技术研究院 Iris positioning method and device
CN106845425A (en) * 2017-01-25 2017-06-13 迈吉客科技(北京)有限公司 A kind of visual tracking method and tracks of device
CN107273978B (en) * 2017-05-25 2019-11-12 清华大学 A kind of method for building up and device of the production confrontation network model of three models game

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102129553A (en) * 2011-03-16 2011-07-20 上海交通大学 Method for eye detection based on single infrared light supply
US9104908B1 (en) * 2012-05-22 2015-08-11 Image Metrics Limited Building systems for adaptive tracking of facial features across individuals and groups
CN103425970A (en) * 2013-08-29 2013-12-04 大连理工大学 Human-computer interaction method based on head postures
CN104732202A (en) * 2015-02-12 2015-06-24 杭州电子科技大学 Method for eliminating influence of glasses frame during human eye detection
CN105205453A (en) * 2015-08-28 2015-12-30 中国科学院自动化研究所 Depth-auto-encoder-based human eye detection and positioning method
CN108197594A (en) * 2018-01-23 2018-06-22 北京七鑫易维信息技术有限公司 The method and apparatus for determining pupil position

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10949991B2 (en) 2018-01-23 2021-03-16 Beijing 7Invensun Technology Co., Ltd. Method and apparatus for determining position of pupil
CN112308014A (en) * 2020-11-18 2021-02-02 成都集思鸣智科技有限公司 High-speed accurate searching and positioning method for reflective points of pupils and cornea of eyes
CN112308014B (en) * 2020-11-18 2024-05-14 成都集思鸣智科技有限公司 High-speed accurate searching and positioning method for pupil and cornea reflecting spot of two eyes
CN113762393A (en) * 2021-09-08 2021-12-07 杭州网易智企科技有限公司 Model training method, gaze point detection method, medium, device, and computing device
CN113762393B (en) * 2021-09-08 2024-04-30 杭州网易智企科技有限公司 Model training method, gaze point detection method, medium, device and computing equipment

Also Published As

Publication number Publication date
US20200273198A1 (en) 2020-08-27
TWI714952B (en) 2021-01-01
CN108197594B (en) 2020-12-11
CN108197594A (en) 2018-06-22
US10949991B2 (en) 2021-03-16
TW201933050A (en) 2019-08-16

Similar Documents

Publication Publication Date Title
WO2019144710A1 (en) Method and apparatus for determining position of pupil
US11783199B2 (en) Image description information generation method and apparatus, and electronic device
WO2020006961A1 (en) Image extraction method and device
WO2019196633A1 (en) Training method for image semantic segmentation model and server
CN110942154A (en) Data processing method, device, equipment and storage medium based on federal learning
CN110765882B (en) Video tag determination method, device, server and storage medium
CN112889108A (en) Speech classification using audiovisual data
WO2019128676A1 (en) Light spot filtering method and apparatus
US11380131B2 (en) Method and device for face recognition, storage medium, and electronic device
CN112748941B (en) Method and device for updating target application program based on feedback information
US20230237630A1 (en) Image processing method and apparatus
CN117274491B (en) Training method, device, equipment and medium for three-dimensional reconstruction model
CN110929041A (en) Entity alignment method and system based on layered attention mechanism
US11899823B2 (en) Privacy safe anonymized identity matching
WO2024088111A1 (en) Image processing method and apparatus, device, medium, and program product
WO2022126921A1 (en) Panoramic picture detection method and device, terminal, and storage medium
CN110866866B (en) Image color imitation processing method and device, electronic equipment and storage medium
CN117972766A (en) Inversion attack method based on multi-mode federal learning
CN112528978A (en) Face key point detection method and device, electronic equipment and storage medium
CN114758130B (en) Image processing and model training method, device, equipment and storage medium
WO2019095596A1 (en) Object detection method, device, storage medium and processor
CN114373034B (en) Image processing method, apparatus, device, storage medium, and computer program
CN113793252B (en) Image processing method, device, chip and module equipment thereof
CN116152368A (en) Font generation method, training method, device and equipment of font generation model
CN111461228B (en) Image recommendation method and device and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18902947

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 23/11/2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18902947

Country of ref document: EP

Kind code of ref document: A1