CN114694236B

CN114694236B - Eyeball motion segmentation positioning method based on cyclic residual convolution neural network

Info

Publication number: CN114694236B
Application number: CN202210220173.9A
Authority: CN
Inventors: 楼丽霞; 叶娟; 王亚奇; 黄星儒; 孙一鸣; 杨泽华
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2022-03-08
Filing date: 2022-03-08
Publication date: 2023-03-24
Anticipated expiration: 2042-03-08
Also published as: CN114694236A

Abstract

The invention discloses an eyeball motion segmentation positioning method based on a cyclic residual convolution neural network. Acquiring and obtaining an eye position of a tested person, inputting the eye position into a first-stage circular residual error convolution neural network model for processing to obtain a rotated eye position; inputting the first-stage cyclic residual convolution neural network model again to obtain a binary mask, cutting to obtain left and right eye detection area pictures, and inputting the second-stage cyclic residual convolution neural network model to detect to obtain eyelid and cornea masks of both eyes; obtaining a scale according to the round mark on the forehead; and then, eye movement is measured by processing the eye position image through the eyelid cornea mask, the pixel distance of the functions of the six extraocular muscles is obtained, and the actual size value is obtained through conversion. The invention has stable technical performance and high segmentation accuracy, can quickly and automatically detect the eyeball movement, and avoids the error of manual measurement in a computer-aided image processing mode, thereby improving the accuracy and the objectivity of eye muscle disease diagnosis. The invention also provides possibility for physical examination screening, remote medical treatment and the like.

Description

Eyeball motion segmentation positioning method based on cyclic residual convolution neural network

Technical Field

The invention relates to an eye image processing method, in particular to an eyeball motion segmentation positioning method based on a cyclic residual convolution neural network.

Background

Clinically, accurate assessment of eye movements is important for the diagnosis of eye muscle related diseases, especially non-common strabismus. The movement of the extraocular muscles is complex and synergistic or antagonistic between the extraocular muscles. For the purpose of diagnosis, an eye position in which the major action of a certain extraocular muscle is displayed but the minor action thereof is not combined is defined and referred to as a diagnosis eye position. Six diagnostic eye positions correspond to six extraocular muscles (two horizontal rectus muscles + two vertical rectus muscles + two oblique muscles), respectively. In clinical practice, physicians often subjectively judge whether there is an increase or a decrease in extraocular muscle function using qualitative methods. Although simple, the method lacks accuracy and objectivity and greatly depends on the subjective experience of doctors. There are also a number of instruments currently used to quantitatively assess eye movement (such as the Hess screen and the Lancaster screen), but these instruments are time consuming and laborious to measure and are not typically available to non-specialized medical institutions.

The eyeball motion condition can be collected and recorded in an image form. The prior art measures the distance moved by the limbus by artificially matching two eye positions to represent the vertical rectus and oblique muscle function. However, this technique cannot avoid image matching errors, which causes inconvenience in clinical application. With the rapid development of computer technology, the deep learning algorithm has been widely applied to the medical field and plays an important role in disease diagnosis and treatment. The deep learning technology provides a direction for developing an eyeball motion measurement method.

Disclosure of Invention

In order to solve the problems in the background art, the invention aims to provide an eyeball motion segmentation positioning method based on a semi-supervised cyclic residual convolution neural network. The eye structural features are segmented through a deep learning algorithm, and the functions of the six extraocular muscles are automatically measured for assisting diagnosis and treatment of eye muscle diseases.

The technical scheme adopted by the invention comprises the following steps:

step 1: constructing a first-stage circular residual convolution neural network model D ₁ Inputting a face image of the face attribute data setFirst-stage cyclic residual convolution neural network model D ₁ Obtaining a binary mask of the two eyes in the face image, and obtaining the position coordinates of the two eyes through the binary mask;

the first-stage circular residual convolution neural network model D ₁ The face attribute data set is input in advance to be trained.

Step 2: constructing a second-stage cyclic residual convolution neural network model D for detecting eyelid contour and cornea contour ₂ ；

And step 3: collecting and obtaining eye position photographs I of a plurality of testees _{Original drawing} Inputting the first stage circular residual convolution neural network model D ₁ Detecting to obtain binary mask, and irradiating eye by the binary mask I _{Original drawing} Extracting to obtain position coordinates of the two eyes, and obtaining pupil center coordinates (X) of the two eyes ₁ ,Y ₁ ) And (X) ₂ ,Y ₂ ) Rotating the eye position I around the midpoint between the pupil center coordinates of both eyes _{Original drawing} So that the two eyes are positioned on the same horizontal plane to obtain the rotated eye position I _Rotate ；

The eye position picture I _{Original drawing} Is composed of positive images respectively collected by human eyeball under nine different azimuth states.

And 4, step 4: for rotated back eye position I _Rotate Inputting the first stage cyclic residual convolution neural network model D again ₁ Detecting to obtain binary mask, and rotating the rotated eye position by the binary mask _Rotate Line extraction and cutting are carried out to obtain a left eye detection area picture and a right eye detection area picture, the left eye detection area picture and the right eye detection area picture are zoomed to a fixed pixel size, specifically to 256 multiplied by 256pixels, and then are respectively input into a second stage cyclic residual convolution neural network model D ₂ Detecting to obtain eyelid masks and cornea masks of the two eyes respectively, and finally restoring the eyelid masks and the cornea masks to the original sizes before cutting;

and 5: using adaptive thresholding for rotated eye position I _Rotate Filtering out colors outside the forehead mark by adopting an adaptive mechanism according to the median color thresholdObtaining the rotated eye position I _Rotate Measuring the horizontal width and the vertical width of the binary mask; then the actual transverse width and the actual longitudinal width of the circular mark and the transverse width and the longitudinal width of the binary mask are combined, and the rotated eye position I is obtained through conversion _Rotate Scale bar (b), i.e., the actual width and the actual length of each pixel;

step 6: repeating the steps 3 to 5 on the eye positions of all the testees, performing initial training by adopting a small part of data, and performing circulating semi-supervised training to obtain eyelid masks, cornea masks and scale scales of all the eye positions;

semi-supervised training refers to training a classifier on a small amount of labeled data and then using the classifier to predict unlabeled data. These predictions may be better than random guessing, and unlabeled data predictions may be employed as "pseudo-labels" in subsequent classifier iterations.

And then, carrying out image processing on the eye position images of the eyelid mask and the cornea mask to measure eye movement, obtaining the pixel distance of the function of the six extraocular muscles, and then converting the pixel value of the function of the six extraocular muscles according to a scale to obtain the actual size value of the function of the six extraocular muscles.

In the eye position picture, a round mark is pasted on the forehead of the human face.

In a specific implementation, the actual diameter of the circular mark is 10mm, and the circular mark is red.

Said normal person has no strabismus and eyelid disease.

The first-stage cyclic residual convolution neural network model D ₁ And a second stage cyclic residual convolutional neural network model D ₂ The system has the same topological structure and specifically comprises five continuous cyclic convolution modules, four attention modules, three cross feature fusion modules and a result output module;

processing an input image by five cyclic convolution modules and four 2 × 2 maximum pooling operations, wherein 2 × 2 maximum pooling operations are arranged between every two adjacent cyclic convolution modules, namely the output of the previous cyclic convolution module is input into the next cyclic convolution module after 2 × 2 maximum pooling operations;

inputting a result of the output of the fifth cyclic convolution module after being subjected to one sampling convolution operation and a result of the output of the fourth cyclic convolution module after being subjected to skip connection into a fourth attention module together, splicing the result of the output of the fifth cyclic convolution module after being subjected to one sampling convolution operation and the result of the output of the fourth attention module after being subjected to skip connection to obtain a fourth combined feature map, taking the result of the output of the fifth cyclic convolution module after being subjected to one sampling convolution operation as a fourth decoder feature map, and inputting the fourth decoder feature map and the fourth combined feature map into a third cross feature fusion module together;

the result of the fourth combined feature map after being subjected to one sampling convolution operation and the output of the third cyclic convolution module are jointly input into a third attention module, the result of the output of the third cross feature fusion module after being subjected to one sampling convolution operation and the result of the output of the third attention module after being subjected to skip connection are spliced to obtain a third combined feature map, the result of the output of the third cross feature fusion module after being subjected to skip connection and being subjected to one sampling convolution operation is used as a third decoder feature map, and then the third decoder feature map and the third combined feature map are jointly input into a second cross feature fusion module for processing;

the result of the third combined characteristic diagram after being processed by a sampling convolution operation and the result of the second cyclic convolution module after being skipped and connected are jointly input into the second attention module, the result of the second cross characteristic fusion module after being processed by a sampling convolution operation and the result of the second attention module after being skipped and connected are spliced to obtain a second combined characteristic diagram, and the second cross characteristic fusion module is connected with the second attention module after being skipped and connected

The output of the first decoder is used as a second decoder characteristic diagram after a sampling convolution operation, and then the second decoder characteristic diagram and the second combined characteristic diagram are jointly input into a first cross characteristic fusion module for processing;

the result of the second combined feature map after a sampling convolution operation and the result of the first cyclic convolution module after skipping connection are jointly input into the first attention module, the result of the first cross feature fusion module after a sampling convolution operation and the result of the first attention module after skipping connection are spliced to obtain a first combined feature map, the result of the first cross feature fusion module after a sampling convolution operation is used as a first decoder feature map, and the first decoder feature map and the first combined feature map are jointly input into the result output module for processing;

and the result output module adopts 1x1 convolution operation processing, and finally the output of the result output module is used as the output of the circular residual convolution neural network model.

First-stage cyclic residual convolution neural network model D ₁ The output of the two-stage convolution neural network model D is a binary mask of two eyes and a second stage circulation residual error ₂ The output of (a) is the eyelid mask and the cornea mask for the single eye.

The sampling convolution operation is formed by two times of upsampling and 3 × 3 convolution processing which are sequentially carried out.

The cyclic convolution module is composed of a convolution layer, a normalization layer and an activation function which are sequentially arranged, and is circularly processed for multiple times, as shown in fig. 5.

The skip connection is actually a copy operation.

The topological structure of the attention module is shown in fig. 6, and specifically includes: and adding the results of the decoder characteristic diagram and the encoder characteristic diagram which are respectively subjected to convolution and batch standardization in sequence to obtain fusion characteristics, and multiplying the results of the fusion characteristics which are subjected to linear function correction, convolution, batch standardization and growth function in sequence with the encoder characteristic diagram to obtain the output of the attention module.

The cross feature fusion module specifically comprises: and on one hand, the intermediate splicing characteristic diagram is subjected to matrix multiplication with the input combined characteristic diagram after passing through a full connection layer to obtain an output combined characteristic diagram, and on the other hand, the intermediate splicing characteristic diagram is subjected to matrix multiplication with the input decoder characteristic diagram after passing through a full connection layer to obtain an output decoder characteristic diagram.

The convolution module is mainly formed by sequentially connecting 1 × 1 convolution, batch standardization and modified linear functions.

The second stage cyclic residual convolution neural network model D ₂ The following training is performed in advance: collecting normal face images of different people, labeling eyelid contours and cornea contours, cutting the normal face images to obtain binocular image areas, zooming the binocular image areas to fixed pixel size, specifically to 256 × 256pixels, inputting all binocular image areas and eyelid contours and cornea contours corresponding to the binocular image areas as training data into a second-stage cyclic residual convolution neural network model D ₂ Training is performed with eyelid contours and corneal contours as labels.

The pixel distance of the function of the six extraocular muscles is obtained by processing the following modes:

respectively extracting each picture in the eye position picture by using an eyelid mask and a cornea mask to obtain an eyelid area and a cornea area, wherein the eyelid area comprises an upper eyelid area and a lower eyelid area; then, taking the pixel point closest to the temporal side in the cornea area as the outer edge of the cornea, taking the pixel point closest to the nasal side in the cornea area as the inner edge of the cornea, and taking the intersection point between the upper eyelid area and the lower eyelid area close to the nasal side as the inner canthus angle;

subtracting the distance from the outer edge of the cornea to the inner canthus angle in the eyeball orthophoto of the eye position picture from the distance from the outer edge of the cornea to the inner canthus angle in the eyeball adduction photo of the eye position picture to obtain a difference value as the inner rectus muscle function and the distance L of the inward movement of the outer edge of the cornea ₁ ；

Subtracting the distance from the inner edge of the cornea to the inner canthus angle in the eye position exo-eyeball position picture from the distance from the inner edge of the cornea to the inner canthus angle in the eye position exo-eyeball position picture to obtain a difference value as the distance L of outward movement of the inner edge of the cornea for the function of the external rectus muscle ₂ ；

In the picture of the temporal upper transfer eye position of the eye position picture, each straight line in the 45-degree direction on the temples is made, intersection point pixel points of each straight line and a cornea area and an upper eyelid area are taken as a corneal margin and an upper eyelid margin respectively, wherein the maximum value of the distance between the corneal margin under each straight line and the upper eyelid margin is taken as the corneal margin-eyelid distance L of the temporal upper transfer eye position for the function of the upper rectus muscle ₃ ；

The 45-degree direction on the temporalis faces to the temporalis and is along the direction inclined by 45 degrees upwards.

In the on-nose-eye rotation position photo of the eye position photo, each straight line in the 45-degree direction on the nose is made, intersection point pixel points of each straight line and a cornea area and an upper eyelid area are taken as a cornea margin and an upper eyelid margin respectively, wherein the maximum value of the distance between the cornea margin under each straight line and the upper eyelid margin is taken as the corneal margin-eyelid distance L of the on-nose-eye rotation for the function of the lower oblique muscle ₄ ；

In the temporal subconverted eye position photo of the eye position photo, each straight line in the 45-degree direction on the temples is made, intersection point pixel points of each straight line and a cornea area and a lower eyelid area are taken as a corneal limbus and a lower eyelid limbus respectively, wherein the maximum value of the distances between the corneal limbus and the upper eyelid limbus under each straight line is taken as the function of lower rectus muscle, and the corneal limbus-eyelid distance L of the temporal subconverted eye is used ₅ ；

In the infranasal oculopathy photo of the eye position photo, each straight line in the 45-degree direction under the nose is made, the intersection point pixel points of each straight line and the cornea area and the lower eyelid area are respectively taken as the corneal limbus and the lower eyelid margin, wherein the maximum value of the distance between the corneal limbus and the lower eyelid margin under each straight line is taken as the corneal limbus-eyelid distance L of the infranasal oculopathy for the function of the superior oblique muscle ₆ 。

Therefore, the eyeball motion, namely the function of the six extraocular muscles, can be measured through the distance between two points.

The invention obtains the detection result of the eyeball movement in a computer image processing mode, and avoids the subjectivity and the error of the traditional manual measurement method.

According to the invention, the functions of the vertical rectus muscle and the oblique muscle are evaluated through the corneal limbus-eyelid distance on a single eye position photograph, the moving distance of the corneal limbus is not required to be measured by matching two eye position photographs, the additional processing of image matching is avoided, and the detection results of the functions of the vertical rectus muscle and the oblique muscle are directly obtained through the effective measurement of the corneal limbus-eyelid distance.

By applying the technical scheme, compared with the prior art, the invention has the following advantages:

the invention obtains the eye position picture in a simple mode, can measure the function of the external muscles of six eyes only through the picture, and has high identification accuracy. According to the method, a cyclic residual convolution neural network R2U-Net is adopted, and a cyclic and residual network is added on the basis of the U-Net, so that the network structure is deepened and gradient disappearance is avoided; meanwhile, an attention mechanism is added in the model, so that the model can know the importance of different local information in the image and learn more useful information. The invention has stable technical performance, is not limited by age and skin color, and can accurately identify and detect the face photos with different brightness, contrast and saturation. Meanwhile, the difficulty that the eyes cannot be identified when the face images are incomplete in the past is solved, the positions of the eyes can be identified no matter how far and how far the eyes are shot, and the whole identification process can be completed within 3-5 seconds. The computer measurement result of the invention has high consistency with the clinical actual measurement result, and can be used for diagnosis and treatment of eye muscle related diseases. By automatically detecting the eyelid contour and the cornea contour, more parameters can be further measured, and the method is expected to be applied to diagnosis and treatment of more eye diseases. Due to the simplicity of image data storage and transmission, the invention can be used in the fields of physical examination screening, remote medical treatment and the like, and is beneficial to saving and optimizing medical resources. The invention has low requirement on technical equipment, and the whole set of algorithm can be operated on a home computer and a notebook computer.

Drawings

FIG. 1 is a schematic representation of eye movement measurement in the method of the present invention. Taking the right eye as an example, the internal rectus function is represented by D1-D2; the external rectus function is denoted D4-D3; superior rectus function is denoted by D5; the inferior oblique muscle function is denoted by D6; lower rectus function is denoted by D7; the function of the upper oblique muscle is indicated by D8.

FIG. 2 is a schematic representation of an embodiment of the method of the present invention.

FIG. 3 is a schematic diagram of a convolutional neural network structure with cyclic residuals.

FIG. 4 is a schematic diagram of a cross feature fusion module in a cyclic residual convolutional neural network model adopted in the present invention.

FIG. 5 is a schematic diagram of a cyclic convolution module in a cyclic residual convolution neural network structure employed in the method.

FIG. 6 is a schematic diagram of an attention module in a circular residual convolutional neural network structure employed in the method.

Fig. 7 is a scatter plot of computer measurements and manual measurements plotting the function of the six extraocular muscles of 414 eyes.

Detailed Description

The present invention will be further described with reference to the following examples.

The embodiment of the invention and the implementation process thereof are as follows:

step 1: eye position photographs of 207 normal persons (414 eyes) visiting an ophthalmic center of a certain hospital during the period from 11 months to 2021 months in 2020 were collected.

The test included 88 men and 119 women aged between 5 and 60 years, with a mean age of 23.2 ± 12.9 years. The forehead to be tested was attached with a circular marker (red, 10mm in diameter). Under the same illumination condition, a Canon 1500D camera is used for taking pictures of normal people when the normal people look at nine eye positions (the resolution is 6000 x 4000 pixels): including a first eye position (looking forward from front), a second eye position (looking horizontally to the left or right, looking vertically up or down), and a third eye position (looking in the upper right, upper left, lower right, lower left directions). The eye position is shown in figure 1.

According to the method for measuring the degree of the extraocular muscle hyperactivity or extraocular muscle insufficiency, the standard of eye movement of both eyes to be tested is directly measured by using a scale in advance. As shown in FIG. 1, taking the right eye as an example, the internal rectus function is represented by D1-D2; external rectus function is denoted by D4-D3; superior rectus function is denoted by D5; the inferior oblique muscle function is denoted by D6; lower rectus function is denoted by D7; the function of the upper oblique muscle is indicated by D8.

And 2, step: based on 30000 face images in the face attribute data set (CelebFaces Attributes Dataset),constructing a first-stage cyclic residual convolutional neural network model D ₁ And the binary mask is used for obtaining the binary mask of the two eyes in the eye position. The position coordinates of the eye can be obtained through the mask.

Cyclic residual convolutional neural network model D ₁ Network parameters of (2): epoch =200; batch size =4; input image size =512 × 512pixels; logistic loss function: BCE loss; an optimizer: adam [ lr =0.00001]. And making a randomly scaled picture through the data part, so that the positions and sizes of the eye areas in the picture are random. Data enhancement is carried out by using methods of scaling, rotation, symmetry, picture noise increase and the like, so that the stability and the performance of the network are improved.

And step 3: for the orthostatic face images of 1862 normal volunteers in the ophthalmic central database, the doctor manually labeled the eyelid and cornea contours. Cutting 3724 images of the double-eye area, scaling the image size to 256 multiplied by 256pixels, using the image size as training data, and constructing a second-stage circular residual convolution neural network model D ₂ For detecting eyelid and corneal contours.

Cyclic residual convolutional neural network model D ₂ Network parameters of (2): epoch =200; batch size =4; input image size =256 × 256pixels; logistic loss function: l1 loss; an optimizer: adam [ lr =0.00001]. Data enhancement is carried out by using methods of scaling, rotation, symmetry, picture noise increase and the like, so that the stability and the performance of the network are improved.

And 4, step 4: collecting and obtaining eye position photographs I of a plurality of testees _{Original drawing} Inputting the first stage circular residual convolution neural network model D trained in the step 2 ₁ Detecting to obtain binary mask, and irradiating eye by the binary mask I _{Original drawing} Extracting to obtain the position coordinates of the two eyes, and obtaining the pupil center coordinates (X) of the two eyes ₁ ,Y ₁ ) And (X) ₂ ,Y ₂ ) Rotating the eye position I by taking the middle point between the pupil center coordinates of the two eyes as the center _{Original drawing} So that the two eyes are positioned on the same horizontal plane to obtain the rotated eye position I _Rotate 。

And 5: for the rotated posterior eye positionAccording to I _Rotate Inputting the first stage cyclic residual convolution neural network model D trained in the step 2 again ₁ Detecting to obtain binary mask, and rotating the rotated eye position by the binary mask _Rotate Line extraction and cutting are carried out to obtain a left eye detection area picture and a right eye detection area picture, the left eye detection area picture and the right eye detection area picture are zoomed to a fixed pixel size, specifically to 256 multiplied by 256pixels, and then are respectively input into a second stage cyclic residual convolution neural network model D ₂ And detecting to obtain eyelid masks and cornea masks of the two eyes respectively, and finally restoring the eyelid masks and the cornea masks to the original sizes before cutting.

Step 6: using adaptive thresholding for rotated eye position I _Rotate Filtering out the color outside the forehead mark by adopting a self-adaptive mechanism according to the median color threshold value to obtain the rotated eye position I _Rotate Measuring the horizontal width and the vertical width of the binary mask; then the actual transverse width and the actual longitudinal width of the circular mark and the transverse width and the longitudinal width of the binary mask are combined, the actual diameter of the mark is 10mm, and the rotated eye position I is obtained through conversion _Rotate The scale of (1), i.e., the actual width and the actual length of each pixel.

And 7: and (3) repeating the steps 4 to 6 for all the tested eye positions, performing initial training by adopting a small part of data, and performing circulating semi-supervised training to obtain eyelid masks and cornea masks of all the eye positions so as to measure the eyeball motion.

Taking the right eye as an example (figure 1), on the basis of identifying the eyelid contour and the cornea contour, the pixel lengths of eight indexes D1 to D8 are automatically measured through an algorithm. The internal rectus function is denoted D1-D2; the external rectus function is denoted D4-D3; superior rectus function is denoted by D5; inferior oblique muscle function is denoted by D6; lower rectus function is denoted by D7; the function of the upper oblique muscle is indicated by D8.

And 8: and (4) converting the pixel lengths of the eight indexes D1 to D8 through the scale obtained in the step 6 to obtain the actual lengths of the eight indexes, and further obtaining the actual values (mm) of the functions of the external muscles of the six eyes.

And step 9: evaluation of accuracy of computer automated segmentation of eyelid and corneal contours using Dice coefficient, dice _Eyelid ＝0.947，Dice _Cornea And =0.952, which indicates higher segmentation accuracy. Scattergrams of computer measurements and artificial measurements of the function of the six extraocular muscles of 414 eyes were plotted, as shown in fig. 7, where a: the superior rectus muscle; b: the inferior oblique muscle; c: the external rectus muscle; d: the internal rectus muscle; e: the inferior rectus muscle; f: the superior oblique muscle.

According to the computer measurement value and the artificial measurement value of the function of the six extraocular muscles of 414 eyes, pearson correlation coefficient (r) and Intraclass Correlation Coefficient (ICC) of the two measurement methods are calculated, meanwhile, a difference value (Bias) versus mean value graph of the computer measurement value and the artificial measurement value of the function of the six extraocular muscles is drawn by adopting a Bland-Altman analysis, and a consistency limit (LoA) is calculated according to the difference value. The statistical results are shown in the following table.

Consistency analysis of computer and manual measurements of the function of the external muscles of the table six eyes

Note: ^a，b mean ± standard deviation (mm); ^*** P<0.001

as can be seen from the implementation comparison data, the technical scheme is used for measuring the eyeball motion, and compared with the measurement of an experienced clinician, the result has high consistency.

Therefore, the implementation shows that the invention adopts the cyclic residual convolution neural network model, adds an attention mechanism in the model, and can accurately identify and detect the face picture. The invention has stable technical performance, is not limited by skin color and shooting conditions, has high consistency between the result of automatically measuring the eyeball motion and the result of artificially measuring the eyeball motion, avoids the error of artificial measurement in a computer-aided image processing mode, and can be used for the auxiliary diagnosis and treatment of eye muscle related diseases. By automatically detecting the eyelid contour and the cornea contour, more parameters can be further measured, and the method is expected to be applied to diagnosis and treatment of more eye diseases. Due to the simplicity of image data storage and transmission, the invention can be used in the fields of physical examination screening, remote medical treatment and the like, and is beneficial to saving and optimizing medical resources.

Claims

1. An eyeball motion segmentation positioning method based on a cyclic residual convolution neural network is characterized by comprising the following steps:

step 1: constructing a first-stage cyclic residual convolutional neural network model D ₁ Inputting the face image of the face attribute data set into a first-stage circular residual convolution neural network model D ₁ Obtaining a binary mask of two eyes in the face image;

and 2, step: constructing a second-stage cyclic residual convolution neural network model D for detecting eyelid contour and cornea contour ₂ ；

And step 3: collecting and obtaining eye position of the testee I _{Original drawing} Inputting the first stage circular residual convolution neural network model D ₁ Detecting to obtain binary mask, and irradiating eye by the binary mask I _{Original drawing} Extracting to obtain position coordinates of the two eyes, and obtaining pupil center coordinates (X) of the two eyes ₁ ,Y ₁ ) And (X) ₂ ,Y ₂ ) Rotating the eye position I around the midpoint between the pupil center coordinates of both eyes _{Original drawing} So that the two eyes are positioned on the same horizontal plane to obtain the rotated eye position I _Rotate ；

And 4, step 4: for rotated back eye position I _Rotate Inputting the first stage cyclic residual convolution neural network model D again ₁ Detecting to obtain binary mask, and rotating the rotated eye position by the binary mask _Rotate Line extraction and cutting are carried out to obtain a left eye detection area picture and a right eye detection area picture, the left eye detection area picture and the right eye detection area picture are zoomed to a fixed pixel size, and then the left eye detection area picture and the right eye detection area picture are respectively input into a second-stage cyclic residual convolution neural network model D ₂ Detecting to obtain eyelid masks and cornea masks of the two eyes respectively, and finally restoring the eyelid masks and the cornea masks to the original sizes before cutting;

and 5: for rotated back eye position I _Rotate Filtering out forehead marks by an adaptive mechanism according to a median color thresholdTo obtain a rotated eye position I _Rotate Measuring the horizontal width and the vertical width of the binary mask; then, the actual transverse width and the actual longitudinal width of the circular mark and the transverse width and the longitudinal width of the binary mask are combined, and the rotated eye position I is obtained through conversion _Rotate The scale in (1);

and 6: repeating the steps 3 to 5 on the eye position pictures of all the testees to obtain respective eyelid masks and cornea masks and scales of all the eye position pictures;

performing image processing on the eye position picture by using an eyelid mask and a cornea mask to measure eyeball movement to obtain pixel distances of the functions of the six extraocular muscles, and converting the pixel values of the functions of the six extraocular muscles according to a scale to obtain actual size values of the functions of the six extraocular muscles;

the pixel distance of the function of the six extraocular muscles is obtained by processing in the following way:

respectively extracting each picture in the eye position by using an eyelid mask and a cornea mask to obtain an eyelid area and a cornea area, wherein the eyelid area comprises an upper eyelid area and a lower eyelid area; then, taking the pixel point closest to the temporal side in the cornea area as the outer edge of the cornea, taking the pixel point closest to the nasal side in the cornea area as the inner edge of the cornea, and taking the intersection point between the upper eyelid area and the lower eyelid area close to the nasal side as the inner canthus angle;

In the picture of the temporo-up eye position of the eye position picture, each straight line in the 45-degree direction on the temporo is made, and intersection point pixel points of each straight line and the cornea area and the upper eyelid area are taken as the corneal limbus and the upper eyelid limbus respectivelyWherein the maximum value of the distances between the corneal limbus and the superior eyelid limbus under each straight line is taken as the limbus-eyelid distance L of the temporal upper eye position for the function of the superior rectus muscle ₃ ；

In the temporal infraversion eye position picture of the eye position picture, each straight line in the 45-degree direction on the temple is taken, the intersection point pixel points of each straight line and the corneal region and the lower eyelid region are respectively taken as the corneal limbus and the lower eyelid margin, wherein the maximum value of the distances between the corneal limbus and the upper eyelid margin under each straight line is taken as the corneal limbus-eyelid distance L of the temporal infraversion eye for the lower rectus muscle function ₅ ；

2. The eyeball motion segmentation positioning method based on the cyclic residual convolution neural network as claimed in claim 1, wherein the method comprises the following steps: in the eye position picture, a round mark is pasted on the forehead of the human face.

3. The eyeball motion segmentation positioning method based on the cyclic residual convolution neural network as claimed in claim 1, wherein the method comprises the following steps: the first-stage cyclic residual convolution neural network model D ₁ And a second stage cyclic residual convolutional neural network model D ₂ The topological structures are the same, and the three cross feature fusion modules respectively comprise five continuous cyclic convolution modules, four attention modules, three cross feature fusion modules and one cross feature fusion moduleA result output module;

processing an input image by five cyclic convolution modules and four maximum pooling operations, wherein the maximum pooling operation is performed between every two adjacent cyclic convolution modules;

the result of the output of the fifth cyclic convolution module after being subjected to one sampling convolution operation and the output result of the fourth cyclic convolution module are jointly input into the fourth attention module, the result of the output of the fifth cyclic convolution module after being subjected to one sampling convolution operation and the output result of the fourth attention module are spliced to obtain a fourth combined feature map, the result of the output of the fifth cyclic convolution module after being subjected to one sampling convolution operation is used as a fourth decoder feature map, and then the fourth decoder feature map and the fourth combined feature map are jointly input into the third cross feature fusion module;

the result of the fourth combined feature map after being subjected to one sampling convolution operation and the output of the third cyclic convolution module are jointly input into a third attention module, the result of the output of the third cross feature fusion module after being subjected to one sampling convolution operation and the output of the third attention module are spliced to obtain a third combined feature map, the result of the output of the third cross feature fusion module after being subjected to one sampling convolution operation is used as a third decoder feature map, and then the third decoder feature map and the third combined feature map are jointly input into a second cross feature fusion module for processing;

the result of the third combined feature map after a sampling convolution operation and the output result of the second cyclic convolution module are jointly input into the second attention module, the result of the second cross feature fusion module after the sampling convolution operation and the output result of the second attention module are spliced to obtain a second combined feature map, the result of the second cross feature fusion module after the sampling convolution operation is used as a second decoder feature map, and the second decoder feature map and the second combined feature map are jointly input into the first cross feature fusion module for processing;

the result of the second combined feature map after a sampling convolution operation and the result of the first cyclic convolution module after skipping connection are jointly input into a first attention module, the result of the first cross feature fusion module after the sampling convolution operation and the output result of the first attention module are spliced to obtain a first combined feature map, the result of the first cross feature fusion module after the sampling convolution operation is used as a first decoder feature map, and the first decoder feature map and the first combined feature map are jointly input into a result output module for processing;

4. The eyeball motion segmentation positioning method based on the cyclic residual convolution neural network as claimed in claim 3, wherein the method comprises the following steps: the cross feature fusion module specifically comprises: and on one hand, the intermediate splicing characteristic diagram is multiplied by the input combined characteristic diagram after passing through a full connection layer, so that an output combined characteristic diagram is obtained, and on the other hand, the intermediate splicing characteristic diagram is multiplied by the input decoder characteristic diagram after passing through a full connection layer, so that an output decoder characteristic diagram is obtained.

5. The eyeball motion segmentation positioning method based on the cyclic residual convolution neural network as claimed in claim 1, wherein the method comprises the following steps: the second stage cyclic residual convolution neural network model D ₂ The following training is performed in advance: collecting a normal face image, labeling eyelid contours and cornea contours, cutting the normal face image to obtain binocular image areas, zooming the binocular image areas to a fixed pixel size, inputting all the binocular image areas and the eyelid contours and the cornea contours corresponding to the binocular image areas as training data into a second-stage cyclic residual error convolutional neural network model D ₂ And (5) training.