CN115984203A

CN115984203A - Eyeball protrusion measuring method, system, terminal and medium

Info

Publication number: CN115984203A
Application number: CN202211658086.8A
Authority: CN
Inventors: 马超; 陈恺
Original assignee: Shanghai Jiaotong University
Current assignee: Shanghai Jiaotong University
Priority date: 2022-12-22
Filing date: 2022-12-22
Publication date: 2023-04-18

Abstract

The invention provides a method, a system, a terminal and a medium for measuring eyeball protrusion, comprising the following steps: shooting images of left and right visual angles of an eye region by using a binocular camera; performing key point detection on the shot image by using a deep neural network to obtain two-dimensional pixel coordinates of pupil center points and canthus points of left and right eyes; searching a matching point in another view by using the characteristics of the key points, and calculating parallax according to the matching point; based on the two-dimensional pixel coordinates and the parallax of the pupil center points and the eye corner points of the left eye and the right eye, three-dimensional world coordinates of the pupil center points and the eye corner points of the left eye and the right eye are obtained by using binocular stereo vision; and calculating the distance between the pupil center point and the eye corner point in the space to form a straight line according to the three-dimensional world coordinates of the pupil center point and the eye corner point of the left eye and the right eye to obtain the distance of the eyeball of the patient. The invention realizes the automatic measurement of the distance of the eyeball of the patient, has simple and cheap equipment, no harm to health, high measurement speed and high result precision.

Description

Eyeball protrusion measuring method, system, terminal and medium

Technical Field

The invention relates to the technical field of medical image processing, in particular to a method, a system, a terminal and a medium for measuring eyeball protrusion.

Background

The eyeball protrusion is the most common clinical sign of the orbit diseases and accounts for more than 80 percent of the symptoms of the orbit diseases. In addition, many endocrine diseases can also cause eyeball protrusion, such as hyperthyroidism; moreover, these diseases may only show a protrusion without other abnormalities in the orbit, and determining the presence or absence of a protrusion will be the only basis for a definitive diagnosis. Therefore, the determination of whether the eyeball is protruded and the measurement of the degree of the protrusion are of great clinical significance.

Currently, there are two main methods for measuring the degree of eyeball protrusion in the medical field: hertel exophthalmos measurement and CT measurement. The Hertel eyeball protrusion amount measurement method measures the amount of protrusion of the eyeball from the lateral orbital rim by visually observing the scale line by the examiner to coincide with the corneal vertex. The method is simple and easy to implement, but the accuracy and precision are low, the influence of human factors is large, the results of multiple measurements of the same patient at different moments generally have large differences, and the experience is poor. In the CT measurement method, the maximum display layer of the axial eyeball is selected from the CT image to connect the anterior edges of the two extraorbital bone walls, and the distance from the eye ring at the back of the eyeball to the line is measured. The method is intuitive and has high accuracy and precision, but the equipment is expensive, the testing cost is high, the time consumption is long, and a certain radiation amount is realized. Due to various defects of the existing method, new technologies and new schemes are urgently needed to appear.

In recent years, with the continuous development of computer science and the great improvement of hardware capability, computer vision technology is widely applied in many fields. The traditional computer vision method, such as common binocular stereo vision, has strict mathematical theoretical basis and higher precision, so that the method is widely applied to the fields of aerospace, navigation positioning, industrial production, three-dimensional reconstruction and the like. With the continuous development of deep learning theory and application, the method has more outstanding effects in the fields of computer vision, natural language processing and the like than the traditional method, and brings new possibility for the innovation development of each field. In the medical field, diagnosis is generally required to be made according to images, and computers have obvious advantages in processing speed and precision of images, so that computer vision technology is increasingly widely applied in the medical field in recent years.

Through search, the Chinese patent with the publication number of CN111803024A discloses a system and a method for measuring the eyeball protrusion degree based on a deep learning algorithm, and the system and the method comprise a CT/MRI orbit image cutting module, an eye ring outline marking module, an eyeball protrusion part marking module, an eyeball protrusion degree calculation module and a terminal which are in communication connection, and the ratio of the external partial volume of the protruded orbit to the eyeball volume is calculated through the modules. The volume ratio calculated by the eye ring is accurately marked one by one based on cutting the eye socket image in the patent, the three-dimensional parameter is obtained, the area ratio of the accuracy to the common two-dimensional linear measurement and the plane measurement is high, and a foundation is provided for improving the subjective prediction to the quantitative prediction of the eye socket decompression operation. However, this patent requires a CT/MRI machine to scan the head of the patient to obtain a volumetric image, and requires expensive equipment, high testing costs, long time consumption and a certain amount of radiation; and a large number of eye socket images need to be marked by self to be used for training the eye ring contour marking module and the eyeball salient part marking module, so that the marking cost is higher. In addition, because the result of the past measuring method is two-dimensional, the result of the invention can not be directly compared with the past result, and the accuracy of the verification result is inconvenient.

Disclosure of Invention

Aiming at the defects in the prior art, the invention aims to provide a method, a system, a terminal and a medium for measuring the eyeball protrusion based on deep learning and binocular stereo vision, so that the full-process automatic high-precision measurement with low cost and no radiation is realized.

In a first aspect of the present invention, there is provided an eyeball protrusion degree measurement method including:

shooting images of left and right visual angles of an eye region by using a binocular camera;

detecting key points on the shot image by using a deep neural network to obtain two-dimensional pixel coordinates of the key points, wherein the key points are pupil center points and corner points of the left eye and the right eye;

searching a matching point of the key point in another view by using the characteristics of the key point, and calculating parallax according to the matching point;

based on the two-dimensional pixel coordinates and the parallax of the pupil center points and the eye corner points of the left eye and the right eye, three-dimensional world coordinates of the pupil center points and the eye corner points of the left eye and the right eye are obtained by using binocular stereo vision;

and calculating the distance between the pupil center point and the eye corner point in the space to form a straight line according to the three-dimensional world coordinates of the pupil center point and the eye corner point of the left eye and the right eye to obtain the distance of the eyeball of the patient.

Optionally, the performing, by using a deep neural network, keypoint detection on a captured image includes:

predicting key points of the face through a deep learning algorithm;

based on the key points of the face, dividing the eye region from the whole face to obtain an eye image;

and sending the segmented eye image into a neural network, thereby predicting each key point of the eye.

Optionally, the obtaining three-dimensional world coordinates of pupil center points and eye angle points of the left and right eyes by using binocular stereo vision includes:

calculating the parallax of the pupil center point and the eye corner point in the left view and the right view according to the matching points;

and acquiring internal and external parameters of the binocular camera, and converting the two-dimensional pixel coordinates of the pupil center point and the canthus point of the left and right eyes into three-dimensional world coordinates by using a coordinate conversion matrix.

Optionally, the finding a matching point of the key point in another view by using the feature of the key point includes:

for a key point in an image in the left and right visual angle images, calculating 128-dimensional SIFT characteristics of the key point;

and then calculating the SIFT characteristics of each point in the other image of the left and right visual angles, and finding out the point closest to the key point characteristics, namely the matching point.

In a second aspect of the present invention, there is provided an eyeball protrusion measurement system comprising:

an image acquisition module: shooting images of left and right visual angles of the eye region by using a binocular camera;

the key point detection module: performing key point detection on the shot image by using a deep neural network to obtain two-dimensional pixel coordinates of pupil center points and canthus points of left and right eyes;

a parallax calculation module: searching a matching point in another view by using the characteristics of the key points, and calculating parallax according to the matching point;

a coordinate conversion module: based on the two-dimensional pixel coordinates and the parallax of the pupil center points and the eye corner points of the left eye and the right eye, three-dimensional world coordinates of the pupil center points and the eye corner points of the left eye and the right eye are obtained by using binocular stereo vision;

a distance calculation module: and calculating the distance between the pupil center point and the eye corner point in the space to form a straight line according to the three-dimensional world coordinates of the pupil center point and the eye corner point to obtain the distance of eyeball protrusion.

According to a third aspect of the present invention, there is provided a terminal comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor being configured to execute the method of measuring the degree of eyeball protrusion or to operate the system of measuring the degree of eyeball protrusion when the program is executed.

According to a fourth aspect of the present invention, there is provided a computer-readable storage medium having stored thereon a computer program for executing the method of measuring an eyeball protrusion or for operating the system of measuring an eyeball protrusion when the program is executed by a processor.

Compared with the prior art, the embodiment of the invention has at least one of the following beneficial effects:

the method and the system of the invention realize the automatic measurement of the distance of the eyeball of the patient, and the whole measurement process does not need manual participation, so the result is not influenced by human factors; the pupil center and the canthus point which need to be extracted are the key points commonly used in academia and industry, a plurality of public data sets can be directly used or finely adjusted, and the labeling cost is greatly reduced.

The method and the system of the invention have the advantages of simple and cheap required equipment, realization by extremely low cost and no harm to the health of patients.

The method and the system of the invention utilize the computer vision technology, the measuring speed is very fast, the measuring result is relatively stable and has higher precision: one measurement can be completed within 10s, and the measurement precision can be controlled to be about 2mm by a method of averaging multiple measurements.

Because the past measuring method and measuring result are both based on two dimensions, the measuring result of the distance between the eyeball and the eyeball is also based on two dimensions, and the measuring result can be directly compared to verify the accuracy of the method.

Drawings

Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:

FIG. 1 is a flowchart illustrating a method for measuring an eyeball protrusion degree according to an embodiment of the invention;

FIG. 2 is a block diagram of an exemplary system for measuring eye prominence;

FIG. 3 is a flowchart of the protrusion measurement according to a preferred embodiment of the present invention;

FIG. 4 is a schematic diagram of the measured eyeball protrusion distance in a preferred embodiment of the invention;

FIG. 5 is a diagram of MediaPipe eye keypoint detection in a preferred embodiment of the invention;

FIG. 6 is a HRNet network architecture in a preferred embodiment of the present invention;

FIG. 7 is a diagram illustrating the deviation between the corner of the eye detected by the pre-trained model and the desired corner of the eye in a preferred embodiment of the present invention;

FIG. 8 is a plot of labeled points of the AFLW dataset in accordance with a preferred embodiment of the present invention;

FIG. 9 illustrates the SIFT rationale in a preferred embodiment of the present invention;

FIG. 10 is a schematic view of epipolar constraint in a preferred embodiment of the present invention;

FIG. 11 is a diagram illustrating depth calculation by disparity in a preferred embodiment of the present invention;

fig. 12 is a comparison of the MediaPipe eye corner positioning results and the HRNet fine-tuned posterior eye corner positioning results in a preferred embodiment of the present invention.

Detailed Description

The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that variations and modifications can be made by persons skilled in the art without departing from the spirit of the invention. All falling within the scope of the invention.

Referring to fig. 1, in an embodiment of the present invention, there is provided an eyeball protrusion measuring method including:

s100, shooting images of left and right visual angles of an eye region by using a binocular camera;

s200, detecting key points on the shot image by using a deep neural network to obtain two-dimensional pixel coordinates of the key points; wherein, the key points refer to pupil center points of the left and right eyes and canthus points of the left and right eyes;

s300, searching a matching point in another view by using the characteristics of the detected key points, and calculating parallax according to the matching point;

s400, based on the two-dimensional pixel coordinates and the parallax of the pupil center points and the eye corner points of the left eye and the right eye, three-dimensional world coordinates of the pupil center points and the eye corner points of the left eye and the right eye are obtained by using binocular stereo vision;

and S500, calculating the distance between the pupil center point and the eye corner point in the space to form a straight line according to the three-dimensional world coordinates of the pupil center point and the eye corner point of the left eye and the right eye, and obtaining the distance of the eyeball of the patient.

In order to improve the detection precision and realize the full-automatic measurement of the eyeball protrusion distance, the embodiment of the invention utilizes the deep learning and binocular stereo vision technology, and finally calculates the distance between the pupil center point and the canthus point in the space to be connected into a straight line through the detection and matching of the key points and the conversion from the two-dimensional pixel coordinates of the pupil center point and the canthus point of the key points to the three-dimensional world coordinates, so as to obtain the eyeball protrusion distance of the patient. Therefore, the projecting distance of the eyeball of the patient can be measured quickly and accurately without manual intervention.

In some embodiments, in order to perform keypoint detection on a captured image by using a deep neural network in S200, the following method may be adopted:

s201, predicting key points of a face through a deep learning algorithm;

s202, based on the key points of the face, dividing the eye region from the whole face to obtain an eye image;

s203, the segmented eye images are sent to a neural network, and therefore all key points of the eyes are predicted.

The above-mentioned key point detection method that this embodiment adopted, it is higher to pupil center's detection accuracy, and can easily dispose on platforms such as mobile end, embedded.

In some embodiments, S300 is executed, and the key points detected by the same three-dimensional world point in the images of the left and right viewing angles are matching point pairs, and the disparity is obtained according to the matching point pairs. In the step, the matching points of the left view and the right view can be obtained by adopting a mode that the deep neural network directly detects key points through a deep learning algorithm, and the method is simple and direct.

In order to obtain matching point pairs of key points in two graphs more accurately, in some preferred embodiments, SIFT operators may be used, which specifically includes:

s3011, obtaining internal and external parameters of the binocular camera, and constraining the matching points to the same horizontal line by adopting polar line constraint;

s3012, after determining the feature of a certain key point in one image, calculating the SIFT feature of each point on the same horizontal line in the other image, finding the point closest to the feature of the key point as a matching point, and forming a matching point pair by the key point and the matching point in the other image.

In the steps, the accuracy of detecting the key points through the deep learning algorithm is possibly insufficient, the key points obtained in the left view and the right view are not necessarily matching points, the matching points obtained through the SIFT operator are more accurate, the calculated parallax and depth information is more accurate, and the finally obtained three-dimensional world coordinate is more accurate.

Further, in the feature matching in the above embodiment, 128-dimensional SIFT features may be selected to describe the key points, specifically, for the key points in one image, the 128-dimensional SIFT features of the key points are calculated, then the SIFT features of the respective points are calculated in another image, and the point closest to the above key point features is found to be the matching point. Better effect can be obtained by adopting 128-dimensional SIFT characteristics, so that the matching result is more accurate.

In some embodiments, performing S400 to obtain three-dimensional world coordinates of pupil center points and eye corner points of the left and right eyes by using binocular stereo vision may include: and acquiring internal and external parameters of the binocular camera, and converting two-dimensional pixel coordinates of pupil center points and canthus points of the left eye and the right eye into three-dimensional world coordinates according to the parallax of the key points.

As shown in fig. 2, in another embodiment of the present invention, an eyeball protrusion measuring system is further provided for implementing the above-mentioned eyeball protrusion measuring method, specifically, the system includes:

an image acquisition module: shooting images of left and right visual angles of an eye region by using a binocular camera;

the key point detection module: performing key point detection on the shot image by using a deep neural network to obtain two-dimensional pixel coordinates of pupil center points and canthus points of the left eye and the right eye;

The implementation of each module of the system for measuring the eyeball protrusion degree in the above embodiment of the present invention may refer to the technology implemented in each step of the method for measuring the eyeball protrusion degree, and will not be described herein again.

In another embodiment of the present invention, a terminal is further provided, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor is configured to execute the method for measuring the degree of eyeball protrusion or operate the system for measuring the degree of eyeball protrusion when executing the program.

In another embodiment of the present invention, there is also provided a computer-readable storage medium having stored thereon a computer program for executing the method for measuring an eyeball protrusion degree or for operating the system for measuring an eyeball protrusion degree when the program is executed by a processor.

In order to better understand the technical solution of the present invention, the following preferred embodiments are described with reference to a specific application, but it should be noted that the present invention is not limited to the following preferred embodiments:

in the whole embodiment, the used devices are only one common RGB binocular camera and one small-sized edge computing device, the RGB binocular camera is used for shooting images of left and right visual angles of the eye region, and the small-sized edge computing device is used for image processing and computing in other steps. FIG. 3 is a flowchart of the protrusion measurement according to a preferred embodiment of the present invention.

Before measurement is started, a binocular camera formed by two monocular cameras is calibrated to obtain internal parameters and external parameters of the two cameras.

During formal measurement, the binocular cameras shoot images of the left visual angle and the right visual angle of the eye region of a patient respectively, specifically, the face of the patient just faces the binocular cameras to shoot, the two cameras are not parallel but slightly inward, the shape is like \/', the camera arrangement position in the figure 4 is shown, the distance between the two cameras is about 6cm, and the distance between the two cameras and the face of the patient is about 20 cm. The image taken by the left camera is called the left view and the image taken by the right camera is called the right view. The left view and the right view are images of the left and right viewing angles.

And then correcting the shot image by using the internal and external parameters of the camera, wherein the correction comprises distortion correction and stereo correction.

And performing necessary preprocessing on the corrected image, and detecting key points (pupil center points and eye corner points in the left view and the right view). The purpose of the preprocessing is to obtain a picture format required by a subsequent key point detection model, for example, attribute information and shooting data of a shot picture are added in the embodiment, and a single-channel grayscale map is changed into a three-channel grayscale map.

When detecting key points, a MediaPipe model is adopted to obtain the pixel coordinates of the pupil center point from the image, and an HRNet is adopted to obtain the pixel coordinates of the eye corner point. Of course, in other embodiments, other networks/models may be used to detect the key points, and MediaPipe and HRNet are only one preferred embodiment of the present invention, and are not limited to these two networks/models.

After the pixel coordinates of the key points (pupil center points and eye corner points) in a certain view are obtained, the SIFT feature description operator is used for finding the matching points of the key points in another view and calculating parallax, and therefore the world coordinates of the key points are obtained by using binocular stereo vision.

And finally, calculating the distance between the center point of the pupil in the space and the corner points of the left eye and the right eye in a connecting line to obtain the protruding distance of the eyeball of the patient. The distance of the eyeball protrusion measured by the scheme is the AP length in fig. 4, points A and A 'are pupil center points of two eyes, and points B and B' are canthus points of the two eyes.

The following describes in detail the specific implementation of each part in the above embodiments:

1) Key point detection models MediaPipe and HRNet

The MediaPipe is a deep learning application development framework developed by Google and open in source, supports various major mainstream platforms such as a desktop end, a mobile end, a cloud platform and an embedded device, supports GPU acceleration, and can perform rapid development, verification and reasoning.

In the embodiment, the eye key point detection function in the MediaPipe is used, so that the coordinates of the pupil center point in the image can be stably and accurately output. The framework predicts key points of the face through a deep learning algorithm, then divides an eye region from the whole face, and sends the divided eye image into a small neural network, so that each key point of the eye is predicted, and the effect is shown in fig. 5. In this embodiment, the pixel coordinates of the pupil center point can be obtained by MediaPipe. For the eye corner point, although MediaPipe can detect the two-dimensional pixel coordinates of the eye corner point, the two-dimensional pixel coordinates are not very stable, so that more accurate eye corner point positioning is required. In the embodiment, the eye corner point is obtained by adopting the HRNet, and the HRNet can well complete tasks such as image classification, segmentation, target detection, human face key point detection, posture identification and the like, so that the eye corner point is accurately positioned by means of the network.

The network result of HRNet is shown in fig. 6, which starts with a high resolution sub-network, steps up to a low resolution sub-network, and connects multiple resolution sub-networks in parallel. With multiple multi-scale fusions, each high-resolution to low-resolution representation receives information from other parallel representations over and over again, resulting in a rich high-resolution representation. Therefore, the network predicts a more accurate and spatially accurate heat map of the key points.

The HRNet pre-trained model is trained on public data sets that are labeled for the location of the corner of the eye on the orbital bone, and the location where the prominence of the eyeball is to be measured in embodiments of the present invention determines that the desired corner of the eye is the junction of the white of the eye and the skin of the corner of the eye. Therefore, the eye corner point location of the HRNet pre-trained model has a certain deviation from the eye corner point desired by the embodiment of the present invention, as shown in fig. 7: the eye corner points (circles) detected by the HRNet pre-training model are more outside, the marking positions of the eye corner points in the data set are detected for common key points of the human face, and the points (squares) close to the inner side are the required eye corner points, so that the HRNet pre-training model is subjected to fine adjustment.

Specifically, an AFLW (artificial Landmarks in the Wild) dataset containing about 25,000 manually labeled face images with different poses, expressions, lighting, etc. is used to train and verify the HRNet pre-training model. Each image is labeled with 21 keypoints as shown in fig. 8.

On the basis of AFLW face images, eye corner points of 240 faces are re-labeled, 200 faces are used as a fine-tuning training set, and 40 faces are used as a fine-tuning verification set. The original 19 key points for training are adjusted to 2 (namely the left and right eye corner points), the learning rate is adjusted to 0.0001, the batch processing size is adjusted to 4, and meanwhile, the gradient of the feature extraction layer is fixed. And (3) setting other parameters to maintain the original setting of HRNet, selecting MSELoss by loss, using an adam optimizer by the optimizer, and finishing training 60 epochs. The HRNet model obtained at the moment can well detect the eye corner points required by the measurement of the exophthalmos.

2) Finding matching points by utilizing SIFT operator

The SIFT is an image local feature description operator which is based on a scale space and keeps invariance to image scaling, rotation and even affine transformation. SIFT searches for extreme points in a scale space by extracting local features of an image, and extracts position, scale and direction information of the extreme points. After the characteristics of a certain key point are obtained, characteristic matching is carried out in another image, and therefore a matching point is found.

The basic principle of the SIFT descriptor is shown in fig. 9, first, the pixel region around the key point is divided into blocks, for an 8 × 8 pixel image block, the image block can be divided into 4 × 4 pixel sub-regions, the gradients of all the pixel points in each region are subjected to gaussian weighting, and finally 8 directions are taken, that is, a 2 × 8-dimensional vector can be generated, and the 2 × 8-dimensional vector is the mathematical description of the central key point. Meanwhile, experimental results of professor David g.lowe show that for each keypoint, the best effect can be obtained by performing keypoint characterization using descriptors of 4 × 8-dimensional vectors calculated on image blocks of 16 × 16 pixels around, and therefore, in this embodiment, 128-dimensional SIFT feature descriptors are also used.

Specifically, for the key points detected in a certain view in the previous step, 128-dimensional SIFT features of the key points are calculated, then SIFT features of each point are calculated in another view, and the point closest to the key point features is found to be the matching point.

Before this, it is necessary to calibrate the two cameras, so as to obtain the intrinsic parameters of the two cameras and the relative position and orientation between the two cameras, i.e. the internal and external parameters of the cameras. After the internal and external parameters are obtained, distortion correction and stereo correction are carried out on the image, the distortion correction is used for eliminating the distortion of the image, and the stereo correction adopts an epipolar constraint method to constrain the matching points to the same height (as shown in figure 10), so that when the matching points are searched, only one-dimensional search on the same horizontal line is carried out, and two-dimensional search on the whole graph is not carried out.

In this embodiment, after the deep learning algorithm (MediaPipe and HRNet in this embodiment) obtains a key point in a certain view, a matching point of this key point in another view is found, and the disparity is calculated based on this, so that the subsequent step completes the two-dimensional to three-dimensional coordinate conversion. Finding matching points in another view can directly employ the key points detected in this view by a deep learning algorithm, but is found to be less accurate in research. And the matching point can be more accurately found in a mode of searching the matching point in another view by using the SIFT operator. Therefore, in the embodiment, the key points detected in another view by the deep learning algorithm are not directly used as the matching points, and the SIFT operators are used for obtaining the matching points, so that the accuracy of searching the matching points can be improved. Meanwhile, the matching points can be used for calculating parallax, and then depth calculation of subsequent key points can be carried out.

3) Binocular stereo vision

To obtain the three-dimensional coordinates of the key points, the conversion from pixel coordinates to three-dimensional coordinates needs to be solved, and the conversion can be completed according to the following formula.

Wherein Z _c Is the depth, u, v are the pixel coordinates, X _w ,Y _w ,Z _w Is the world coordinate, the first two matrices behind the equal sign are the internal and external reference matrices of the camera. According to a formula, internal and external parameters of a camera and depth information of key points are needed to obtain a conversion relation from a two-dimensional coordinate to a three-dimensional coordinate.

Just as humans can perceive depth through both eyes, for computer vision, depth information can also be recovered from the disparity of the same object in left and right views. The depth calculation method is shown in FIG. 11, where B is the distance between the optical centers of the two cameras, f is the focal length of the cameras, and x _l And x _r Is the abscissa of the pixel coordinates of the matching points P and P' in the left and right views. After finding the matching points P and P' in the left and right views, the depth of the point P can be calculated according to the following formula:

after the depth of the key point is calculated, the three-dimensional coordinates of the pupil center points A and A 'and the eye corner points B and B' can be obtained through the coordinate transformation.

4) Calculating the distance of the eyeball of the patient

And after three-dimensional coordinates of the pupil center points A and A ' and the corner points B and B ' are obtained, connecting the left corner point and the right corner point to form a straight line BB ', and calculating the distance d from the pupil center point to the corner connecting line by using a distance formula from the space point to the straight line.

The distance d is the distance of eyeball protrusion.

The implementation effect is as follows:

the AFLW data are divided into 4 groups for training, 50, 100, 150 and 200 pieces of re-labeled face images are respectively trained in each group, then the normalized average error NME of each group on the verification set is compared, and the comparison result is shown in Table 1. It can be seen that, with the increasing of training data, the positioning of the eye corner points becomes more and more accurate, and when the training data is 150 images, the minimum value is reached.

TABLE 1 comparison of prediction errors on validation set for HRNet model trained with different training data volumes

Number of training images	0	50	100	150	200
						NME	0.0215	0.0090	0.0087	0.0080	0.0087

For example, as shown in fig. 12, the first row is the result of locating the eye corner of MediaPipe, the second row is the result of locating the eye corner of HRNet after fine adjustment, the third row is the found matching point of the eye corner located by MediaPipe in another view with SIFT, and the fourth row is the found matching point of the eye corner located by HRNet after fine adjustment in another view with SIFT. It can be seen that the positioning of the eye corner points by the HRNet after fine adjustment is greatly improved compared with that before the fine adjustment, and compared with the positioning result of MediaPipe, the positioning result is improved slightly. Meanwhile, the matching points searched in another view according to the SIFT descriptors can be observed to be more accurate.

Three subjects were randomly selected and their eye protrusion was measured, and the results are shown in table 2, ABC representing three subjects, respectively. It can be seen that there is a certain deviation between the multiple measurement results of the same subject, and there is also a certain deviation between the left and right eye data of the same subject, which is mainly because the fine-tuned HRNet has not been very stable yet for positioning the eye corner point. But after averaging the data of multiple measurements of the same subject, the result is greatly improved, and the time of each group of tests is not more than 10 seconds to complete. Compared with the real data, the error of the measured result of the scheme is about 2 mm.

Subsequently, if the measurement precision of the method needs to be further improved, more real shot images can be collected and the eye corner points are marked, and then the eye corner points are completely trained from the beginning.

TABLE 2 distance of eyeball protrusion measured by three subjects

According to the embodiment of the invention, the problems that the measurement result is greatly influenced by human factors, the test cost is high and the radiation quantity exists in the conventional medical method for measuring the distance between the eyeball of the patient are solved, the whole process does not need manual intervention, the result is not influenced by the human factors, the measurement equipment is simple and cheap, and the measurement process is high in speed; the measuring result is relatively stable, the precision is high, and the verification is convenient.

It should be noted that, the steps in the method provided by the present invention may be implemented by using corresponding modules, devices, units, and the like in the system, and those skilled in the art may implement the step flow of the method with reference to the technical solution of the system, that is, the embodiment in the system may be understood as a preferred example for implementing the method, and details are not described here.

Those skilled in the art will appreciate that, in addition to implementing the system and its various devices provided by the present invention in purely computer readable program code means, the method steps can be fully programmed to implement the same functions by implementing the system and its various devices in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Therefore, the system and various devices thereof provided by the present invention can be regarded as a hardware component, and the devices included in the system and various devices thereof for realizing various functions can also be regarded as structures in the hardware component; means for performing the functions may also be regarded as structures within both software modules and hardware components for performing the methods.

The foregoing description has described specific embodiments of the present invention. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes and modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The above-described preferred features may be used in any combination without conflict with each other.

Claims

1. An eyeball protrusion measuring method, comprising:

shooting images of left and right visual angles of the eye region by using a binocular camera;

2. The method according to claim 1, wherein the performing key point detection on the captured image using the deep neural network comprises:

predicting key points of the face through a deep learning algorithm;

and sending the segmented eye images into a neural network, thereby predicting each key point of the eye.

3. The method for measuring the degree of eyeball protrusion according to claim 1, wherein the obtaining of the three-dimensional world coordinates of the pupil center points and the eye angle points of the left and right eyes by using binocular stereovision comprises:

and acquiring internal and external parameters of the binocular camera, and converting two-dimensional pixel coordinates of pupil center points and corner points of the left eye and the right eye into three-dimensional world coordinates by using a coordinate conversion matrix.

4. The method according to claim 3, wherein the finding a matching point of the key point in another view using the feature of the key point comprises:

and then calculating SIFT features of each point in the other image of the image at the left and right visual angles, and finding out the point closest to the key point features as a matching point.

5. The method according to claim 3, wherein the acquiring internal and external parameters of the binocular camera includes:

calibrating the two cameras of the binocular camera so as to obtain respective intrinsic parameters of the cameras and relative positions and orientations between the two cameras, namely internal and external parameters of the cameras.

6. The method according to claim 5, wherein the image is subjected to distortion correction and stereo correction after obtaining the internal and external parameters of the camera, the distortion correction is used for eliminating the distortion of the image, the stereo correction is performed by using an epipolar constraint method to constrain the matching points to the same height, and only one-dimensional search is performed on the same horizontal line when the matching points are found.

7. The eyeball protrusion degree measurement method according to any one of claims 3 to 5, wherein the two-dimensional pixel coordinates of the pupil center points and the corner points of the left and right eyes are converted to three-dimensional world coordinates by using a coordinate conversion matrix, wherein:

and calculating the depths of the pupil center points and the eye corner points of the left eye and the right eye according to the parallax of the images of the left visual angle and the right visual angle and the internal and external parameters of the camera, and then obtaining the three-dimensional world coordinates of the pupil center points and the eye corner points by adopting the coordinate transformation matrix.

8. An ocular prominence measurement system, comprising:

9. A terminal comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program is operable to perform the method of any one of claims 1 to 7 or to operate the system of any one of claim 8.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, is adapted to carry out the method of any one of claims 1 to 7 or to carry out the system of any one of claim 8.