Disclosure of Invention
The technical problem solved by the embodiment of the invention is how to realize automatic supervision of amblyopia training.
In order to solve the above technical problem, an embodiment of the present invention provides a method for supervising amblyopia training, including: acquiring a facial image of an amblyopia training object, and performing face recognition on the facial image to obtain a face recognition result; determining the position of each face key point according to the face recognition result; and estimating the concentration index of the amblyopia training object according to the positions of key points of each face, wherein the concentration index is used for representing the concentration degree.
Optionally, the method for supervising amblyopia training further comprises: when the attention index does not meet the preset attention range, determining a head posture adjustment strategy according to the positions of key points of each face; outputting posture adjustment reminding information, wherein the posture adjustment reminding information comprises the head posture adjustment strategy.
Optionally, the estimating the concentration index of the amblyopia training object according to the position of each face key point includes: and estimating the concentration index of the amblyopia training object according to the positions of the eye key points and the nose key points, wherein the face key points comprise the eye key points and the nose key points.
Optionally, the estimating the concentration index of the amblyopia training subject according to the positions of the eye key points and the nose key points comprises: calculating a left eye distance and a right eye distance according to the positions of the eye key points, wherein the left eye distance refers to the distance between a left eye center point and a nose center point in the horizontal direction, and the right eye distance refers to the distance between a right eye center point and the nose center point in the horizontal direction; and estimating the attention index according to the absolute value of the difference value of the left eye space and the right eye space.
Optionally, the estimating the concentration index of the amblyopia training subject according to the absolute value of the difference between the left-eye distance and the right-eye distance includes: calculating the eye distance according to the position of the left eye central point and the position of the right eye central point; and calculating the ratio of the absolute value of the difference value to the eye distance, and taking the ratio as the attention index.
Optionally, the estimating the concentration index of the amblyopia training object according to the position of each face key point includes: calculating the eye distance according to the position of the center point of the left eye and the position of the center point of the right eye; calculating the height of a left eye according to the positions of the highest point and the lowest point of the left eye, and estimating the concentration index of the amblyopia training object according to the ratio of the height of the left eye to the eye distance, wherein the height of the left eye refers to the distance between the highest point and the lowest point of the left eye in the vertical direction; or calculating the height of the right eye according to the positions of the highest point and the lowest point of the right eye, and estimating the concentration index of the amblyopia training object according to the ratio of the height of the right eye to the distance between the eyes, wherein the height of the right eye refers to the distance between the highest point and the lowest point of the right eye in the vertical direction.
Optionally, the method for supervising amblyopia training further comprises: estimating the distance between the eyes of the amblyopia training object and the amblyopia training screen according to the position of each face key point; if the distance between the eyes of the amblyopia training object and the amblyopia training screen exceeds a set distance range, determining a distance adjustment strategy of the amblyopia training object, and outputting distance adjustment reminding information, wherein the distance adjustment information comprises the distance adjustment strategy of the amblyopia training object; or if the distance between the eyes of the amblyopia training object and the amblyopia training screen exceeds a set distance range, determining a distance adjusting strategy of the amblyopia training screen, and adjusting the position of the amblyopia training screen according to the distance adjusting strategy of the amblyopia training screen.
Optionally, the method for supervising amblyopia training further comprises: and if the distance between the eyes of the amblyopia training object and the amblyopia training screen is smaller than a first threshold value, suspending amblyopia training, wherein the set distance range comprises the first threshold value.
Optionally, the estimating, according to the position of each face key point, a distance between the eyes of the amblyopia training object and the amblyopia training screen includes: selecting N facial images, and respectively calculating the eye distance of the amblyopia training object in each facial image; calculating the eye distance of the amblyopia training object during the amblyopia training according to the eye distance corresponding to the N facial images respectively; and estimating the distance between the eyes of the amblyopia training object and the amblyopia training screen according to the eye distance of the amblyopia training object in the amblyopia training and a preset distance threshold.
Optionally, the method for supervising amblyopia training further comprises: performing eye mask recognition on the facial image; determining the eye mask wearing condition of the amblyopia training object according to the eye mask identification result, wherein the eye mask is used for shielding the eyes of the amblyopia training object which do not need to be subjected to amblyopia training currently; acquiring amblyopia correction information of the amblyopia training object, wherein the amblyopia correction information comprises: the position of the amblyopia eyes and whether the eye cover needs to be worn; and judging whether the amblyopia correction of the amblyopia training object is standard or not according to the wearing condition of the eyeshade of the amblyopia training object.
Optionally, the method for supervising amblyopia training further comprises: when the amblyopia training object wears the eyeshade regularly, determining the size of an eyeshade mark according to the eyeshade recognition result, wherein the eyeshade mark is arranged on the eyeshade; estimating the distance between the eyes of the amblyopia training object and the amblyopia training screen according to the size of the eye patch mark; if the distance between the eyes of the amblyopia training object and the amblyopia training screen exceeds a set distance range, determining a distance adjustment strategy of the amblyopia training object, and outputting distance adjustment reminding information, wherein the distance adjustment information comprises the distance adjustment strategy of the amblyopia training object; or if the distance between the eyes of the amblyopia training object and the amblyopia training screen exceeds a set distance range, determining a distance adjusting strategy of the amblyopia training screen, and adjusting the position of the amblyopia training screen according to the distance adjusting strategy of the amblyopia training screen.
Optionally, the performing eye mask recognition on the facial image includes: determining a face region according to the face recognition result obtained by face recognition of the face image; and carrying out eye mask identification on the face area.
Optionally, the eye mask recognition of the face region includes performing eye mask recognition of the face region by using an eye mask recognition model, where the eye mask recognition model is obtained by training based on a YOLO-Tiny target detection algorithm, the recognition target class is configured to be 1, and the filter layer is configured to be 18.
Optionally, the model data of the eye patch identification model is set in an SD card, and the eye patch identification model is loaded in the following manner: when the trigger loading of the eye shield identification model is detected, copying model data of the eye shield identification model from the SD card to an SD card directory, and acquiring a model file path so as to read the model data of the eye shield identification model from the SD card directory based on the model file path.
Optionally, the performing eye mask recognition on the facial image includes: and opening a first thread, and carrying out eye mask identification on the facial image on the first thread.
Optionally, the acquiring a facial image of the amblyopia training object includes: acquiring a facial image of the amblyopia training object through an image acquisition device, and putting the acquired facial image into a first cache queue; the face image is obtained from the first buffer queue.
Optionally, the method for supervising amblyopia training further comprises: and when the frame number of the face images in the first buffer queue exceeds a preset frame, discarding one frame of the face images in the first buffer queue.
Optionally, the performing face recognition on the facial image includes: and opening a second thread, and performing face recognition on the facial image in the second thread.
The embodiment of the invention also provides a device for supervising amblyopia training, which comprises: the face recognition unit is used for acquiring a face image of the amblyopia training object and carrying out face recognition on the face image to obtain a face recognition result; the determining unit is used for determining the positions of key points of each face according to the face recognition result; and the estimation unit is used for estimating the concentration index of the amblyopia training object according to the position of each face key point, and the concentration index is used for representing the concentration degree.
An embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium is a non-volatile storage medium or a non-transitory storage medium, and a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer-readable storage medium executes any of the steps of the above-mentioned method for supervising amblyopia training.
The embodiment of the invention also provides a terminal, which comprises a memory and a processor, wherein the memory is stored with a computer program capable of running on the processor, and the processor executes any step of the monitoring method for amblyopia training when running the computer program.
Compared with the prior art, the technical scheme of the embodiment of the invention has the following beneficial effects:
through the facial image that obtains amblyopia training object, carry out face identification to the facial image, confirm the position of each people's face key point according to the face identification result, and then according to the position of each people's face key point, estimate amblyopia training object's the index of being absorbed in, absorb in the index of being absorbed in and be used for the special attention degree, realize from this that practice thrift the cost of labor to the automatic supervision of amblyopia training object.
Detailed Description
As mentioned above, in order to achieve a better amblyopia training effect, people with relatively poor self-control ability (such as children) usually need manual supervision by parents or other people to ensure that a high concentration is maintained during amblyopia training, and the manual supervision takes a lot of manpower.
In order to solve the above problems, in the embodiment of the present invention, a face image of an amblyopia training object is obtained, face recognition is performed on the face image, the position of each face key point is determined according to a face recognition result, and then a concentration index of the amblyopia training object is estimated according to the position of each face key point, and the concentration index is used for the concentration degree, so that automatic supervision on the amblyopia training object is realized, and labor cost is saved.
In order to make the aforementioned objects, features and advantages of the embodiments of the present invention more comprehensible, specific embodiments accompanied with figures are described in detail below.
Referring to fig. 1, a flowchart of a method for supervising amblyopia training in the embodiment of the present invention is shown, which specifically includes the following steps:
step S11, obtaining the face image of the amblyopia training object, and carrying out face recognition on the face image to obtain a face recognition result.
And step S12, determining the position of each face key point according to the face recognition result.
And step S13, estimating concentration indexes of the amblyopia training objects according to the positions of key points of each face, wherein the concentration indexes are used for representing the concentration degree.
According to research, in the process of amblyopia training, the amblyopia trainer needs to pay attention to the pictures presented on the amblyopia training screen or cooperate with the pictures to carry out some game operations and the like, so that the amblyopia trainer needs to have better special attention to realize the corresponding training effect. When the amblyopia training object is distracted or the concentration is low, the eye spirit is far away from or deviates from the amblyopia training screen, and then different facial postures are shown, and when the facial postures are changed, the relative positions of key points of the human face are also changed. Therefore, based on the research findings, the concentration index of the amblyopia training object can be estimated based on the positions of the key points of the human face, and the concentration condition of the amblyopia training object can be further judged.
In the specific implementation, in step S13, when the concentration index of the amblyopic training object is estimated according to the positions of the face key points, the estimation method of the concentration index is different according to the difference between the selected face key points, which will be described below as an example.
In a non-limiting embodiment of the present invention, the face key points include eye key points and nose key points, and the concentration index of the amblyopia training object can be estimated according to the positions of the eye key points and the nose key points.
Further, the concentration index of the amblyopia training object is estimated according to the positions of the eye key points and the nose key points, which can be specifically realized by the following steps: calculating a left eye distance and a right eye distance according to the positions of the eye key points, wherein the left eye distance refers to the distance between a left eye center point and a nose center point in the horizontal direction, and the right eye distance refers to the distance between a right eye center point and the nose center point in the horizontal direction; and estimating the attention index according to the absolute value of the difference value of the left eye space and the right eye space. The eye key points are used for representing the contour of the eye, and the eye key points for the left eye may include a left eye center point, and correspondingly, the eye key points for the right eye may include a right eye center point. The nose key points are used to characterize the contour of the nose and may include the nose center point.
Further, according to the absolute value of the difference between the left eye distance and the right eye distance, the concentration index of the amblyopia training object is estimated, which can be specifically realized as follows: calculating the eye distance according to the position of the left eye central point and the position of the right eye central point; and calculating the ratio of the absolute value of the difference value to the eye distance, and taking the ratio as the attention index.
For example, the concentration index is calculated using the following formula (1).
Wherein the content of the first and second substances,
is a concentration index;
is the left eye pitch;
the right eye spacing.
In specific implementation, the attention degree of the amblyopic training object can be judged according to the attention index. When the concentration index does not satisfy the preset concentration range, the insufficient concentration of the amblyopia training object can be judged. The predetermined concentration range may be
That is, when the concentration index is larger than 0.4, it can be judged that the concentration is insufficient, that is, not concentrated. It should be noted that, the example is only given for ease of understanding, and in practice, other values may be selected according to actual requirements, and are not limited herein.
Further, the head pose of the user may also be determined based on the difference between the left-eye and right-eye spacings. For example, if the difference between the left-eye and right-eye distances is greater than zero, then the head representing the amblyopic training subject turns right. In another example, if the difference between the left-eye distance and the right-eye distance is less than zero, it represents that the head of the amblyopia training object turns left.
In another embodiment of the present invention, estimating the concentration index of the amblyopic training object according to the position of each face key point can be implemented by the following steps: calculating the eye distance according to the position of the center point of the left eye and the position of the center point of the right eye; calculating the height of a left eye according to the positions of the highest point and the lowest point of the left eye, and estimating the concentration index of the amblyopia training object according to the ratio of the height of the left eye to the eye distance, wherein the height of the left eye refers to the distance between the highest point and the lowest point of the left eye in the vertical direction; or calculating the height of the right eye according to the positions of the highest point and the lowest point of the right eye, and estimating the concentration index of the amblyopia training object according to the ratio of the height of the right eye to the distance between the eyes, wherein the height of the right eye refers to the distance between the highest point and the lowest point of the right eye in the vertical direction.
When the concentration index is estimated according to the left eye height or the right eye height, the concentration degree of the amblyopia training object can be judged according to the relation between the concentration index and the preset concentration range. When the concentration index is less than 0.07, it can be judged that the concentration is insufficient, that is, not concentrated. It should be noted that 0.07 is only an example for easy understanding, and does not limit the protection scope of the embodiment of the present invention. In practice, other values may be selected according to actual requirements, and are not limited herein.
In specific implementation, when the concentration power index does not meet a preset concentration power range, determining a head posture adjustment strategy according to the positions of key points of each face; outputting posture adjustment reminding information, wherein the posture adjustment reminding information comprises the head posture adjustment strategy. Therefore, when the attention degree of the amblyopia training object is low, the amblyopia training object is reminded to adjust the head posture, so that the better attention is maintained, and the amblyopia training effect is improved.
In specific implementation, the distance between the eyes of the amblyopia training object and the amblyopia training screen can be estimated according to the positions of key points of each face; and if the distance between the eyes of the amblyopia training object and the amblyopia training screen exceeds a set distance range, determining a distance adjustment strategy of the amblyopia training object, and outputting distance adjustment reminding information, wherein the distance adjustment information comprises the distance adjustment strategy of the amblyopia training object. So as to remind the amblyopia training object to adjust the posture in time, and is beneficial to adjusting the amblyopia training effect.
And if the distance between the eyes of the amblyopia training object and the amblyopia training screen exceeds a set distance range, determining a distance adjusting strategy of the amblyopia training screen, and adjusting the position of the amblyopia training screen according to the distance adjusting strategy of the amblyopia training screen. The distance between the amblyopia training screen and the eyes of the amblyopia training object is adjusted by adjusting the position of the amblyopia training screen, so that the distance between the adjusted amblyopia training screen and the eyes of the amblyopia training object is in a set distance range, and the amblyopia training effect is ensured.
Further, if the distance between the eyes of the amblyopia training object and the amblyopia training screen is smaller than a first threshold value, the amblyopia training is suspended, and the set distance range comprises the first threshold value. The visual acuity protecting device protects the visual acuity of an amblyopia training object and avoids the situation that the eyes of the amblyopia training object are too close to an amblyopia training screen to hurt the visual acuity.
In some non-limiting embodiments, the distance between the eyes of the amblyopia training subject and the amblyopia training screen is estimated according to the position of each face key point, which may be specifically implemented as follows: selecting N facial images, and respectively calculating the eye distance of the amblyopia training object in each facial image; calculating the eye distance of the amblyopia training object during the amblyopia training according to the eye distance corresponding to the N facial images respectively; and estimating the distance between the eyes of the amblyopia training object and the amblyopia training screen according to the eye distance of the amblyopia training object in the amblyopia training and a preset distance threshold.
The preset distance threshold value can be obtained according to the eye distance of the amblyopia training object. For example, when the amblyopia training object is registered, a face image of the amblyopia training object may be collected, and an eye distance may be calculated based on the collected face image, and the calculated eye distance may be used as a preset distance threshold. Therefore, the error of concentration index calculation caused by different eye distances of different amblyopia training objects can be effectively avoided, and the accuracy of the concentration index calculation is improved.
In some embodiments, when the eye distance of the amblyopia training object during amblyopia training is calculated according to the eye distances corresponding to the N facial images, the following manner may be adopted: corresponding weights can be configured for the N face images respectively, the weights of the N face images are adopted to carry out weighting calculation on the eye distances corresponding to the N face images respectively, and the weighting result is used as the eye distance of the amblyopia training object in the amblyopia training; the average value of the eye distances corresponding to the N face images may be used as the eye distance of the amblyopia training object during the amblyopia training.
When amblyopia training, to the light amblyopia training object of amblyopia degree, can wear the eye-shade and carry out the amblyopia training, also can not wear the eye-shade and carry out the amblyopia training, but to the comparatively serious amblyopia training object of amblyopia degree, then need wear the eye-shade to cover the better eyes of eyesight, carry out the amblyopia training to the eyes of amblyopia. However, in practice, some amblyopic training subjects may not wear the eye shield correctly or even, which may affect the amblyopia training effect without parents or supervision of specific personnel. For this reason, in the embodiment of the present invention, the face image is subjected to eye mask recognition; determining the eye mask wearing condition of the amblyopia training object according to the eye mask identification result, wherein the eye mask is used for shielding the eyes of the amblyopia training object which do not need to be subjected to amblyopia training currently; acquiring amblyopia correction information of the amblyopia training object, wherein the amblyopia correction information comprises: the position of the amblyopia eyes and whether the eye cover needs to be worn; and judging whether the amblyopia correction of the amblyopia training object is standard or not according to the wearing condition of the eyeshade of the amblyopia training object.
In some non-limiting embodiments, when the amblyopia training subject regularly wears the eyeshade, the size of the eyeshade identification, which is provided to the eyeshade, may be determined according to the eyeshade recognition result. And estimating the distance between the eyes of the amblyopia training object and the amblyopia training screen according to the size of the eye mask mark. And if the distance between the eyes of the amblyopia training object and the amblyopia training screen exceeds the set distance range, determining a distance adjusting strategy of the amblyopia training object, and outputting distance adjusting information, wherein the distance adjusting information comprises the distance adjusting strategy of the amblyopia training object. The eyeshade identification can be a preset identification symbol, a preset identification frame, a preset identification picture and the like.
Wherein, the relative position that can dispose the image acquisition device of the facial image of amblyopia training object is fixed with the relative position of amblyopia training screen, and when amblyopia training object worn the eye-shade and carried out the amblyopia training, if the distance of amblyopia training object and amblyopia training screen is different, the size of eye-shade sign is different to can estimate the distance of amblyopia training object and amblyopia training screen according to the size of eye-shade sign. If the relative distance between the amblyopia training object and the amblyopia training screen exceeds the set distance range, if the distance between the amblyopia training object and the amblyopia training screen is too short, determining the distance adjusting strategy of the amblyopia training object to be far away from the amblyopia training screen; and if the distance between the amblyopia training object and the amblyopia training screen is too far, determining the distance adjusting strategy of the amblyopia training object to be close to the amblyopia training screen so as to enable the distance between the amblyopia training object and the amblyopia training screen to be in a set distance range. The set distance range may be set according to the size of the amblyopia training screen, the size of a picture or a font displayed on the screen, and the like.
For example, a camera of the device to which the amblyopia training screen belongs is adopted to collect the facial image of the amblyopia training object, or the image collecting device can be used together with the amblyopia training screen to keep the relative position of the image collecting device and the amblyopia training screen fixed.
In other non-limiting embodiments, if the distance between the eye of the amblyopia training object and the amblyopia training screen exceeds a set distance range, determining a distance adjustment strategy of the amblyopia training screen, and adjusting the position of the amblyopia training screen according to the distance adjustment strategy of the amblyopia training screen. After the position of the amblyopia training screen is adjusted, the distance between the eyes of the amblyopia training object and the amblyopia training screen is in a preset distance range.
Further, the distance between the amblyopia training object and the amblyopia training screen can be estimated according to the size of the eye patch identifier, the adjustment distance is further determined according to the relation between the distance between the amblyopia training object and the amblyopia training screen and the preset distance range, and the position of the amblyopia training screen is adjusted according to the adjustment distance.
In some embodiments, a mapping relationship between the distance between the amblyopia training object and the amblyopia training screen and the size of the eyeshade identification can be configured in advance, and then the distance between the amblyopia training object and the amblyopia training screen can be determined according to the size of the eyeshade identification determined by the eyeshade recognition result.
In order to further improve the accuracy of the eye mask recognition result when performing eye mask recognition on a face image, in some non-limiting embodiments of the present invention, a face region is determined based on the face recognition result obtained by performing face recognition on the face image; and carrying out eye mask identification on the face area. The face area is determined through the face recognition result, eye shade recognition is carried out on the face area, eye shade recognition is carried out on the specific face area based on the eye shade wearing position, and the recognition workload can be reduced while the accuracy of eye shade recognition results can be improved.
In some embodiments, Artificial Intelligence (AI) may be used for face recognition in real time, or conventional image processing algorithms may be used for face recognition.
The eye mask recognition model can be used for eye mask recognition of the face area, and in some embodiments, the eye mask recognition model can be placed at a server side and used for eye mask recognition at the server side so as to reduce performance requirements on the terminal device. In other embodiments, the eye patch identification model may be placed on the terminal device. The terminal equipment can be a computer, a mobile phone, a tablet and other suitable terminals.
In a specific implementation, the eye patch recognition model can be configured in a light weight manner. Specifically, the eye patch recognition model is obtained based on a YoLO-Tiny target detection algorithm, the recognition target class is configured to be 1, and the filter layer configuration is 18. By carrying out light weight configuration on the eye shield recognition model, the performance requirement on the terminal equipment can be reduced, so that the eye shield recognition model can be operated in some low-configuration models.
Further, in order to reduce the storage space requirement of the eye patch identification model for the terminal device, in some non-limiting embodiments, the model data of the eye patch identification model is set in the SD card, and the eye patch identification model is loaded in the following manner: when the triggering of loading of the eye shield identification model is detected, copying model data of the eye shield identification model from the SD card to the SD card directory, and acquiring a model file path so as to read the model data of the eye shield identification model from the SD card directory based on the model file path.
In a specific implementation, a first thread may be opened on which eye-mask recognition is performed on the facial image.
In specific implementation, the facial image of the amblyopia training object is acquired through an image acquisition device, and the acquired facial image is placed in a first cache queue; and acquiring the face image from the first buffer queue.
In a specific implementation, when the frame number of the face images in the first buffer queue is detected to exceed a preset frame, one frame of the face images in the first buffer queue is discarded to ensure that the images in the buffer queue are the latest images.
In a specific implementation, the performing face recognition on the facial image includes: and opening a second thread, and performing face recognition on the facial image in the second thread. And a second thread is set for carrying out face recognition on the face image, so that the influence on eye mask recognition, image processing and the like can be reduced.
Furthermore, a first thread is set for eye mask recognition, a third thread is set for image processing, a second thread is set for face recognition of the face image, multi-thread concurrence can be achieved, threads are not affected with each other, and smoothness of the amblyopia training supervision method is improved.
Further, according to the eye shade recognition result and the amblyopia correction information, determining that the amblyopia training object needs to wear the eye shade, and if the duration that the amblyopia training object does not wear the eye shade reaches the set duration according to the eye shade recognition result, outputting eye shade wearing reminding information to remind the amblyopia training object to wear the eye in time. The amblyopia training correction information may be preset and stored, for example, the corresponding amblyopia correction information may be collected or configured when the amblyopia training subject is registered.
In addition, when the concentration index of the amblyopia training object is obtained through estimation, a amblyopia training supervision report can be output. The amblyopia training supervision report may include a concentration status determined based on the concentration index, and may also include information on the wearing condition of the eye shield, the distance of the eye of the amblyopia training subject from the amblyopia training screen, and the like. The amblyopia training supervision report can be used for parents or other people to know the amblyopia training condition of the amblyopia training object.
Therefore, the face images of the amblyopia training objects are acquired, face recognition is carried out on the face images, the positions of the key points of the faces are determined according to the face recognition results, the concentration index of the amblyopia training objects is estimated according to the positions of the key points of the faces, the concentration index is used for concentrating the concentration degree, automatic supervision on the amblyopia training objects is achieved, and labor cost is saved.
In order to better understand and implement the method for supervising amblyopia training provided by the above embodiments of the present invention, the following describes a specific workflow and working principle of the method for supervising amblyopia training in combination with an embodiment.
Referring to fig. 2, a schematic diagram of a supervision model for amblyopia training for implementing the supervision method for amblyopia training is given. FIG. 3 is a flowchart illustrating a method for supervising amblyopia training according to an embodiment of the present invention; fig. 4 is a working schematic diagram of a method for supervising amblyopia training in the embodiment of the present invention. The method for supervising amblyopia training in a terminal device is described below with reference to fig. 1 to 4 as an example.
The supervision model for implementing the supervision method of amblyopia training may specifically comprise the following elements: the eye patch identification data set YoLO model creation unit 10, the terminal device environment initialization unit 20, the concentration recognition unit 30, the eye patch identification recognition unit 40, and the feedback guidance unit 50.
The eye patch identification data set YOLO model making unit 10 is used to make an eye patch recognition model for eye patch recognition in the above-described embodiment. Specifically, the eye patch identification data set YOLO modeling unit 10 may include the following modules: the system comprises an eye shield identification data acquisition module 11, an eye shield identification data marking module 12, a data model Tiny training configuration 13 and an eye shield identification data model training module 14.
The eyeshade identification data acquisition module 11 is mainly used for acquiring eyeshade identification on eyeshades, and the eyeshade identification can be identification characteristic pictures. In order to ensure the eye mask recognition effect, a plurality of pictures with different viewing angles, different backgrounds and different light rays can be shot, for example, about 2000 pictures, and the shot plurality of pictures are used as training materials of the eye mask identification data model training module 14.
And the eye patch identification data marking module 12 is used for marking the picture. If label area labeling is carried out by using a labelImg tool, the specific method is to open the labelImg tool, find open Dir to open the picture folder to be labeled, label the pictures respectively, input the name of the eye shade label, and then save the name. The proportion of the marked picture is 8: 1: 1, classifying and writing training samples, testing samples and verifying samples. The associated region coordinate profile is then generated for use in a training profile for the eye patch identification data model training module 14.
Data model Tiny training configuration 13, YOLO-Tiny configuration, the first, target class (classes) is equal to 1, i.e. only one target class, eye mask, is configured. Second, the filtering (filters) level is set to the formula (5+ classes) × 3, i.e., set 18. Third, the maximum number of trainings (max _ batches), set 500200, can be reduced to one in a thousand via the error rate testing, it being understood that other training numbers can be configured. With this configuration, it is possible to achieve a lightweight eye-mask recognition model to reduce the requirements for the performance of the terminal device.
The eyeshade identification data model training module 14 is used for carrying out YOLO data model training on the picture by using a Darknet deep learning frame after the marked data and configuration files are prepared, a Tiny model is adopted for training configuration, the training configuration belongs to a lightweight model, only 600 ten thousand of parameters are equivalent to one tenth of the original parameters, and therefore the detection speed is greatly improved; finally, after training, generating a YOLO model file yolov3 weight (yolov 3. weights); the subsequent eye mask identification recognition unit 40 performs eye mask recognition by using the trained model.
The initialization 20 of the terminal device environment mainly includes: loading the eye patch identification data model 21, loading the open source face model data 22, the Camera2 environment initialization module 23, and the Opencv environment initialization module 24.
The eyeshade identification data model 21 is loaded, when the eyeshade identification data model 21 is used for an android device, because the data read from the android device are all read from the sdcard directory, the model data of the eyeshade identification data set YOLO model making unit 10 needs to be copied to the sdcard directory, and after the copying is completed, the model yolov3.weights file path is transmitted to the eyeshade identification recognition unit 40.
The open source face model data module 22 is also loaded, and is to copy the facetypefaces _ detector _ for _ cpu and pyramidbox _ lite _ for _ cpu face model data to the sdcard directory, and after copying is completed, the model file path is transmitted to the concentration recognition unit 30.
Regarding the Camera2 environment initialization 23, taking Camera2 as an example, Camera2 is to improve the performance of new hardware on an android device, to be able to take images at faster intervals, and initializes several categories, including a Camera management (Camera manager) category, a Camera device (Camera device) category, and a Camera features (Camera characteristics) category. Wherein the camera manager functions as a camera system service for managing and connecting the camera devices; the camera device class, camera characteristics, is mainly used to acquire camera information, and carries a lot of camera information inside, including the forward and reverse directions, rotation direction, etc. of a camera. A capture request (CaptureRequest) camera captures a setting request of an image, the format of the image is set to be YUV420 through the category, because the native format of the acquired data of the camera is YUV420, an Opengl Graphics Processor (GPU) is adopted to convert YUV420 into RGB format data, the delay is extremely low, almost no delay exists, and the acquired data required by the concentration recognition Unit 30 and the eye patch identification recognition Unit 40 is output by calling back from an image reader (ImageReader) category interface;
the concentration recognition unit 30 mainly includes: a face recognition subunit 31, a concentration analysis algorithm module 32, and a human eye-to-screen distance calculation module 33. The face recognition subunit 31 includes a Camera2 module 310, a data to be recognized receiving module 311, a recognition data processing and converting module 312, and a paddlelite-Lite face recognition module 313.
The Camera2 module 310 collects Camera data, outputs YUV data through an ImageReader callback function set in the module Camera2 environment initialization 23, and then converts the YUV data into RGB data through an OpenGL rendering pipeline in a unified manner and outputs the RGB data to the data to be recognized receiving module 311 of the face recognition subunit 31 and the data to be recognized receiving module 41 of the eye mask identification recognition unit 40, the OpenGL conversion method is described by defining textures of each component Y, U, V, and then converting the texture data of each component into a value corresponding to the component R, G, B, and the formula is as follows: r = Y +1.4022 (V-128); g = Y-0.3456 (U-128) -0.7145 (V-128); b = Y +1.771 (U-128); the benefit of conversion by OpenGL is that GPU performance is faster, approximately 1 millisecond to complete the conversion, and approximately 30 milliseconds if the conversion is by CPU. YUV is a color coding method, Y represents brightness (Luma) and is a gray scale value, and U and V represent Chrominance (Chroma) and saturation, which are used to describe the color and saturation of an image and to specify the color of a pixel. The RGB color scheme is a color standard in the industry, and various colors are obtained by changing three color channels of red (R), green (G) and blue (B) and superimposing the three color channels on each other, wherein RGB represents the colors of the three color channels of red, green and blue.
The to-be-identified data receiving module 311 creates 3 buffer queues for to-be-received data, and when a frame of RGB data transmitted from the Camera2 module 310 is placed in the created buffer queue for RGB data (i.e. the first buffer queue mentioned above), and when the first buffer queue is greater than 3, the frame of RGB data in the buffer queue is discarded, so that it can be ensured that the received RGB data is the latest. The data receiving module 311 to be recognized separately starts a second thread for face recognition to perform recognition work, so that a Camera rendering thread is not blocked, the flow in the second thread is to take out RGB data and transmit the RGB data to the face recognition subunit 31 for face recognition analysis when the RGB data cache queue has data, and the taken-out cache data memory is stored in a recovery queue as a new frame data memory of the Camera2 module 310, so that frequent garbage recovery (GC) can be avoided, memory providing performance is saved, and when the RGB cache queue has no data, the second thread stops working to sleep.
The face recognition subunit 31 may adopt hundredth-level-Lite face recognition, the recognition process sets a path of the face-level model, the path is obtained from the loading open-source face model data module 22, the CPU operation mode POWER mode, level _ POWER _ NO _ BIND is set for the purpose that the machine does not burn, the number of the work processes is set for the purpose of concurrently reducing recognition delay, a predicted object is created according to the above settings, data taken out from the RGB cache queue is set to the predicted object, then the predicted object performs prediction, and finally, the face recognition result is called back and output to the concentration analysis algorithm module 32 and the human eye-to-screen distance calculation module 33.
Concentration analysis algorithm module 32 analyzes the concentration as follows: the Paddle-Lite performs face recognition according to the RGB data, and the obtained face recognition result may include the area coordinates of the recognized face and 68 key points of the face, which may specifically refer to a position schematic diagram of the key points of the face in the embodiment of the present invention shown in FIG. 5. Calculating according to 68 key points (the sequence numbers are 1 to 68 respectively) of the identified face: calculating a left eye center point between a left eye lowest point (a key point with a serial number of 41) and a left eye highest point (a key point with a serial number of 37), calculating a horizontal direction distance from the left eye center point to a nose center point (for example, taking the key point with a serial number of 30) and marking as a left eye distance, similarly calculating a right eye center point between a right eye lowest point (a key point with a serial number of 47) and a right eye highest point (a key point with a serial number of 43), calculating a horizontal direction distance from the right eye center point to the nose center point (a key point with a serial number of 30) and marking as a right eye distance, and marking a distance from the left eye center point to the right eye center point as a binocular distance.
abs (left-eye distance-right-eye distance)/interocular distance > 0.4 is defined as inattention, where abs () is the absolute value.
The vertical distance between the lowest point of the left eye (key point numbered 41) and the highest point of the left eye (key point numbered 37) is marked as the left eye height, the vertical distance between the lowest point of the right eye (key point numbered 47) and the highest point of the right eye (key point numbered 43) is marked as the right eye height, the left eye height/binocular interval < 0.07 is calculated to be defined as inattentive, and the right eye height/binocular interval < 0.07 is calculated to be inattentive. The calculation passes the results of concentration recognition to the feedback guidance unit 50.
It should be noted that the key point with the serial number of 41 is used as the lowest point of the left eye, and the key point with the serial number of 37 is used as the highest point of the left eye, which is only an exemplary description, and in practice, other key points in the serial numbers 37 to 42 of the key points of the left eye may also be used, and it is only necessary that the relative position between the highest point of the left eye and the lowest point of the left eye is satisfied, and the highest point of the left eye is higher than the lowest point of the left eye. Correspondingly, the key point with the serial number of 47 is used as the lowest point of the right eye, and the key point with the serial number of 43 is used as the highest point of the right eye, which is only an exemplary illustration, and in practice, other key points in the serial numbers 43 to 48 of the key points of the right eye may also be used, and it is only required that the relative position between the highest point of the right eye and the lowest point of the right eye is satisfied, and the highest point of the right eye is higher than the lowest point of the right eye.
The module 33 starts to protect the eyes of the user from damaging the eyes of the screen, and the amblyopia training object is usually a child, the distance between the eyes acquired when the child registers the face is used as the optimal distance, and the distance between the eyes acquired later is calculated by taking the value as a reference, so that the algorithm is simple and efficient, and the calculation difference caused by the different distances between the eyes of each child is avoided. The specific algorithm defines the eye distance when the face is registered as an eye distance constant value (i.e. the preset distance threshold value in the above) as follows, and defines the average value of the nearest 5 frames of the eye distance as an eye distance average value, (the eye distance average value-the eye distance constant value)/the eye distance constant value to calculate the distance from the human eyes to the screen.
And an eye patch identification recognition unit 40, which is mainly used for carrying out target detection analysis based on an open-source Darknet AI framework. The eye patch identification recognition unit 40 may include a data to be recognized receiving module 41, a face image format conversion module 42, a Darknet initialization module 43, a Darknet target detection module 44, and a Darknet recognition result processing module 45.
The data processed by the eye patch identification recognition unit 40 is a face recognition result received from the concentration recognition unit 30, and the face recognition result includes RGB data, and coordinates of a region of a face in the RGB data. The face area can be determined according to the face recognition result, eye mask identification recognition is carried out on the face area, the data volume to be recognized is small, the performance and the accuracy are better, when the buffer queue is more than 3, one frame of data in the queue can be discarded, so that the latest data can be ensured to be received,
the data receiving module 41 to be recognized creates 3 buffer queues of data to be received, when the data receiving concentration recognition unit 30 outputs the recognition result and packages the data into RGB data and face region coordinates, the packaged data is placed into the data buffer queues, the data receiving module 41 to be recognized independently opens a first thread for face recognition to perform eye mask identification work, so that a second thread of face recognition cannot be blocked, the flow in the first thread is that when the data buffer queues have data, the data is taken out and transmitted to the face image format conversion module 42 to perform data clipping, the face RGB data is clipped through the function copyTo of Opencv according to the face region coordinates, the clipped data is transmitted to the Darknet target detection module 44 to perform detection, the taken-out buffer data memory is stored in the recovery queue as a new frame data memory of the face recognition subunit 31, so that frequent GC can be avoided, memory saving provisioning capability when the cache queue has no data, the first thread stops working and sleeps.
The Darknet initialization module 43 identifies the model and detects the rule initial environment, specifically, 1. reading of the configuration file initializes the coco.dat training data set, yolov3.cfg training configuration, and yolov3.weights training weights, respectively. 2. And reading and writing the picture file. 3. And (3) loading a network model file, namely loading a label character string from coco.names is completed by a get _ labels function, loading the network model is totally completed by a load _ network function, reading a yolov3.cfg file by a parse _ network _ cfg function, and completing the memory allocation of the network according to rules in the file. The load _ weights function then loads the yolov3.weights weight file. Data/yolovv 3. cfg/yolovv 3.weights are obtained from the load eye shield identification data model 21.
The Darknet target detection module 44 can implement the following: the data output by the face image format conversion module 42 is loaded with pictures and normalized (the called function is load _ image _ color), then prediction is performed (the called function is network _ prediction), then a prediction frame is output, a prediction frame with the similarity smaller than 0.5 is filtered, and if the prediction frame with the similarity larger than 0.5 exists, the fact that a child (amblyopia training object) wears eyepatches is indicated.
The feedback guiding unit 50, and the product finally connected to the scheme, analyzes and adjusts the product to give guidance according to the concentration force given by the concentration force recognition unit 30 and the eye mask identification recognition unit 40, the distance between the human eyes and the screen, whether the eye mask is worn or not, and the like, such as reminding the amblyopia training object to adjust the head posture, the amblyopia training object to adjust the distance between the eyes and the amblyopia training screen, the position of the amblyopia training screen, and the like.
The feedback guidance unit 50 mainly includes a big data service product module 51. Big data service product module 51 is the time, children are absorbed in the power value, the eye distance, whether wear the database of eye shade storage to big data and preserve, read one section database value and carry out the analysis, the attention is not concentrated, eyes feed back the product too near the screen, can suspend forbidden warning children and need promote to be absorbed in power and just can the unblock and continue experience the product, if detect for a long time and do not wear the eye shade, the eye shade is worn to the healthy eyes of children to the product warning, if all concentrate in for a long time, the standard is done well, and give certain reward, and finally and to relevant attention power, the distance of people eyes to the screen, whether wear the timely feedback of eye shade to parents, accomplish final guardianship.
Referring to fig. 6, which shows an amblyopia training supervision apparatus in an embodiment of the present invention, the amblyopia training supervision apparatus 700 may include:
a face recognition unit 71, configured to acquire a face image of an amblyopia training object, and perform face recognition on the face image to obtain a face recognition result;
a determining unit 72, configured to determine positions of key points of each face according to the face recognition result;
and the estimating unit 73 is used for estimating a concentration index of the amblyopia training object according to the position of each face key point, wherein the concentration index is used for representing the concentration degree.
In a specific implementation, the specific working principle and the working process of the monitoring apparatus 700 for amblyopia training may refer to the description of the monitoring method for amblyopia training provided in any of the above embodiments, and are not described herein again.
An embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium is a non-volatile storage medium or a non-transitory storage medium, and a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program performs any of the above steps of the method for supervising amblyopia training.
The embodiment of the invention also provides a terminal, which comprises a memory and a processor, wherein the memory is stored with a computer program capable of running on the processor, and the processor executes any step of the monitoring method for amblyopia training when running the computer program.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in any computer readable storage medium, and the storage medium may include: ROM, RAM, magnetic or optical disks, and the like.
Although the present invention is disclosed above, the present invention is not limited thereto. Various changes and modifications may be effected therein by one skilled in the art without departing from the spirit and scope of the invention as defined in the appended claims.