CN115171028A - Intelligent glasses adjustment control method and system based on image processing - Google Patents

Intelligent glasses adjustment control method and system based on image processing Download PDF

Info

Publication number
CN115171028A
CN115171028A CN202211082278.9A CN202211082278A CN115171028A CN 115171028 A CN115171028 A CN 115171028A CN 202211082278 A CN202211082278 A CN 202211082278A CN 115171028 A CN115171028 A CN 115171028A
Authority
CN
China
Prior art keywords
video frame
frame
face
video
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211082278.9A
Other languages
Chinese (zh)
Other versions
CN115171028B (en
Inventor
王洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Tianqu Xingkong Technology Co ltd
Original Assignee
Shenzhen Tianqu Xingkong Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Tianqu Xingkong Technology Co ltd filed Critical Shenzhen Tianqu Xingkong Technology Co ltd
Priority to CN202211082278.9A priority Critical patent/CN115171028B/en
Publication of CN115171028A publication Critical patent/CN115171028A/en
Application granted granted Critical
Publication of CN115171028B publication Critical patent/CN115171028B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/017Head mounted
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/48Matching video sequences
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/49Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B6/00Tactile signalling systems, e.g. personal calling systems
    • GPHYSICS
    • G02OPTICS
    • G02BOPTICAL ELEMENTS, SYSTEMS OR APPARATUS
    • G02B27/00Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
    • G02B27/01Head-up displays
    • G02B27/017Head mounted
    • G02B2027/0178Eyeglass type

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Optics & Photonics (AREA)
  • Ophthalmology & Optometry (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides an intelligent glasses adjustment control method and system based on image processing, and relates to the technical field of image processing. In the invention, the face area of a target user is subjected to video monitoring processing through target intelligent glasses so as to output a face area monitoring video corresponding to the target user; carrying out user state recognition processing on the face region monitoring video by utilizing a user state recognition neural network formed by pre-training so as to output target user state recognition information corresponding to a target user, wherein the target user state recognition information is used for reflecting the attention concentration of the target user; and adjusting and controlling the target intelligent glasses based on the state identification information of the target user so as to realize warning treatment of the target user. Based on the above, the invention improves the reliability of the adjustment control of the intelligent glasses.

Description

Intelligent glasses adjustment control method and system based on image processing
Technical Field
The invention relates to the technical field of image processing, in particular to an intelligent glasses adjustment control method and system based on image processing.
Background
The intelligent glasses are applied more, for example, the user can be warned through the intelligent glasses, namely, corresponding warning is carried out on the user under the condition that the existing state of the user is not good so as not to cause damage. In the prior art, when the user has poor state due to overlong film watching time, the intelligent glasses warn the user; however, in the whole process, the problem of poor warning reliability caused by the fact that the intelligent glasses are not adjusted and controlled in place exists, namely the problem of poor reliability of adjusting and controlling the intelligent glasses exists.
Disclosure of Invention
In view of the above, the present invention provides a method and a system for adjusting and controlling smart glasses based on image processing, so as to improve the reliability of adjusting and controlling the smart glasses to a certain extent.
In order to achieve the above purpose, the embodiment of the invention adopts the following technical scheme:
an intelligent glasses adjusting control method based on image processing comprises the following steps:
carrying out video monitoring processing on a face area of a target user through target intelligent glasses so as to output a face area monitoring video corresponding to the target user, wherein the face area monitoring video comprises a plurality of face area monitoring video frames, and the face area monitoring video frames at least comprise eye information;
carrying out user state recognition processing on the face region monitoring video by utilizing a user state recognition neural network formed by pre-training so as to output target user state recognition information corresponding to the target user, wherein the target user state recognition information is used for reflecting the attention concentration of the target user;
and adjusting and controlling the target intelligent glasses based on the state identification information of the target user so as to realize the warning processing of the target user.
In some preferred embodiments, in the above method for adjusting and controlling smart glasses based on image processing, the step of performing video monitoring processing on the face area of the target user through the target smart glasses to output a face area monitoring video corresponding to the target user includes:
judging whether an intelligent glasses adjusting control instruction sent by target intelligent glasses or target user terminal equipment corresponding to the target intelligent glasses is received or not;
if an intelligent glasses adjusting control instruction sent by the target intelligent glasses or a target user terminal device corresponding to the target intelligent glasses is received, sending a video monitoring instruction to the target intelligent glasses to control the target intelligent glasses to perform video monitoring processing on a face area of a target user, so as to output a face area monitoring video corresponding to the target user.
In some preferred embodiments, in the method for adjusting and controlling smart glasses based on image processing, the step of performing user state recognition processing on the face area surveillance video by using a user state recognition neural network formed through pre-training to output target user state recognition information corresponding to the target user includes:
performing video frame region interception processing on each face region monitoring video frame in a plurality of face region monitoring video frames included in the face region monitoring video so as to output a face sub-region monitoring video frame corresponding to each face region monitoring video frame, wherein the face sub-region monitoring video frame is a sub-region where eyes are located in the face region monitoring video frame;
sequencing according to the face subregion monitoring video frame corresponding to each frame of the face region monitoring video frame and the video frame time sequence of the face region monitoring video frame to form a face subregion monitoring video corresponding to the face region monitoring video;
utilizing a user state recognition neural network formed by pre-training to perform recognition matching processing on the face sub-region monitoring video and each frame of standard face sub-region monitoring video in a plurality of pre-configured standard face sub-region monitoring videos so as to output the video matching degree between the face sub-region monitoring video and each standard face sub-region monitoring video;
and according to the video matching degree between the face sub-region monitoring video and each standard face sub-region monitoring video, fusing user state standard information corresponding to each standard face sub-region monitoring video to output target user state identification information corresponding to the target user.
In some preferred embodiments, in the above intelligent glasses adjustment control method based on image processing, the step of performing recognition matching processing on the face sub-region monitoring video and each of a plurality of standard face sub-region monitoring videos configured in advance by using a user state recognition neural network formed by training in advance to output a video matching degree between the face sub-region monitoring video and each of the standard face sub-region monitoring videos includes:
extracting video frame characteristic data corresponding to each frame of face sub-region monitoring video frame included in the face sub-region monitoring video from the face sub-region monitoring video, and extracting video frame characteristic data corresponding to each frame of standard face sub-region monitoring video frame included in the standard face sub-region monitoring video from the standard face sub-region monitoring video;
loading video frame characteristic data corresponding to each frame of the face sub-region monitoring video frame and video frame characteristic data corresponding to each frame of the standard face sub-region monitoring video frame into a user state recognition neural network formed by pre-training, so as to analyze and recognize video frame matching coefficients between each frame of the face sub-region monitoring video frame and each frame of target video frame in a target video frame cluster by using the user state recognition neural network, and analyze and recognize standard video frame matching coefficients between each frame of the standard face sub-region monitoring video frame and each frame of target video frame in the target video frame cluster, wherein the multi-frame target video frames in the target video frame cluster comprise the face sub-region monitoring video frame and the standard face sub-region monitoring video frame;
and analyzing and determining the video matching degree between the face sub-region monitoring video and the standard face sub-region monitoring video based on the video frame matching coefficient between each frame of the face sub-region monitoring video frame and each frame of the target video frame and the standard video frame matching coefficient between each frame of the standard face sub-region monitoring video frame and each frame of the target video frame.
In some preferred embodiments, in the above intelligent glasses adjustment control method based on image processing, the step of extracting, from the face sub-region surveillance video, video frame feature data corresponding to each frame of face sub-region surveillance video frame included in the face sub-region surveillance video, and extracting, from the standard face sub-region surveillance video, video frame feature data corresponding to each frame of standard face sub-region surveillance video frame included in the standard face sub-region surveillance video includes:
extracting video frame pixel distribution data corresponding to each frame of the face sub-region monitoring video frame from the face sub-region monitoring video, and extracting video frame pixel distribution data corresponding to each frame of the standard face sub-region monitoring video frame from the standard face sub-region monitoring video;
respectively extracting video frame time sequence data corresponding to each frame of the face sub-region monitoring video from the face sub-region monitoring video, and extracting video frame time sequence data corresponding to each frame of the standard face sub-region monitoring video from the standard face sub-region monitoring video;
and constructing and forming video frame characteristic data corresponding to each frame of the face subregion surveillance video frame based on the video frame pixel distribution data and the video frame time sequence data corresponding to each frame of the face subregion surveillance video frame, and constructing and forming video frame characteristic data corresponding to each frame of the standard face subregion surveillance video frame based on the video frame pixel distribution data and the video frame time sequence data corresponding to each frame of the standard face subregion surveillance video frame.
In some preferred embodiments, in the above intelligent glasses adjustment control method based on image processing, the step of loading the video frame feature data corresponding to each frame of the face sub-region surveillance video frame and the video frame feature data corresponding to each frame of the standard face sub-region surveillance video frame into a user state recognition neural network formed by pre-training so as to use the user state recognition neural network to analyze and recognize the video frame matching coefficients between each frame of the face sub-region surveillance video frame and each frame of the target video frames in the target video frame cluster, and analyze and recognize the standard video frame matching coefficients between each frame of the standard face sub-region surveillance video frame and each frame of the target video frames in the target video frame cluster includes:
loading video frame characteristic data corresponding to each frame of the face sub-region monitoring video frame and video frame characteristic data corresponding to each frame of the standard face sub-region monitoring video frame into a user state recognition neural network formed by pre-training;
analyzing and outputting a video frame time sequence data difference value between each face sub-region monitoring video frame and each frame target video frame in a target video frame cluster and analyzing and outputting a video frame time sequence data difference value between each standard face sub-region monitoring video frame and each frame target video frame in the target video frame cluster on the basis of video frame time sequence data corresponding to the face sub-region monitoring video frame and video frame time sequence data corresponding to the standard face sub-region monitoring video frame through the user state recognition neural network;
and analyzing and outputting the matching coefficient of the face sub-region monitoring video frame and the video frame of each frame of target video frame respectively, and analyzing and outputting the matching coefficient of the standard face sub-region monitoring video frame and the video frame of each frame of target video frame respectively.
In some preferred embodiments, in the above intelligent glasses adjustment control method based on image processing, the step of loading the video frame feature data corresponding to each frame of the face sub-region surveillance video frame and the video frame feature data corresponding to each frame of the standard face sub-region surveillance video frame into a user state recognition neural network formed by pre-training so as to use the user state recognition neural network to analyze and recognize the video frame matching coefficients between each frame of the face sub-region surveillance video frame and each frame of the target video frames in the target video frame cluster, and analyze and recognize the standard video frame matching coefficients between each frame of the standard face sub-region surveillance video frame and each frame of the target video frames in the target video frame cluster includes:
loading video frame characteristic data corresponding to each frame of the face subregion monitoring video frame and video frame characteristic data corresponding to each frame of the standard face subregion monitoring video frame into a video frame characteristic identification mining model included in the user state identification neural network;
and analyzing and identifying the video frame matching coefficient between the monitoring video frame of the face subregion and each frame of target video frame by utilizing a video frame characteristic weight analysis submodel included in the video frame characteristic identification mining model, and analyzing and identifying the standard video frame matching coefficient between the monitoring video frame of the standard face subregion and each frame of target video frame.
In some preferred embodiments, in the above-mentioned image processing-based smart glasses adjustment control method, the image processing-based smart glasses adjustment control method further includes a step of training the formed user state recognition neural network, the step including:
extracting video frame characteristic data corresponding to each frame of an exemplary face sub-region monitoring video frame included in the exemplary face sub-region monitoring video from the configured exemplary face sub-region monitoring video, and extracting video frame characteristic data corresponding to each frame of an exemplary standard face sub-region monitoring video frame included in the exemplary standard face sub-region monitoring video from the configured exemplary standard face sub-region monitoring video;
loading video frame characteristic data corresponding to each exemplary face subregion monitoring video frame and video frame characteristic data corresponding to each exemplary standard face subregion monitoring video frame into a pre-established user state identification neural network to be updated;
analyzing and identifying an exemplary video frame matching coefficient between each frame of the exemplary facial sub-region surveillance video frame and each frame of an exemplary target video frame in an exemplary target video frame cluster by using the to-be-updated user state recognition neural network, and analyzing and identifying an exemplary standard video frame matching coefficient between each frame of the exemplary standard facial sub-region surveillance video frame and each frame of the exemplary target video frame in the exemplary target video frame cluster, wherein the frames of the exemplary target video frame in the exemplary target video frame cluster comprise the exemplary facial sub-region surveillance video frame and the exemplary standard facial sub-region surveillance video frame;
analyzing and determining an exemplary video matching degree between the exemplary face sub-region monitoring video and the exemplary standard face sub-region monitoring video based on an exemplary video frame matching coefficient between each exemplary face sub-region monitoring video frame and each exemplary target video frame respectively and an exemplary standard video frame matching coefficient between each exemplary standard face sub-region monitoring video frame and each exemplary target video frame respectively;
and updating the to-be-updated user state recognition neural network based on the real value of the video matching degree between the exemplary face sub-region monitoring video and the exemplary standard face sub-region monitoring video and by combining the exemplary video matching degree, so as to form the user state recognition neural network corresponding to the to-be-updated user state recognition neural network.
In some preferred embodiments, in the above method for controlling adjustment of smart glasses based on image processing, the step of performing adjustment control on the target smart glasses based on the target user state identification information to implement warning processing for the target user includes:
comparing the attention concentration of the target user reflected by the target user state identification information with a pre-configured attention concentration reference value;
if the attention concentration of the target user reflected by the target user state identification information is less than or equal to the attention concentration reference value, calculating a ratio of the attention concentration reference value to the attention concentration to output a corresponding concentration ratio;
determining a target vibration parameter with positive correlation according to the concentration ratio, and adjusting and controlling the target intelligent glasses according to the target vibration parameter so that the target intelligent glasses vibrate based on the target vibration parameter to prompt the target user, wherein the target vibration parameter at least comprises one of a target vibration amplitude and a target vibration frequency.
The embodiment of the invention also provides an intelligent glasses adjustment control system based on image processing, which comprises a processor and a memory, wherein the memory is used for storing a computer program, and the processor is used for executing the computer program so as to realize the intelligent glasses adjustment control method based on image processing.
According to the method and the system for adjusting and controlling the intelligent glasses based on the image processing, the face area of the target user is subjected to video monitoring processing through the target intelligent glasses, so that the face area monitoring video corresponding to the target user is output. And carrying out user state recognition processing on the face region monitoring video by utilizing a user state recognition neural network formed by pre-training so as to output target user state recognition information corresponding to a target user, wherein the target user state recognition information is used for reflecting the attention concentration of the target user. And adjusting and controlling the target intelligent glasses based on the state identification information of the target user so as to realize the warning processing of the target user. Based on the steps, the face area monitoring video can be identified through the neural network with higher data processing capacity to output corresponding target user state identification information, so that the target user state identification information can reflect the attention concentration of a target user with higher reliability, and therefore, when the target intelligent glasses are adjusted and controlled based on the target user state identification information, the target intelligent glasses can have higher reliability, namely, the reliability of adjusting and controlling the intelligent glasses is improved to a certain extent, and the defects in the prior art are overcome.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
Fig. 1 is a block diagram of a smart glasses adjustment control system based on image processing according to an embodiment of the present invention.
Fig. 2 is a flowchart illustrating steps included in an image processing-based smart glasses adjustment control method according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of modules included in an image processing-based smart glasses adjustment control apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, an embodiment of the present invention provides an intelligent glasses adjustment control system based on image processing. Wherein the image processing based smart eyewear adjustment control system may include a memory and a processor.
Illustratively, in some possible embodiments, the memory and the processor are electrically connected, directly or indirectly, to enable transmission or interaction of data. For example, they may be electrically connected to each other via one or more communication buses or signal lines. The memory can have at least one software functional module (computer program) stored therein, which can be in the form of software or firmware. The processor may be configured to execute the executable computer program stored in the memory, so as to implement the method for controlling smart glasses adjustment based on image processing according to the embodiment of the present invention.
By way of example, in some possible embodiments, the Memory may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Read Only Memory (EPROM), an electrically Erasable Read Only Memory (EEPROM), and the like. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), a System on Chip (SoC), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components.
For example, in some possible embodiments, the structure shown in fig. 1 is only an illustration, and the smart glasses adjustment control system based on image processing may further include more or less components than those shown in fig. 1, or have a different configuration from that shown in fig. 1, for example, may include a communication unit for information interaction with other devices (e.g., user terminal devices such as smart glasses, mobile phones, etc.).
For example, in some possible embodiments, the image processing-based smart eyewear adjustment control system may be a server with data processing capabilities.
With reference to fig. 2, an embodiment of the present invention further provides an image processing-based smart glasses adjustment control method, which is applicable to the image processing-based smart glasses adjustment control system. The method steps defined by the flow related to the intelligent glasses adjustment control method based on image processing can be realized by the intelligent glasses adjustment control system based on image processing.
The specific process shown in FIG. 2 will be described in detail below.
Step S110, carrying out video monitoring processing on the face area of the target user through the target intelligent glasses so as to output a face area monitoring video corresponding to the target user.
In the embodiment of the invention, the intelligent glasses adjustment control system based on image processing can perform video monitoring processing on the face area of the target user through the target intelligent glasses so as to output the face area monitoring video corresponding to the target user. The face region monitoring video comprises a plurality of face region monitoring video frames, and the face region monitoring video frames at least comprise information of eyes.
And step S120, carrying out user state identification processing on the face region monitoring video by using a user state identification neural network formed by pre-training so as to output target user state identification information corresponding to the target user.
In the embodiment of the invention, the intelligent glasses adjustment control system based on image processing can utilize a user state recognition neural network formed by pre-training to perform user state recognition processing on the face region monitoring video so as to output target user state recognition information corresponding to the target user. The target user state identification information is used for reflecting the attention concentration degree of the target user.
And step S130, adjusting and controlling the target intelligent glasses based on the state identification information of the target user so as to realize the warning processing of the target user.
In the embodiment of the present invention, the image processing-based smart glasses adjustment control system may perform adjustment control on the target smart glasses based on the target user state identification information, so as to implement warning processing on the target user.
Based on the foregoing steps, the facial region surveillance video may be identified by the neural network with higher data processing capability to output corresponding target user state identification information, so that the target user state identification information may reflect the concentration of the attention of the target user with higher reliability, and thus, when the target smart glasses are adjusted and controlled based on the target user state identification information, the target smart glasses may have higher reliability, that is, the reliability of adjusting and controlling the smart glasses is improved to a certain extent, and the defects in the prior art (that is, the reliability of adjusting and controlling the smart glasses is not high) are overcome.
For example, in some possible embodiments, in order to implement the step of performing video monitoring processing on a face area of a target user through target smart glasses to output a face area monitoring video corresponding to the target user, the following may be performed in a specific execution process:
judging whether an intelligent glasses adjusting control instruction sent by target intelligent glasses or target user terminal equipment (corresponding to a target user) corresponding to the target intelligent glasses is received or not;
if an intelligent glasses adjusting control instruction sent by the target intelligent glasses or a target user terminal device corresponding to the target intelligent glasses is received, sending a video monitoring instruction to the target intelligent glasses to control the target intelligent glasses to perform video monitoring processing on a face area of a target user, so as to output a face area monitoring video corresponding to the target user.
For example, in some possible embodiments, in order to implement the step "performing video monitoring processing on a face area of a target user through target smart glasses to output a face area monitoring video corresponding to the target user", in a specific execution process, the following may be further performed:
performing video frame similarity calculation processing on every two adjacent face area monitoring video frames in the face area monitoring video corresponding to the target user to output video frame similarity between every two adjacent face area monitoring video frames; based on the video frame similarity, performing duplicate removal or high similarity removal video frame screening on the face region surveillance video to output a final face region surveillance video (the final face region surveillance video is used for performing subsequent steps, as in step S120).
In some possible embodiments, the step of performing a video frame similarity calculation process on every two adjacent face area monitoring video frames in the face area monitoring video corresponding to the target user to output a video frame similarity between every two adjacent face area monitoring video frames may further include the following (calculating only for any two adjacent frames):
performing face feature point recognition processing on the two adjacent frames of face region surveillance video frames to respectively output a corresponding first face feature point set and a corresponding second face feature point set, wherein the first face feature point set comprises a plurality of first face feature points from one of the two adjacent frames of face region surveillance video frames, and the second face feature point set comprises a plurality of second face feature points from the other one of the two adjacent frames of face region surveillance video frames;
for each first face feature point in the first face feature point set, determining a feature point straight line passing through the first face feature point and an adjacent first face feature point closest to the first face feature point according to the first face feature point and the adjacent first face feature point, and segmenting the corresponding face region surveillance video frame according to the feature point straight line corresponding to each first face feature point in the first face feature point set to output a plurality of corresponding first face region surveillance video frame blocks;
for each second face feature point in the second face feature point set, determining a feature point straight line passing through the second face feature point and an adjacent second face feature point according to the second face feature point and the adjacent second face feature point closest to the second face feature point, and segmenting the belonged face region monitoring video frame according to the feature point straight line corresponding to each second face feature point in the second face feature point set so as to output a plurality of corresponding second face region monitoring video frame blocks;
respectively calculating a first dimension similarity coefficient and a second dimension similarity coefficient between each first face region surveillance video frame block and each second face region surveillance video frame block in the plurality of first face region surveillance video frame blocks, wherein the first dimension similarity coefficient is used for reflecting the shape similarity (such as the similarity of region outlines) between the corresponding first face region surveillance video frame block and the corresponding second face region surveillance video frame block, and the second dimension similarity coefficient is used for reflecting the pixel similarity (such as the sequence similarity between the pixel value safety preset path of the pixel points, such as the sequence order of first left, then right, first up, then down and the like, and then the sequence similarity between the obtained pixel value sequences is calculated) between the corresponding first face region surveillance video frame block and the corresponding second face region surveillance video frame block;
performing fusion calculation (such as mean calculation or weighted mean calculation) on the corresponding first dimension similarity coefficient and second dimension similarity coefficient to output a fusion dimension similarity coefficient between the corresponding first face region surveillance video frame block and second face region surveillance video frame block;
associating the plurality of first face region surveillance video frame blocks and the plurality of second face region surveillance video frame blocks according to the corresponding fusion dimension similarity coefficients, such that a mean value of the fusion dimension similarity coefficients between the associated first face region surveillance video frame blocks and second face region surveillance video frame blocks is maximized, wherein if one first face region surveillance video frame block is associated with a plurality of second face region surveillance video frame blocks, each of the plurality of second face region surveillance video frame blocks is associated with only the first face region surveillance video frame block, whereas if one second face region surveillance video frame block is associated with a plurality of first face region surveillance video frame blocks, each of the plurality of first face region surveillance video frame blocks is associated with only the second face region surveillance video frame block (i.e., the relationship may include one-to-one, one-to-many, many-to-many, and not many-to-many);
for each first face region monitoring video frame block, calculating an area ratio between the area of the first face region monitoring video frame block and a reference area corresponding to the first face region monitoring video frame block, and determining a first fusion weight coefficient corresponding to the first face region monitoring video frame block according to the area ratio, wherein the first fusion weight coefficient is negatively related to the area ratio, and the reference area corresponding to the first face region monitoring video frame block is used for reflecting the area of the minimum circumscribed circle of the first face region monitoring video frame block;
for each second face region monitoring video frame block, calculating an area ratio between the area of the second face region monitoring video frame block and a reference area corresponding to the second face region monitoring video frame block, and determining a second fusion weight coefficient corresponding to the second face region monitoring video frame block according to the area ratio, wherein the second fusion weight coefficient is negatively related to the area ratio, and the reference area corresponding to the second face region monitoring video frame block is used for reflecting the area of the minimum circumscribed circle of the second face region monitoring video frame block;
and performing fusion processing on a fusion dimension similarity coefficient between the first face region surveillance video frame block and the second face region surveillance video frame block associated with each group according to the corresponding first fusion weight coefficient and the second fusion weight coefficient (for example, an average value of the first fusion weight coefficient and the second fusion weight coefficient may be used as a weight value, and performing weighted summation calculation on the fusion dimension similarity coefficient) to output the video frame similarity between the two adjacent face region surveillance video frames.
For example, in some possible embodiments, in order to implement the step "using a pre-trained and formed user state recognition neural network to perform user state recognition processing on the face region surveillance video to output target user state recognition information corresponding to the target user", the following may be performed in a specific execution process:
performing video frame region clipping processing on each face region surveillance video frame in a plurality of face region surveillance video frames included in the face region surveillance video (for example, an eye region in the face region surveillance video frame may be recognized first, and then a eye region is clipped, where a technology for recognizing the eye region may refer to a related prior art) to output a face sub-region surveillance video frame corresponding to each face region surveillance video frame, where the face sub-region surveillance video frame is a sub-region where eyes are located in the face region surveillance video frame;
sequencing according to the face subregion monitoring video frame corresponding to each frame of the face region monitoring video frame and the video frame time sequence of the face region monitoring video frame to form a face subregion monitoring video corresponding to the face region monitoring video;
utilizing a user state recognition neural network formed by pre-training to perform recognition matching processing on the face sub-region monitoring video and each frame of standard face sub-region monitoring video in a plurality of pre-configured standard face sub-region monitoring videos so as to output the video matching degree between the face sub-region monitoring video and each standard face sub-region monitoring video;
according to the video matching degree between the face sub-region monitoring video and each standard face sub-region monitoring video, merging user state standard information (for example, the user state standard information may be pre-configured to reflect a corresponding standard user concentration degree, and a user corresponding to the standard face sub-region monitoring video may be the target user, or may be another user other than the target user) corresponding to each standard face sub-region monitoring video (for example, the merging may be performed by taking the video matching degree as a weighting coefficient to perform weighted summation calculation on the user state standard information to obtain target user state identification information), so as to output the target user state identification information corresponding to the target user.
For example, in some possible embodiments, in order to implement the step of performing "using a pre-trained user state recognition neural network to perform recognition matching processing on the face sub-region monitoring video and each standard face sub-region monitoring video in a plurality of standard face sub-region monitoring videos configured in advance to output a video matching degree between the face sub-region monitoring video and each standard face sub-region monitoring video", the following may be performed in a specific execution process:
extracting video frame characteristic data corresponding to each frame of face sub-region monitoring video frame included in the face sub-region monitoring video from the face sub-region monitoring video, and extracting video frame characteristic data corresponding to each frame of standard face sub-region monitoring video frame included in the standard face sub-region monitoring video from the standard face sub-region monitoring video;
loading video frame characteristic data corresponding to each frame of the face sub-region monitoring video frame and video frame characteristic data corresponding to each frame of the standard face sub-region monitoring video frame into a user state recognition neural network formed by pre-training, so as to analyze and recognize video frame matching coefficients between each frame of the face sub-region monitoring video frame and each frame of target video frame in a target video frame cluster by using the user state recognition neural network, and analyze and recognize standard video frame matching coefficients between each frame of the standard face sub-region monitoring video frame and each frame of target video frame in the target video frame cluster, wherein the multi-frame target video frames in the target video frame cluster comprise the face sub-region monitoring video frame and the standard face sub-region monitoring video frame;
and analyzing and determining the video matching degree between the face sub-region monitoring video and the standard face sub-region monitoring video based on the video frame matching coefficient between each frame of the face sub-region monitoring video frame and each frame of the target video frame and the standard video frame matching coefficient between each frame of the standard face sub-region monitoring video frame and each frame of the target video frame.
For example, in some possible embodiments, in order to implement the steps of extracting, from the face sub-region monitoring video, video frame feature data corresponding to each frame of face sub-region monitoring video frame included in the face sub-region monitoring video, and extracting, from the standard face sub-region monitoring video, video frame feature data corresponding to each frame of standard face sub-region monitoring video frame included in the standard face sub-region monitoring video, "the following may be performed in a specific execution process:
extracting video frame pixel distribution data corresponding to each frame of the face subregion monitoring video frame from the face subregion monitoring video, and extracting video frame pixel distribution data corresponding to each frame of the standard face subregion monitoring video frame from the standard face subregion monitoring video;
respectively extracting video frame time sequence data corresponding to each frame of the face sub-region monitoring video from the face sub-region monitoring video, and extracting video frame time sequence data corresponding to each frame of the standard face sub-region monitoring video from the standard face sub-region monitoring video;
and constructing and forming video frame characteristic data corresponding to each frame of the face subregion surveillance video frame based on the video frame pixel distribution data and the video frame time sequence data corresponding to each frame of the face subregion surveillance video frame, and constructing and forming video frame characteristic data corresponding to each frame of the standard face subregion surveillance video frame based on the video frame pixel distribution data and the video frame time sequence data corresponding to each frame of the standard face subregion surveillance video frame.
For example, in some possible embodiments, in order to implement the steps of loading the video frame feature data corresponding to each frame of the face subregion surveillance video frame and the video frame feature data corresponding to each frame of the standard face subregion surveillance video frame into a user state recognition neural network formed by pre-training so as to utilize the user state recognition neural network, analyzing and recognizing the video frame matching coefficients between each frame of the face subregion surveillance video frame and each frame of the target video frame in the target video frame cluster, and analyzing and recognizing the standard video frame matching coefficients between each frame of the standard subregion surveillance video frame and each frame of the target video frame in the target video frame cluster, the following may be performed in a specific execution process:
loading video frame characteristic data corresponding to each frame of the face subregion monitoring video frame and video frame characteristic data corresponding to each frame of the standard face subregion monitoring video frame into a user state recognition neural network formed by pre-training; analyzing and outputting a video frame time sequence data difference value between each face sub-region monitoring video frame and each frame target video frame in a target video frame cluster and analyzing and outputting a video frame time sequence data difference value between each standard face sub-region monitoring video frame and each frame target video frame in the target video frame cluster on the basis of video frame time sequence data corresponding to the face sub-region monitoring video frame and video frame time sequence data corresponding to the standard face sub-region monitoring video frame through the user state recognition neural network; analyzing and outputting the video frame matching coefficients of the face subregion surveillance video frame and each frame of target video frame, and analyzing and outputting the standard video frame matching coefficients of the standard face subregion surveillance video frame and each frame of target video frame based on the video frame pixel distribution data corresponding to the face subregion surveillance video frame, the video frame time sequence data difference between the face subregion surveillance video frame and each frame of target video frame, the video frame pixel distribution data corresponding to the standard face subregion surveillance video frame and the video frame time sequence data difference between the standard face subregion surveillance video frame and each frame of target video frame.
It should be noted that the matching value between the face subregion surveillance video frame and each frame target video frame in the target video frame cluster and the matching value between the standard subregion surveillance video frame and each frame target video frame in the target video frame cluster can be generated according to the video frame pixel distribution data of the face subregion surveillance video frame in the video frame characteristic data of the face subregion surveillance video frame, the video frame pixel distribution data of the standard face subregion surveillance video frame in the video frame characteristic data of the standard face subregion surveillance video frame, the video frame timing data difference between the face subregion surveillance video frame and each frame target video frame in the target video frame cluster, and the video frame timing data difference between the standard face subregion surveillance video frame and each frame target video frame in the target video frame cluster. The matching value between the face subregion surveillance video frame and each frame target video frame in the target video frame cluster can be used as a video frame matching coefficient between the face subregion surveillance video frame and each frame target video frame in the target video frame cluster. Similarly, a matching value may be corresponding between a standard face sub-region surveillance video frame and a frame of target video frame in the target video frame cluster, and the matching value represents a matching degree or a similarity degree between the standard face sub-region surveillance video frame and a corresponding target video frame in the target video frame cluster, so that the matching value between the standard face sub-region surveillance video frame and each frame of target video frame in the target video frame cluster may be used as a standard video frame matching coefficient between the standard face sub-region surveillance video frame and each frame of target video frame in the target video frame cluster.
For example, the video frame matching coefficient and the standard video frame matching coefficient may exist in the form of a queue, and a matching value queue may be generated according to video frame pixel distribution data of the face sub-region surveillance video frame, video frame pixel distribution data of the standard face sub-region surveillance video frame, a video frame timing data difference between the face sub-region surveillance video frame and each frame target video frame in the target video frame cluster, and a video frame timing data difference between the standard face sub-region surveillance video frame and each frame target video frame in the target video frame cluster, where the matching value queue includes a video frame matching coefficient between the face sub-region surveillance video frame and each frame target video frame in the target video frame cluster, and a standard video frame matching coefficient between the standard face sub-region surveillance video frame and each frame target video frame in the target video frame cluster.
Illustratively, a face sub-region monitoring video frame 1, a face sub-region monitoring video frame 2, a face sub-region monitoring video frame 3, a face sub-region monitoring video frame 4, and a face sub-region monitoring video frame 5, i.e., 5 frames of face sub-region monitoring video frames, and a standard face sub-region monitoring video frame 1, a standard face sub-region monitoring video frame 2, and a standard face sub-region monitoring video frame 3, i.e., 3 frames of standard face sub-region monitoring video frames. Therefore, the plurality of atoms in the target video frame cluster include the face sub-region monitoring video frame 1, the face sub-region monitoring video frame 2, the face sub-region monitoring video frame 3, the face sub-region monitoring video frame 4, the face sub-region monitoring video frame 5, the standard face sub-region monitoring video frame 1, the standard face sub-region monitoring video frame 2, and the standard face sub-region monitoring video frame 3. Based on this, the matching value queue output by the user state recognition neural network may include 8 rows and 8 columns, and one row in the matching value queue may be regarded as a matching coefficient distribution feature, so, in the matching value queue, the first 5 rows may be matching coefficient distribution features respectively corresponding to each frame of face sub-region surveillance video frame, and the matching coefficient distribution feature corresponding to one frame of face sub-region surveillance video frame includes a video frame matching coefficient between the face sub-region surveillance video frame and each frame of target video frame in the target video frame cluster; the next 3 lines may be matching coefficient distribution features corresponding to each frame of standard face subregion surveillance video frame, where the matching coefficient distribution features corresponding to one frame of standard face subregion surveillance video frame include the standard video frame matching coefficients between the standard face subregion surveillance video frame and each frame of target video frame in the target video frame cluster.
For example, in some possible embodiments, in order to implement the steps of loading the video frame feature data corresponding to each frame of the face subregion surveillance video frame and the video frame feature data corresponding to each frame of the standard face subregion surveillance video frame into a user state recognition neural network formed by pre-training so as to analyze and recognize the video frame matching coefficients between each frame of the face subregion surveillance video frame and each frame of the target video frame in the target video frame cluster by using the user state recognition neural network, and analyze and recognize the standard video frame matching coefficients between each frame of the standard subregion surveillance video frame and each frame of the target video frame in the target video frame cluster, the following may also be performed in a specific execution process:
loading video frame characteristic data corresponding to each frame of the face subregion monitoring video frame and video frame characteristic data corresponding to each frame of the standard face subregion monitoring video frame into a video frame characteristic identification mining model included in the user state identification neural network; the video frame feature weight analysis sub-model (for example, the video frame feature weight analysis sub-model may be an attention-aware network, and analyzes and determines a video frame matching coefficient) included in the video frame feature recognition mining model is used to analyze and recognize a video frame matching coefficient between each of the face sub-region surveillance video frames and each of the frame target video frames, and analyze and recognize a standard video frame matching coefficient between each of the standard face sub-region surveillance video frames and each of the frame target video frames.
For example, in some possible embodiments, in order to implement the step "analyzing and determining the video matching degree between the face sub-region surveillance video and the standard face sub-region surveillance video based on the video frame matching coefficient between the face sub-region surveillance video frame and the target video frame and the standard video frame matching coefficient between the standard face sub-region surveillance video frame and the target video frame" respectively ", the following may be performed:
performing data compression and screening processing on video frame matching coefficients between the face subregion monitoring video frames and each frame of target video frame and standard video frame matching coefficients between the standard face subregion monitoring video frames and each frame of target video frame respectively so as to output corresponding video matching data characteristic distribution; based on the video matching data feature distribution, analyzing and determining a video matching degree between the face sub-region monitoring video and the standard face sub-region monitoring video (for example, the user state recognition neural network may perform a pooling process on a video frame matching coefficient corresponding to a face sub-region monitoring video frame and a standard video frame matching coefficient corresponding to a standard face sub-region monitoring video frame through the pooling network, and then process an output of the pooling network through a linear network to output a corresponding video matching data feature distribution, where the video matching data feature distribution is used as a basis for analyzing the video matching degree between the face sub-region monitoring video and the standard face sub-region monitoring video by the user state recognition neural network, so that the corresponding video matching degree can be analyzed and determined, for example, the video matching degree is implemented by a softmax function).
In some possible embodiments, the intelligent glasses adjustment control method based on image processing further includes a step of training the formed user state recognition neural network, and in a specific implementation process, the following steps can be executed:
extracting video frame characteristic data corresponding to each frame of exemplary face sub-region monitoring video frame included in the exemplary face sub-region monitoring video from the configured exemplary face sub-region monitoring video, and extracting video frame characteristic data corresponding to each frame of exemplary standard face sub-region monitoring video frame included in the exemplary standard face sub-region monitoring video from the configured exemplary standard face sub-region monitoring video;
loading video frame characteristic data corresponding to each exemplary face subregion monitoring video frame and video frame characteristic data corresponding to each exemplary standard face subregion monitoring video frame into a pre-established user state identification neural network to be updated;
analyzing and identifying an exemplary video frame matching coefficient between each frame of the exemplary facial sub-region surveillance video frame and each frame of an exemplary target video frame in an exemplary target video frame cluster by using the to-be-updated user state recognition neural network, and analyzing and identifying an exemplary standard video frame matching coefficient between each frame of the exemplary standard facial sub-region surveillance video frame and each frame of the exemplary target video frame in the exemplary target video frame cluster, wherein the frames of the exemplary target video frame in the exemplary target video frame cluster comprise the exemplary facial sub-region surveillance video frame and the exemplary standard facial sub-region surveillance video frame;
analyzing and determining an exemplary video matching degree between the exemplary face sub-region monitoring video and the exemplary standard face sub-region monitoring video based on an exemplary video frame matching coefficient between each exemplary face sub-region monitoring video frame and each exemplary target video frame respectively and an exemplary standard video frame matching coefficient between each exemplary standard face sub-region monitoring video frame and each exemplary target video frame respectively;
and updating the to-be-updated user state recognition neural network based on a real value of the video matching degree (which can be formed by labeling) between the exemplary face sub-region monitoring video and the exemplary standard face sub-region monitoring video and by combining the exemplary video matching degree, so as to form the user state recognition neural network corresponding to the to-be-updated user state recognition neural network.
For example, in some possible embodiments, in order to implement the step "update the to-be-updated user state recognition neural network based on the video matching degree truth value between the exemplary face sub-region surveillance video and the exemplary standard face sub-region surveillance video, and in combination with the exemplary video matching degree, so as to form a user state recognition neural network corresponding to the to-be-updated user state recognition neural network", the following may be performed in a specific execution process:
updating the network weight of the user state recognition neural network to be updated based on the real value of the video matching degree between the exemplary face sub-region monitoring video and the exemplary standard face sub-region monitoring video and by combining the exemplary video matching degree; utilizing a neural network for identifying the user state to be updated after the network weight is updated, and analyzing and identifying the matching degree of the updated video between the exemplary face sub-region monitoring video and the exemplary standard face sub-region monitoring video; and under the condition that the matching degree difference degree between the updated video matching degree and the real video matching degree does not exceed the matching degree difference degree reference value (namely, the convergence condition is reached), marking the user state identification neural network to be updated after the network weight is updated as the user state identification neural network corresponding to the user state identification neural network to be updated.
For example, in some possible embodiments, in order to implement the step of "performing adjustment control on the target smart glasses to implement the alert processing for the target user based on the target user state identification information", the following may be performed in a specific implementation:
comparing the attention concentration of the target user reflected by the target user state identification information with a preset attention concentration reference value (the attention concentration reference value can be configured according to actual application requirements and is not specifically limited);
if the attention concentration of the target user reflected by the target user state identification information is less than or equal to the attention concentration reference value, calculating a ratio of the attention concentration reference value to the attention concentration to output a corresponding concentration ratio;
determining a target vibration parameter with positive correlation according to the concentration ratio, and adjusting and controlling the target intelligent glasses according to the target vibration parameter so that the target intelligent glasses vibrate based on the target vibration parameter to prompt the target user, wherein the target vibration parameter at least comprises one of a target vibration amplitude and a target vibration frequency.
With reference to fig. 3, an embodiment of the present invention further provides an intelligent glasses adjustment control device based on image processing, which is applicable to the intelligent glasses adjustment control system based on image processing. Wherein, the intelligent glasses adjusting and controlling device based on image processing may include:
the system comprises a first software function module, a second software function module and a third software function module, wherein the first software function module is used for carrying out video monitoring processing on a face area of a target user through target intelligent glasses so as to output a face area monitoring video corresponding to the target user, the face area monitoring video comprises a plurality of face area monitoring video frames, and the face area monitoring video frames at least comprise eye information;
the second software function module is used for carrying out user state recognition processing on the face region monitoring video by utilizing a user state recognition neural network formed by pre-training so as to output target user state recognition information corresponding to the target user, wherein the target user state recognition information is used for reflecting the attention concentration of the target user;
and the third software function module is used for adjusting and controlling the target intelligent glasses based on the state identification information of the target user so as to realize the warning processing of the target user.
In summary, according to the method and system for adjusting and controlling the smart glasses based on image processing provided by the present invention, the target smart glasses perform video monitoring processing on the face area of the target user to output the face area monitoring video corresponding to the target user. And carrying out user state recognition processing on the face region monitoring video by utilizing a user state recognition neural network formed by pre-training so as to output target user state recognition information corresponding to a target user, wherein the target user state recognition information is used for reflecting the attention concentration of the target user. And adjusting and controlling the target intelligent glasses based on the state identification information of the target user so as to realize the warning processing of the target user. Based on the steps, the face area monitoring video can be identified through the neural network with higher data processing capacity to output corresponding target user state identification information, so that the target user state identification information can reflect the attention concentration of a target user with higher reliability, and therefore, when the target intelligent glasses are adjusted and controlled based on the target user state identification information, the target intelligent glasses can have higher reliability, namely, the reliability of adjusting and controlling the intelligent glasses is improved to a certain extent, and the defects in the prior art are overcome.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (9)

1. An intelligent glasses adjusting control method based on image processing is characterized by comprising the following steps:
performing video monitoring processing on a face area of a target user through target intelligent glasses to output a face area monitoring video corresponding to the target user, wherein the face area monitoring video comprises a plurality of face area monitoring video frames, and the face area monitoring video frames at least comprise eye information;
performing user state recognition processing on the face region monitoring video by using a user state recognition neural network formed by pre-training so as to output target user state recognition information corresponding to the target user, wherein the target user state recognition information is used for reflecting the attention concentration of the target user;
comparing the attention concentration of the target user reflected by the target user state identification information with a preset attention concentration reference value;
if the attention concentration of the target user reflected by the target user state identification information is less than or equal to the attention concentration reference value, calculating a ratio of the attention concentration reference value to the attention concentration to output a corresponding concentration ratio;
and determining a target vibration parameter with positive correlation according to the concentration ratio, and adjusting and controlling the target intelligent glasses according to the target vibration parameter so that the target intelligent glasses vibrate based on the target vibration parameter to prompt the target user, wherein the target vibration parameter at least comprises one of a target vibration amplitude and a target vibration frequency.
2. The image-processing-based smart glasses adjustment control method according to claim 1, wherein the step of performing video monitoring processing on the face area of the target user through the target smart glasses to output the face area monitoring video corresponding to the target user comprises:
judging whether an intelligent glasses adjusting control instruction sent by target intelligent glasses or target user terminal equipment corresponding to the target intelligent glasses is received or not;
if an intelligent glasses adjusting control instruction sent by the target intelligent glasses or a target user terminal device corresponding to the target intelligent glasses is received, sending a video monitoring instruction to the target intelligent glasses to control the target intelligent glasses to perform video monitoring processing on a face area of a target user, so as to output a face area monitoring video corresponding to the target user.
3. The intelligent glasses adjustment control method based on image processing as claimed in claim 1, wherein the step of performing the user state recognition processing on the face area surveillance video by using the user state recognition neural network formed by pre-training to output the target user state recognition information corresponding to the target user comprises:
performing video frame region interception processing on each face region monitoring video frame in a plurality of face region monitoring video frames included in the face region monitoring video so as to output a face sub-region monitoring video frame corresponding to each face region monitoring video frame, wherein the face sub-region monitoring video frame is a sub-region where eyes are located in the face region monitoring video frame;
sequencing according to the face subregion monitoring video frame corresponding to each frame of the face region monitoring video frame and the video frame time sequence of the face region monitoring video frame to form a face subregion monitoring video corresponding to the face region monitoring video;
utilizing a user state recognition neural network formed by pre-training to perform recognition matching processing on the face sub-region monitoring video and each frame of standard face sub-region monitoring video in a plurality of pre-configured standard face sub-region monitoring videos so as to output the video matching degree between the face sub-region monitoring video and each standard face sub-region monitoring video;
and according to the video matching degree between the face sub-region monitoring video and each standard face sub-region monitoring video, fusing the user state standard information corresponding to each standard face sub-region monitoring video so as to output the target user state identification information corresponding to the target user.
4. The image-processing-based smart eyewear adjustment control method of claim 3, wherein the step of performing recognition matching processing on the face sub-region surveillance video and each of a plurality of standard face sub-region surveillance videos configured in advance by using a user state recognition neural network formed by pre-training to output a video matching degree between the face sub-region surveillance video and each of the standard face sub-region surveillance videos comprises:
extracting video frame characteristic data corresponding to each frame of face sub-region monitoring video frame included in the face sub-region monitoring video from the face sub-region monitoring video, and extracting video frame characteristic data corresponding to each frame of standard face sub-region monitoring video frame included in the standard face sub-region monitoring video from the standard face sub-region monitoring video;
loading video frame characteristic data corresponding to each frame of the face sub-region monitoring video frame and video frame characteristic data corresponding to each frame of the standard face sub-region monitoring video frame into a user state recognition neural network formed by pre-training, so as to analyze and recognize video frame matching coefficients between each frame of the face sub-region monitoring video frame and each frame of target video frame in a target video frame cluster by using the user state recognition neural network, and analyze and recognize standard video frame matching coefficients between each frame of the standard face sub-region monitoring video frame and each frame of target video frame in the target video frame cluster, wherein the multi-frame target video frames in the target video frame cluster comprise the face sub-region monitoring video frame and the standard face sub-region monitoring video frame;
and analyzing and determining the video matching degree between the face sub-region monitoring video and the standard face sub-region monitoring video based on the video frame matching coefficient between each frame of the face sub-region monitoring video frame and each frame of the target video frame and the standard video frame matching coefficient between each frame of the standard face sub-region monitoring video frame and each frame of the target video frame.
5. The image-processing-based intelligent glasses adjustment control method according to claim 4, wherein the step of extracting video frame feature data corresponding to each frame of face sub-region surveillance video frame included in the face sub-region surveillance video from the face sub-region surveillance video, and extracting video frame feature data corresponding to each frame of standard face sub-region surveillance video frame included in the standard face sub-region surveillance video from the standard face sub-region surveillance video comprises:
extracting video frame pixel distribution data corresponding to each frame of the face subregion monitoring video frame from the face subregion monitoring video, and extracting video frame pixel distribution data corresponding to each frame of the standard face subregion monitoring video frame from the standard face subregion monitoring video;
respectively extracting video frame time sequence data corresponding to each frame of the face sub-region monitoring video from the face sub-region monitoring video, and extracting video frame time sequence data corresponding to each frame of the standard face sub-region monitoring video from the standard face sub-region monitoring video;
and constructing and forming video frame characteristic data corresponding to each frame of the face subregion surveillance video frame based on the video frame pixel distribution data and the video frame time sequence data corresponding to each frame of the face subregion surveillance video frame, and constructing and forming video frame characteristic data corresponding to each frame of the standard face subregion surveillance video frame based on the video frame pixel distribution data and the video frame time sequence data corresponding to each frame of the standard face subregion surveillance video frame.
6. The image-processing-based intelligent glasses adjustment control method as claimed in claim 5, wherein the step of loading the video frame feature data corresponding to each of the facial sub-region surveillance video frames and the video frame feature data corresponding to each of the standard facial sub-region surveillance video frames into a user state recognition neural network formed by pre-training so as to analyze and recognize the video frame matching coefficients between each of the facial sub-region surveillance video frames and each of the target video frames in the target video frame cluster by using the user state recognition neural network, and analyze and recognize the standard video frame matching coefficients between each of the standard facial sub-region surveillance video frames and each of the target video frames in the target video frame cluster comprises:
loading video frame characteristic data corresponding to each frame of the face sub-region monitoring video frame and video frame characteristic data corresponding to each frame of the standard face sub-region monitoring video frame into a user state recognition neural network formed by pre-training;
analyzing and outputting a video frame time sequence data difference value between the face subregion monitoring video frame and each frame target video frame in a target video frame cluster respectively and analyzing and outputting a video frame time sequence data difference value between the standard face subregion monitoring video frame and each frame target video frame in the target video frame cluster respectively based on the video frame time sequence data corresponding to the face subregion monitoring video frame and the video frame time sequence data corresponding to the standard face subregion monitoring video frame through the user state recognition neural network;
and analyzing and outputting the matching coefficient of the face sub-region monitoring video frame and the video frame of each frame target video frame respectively, and analyzing and outputting the matching coefficient of the standard face sub-region monitoring video frame and the video frame of each frame target video frame respectively.
7. The image processing-based intelligent glasses adjustment and control method according to claim 4, wherein the step of loading the video frame feature data corresponding to each of the facial sub-region surveillance video frames and the video frame feature data corresponding to each of the standard facial sub-region surveillance video frames into a user state recognition neural network formed by pre-training so as to analyze and recognize the video frame matching coefficients between each of the facial sub-region surveillance video frames and each of the target video frames in the target video frame cluster by using the user state recognition neural network, and analyze and recognize the standard video frame matching coefficients between each of the standard facial sub-region surveillance video frames and each of the target video frames in the target video frame cluster comprises:
loading video frame characteristic data corresponding to each frame of the face subregion monitoring video frame and video frame characteristic data corresponding to each frame of the standard face subregion monitoring video frame into a video frame characteristic identification mining model included in the user state identification neural network;
and analyzing and recognizing a video frame matching coefficient between each face sub-region monitoring video frame and each frame target video frame by using a video frame feature weight analysis sub-model included in the video frame feature recognition mining model, and analyzing and recognizing a standard video frame matching coefficient between each standard face sub-region monitoring video frame and each frame target video frame.
8. The image-processing-based smart eyewear adjustment control method of claim 4 further comprising the step of training the formed user state-recognition neural network, comprising:
extracting video frame characteristic data corresponding to each frame of exemplary face sub-region monitoring video frame included in the exemplary face sub-region monitoring video from the configured exemplary face sub-region monitoring video, and extracting video frame characteristic data corresponding to each frame of exemplary standard face sub-region monitoring video frame included in the exemplary standard face sub-region monitoring video from the configured exemplary standard face sub-region monitoring video;
loading video frame characteristic data corresponding to each exemplary face subregion monitoring video frame and video frame characteristic data corresponding to each exemplary standard face subregion monitoring video frame into a pre-established user state identification neural network to be updated;
analyzing and identifying an exemplary video frame matching coefficient between each frame of the exemplary facial sub-region surveillance video frame and each frame of an exemplary target video frame in an exemplary target video frame cluster by using the to-be-updated user state recognition neural network, and analyzing and identifying an exemplary standard video frame matching coefficient between each frame of the exemplary standard facial sub-region surveillance video frame and each frame of the exemplary target video frame in the exemplary target video frame cluster, wherein the frames of the exemplary target video frame in the exemplary target video frame cluster comprise the exemplary facial sub-region surveillance video frame and the exemplary standard facial sub-region surveillance video frame;
analyzing and determining an exemplary video matching degree between the exemplary face sub-region monitoring video and the exemplary standard face sub-region monitoring video based on an exemplary video frame matching coefficient between each exemplary face sub-region monitoring video frame and each exemplary target video frame respectively and an exemplary standard video frame matching coefficient between each exemplary standard face sub-region monitoring video frame and each exemplary target video frame respectively;
and updating the to-be-updated user state recognition neural network based on the real value of the video matching degree between the exemplary face sub-region monitoring video and the exemplary standard face sub-region monitoring video and by combining the exemplary video matching degree, so as to form the user state recognition neural network corresponding to the to-be-updated user state recognition neural network.
9. An image processing based smart eyewear adjustment control system comprising a processor and a memory, the memory storing a computer program, the processor being configured to execute the computer program to implement the method of any one of claims 1 to 8.
CN202211082278.9A 2022-09-06 2022-09-06 Intelligent glasses adjustment control method and system based on image processing Active CN115171028B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211082278.9A CN115171028B (en) 2022-09-06 2022-09-06 Intelligent glasses adjustment control method and system based on image processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211082278.9A CN115171028B (en) 2022-09-06 2022-09-06 Intelligent glasses adjustment control method and system based on image processing

Publications (2)

Publication Number Publication Date
CN115171028A true CN115171028A (en) 2022-10-11
CN115171028B CN115171028B (en) 2022-11-18

Family

ID=83480499

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211082278.9A Active CN115171028B (en) 2022-09-06 2022-09-06 Intelligent glasses adjustment control method and system based on image processing

Country Status (1)

Country Link
CN (1) CN115171028B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160055371A1 (en) * 2014-08-21 2016-02-25 Coretronic Corporation Smart glasses and method for recognizing and prompting face using smart glasses
CN111240018A (en) * 2020-03-12 2020-06-05 深圳捷径观察科技有限公司 VR glasses with vague nerve prevention reminding function and vague nerve prevention reminding method
CN113744499A (en) * 2021-08-12 2021-12-03 科大讯飞股份有限公司 Fatigue early warning method, glasses, system and computer readable storage medium
CN114022841A (en) * 2021-10-22 2022-02-08 深圳市中博科创信息技术有限公司 Personnel monitoring and identifying method and device, electronic equipment and readable storage medium
CN114049679A (en) * 2021-11-18 2022-02-15 中国银行股份有限公司 Intelligent glasses and eye fatigue detection method
US20220072380A1 (en) * 2020-09-04 2022-03-10 Rajiv Trehan Method and system for analysing activity performance of users through smart mirror

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160055371A1 (en) * 2014-08-21 2016-02-25 Coretronic Corporation Smart glasses and method for recognizing and prompting face using smart glasses
CN111240018A (en) * 2020-03-12 2020-06-05 深圳捷径观察科技有限公司 VR glasses with vague nerve prevention reminding function and vague nerve prevention reminding method
US20220072380A1 (en) * 2020-09-04 2022-03-10 Rajiv Trehan Method and system for analysing activity performance of users through smart mirror
CN113744499A (en) * 2021-08-12 2021-12-03 科大讯飞股份有限公司 Fatigue early warning method, glasses, system and computer readable storage medium
CN114022841A (en) * 2021-10-22 2022-02-08 深圳市中博科创信息技术有限公司 Personnel monitoring and identifying method and device, electronic equipment and readable storage medium
CN114049679A (en) * 2021-11-18 2022-02-15 中国银行股份有限公司 Intelligent glasses and eye fatigue detection method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
吴迪 等: "基于面部特征的线上关注度检测与判断", 《沈阳师范大学学报(自然科学版)》 *

Also Published As

Publication number Publication date
CN115171028B (en) 2022-11-18

Similar Documents

Publication Publication Date Title
CN110728255B (en) Image processing method, image processing device, electronic equipment and storage medium
CN114140713A (en) Image recognition system and image recognition method
CN115018840B (en) Method, system and device for detecting cracks of precision casting
CN115205765B (en) FPGA-based video analysis method and system
CN114140712A (en) Automatic image recognition and distribution system and method
CN114978037A (en) Solar cell performance data monitoring method and system
CN111783665A (en) Action recognition method and device, storage medium and electronic equipment
CN114724215A (en) Sensitive image identification method and system
CN112633159A (en) Human-object interaction relation recognition method, model training method and corresponding device
CN114140710A (en) Monitoring data transmission method and system based on data processing
CN114139016A (en) Data processing method and system for intelligent cell
CN115171028B (en) Intelligent glasses adjustment control method and system based on image processing
CN114418460B (en) Construction process information analysis method and construction management system applied to BIM
CN115065842B (en) Panoramic video streaming interaction method and system based on virtual reality
CN115375886A (en) Data acquisition method and system based on cloud computing service
CN115424193A (en) Training image information processing method and system
CN114095734A (en) User data compression method and system based on data processing
CN114723456A (en) Payment method and payment system
CN115082709B (en) Remote sensing big data processing method, system and cloud platform
CN115147134B (en) Product anti-counterfeiting traceability method, system and cloud platform based on industrial Internet
CN116665059A (en) Remote sensing image processing method and system for building area
CN111259775B (en) Video action positioning method and system for constraint time domain relation
CN115272831B (en) Transmission method and system for monitoring images of suspension state of contact network
CN114173086A (en) User data screening method based on data processing
CN114153654A (en) User data backup method and system based on data processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant