CN109299641B

CN109299641B - Train dispatcher fatigue monitoring image adaptive processing algorithm

Info

Publication number: CN109299641B
Application number: CN201810354996.4A
Authority: CN
Inventors: 杨奎; 彭其渊; 张晓梅; 胡雨欣
Original assignee: Southwest Jiaotong University; China Railway Corp
Current assignee: Southwest Jiaotong University; China Railway Corp
Priority date: 2018-04-19
Filing date: 2018-04-19
Publication date: 2020-10-16
Anticipated expiration: 2038-04-19
Also published as: CN109299641A

Abstract

The invention discloses a train dispatcher fatigue monitoring image self-adaptive processing algorithm, which belongs to the technical field of pattern recognition based on biological characteristics, optimizes the detection parameters of the current frame image according to the detection result of the previous frame image by a self-adaptive face detection algorithm, reduces the detection range to the maximum extent, reduces the detection times in the process and improves the image detection efficiency; according to the human face eye relationship and the double eye position relationship, the human eye detection range is further narrowed through an adaptive rapid human eye detection and intelligent estimation algorithm, meanwhile, the effective inference and data verification of the eye position are carried out, and the data accuracy and completeness are effectively improved; and evaluating the quality of the continuous image detection result in a subsequent period of time according to an interval face recognition and frame skipping fast processing algorithm, performing differentiated frame skipping processing, and improving the image processing efficiency, so that the aim of performing self-adaptive adjustment on the subsequent processing process according to the data obtained by current processing through a self-adaptive detection technology can be fulfilled, and the image processing quality and efficiency can be improved.

Description

Train dispatcher fatigue monitoring image adaptive processing algorithm

Technical Field

The invention belongs to the technical field of pattern recognition based on biological characteristics, relates to various theories and technologies such as image processing, pattern recognition, computer vision, human physiology and the like, and particularly relates to a train dispatcher fatigue monitoring image adaptive processing algorithm.

Background

The human face detection, the human face recognition and the human eye detection are to determine human faces and human eye areas in images according to facial and eye characteristics and identify object identities corresponding to the human faces, and relate to various theories and technologies such as image processing, mode recognition, computer vision, human physiology and the like.

Opencv (open Source Computer Vision library) is a cross-platform Computer Vision library initiated and involved in development by intel corporation, and is composed of a series of C functions and a small amount of C + + classes, so that many general algorithms in the aspects of image processing and Computer Vision are realized. OpenCV can run on Linux, Windows and Mac OS operating systems, provides language interfaces such as Python, Ruby and MATLAB, has the characteristics of cross-platform, light weight, high efficiency, independence on other external libraries and free open source, and is an ideal tool for image processing, mode recognition and secondary development in the field of computer vision.

OpenCV offers a basic dll library of many image processing, pattern recognition and basic functions in the computer vision field, but has a significant disadvantage in that it provides few GUI interfaces and is difficult to directly meet the needs of application development.

The EmguCV is a cross-platform OpenCV.Net encapsulation, which allows direct invocation of OpenCV functions in the Net language to connect C # and OpenCV well, thereby making up for the GUI deficiencies of OpenCV.

The human face detection and the human eye detection both belong to the field of target detection, the cascade AdaBoost algorithm is a target detection algorithm which is supported by OpenCV and widely applied, an AdaBoost cascade classifier is obtained by utilizing sample Haar characteristic training, and a Haar detection function is called to realize target detection. The AdaBoost algorithm core idea is to train different weak classifiers aiming at the same training set, adaptively promote the weak classifiers into strong classifiers, and finally converge to be stable through iterative weighting.

Image detection and image recognition technologies have been developed rapidly, and face detection and face recognition technologies have been widely used in various industries. The non-contact PERCLOS method based on the human eye closing degree is widely accepted by the industry, and starts to be applied to fatigue monitoring of automobile drivers and pilots, so that a good effect is achieved. Meanwhile, the EMGUCV provides a convenient and efficient system interface, can realize the basic functions of face detection and face recognition, and meets the basic requirements.

The dispatching and commanding working environment of the train dispatcher has obvious openness, wide administration range, wide integrated information and more equipment and systems, and the technical equipment arrangement usually adopts a multi-row and multi-column mode. Different from other industrial personnel who mainly focus on the sight in the front direction, the train dispatcher can focus on different areas in different time periods according to working requirements in the dispatching and commanding process, actions such as head raising, head lowering or left and right side looking can occur, and the sight focusing has the obvious dispersion characteristic.

According to the dispatching command work scene characteristics, the human body physiological characteristics and the fatigue monitoring requirements, the performance requirements of the train dispatcher fatigue monitoring system are mainly embodied in five aspects of non-invasiveness, parallelism, continuity, high efficiency and accuracy. The image processing is the most time-consuming link of fatigue monitoring, the data processing efficiency and the performance of a fatigue monitoring system are concerned, the parallel fatigue monitoring of a train dispatcher puts higher requirements on the technical efficiency of the image processing, and the prior art is difficult to meet the requirements.

Disclosure of Invention

In order to solve the above problems in the prior art, the present invention aims to provide an adaptive processing algorithm for fatigue monitoring images of train dispatchers, so as to achieve the purpose of realizing the functions of different modules in image processing, meeting the performance requirements of the system in terms of quality and speed, and performing adaptive adjustment on the subsequent processing process according to the data obtained by current processing through an adaptive detection technology, so as to improve the image processing quality and efficiency to the maximum extent, and meet the system performance requirements while realizing the image processing function.

The technical scheme adopted by the invention is as follows: a train dispatcher fatigue monitoring image self-adaptive processing algorithm is a development platform based on VS2010, and utilizes C # language to call EmguCV for secondary development, and mainly comprises the following steps:

face detection and eye detection

Obtaining FaceHaar by loading a face classifier haarcascade _ frontage _ alt2.xml, obtaining EyeHaar by loading a face classifier haarcascade _ mcs _ righteye. xml, and calling a DetectMultiScale function to respectively obtain the following formulas:

Faces＝FaceHaar.DetectMultiScale(Image1,SF1,MinNH1,MinSize1,MaxSize1) (2-1)

Eyes＝EyeHaar.DetectMultiScale(Image2,SF2,MinNH2,MinSize2,MaxSize2) (2-2)

wherein, the DetectMultiScale is a multiscale detection method of CascadeClassiifier class, and a region set of a specific target object in an input image is obtained;

image1 and Image2 represent Image objects for face detection and eye detection, respectively, and are of the type Image < Gray, byte >;

SF1 and SF2 represent scaling factors for face detection and eye detection, respectively;

MinNH1 and MinNH2 represent the minimum number of adjacent rectangles constituting the human face detection and human eye detection targets, respectively;

MinSize1 and MaxSize1 respectively represent the minimum size and the maximum size of a rectangular area obtained by face detection;

MinSize2 and MaxSize2 respectively represent the minimum size and the maximum size of a rectangular area obtained by human eye detection;

(II) face recognition

The face recognition is realized by calling a Recognize method of EigenObjectRecognizer class in EmguCV, the identity of a target object is identified through face features based on a face region obtained by face detection, the face region obtained by face detection of a current frame image is traversed in the face recognition process until the face region belonging to the target object is found, and then subsequent eye detection and eyelid distance calculation are carried out, wherein the key process formula is as follows:

Recognizer＝newEigenObjectRecognizer(Images,Labels,DistanceThreshold,termCrit) (2-3)

Name＝recognizer.Recognize(result).Label (2-4)

images are face recognition training Image arrays, and the type is Image < Gray, byte >;

labels is an identification number array corresponding to the face recognition image array, and the type is string;

DistanceThreshold is a characteristic distance threshold;

TerrmCrit is a face recognition training standard and is MCvTerrmCriteria in type;

the Name is an object identity obtained by face recognition and belongs to elements in Labels.

Further, the face detection adopts a fast self-adaptive face detection algorithm based on interframe constraint, in a face detection area, a face detection search window is sequentially detected from the size of MinSize1, if the face cannot be detected, the search window is expanded by SF1 times, and the steps are circularly carried out until the face is detected or the size of the search window reaches MaxSize 1; let i be the frame variable for image processing, PR_iFor rectangular areas of the image, DR_iDetecting a target region, FR, for the face of the image_iIf the detected face rectangular area is the face rectangular area of the image, then:

MinSize1_i≤FR_i.Size≤MaxSize1_i(2-6)

taking the face detection target area of the next frame of image as DR_i+1Window size MinSize1_i+1And MaxSize1_i+1Let f₁、f₂And f₃Respectively represent DR_i+1、MinSize1_i+1And MaxSize1_i+1And FR_iAdaptive functional relationship between:

DR_i+1＝f₁(FR_i)1≤i≤M-1,i∈N (2-7)

MinSize1_i+1＝f₂(FR_i)1≤i≤M-1,i∈N (2-8)

MaxSize1_i+1＝f₃(FR_i)1≤i≤M-1,i∈N (2-9)

wherein M is the image frame number of the current video file.

Further, let λ be the search region expansion coefficient, then the face detection target region DR in the i +1 th frame image_i+1The position parameters of (1) are X and Y, and the size parameters are Width and Height; f is₁The adaptive function is represented by the following formula:

let α andβ denotes MinSize1_i+1And MaxSize1_i+1Relative to FR_iScaling of the dimensions, then function f₂And f₃Can be expressed by equations (2-11) and (2-12), respectively:

further, DR may occur when actually detecting_i+1Out of PR_i+1In the case of (1), wherein PR_i+1For the image rectangular area of the next frame image, the human face detection target area DR is needed to be carried out according to the actual situation_i+1DR modified to be feasible_i′₊₁(ii) a Get DR_i+1And PR_i+1The intersection is used as the face detection target area of the (i + 1) th frame image, then DR_i′₊₁＝DR_i+1∩PR_i+1。

Furthermore, the eye detection adopts an adaptive rapid eye detection algorithm, so that the eye detection is ER_iFR-based video file frame i image_iAnd a human eye detection target area (ER) determined by the rule of 'three-family five-eye' of the human face in a self-adaptive manner_iWith the position parameters X and Y and the size parameters Width and Height, determining ER_iAnd FR_iThe adaptive functional relationship between the following:

then according to the human eye detection area ER_iAdaptively determining a human eye detection minimum search window MinSize2_iAnd a maximum search window MaxSize2_iAnd MinSize2_iAnd MaxSize2_iAnd ER_iThe adaptive functional relationship between the following formulas:

further, eye position inference and data verification are carried out under specific conditions through an adaptive intelligent algorithm, which is as follows: let LER_iAnd RER_iAre respectively in ER_iThe method comprises the following steps of detecting a left eye image region and a right eye image region obtained in the atmosphere, wherein the q frame image human eye adaptive intelligent inference and verification reference information comprises: LER_p、RER_p、FR_pAnd FR_qWherein p is the maximum value of frame number variables of complete human eye information and human face information detected before the q frame image, and p is less than or equal to q-1;

let ERN_pIs ER_pDirectly detecting the number of the obtained human eyes in the range, intelligently deducing and checking the content factor ERN of the q frame image human eyes_pThe values are different, and the following three scenarios are specifically included:

(1) if ERN_pNot less than 2, according to LER_p、RER_p、FR_pAnd FR_qChecking and detecting the obtained eye regions one by one, eliminating redundant eye regions, retaining two optimal eye regions, and respectively determining left eye image region LER according to relative position relationship_qAnd a right eye image region RER_q；

(2) If ERN _p1, according to LER_p、RER_p、FR_pAnd FR_qChecking the eye region of the information, and determining the eye region obtained by direct detection as the left eye region LER_qOr right eye region RER_qAfter passing the test, based on the eye area in the ER_qInferring the other eye region within range;

(3) if ERN _p0, according to LER_p、RER_p、FR_pAnd FR_qInformation directly in ER_qInferring left eye region LER within range_qAnd right eye region RER_q。

Further, according to LER_p、RER_p、FR_pAnd FR_qAdaptive intelligently inferred binocular region LER_q' and RER_q' by the following formulas, respectively:

wherein s is_q,pThe scaling factor of the human eye region in the image of the q frame relative to the human eye region in the image of the p frame is determined,

s_q,p＝(FR_q.Width/FR_p.Width+FR_q.Height/FR_p.Height)/2(2-18)。

further, the identity of the target object is checked through an interval face recognition algorithm according to the change conditions of the face position parameters and the size parameters, a Recognize method of an EigenObjectRecognizer class in EmguCV is called to determine the personnel identity corresponding to the face area after the condition is triggered, and the triggering condition for face recognition after the face image is detected in the (i + 1) th frame image comprises the following steps:

(1) the last frame of image of the current frame of image fails to detect the face region, i.e. DR_i+1＝PR_iIndicating that the detected face area is the face area of a person newly entering the image range;

(2) face rectangular region FR detected in i +1 th frame image_i+1The following formulas cannot be satisfied simultaneously:

wherein, omega is more than 0 and less than or equal to 0.4, sigma is more than 0 and less than or equal to 0.15, FR is selected_iFor the rectangular region of the face detected in the i-th frame image, FR_i+1The face rectangle area detected in the i +1 th frame image.

Further, when the target object leaves the video recording range, the face detection frame skipping processing can be performed in the image processing process, the triggering condition of the face detection frame skipping processing is set to be that the face region cannot be detected in the continuous K frames of images, and the value range of the parameter K is [5,25 ].

The invention has the beneficial effects that:

1. the human face detection and the human eye detection based on the EmguCV have good robustness, and the human face and the human eye area can be still accurately detected when the posture slightly deviates and the target is partially shielded;

2. by adopting a rapid self-adaptive face detection algorithm, the detection area and related parameters of the face detection are self-adaptively adjusted according to the face position and size data detected in the continuous frame images, and the detection rate is improved to the maximum extent on the basis of ensuring the detection accuracy;

3. in the human eye search area determined based on the human face area, the adaptive rapid human eye detection and intelligent inference algorithm can robustly find the corresponding double-eye area under different situations, the double-eye image area is processed to obtain the distance between the eyelids of the two eyes, and the accuracy and the robustness of the human eye detection are improved by complementing the double-eye area in the image;

4. the problem of object identity checking after face detection is differentially solved by adopting an interval face recognition technology, and the image processing efficiency can be improved to the maximum extent while the accuracy of an image processing object is ensured;

5. by adopting a frame skipping fast processing algorithm, the frame skipping processing is carried out by the face detection when the target object leaves the video recording range, and the overall image processing efficiency can be effectively improved by not carrying out the face detection when the target object leaves the video recording range.

Drawings

FIG. 1 is a schematic diagram showing the relationship among an image rectangular region, a face detection target region and a face rectangular region in an adaptive processing algorithm for a train dispatcher fatigue monitoring image provided by the invention;

FIG. 2 is a schematic diagram showing changes of a rectangular region of a face of a target object moving along the X-axis direction in the adaptive processing algorithm for the fatigue monitoring image of the train dispatcher;

FIG. 3 is a schematic diagram of a change of a rectangular area of a face of a target object moving along a Z-axis direction in an adaptive processing algorithm for a fatigue monitoring image of a train dispatcher, provided by the invention;

FIG. 4 is a schematic diagram showing changes of a rectangular area of a face of a target object moving along X-axis and Z-axis directions in an adaptive processing algorithm for a fatigue monitoring image of a train dispatcher;

FIG. 5 is a schematic diagram showing changes of a human face region moving forward and backward in the Y-axis direction of a target object in the adaptive processing algorithm for the fatigue monitoring image of the train dispatcher;

FIG. 6 is a schematic diagram showing the differentiation of the same size and displacement in an image at different distances in the adaptive processing algorithm for the fatigue monitoring image of the train dispatcher provided by the invention;

FIG. 7 is a schematic diagram of the comprehensive time-consuming lateral comparison of face detection in different modes in the adaptive processing algorithm for the fatigue monitoring image of the train dispatcher provided by the invention;

FIG. 8 is a schematic diagram of the distribution of "three-family five eyes" of a face image in the adaptive processing algorithm for a train dispatcher fatigue monitoring image provided by the invention;

fig. 9 is a schematic diagram of the process of triggering, continuously triggering and recovering from normal of the face detection frame skipping processing in the adaptive processing algorithm for the fatigue monitoring image of the train dispatcher provided by the invention.

1--FR_iRectangular regions of the face detected for the image, 2-DR_iDetecting a target region, 3-PR, for the image face_iFor the image rectangular area, O-video capture device, 4-FF mode, 5-AF mode, 6-FA mode, 7-AA mode, 8-BS mode, 9-ear, 10-eye, 11-nose, 12-mouth.

Detailed Description

The invention is further described with reference to the following figures and specific embodiments.

The invention provides a train dispatcher fatigue monitoring image self-adaptive processing algorithm, which is based on a VS2010 development platform, calls EmguCV by using C # language for secondary development and mainly comprises the following steps:

face detection and eye detection

Obtaining FaceHaar by loading a face classifier haarcascade _ frontage _ alt2.xml, wherein the FaceHaar is a face detection example of an AdaBoost cascade classifier CascadeClassifier, obtaining Eyehaar by loading a face classifier haarcascade _ mcs _ righteye. xml, the EyeHaar is a human eye detection example of the AdaBoost cascade classifier CascadeClassifier, and calling a DetectMutlyscale function to respectively obtain the following formulas:

Faces＝FaceHaar.DetectMultiScale(Image1,SF1,MinNH1,MinSize1,MaxSize1)(2-1)

Eyes＝EyeHaar.DetectMultiScale(Image2,SF2,MinNH2,MinSize2,MaxSize2)(2-2)

wherein, Faces is a face area rectangular array returned by the image face detection, the type is Rectangle, and the elements of the array contain the position and size information of a face;

eye is a human eye area rectangular array returned by image human eye detection, the type is Rectangle, and the element of eye comprises the position and size information of one eye;

DetectMultiScale is a multiscale detection method of CascadeClassiifier class, and a region set of a specific target object in an input image is obtained;

image1 and Image2 respectively represent Image objects of face detection and eye detection, and are of the type Image < Gray, byte >, and according to the geometric inclusion relationship between eyes and a face region, the eye detection is performed in a face region obtained by the face detection, namely Image2 is an Image region corresponding to a certain element of Faces, and if the face cannot be detected, the subsequent eye detection is not performed;

SF1 and SF2 respectively represent scaling factors of human face detection and human eye detection, represent a search window proportion coefficient in two adjacent scans, the default value is 1.1, the search window is sequentially expanded by 10% each time, and the scaling factors can be set according to needs;

MinNH1 and MinNH2 respectively represent the minimum number min _ neighbors of adjacent rectangles constituting the human face detection target and the human eye detection target, and the number of the small rectangles constituting the detection target and the number of the small rectangles smaller than min _ neighbors are both excluded, and the default values are 3. If the value is 0, the function returns all the detected candidate rectangles without any operation, and the setting is usually used for a user to self-define a combined program of the detection result;

MinSize1 and MaxSize1 respectively represent the minimum Size and the maximum Size of a rectangular area obtained by face detection, the minimum Size and the maximum Size jointly cooperate to limit the range of the face area, and the types are all Size;

MinSize2 and MaxSize2 respectively represent the minimum Size and the maximum Size of a rectangular area obtained by human eye detection, the minimum Size and the maximum Size jointly cooperate to limit the range of the human eye area, and the types are all Size;

(II) face recognition

Name＝recognizer.Recognize(result).Label (2-4)

an EigenObjectRecognizer adopts a PCA target recognizer;

the Recognizer is an example of an EigenObjectRecognizer class, and the Recognize method of the Recognizer can obtain specific object identification information;

images are face recognition training Image arrays, the type is Image < Gray, byte >, the size of each Image is the same, histogram normalization is carried out, and the Images are obtained through artificial training in advance;

labels is an identification number array corresponding to the face recognition image array, the type is string, elements of the array and Images in the training image array have sequentially corresponding mapping relations, and the Labels is correspondingly specified during image training;

DistanceThreshold is a characteristic distance threshold, and the larger the value is, the harder the face recognition is but the higher the recognition accuracy is;

When the target object workplace has an open characteristic, the target object may leave the video capture area for a short time during the work process, and other target objects may appear in the video capture area. Therefore, for a certain frame of image of the captured video, three results may occur in face detection: (1) an area without a human face; (2) a face region belonging to a target object or other target objects; (3) and the plurality of face regions comprise the target object or do not comprise the target object.

The face recognition is based on a face area obtained by face detection, and the identity of a target object is identified through face features. And traversing the face region obtained by face detection of the current frame image in the face recognition process until the face region belonging to the target object is found, and then performing subsequent eye detection and eyelid distance calculation.

The face recognition can ensure that eyelid distance data obtained by image processing belongs to a specific target object so as not to influence the data accuracy of the fatigue degree development.

After the basic function of the face detection is realized, the performance requirements of fatigue monitoring in the aspects of speed, accuracy and the like need to be further met, the face detection adopts a fast self-adaptive face detection algorithm based on interframe constraint, namely, the detection area and related parameters of the face detection are self-adaptively adjusted according to the face position and size data detected in the continuous frame images, and the detection speed is improved to the maximum extent on the basis of ensuring the detection accuracy.

In the human face detection area, sequentially detecting a human face detection search window from the size of MinSize1, if the human face cannot be detected, expanding the search window by SF1 times, and circularly performing the steps until the human face is detected or the size of the search window reaches MaxSize 1; let i be the frame variable for image processing, PR_iFor rectangular areas of the image, DR_iDetecting a target region, FR, for the face of the image_iFor rectangular regions of the face detected for this image, PR_iRectangular region of image, DR_iFace detection target area and FR_iThe relationship between the face rectangular regions is shown in fig. 1, then:

MinSize1_i≤FR_i.Size≤MaxSize1_i(2-6)

the human face video acquisition frame frequency is 25 frames/s, the time interval of adjacent frame images is 0.04s, the changes of human face position parameters and size parameters have gradual changes, and the changes can be finely depicted through continuous frame image recording. The single-frame image face detection result can directly reflect face position and size information, and the face detection result of the continuous frame image further contains the change trend of the face position and size, so that effective reference is provided for the face detection of the next frame image.

Self-adaptively determining the face detection region DR of the next frame image by using the face position parameter and size parameter information obtained by continuous frame image detection_i+1Position and size parameters of (2), MinSize1_i+1And MaxSize1_i+1Minimizing DR by accurately locating the face detection area_i+1Size and MaxSize1_i+1Maximize MinSize1_i+1The human face detection area is reduced to the maximum extent, the detection times are reduced, and the human face detection rate is further improved.

Taking the face detection target area of the next frame of image as DR_i+1Window size MinSize1_i+1And MaxSize1_i+1When the face is not detected in the ith frame image, DR_i+1、MinSize1_i+1And MaxSize1_i+1Default values of initial values are all taken; when the face is detected in the ith frame image, the face detection parameters of the (i + 1) th frame image are detected according to the FR_iAdaptively determine, let f₁、f₂And f₃Respectively represent DR_i+1、MinSize1_i+1And MaxSize1_i+1And FR_iAdaptive functional relationship between:

DR_i+1＝f₁(FR_i)1≤i≤M-1,i∈N (2-7)

MinSize1_i+1＝f₂(FR_i)1≤i≤M-1,i∈N (2-8)

MaxSize1_i+1＝f₃(FR_i)1≤i≤M-1,i∈N (2-9)

wherein M is the image frame number of the current video file.

Let lambda be the search region expansion coefficient, then the face detection target region DR in the i +1 th frame image_i+1The position parameters of (1) are X and Y, and the size parameters are Width and Height; f is₁The adaptive function is represented by the following formula:

when DR may occur during the actual detection process_i+1Out of PR_i+1In the case of (1), wherein PR_i+1For the image rectangular area of the next frame image, the human face detection target area DR is needed to be carried out according to the actual situation_i+1DR modified to be feasible_i′₊₁I.e. by

Get DR_i+1And PR_i+1The intersection is used as the face detection target area of the (i + 1) th frame image, then DR_i′₊₁＝DR_i+1∩PR_i+1。

Preferably, λ is 0.4 in the above formula, and the specific analysis is as follows:

the target object position may be displaced leftwards and rightwards (X direction), forwards and backwards (Y axis direction) or upwards and downwards (Z axis direction) according to work requirements, and the displacement may occur in one of three directions, or may occur in two or three directions. The position of the video acquisition equipment is kept fixed and unchanged in the video acquisition process, and the displacement of the target object can cause the position or the size of the acquired face to be correspondingly changed. The changes in the X-axis and Z-axis directions may cause corresponding changes in the face position, and the corresponding face regions may appear in all ranges of positive and negative maximum displacements in two directions, as shown in fig. 2, 3, and 4, respectively.

The target object moves forwards and backwards in the Y-axis direction, the position and the size of a face area in the image are influenced at the same time, the target object moves forwards, and the size of the corresponding face image area is larger as the face is closer to the camera; on the contrary, the farther the face is away from the camera, the smaller the corresponding face image size is. As shown in fig. 5, the schematic diagram of the forward and backward movement of the target object in the Y-axis direction shows that the distance between the face of the target object and the camera is changed to N times, and the length and width of the detected rectangular region of the face image are both changed to 1/N times.

Normally, the average moving speed of the human is about 1m/s, the position change process within 1s is recorded in the continuous 25 frames of images, and the maximum value of the actual average displacement of the face in the two adjacent frames of images is about 4 cm. For the same target object, the actual size of the face is kept unchanged, the farther the face is away from the camera, the smaller the face area obtained by face detection is, and the smaller the change of the actual displacement reflected in the image at the same degree is; conversely, the larger the face region obtained by face detection is, the larger the change of the actual displacement reflected in the image of the same degree is. As shown in fig. 6, OE ═ 2OA, and the areas of equal size at ABCD and EFGH appear as a in the image, respectively₁B₁C₁D₁And E₁F₁G₁H₁The equivalent scale displacement is respectively represented as A in the image₁A₂And E₁E₂Wherein A is₁B₁＝2E₁F₁，A₁A₂＝2E₁E₂。

The actual size and the displacement of the face of the target object exist objectively, the size of the face region and the size of the displacement in the image are synchronously enlarged or reduced, and the face position and the size detected by the current frame image are utilized to determine the next frame face detection region, so that the method has the characteristics of obvious self-adaption and high efficiency. Influenced by the height and the width of the working table of the target object, the distance between the target object and the video acquisition equipment is not less than 40cm, so that the size change rate of the face area detected in two adjacent frames of images caused by the forward and backward movement in the Y-axis direction is not more than 10%. The average size of the face is about 11cm X18 cm, and the displacements of the face in the X-axis direction and the Z-axis direction in two adjacent frames of images are usually less than 40% of the width of the face. By combining three displacement directions, the face region of the current frame can be used as the face detection region of the next frame image under the normal condition that the face width is respectively expanded outwards by 40% in four directions.

Analysis and calculation based on the Y-axis direction position and the forward-backward movement speed of the train dispatcher find that the size change of the same face in the adjacent frame images basically does not exceed 10 percent, so the FR is_iAdaptively determining DR when available_i+1While can also be FR_i+1Size provides a reference, by minimum search window MinSize1_i+1And a maximum search window MaxSize1_i+1Describing FR in face detection process respectively_i+1Lower and upper limits of size, using FR_iAdaptively maximizing MinSize1_i+1And minimize MaxSize1_i+1Can minimize FR_i+1The size is feasible, and the detection speed can be effectively improved.

Let α and β denote MinSize1, respectively_i+1And MaxSize1_i+1Relative to FR_iScaling of the dimensions, then function f₂And f₃Can be expressed by equations (2-11) and (2-12), respectively:

on the basis of comprehensively considering the size change range of the face of the adjacent frames, the more the alpha and the beta approach to 1 at the same time, the faster the face detection speed of the image of the (i + 1) th frame is, and on the basis of 10% scaling, the margin of 5% is considered, preferably, the values are 0.85 and 1.15 respectively.

Face detection time experiments were performed in different modes: randomly selecting a sample video to perform a face detection test, equally dividing 650 frames of video images into 13 groups, sequentially performing image capture, preprocessing and face detection on 50 frames of each group, wherein the face detection time under different modes is shown in fig. 7.

Wherein AA represents DR_i+1、MinSize1_i+1And MaxSize1_i+1Are all based on FR_iSelf-adaptive determination; AF denotes MinSize1_i+1And MaxSize1_i+1Based on FR_iAdaptive determination of DR_i+1Is the global Domain (DR)_i+1＝PR_i+1) (ii) a FF represents MinSize1_i+1And MaxSize1_i+1Is a fixed value, DR_i+1Is the global range; FA stands for MinSize1_i+1And MaxSize1_i+1Is a fixed value, DR_i+1Based on FR_iSelf-adaptive determination; BS denotes a basic processing mode, and only frame image capture and preprocessing are performed.

AA. The average time difference of face detection in AF, FF, FA and BS modes is huge, and the average time of processing 50 frames of images is 317ms, 435ms, 724ms, 405ms and 217ms respectively. The comprehensive rate of the face detection in the AA mode is more than twice of that in the FF mode, basic processing work (BS mode content) such as image capture, image preprocessing, analysis and calculation is eliminated, and the rate of the single face detection part in the AA mode is more than 5 times of that in the FF mode, so that the self-adaptive face detection efficiency is remarkably improved.

Human eye detection in the image processing process is a basic premise for judging the subsequent eye closing degree, and the human eye detection range is usually determined by a detected human face region, so that the human eye detection rate is improved by reducing the detection range. The spatial distribution of facial organs of the human face usually meets the specific proportional relation of 'three-court five-eye', wherein three-court is the length proportion of the face, the length of the face is divided into three equal parts, namely, the forehead hairline to the eyebrow bone, the eyebrow bone to the nasal base and the nasal base to the chin; the five eyes are the width proportion of the face, the width of the face from the left hairline to the right hairline is equally divided into five eye-shaped lengths, and the transverse positions of the two eyes are respectively positioned at the second and fourth eye-shaped length positions, as shown in fig. 8, wherein, the ears 9 are positioned above the eyebrows and below the pen point; eyes 10 are at 1/2 of the face; the bottom of the nose 11 is at 1/2 midway between the eyes and chin, the width is the width of the space between the two eyes, and the mouth 12 is at 1/3 between the nose 11 and chin.

Based on the space constraint relation of three groups and five eyes of the human face, the human eye detection can be further reduced in a self-adaptive manner within the range of the human faceThe detection range, the human eye detection areas at different positions, different postures and different scales are all subjected to self-adaptive change along with the human face area and completely contain the two eye areas obtained by detection, and the human eye detection adopts a self-adaptive rapid human eye detection algorithm, namely ER_iFR-based video file frame i image_iAnd a human eye detection target area (ER) determined by the rule of 'three-family five-eye' of the human face in a self-adaptive manner_iWith the position parameters X and Y and the size parameters Width and Height, determining ER_iAnd FR_iThe adaptive functional relationship between the following:

based on the size relationship between human eyes and human faces, the sample video is used for verifying a human eye detection search window, and then the human eye detection area ER is used_iAdaptively determining a human eye detection minimum search window MinSize2_iAnd a maximum search window MaxSize2_iThe detection rate can be improved to the maximum extent while the detection accuracy is ensured, and MinSize2_iAnd MaxSize2_iAnd ER_iThe adaptive functional relationship between the following formulas:

the self-adaptive intelligent inference and verification algorithm for human eye detection specifically comprises the following steps of carrying out eye position inference and data verification under specific conditions by using a self-adaptive intelligent algorithm, and taking physiological characteristics of human eyes and human faces as a theoretical basis:

(1) the stability of the relative relationship between the positions and the scales of the human eyes and the human faces;

(2) substantial consistency of binocular size;

(3) the synchronicity of the change of the human face and the human eyes in position and scale;

(4) the degree of closure of the eyes and the synchronism of time;

the human eye and human face physiological characteristic rules contain the acquired video images, and human face information and human eye information obtained through detection are reflected, so that the human face and human eye information (position and scale) obtained through direct detection is a direct basis for self-adaptive intelligent inference and verification of subsequent human eye detection.

The method comprises the following specific steps: let LER_iAnd RER_iAre respectively in ER_iThe method comprises the following steps of detecting a left eye image region and a right eye image region obtained in the atmosphere, wherein the q frame image human eye adaptive intelligent inference and verification reference information comprises: LER_p、RER_p、FR_pAnd FR_qWherein p is the maximum value of frame number variables of complete human eye information and human face information detected before the q frame image, and p is less than or equal to q-1;

Different scenesThe self-adaptive intelligent inference and verification contents of the lower frame are different, but the intelligent inference and verification principles are essentially the same, and the LER in the double-eye area is calculated by taking the effective face and double-eye information of the previous frame as reference_q′.X、LER_q′.Y、LER_q′.Width、LER_q'. Height and RER_q′.X、RER_q′.Y、RER_q′.Width、RER_q'. Height, using the above parameter data to estimate the two-eye area in the face area of the current frame image, so as to perform checksum inference on the directly detected human-eye area.

The concrete steps are as follows:

according to LER_p、RER_p、FR_pAnd FR_qAdaptive intelligently inferred binocular region LER_q' and RER_q' by the following formulas, respectively:

s_q,p＝(FR_q.Width/FR_p.Width+FR_q.Height/FR_p.Height)/2 (2-18)。

according to the above, in ER_qAfter eye detection in range, by comparing the detected eye area with the LER_q' and RER_qThe position and scale relation between the eye detection system and the eye detection method is verified, the position of an eye region which is not directly detected is deduced, and the accuracy and robustness of human eye detection are improved by completing the two eye regions in the image.

The identity of a target object is checked according to the change conditions of face position parameters and size parameters through an interval face recognition algorithm, a train dispatcher is used as the target object, the openness of a train dispatching command place causes faces of a plurality of train dispatchers or non-target train dispatchers to possibly appear in a frame image, and the face recognition technology can check the identity of personnel through face features and determine the current-class dispatcher of the train dispatching desk from a detected face area so as to eliminate the interference of other train dispatchers. The face recognition is carried out on the face area obtained by the face detection of each frame of image, the identity of the object can be determined most accurately, but the workload of image processing is obviously increased, and the sorting processing rate of the fatigue monitoring system is reduced.

The positions and sizes of the face areas of the adjacent frame images have gradual change characteristics, and the face areas have corresponding upper limit of change ratio. After the object identity is determined by face recognition, for the subsequent continuous frame images capable of detecting the face, the object identity can be determined according to the face position and size change condition. Therefore, the face recognition for determining the identity of the object is not necessary frame by frame in the image processing process, and the recognition is performed when the identity of the object cannot be checked according to the change situation of the face position and the size.

The interval face recognition algorithm is to set a face recognition triggering condition, call Recognize method of EigenObjectR ecognizer class in EmguCV to determine the identity of people corresponding to the face area after the condition is triggered, and the triggering condition for face recognition after the face image is detected in the (i + 1) th frame image comprises the following steps:

wherein, omega is more than 0 and less than or equal to 0.4, sigma is more than 0 and less than or equal to 0.15, FR is selected_iFor the rectangular region of the face detected in the i-th frame image, FR_i+1For the rectangular region of the face detected in the i +1 th frame image, the formula (2-19) is the above-mentioned inter-frame constraint conditionBased on the adaptive face detection analysis, the face region passes through an identity checking conditional formula based on the face position and size change condition. In the value range, the smaller the values of omega and sigma are, the stricter the condition for checking the identity according to the position and size change of the face is, and the more the number of frame images required to be subjected to face recognition is.

In the daily dispatching and commanding work process, only on-duty dispatching and commanding personnel are in front of a train dispatching desk in most of time, the problem of checking the identity of an object after face detection is solved in a differentiated mode through a face recognition technology, and the image processing efficiency can be improved to the maximum extent while the accuracy of the image processing object is ensured.

In addition, a train dispatcher may leave a video recording range for a short time in the dispatching and commanding process, so that the frame images in a continuous period of time cannot detect the face. According to the self-adaptive face detection algorithm under the interframe constraint, when the face cannot be detected in the current frame, the face detection range of the next frame is expanded to the whole image area, so that the single-frame face detection time is obviously increased. The situation that the train dispatcher leaves the video recording range usually lasts for a certain time, the face detection of the acquired image in the period belongs to invalid processing, and meanwhile, the processing time of a single frame image is much longer than the time when the train dispatcher appears in the frame image range.

The video recording frame rate of the train dispatcher is 25 frames/second, when the train dispatcher leaves a video recording range, the whole image processing efficiency can be effectively improved by not carrying out face detection, when a target object leaves the video recording range, the face detection frame skipping processing can be carried out in the image processing process, the triggering condition of the face detection frame skipping processing is set to be that a face area cannot be detected by continuous K frames of images, and the value range of the parameter K is [5,25 ].

When the human face detection triggers frame skipping processing, the continuous frame skipping number can be determined according to the image processing requirement, a fixed integral multiple of 25 can be taken within the range of [100,250], the corresponding actual time length is 4-10 s, and the fatigue degree of a train dispatcher can not be influenced. When the face detection frame skipping processing is triggered continuously for multiple times, the number of continuously skipped frames can be sequentially increased gradually, but is not easy to exceed 1000 frames, and the triggering, continuous triggering and normal recovery processes of the face detection frame skipping are shown in fig. 9.

The invention is not limited to the above alternative embodiments, and any other various forms of products can be obtained by anyone in the light of the present invention, but any changes in shape or structure thereof, which fall within the scope of the present invention as defined in the claims, fall within the scope of the present invention.

Claims

1. The train dispatcher fatigue monitoring image self-adaptive processing algorithm is characterized in that a development platform based on VS2010 calls EmguCV through C # language for secondary development, and mainly comprises the following steps:

face detection and eye detection

Faces＝FaceHaar.DetectMultiScale(Image1,SF1,MinNH1,MinSize1,MaxSize1) (2-1)

Eyes＝EyeHaar.DetectMultiScale(Image2,SF2,MinNH2,MinSize2,MaxSize2) (2-2)

(II) face recognition

Recognizer＝new EigenObjectRecognizer(Images,Labels,DistanceThreshold,termCrit) (2-3)

Name＝recognizer.Recognize(result).Label (2-4)

DistanceThreshold is a characteristic distance threshold;

the Name is an object identity obtained by face recognition and belongs to elements in Labels;

the face detection adopts a fast self-adaptive face detection algorithm based on interframe constraint, in a face detection area, a face detection search window is sequentially detected from the size of MinSize1, if the face cannot be detected, the search window is expanded by SF1 times, and the steps are circularly carried out until the face is detected or the size of the search window reaches MaxSize 1; let i be the frame variable for image processing, PR_iFor rectangular areas of the image, DR_iDetecting a target region, FR, for the face of the image_iIf the detected face rectangular area is the face rectangular area of the image, then:

MinSize1_i≤FR_i.Size≤MaxSize1_i(2-6)

take the next frameThe human face detection target area of the image is DR_i+1Window size MinSize1_i+1And MaxSize1_i+1Let f₁、f₂And f₃Respectively represent DR_i+1、MinSize1_i+1And MaxSize1_i+1And FR_iAdaptive functional relationship between:

DR_i+1＝f₁(FR_i) 1≤i≤M-1,i∈N (2-7)

MinSize1_i+1＝f₂(FR_i) 1≤i≤M-1,i∈N (2-8)

MaxSize1_i+1＝f₃(FR_i) 1≤i≤M-1,i∈N (2-9)

wherein M is the image frame number of the current video file.

2. The adaptive processing algorithm for the train dispatcher fatigue monitoring image as recited in claim 1, wherein let λ be a search region expansion coefficient, then the face detection target region DR in the i +1 th frame image_i+1The position parameters of (1) are X and Y, and the size parameters are Width and Height; f is₁The adaptive function is represented by the following formula:

3. the train dispatcher of claim 2The self-adaptive processing algorithm of the fatigue monitoring image is characterized in that DR can occur in the actual detection process_i+1Out of PR_i+1In the case of (1), wherein PR_i+1For the image rectangular area of the next frame image, the human face detection target area DR is needed to be carried out according to the actual situation_i+1Corrected to feasible DR'_i+1(ii) a Get DR_i+1And PR_i+1Taking the intersection as the face detection target area of the (i + 1) th frame image, and then obtaining DR'_i+1＝DR_i+1∩PR_i+1。

4. The train dispatcher fatigue monitoring image adaptive processing algorithm as claimed in claim 1, wherein the eye detection adopts an adaptive fast eye detection algorithm, let ER_iFR-based video file frame i image_iAnd a human eye detection target area (ER) determined by the rule of 'three-family five-eye' of the human face in a self-adaptive manner_iWith the position parameters X and Y and the size parameters Width and Height, determining ER_iAnd FR_iThe adaptive functional relationship between the following:

5. the adaptive processing algorithm for the image of train dispatcher fatigue monitoring according to claim 4The method is characterized in that the eye position inference and data verification are carried out under specific conditions through an adaptive intelligent algorithm, and the method specifically comprises the following steps: let LER_iAnd RER_iAre respectively in ER_iThe method comprises the following steps of detecting a left eye image region and a right eye image region obtained in the atmosphere, wherein the q frame image human eye adaptive intelligent inference and verification reference information comprises: LER_p、RER_p、FR_pAnd FR_qWherein p is the maximum value of frame number variables of complete human eye information and human face information detected before the q frame image, and p is less than or equal to q-1;

(2) If ERN_p1, according to LER_p、RER_p、FR_pAnd FR_qChecking the eye region of the information, and determining the eye region obtained by direct detection as the left eye region LER_qOr right eye region RER_qAfter passing the test, based on the eye area in the ER_qInferring the other eye region within range;

(3) if ERN_p0, according to LER_p、RER_p、FR_pAnd FR_qInformation directly in ER_qInferring left eye region LER within range_qAnd right eye region RER_q。

6. The adaptive processing algorithm for train dispatcher fatigue monitoring images as claimed in claim 5, wherein LER is a function of LER_p、RER_p、FR_pAnd FR_qAdaptive intelligently inferred binocular region LER_q' and RER_q' respectivelyBy the following formula:

s_q,p＝(FR_q.Width/FR_p.Width+FR_q.Height/FR_p.Height)/2 (2-18)。

7. the train dispatcher fatigue monitoring image adaptive processing algorithm as claimed in claim 1, wherein the identity of the target object is checked by an interval face recognition algorithm according to the change condition of the face position parameter and the size parameter, the Recognize method of EigenObjectRecognizer class in EMGUCV is called to determine the personnel identity corresponding to the face area after the condition is triggered, and the triggering condition for face recognition after the face image is detected in the (i + 1) th frame image comprises:

wherein, take 0<ω≤0.4，0<σ≤0.15，FR_iFor the rectangular region of the face detected in the i-th frame image, FR_i+1The face rectangle area detected in the i +1 th frame image.

8. The adaptive processing algorithm for the train dispatcher fatigue monitoring image according to claim 1, wherein when the target object leaves the video recording range, the face detection frame skipping processing is performed in the image processing process, the triggering condition of the face detection frame skipping processing is set to be that no face region is detected in the continuous K frames of images, and the value range of the parameter K is [5,25 ].