CN106845357B - A kind of video human face detection and recognition methods based on multichannel network - Google Patents
A kind of video human face detection and recognition methods based on multichannel network Download PDFInfo
- Publication number
- CN106845357B CN106845357B CN201611214990.4A CN201611214990A CN106845357B CN 106845357 B CN106845357 B CN 106845357B CN 201611214990 A CN201611214990 A CN 201611214990A CN 106845357 B CN106845357 B CN 106845357B
- Authority
- CN
- China
- Prior art keywords
- face
- image
- video
- posture
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/165—Detection; Localisation; Normalisation using facial parts and geometric relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Geometry (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
A kind of video human face detection and recognition methods based on multichannel network, include the following steps: S1: video pre-filtering adds temporal information to each frame image;S2: target Face datection and posture coefficient calculate;S3: human face posture is corrected: for m obtained in S2 faces, carrying out pose adjustment;S4: the face characteristic based on deep neural network extracts;S5 face characteristic compares: for the face of input, after obtaining feature vector using step S4, recycling COS distance to match the matching degree of vector in input face feature vector and feature database, is greater than given threshold with face COS distance to be identifiedClass be added in alternative class, if the feature of face to be identified and the COS distance of the central feature of all classes are respectively less thanThen it is considered as the information of not stored the people in database, terminates identification.The present invention provides a kind of higher video human face detections and recognition methods based on multichannel network of accuracy rate.
Description
Technical field
The present invention relates to a kind of video human faces to detect identification field, more particularly, to a kind of video people based on deep learning
Face recognition method.
Background technique
Video monitoring is an important component of security system.With video sensor technology and corresponding mating skill
The development of art, the simulated monitoring system since most, digital-to-analog monitoring system later start the ip prison of application till now
The application range of control system, video monitoring is increasing, and especially public security system disposes video monitoring system largely to be applied to
The fields such as security administration and suspect's tracking.
Fast-developing video monitoring system produces the monitor video data of magnanimity, tracks neck in security administration and suspect
What the people occurred is found in domain to one or number picture that a main processing of these video datas is by a people
Video file and corresponding frame number.Traditional approach is searched these video datas using manpower, is arranged, main disadvantage
It is that search efficiency is low, error rate is high, needs to exclude repeatedly, and spends the time long, with the continuous hair of video monitoring system
Exhibition and monitor video data are continuously increased, and the method that everybody traditional work is searched is more and more not applicable.
It automatically handled, known to being collected into ground video information by video image processing technology and mode identification technology
Not, very mature application is had been achieved in field of traffic control.But the existing identification side based on image procossing
Method can only reach higher recognition accuracy in ideal circumstances, change complexity, lower picture quality, to be identified right in light
When as attitudes vibration, omission factor and false detection rate just be will rise rapidly, so can't to the automatic identification of face in video, analysis
Reach the level of practical application.
With the development of big data and depth learning technology, depth learning technology is applied in face detection system, and
Obtain preferable effect.The more existing video human face based on deep learning identifies patent, such as patent
Video is regarded as one group of picture by CN201511033733.6, and the face obtained on the preferable picture of several quality is known
Not, this method has ignored the connection between different frame, is lost a large amount of information, when video quality is deteriorated, so that finally
The accuracy degradation of identification.
Patent CN201510471210.3 proposes a kind of real-time face recognition methods based on deep neural network, this method
The feature vector of face, the first Hamming distance between calculating feature vector are extracted by deep neural network, then at selection
Object of the face as secondary identification under threshold value, calculates the Euclidean distance of the feature vector of secondary identification, with it is European away from
From judging that face to be identified belongs to which face in face database, with improve face database it is larger when operation efficiency.
From the point of view of existing method, the face information of certain frames mainly in extraction video image utilizes deep learning
It is trained and is detected identification, connection on the time-space between frame each in video image is considered not yet, it is accurate to cause
Rate is lower.
Summary of the invention
In order to overcome the lower deficiency of accuracy rate of existing video human face detection and recognition methods, the present invention provides one kind
The higher video human face detection and recognition methods based on multichannel network of accuracy rate.
The technical solution adopted by the present invention to solve the technical problems is:
A kind of video human face detection and recognition methods based on multichannel network, includes the following steps:
S1: video pre-filtering
The video data that monitoring device is collected into is received, and is broken down into image one by one, gives each frame image
In addition temporal information;
S2: target Face datection and posture coefficient calculate, and process is as follows:
Face location and corresponding face position in video image are extracted, face and standard appearance in video image are calculated
The distance of state human face five-sense-organ calculates posture coefficient, integrates the close image of posture, by adjacent location between frames are close and posture coefficient phase
The smallest face of difference is considered as the face in same face race;Threshold value φ is defined, for each face race, chooses m p < φ's
Face;If the facial image number of p < φ is m in the face raceP < φ, then it is posture system p in the face race is the smallest by one
It opens facial image and replicates m-mP < φPart, m images are together constituted with other images, are input in S3;
S3: human face posture is corrected: for m obtained in S2 faces, carrying out pose adjustment;
S4: the face characteristic based on deep neural network extracts, and process is as follows:
S4.1 face characteristic extracts network training
When the face characteristic for carrying out video image extracts, advances with face database and has carried out characteristic model training,
M images in face database under everyone different angle, different illumination are obtained, wherein m images are randomly selected, to this m
After opening image progress posture correction, it is combined into w' × h' × 3m facial image, wherein w' is the width of training picture, and h' is
The height of training picture, 3m are the channel RGB3 multiplied by amount of images m, in face database everyone carry out aforesaid operations, and compile
Upper label is input to training in neural network;
S4.2 video human face feature extraction
The face color image after w × h × 3 are corrected is opened by having obtained m in S3, each image there are 3 channels, will be different
Image is fused together in the form of different channels, that is, permeate the facial image w × h × 3m for having 3 × m channel;
The input S4.1 training of obtained this w × h × 3m facial image is obtained face characteristic to extract in network, and
Finally obtain the feature vector for representing the face;
S5 face characteristic compares
For the face of input, after obtaining feature vector using step S4, COS distance is recycled to match input face
The matching degree of vector, calculating process are as follows in feature vector and feature database:
S5.1 preliminary screening
Calculate the COS distance of the feature of face to be identified and the central feature of each class, calculation such as formula (10) institute
Show,Operation is expressed as vectorTwo norms, i.e. vectorLength, cos θ is vectorWith vectorCosine away from
From:
It is greater than given threshold with face COS distance to be identifiedClass be added in alternative class, if the spy of face to be identified
It levies and is respectively less than with the COS distance of the central feature of all classesThen it is considered as the information of not stored the people in database, terminates to know
Not.
Further, the step S5 is further comprising the steps of:
S5.2 is accurately screened
For each of each alternative class face, calculate they feature vector and face to be identified feature to
The COS distance of amount, choose wherein COS distance be more than given threshold ρ face as recognition result, and will recognition result place
Video image output;If the COS distance of each of each alternative class face and face to be identified is respectively less than ρ,
It is considered as the information of not stored the people in database.
Further, in the step S1, the first frame image in video received is image 1, then temporally suitable
Sequence sets the t frame image in video as image t, with ItIt indicates t frame image, the frame image collection of same video is indicated with I,
After completing to the pretreatment of video, temporally vertical sequence passes to the image of decomposition in human face target detection module.
Further, in the step S2, the process that target Face datection and posture coefficient calculate is as follows:
S2.1 extracts face location and corresponding face position in video image
For each frame image It, face present in the frame image and corresponding face are found out using Lis Hartel sign
Coordinate is denoted as F respectively1(x1,y1), F2(x2,y2), F3(x3,y3), F4(x4,y4), F5(x5,y5);
S2.2 calculates the distance of face and standard posture human face five-sense-organ in video image
The coordinate for enabling human face five-sense-organ in standard pose presentation I' is F1'(x'1,y'1), F2'(x'2,y'2), F3'(x'3,y
'3), F4'(x'4,y'4), F5'(x'5,y'5), video image I is calculated using formula (1) and formula (2)tWith standard pose presentation I'
Mutual distance between middle human face five-sense-organ:
Wherein, (xi,yi)、(xj,yj) indicate the coordinates of different face in face to be found, (x'i,y'i)、(x'j,y'j) table
The coordinate of different face, d in the quasi- pose presentation of indicatingijIndicate the mutual distance between human face five-sense-organ to be identified, d'ijExpression standard
Mutual distance in pose presentation between human face five-sense-organ;
S2.3 calculates posture coefficient, integrates the close image of posture
The posture coefficient p for defining face calculates posture coefficient p using formula (3):
Wherein, λ is zoom factor, is caused when with to avoid facial image to be identified and inconsistent standard pose presentation scale
Error, the value of λ can be calculated by formula (4), i.e. λ takes so that the smallest value of posture coefficient;
In the step S3, the step of pose adjustment, is as follows:
S3.1 calculates face rotating vector
By the coordinate of five features point in known standard faces model and video, figure is obtained using POSIT algorithm
The posture information of face as in, i.e. the rotating vector R of face, i.e.,
The mapping relations of S3.2 calculating correcting image and original image
By the rotating vector of face, a certain pixel a certain picture into protoplast's face image in facial image after being corrected
The mapping relations of vegetarian refreshments using two lines as x-axis, construct coordinate system, enable using face axis as y-axis in image after correction
(x, y)=f (x', y') is some mapping of (x, y) on a bit (x', y') to original image on image after correcting, specific as follows:
The correction of S3.3 posture
Rgb'(x, y) it is rgb value after correction on image at (x, y), rgb (x, y) is in protoplast's face image at (x, y)
Rgb value, then the rgb value after being corrected using formula (7) in facial image on certain point (x, y), if
Wherein, G is gaussian probability matrix, in the actual operation process, because of the practical threedimensional model and mark of different faces
There are certain differences for quasi-three-dimensional model, so the mapping relations of certain point and original image corresponding points after correction on image can exist
Certain error, thus the rgb value of certain point is common by the rgb of 9 points near corresponding position on original image on image after correction
It obtains, i.e., acquires the desired value of rgb value at the point by gaussian probability matrix G, the rgb value as the point;K is in formula (7)
Ratio value has been previously set;
After carrying out human face posture correction to every face in same face race, obtaining several sizes of m is w × h × 3
Facial image, i.e., the color image of the one w × h pixel for possessing RGB3 channel.
In the step S4.1, using gradient descent algorithm training neural network, every input batch picture simultaneously calculates damage
The weight of neural network is updated after mistake, 512 dimensional vectors that full articulamentum 3 exports in network indicate which people the face of input is
Probability does softmax recurrence to it, obtains corresponding loss function, i.e., as shown in formula (9), wherein k indicates input picture
Affiliated classification, zkIndicate k-th of numerical value in 512 dimensional vectors of the full output of articulamentum 3:
Loss=∑-logf (zk) (10)
After calculating loss function, by preceding to calculating and the calculating of reversed gradient is inferred, each layer in neural network is calculated
Updated value, the weight of each layer is updated;
Clustering is carried out to establish spatial index to the face characteristic collection in database in advance, operating procedure is as follows:
S4.1.1 uses clustering algorithm for the feature in face characteristic library, if it is Ganlei that these face characteristics, which are gathered,;
S4.1.2 calculates the mean value of the feature vector of all faces in class, is denoted as such central feature for every one kind.
The present invention can be in the view of magnanimity in conjunction with deep learning method using the relevance between each frame of monitor video
The frame number of target face occurred in frequency file video file and place.Compared to other methods, the present invention passes through collection
Image information of the same person in different frame, is comprehensively considered, can farthest using video data information, from
And improve identification ground accuracy rate.
Beneficial effects of the present invention are mainly manifested in:
1, when previous method detects face in video, all it is the image individually considered in different frame, fails
Enough effectively utilize the information in video.This method utilizes the method that different images are regarded as different channels, by the same face
Image co-registration in different frame is the image of a multichannel, and implicitly extracts synthesis using multilayer convolutional network and extract this
The union feature of a little images.On the one hand, accuracy rate can be improved to the greatest extent using the information in video;On the other hand,
Multiple images are input in neural network simultaneously, avoid time waste caused by repeatedly input;
2, due to needing an image that multiple facial images permeate, it is necessary to ensure identical bits in different faces image
The feature set generally remains unanimously, i.e., in different images, eyes, nose etc. will be generally in the same position.Otherwise feature
Extracting network can be difficult to restrain.Thus, before carrying out feature extraction, it is necessary to carry out posture correction to facial image.In we
In human face posture in method, face is utilized along the symmetrical feature of axis, correct descendant certain point on the face rgb value have its
The rgb value of original image corresponding position and the rgb value synthesis for symmetrically locating corresponding points along central axes obtain.In this way, largely keeping away
Exempt from the drawbacks of information in Generic face posture antidote is lost, improves the accuracy rate finally identified;
3, the quality in view of frame is had to for the selection of particular frame in video, and the human face posture in frame image is to determine
One principal element of framing quality, but Attitude estimation is all carried out to each of each frame image face in real process
Cost is excessive.So defining a kind of posture coefficient in this method to judge the posture of face, it can preferably reflect face
Posture, while calculation amount is small.It thus can rapidly and accurately find out preferably several frame images of quality in a face race.
Detailed description of the invention
Fig. 1 is overview flow chart of the invention.
Fig. 2 is the deep learning network structure used in the embodiment of the present invention.
Specific embodiment
The invention will be further described below in conjunction with the accompanying drawings.
Referring to Figures 1 and 2, a kind of video human face detection and recognition methods based on multichannel network, the method packet
Include following steps:
S1: video pre-filtering
The video data that monitoring device is collected into is received, and is broken down into image one by one, gives each frame image
In addition temporal information, specifically: the first frame image in the video received is image 1, then sets video in chronological order
In t frame image be image t.In following narration, with ItIt indicates t frame image, the frame image of same video is indicated with I
Set.After completing to the pretreatment of video, temporally the image of decomposition is passed to human face target detection mould by vertical sequence
In block.
S2: target Face datection and posture coefficient calculate
S2.1 extracts face location and corresponding face position in video image
For each frame image It, face present in the frame image and corresponding face are found out using Lis Hartel sign
The coordinate of (two sides of eyes, nose and lip), is denoted as F respectively1(x1,y1), F2(x2,y2), F3(x3,y3)。F4(x4,y4),
F5(x5,y5)
S2.2 calculates the distance of face and standard posture human face five-sense-organ in video image
The coordinate for enabling human face five-sense-organ (two sides of eyes, nose and lip) in standard pose presentation I' is F1'(x'1,y
'1), F2'(x'2,y'2), F3'(x'3,y'3)。F4'(x'4,y'4), F5'(x'5,y'5), it is calculated using formula (1) and formula (2)
Video image ItAnd mutual distance between human face five-sense-organ in standard pose presentation I'.
Wherein, (xi,yi)、(xj,yj) indicate the coordinates of different face in face to be found, (x'i,y'i)、(x'j,y'j) table
The coordinate of different face, d in the quasi- pose presentation of indicatingijIndicate the mutual distance between human face five-sense-organ to be identified, d'ijExpression standard
Mutual distance in pose presentation between human face five-sense-organ.
S2.3 calculates posture coefficient, integrates the close image of posture.
The posture coefficient p for defining face calculates posture coefficient p using formula (3), for measuring the posture of face to be identified
With the difference of standard posture, the smaller posture for indicating face to be identified of p gets over standard, the i.e. standard as the image of recognition of face
True rate is higher;The bigger posture for indicating face to be identified of p is further off standard posture, i.e. picture quality is poor, as face
The accuracy rate of the image of identification is lower:
Wherein, λ is zoom factor, is caused when with to avoid facial image to be identified and inconsistent standard pose presentation scale
Error, the value of λ can be calculated by formula (4), i.e. λ takes so that the smallest value of posture coefficient.
Adjacent location between frames are approached and posture coefficient differ most by the posture coefficient for calculating all facial images detected
Small face is considered as the face in same face race.Threshold value φ is defined, for each face race, chooses the people of m p < φ
Face.If the facial image number of p < φ is m in the face raceP < φ, then it is posture system p in the face race is one the smallest
Facial image replicates m-mP < φPart, m images are together constituted with other images, are input in S3.
S3: human face posture is corrected
For m obtained in S2 faces, pose adjustment is carried out, the specific steps are as follows:
S3.1 calculates face rotating vector
In monitor video, the people of appearance generally farther out from camera, i.e., want with a distance from camera on face by characteristic point
Much larger than distance between them, then can be made by the coordinate of five features point in known standard faces model and video
The posture information of face in image, i.e. the rotating vector R of face are obtained with POSIT algorithm.I.e.
The mapping relations of S3.2 calculating correcting image and original image
By the rotating vector of face, after available correction in facial image a certain pixel into protoplast's face image certain
The mapping relations of one pixel, using two lines as x-axis, construct coordinate using face axis as y-axis in image after correction
System.Enabling (x, y)=f (x', y') is some mapping of (x, y) on a bit (x', y') to original image on image after correcting.Specifically such as
Under:
The correction of S3.3 posture
Rgb'(x, y) it is rgb value after correction on image at (x, y), rgb (x, y) is in protoplast's face image at (x, y)
Rgb value.Rgb value after then being corrected using formula (7) in facial image on certain point (x, y).If
Wherein, G is gaussian probability matrix, in the actual operation process, because of the practical threedimensional model and mark of different faces
There are certain differences for quasi-three-dimensional model, so the mapping relations of certain point and original image corresponding points after correction on image can exist
Certain error, thus the rgb value of certain point is common by the rgb of 9 points near corresponding position on original image on image after correction
It obtains, i.e., acquires the desired value of rgb value at the point by gaussian probability matrix G, the rgb value as the point.
Here, correcting image utilizes the characteristic of facial symmetry, with the comprehensively considering face side of the human face image information after correction
Information, maximumlly extract information, avoid when face rotation angle spends big the problem of information after posture correction is lost.Formula
(7) k is that ratio value has been previously set in, the value between 0-0.5, the smaller image for meaning more to consider independent side of k
Information, k mean more greatly the image information for more comprehensively considering two sides.
After carrying out human face posture correction to every face in same face race, obtaining several sizes of m is w × h × 3
Facial image, i.e., the color image of the one w × h pixel for possessing RGB3 channel.
S4 is extracted based on the face characteristic of deep neural network
S4.1 face characteristic extracts network training
When the face characteristic for carrying out video image extracts, face database is advanced with and has carried out characteristic model training.
Training method is as follows:
M images in face database under everyone different angle, different illumination are obtained, wherein m figures are randomly selected
Picture is combined into w' × h' × 3m facial image after carrying out posture correction to this m images, and wherein w' is training picture
Width, h' be training picture height, 3m be the channel RGB3 multiplied by amount of images m.To in face database everyone carry out it is above-mentioned
Operation, and it is numbered with label, it is input to training in neural network.
It is this different images to be considered as after the fusion of different channels neural network as input we are referred to as multichannel
Network.
For the present invention using gradient descent algorithm training neural network, batch is set as 256, i.e., 256 pictures of every input are simultaneously
Calculate the weight that neural network is updated after losing.
512 dimensional vectors that full articulamentum 3 exports in above-mentioned network indicate that the face of input is the probability of which people, do to it
Softmax is returned, and obtains corresponding loss function, i.e., as shown in formula (9), wherein and k indicates classification belonging to input picture,
zkIndicate k-th of numerical value in 512 dimensional vectors of the full output of articulamentum 3.
Loss=∑-logf (zk) (19)
After calculating loss function, by preceding to calculating and the calculating of reversed gradient is inferred, each layer in neural network is calculated
Updated value, the weight of each layer is updated, with realize reduce loss purpose, to optimize network.
In practical applications, it is contemplated that the number of stored face may be very huge in database, can be preparatory
Clustering first is carried out to establish spatial index to the face characteristic collection in database, specific steps are as follows:
S4.1.1 uses clustering algorithm for the feature in face characteristic library, these face characteristics are gathered and are by such as Kmeans
If Ganlei.
S4.1.2 calculates the mean value of the feature vector of all faces in class, is denoted as such central feature for every one kind.
S4.2 video human face feature extraction
The face color image after w × h × 3 are corrected is opened by having obtained m in S3, each image there are 3 channels, will be different
Image is fused together in the form of different channels, that is, permeate the facial image w × h × 3m for having 3 × m channel
The input S4.1 training of obtained this w × h × 3m facial image is obtained face characteristic to extract in network, and
Finally obtain the feature vector for representing the face.
S5 face characteristic compares
For the face of input, after obtaining feature vector using step S4, COS distance is recycled to match input face
The matching degree of vector, calculating process are as follows in feature vector and feature database:
S5.1 preliminary screening
Calculate the COS distance of the feature of face to be identified and the central feature of each class, calculation such as formula (10) institute
Show,Operation is expressed as vectorTwo norms, i.e. vectorLength (Indicate vectorLength,Indicate to
AmountLength), cos θ is vectorWith vectorCOS distance:
It is greater than given threshold with face COS distance to be identifiedClass be added in alternative class.If the spy of face to be identified
It levies and is respectively less than with the COS distance of the central feature of all classesThen it is considered as the information of not stored the people in database, terminates to know
Not.
S5.2 is accurately screened
For each of each alternative class face, calculate they feature vector and face to be identified feature to
The COS distance of amount, choose wherein COS distance be more than given threshold ρ face as recognition result, and will recognition result place
Video image output;If the COS distance of each of each alternative class face and face to be identified is respectively less than ρ,
It is considered as the information of not stored the people in database, terminates identification.
So far, the identification of video human face is completed.
Claims (6)
1. a kind of video human face detection and recognition methods based on multichannel network, it is characterised in that: the method includes such as
Lower step:
S1: video pre-filtering
The video data that monitoring device is collected into is received, and is broken down into image one by one, is added to each frame image
Temporal information;
S2: target Face datection and posture coefficient calculate, and process is as follows:
Face location and corresponding face position in video image are extracted, face and standard posture people in video image are calculated
The distance of face face calculates posture coefficient, integrates the close image of posture, and adjacent location between frames are approached and posture coefficient differs most
Small face is considered as the face in same face race;Threshold value φ is defined, for each face race, chooses the people of m p < φ
Face;If the facial image number of p < φ is m in the face raceP < φ, then it is posture coefficient p in the face race is the smallest by one
It opens facial image and replicates m-mP < φPart, m images are together constituted with other images, are input in S3;
S3: human face posture is corrected: for m obtained in S2 faces, carrying out pose adjustment;
S4: the face characteristic based on deep neural network extracts, and process is as follows:
S4.1 face characteristic extracts network training
When the face characteristic for carrying out video image extracts, face database is advanced with and has carried out characteristic model training, obtained
M images in face database under everyone different angle, different illumination, randomly select wherein m images, to this m figures
After carrying out posture correction, it is combined into w' × h' × 3m facial image, wherein w' is the width of training picture, and h' is training
The height of picture, 3m are the channel RGB3 multiplied by amount of images m, in face database everyone carry out aforesaid operations, and be numbered with mark
Label are input to training in neural network;
S4.2 video human face feature extraction
The face color image after w × h × 3 are corrected is opened by having obtained m in S3, each image there are 3 channels, by different images
It is fused together in the form of different channels, that is, permeate the facial image w × h × 3m for having 3 × m channel;
The input S4.1 training of obtained this w × h × 3m facial image is obtained face characteristic to extract in network, and final
The feature vector of the face is represented to one;
S5 face characteristic compares
For the face of input, after obtaining feature vector using step S4, COS distance is recycled to match input face characteristic
The matching degree of vector, calculating process are as follows in vector and feature database:
S5.1 preliminary screening
The COS distance of the feature of face to be identified and the central feature of each class is calculated, shown in calculation such as formula (10),Operation is expressed as vectorTwo norms, i.e. vectorLength, cos θ is vectorWith
VectorCOS distance:
It is greater than given threshold with face COS distance to be identifiedClass be added in alternative class, if the feature of face to be identified with
The COS distance of the central feature of all classes is respectively less thanThen it is considered as the information of not stored the people in database, terminates identification.
2. a kind of video human face detection and recognition methods based on multichannel network as described in claim 1, it is characterised in that:
The step S5 is further comprising the steps of:
S5.2 is accurately screened
For each of each alternative class face, their feature vector and the feature vector of face to be identified are calculated
COS distance chooses face that wherein COS distance is more than given threshold ρ as recognition result, and by the view where recognition result
The output of frequency image;If the COS distance of each of each alternative class face and face to be identified is respectively less than ρ, it is considered as
The information of not stored the people in database.
3. a kind of video human face detection and recognition methods, feature based on multichannel network as claimed in claim 1 or 2 exists
In: in the step S1, the first frame image in video received is image 1, is then set in video in chronological order
T frame image is image t, with ItIt indicates t frame image, the frame image collection of same video is indicated with I, complete to the pre- of video
After processing, temporally vertical sequence passes to the image of decomposition in human face target detection module.
4. a kind of video human face detection and recognition methods, feature based on multichannel network as claimed in claim 1 or 2 exists
In: in the step S2, the process that target Face datection and posture coefficient calculate is as follows:
S2.1 extracts face location and corresponding face position in video image
For each frame image It, the coordinate of face present in the frame image and corresponding face is found out using Lis Hartel sign,
It is denoted as F respectively1(x1,y1), F2(x2,y2), F3(x3,y3), F4(x4,y4), F5(x5,y5);
S2.2 calculates the distance of face and standard posture human face five-sense-organ in video image
The coordinate for enabling human face five-sense-organ in standard pose presentation I' is F1'(x'1,y'1), F2'(x'2,y'2), F3'(x'3,y'3), F4'
(x'4,y'4), F5'(x'5,y'5), video image I is calculated using formula (1) and formula (2)tWith face in standard pose presentation I'
Mutual distance between face:
Wherein, (xi,yi)、(xj,yj) indicate the coordinates of different face in face to be found, (x'i,y'i)、(x'j,y'j) indicate mark
The coordinate of different face, d in quasi- pose presentationijIndicate the mutual distance between human face five-sense-organ to be identified, d'ijExpression standard posture
Mutual distance in image between human face five-sense-organ;
S2.3 calculates posture coefficient, integrates the close image of posture
The posture coefficient p for defining face calculates posture coefficient p using formula (3):
Wherein, λ is zoom factor, with to avoid facial image to be identified and standard pose presentation scale it is inconsistent when caused by accidentally
The value of difference, λ can be calculated by formula (4), i.e. λ takes so that the smallest value of posture coefficient;
5. a kind of video human face detection and recognition methods based on multichannel network as claimed in claim 4, it is characterised in that:
In the step S3, the step of pose adjustment, is as follows:
S3.1 calculates face rotating vector
By the coordinate of five features point in known standard faces model and video, obtained in image using POSIT algorithm
The posture information of face, i.e. the rotating vector R of face, i.e.,
The mapping relations of S3.2 calculating correcting image and original image
By the rotating vector of face, a certain pixel a certain pixel into protoplast's face image in facial image after being corrected
Mapping relations, using face axis as y-axis in image after correction, using two lines as x-axis, construct coordinate system, enable (x, y)
=f (x', y') is some mapping of (x, y) on a bit (x', y') to original image on image after correcting, specific as follows:
The correction of S3.3 posture
Rgb'(x, y) it is rgb value after correction on image at (x, y), rgb (x, y) is the rgb in protoplast's face image at (x, y)
Value, then the rgb value after being corrected using formula (7) in facial image on certain point (x, y), if
Wherein, G (i, j) is gaussian probability matrix, in the actual operation process, because of the practical threedimensional model and mark of different faces
There are certain differences for quasi-three-dimensional model, so the mapping relations of certain point and original image corresponding points after correction on image can exist
Certain error, thus the rgb value of certain point is common by the rgb of 9 points near corresponding position on original image on image after correction
It obtains, i.e., acquires the desired value of rgb value at the point by gaussian probability matrix G, the rgb value as the point;Kl is in formula (7)
Ratio value has been previously set;
After carrying out human face posture correction to every face in same face race, the face figure that m Zhang great little is w × h × 3 is obtained
Picture, i.e., the color image of the one w × h pixel for possessing RGB3 channel.
6. a kind of video human face detection and recognition methods based on multichannel network as claimed in claim 5, it is characterised in that:
In the step S4.1, using gradient descent algorithm training neural network, every input batch picture and calculate lose after update
The weight of neural network, 512 dimensional vectors that full articulamentum 3 exports in network indicate that the face of input is the probability of which people, right
It does softmax recurrence, obtains corresponding loss function, i.e., as shown in formula (9), wherein zkIndicate what full articulamentum 3 exported
K-th of numerical value in 512 dimensional vectors:
Loss=∑-log f (zk) (9)
After calculating loss function, is calculated and reversed gradient is calculated by preceding to inferring, calculate in neural network each layer more
New value, is updated the weight of each layer;
Clustering is carried out to establish spatial index to the face characteristic collection in database in advance, operating procedure is as follows:
S4.1.1 uses clustering algorithm for the feature in face characteristic library, if it is Ganlei that these face characteristics, which are gathered,;
S4.1.2 calculates the mean value of the feature vector of all faces in class, is denoted as such central feature for every one kind.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611214990.4A CN106845357B (en) | 2016-12-26 | 2016-12-26 | A kind of video human face detection and recognition methods based on multichannel network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611214990.4A CN106845357B (en) | 2016-12-26 | 2016-12-26 | A kind of video human face detection and recognition methods based on multichannel network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106845357A CN106845357A (en) | 2017-06-13 |
CN106845357B true CN106845357B (en) | 2019-11-05 |
Family
ID=59137016
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611214990.4A Active CN106845357B (en) | 2016-12-26 | 2016-12-26 | A kind of video human face detection and recognition methods based on multichannel network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106845357B (en) |
Families Citing this family (40)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107392129A (en) * | 2017-07-13 | 2017-11-24 | 浙江捷尚视觉科技股份有限公司 | Face retrieval method and system based on Softmax |
CN107481318A (en) * | 2017-08-09 | 2017-12-15 | 广东欧珀移动通信有限公司 | Replacement method, device and the terminal device of user's head portrait |
CN107798308B (en) * | 2017-11-09 | 2020-09-22 | 一石数字技术成都有限公司 | Face recognition method based on short video training method |
CN107818316A (en) * | 2017-11-24 | 2018-03-20 | 合肥博焱智能科技有限公司 | A kind of batch face identification system |
CN107944416A (en) * | 2017-12-06 | 2018-04-20 | 成都睿码科技有限责任公司 | A kind of method that true man's verification is carried out by video |
CN108108676A (en) * | 2017-12-12 | 2018-06-01 | 北京小米移动软件有限公司 | Face identification method, convolutional neural networks generation method and device |
CN108197574B (en) * | 2018-01-04 | 2020-09-08 | 张永刚 | Character style recognition method, terminal and computer readable storage medium |
CN108171191B (en) * | 2018-01-05 | 2019-06-28 | 百度在线网络技术(北京)有限公司 | Method and apparatus for detecting face |
CN108256459B (en) * | 2018-01-10 | 2021-08-24 | 北京博睿视科技有限责任公司 | Security check door face recognition and face automatic library building algorithm based on multi-camera fusion |
CN108460329B (en) * | 2018-01-15 | 2022-02-11 | 任俊芬 | Face gesture cooperation verification method based on deep learning detection |
CN108427871A (en) * | 2018-01-30 | 2018-08-21 | 深圳奥比中光科技有限公司 | 3D faces rapid identity authentication method and device |
CN110555338A (en) * | 2018-05-30 | 2019-12-10 | 北京三星通信技术研究有限公司 | object identification method and device and neural network generation method and device |
CN110555342B (en) * | 2018-05-31 | 2022-08-26 | 杭州海康威视数字技术股份有限公司 | Image identification method and device and image equipment |
CN108875602A (en) * | 2018-05-31 | 2018-11-23 | 珠海亿智电子科技有限公司 | Monitor the face identification method based on deep learning under environment |
CN109145940B (en) * | 2018-07-02 | 2021-11-30 | 北京陌上花科技有限公司 | Image recognition method and device |
CN110751163B (en) * | 2018-07-24 | 2023-05-26 | 杭州海康威视数字技术股份有限公司 | Target positioning method and device, computer readable storage medium and electronic equipment |
CN110874602A (en) * | 2018-08-30 | 2020-03-10 | 北京嘀嘀无限科技发展有限公司 | Image identification method and device |
CN109190561B (en) * | 2018-09-04 | 2022-03-22 | 四川长虹电器股份有限公司 | Face recognition method and system in video playing |
CN109389588A (en) * | 2018-09-28 | 2019-02-26 | 大连民族大学 | The method for measuring difference between video successive frame and its convolution characteristic pattern |
CN109344764A (en) * | 2018-09-28 | 2019-02-15 | 大连民族大学 | Measure the system and device of difference between video successive frame and its convolution characteristic pattern |
CN109684913A (en) * | 2018-11-09 | 2019-04-26 | 长沙小钴科技有限公司 | A kind of video human face mask method and system based on community discovery cluster |
CN109800685A (en) * | 2018-12-29 | 2019-05-24 | 上海依图网络科技有限公司 | The determination method and device of object in a kind of video |
CN109949276B (en) * | 2019-02-28 | 2021-06-11 | 华中科技大学 | Lymph node detection method for improving SegNet segmentation network |
CN110276320A (en) * | 2019-06-26 | 2019-09-24 | 杭州创匠信息科技有限公司 | Guard method, device, equipment and storage medium based on recognition of face |
CN110738103A (en) * | 2019-09-04 | 2020-01-31 | 北京奇艺世纪科技有限公司 | Living body detection method, living body detection device, computer equipment and storage medium |
CN110648324A (en) * | 2019-09-29 | 2020-01-03 | 百度在线网络技术(北京)有限公司 | Vehicle tire burst early warning method and device and computer equipment |
CN112668362B (en) * | 2019-10-15 | 2023-06-16 | 浙江中正智能科技有限公司 | Human evidence comparison model training method for dynamic optimization class proxy |
CN111178129B (en) * | 2019-11-25 | 2023-07-14 | 浙江工商大学 | Multi-mode personnel identification method based on human face and gesture |
CN111325712B (en) * | 2020-01-20 | 2024-01-23 | 北京百度网讯科技有限公司 | Method and device for detecting image validity |
CN111339973A (en) * | 2020-03-03 | 2020-06-26 | 北京华捷艾米科技有限公司 | Object identification method, device, equipment and storage medium |
CN111444916A (en) * | 2020-03-26 | 2020-07-24 | 中科海微(北京)科技有限公司 | License plate positioning and identifying method and system under unconstrained condition |
CN111737525B (en) * | 2020-06-03 | 2022-10-25 | 西安交通大学 | Multi-video program matching method |
CN111681271B (en) * | 2020-08-11 | 2020-10-30 | 湖南大学 | Multichannel multispectral camera registration method, system and medium |
CN111814760B (en) * | 2020-08-24 | 2021-06-01 | 湖南视觉伟业智能科技有限公司 | Face recognition method and system |
CN112418322B (en) * | 2020-11-24 | 2024-08-06 | 苏州爱医斯坦智能科技有限公司 | Image data processing method and device, electronic equipment and storage medium |
CN112525352A (en) * | 2020-11-24 | 2021-03-19 | 深圳市高巨创新科技开发有限公司 | Infrared temperature measurement compensation method based on face recognition and terminal |
CN113269022B (en) * | 2021-03-24 | 2022-02-08 | 七台河市公安局 | System and method for predicting recessive specific personnel |
CN113240092B (en) * | 2021-05-31 | 2024-09-17 | 深圳市商汤科技有限公司 | Neural network training and face recognition method, device, equipment and storage medium |
CN113837040A (en) * | 2021-09-14 | 2021-12-24 | 天津市国瑞数码安全系统股份有限公司 | Video face detection method and system based on deep neural network |
CN116311464B (en) * | 2023-03-24 | 2023-12-12 | 北京的卢铭视科技有限公司 | Model training method, face recognition method, electronic device and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102004899A (en) * | 2010-11-03 | 2011-04-06 | 无锡中星微电子有限公司 | Human face identifying system and method |
CN103049459A (en) * | 2011-10-17 | 2013-04-17 | 天津市亚安科技股份有限公司 | Feature recognition based quick video retrieval method |
CN105654055A (en) * | 2015-12-29 | 2016-06-08 | 广东顺德中山大学卡内基梅隆大学国际联合研究院 | Method for performing face recognition training by using video data |
CN106127170A (en) * | 2016-07-01 | 2016-11-16 | 重庆中科云丛科技有限公司 | A kind of merge the training method of key feature points, recognition methods and system |
-
2016
- 2016-12-26 CN CN201611214990.4A patent/CN106845357B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102004899A (en) * | 2010-11-03 | 2011-04-06 | 无锡中星微电子有限公司 | Human face identifying system and method |
CN103049459A (en) * | 2011-10-17 | 2013-04-17 | 天津市亚安科技股份有限公司 | Feature recognition based quick video retrieval method |
CN105654055A (en) * | 2015-12-29 | 2016-06-08 | 广东顺德中山大学卡内基梅隆大学国际联合研究院 | Method for performing face recognition training by using video data |
CN106127170A (en) * | 2016-07-01 | 2016-11-16 | 重庆中科云丛科技有限公司 | A kind of merge the training method of key feature points, recognition methods and system |
Non-Patent Citations (1)
Title |
---|
Fusion ofcolor,localspatialandglobalfrequencyinformation;Zhiming Liu 等;《Pattern Recognition》;20100316;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN106845357A (en) | 2017-06-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106845357B (en) | A kind of video human face detection and recognition methods based on multichannel network | |
CN109919977B (en) | Video motion person tracking and identity recognition method based on time characteristics | |
CN107832672B (en) | Pedestrian re-identification method for designing multi-loss function by utilizing attitude information | |
CN108052896B (en) | Human body behavior identification method based on convolutional neural network and support vector machine | |
CN108520226B (en) | Pedestrian re-identification method based on body decomposition and significance detection | |
CN104517102B (en) | Student classroom notice detection method and system | |
CN108256439A (en) | A kind of pedestrian image generation method and system based on cycle production confrontation network | |
US20200234036A1 (en) | Multi-Camera Multi-Face Video Splicing Acquisition Device and Method Thereof | |
CN104881637B (en) | Multimodal information system and its fusion method based on heat transfer agent and target tracking | |
CN110135249B (en) | Human behavior identification method based on time attention mechanism and LSTM (least Square TM) | |
CN103177269B (en) | For estimating the apparatus and method of object gesture | |
CN107369183A (en) | Towards the MAR Tracing Registration method and system based on figure optimization SLAM | |
CN110008913A (en) | Pedestrian re-identification method based on fusion of attitude estimation and viewpoint mechanism | |
CN103996046B (en) | The personal identification method merged based on many visual signatures | |
CN106778604A (en) | Pedestrian's recognition methods again based on matching convolutional neural networks | |
CN109800624A (en) | A kind of multi-object tracking method identified again based on pedestrian | |
CN109190561B (en) | Face recognition method and system in video playing | |
CN112966736B (en) | Vehicle re-identification method based on multi-view matching and local feature fusion | |
CN102043953A (en) | Real-time-robust pedestrian detection method aiming at specific scene | |
EP4174716A1 (en) | Pedestrian tracking method and device, and computer readable storage medium | |
CN107766791A (en) | A kind of pedestrian based on global characteristics and coarseness local feature recognition methods and device again | |
CN109341703A (en) | A kind of complete period uses the vision SLAM algorithm of CNNs feature detection | |
CN109886356A (en) | A kind of target tracking method based on three branch's neural networks | |
CN103729620B (en) | A kind of multi-view pedestrian detection method based on multi-view Bayesian network | |
CN113221625A (en) | Method for re-identifying pedestrians by utilizing local features of deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP01 | Change in the name or title of a patent holder | ||
CP01 | Change in the name or title of a patent holder |
Address after: 310012 1st floor, building 1, 223 Yile Road, Hangzhou City, Zhejiang Province Patentee after: Yinjiang Technology Co.,Ltd. Address before: 310012 1st floor, building 1, 223 Yile Road, Hangzhou City, Zhejiang Province Patentee before: ENJOYOR Co.,Ltd. |