CN108564053A - Multi-cam dynamic human face recognition system based on FaceNet and method - Google Patents

Multi-cam dynamic human face recognition system based on FaceNet and method Download PDF

Info

Publication number
CN108564053A
CN108564053A CN201810370308.3A CN201810370308A CN108564053A CN 108564053 A CN108564053 A CN 108564053A CN 201810370308 A CN201810370308 A CN 201810370308A CN 108564053 A CN108564053 A CN 108564053A
Authority
CN
China
Prior art keywords
face
facenet
frame
video
frames
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810370308.3A
Other languages
Chinese (zh)
Inventor
桂冠
江斌
任强
戴菲
熊健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Post and Telecommunication University
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing Post and Telecommunication University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Post and Telecommunication University filed Critical Nanjing Post and Telecommunication University
Priority to CN201810370308.3A priority Critical patent/CN108564053A/en
Publication of CN108564053A publication Critical patent/CN108564053A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses multi-cam dynamic human face recognition systems and method based on FaceNet, it includes the multiple camera monitoring systems for being mounted on management region to be applicable in scene to be, the camera of multiple monitoring systems acquires the video of pedestrian in real time from different perspectives, all monitoring systems pass through network connection to same server and the Intranet of shared server, including step 1, the video of pedestrian is acquired from different perspectives, and face's frame is cut out from video flowing using HAAR models frame by frame;2, face characteristic is extracted to all face's frames using FaceNet frames;3, classify to the feature extracted, to realize recognition of face.The hardware platform of present invention combination multi-cam and server, deep learning and feature recognition are applied on monitoring system, are had great importance to improving security administration.

Description

Multi-cam dynamic human face recognition system based on FaceNet and method
Technical field
The present invention relates to technical field of face recognition, and in particular to a kind of multi-cam dynamic human face based on FaceNet Identifying system further relates to the recognition methods area of this system.
Background technology
With the fast development of artificial intelligence technology and becoming increasingly popular for video monitoring equipment, intelligent monitoring is accurate with it, Timely and feature-rich and by various circles of society extensive concern.Currently, domestic many occasions are all furnished with monitoring, video monitoring is Through as another great Video Applications after DTV, video conference, and have become " scale of construction " maximum one A Video Applications system.Security administration monitors an important application as field of video monitoring.It can face video monitoring function It is single, record various, intelligent monitoring is not high many for the feature learning rate of face under different angle different illumination conditions How disadvantage improves the feature extraction rate of intelligent video monitoring, how so that intelligent monitoring training study institute under complex environment The problems such as model poor fitting obtained, faces significant challenge.Continuous improvement with the following safety-protection system cost performance and digital high-definition The development of the technologies such as change, intelligence, market application space will constantly increase.
Currently, video monitoring key Processing Algorithm includes automatic exposure algorithm, automatic white balance algorithm, automatic focusing calculation Method, wide dynamic algorithm etc..Better color rendition may be implemented in excellent Processing Algorithm, keeps acquired image more life-like, The video of monitoring scene is set to have better performance in the case where low-light (level) and light change greatly.And hard-disc storage often occurs Hard disk fragment caused by recording repeatedly, the high fever brought when hard disc data damage, multiple hard disk operationals caused by burst power down and The problems such as vibration, due to the particularity of security protection industry, it is desirable that video/audio stores 24 hours uninterrupted steady operations of whole day, answers It is complex with occasion.Intelligent video analysis technology is " machine in monitoring technology third developing stage " machine eye+machine brain " Device brain " part judges the monitoring of video pictures " human brain " using machine, carries out data analysis and refines feature formation calculation Method is implanted into machine, forms " machine brain " to video pictures automatic detection analysis, and makes alarm or other actions.It is by calculating The powerful data-handling capacity of machine filters out useless video pictures or interference information, automatically analyzes, extracts pass in video source Key useful information becomes the eyes of people to make video camera not only, computer is also made to become the brain of people.
Intelligent video monitoring progress feature extraction, which is still had, in the prior art cannot meet asking for real high request Topic, such as:
1) feature extraction rate is not high:Existing scheme is mostly that limited a major organs feature can only be carried out to face It practises, and the extraction of characteristic point is also much insufficient for the requirement of real recognition of face precision for more details.
2) (function and training of the picture matrix through fitting gained of the model poor fitting under complex environment obtained by training study It is larger to collect error) problem:Existing monitoring scheme, since extraction characteristic point is insufficient, causes to train under complicated environmental condition Model poor fitting.
3) multi-cam model sharing dynamic monitoring problem:There are identification features for the real-time intelligent monitoring of multi-cam at present The problem that can not be shared.
Invention content
It is an object of the invention to overcome deficiency in the prior art, a kind of multi-cam based on FaceNet is provided Dynamic human face recognition system and method, combine HAAR models and FaceNet frames model the feature of face various dimensions Match cognization can promote discrimination and accelerate operating rate.
In order to solve the above technical problems, the present invention provides a kind of, the multi-cam dynamic human face based on FaceNet identifies System, it includes the multiple camera monitoring systems for being mounted on management region to be applicable in scene to be, characterized in that multiple monitoring systems Camera acquires the video of pedestrian in real time from different perspectives, and all monitoring systems by network connection to same server and are total to Enjoy the Intranet of server;This face identification system includes video acquisition module, characteristic extracting module and tagsort module,
Wherein, video acquisition module, all monitoring systems acquire pedestrian in real time from different perspectives based on respective camera Video flowing;Face's frame is cut out from video flowing using HAAR models frame by frame, and all face's frames are sent to server;
Characteristic extracting module extracts face characteristic using FaceNet frames in server to all face's frames;
Tagsort module classifies to the face characteristic extracted, to realize recognition of face.
Further, all monitoring systems connected trained FaceNet model sharings to server in server.
Correspondingly, the present invention also provides a kind of multi-cam dynamic human face recognition methods based on FaceNet, including with Lower step:
Step S1 acquires the video of pedestrian, and cuts out face from video flowing frame by frame using HAAR models from different perspectives Portion's frame;
Step S2 extracts face characteristic using FaceNet frames to all face's frames;
Step S3 classifies to the feature extracted, to realize recognition of face.
Further, face's frame is indicated using 128 dimension various dimensions matrixes.
Further, in step S2, in triple training process, pass through every n (n ∈ N*) step calculate subset argmax and Argmin is filtered, that is, expands iteration step length, is reduced model and is restrained number.
Further, in step S3, classified using KNN methods.
Compared with prior art, the advantageous effect of the invention reached is:Present system combine HAAR models and FaceNet frames carry out modeling match cognization to the feature of face various dimensions, can promote discrimination and accelerate operating rate.
Description of the drawings
Fig. 1 is the flow chart of the method for the present invention;
Fig. 2 is the schematic diagram that acquisition successive frame carries out Face datection in implementing;
Fig. 3 is the schematic diagram that HAAR models carry out Face datection;
Fig. 4 is FaceNet frame principles schematic diagrames.
Specific implementation mode
The invention will be further described below in conjunction with the accompanying drawings.Following embodiment is only used for clearly illustrating the present invention Technical solution, and not intended to limit the protection scope of the present invention.
The present invention a kind of multi-cam dynamic human face recognition system based on FaceNet, applicable scene be include installation Regarding for pedestrian is acquired in real time from different perspectives in the camera of multiple camera monitoring systems of management region, multiple monitoring systems Frequently, all monitoring systems pass through network connection to same server and the Intranet of shared server;This face identification system packet Video acquisition module, characteristic extracting module and tagsort module are included,
Wherein, video acquisition module, all monitoring systems acquire pedestrian in real time from different perspectives based on respective camera Video flowing;Face's frame is cut out from video flowing using HAAR models frame by frame, and all face's frames are sent to server;
Characteristic extracting module extracts face characteristic using FaceNet frames in server to all face's frames;
Tagsort module classifies to the face characteristic extracted, to realize recognition of face.
In terms of feature recognition, present invention incorporates HAAR models and FaceNet frames to representing the multidimensional of face picture The feature of degree matrix carries out modeling match cognization, can promote discrimination and accelerate operating rate.The invention can be applied to regard The multiple occasions, including bank, market, enterprise etc. of frequency monitoring.
Correspondingly, a kind of multi-cam dynamic human face recognition methods based on FaceNet of the present invention, as shown in Figure 1, packet Include following steps:
Step S1 acquires the video of pedestrian, and cuts out face from video flowing frame by frame using HAAR models from different perspectives Portion's frame;
All camera monitoring systems are opened, the video of pedestrian is acquired in real time from different perspectives by camera, beats herein It opens camera and record and OpenCV technologies in the prior art may be used realize.
In the embodiment of the present invention, face's frame is extracted from the successive image frame of video flowing and uses HAAR features detection face, The schematic diagram of face is extracted from successive image frame referring to Fig. 2.HAAR features are a kind of grey scale change of reflection image, pixel Sub-module seeks a kind of feature of difference.It is divided into three classes:Edge feature, linear character, central feature and diagonal line feature.With black White two kinds of rectangle frames are combined into feature templates, this feature masterplate may include multiple combinations, such as shown in Fig. 3, black and white two Two rectangular characteristics that rectangle frame is combined into, three rectangular characteristics that three rectangle frames of white black and white are combined into, four black and white rectangle frame groups The four rectangular characteristic templates for closing square with black rectangle pixel and subtract white rectangle pixel and indicate in feature templates The characteristic value of this masterplate.
The calculation formula N of HAAR Characteristic Numbers is:
Wherein, W*H is picture size, and w*h is matrix character size,Representing matrix feature exists The maximum ratio coefficient that can both horizontally and vertically amplify.
Step S2 extracts face characteristic using FaceNet frames to all face's frames;
It is detected after obtaining face picture with HAAR features, needs to be transformed into various dimensions to the face picture detected frame by frame Matrix.Various dimensions matrix is mapped in the feature vector of theorem in Euclid space by CNN, calculate different picture face characteristics away from From by the distance of the face of same individual, always less than the face of Different Individual, this priori trains FaceNet.It surveys It only needs to calculate face characteristic when examination, then calculates distance and can determine that whether two human face photos belong to identical using threshold value Individual.
It is different from application of other deep learning methods on face, the not useful traditional softmax's of FaceNet Mode goes to carry out classification learning, then extracts wherein a certain layer as feature;But end-to-end study one is directly carried out from figure Coding method as arriving theorem in Euclid space, is then based on this coding and does recognition of face, face verification and face cluster etc. again.
FaceNet algorithms eliminate last softmax, but calculate the mode of distance with tuple to carry out the instruction of model Practice.So that the graphical representation acquired in this way is compacted very much, uses 128 dimensions.The various dimensions matrix of face picture is obtained, is obtained Face more details characteristic point, to meet the requirement of real recognition of face precision.
The model framework of FaceNet is as shown in figure 4, Deep Architecture (depth framework) are convolutional neural networks Remove the structure after sofmax, by the normalization of L2, then obtain character representation, triple is calculated based on this character representation Loss.This FaceNet model is the prior art, is seldom repeated herein its model framework, referring to the prior art.
So-called triple (is made of, any one pictures all can serve as one Anchor, Negative, Positive A basic point (A), the picture that same people is then belonged to it is exactly its P, and the picture that same people is not belonging to it is exactly its N) It is exactly three samples, such as (anchor, pos, neg) (see attached drawing 4).The process so learnt is exactly to acquire a kind of expression, for Triple as much as possible so that the distance of anchor and pos is less than the distance of anchor and neg, because only that such ability Ensure that the distance of the face of same individual described above is always less than the distance of the face of Different Individual.I.e.:
So converting, object function is obtained:
Wherein, two norms on the left side indicate that inter- object distance, two norms on the right indicate between class distance, and α is a constant.Mesh The meaning of scalar functions is exactly to be optimized for being unsatisfactory for the triple of condition, i.e., as far as possible so that anchor and pos away from With a distance from less than anchor and neg;For meeting the triple of condition, just first no matter.
The selection of triple is extremely important for the convergence of model.ForNeed the different pictures of selection same personSo thatIt is also desirable to select different personal imagesSo thatIn hands-on, the argmax and argmin of all training samples are calculated It is unpractical, since label image accuracy rate is there are error, training convergence is highly difficult.
So passing through every n (n ∈ N in the present invention*) step calculates the argmax and argmin of subset and filter, that is, expand iteration Step-length, reduces model and restrains number, and ensure influences convergent factor as far as possible is reduced.
Step S3 classifies to the feature extracted, to realize recognition of face and identities match.
In the embodiment of the present invention, classified to the feature that previous step is extracted using KNN algorithms in the prior art.KNN It is to be classified by measuring the distance between different characteristic value.Its thinking is:If K of the sample in feature space Most of in a most like sample (i.e. closest in feature space) belong to some classification, then the sample also belongs to this Classification.K is typically the integer no more than 20.In KNN algorithms, selected neighbours are the objects correctly classified.The party Method only determines the classification belonging to sample to be divided on determining class decision according to the classification of one or several closest samples.
In KNN, the non-similarity index between each object is used as by distance between computing object, avoids object Between matching problem, herein distance generally use Euclidean distance or manhatton distance:
Euclidean distance:
Wherein, x and y is theorem in Euclid space coordinate.
Manhatton distance:
Wherein, x and y is Manhattan space coordinate.
Meanwhile KNN is by according to the classification progress decision being dominant in K object, rather than single object type decision. This 2 points be exactly KNN algorithms advantage.The detailed process that KNN algorithms are classified is:Known to training intensive data and label In the case of, the feature of test data feature corresponding with training set is compared to each other, finds instruction by input test data Practice and concentrate the most similar preceding K data therewith, then the corresponding classification of the test data be exactly in K data occurrence number it is most That classification, algorithm is described as:
1) the distance between test data and each training data are calculated;
2) it is ranked up according to the incremental relationship of distance;
3) K point of selected distance minimum;
4) frequency of occurrences of classification where K point before determining;
5) the highest classification of the frequency of occurrences is classified as the prediction of test data in K point before returning.
All monitoring systems that previous trained model sharing to server is connected, i.e., all monitoring systems all may be used To directly invoke same training pattern by server platform, once identification feature is similar to the face characteristic in database, then Identities match is carried out in server internal.
Multi-cam dynamic human face recognition system and method based on FaceNet proposed by the invention, combines HAAR Model and FaceNet frames carry out modeling match cognization to face characteristic, can effectively improve safety defense monitoring system to different angles Spend the facial recognition capability of different illumination conditions, and all monitoring that model sharing to server is connected, i.e., all monitoring Equipment can be directly invoked same training pattern by server platform and carry out face identities match, identify target body in time Part, compared to traditional monitoring scheme, greatly improve the intelligent and safety of monitoring system.
The above is only a preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For member, without departing from the technical principles of the invention, several improvements and modifications, these improvements and modifications can also be made Also it should be regarded as protection scope of the present invention.

Claims (6)

1. the multi-cam dynamic human face recognition system based on FaceNet, to be applicable in scene be to include mounted on the more of management region A camera monitoring system, characterized in that the camera of multiple monitoring systems acquires the video of pedestrian, institute in real time from different perspectives There is monitoring system by network connection to same server and the Intranet of shared server;This face identification system includes video Acquisition module, characteristic extracting module and tagsort module,
Wherein, video acquisition module, all monitoring systems are based on respective camera and acquire pedestrian's video in real time from different perspectives Stream;Face's frame is cut out from video flowing using HAAR models frame by frame, and all face's frames are sent to server;
Characteristic extracting module extracts face characteristic using FaceNet frames in server to all face's frames;
Tagsort module classifies to the face characteristic extracted, to realize recognition of face.
2. the multi-cam dynamic human face recognition system according to claim 1 based on FaceNet, characterized in that service All monitoring systems for connecting trained FaceNet model sharings to server in device.
3. the multi-cam dynamic human face recognition methods based on FaceNet, characterized in that include the following steps:
Step S1 acquires the video of pedestrian, and cuts out face's frame from video flowing frame by frame using HAAR models from different perspectives;
Step S2 extracts face characteristic using FaceNet frames to all face's frames;
Step S3 classifies to the feature extracted, to realize recognition of face.
4. the multi-cam dynamic human face recognition methods according to claim 3 based on FaceNet, characterized in that face Frame is indicated using 128 dimension various dimensions matrixes.
5. the multi-cam dynamic human face recognition methods according to claim 3 based on FaceNet, characterized in that step In S2, in triple training process, pass through every n (n ∈ N*) step calculates the argmax and argmin of subset and filter, that is, expand and change It rides instead of walk length, reduces model and restrain number.
6. the multi-cam dynamic human face recognition methods according to claim 3 based on FaceNet, characterized in that step In S3, classified using KNN methods.
CN201810370308.3A 2018-04-24 2018-04-24 Multi-cam dynamic human face recognition system based on FaceNet and method Pending CN108564053A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810370308.3A CN108564053A (en) 2018-04-24 2018-04-24 Multi-cam dynamic human face recognition system based on FaceNet and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810370308.3A CN108564053A (en) 2018-04-24 2018-04-24 Multi-cam dynamic human face recognition system based on FaceNet and method

Publications (1)

Publication Number Publication Date
CN108564053A true CN108564053A (en) 2018-09-21

Family

ID=63536231

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810370308.3A Pending CN108564053A (en) 2018-04-24 2018-04-24 Multi-cam dynamic human face recognition system based on FaceNet and method

Country Status (1)

Country Link
CN (1) CN108564053A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110427843A (en) * 2019-07-18 2019-11-08 广州利科科技有限公司 A kind of face intelligent identification Method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101373514A (en) * 2007-08-24 2009-02-25 李树德 Method and system for recognizing human face
CN107403173A (en) * 2017-08-21 2017-11-28 合肥麟图信息科技有限公司 A kind of face identification system and method
CN107480658A (en) * 2017-09-19 2017-12-15 苏州大学 Face identification device and method based on multi-angle video

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101373514A (en) * 2007-08-24 2009-02-25 李树德 Method and system for recognizing human face
CN107403173A (en) * 2017-08-21 2017-11-28 合肥麟图信息科技有限公司 A kind of face identification system and method
CN107480658A (en) * 2017-09-19 2017-12-15 苏州大学 Face identification device and method based on multi-angle video

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
FLORIAN SCHROFF等: "FaceNet: A Unified Embedding for Face Recognition and Clustering", 《2015CVPR》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110427843A (en) * 2019-07-18 2019-11-08 广州利科科技有限公司 A kind of face intelligent identification Method
CN110427843B (en) * 2019-07-18 2021-07-13 广州利科科技有限公司 Intelligent face recognition method

Similar Documents

Publication Publication Date Title
CN108564052A (en) Multi-cam dynamic human face recognition system based on MTCNN and method
CN109819208B (en) Intensive population security monitoring management method based on artificial intelligence dynamic monitoring
CN110210276A (en) A kind of motion track acquisition methods and its equipment, storage medium, terminal
US6807286B1 (en) Object recognition using binary image quantization and hough kernels
Sun et al. Photo assessment based on computational visual attention model
Nguyen et al. Anomaly detection in traffic surveillance videos with gan-based future frame prediction
CN110543811B (en) Deep learning-based non-cooperative examination personnel management method and system
CN112183468A (en) Pedestrian re-identification method based on multi-attention combined multi-level features
CN111325051A (en) Face recognition method and device based on face image ROI selection
CN111353399A (en) Tamper video detection method
CN109902681B (en) User group relation determining method, device, equipment and storage medium
CN108363771B (en) Image retrieval method for public security investigation application
Heng et al. How to assess the quality of compressed surveillance videos using face recognition
CN110827432A (en) Class attendance checking method and system based on face recognition
CN112766119A (en) Method for accurately identifying strangers and constructing community security based on multi-dimensional face analysis
KR20210040604A (en) Action recognition method and device
KR20190071452A (en) Apparatus and method for object detection with shadow removed
CN116824641B (en) Gesture classification method, device, equipment and computer storage medium
CN112488072A (en) Method, system and equipment for acquiring face sample set
CN108564053A (en) Multi-cam dynamic human face recognition system based on FaceNet and method
CN111800428A (en) Real-time statistical method and system for digital conference participation
CN115471901A (en) Multi-pose face frontization method and system based on generation of confrontation network
CN115346169A (en) Method and system for detecting sleep post behaviors
CN116546304A (en) Parameter configuration method, device, equipment, storage medium and product
Charran et al. Real-Time Identity Censorship of Videos to Enable Live Telecast Using NVIDIA Jetson Nano

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180921

RJ01 Rejection of invention patent application after publication