CN101419670A - Video monitoring method and system based on advanced audio/video encoding standard - Google Patents

Video monitoring method and system based on advanced audio/video encoding standard Download PDF

Info

Publication number
CN101419670A
CN101419670A CNA2008102032020A CN200810203202A CN101419670A CN 101419670 A CN101419670 A CN 101419670A CN A2008102032020 A CNA2008102032020 A CN A2008102032020A CN 200810203202 A CN200810203202 A CN 200810203202A CN 101419670 A CN101419670 A CN 101419670A
Authority
CN
China
Prior art keywords
face
background
video
people
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA2008102032020A
Other languages
Chinese (zh)
Other versions
CN101419670B (en
Inventor
王新
路红
宋元征
陈桂财
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN2008102032020A priority Critical patent/CN101419670B/en
Publication of CN101419670A publication Critical patent/CN101419670A/en
Application granted granted Critical
Publication of CN101419670B publication Critical patent/CN101419670B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention belongs to the video monitoring technical field and in particular relates to a video monitoring method based on AVS (advanced video-audio coding standard) and an implementation system thereof. Complying with the development trend of the video monitoring, the invention introduces automatic processing and AVS standard into video monitoring and is combined with techniques of background/non-background classification, face detection and recognition, etc. The monitoring video is automatically processed in advance by the computer system. Under the premise of ensuring the effectiveness of returned contents, information quantity feedback to operators is far smaller than that of a traditional monitoring system. Therefore, human resources are saved greatly, and the reliability of the video monitoring system is improved simultaneously. The invention is the first to use the advantages of AVS in video monitoring technology and patent application. With the great support from national and local governments for the application and the expansion of AVS, The video monitoring method based on AVS and the implementation system thereof have certain application value in the application fields of digital monitoring, access control, personal identification, etc.

Description

Video frequency monitoring method and system based on advanced audio/video encoding standard
Technical field
The invention belongs to technical field of video monitoring, be specially a kind of video frequency monitoring method and realization system thereof based on AVS (advanced audio/video encoding standard).
Background technology
Nowadays safety problem has been subjected to extensive concern, has emerged in large numbers increasing video monitoring system, as gate control system, attendance checking system and identification system or the like.Video monitoring system can allow managerial personnel observe front end in the pulpit and take precautions against in the zone all personnel's active situation and keep a record, for security system provides real-time image, acoustic information.But, traditional video monitoring system needs the great amount of manpower resource overhead, detection, identification and understanding to the monitor video content rely on manually fully, reduced the work efficiency of video monitoring system, security and accuracy also lack assurance, and also do not have special-purpose digital video monitor system video compression standard at present as the video compression standard of video monitoring system core technology, on Network Transmission and system's versatility, caused bigger problem.
Summary of the invention
The objective of the invention is to propose a kind of high efficiency, video frequency monitoring method and system that security is good.
The present invention complies with the video monitoring trend, robotization is handled and AVS standard introducing video monitoring, in conjunction with technology such as background/non-background class, the detection of people's face and identifications, in advance to the automatic processing of monitor video by computer system, under the prerequisite of the validity that guarantees returned content, the quantity of information that feeds back to operating personnel will be much smaller than traditional supervisory system, thereby has saved human resources greatly, has also improved the reliability of video monitoring system simultaneously.Initiative utilizes AVS at video monitoring technical elements and patent application advantage, and along with country and local government support application to AVS energetically, the present invention controls and application such as identification has certain application value digital supervision, gate inhibition.
The present invention at first gathers according to the AVS code stream by the AVS web camera, uses the compressed domain in the AVS code stream decoding process to carry out the classification of background and non-background.When classification results shows that current frame is not background, carry out people's face and detect.When detecting people's face, carry out recognition of face, be about to people's face data and carry out comparing with training data after the conversion.Before recognition result is fed to the user, calculate degree of confidence t earlier, t shows the credibility of current recognition result.(t_min is obtained by the empirical data statistics during less than threshold value t_min as degree of confidence t, t_min is high more, and then accuracy rate is high more, t_min is low more, and then recall ratio is high more, set a suitable t_min by balance according to system's actual conditions), we think that this people's face does not belong to the data in the Current Library, regard as the stranger, and this result is fed back to the user, new people's face adds in the storehouse with this after the user confirms.When degree of confidence during more than or equal to threshold value t_min, show that recognition result has higher confidence level, write down recognition result then and video is marked.Fig. 1 is the process flow diagram of this video monitoring system, has wherein embodied two characteristics of the present invention, and AVS uses and robotization is handled.
The system of specific implementation mainly forms training module, labeling module and retrieval module by three parts.
Training module comprises the training module of monitoring environment background and the training module of face database, implements respectively to import to people's face sample storehouse and background sample storehouse to the environmental background training with to the training of people's face, is output as each face characteristic and background characteristics.
Labeling module comprises that background detection module, people's face detection module, face recognition module and index structure set up part, and the monitor video of input is marked automatically.Be input as background characteristics, face characteristic and monitor video to be marked that training module obtains, be output as the search index of monitor video to be marked.
Retrieval module is to specifying monitor video to retrieve, comprising picture query, text query and query video.Be input as the index of specifying monitor video, picture, text or segment video that the user submits to obtain content that the user submits to corresponding picture material in monitor video.Figure 2 shows that the logical relation between main composition module, workflow and each module of system.As shown in the figure, the initial input of system is face database and background sample, through obtaining background model and face characteristic transformation matrix and face characteristic storehouse after the training.Then monitor video is marked, the process of mark at first is that background detects, and to not being that the image of background carries out people's face and detects, the people's face that wherein occurs is carried out eigentransformation and creates index under the index structure.The final user submits text to by user interface, picture or video, and system submits to the difference of content to handle respectively according to the user, and what finally feed back to the user is the position that relevant information occurs in monitor data.
Be the design of system's main modular below:
1) background training module: the background video sample to input calculates, and obtains background model.Adopt algorithm to be based on the hsv color space, calculate the span that each pixel belongs to background.
Input: background video sample.
Output: background model is used for the comparison of background.
2) people's face training module: the people's face in the face database is handled.Adopting algorithm is fisher-face.
Input: face database.
Output: by the transformation matrix that people's face data computation in the face database obtains, the purpose of this matrix is that the conversion of input people face is obtained one-dimensional vector, in order to identification.When obtaining transformation matrix, export the center of each one face, in order to identification.
3) background detection module: incoming frame image and background model are compared, and purpose is to know whether this incoming frame is background, if not background, those zones belong to the prospect scope.
Input: background model, two field picture.
Output: know whether this incoming frame is background, if not background, those zones belong to the prospect scope.
4) people's face detection module:, detect people's face therein for the two field picture of non-background.
Input: two field picture.
Output: detected facial image.
5) face recognition module: for detected facial image, the transformation matrix that uses training to obtain obtains a bit vector, adopts the similarity at Euclidean distance calculating and each center, to realize the purpose of identification.
Input: facial image, transformation matrix.
Output: recognition result.
6) index structure module: input video is marked, and the result according to recognition of face obtains video index, and index structure set up in index.
Input: monitor video.
Output: video index.
7) retrieval module: the user is by user interface input inquiry content, and retrieval module submits to the difference of content format to retrieve according to the user, and by the user interface feedback information.
Input: the inquiry that the user submits to.
Output: the information such as video clips that feed back to the user.
The present invention has special pre-service at the AVS video flowing, no matter be at the gate inhibition's monitoring in real time or the video of processed offline storage, the AVS code stream is not decoded completely, and the compressed domain that is to use AVS is carried out background/non-background class, judge whether present image is background, if just do not carry out follow-up work, improve the treatment effeciency of system with this for background.In using in real time, can also add and use hardware handles to quicken this process.
In the middle of the compression domain of AVS, the motion vector of macro block can reflecting video in the middle of the motion of object.In the background segment, image is static relatively, can make when the people occurs and introduce more movable information in the video.Propose in the document [1] to use motion estimation technique H.264 to carry out the classification of background/non-background.The present invention is used for the AVS code stream with similar algorithms.If Be the motion vector of a macro block in the present image,
Figure A200810203202D00072
0≤i≤N-1.N is a macro block sum in the present image.Calculate exercise intensity in the present image with following formula:
Formula (1)
Wherein, size iThe area of representing i macro block.
The simple motion state of using exercise intensity can not characterize object in the present image fully, therefore introduce the scope of moving in another parameter MS presentation video:
MS = Σ i = 0 N - 1 b _ s i , b _ s i = size i , m i → ≠ 0 0 , else Formula (2)
In the background image sequence, there is not violent motion in the image, exercise intensity and range of movement all are limited in less numerical value.If the threshold value of MV is mv_min, the threshold value of ms is ms_min, mv_min and ms_min are obtained by the empirical data statistics, it is high more that the more little then background of mv_min and ms_min is differentiated accuracy rate, mv_min and ms_min are big more, and then recall ratio is high more, sets a suitable mv_min and ms_min by balance according to system's actual conditions.When satisfying following condition, judge that present image belongs to background:
MV<mv_min and MS<ms_min.
The meaning of carrying out background and non-background class not only is to have improved the efficient of system, also collects the statistical information of each control point on the other hand, thereby infers the environmental information of control point.For example by the distribution of the non-background frames of statistics in the middle of supervisory sequence, just can learn when section is in crowded state in this control point, thereby further suitable deployment is made in this control point, for example the intensive relatively time period improve the frame per second of recording, and reduce frame per second of recording or the like in the time period of stream of people's rareness the stream of people.
Detect through background, do not detect for the image of background carries out people's face judging.People's face detects and adopts AdaBoost algorithm [2].But in order to improve the treatment effeciency of system, we do not carry out global detection, but carry out local detection.
From people's face detected, detected facial image carried out according to from left to right, being scanned into sample vector from top to bottom after size unifies convergent-divergent, then sample vector is carried out dimensionality reduction.The Fisher-Face algorithm that we adopt classical PCA to combine with LDA carries out the extraction of people's face projection properties [3](PCA:Principal Components Analysis is in conjunction with pivot analysis; LDA:Linear Discriminant Analysis, linear discriminant analysis).Use LDA on the space after using the PCA dimensionality reduction, obtain the proper vector of the people's face that detects.Adopt the people's face in minimum distance classifier and the storehouse to compare and identification after the feature extraction.
If the sample vector after people's face f process Fi sher-Face feature extraction is f ', f '=(u0, u1 ... uk), calculate the distance of itself and training sample then:
d ( f ′ , f i ′ ) = Σ i = 0 k [ ( u i - v i ) 2 ] Formula (3)
Fi '=(v0, v1 wherein ... vk) i training sample in the library representation, k is the sample dimension.The distance of i training sample in d (f ', fi ') current sample to be identified of expression and the storehouse.
Calculated in f ' and the storehouse behind all samples, found out minimum preceding 5 samples of distance, fi1 ', fi2 ' ... fi5 '.Wherein most samples belong to class c, and class c appoints the sample class that refers to belong to same individual, the more the sort of c class that is of quantity.If 5 samples respectively belong to a class, then with the minimum sample fi1 ' of f ' distance under class as c.We calculate the degree of confidence t of identification with following formula:
t = Σd ( f ′ , f ij ′ | f ij ′ ∈ c ) Σ j = 1 5 d ( f ′ , f ij ′ ) Formula (4)
As degree of confidence t during less than threshold value t_min, illustrate that people's face is the stranger, f is as a result fed back to the user, new people's face adds in the storehouse with this after the user confirms, otherwise the expression recognition result is reliable and write down the result.T_min is obtained by the empirical data statistics, and t_min is high more, and then accuracy rate is high more, and t_min is low more, and then recall ratio is high more, sets a suitable t_min by balance according to system's actual conditions.
According to foregoing, what summarize the present invention's proposition based on the video monitoring system of AVS and the step of its implementation is: 1, utilize the AVS video camera to obtain the AVS code stream; 2, the AVS code stream is carried out background class, the detection of people's face, background training, the training of people's face; 3, to the identification of comparing of people's face; 4, obtain Query Result.
Description of drawings
Fig. 1 is the core process flow diagram of this video monitoring system.
Fig. 2 is system's main modular and workflow.
Number in the figure: 1 training module; 2 labeling module; 3 retrieval modules; 4 face databases; 5 background sample storehouses; 6 background training modules; 7 people's face training modules; 8 background models; 9 face characteristic transformation matrixs; 10 background detection modules; 11 people's face detection modules; 12 face recognition module; 13 index structure modules; 14 monitor videos; 15 search indexs; 16 retrieval modules.
Embodiment
For example, the present invention is in the application of gate control system, and system can be divided into five parts: front-end camera, AVS video database, Video processing and comparison identification, face database, enter information inquiry.In gate control system, camera position is more fixing, and the angle and the figure viewed from behind of shooting are all fixed, and the variation of light neither be very violent in this indoor environment of office building.Because segmentation and remote storage are not supported in the driving that video camera carries, so will carry at camera on the basis of driving according to application requirements and write driver, in the shooting process, automatically realize the segmentation of video, and the AVS segmentation video storage that will take gained is in the data designated storehouse.Simultaneously, real-time order is handled the AVS code stream of segmentation.At first carry out background class,, then be not for further processing if the segment video is the figure viewed from behind.Detect through background, do not detect for the image of background carries out people's face judging.But in order to improve the treatment effeciency of system, we do not carry out global detection, but carry out local detection, and detection method has detailed elaboration in preamble, just do not repeat at this.Detect as degree of confidence t (the computing method preamble is stated) during by people's face less than threshold value t_min (preamble is stated), t_min can be made as 0.85 in the actual realization of system, give the user less than this value feedback sort like the information in " this person's face is not in the storehouse; be the stranger ", remind the user, can also be after the user confirms new people's face adds in the storehouse with this, the result can be existed in the face database.If greater than t_min, the expression recognition result reliably and in protoplast's face database has this person, inquires about and report this person's name automatically, writes down the time that it enters.This is the present invention's a kind of application in practice.
List of references:
[1] Hui H., Liu H., Wu Y., Liang Y.Video surveillance method based on is standard[J H.264] .Computer Applications, 2005,25 (11), 131-133.[Hui Marsha, Liu Han, Wu Yali, Liang Yanming. a kind of based on video encoding standard intelligent video monitoring technology [J] H.264. " computer utility ", 2005,25 (11), 131-133]
[2]Freund?Y.,Schapire?R.E.A?Decision-Theoretic?Generalization?of?Online?Learning?and?anApplication?to?Boosting.Journal?of?Computer?and?System?Sciences,1997,55(1):119-139
[3]Belhumeur?P.,Hespanha?J.Eigenfaces?vs?Fisherfaces:recognition?using?class?specific?linearprojection[C],1997,IEEE?Transactions?on?Pattern?Analysis?and?Machine?Intelligence,20(7),711-720

Claims (3)

1, a kind of video frequency monitoring method based on AVS is characterized in that concrete steps are as follows: at first gather according to the AVS code stream by the AVS web camera, use the compressed domain in the AVS code stream decoding process to carry out the classification of background and non-background.When classification results shows that current frame is not background, carry out people's face and detect.When detecting people's face, carry out recognition of face, be about to people's face data and carry out comparing with training data after the conversion.Before recognition result is fed to the user, calculate degree of confidence t earlier, t shows the credibility of current recognition result.As degree of confidence t during less than threshold value t_min, think that this people's face does not belong to the data in the Current Library, regard as the stranger, and this result is fed back to the user, new people's face adds in the storehouse with this after the user confirms.When degree of confidence during more than or equal to threshold value t_min, show that recognition result has higher confidence level, write down recognition result then and video is marked; Here AVS is meant advanced audio/video encoding standard.2, method according to claim 1, the method that it is characterized in that described background class is for establishing
Figure A200810203202C00021
Be the motion vector of a macro block in the present image,
Figure A200810203202C00022
0≤i≤N-1; N is a macro block sum in the present image.Calculate exercise intensity in the present image with following formula:
Figure A200810203202C00023
Formula (1)
Wherein, size iThe area of representing i macro block.
The scope of moving in the parameter MS presentation video:
MS = Σ i = 0 N - 1 b _ s i , b _ s i = size i , m i → ≠ 0 0 , else Formula (2)
When satisfying following condition, judge that present image belongs to background:
MV<mv_min and MS<ms_min.
3, method according to claim 1, the method that it is characterized in that described recognition of face is as follows: from people's face detected, detected facial image carried out after size unifies convergent-divergent, according to from left to right, be scanned into sample vector from top to bottom, then sample vector carried out dimensionality reduction; The Fisher-Face algorithm that we adopt classical PCA to combine with LDA carries out the extraction of people's face projection properties;
If the sample vector after people's face f process Fisher-Face feature extraction is f ', f '=(u0, u1 ... uk), calculate the distance of itself and training sample then:
d ( f ′ , f i ′ ) = Σ i = 0 k [ ( u i - v i ) 2 ] Formula (3)
Fi '=(v0, v1 wherein ... vk) i training sample in the library representation, k is the sample dimension; The distance of i training sample in d (f ', fi ') current sample to be identified of expression and the storehouse;
Calculated in f ' and the storehouse behind all samples, found out minimum preceding 5 samples of distance, fi1 ', fi2 ' ... fi5 '; Wherein most samples belong to class c, and class c appoints the sample class that refers to belong to same individual, the more the sort of c class that is of quantity; If 5 samples respectively belong to a class, then with the minimum sample fi1 ' of f ' distance under class as c; We calculate the degree of confidence t of identification with following formula:
t = Σd ( f ′ , f ij ′ | f ij ′ ∈ c ) Σ j = 1 5 d ( f ′ , f ij ′ ) Formula (4)
As degree of confidence t during less than threshold value t_min, illustrate that people's face is the stranger, f is as a result fed back to the user, new people's face adds in the storehouse with this after the user confirms, otherwise the expression recognition result is reliable and write down the result.
4, a kind of video monitoring system based on AVS is characterized in that system is mainly by training module, labeling module and retrieval module:
Training module comprises the training module of monitoring environment background and the training module of face database, implements respectively to import to people's face sample storehouse and background sample storehouse to the environmental background training with to the training of people's face, is output as each face characteristic and background characteristics;
Labeling module comprises that background detection module, people's face detection module, face recognition module and index structure set up part, and the monitor video of input is marked automatically; Be input as background characteristics, face characteristic and monitor video to be marked that training module obtains, be output as the search index of monitor video to be marked;
Retrieval module is to specifying monitor video to retrieve, comprising picture query, text query and query video; Be input as the index of specifying monitor video, picture, text or segment video that the user submits to obtain content that the user submits to corresponding picture material in monitor video;
The design of system's main modular is as follows:
1) background training module: the background video sample to input calculates, and obtains background model; Adopt algorithm to be based on the hsv color space, calculate the span that each pixel belongs to background;
Input: background video sample;
Output: background model is used for the comparison of background;
2) people's face training module: the people's face in the face database is handled; Adopting algorithm is fisher-face;
Input: face database;
Output: by the transformation matrix that people's face data computation in the face database obtains, the purpose of this matrix is that the conversion of input people face is obtained one-dimensional vector, in order to identification; When obtaining transformation matrix, export the center of each one face, in order to identification;
3) background detection module: incoming frame image and background model are compared, and purpose is to know whether this incoming frame is background, if not background, those zones belong to the prospect scope;
Input: background model, two field picture;
Output: know whether this incoming frame is background, if not background, those zones belong to the prospect scope;
4) people's face detection module:, detect people's face therein for the two field picture of non-background;
Input: two field picture;
Output: detected facial image;
5) face recognition module: for detected facial image, the transformation matrix that uses training to obtain obtains a bit vector, adopts the similarity at Euclidean distance calculating and each center, to realize the purpose of identification;
Input: facial image, transformation matrix;
Output: recognition result;
6) index structure module: input video is marked, and the result according to recognition of face obtains video index, and index structure set up in index;
Input: monitor video;
Output: video index;
7) retrieval module: the user is by user interface input inquiry content, and retrieval module submits to the difference of content format to retrieve according to the user, and by the user interface feedback information;
Input: the inquiry that the user submits to;
Output: the video clips information that feeds back to the user.
CN2008102032020A 2008-11-21 2008-11-21 Video monitoring method and system based on advanced audio/video encoding standard Expired - Fee Related CN101419670B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008102032020A CN101419670B (en) 2008-11-21 2008-11-21 Video monitoring method and system based on advanced audio/video encoding standard

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008102032020A CN101419670B (en) 2008-11-21 2008-11-21 Video monitoring method and system based on advanced audio/video encoding standard

Publications (2)

Publication Number Publication Date
CN101419670A true CN101419670A (en) 2009-04-29
CN101419670B CN101419670B (en) 2011-11-02

Family

ID=40630456

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008102032020A Expired - Fee Related CN101419670B (en) 2008-11-21 2008-11-21 Video monitoring method and system based on advanced audio/video encoding standard

Country Status (1)

Country Link
CN (1) CN101419670B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101860731A (en) * 2010-05-20 2010-10-13 杭州普维光电技术有限公司 Video information processing method, system and server
CN102223520A (en) * 2011-04-15 2011-10-19 北京易子微科技有限公司 Intelligent face recognition video monitoring system and implementation method thereof
CN102932625A (en) * 2011-08-10 2013-02-13 上海康纬斯电子技术有限公司 Portable digital audio/video acquisition device
CN103475882A (en) * 2013-09-13 2013-12-25 北京大学 Surveillance video encoding and recognizing method and surveillance video encoding and recognizing system
CN104392439A (en) * 2014-11-13 2015-03-04 北京智谷睿拓技术服务有限公司 Image similarity confirmation method and device
CN104463117A (en) * 2014-12-02 2015-03-25 苏州科达科技股份有限公司 Sample collection method and system used for face recognition and based on video
CN105654055A (en) * 2015-12-29 2016-06-08 广东顺德中山大学卡内基梅隆大学国际联合研究院 Method for performing face recognition training by using video data
CN106407281A (en) * 2016-08-26 2017-02-15 北京奇艺世纪科技有限公司 Image retrieval method and device
CN109446967A (en) * 2018-10-22 2019-03-08 深圳市梦网百科信息技术有限公司 A kind of method for detecting human face and system based on compression information
CN112085858A (en) * 2020-06-19 2020-12-15 北京筑梦园科技有限公司 Parking charging method, server and parking charging processing system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1224900C (en) * 2001-12-29 2005-10-26 上海银晨智能识别科技有限公司 Embedded human face automatic detection equipment based on DSP and its method
CN100468467C (en) * 2006-12-01 2009-03-11 浙江工业大学 Access control device and check on work attendance tool based on human face identification technique
CN100568262C (en) * 2007-12-29 2009-12-09 浙江工业大学 Human face recognition detection device based on the multi-video camera information fusion

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101860731B (en) * 2010-05-20 2012-05-30 杭州普维光电技术有限公司 Video information processing method, system and server
CN101860731A (en) * 2010-05-20 2010-10-13 杭州普维光电技术有限公司 Video information processing method, system and server
CN102223520A (en) * 2011-04-15 2011-10-19 北京易子微科技有限公司 Intelligent face recognition video monitoring system and implementation method thereof
CN102932625A (en) * 2011-08-10 2013-02-13 上海康纬斯电子技术有限公司 Portable digital audio/video acquisition device
CN103475882A (en) * 2013-09-13 2013-12-25 北京大学 Surveillance video encoding and recognizing method and surveillance video encoding and recognizing system
CN103475882B (en) * 2013-09-13 2017-02-15 北京大学 Surveillance video encoding and recognizing method and surveillance video encoding and recognizing system
CN104392439B (en) * 2014-11-13 2019-01-11 北京智谷睿拓技术服务有限公司 The method and apparatus for determining image similarity
CN104392439A (en) * 2014-11-13 2015-03-04 北京智谷睿拓技术服务有限公司 Image similarity confirmation method and device
CN104463117A (en) * 2014-12-02 2015-03-25 苏州科达科技股份有限公司 Sample collection method and system used for face recognition and based on video
CN104463117B (en) * 2014-12-02 2018-07-03 苏州科达科技股份有限公司 A kind of recognition of face sample collection method and system based on video mode
CN105654055A (en) * 2015-12-29 2016-06-08 广东顺德中山大学卡内基梅隆大学国际联合研究院 Method for performing face recognition training by using video data
CN106407281A (en) * 2016-08-26 2017-02-15 北京奇艺世纪科技有限公司 Image retrieval method and device
CN106407281B (en) * 2016-08-26 2019-12-24 北京奇艺世纪科技有限公司 Image retrieval method and device
CN109446967A (en) * 2018-10-22 2019-03-08 深圳市梦网百科信息技术有限公司 A kind of method for detecting human face and system based on compression information
CN109446967B (en) * 2018-10-22 2022-01-04 深圳市梦网视讯有限公司 Face detection method and system based on compressed information
CN112085858A (en) * 2020-06-19 2020-12-15 北京筑梦园科技有限公司 Parking charging method, server and parking charging processing system

Also Published As

Publication number Publication date
CN101419670B (en) 2011-11-02

Similar Documents

Publication Publication Date Title
CN101419670B (en) Video monitoring method and system based on advanced audio/video encoding standard
US20220270369A1 (en) Intelligent cataloging method for all-media news based on multi-modal information fusion understanding
Chung et al. Seeing voices and hearing voices: learning discriminative embeddings using cross-modal self-supervision
Kieran et al. A framework for an event driven video surveillance system
Liu et al. LSTM-based multi-label video event detection
CN112468888B (en) Video abstract generation method and system based on GRU network
Duong et al. Shrinkteanet: Million-scale lightweight face recognition via shrinking teacher-student networks
CN112183468A (en) Pedestrian re-identification method based on multi-attention combined multi-level features
CN103237201A (en) Case video studying and judging method based on social annotation
CN113627266A (en) Video pedestrian re-identification method based on Transformer space-time modeling
Mahmoodi et al. Violence detection in videos using interest frame extraction and 3D convolutional neural network
Zhou et al. Recognizing pair-activities by causality analysis
Tariq et al. Real time vehicle detection and colour recognition using tuned features of Faster-RCNN
Pouthier et al. Active speaker detection as a multi-objective optimization with uncertainty-based multimodal fusion
Liu et al. Lip event detection using oriented histograms of regional optical flow and low rank affinity pursuit
Ma et al. Real-time multi-view face detection and pose estimation based on cost-sensitive adaboost
CN108491751B (en) Complex action identification method for exploring privilege information based on simple action
Frisch et al. Detection of a transient signal of unknown scaling and arrival time using the discrete wavelet transform
Zhou et al. Preserve pre-trained knowledge: Transfer learning with self-distillation for action recognition
Yin et al. Chinese sign language recognition based on two-stream CNN and LSTM network
Lee et al. Video summarization based on face recognition and speaker verification
Shi et al. Kernel null-space-based abnormal event detection using hybrid motion information
Shen et al. Fast gender recognition by using a shared-integral-image approach
Kumar A comparative study on machine learning algorithms using HOG features for vehicle tracking and detection
Chen et al. End-To-End Part-Level Action Parsing With Transformer

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20111102

Termination date: 20141121

EXPY Termination of patent right or utility model