CN111540103A - Embedded voice video talkback face recognition access control system based on timestamp synchronization method - Google Patents

Embedded voice video talkback face recognition access control system based on timestamp synchronization method Download PDF

Info

Publication number
CN111540103A
CN111540103A CN201910863305.8A CN201910863305A CN111540103A CN 111540103 A CN111540103 A CN 111540103A CN 201910863305 A CN201910863305 A CN 201910863305A CN 111540103 A CN111540103 A CN 111540103A
Authority
CN
China
Prior art keywords
video
access control
face recognition
control system
synchronization method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910863305.8A
Other languages
Chinese (zh)
Inventor
汤茂俊
谢巍
吴伟林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yangzhou Shengshiyun Information Technology Co ltd
Original Assignee
Yangzhou Shengshiyun Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yangzhou Shengshiyun Information Technology Co ltd filed Critical Yangzhou Shengshiyun Information Technology Co ltd
Priority to CN201910863305.8A priority Critical patent/CN111540103A/en
Publication of CN111540103A publication Critical patent/CN111540103A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/04Time compression or expansion
    • G10L21/055Time compression or expansion for synchronising with other signals, e.g. video signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • H04N7/183Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a single remote source
    • H04N7/186Video door telephones

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Interconnected Communication Systems, Intercoms, And Interphones (AREA)

Abstract

The invention discloses an embedded voice and video talkback face recognition access control system based on a timestamp synchronization method, and belongs to an access control system which takes an embedded ARM system as a hardware base, utilizes an audio and video signal acquisition module to acquire voice and video information, performs filtering processing on an audio signal, improves the signal-to-noise ratio, and simultaneously performs transmission of an audio signal and a video signal, thereby realizing the real-time synchronous visual talkback interaction function with a client, and forms a transmission network with a background server to perform face recognition processing on the acquired video, thereby finally realizing the control of the access control system. The main structure is as follows: the system comprises a CMOS camera, a microphone, an embedded ARM development board, a client and a background server.

Description

Embedded voice video talkback face recognition access control system based on timestamp synchronization method
Technical Field
The invention relates to the technical fields of sound and video image processing, feature matching, machine vision, man-machine interaction and the like, in particular to an embedded voice and video talkback face recognition access control system based on a timestamp synchronization method.
Background
Since video intercom systems mainly transmit image and audio signals, which are digital signals, digital signal transmission systems are generally used. While digital signal transmission has a significant effect in simplifying a system and optimizing transmission efficiency, there are disadvantages such as excessive noise of an audio signal, delay of a video signal and an audio signal, and non-synchronization of a video signal and an audio signal.
The user experience is poor due to excessive noise of the audio signal, delay of the video signal and the audio signal, and asynchronism of the video signal and the audio signal; and may affect the portrait recognition effect of the background server.
Summary of the invention
In order to overcome the defects and shortcomings of the existing embedded type video voice talkback access control system, the invention provides the embedded type voice video talkback face recognition access control system and method based on the timestamp synchronization method.
The invention adopts the following technical scheme:
an embedded voice video intercom face recognition access control system based on a timestamp synchronization method is composed of a CMOS camera, a microphone, an embedded ARM development board, a client and a background server.
The embedded ARM development board filters the audio signals to achieve a noise elimination effect, improves the signal-to-noise ratio, and transmits the collected and synchronized video signals and audio signals to the client side to achieve real-time interaction with the client.
The filtering processing is carried out on the audio signal so as to achieve the noise elimination effect, and the FIR filter method is easy to realize and has good digital characteristic, and the linear phase can not distort the phase.
The FIR filter is considered to carry out filtering processing on the audio signal so as to achieve the noise elimination effect. The FIR filter, i.e. the finite impulse response filter, is characterized by no feedback, wide application, easy realization, good digital characteristic and no distortion of the phase of the linear phase.
The impulse transfer function expression of the FIR finite impulse response filter system is:
H(z)=
Figure DEST_PATH_IMAGE001
=
Figure DEST_PATH_IMAGE002
+
Figure DEST_PATH_IMAGE003
+
Figure DEST_PATH_IMAGE004
+…+
Figure DEST_PATH_IMAGE005
=
Figure DEST_PATH_IMAGE006
(1-1)
then there are
Y(z)=[
Figure 309033DEST_PATH_IMAGE002
+
Figure 521446DEST_PATH_IMAGE003
+
Figure 442128DEST_PATH_IMAGE004
+…+
Figure 306179DEST_PATH_IMAGE005
]X(z) (1-2)
Solving Z inverse transformation to obtain differential equation
y(n)=
Figure 979606DEST_PATH_IMAGE002
x(n)+
Figure DEST_PATH_IMAGE007
x(n-1)+...+
Figure DEST_PATH_IMAGE008
x(n-N) (1-3)
Namely, it is
y(n)=
Figure DEST_PATH_IMAGE009
(1-4)
The difference equations given by equations (1-4) show that the output y (n) at the current time is multiplied by a series of input values x (n-r) including the current time and the history, by corresponding coefficients
Figure DEST_PATH_IMAGE010
And (6) determining.
The background server has the functions of processing video signals, realizing face recognition and finally realizing the control of the door lock of the entrance guard.
The client can realize the acquisition and synchronous transmission of video signals and audio signals and realize the real-time interaction with clients.
The video signal and the audio signal are collected and synchronously transmitted, and the adopted synchronization method is a time stamp synchronization method, namely, the system time is taken as a reference time line, time stamps are respectively created according to the characteristics of an audio frame and a video frame, and then the audio signal and the video signal are synchronized according to the time stamps.
The background server has the function of processing video signals and realizing the face recognition function; the face detection and recognition module is designed by OpenCV, a trained fast haar feature classifier carried by a development library is loaded by using a cascade classifier, and a function of the classifier is called to detect a person to a face image of the camera, so that a face recognition function is realized.
Due to the difference of the information amount of the audio and the video, the acquisition rate is naturally different, in addition, the video coding is hardware coding, the rate is high, the audio coding is software coding, the rate is relatively low, and the audio and the video need to be synchronized according to the actual situation. The adopted synchronization method is a time stamp synchronization method, namely, taking system time as a reference time line, respectively creating time stamps according to the characteristics of audio frames and video frames, and then realizing the synchronization of audio and video signals according to the time stamps. Audio and video signals are time stamped according to the system time, and the creation of the time stamp is mainly determined according to the characteristics of the audio and video data, such as the audio frame rate and the video frame rate.
Advantageous effects
The embedded ARM system is used as a core component, filtering processing is carried out on the audio signals, the signal to noise ratio is improved, acquisition and synchronous transmission of the video signals and the audio signals to a client side are achieved, and real-time interaction with the client is achieved; meanwhile, the background server has the function of processing video signals to realize the face recognition; finally, the door lock of the entrance guard is controlled, so that the video and voice intercom entrance guard system achieves a good interaction effect.
Drawings
Fig. 1 is a schematic structural diagram of the invention.
Detailed Description
The invention is described in further detail below with reference to examples and drawings, but the invention is not limited thereto.
Referring to fig. 1, the embedded video voice intercom face recognition access control system based on the timestamp synchronization method is composed of a CMOS camera, a microphone, an embedded ARM development board, a client and a background server. The core processor of the embedded ARM development board is a Samsung Exynos4412, the CPU main frequency can reach 1.5GHz, a Mali-400 MP high-performance graphic engine is integrated in the Exynos4412, smooth operation of 3D graphics is supported, and the embedded ARM development board can be well applied to a voice video intercom system. The microphone used was a Mic-in level microphone and the CMOS camera used was an orni USB camera.
The voice and video information is collected by the audio and video signal collecting module, the FIR filter is considered to filter the audio signal so as to achieve the noise elimination effect, and the filtering processing is carried out on the audio signal so as to improve the signal-to-noise ratio.
The adopted synchronization method is a time stamp synchronization method, namely, system time is taken as a reference time line, time stamps are respectively created according to the characteristics of audio frames and video frames, and then the audio signals and the video signals are synchronized according to the time stamps; the transmission of audio and video signals is realized, so that the real-time synchronous visual intercom interaction function with the client is realized, and good user experience is achieved.
The video frames of the system are 40ms each frame, namely the frame rate is 25fps, 25 frames are transmitted per second, the time of each frame of the audio frames is controlled to be 23.32ms. audio and video data synchronization, actually, the audio time stamps and the video time stamps are controlled, and the difference value between the audio time stamps and the video time stamps is ensured to be minimum.
The background server has the functions of processing video signals and realizing face recognition, and is characterized in that the face detection and recognition module is designed by OpenCV; loading a trained fast haar feature classifier carried by a development library by using a cascade classifier; calling a function of the classifier to detect that a person is facing to a face image of the camera; the face recognition function is realized; controlling the opening and closing of the entrance guard.
The above-mentioned embodiments are intended to be illustrative of the present invention, but the present invention is not limited to the above-mentioned embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (7)

1. Embedded voice video face identification access control system that talkbacks based on timestamp synchronization method, its characterized in that: the access control system is composed of a CMOS camera, a microphone, an embedded ARM development board, a client and a background server.
2. The embedded voice video face recognition entrance guard system based on the timestamp synchronization method according to claim 1, characterized in that: the embedded ARM development board filters the audio signals to achieve a noise elimination effect, improves the signal-to-noise ratio, and transmits the collected and synchronized video signals and audio signals to the client side to achieve real-time interaction with the client.
3. The embedded voice video intercom face recognition access control system based on the timestamp synchronization method according to claim 2 is characterized in that the filtering processing is carried out on the audio signal so as to achieve the noise elimination effect, and the FIR filter method is used, is easy to implement, has good digital characteristics, and does not distort the phase of the linear phase.
4. The embedded voice video intercom face recognition access control system based on the timestamp synchronization method as claimed in claim 1 is characterized in that the background server has a function of processing video signals, realizing face recognition and finally realizing control of an access control door lock.
5. The embedded voice video intercom face recognition access control system based on the timestamp synchronization method as claimed in claim 1 is characterized in that the client can realize the acquisition and synchronous transmission of video signals and audio signals and realize the real-time interaction with clients.
6. The embedded voice video intercom face recognition access control system based on the timestamp synchronization method according to claim 5 is characterized in that the video signals and the audio signals are collected and transmitted synchronously, the adopted synchronization method is the timestamp synchronization method, namely, the system time is used as a reference timeline, timestamps are respectively created according to the characteristics of audio frames and video frames, and then the audio signals and the video signals are synchronized according to the timestamps.
7. The embedded voice video intercom face recognition access control system based on the timestamp synchronization method according to claim 1 is characterized in that the background server has a function of processing video signals to realize face recognition; the face detection and recognition module is designed by OpenCV, a trained fast haar feature classifier carried by a development library is loaded by using a cascade classifier, and a function of the classifier is called to detect a person to a face image of the camera, so that a face recognition function is realized.
CN201910863305.8A 2019-09-12 2019-09-12 Embedded voice video talkback face recognition access control system based on timestamp synchronization method Pending CN111540103A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910863305.8A CN111540103A (en) 2019-09-12 2019-09-12 Embedded voice video talkback face recognition access control system based on timestamp synchronization method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910863305.8A CN111540103A (en) 2019-09-12 2019-09-12 Embedded voice video talkback face recognition access control system based on timestamp synchronization method

Publications (1)

Publication Number Publication Date
CN111540103A true CN111540103A (en) 2020-08-14

Family

ID=71976634

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910863305.8A Pending CN111540103A (en) 2019-09-12 2019-09-12 Embedded voice video talkback face recognition access control system based on timestamp synchronization method

Country Status (1)

Country Link
CN (1) CN111540103A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112423075A (en) * 2020-11-11 2021-02-26 广州华多网络科技有限公司 Audio and video timestamp processing method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101159865A (en) * 2007-08-20 2008-04-09 西安联合信息技术股份有限公司 WINCE platform based on audio-video collection and wireless transmission system
CN104717469A (en) * 2015-03-12 2015-06-17 严爱民 Wechat building video intercom door machine
CN105913037A (en) * 2016-04-26 2016-08-31 广东技术师范学院 Face identification and radio frequency identification based monitoring and tracking system
CN208737561U (en) * 2018-09-25 2019-04-12 深圳神目信息技术有限公司 A kind of face welcome hybrid system
CN109740577A (en) * 2019-02-28 2019-05-10 南京信息工程大学 A kind of real-time face based on raspberry pie identifies camera system and its adjustment method again

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101159865A (en) * 2007-08-20 2008-04-09 西安联合信息技术股份有限公司 WINCE platform based on audio-video collection and wireless transmission system
CN104717469A (en) * 2015-03-12 2015-06-17 严爱民 Wechat building video intercom door machine
CN105913037A (en) * 2016-04-26 2016-08-31 广东技术师范学院 Face identification and radio frequency identification based monitoring and tracking system
CN208737561U (en) * 2018-09-25 2019-04-12 深圳神目信息技术有限公司 A kind of face welcome hybrid system
CN109740577A (en) * 2019-02-28 2019-05-10 南京信息工程大学 A kind of real-time face based on raspberry pie identifies camera system and its adjustment method again

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112423075A (en) * 2020-11-11 2021-02-26 广州华多网络科技有限公司 Audio and video timestamp processing method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN105472191B (en) A kind of method and apparatus tracking echo delay time
CN110931031A (en) Deep learning voice extraction and noise reduction method fusing bone vibration sensor and microphone signals
EP2326091B1 (en) Method and apparatus for synchronizing video data
CN101843082B (en) A method and system for clock drift compensation
US20130304243A1 (en) Method for synchronizing disparate content files
CN109727607B (en) Time delay estimation method and device and electronic equipment
CN102089809A (en) Method, apparatus and computer program product for providing improved audio processing
JP7062727B2 (en) Audio signal processing methods and devices, storage media
US6747581B2 (en) Techniques for variable sample rate conversion
CN101305618A (en) Method of receiving a multimedia signal comprising audio and video frames
CN106716527A (en) Noise suppression system and method
CN104978966B (en) Frame losing compensation implementation method and device in audio stream
CN111540103A (en) Embedded voice video talkback face recognition access control system based on timestamp synchronization method
CN107911328B (en) Frame synchronization judging device and judging method
US10063907B1 (en) Differential audio-video synchronization
CN108833366A (en) Control frame compression method based on AS6802 agreement
CN106328160B (en) Noise reduction method based on double microphones
WO2021120795A1 (en) Sampling rate processing method, apparatus and system, and storage medium and computer device
CN111540365B (en) Voice signal determination method, device, server and storage medium
CN106341564B (en) Signal data processing method and device and intelligent conference equipment
WO2017121245A1 (en) Method for achieving surround sound, electronic device, and storage medium
JP4744843B2 (en) Method and system for synchronizing signals from multiple asynchronous sensors spatially distributed in the environment
CN109640242B (en) Audio source component and environment component extraction method
CN105187688B (en) The method and system that a kind of real-time video and audio to mobile phone collection synchronizes
WO2010108445A1 (en) Method for estimating inter-channel delay and apparatus and encoder thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination