CN101501564B - video surveillance system and method with combined video and audio recognition - Google Patents

video surveillance system and method with combined video and audio recognition Download PDF

Info

Publication number
CN101501564B
CN101501564B CN2006800555140A CN200680055514A CN101501564B CN 101501564 B CN101501564 B CN 101501564B CN 2006800555140 A CN2006800555140 A CN 2006800555140A CN 200680055514 A CN200680055514 A CN 200680055514A CN 101501564 B CN101501564 B CN 101501564B
Authority
CN
China
Prior art keywords
video
audio
signal
particular event
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN2006800555140A
Other languages
Chinese (zh)
Other versions
CN101501564A (en
Inventor
M·G·基恩兹勒
V·舍伊宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Publication of CN101501564A publication Critical patent/CN101501564A/en
Application granted granted Critical
Publication of CN101501564B publication Critical patent/CN101501564B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • H04N7/181Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a plurality of remote sources
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B13/00Burglar, theft or intruder alarms
    • G08B13/16Actuation by interference with mechanical vibrations in air or other fluid
    • G08B13/1654Actuation by interference with mechanical vibrations in air or other fluid using passive vibration detection systems
    • G08B13/1672Actuation by interference with mechanical vibrations in air or other fluid using passive vibration detection systems using sonic detecting means, e.g. a microphone operating in the audio frequency range
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B13/00Burglar, theft or intruder alarms
    • G08B13/18Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength
    • G08B13/189Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems
    • G08B13/194Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems
    • G08B13/196Actuation by interference with heat, light, or radiation of shorter wavelength; Actuation by intruding sources of heat, light, or radiation of shorter wavelength using passive radiation detection systems using image scanning and comparing systems using television cameras
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B19/00Alarms responsive to two or more different undesired or abnormal conditions, e.g. burglary and fire, abnormal temperature and abnormal rate of flow
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B31/00Predictive alarm systems characterised by extrapolation or other computation using updated historic data

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Business, Economics & Management (AREA)
  • Computing Systems (AREA)
  • Emergency Management (AREA)
  • Closed-Circuit Television Systems (AREA)
  • Burglar Alarm Systems (AREA)
  • Alarm Systems (AREA)

Abstract

A novel video surveillance system is made up of video and audio compression engine, a storage device and, a video and audio recognition engine. The video recognition engine detects such events as face recognition, motion detection etc, whereas audio recognition engine detects voice and other sound signatures indicating a potential alarm situation, e.g., panic voices such as screaming and yelling, or sounds such as gun shots, explosions. Combined recognition of audio and video signals provides for higher true alarm generation and lower false alarms level of the surveillance system. Additionally, the audio recognition engine provides information for directing video cameras in the direction of interest allowing better capture of an interesting scene.

Description

Video monitoring system and method with composite video and audio identification
Technical field
Invention relates generally to surveillance and the method that is used to provide security, more particularly, relates to a kind of new online (in real time) video and the audio recognition systems and the processing that is used for surveillance that are used for surveillance.
Background technology
Conventional video monitoring system does not generally comprise any function or the measure of monitor audio; That is, surveillance does not comprise the audio frequency input.At most, typical video monitoring system is recorded when such as the video monitoring system of description in United States Patent(USP) No. 6724421 and 6175382 visual information and audio-frequency information being provided.In two kinds of video monitoring systems in these lists of references, describing, video data is analyzed by the intelligent surveillance engine, and is compressed so that carry out stored digital.These engines are realized various recognizers, such as recognition of face, and motion detection, the panic detection assassinated (stabbing) motion detection or the like.For example, when keeping watch on the inlet of skyscraper, a kind of alarm condition relates to the unexpected rapid movement of a people towards another people, means possible plundering, hit or analogous action.In this case, the intelligent surveillance engine will be discerned (success ratio is less than 100%) motion suddenly fast, and produce alarm at monitoring station.Because the result who reports to the police, police strength can be sent monitored position.Obviously, motion suddenly fast possibly run to its father and mother/friend by children and produce, and in this case, the alarm of generation becomes false alarm, and this can waste sending of police strength.Another consequence of intelligent surveillance engine erroneous detection is in emergency circumstances real, does not produce alarm.For example, when there is more than one man-hour at the scene this situation possibly appear.When real emergency condition takes place, but not sending police strength is another defective of present surveillance.
The surveillance of having only video of prior art has been described among Fig. 1.Video camera array 10 is sent into video compression engine 12 to video information through video links 11.Video information is compressed, and issues memory storage 14 so that long preservation through link 16.In addition, video information is fed to video recognition engine 13 through identical video links 11.Video recognition engine 13 is carried out video recognition tasks, such as recognition of face, motion detection or the like, and produces incident and the alarm that sends to event database 15 and monitoring station 18 through link 17.Monitoring station 18 can comprise manned monitoring station, thereby the operator carries out the real-time vision monitoring of the video camera of specific quantity.When the emergency condition of thinking as the operator takes place, whether send police strength or his/her decision is depended in other emergency response troop to monitored district.According to top description, obviously do not utilize audio-frequency information, although can obtain such audio-frequency information usually in monitored district.
Represented to record the existing video monitoring system that has among Fig. 2.Video camera array 20 is sent into video and audio compression engine 22 to video information through video links 21.Simultaneously, audio-frequency information is admitted to video and audio compression engine 22 from microphone array 29 through voice frequency link 30.Video and audio-frequency information are compressed, and are issued memory storage 24 so that long preservation through link 26.Similarly, video information is admitted to video recognition engine 23 through identical video links 21.Video recognition engine 23 is carried out video recognition tasks, such as recognition of face, motion detection or the like, and produces the incident and the alarm of issuing database 25 and monitoring station 28 through link 27.Monitoring station 28 is manned monitoring stations, thereby the operator carries out the vision monitoring of the video camera of specific quantity.When the emergency condition of thinking as the operator takes place, whether send police strength or his/her decision is depended in other emergency response troop to monitored district.According to top description, obviously do not extract Useful Information, although can in the sound signal that monitored district obtains, obtain such information usually from the audio frequency input.
As stated, second kind of surveillance be recorded video and audio-frequency information simultaneously, and realizes being used for the intelligent surveillance engine of various video recognition tasks.At present, in these systems, audio-frequency information is compressed and records, and is not analyzed.
When analysis video was imported, present surveillance was not utilized quite valuable audio-frequency information.Obviously, this audio-frequency information is useful, and under many surveillance scenarios, can be widely used.
Thereby; It is desirable to very much introduce the use of audio-frequency information in the video monitoring system; The use of expection audio-frequency information will reduce the number of the false alarm of surveillance generation, and improves the percent of detected true alarm, and the people who reports to the police to assessment simultaneously provides more information.Utilize video information can not find that some incidents are opposite in addition and only, utilize Voice & Video information can find these incidents.
Summary of the invention
Thereby, an object of the present invention is to provide a kind of video monitoring system and method, comprise and using and the video information that combines from the audio-frequency information that obtained by surveillance zone.
Surveillance of the present invention had both comprised the vision signal input, comprised the sound signal input again.Vision signal is derived from numeral or analog video camera, and the audio frequency input is received from the microphone that is installed in monitored district.Video and audio-frequency information are compressed and send to digital memeory device.The quantity of the stored digital that the whole video cameras realized in order to practice thrift and microphone are required, preferably compressed audio and video information.With record side by side; Video and audio frequency input are admitted to smart recognition engine; Smart recognition engine is carried out video identification, audio identification; And carry out the instantaneous of result that is derived from video-audio identification and be correlated with, so that detection/recognition is represented one group of particular event of panic situation, such as high pitch birdie, blast, gunslinging etc.The alarm that smart recognition engine generates can be sent out to monitoring station, and at monitoring station, whether operator's decision sends police or emergency personnel to monitored district.
According to one aspect of the present invention, smart recognition engine is carried out the available video recognizer, such as recognition of face, motion detection or the like, and the audio/speech recognizer that is used for the specific vocabulary of speech recognition (" help ", " robbery " etc.).Audio recognition engine can be trained, and to discern special sound signal, such as gunslinging, blast etc., and representative is reported to the police or the high-pitched tone and other phonetic feature (signature) of emergency condition.
Through utilizing the microphone array of arranging along specific orientation, can confirm the direction of sound.Directed audio-frequency information can be delivered to camera control unit subsequently, so that one/a plurality of cameras oriented are to interested direction.So can carry out further video/audio recognition with better efficient.Thereby, for example, utilizing the microphone array in the monitored district, audio recognition engine can detect explosive sound.As a result, will make cameras oriented arrive the blast direction, and will in video engine, carry out subsequent action---to monitoring station warning scene Recognition/understanding.At once be used to the result who is derived from video and audio identification to instruct the further assessment of the Voice & Video of recording; And instruct the improvement of new video and audio frequency input to record; Advantageously improved the accuracy that detects; Shortened the used time of character of definite alarm, and more information is provided to the operator of assessment situation.
The output of video recognition engine and audio recognition engine is by mutual recognition engine analysis, and the result generates final alarm and is transmitted to monitoring station.
In order to realize these and other purpose, according to a preferred aspect of the present invention, a kind of surveillance and method are provided, and computer program, wherein said system comprises:
Generate the device of real time video signals, said real time video signals is included in the video information that receives acquisition in the surveillance zone;
Obtain the device of real-time audio signal, said real-time audio signal comprises the audio-frequency information that receives surveillance zone from said;
Receive said vision signal and sound signal simultaneously, therefrom confirm relevant video and audio identification information, and real-time audio and video information are relative to each other with the device of the possibility occurrence of definite particular event; With
According to the generation of particular event, produce the device of alarm condition.
Description of drawings
According to following explanation, additional claim and accompanying drawing, with the further feature of understanding structure of the present invention and method better, aspect and advantage, wherein:
Fig. 1 graphic extension is according to the surveillance of having only video of prior art;
Fig. 2 graphic extension is according to the video monitoring system with audio recording ability of prior art;
Fig. 3 graphic extension is according to the video monitoring system with video and audio identification of the present invention; And
Fig. 4 graphic extension is according to the details of smart recognition engine of the present invention.
Embodiment
Fig. 3 graphic extension is according to the video monitoring system with video and audio identification of the present invention.As shown in Figure 3; Comprise one or more colours or monochromatic still life or video electronic video camera; For example CCD or cmos camera, the video camera array 40 that perhaps has the equivalent combinations of taking the assembly that receives surveillance zone is sent into digital video and audio compression engine 42 to vision signal through video communication link 41.For example under computing machine and/or software control, the motion of each camera system of video camera array 40 and operation can be by the control signal controls that receives.In addition, the operating parameter of each video camera in the video camera array 40 comprises panorama (pan)/pitching (tilt) mirror, lens combination, focusing motor, panorama motor and pitching motor control by the control signal control that receives, and is said more in detail as follows.Before the output digital video signal, can use many signal processing technologies for example to reduce noise or filtering/image enhancement technique is provided.
Simultaneously, setting comprises the microphone array 49 that can convert acoustic pressure to the microphone sensor devices (microphone of omnidirectional and/or high orientation) of electric signal, to send into digital video and audio compression engine 42 to audio-frequency information through voice communication link 50.Those skilled in the art is known; The directivity degree of microphone array changes with sound frequency; So that can consider required frequency range capability, confirm the number of microphone and the distance between the microphone, so that the directivity of any given extent can be provided.For example; Can under software control, be controlled at the microphone realized in the array to realize these targets; And the microphone of in array, realizing comprises transducer, and said transducer is configured to have obviously is partial to the for example pickup mode of each frequency reception in scopes such as human speech, blast, gunslinging.Like this, thus guarantee that microphone array is the sound field with higher accuracy response sound event of susceptible.Can use other sound signal regulation technology, the simulated audio signal that for example utilizes the A/D converter digitizing to obtain, and provide for example gain control, noise reduction/filter to make an uproar.Digitized video and audio-frequency information be by digital compression, and issued memory storage 44 so that long preservation through link 46, for example, and database; Hard disk drive, magnetic medium or optical medium include, but is not limited to: CD-ROM, DVD; Tape, disk, disk array or the like.The output of each video camera of video camera array 40 is with compressed format, and such as MPEG1, MPEG2 etc. are stored in the storage medium.In addition, the output of each video camera of video camera array 40 can be stored in ad-hoc location related with this video camera on the storage medium, is perhaps preserved together corresponding to the indication of which video camera with the output of each preservation.
Shown in further among Fig. 3, identical video information and audio-frequency information are sent into smart recognition engine 43 through corresponding video links 41 with voice frequency link 50 in addition simultaneously.Should be understood that the communication link 41 and 50 between corresponding video camera array and audio microphone array and video and audio compression engine 42 and the smart recognition engine 43 can be hard-wired, perhaps can adopt Radio Link.In addition, these communication links are taked cable, satellite, RF and microwave transmission, optical fiber or the like form also within the scope of the invention.
Said more in detail as follows, as shown in Figure 4, smart recognition engine 43 comprises video recognition engine 62, audio recognition engine 63, mutual recognition engine and warning generation module 64.Smart recognition engine 43 realizes control computer equipment completion execution video identification algorithm and the method for face recognition algorithms and the software of process.These algorithms can and motion detection algorithm (for example, following the tracks of each point) with known (patch) relevant or track algorithm of the motion of characteristic in the estimated image stream wait execution together.In addition, smart recognition engine 43 realizes control computer equipment completion execution audio identification and the method for speech recognition algorithm and the software of process.The speech recognition algorithm that is embodied as computer-readable instruction, data structure, program module etc. can be used to discern the specific spoken words (" help ", " robbery " etc.) of representing emergency potentially or answering alarm condition.
The audio recognition engine 63 that comprises computer-readable instruction, data structure, program module or other data can be trained; Special audio signal with identification such as gunslinging, blast; And higher pitch sounds; For example, scream, cry in fear, and relevant other sound and the voice characteristics of incident of reporting to the police with known possibly causing.But be appreciated that according to the present invention and can adopt the various recognizers that do not require training in advance.
The computing equipment of realizing comprises the general purpose computing device such as PC, laptop devices, mobile device; Have and include, but is not limited to processing unit, system storage and system bus at interior assembly, said system bus coupling comprises that each system component of system storage is to handling the unit.Computer equipment is realized these assemblies; Be kept at smart recognition engine and the audio recognition engine on the known computer-readable medium so that carry out; Said computer-readable medium comprises and can be comprised detachable media, on-dismountable medium, Volatile media and non-volatile media by any usable medium of computer equipment visit.Computer readable recording medium storing program for performing can for example concentrate on a position, perhaps be dispersed in the computer system that connects through network, and the computer-readable recognizer can be stored in the computer readable recording medium storing program for performing, and be performed according to the mode of disperseing.
Return Fig. 3, through utilize specific towards microphone array 49, the direction of sound is confirmable.Relate to by the directional information of the audio event of sensing and be passed to camera microphone control module 52 through wired or wireless communication link 53.Video camera/microphone control module 52 comprises realization by control signal 54, and one/a plurality of video cameras of aligned array 40 are controlled necessary whole software with the motor position of the position of control microphone array 49.For example, control signal can be transfused to video camera array 40, with adjustment or control video camera panorama/tilt mirrors, lens combination, focusing motor, panorama motor and pitching motor assembly and subsystem.These control signals are used to the being seen visual field of automatic aligned with camera in addition, so that obtain image better placed in the middle about the more information of actual alarm or alert event, and that perhaps amplify more, that focus on or distinct image more.In a non-limitative example, the audio identification of the gunslinging sound signal of response smart recognition engine can generate this scene of one or more camera video camera array, with the control signal of " attention " gunslinging direction.If according to the audio identification of gunslinging, make video camera array aim at the scene of a crime, owing to can obtain the more information about gunslinging, the situation of " crime dramas " identification can be better so.Alternatively or in addition, can produce these control signals as to be used for the adjusting orientation of microphone and the distance between the microphone automatically, so that receive subsidiary audio-frequency information better.In addition, consider the sound signal that detects required frequency range, the directivity of any given degree perhaps is provided, can adjust microphones orientation.Thereby, for example, the response video recognition event, one or more microphones can be redirected, so that " intercepting " specific direction.
More particularly, as shown in Figure 4, analyze by mutual recognition engine 64, handling video and the audio identification information of receiving simultaneously, and finally determine whether to exist alarm condition from the output of video recognition engine 62 and audio recognition engine 63.In this way, can generate the alarm that is transmitted to manned monitoring station 48 through communication link 47.That is, in mutual recognition engine 64, use, the identifying that adopts as computer-readable instruction, data structure, program module etc. is usually based on pattern match and/or hypothesis evaluation.In evaluation stage, confirm the estimator of the probability of variety of event.This can be through confirming between the subsidiary speech of the video scene of each identification and identification or audio frequency characteristics, to exist the correlativity of which kind of degree to realize according to real-time video identifying information and sound signal.In an example recognition event, in order to discern the assassination activity, video information is used to manage to assess the probability of each video scene.If known such scene can be with higher pitch sounds (scream etc.), from the audio frequency input, detect so high-pitched tone can increase its for as the result's of the assassination activity of in vision signal, catching probability.The resolution that police or emergency personnel are depended on the operator is sent to the monitoring area in the specific region that operator's vision monitoring video camera array 40 is kept watch on, and report to the police when indicating when the alarm generation unit provides whether.According to top description, obviously exist from the audio frequency input and extract useful information, through combining with video recognition event, this has improved the whole operation of surveillance.
In addition as shown in Figure 4, the communication link 60 between video recognition engine 62 and the mutual recognition engine 64 is two-way, just as the communication link 61 between audio recognition engine 63 and the mutual recognition engine 64. Link 60 and 61 amphicheirality allow video and audio recognition algorithm as described above mode influence each other, thereby, identification video and audio frequency better, and possibly realize detecting up to now can not detected particular event.
Although represent in detail and the present invention be described about illustrative and embodiment that realize of the present invention; But those skilled in the art understands and can make aforementioned aspect form and the details and other variation; And not breaking away from the spirit and scope of the present invention, the spirit and scope of the present invention only should be limited the scope of additional claim.

Claims (15)

1. surveillance of utilizing video and audio identification comprises:
Generate the device of real time video signals, said real time video signals is included in the video information that receives acquisition in the surveillance zone, and the device of wherein said generation real time video signals comprises one or more camera systems;
Obtain the device of real-time audio signal, said real-time audio signal comprises the audio-frequency information that receives surveillance zone from said;
Receive said vision signal and sound signal simultaneously, therefrom confirm relevant video and audio identification information, and said Voice & Video identifying information is relative to each other with the device of the possibility occurrence of definite particular event;
According to the generation of said particular event, produce the device of alarm condition; And
The said device that the Voice & Video identifying information is relative to each other with the possibility occurrence of confirming particular event also is configured in response to the generation that identifies this particular event according to the said audio identification of particular event, generates control signal and catches vision signals with the one or more directions in this particular event in the guiding camera system.
2. according to the described system of claim 1; Said said vision signal and the sound signal of receiving simultaneously; Therefrom confirm relevant video and audio identification information; And the device that said Voice & Video identifying information is relative to each other with the possibility occurrence of confirming particular event comprises first recognition engine, is used to handle said vision signal, to be used for confirming said video identification information.
3. according to the described system of claim 2; Said said vision signal and the sound signal of receiving simultaneously; Therefrom confirm relevant video and audio identification information; And the device that said Voice & Video identifying information is relative to each other with the possibility occurrence of confirming particular event comprises second recognition engine, is used to handle said sound signal, to be used for confirming said audio identification information.
4. according to the described system of claim 1, wherein each said camera system comprises the said control signal of response one or more with in one or more panorama/tilt mirrors, lens combination, focusing motor, panorama motor and the pitching motor assembly in the panorama of adjustment camera system, pitching, convergent-divergent, rotation, passing, the translation controlled variable.
5. according to the described system of claim 1; The device of wherein said acquisition real-time audio signal comprises one or more microphone apparatus; Said make said Voice & Video identifying information be relative to each other device with the possibility occurrence of confirming particular event also comprise response according to the said video identification of particular event identify maybe incident generation; Generate control signal guiding said one or more microphone apparatus, thereby can catch the device of sound signal in the direction of this particular event.
6. according to the described system of claim 5, wherein each said microphone apparatus responds said control signal, considers the orientation of the said microphone apparatus of the automatic adjustment of the sound signal that detects required frequency range.
7. according to the described system of claim 5, wherein each said microphone apparatus responds said control signal, considers the orientation of adjusting said microphone apparatus with the directivity received audio signal of any given degree automatically.
8. according to the described system of claim 1, also comprise the device of preserving said Voice & Video signal.
9. according to the described system of claim 8, also be included in be kept at said Voice & Video signal in the device of the said Voice & Video signal of said preservation before, compress the device of said Voice & Video signal.
10. a method for monitoring that utilizes video and audio identification comprises the steps:
Treating apparatus receives real time video signals and real-time audio signal simultaneously; Said real time video signals is included in the video information that receives acquisition in the surveillance zone; Said real-time audio signal comprises the audio-frequency information that receives surveillance zone from said, and wherein said real time video signals is provided by one or more camera systems;
Confirm relevant video identification and audio identification information from the video and audio signal of said reception;
Real-time audio and video identification information are relative to each other, to confirm the possibility occurrence of particular event;
In response to the generation that identifies this particular event according to the said audio identification of particular event, generate control signal and catch vision signal to guide the one or more directions in the said camera system in this particular event; And
Generation according to said particular event produces alarm condition.
11. according to the described method for monitoring of claim 10, wherein said treating apparatus comprises first recognition engine, said first recognition engine realizes confirming said video identification information processing step from said vision signal.
12. according to the described method for monitoring of claim 11, wherein said treating apparatus comprises second recognition engine, said second recognition engine realizes confirming said audio identification information processing step from said sound signal.
13. according to the described method for monitoring of claim 10, each in wherein said one or more camera systems comprises one or more in one or more panorama/tilt mirrors, lens combination, focusing motor, panorama motor and the pitching motor assembly in the panorama, pitching, convergent-divergent, rotation, passing, translation controlled variable of the said control signal of response adjustment camera system.
14., also comprise being kept at the step in the data storage device to said Voice & Video signal according to the described method for monitoring of claim 10.
15. it is, further comprising the steps of: as before being kept at the Voice & Video signal in the said data storage device, to compress said Voice & Video signal according to the described method for monitoring of claim 14.
CN2006800555140A 2006-08-03 2006-08-03 video surveillance system and method with combined video and audio recognition Active CN101501564B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2006/030560 WO2008016360A1 (en) 2006-08-03 2006-08-03 Video surveillance system and method with combined video and audio recognition

Publications (2)

Publication Number Publication Date
CN101501564A CN101501564A (en) 2009-08-05
CN101501564B true CN101501564B (en) 2012-02-08

Family

ID=38997456

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2006800555140A Active CN101501564B (en) 2006-08-03 2006-08-03 video surveillance system and method with combined video and audio recognition

Country Status (6)

Country Link
JP (1) JP5043940B2 (en)
CN (1) CN101501564B (en)
BR (1) BRPI0621897B1 (en)
CA (1) CA2656268A1 (en)
MX (1) MX2009001254A (en)
WO (1) WO2008016360A1 (en)

Families Citing this family (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2466242B (en) * 2008-12-15 2013-01-02 Audio Analytic Ltd Sound identification systems
US9286911B2 (en) 2008-12-15 2016-03-15 Audio Analytic Ltd Sound identification systems
CN102082948B (en) * 2009-11-30 2012-07-25 中国移动通信集团北京有限公司 System, method and equipment for acquiring video information
CN103067655A (en) * 2011-10-24 2013-04-24 鸿富锦精密工业(深圳)有限公司 System and method of controlling video camera device
CN103136899B (en) * 2013-01-23 2016-01-20 宁凯 Based on the intelligent alarm method for supervising of Kinect somatosensory device
JP5958833B2 (en) * 2013-06-24 2016-08-02 パナソニックIpマネジメント株式会社 Directional control system
CN103747217A (en) * 2014-01-26 2014-04-23 国家电网公司 Video monitoring method and device
EP2927885A1 (en) * 2014-03-31 2015-10-07 Panasonic Corporation Sound processing apparatus, sound processing system and sound processing method
US10182280B2 (en) 2014-04-23 2019-01-15 Panasonic Intellectual Property Management Co., Ltd. Sound processing apparatus, sound processing system and sound processing method
EP2938097B1 (en) * 2014-04-24 2017-12-27 Panasonic Corporation Sound processing apparatus, sound processing system and sound processing method
CN105338294A (en) * 2014-08-07 2016-02-17 富士通株式会社 Monitoring device and method
CN104269016A (en) * 2014-09-22 2015-01-07 北京奇艺世纪科技有限公司 Alarm method and device
CN104333686B (en) * 2014-11-27 2018-03-27 天地伟业技术有限公司 Intelligent monitoring camera and its control method based on face and Application on Voiceprint Recognition
US9813484B2 (en) 2014-12-31 2017-11-07 Motorola Solutions, Inc. Method and apparatus analysis of event-related media
US20160241818A1 (en) * 2015-02-18 2016-08-18 Honeywell International Inc. Automatic alerts for video surveillance systems
JP6682222B2 (en) 2015-09-24 2020-04-15 キヤノン株式会社 Detecting device, control method thereof, and computer program
US9598076B1 (en) * 2015-10-22 2017-03-21 Ford Global Technologies, Llc Detection of lane-splitting motorcycles
CN105491336B (en) * 2015-12-08 2018-07-06 成都芯软科技发展有限公司 A kind of low power image identification module
CN106028217B (en) * 2016-06-20 2020-01-21 咻羞科技(深圳)有限公司 Intelligent equipment interaction system and method based on audio recognition technology
CN106023515A (en) * 2016-07-06 2016-10-12 中警科技(江苏)开发有限公司 Remote automatic alarm police kiosk
WO2018075068A1 (en) 2016-10-21 2018-04-26 Empire Technology Development Llc Selecting media from mass social monitoring devices
CN106600876A (en) * 2017-01-24 2017-04-26 璧典寒 Automatic machine room duty alarming system and alarming method
US10810854B1 (en) 2017-12-13 2020-10-20 Alarm.Com Incorporated Enhanced audiovisual analytics
CN109033997A (en) * 2018-07-02 2018-12-18 厦门快商通信息技术有限公司 A kind of lumbering event detecting method and system
CN112425157A (en) * 2018-07-24 2021-02-26 索尼公司 Information processing apparatus and method, and program
CN109089087B (en) * 2018-10-18 2020-09-29 广州市盛光微电子有限公司 Multi-channel audio-video linkage device
CN109543538A (en) * 2018-10-23 2019-03-29 深圳壹账通智能科技有限公司 Obtain method, apparatus, computer equipment and the storage medium of the track of alert object
TWI687753B (en) * 2018-12-06 2020-03-11 宏碁股份有限公司 Panoramic camera and panoramic photography system
CN110336976A (en) * 2019-06-13 2019-10-15 长江大学 A kind of intelligent monitoring probe and system
CN111091073A (en) * 2019-11-29 2020-05-01 清华大学 Abnormal event monitoring equipment and method combining video and audio
EP3839909A1 (en) * 2019-12-18 2021-06-23 Koninklijke Philips N.V. Detecting the presence of an object in a monitored environment
CN111460907B (en) * 2020-03-05 2023-06-20 浙江大华技术股份有限公司 Malicious behavior identification method, system and storage medium
DE102020209025A1 (en) * 2020-07-20 2022-01-20 Robert Bosch Gesellschaft mit beschränkter Haftung Method for determining a conspicuous partial sequence of a surveillance image sequence
CN111818237A (en) * 2020-07-21 2020-10-23 南京智金科技创新服务中心 Video monitoring analysis system and method
CN112396801A (en) * 2020-11-16 2021-02-23 苏州思必驰信息科技有限公司 Monitoring alarm method, monitoring alarm device and storage medium
GB202019713D0 (en) * 2020-12-14 2021-01-27 Vaion Ltd Security system
CN112929372A (en) * 2021-02-06 2021-06-08 北京第七九七音响股份有限公司 Network intelligent audio terminal, monitoring method and monitoring system
CN113920660B (en) * 2021-09-30 2023-04-18 中国工商银行股份有限公司 Safety monitoring method and system suitable for safety storage equipment
GB2620594A (en) * 2022-07-12 2024-01-17 Ava Video Security Ltd Computer-implemented method, security system, video-surveillance camera, and server

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6175382B1 (en) * 1997-11-24 2001-01-16 Shell Oil Company Unmanned fueling facility
CN1449186A (en) * 2003-04-03 2003-10-15 上海交通大学 Abnormal object automatic finding and tracking video camera system
CN1527992A (en) * 2001-03-15 2004-09-08 �ʼҷ����ֵ������޹�˾ Automatic system for monitoring independent person requiring occasional assistance
CN1716329A (en) * 2004-06-29 2006-01-04 乐金电子(沈阳)有限公司 Baby monitoring system and its method using baby's crying frequency
CN200966113Y (en) * 2006-11-08 2007-10-24 天津三星电子有限公司 A monitor with the audio locking functions

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3381343B2 (en) * 1993-12-03 2003-02-24 株式会社日立製作所 Monitoring device
JPH0983856A (en) * 1995-09-07 1997-03-28 Nippon Telegr & Teleph Corp <Ntt> Intelligent camera equipment
JP4175180B2 (en) * 2003-05-29 2008-11-05 松下電工株式会社 Monitoring and reporting system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6175382B1 (en) * 1997-11-24 2001-01-16 Shell Oil Company Unmanned fueling facility
CN1527992A (en) * 2001-03-15 2004-09-08 �ʼҷ����ֵ������޹�˾ Automatic system for monitoring independent person requiring occasional assistance
CN1449186A (en) * 2003-04-03 2003-10-15 上海交通大学 Abnormal object automatic finding and tracking video camera system
CN1716329A (en) * 2004-06-29 2006-01-04 乐金电子(沈阳)有限公司 Baby monitoring system and its method using baby's crying frequency
CN200966113Y (en) * 2006-11-08 2007-10-24 天津三星电子有限公司 A monitor with the audio locking functions

Also Published As

Publication number Publication date
BRPI0621897A2 (en) 2011-03-29
WO2008016360A1 (en) 2008-02-07
MX2009001254A (en) 2009-02-11
CA2656268A1 (en) 2008-02-07
JP2009545911A (en) 2009-12-24
JP5043940B2 (en) 2012-10-10
CN101501564A (en) 2009-08-05
BRPI0621897B1 (en) 2018-03-20

Similar Documents

Publication Publication Date Title
CN101501564B (en) video surveillance system and method with combined video and audio recognition
US20060227237A1 (en) Video surveillance system and method with combined video and audio recognition
CN109300471B (en) Intelligent video monitoring method, device and system for field area integrating sound collection and identification
KR101445367B1 (en) Intelligent cctv system to recognize emergency using unusual sound source detection and emergency recognition method
CN102737480B (en) Abnormal voice monitoring system and method based on intelligent video
CN101119482B (en) Overall view monitoring method and apparatus
CN111601074A (en) Security monitoring method and device, robot and storage medium
CN101123722B (en) Panorama video intelligent monitoring method and system
WO1997008896A1 (en) Open area security system
KR101864388B1 (en) Peculiar sound detection system and method to be cancelled out the noise by using array microphone in CCTV camera system
CN102176746A (en) Intelligent monitoring system used for safe access of local cell region and realization method thereof
KR101899436B1 (en) Safety Sensor Based on Scream Detection
CN109551500A (en) Supervisory control of robot alarm system
JP2012048689A (en) Abnormality detection apparatus
KR101444843B1 (en) System for monitoring image and thereof method
CN116129490A (en) Monitoring device and monitoring method for complex environment behavior recognition
Park et al. Sound learning–based event detection for acoustic surveillance sensors
CN105474665A (en) Sound processing apparatus, sound processing system, and sound processing method
KR102034176B1 (en) Emergency Situation Perception Method by Voice Recognition, and Managing Server Used Therein
CN113781702B (en) Cash box management method and system based on internet of things
KR100902275B1 (en) Cctv system for intelligent security and method thereof
CN114286059A (en) Wireless video monitoring system for kindergarten
KR102319687B1 (en) Surveillance system adopting wireless acoustic sensors
CN111627178A (en) Sound identification positioning warning system and method thereof
JP2004357014A (en) Monitor report system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant