CN115334289A - Audio video processing system, method, device, equipment and storage medium - Google Patents

Audio video processing system, method, device, equipment and storage medium Download PDF

Info

Publication number
CN115334289A
CN115334289A CN202210964134.XA CN202210964134A CN115334289A CN 115334289 A CN115334289 A CN 115334289A CN 202210964134 A CN202210964134 A CN 202210964134A CN 115334289 A CN115334289 A CN 115334289A
Authority
CN
China
Prior art keywords
target
abnormal event
signal
video
event
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210964134.XA
Other languages
Chinese (zh)
Inventor
曾亮
涂贤玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing SoundAI Technology Co Ltd
Original Assignee
Beijing SoundAI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing SoundAI Technology Co Ltd filed Critical Beijing SoundAI Technology Co Ltd
Priority to CN202210964134.XA priority Critical patent/CN115334289A/en
Publication of CN115334289A publication Critical patent/CN115334289A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • H04N7/181Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a plurality of remote sources
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/14Picture signal circuitry for video frequency region

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Closed-Circuit Television Systems (AREA)
  • Burglar Alarm Systems (AREA)

Abstract

The application provides an audio and video processing system, method, device, equipment and storage medium, and belongs to the technical field of monitoring. The system comprises: the system comprises an audio acquisition device, a video acquisition device and a control device; the audio acquisition equipment is used for acquiring an audio signal of the target area and sending the audio signal to the control equipment; the video acquisition equipment is used for carrying out video monitoring on the target area so as to acquire a video signal of the target area and sending the video signal to the control equipment; the control equipment is used for receiving the audio signals and the video signals, detecting abnormal events based on the received signals, acquiring event information of the target abnormal events if the target abnormal events in the target area are detected based on any type of signals, and monitoring the target abnormal events based on the event information of the target abnormal events. The system improves the accuracy of detecting abnormal events.

Description

Audio video processing system, method, device, equipment and storage medium
Technical Field
The present application relates to the field of monitoring technologies, and in particular, to an audio/video processing system, method, apparatus, device, and storage medium.
Background
Video monitoring is a main monitoring means, which can monitor a designated area, and can detect abnormal events (such as car accident events) based on monitored data. However, the video acquisition device for video monitoring has the problems of narrow viewing angle, high shielding possibility and the like, so that the monitoring effect is poor, and the accuracy of detecting abnormal events is poor.
Disclosure of Invention
The embodiment of the application provides an audio and video processing system, method, device, equipment and storage medium, which can improve the accuracy of detecting abnormal events. The technical scheme is as follows:
in one aspect, an audio video processing system is provided, the system comprising: the system comprises an audio acquisition device, a video acquisition device and a control device;
the audio acquisition equipment is used for acquiring an audio signal of a target area and sending the audio signal to the control equipment;
the video acquisition equipment is used for carrying out video monitoring on the target area so as to acquire a video signal of the target area and sending the video signal to the control equipment;
the control device is used for receiving the audio signal and the video signal, detecting an abnormal event based on the received signals, acquiring event information of a target abnormal event if the target abnormal event is detected to occur in the target area based on any type of signals, and monitoring the target abnormal event based on the event information of the target abnormal event.
In some embodiments, the control apparatus, when performing the abnormal event detection based on the received signal, is configured to perform the abnormal event detection based on the audio signal and the video signal if it is determined that the audio signal does not match a reference audio signal and the video signal does not match the reference video signal;
under the condition that the audio signal is determined to be matched with the reference audio signal and the video signal is determined not to be matched with the reference video signal, abnormal event detection is carried out based on the video signal;
in the case that the audio signal is determined not to match the reference audio signal and the video signal is determined to match the reference video signal, performing abnormal event detection based on the audio signal; the reference audio signal is an audio signal of a target area acquired when the target abnormal event does not occur in the target area, and the reference video signal is a video signal of the target area acquired when the target abnormal event does not occur in the target area.
In some embodiments, the control device, when determining that the audio signal does not match the reference audio signal and that the video signal matches the reference video signal, is configured to perform abnormal event detection based on the audio signal, to determine an occurrence location of the target abnormal event based on the audio signal;
acquiring a video signal which is acquired again based on the occurrence position of the target abnormal event;
performing an abnormal event detection based on the re-captured video signal.
In some embodiments, the system further comprises a plurality of backup video capture devices for video surveillance of different locations of the target area;
the control equipment is used for acquiring a video signal acquired by target standby video acquisition equipment when acquiring a video signal acquired again based on the occurrence position of the target abnormal event and detecting the abnormal event based on the video signal acquired again;
and detecting an abnormal event based on the video signal acquired by the target standby video acquisition equipment, wherein the position monitored by the target standby video acquisition equipment is the same as the occurrence position of the target abnormal event.
In some embodiments, the control device, when acquiring a reacquired video signal based on the occurrence position of the target abnormal event and performing abnormal event detection based on the reacquired video signal, is configured to control the position monitored by the video capture device to be adjusted to the occurrence position of the target abnormal event, so that the video capture device captures the video signal of the occurrence position of the target abnormal event;
and detecting abnormal events based on the video signals re-acquired by the video acquisition equipment.
In some embodiments, the target exception event comprises a plurality of stages of development, the event information comprising a location of occurrence of the target exception event at a current stage of the plurality of stages of development;
the control device is used for predicting the occurrence position of the target abnormal event at the next stage based on the occurrence position of the target abnormal event at the current stage when monitoring the target abnormal event based on the event information of the target abnormal event;
and monitoring the target abnormal event based on the occurrence position of the target abnormal event in the next stage.
In some embodiments, the control device, when predicting the occurrence position of the target abnormal event at the next stage based on the occurrence position of the target abnormal event at the current stage, is configured to acquire event information of a historical abnormal event, the event information of the historical abnormal event including the occurrence positions of the historical abnormal event at a plurality of stages of development;
in response to the matching of the occurrence position of the historical abnormal event in a first historical stage and the occurrence position of the target abnormal event in the current stage, taking the occurrence position of the historical abnormal event in a second historical stage as the occurrence position of the target abnormal event in the next stage, wherein the first historical stage is any one stage except the last stage in a plurality of development stages of the historical abnormal event, and the second historical stage is the next stage of the first historical stage.
In some embodiments, the control device, when detecting an abnormal event based on the received signals, is configured to determine that a target abnormal event is detected if any type of signals is determined to match a target signal in an abnormal event library, the abnormal event library being configured to store audio signals and video signals of a plurality of types of abnormal events, the target signal including audio signals and/or video signals of the target abnormal event; the target exception event comprises any one of the plurality of exception events.
In some embodiments, the control device, when determining that any type of signal matches a target signal in the anomaly event library, is configured to determine similarities between the any type of signal and signals in the anomaly event library, respectively, and in a case that the similarities between the any type of signal and the target signal meet a similarity condition, determine that the any type of signal matches the target signal.
In some embodiments, the target signal corresponds to a plurality of exception events;
the control device is used for determining that the target abnormal event is detected under the condition that any type of signal is determined to be the signal under the target abnormal event in the multiple abnormal events based on the target signal and the Bayesian probability when the target abnormal event is determined to be detected.
In some embodiments, the target exception event occurs multiple times;
the control device is used for performing event statistics based on event information of a plurality of times of target abnormal events after detecting that the target abnormal events occur in the target area based on any type of signals, and generating an analysis report of the target abnormal events based on statistical results, wherein the analysis report comprises at least one of an occurrence time chart, an occurrence position chart and an occurrence frequency chart of the target abnormal events.
In another aspect, an audio video processing method is provided, the method including:
receiving an audio signal sent by audio acquisition equipment and a video signal sent by video acquisition equipment, wherein the audio signal and the video signal are both signals of an acquired target area;
performing abnormal event detection based on the audio signal and the video signal;
if a target abnormal event is detected to occur in the target area based on any type of signals, acquiring event information of the target abnormal event;
and monitoring the target abnormal event based on the event information of the target abnormal event.
In another aspect, an audio-video processing apparatus is provided, the apparatus including:
the signal receiving module is used for receiving an audio signal sent by audio acquisition equipment and a video signal sent by video acquisition equipment, wherein the audio signal and the video signal are both signals of an acquired target area;
an event detection module for performing abnormal event detection based on the audio signal and the video signal;
the information acquisition module is used for acquiring event information of a target abnormal event if the target abnormal event in the target area is detected based on any type of signals;
and the event monitoring module is used for monitoring the target abnormal event based on the event information of the target abnormal event.
In another aspect, a control device is provided, which includes one or more processors and one or more memories, wherein at least one program code is stored in the one or more memories, and the at least one program code is loaded and executed by the one or more processors to implement the audio and video processing method according to any one of the above-mentioned implementation manners.
In another aspect, a computer-readable storage medium is provided, in which at least one program code is stored, and the at least one program code is loaded and executed by a processor to implement the audio and video processing method according to any of the above-mentioned implementation manners.
In another aspect, a computer program product is provided, the computer program product comprising computer program code, the computer program code being stored in a computer readable storage medium, from which a processor of a control device reads the computer program code, the processor executing the computer program code to cause the control device to perform the audio video processing method according to any of the implementations described above.
The embodiment of the application provides an audio and video processing system, which respectively collects audio signals and video signals of a target area through an audio collecting device and a video collecting device, further detects abnormal events based on the two signals, and can determine that the target abnormal event is detected under the condition that any signal detects the abnormal event; according to the system, under the condition that the abnormal event is not detected based on one type of signals, the abnormal event can be detected based on the other type of signals, so that the probability of detecting the abnormal event is improved, the detected target abnormal event is monitored, and the accuracy of detecting the abnormal event is further improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic diagram of an audio-video processing system according to an embodiment of the present application;
FIG. 2 is a schematic diagram of another audio-video processing system provided by an embodiment of the present application;
fig. 3 is a flowchart of an audio and video processing method provided in an embodiment of the present application;
fig. 4 is a block diagram of an audio and video processing apparatus provided in an embodiment of the present application;
fig. 5 is a block diagram of a control device according to an embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
The terms "first," "second," "third," and "fourth," etc. in the description and claims of this application and in the accompanying drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements but may alternatively include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be noted that the information (including but not limited to user device information, user personal information, etc.), data (including but not limited to data for analysis, stored data, displayed data, etc.) and signals referred to in this application are authorized by the user or fully authorized by various parties, and the collection, use and processing of the relevant data are subject to relevant laws and regulations and standards in relevant countries and regions. For example, audio signals and video signals referred to in this application are acquired with sufficient authorization.
An audio and video processing system is provided in the embodiment of the present application, and referring to fig. 1, the system includes an audio capture device 10, a video capture device 20, and a control device 30.
The audio capture device 10 is configured to capture an audio signal of a target area and transmit the audio signal to the control device 30.
In the embodiment of the present application, the target area is any area in which abnormal event detection is required, and the target area may be in an elevator, in a corridor, on a street, or the like. Audio capture device 10 may be a microphone array and video capture device 20 may be a pan-tilt camera. The audio capture device 10 and the video capture device 20 may be connected to the control device 30 via a wired or wireless connection, respectively.
The video capture device 20 is configured to perform video monitoring on the target area to capture a video signal of the target area, and send the video signal to the control device 30.
In this embodiment, the video capture device 20 may perform video monitoring on the fixed position of the target area, and may also perform alternate monitoring on all positions of the target area. The video capture device 20 may rotate a preset angle every preset period to adjust the monitored positions, so as to monitor all the positions in turn.
The control device 30 is configured to receive an audio signal and a video signal, perform abnormal event detection based on the received signals, acquire event information of a target abnormal event if it is detected that the target abnormal event occurs in a target area based on any one of the signals, and monitor the target abnormal event based on the event information of the target abnormal event.
In the embodiment of the present application, the control device 30 may detect a plurality of abnormal events, and the target abnormal event may be any one of the plurality of abnormal events detected, including a fire event, a car accident event, an escape event, and the like.
The embodiment of the application provides an audio and video processing system, which respectively collects audio signals and video signals of a target area through an audio collecting device 10 and a video collecting device 20, then detects abnormal events based on the two signals, and can determine that the target abnormal events are detected under the condition that any signal detects the abnormal events; according to the system, under the condition that the abnormal event is not detected based on one type of signals, the abnormal event can be detected based on the other type of signals, so that the probability of detecting the abnormal event is improved, the detected target abnormal event is monitored, and the accuracy of detecting the abnormal event is further improved.
In the embodiment of the application, since the audio signal and the video signal are respectively acquired based on different devices, at least one of synchronous processing and preprocessing can be performed on the two types of signals in order to facilitate abnormal event detection based on the audio signal and the video signal. Accordingly, in some embodiments, the control device 30 is configured to perform synchronous processing and/or preprocessing on the audio signal and the video signal to obtain a processed signal when performing the abnormal event detection based on the received signal, the synchronous processing includes synchronizing the acquisition time of the audio signal and the video signal, and the preprocessing includes at least one of compression processing, enhancement processing, filtering processing, and denoising processing.
In the embodiment of the application, the audio signal and the video signal are synchronously processed, so that the abnormal event detection of two signals based on the same time period is ensured, and the accuracy of the abnormal event detection is further improved. The audio and video signals are pre-processed to filter out miscellaneous noise and interference, i.e., to transform the signals into a form that is easy to process, transmit, analyze and identify for subsequent detection of abnormal events based on the signals.
In the embodiment of the present application, before performing synchronization processing and/or preprocessing on the audio signal and the video signal, the audio capture device 10 and the video capture device 20 are further configured to condition the audio signal and the video signal, respectively, where the conditioning of the signal refers to converting the captured signal into a standard signal, and the conditioning includes at least one of jitter elimination processing, filtering processing, protection processing, level conversion processing, isolation processing, and the like.
In order to improve the efficiency of performing the abnormal event detection, in the embodiment of the present application, in the case of acquiring an invalid video signal or audio signal, the abnormal event detection may be performed based on only one type of signal. Accordingly, in some embodiments, the control device 30, in performing abnormal event detection based on the received signal, is configured to perform abnormal event detection based on the audio signal and the video signal in a case where it is determined that the audio signal does not match the reference audio signal and the video signal does not match the reference video signal; the control device 30 is configured to perform abnormal event detection based on the video signal in the case where it is determined that the audio signal matches the reference audio signal and the video signal does not match the reference video signal. The control device 30 is configured to perform abnormal event detection based on the audio signal in a case where it is determined that the audio signal does not match the reference audio signal and the video signal matches the reference video signal. The reference audio signal is an audio signal of the target area acquired when the target abnormal event does not occur in the target area, and the reference video signal is a video signal of the target area acquired when the target abnormal event does not occur in the target area.
In the embodiment of the application, if the audio signal is matched with the reference audio signal, it is indicated that the audio signal of the target abnormal event is not acquired, that is, the audio signal is an invalid signal, so that the abnormal event detection is performed only based on the video signal, the interference of the invalid audio signal is avoided, and the efficiency of performing the abnormal event detection is improved. Similarly, if the video signal is matched with the reference video signal, it is indicated that the video signal of the target abnormal event is not acquired, that is, the video signal is an invalid signal, so that the abnormal event detection is performed only based on the audio signal, the interference of the invalid video signal is avoided, and the efficiency of performing the abnormal event detection is improved.
In the embodiments of the present application, signal matching refers to signal feature matching, that is, matching of an audio signal with a reference audio signal refers to matching of a signal feature of the audio signal with a signal feature of the reference audio signal, and matching of a video signal with a reference video signal refers to matching of a signal feature of the video signal with a signal feature of the reference video signal.
In the embodiment of the present application, the signal feature includes a basic feature, a transform domain feature, a statistical feature, a motion feature, and at least one of features of the basic feature, the transform domain feature, the statistical feature, and the motion feature in dimensions of time, space, spectrum, phase, and the like. Wherein the basic features include at least one of image features (such as gray values) of the video signal or tones, timbres, tone intensities, etc. of the audio signal. The transform domain features include signal features of an audio signal or a video signal after fourier transform. The statistical features include the duration or number of occurrences of a certain salient feature in the audio signal or video signal, which salient feature may be the maximum pitch, the maximum loudness, etc. The motion characteristics include characteristics obtained by MPEG (Moving Picture Experts Group) encoding of a video signal. The features of the basic features, the transform domain features, the statistical features and the motion features in the dimensions of time, space, spectrum and phase include the mean, variance, cepstrum and envelope spectrum of each of the features, and are not limited in detail herein.
In the embodiment of the present application, if a failure occurs in the audio capture device 10 or the video capture device 20, which results in a failure in capturing the audio signal or the video signal, the abnormal event detection is performed based on the video signal in the case of the failure in capturing the audio signal, and the abnormal event detection is performed based on the audio signal in the case of the failure in capturing the video signal.
In the embodiment of the present application, in order to find out a failed device in time, the control device 30 may also alarm when the audio signal or the video signal is abnormal, and accordingly, after receiving the audio signal and the video signal, the control device 30 is configured to alarm the audio signal when the audio signal is matched with any audio signal in an audio library, and the audio library is configured to store the audio signal collected when the audio collecting device 10 fails; after receiving the audio signal and the video signal, the control device 30 is configured to alarm the video signal if the video signal matches any video signal in a video library, where the video library is configured to store the video signal acquired when the video acquisition device 20 fails; control device 30, after receiving the audio signal and the video signal, is configured to alert the video signal if the video signal matches a target video signal, the target video signal being a video signal captured when video capture device 20 is occluded.
In the embodiments of the present application, signal matching refers to signal feature matching. Due to the fact that the alarm is given when the audio acquisition device 10 fails, the video acquisition device 20 fails and the video acquisition device 20 is shielded, management personnel can timely process the device to continuously acquire accurate audio signals and video signals.
In the embodiment of the present application, in the case of acquiring an invalid video signal, a video signal acquired again may be acquired, so as to perform abnormal event detection based on the video signal acquired again. Accordingly, the control device 30 is configured to determine the occurrence position of the target abnormal event based on the audio signal when performing abnormal event detection based on the audio signal in a case where it is determined that the audio signal does not match the reference audio signal and the video signal matches the reference video signal; acquiring a video signal acquired again based on the occurrence position of the target abnormal event; and detecting abnormal events based on the recaptured video signals.
In the embodiment of the application, if the video signal is matched with the reference video signal, it is indicated that the video signal of the target abnormal event is not acquired, that is, the video signal is an invalid signal, so that when the video signal is an invalid signal, abnormal event detection is performed based on the video signal acquired again, the abnormal event can be detected based on two types of signals, namely the audio signal and the video signal, the comprehensiveness of the detection is ensured, and the accuracy of the detection is further ensured.
In the embodiment of the present application, when acquiring the recaptured video signal and performing abnormal event detection based on the recaptured video signal, the control device 30 includes at least one of the following implementation manners:
in one implementation, the system further includes a plurality of standby video capture devices, the plurality of standby video capture devices being configured to perform video surveillance on different locations of the target area; correspondingly, the control device 30 is configured to obtain a video signal acquired by the target standby video acquisition device, and perform abnormal event detection based on the video signal acquired by the target standby video acquisition device, where a position monitored by the target standby video acquisition device is the same as an occurrence position of the target abnormal event. In this implementation, if the video signal matches the reference video signal, it indicates that the video signal of the abnormal event is not acquired, that is, the video signal is an invalid signal, so that, when the video signal is an invalid signal, the abnormal event is detected based on the video signal acquired by the standby video acquisition device, and the abnormal event can be detected based on the two types of signals, that is, the audio signal and the video signal, so that the comprehensiveness of the detection is ensured, and the accuracy of the detection is further ensured.
In another implementation, the control device 30 is configured to control the position monitored by the video capture device 20 to be adjusted to the occurrence position of the target abnormal event, so that the video capture device 20 captures a video signal of the occurrence position of the target abnormal event, and perform abnormal event detection based on the video signal re-captured by the video capture device 20.
In some cases, after the position monitored by the video capture device 20 is adjusted, the adjusted position monitored by the video capture device 20 may not be matched with the position where the target abnormal event occurs due to an adjustment error or a change in the position where the target abnormal event occurs, and the monitoring position of the video capture device 20 needs to be adjusted for many times, and the number of times of adjustment may be set and changed as needed. In order to improve the abnormal event detection efficiency and to ensure the convergence and the operation speed of the abnormal event detection algorithm, the number of times of adjustment needs to be controlled, and if the number of times of adjustment is set to 3, the adjustment is not performed any more than 3.
It should be noted that, while the adjusted video capture device 20 captures the video signal again, the audio capture device 10 also captures the audio signal again, so as to perform the detection of the abnormal event based on the captured video signal and audio signal. The position monitored by the video acquisition equipment 20 is adjusted to the occurrence position of the target abnormal event, so that the video signal of the target abnormal event can be conveniently acquired, the abnormal event can be detected based on the two signals, namely the audio signal and the video signal, the detection comprehensiveness is ensured, and the detection accuracy is further ensured.
In this embodiment, the control device 30 is further configured to adjust at least one of a focal length or a fill light intensity of the video capture device 20 based on the quality of the video signal, so as to capture the high-quality video signal, thereby ensuring the accuracy of detecting the abnormal event based on the video signal. Wherein the quality of the video signal comprises at least one of picture quality, sharpness, etc. of an image generated based on the video signal.
In the embodiment of the present application, the control device 30 is configured to, when performing abnormal event detection based on the received signal, determine that the target abnormal event is detected if it is determined that any type of signal matches a target signal in an abnormal event library, where the abnormal event library is configured to store audio signals and video signals of a plurality of types of abnormal events, the target signal includes audio signals and/or video signals of the target abnormal event, and the target abnormal event includes any one of the plurality of types of abnormal events.
In the embodiment of the application, the abnormal event library comprises a first sub-library and a second sub-library, wherein the first sub-library is used for storing audio signals, and the second sub-library is used for storing video signals; accordingly, the control device 30 is configured to compare the audio signal with the signals in the first sub-bank and compare the video signal with the signals in the second sub-bank to detect the abnormal event. In the case where the detection of the abnormal event is performed based on only the audio signal, the detection of the abnormal event only needs to be performed by the first sub-bank. In the case where the detection of the abnormal event is performed based on only the video signal, the detection of the abnormal event only needs to be performed by the second sub bank.
In the embodiment of the application, the abnormal event library comprises abnormal event libraries under various different environment backgrounds, and the signal in each abnormal event library is a signal of a fusion environment background; accordingly, the control device 30, in a case where it is determined that any one type of signal matches a target signal in the abnormal event repository, is configured to acquire a target abnormal event repository matching the environmental context of the target area from among the plurality of abnormal event repositories before it is determined that the target abnormal event is detected, and perform abnormal event detection based on the target abnormal event repository.
In the embodiments of the present application, the plurality of environmental contexts include environmental contexts under a plurality of meteorological conditions including, but not limited to, wind, rain, snow, fog, etc., and environmental contexts under a plurality of typical scenarios including, but not limited to, distress calls, whistles, collisions, explosions, low-altitude flights, crowd gathering, etc.
It should be noted that the environmental background of the target area may be different at different stages, and in this embodiment, the environmental background of the target area refers to the real-time environmental background of the target area at the current stage. In the embodiment of the application, because the signals in the abnormal event library are signals of the fusion environment background, the signals in the abnormal event library can represent the signals in a real scene, and the accuracy and the effectiveness of signal comparison are further improved.
In the embodiment of the present application, the signal of the fusion environment background may be obtained by: the control device 30 is configured to obtain background noise in the environment background, perform modeling based on the background noise to obtain a simulated noise signal, and fuse the model noise signal and the signal to obtain a signal fused with the environment background. In another implementation, the signal of the un-fused environment background and the model noise signal of the environment background are respectively stored in a signal library and a background noise library, and the signal of the fused environment background in the abnormal event library is obtained by instantly fusing the environment background. Accordingly, the control device 30 is configured to obtain an analog noise signal corresponding to the environmental background from a background noise library based on the environmental background of the target area, and fuse the analog noise signal with the signals in the signal library to obtain a signal fused with the environmental background.
In some embodiments, the audio/video processing system in the embodiment of the present application is an embedded operating system with limited computational resources, so that the control device 30 may not be able to finish reading the signals in the signal library for signal fusion at one time, or finish reading the signals in the abnormal event library for comparison at one time, and then the control device 30 is configured to read the signals in the signal library for multiple times for signal fusion, and read the signals in the abnormal event library for multiple times for comparison, so as to obtain a more accurate comparison result.
In the embodiment of the present application, it is determined whether any type of signal matches the target signal based on the similarity, and accordingly, when determining that any type of signal matches the target signal in the abnormal event library, the control device 30 is configured to determine the similarities between the signals of any type of signal and the signals in the abnormal event library, respectively, and in a case that the similarities between any type of signal and the target signal meet the similarity condition, determine that any type of signal matches the target signal. In the embodiment of the application, whether two signals are matched or not is determined based on the similarity between the signals, and the reasonability and the accuracy of the matching result are ensured.
In the embodiment of the application, because the similarity between any type of signal and the signal in the abnormal event library is determined, a plurality of similarities can be obtained, and one similarity corresponds to one signal in the abnormal event library; accordingly, the similarity condition includes the following cases: in one case, the similarity condition indicates that the similarity between any one type of signal and the target signal is the similarity of a preset order in a similarity sequence, the similarity sequence is obtained based on the similarities, and the value of the similarity in the sequence before is greater than the value of the similarity in the sequence after; the preset order can be set according to needs, if the previous preset order can be the previous order, namely the similarity between any type of signal and the target signal is the maximum similarity among the multiple similarities, and since the similarity between two signals which are matched with each other is high, the matched signal is determined based on the similarity order, and the rationality and the accuracy of the matching result are ensured. In another case, the similarity condition means that the similarity between the signal of any type and the target signal is greater than or equal to a preset similarity threshold, that is, the similarity between the signal of any type and the target signal is greater than or equal to a preset similarity threshold, before it is determined that the signal of any type matches the target signal. It should be noted that, in this case, if there is a possibility that the similarity between the multiple signals and the any type of signal in the abnormal event library is greater than or equal to the preset similarity threshold, it is determined that the any type of signal matches the target signal when the similarity between the target signal and the any type of signal is the signal with the greatest similarity among the multiple signals; because the similarity between the two matched signals is high, the matching with the target signal is determined only when the similarity is greater than or equal to the preset similarity threshold, and the accuracy of the matching result is improved.
In some embodiments, the target signal corresponds to a plurality of exception events; accordingly, the control device 30, when determining that the target abnormal event is detected, is configured to determine that the target abnormal event is detected in a case where it is determined that the any one type of signal is a signal under the target abnormal event among the plurality of types of abnormal events based on the target signal and the bayesian probability, thereby ensuring the reasonableness and accuracy of detecting the target abnormal event.
In the embodiment of the present application, the signal matching refers to signal feature matching, in other embodiments, signal features of audio signals and signal features of video signals of multiple kinds of abnormal events are stored in the abnormal event library, the control device 30 is configured to compare the signal features of the audio signals and the signal features of the video signals with the signal features in the abnormal event library, respectively, and in a case that a similarity condition is satisfied between the signal feature of any kind of signals and a target signal feature in the abnormal event library, it is determined that a target abnormal event is detected, and the target signal feature includes at least one of the signal feature of the audio signals and the signal feature of the video signals of the target abnormal event. Correspondingly, the first sub-library and the second sub-library of the abnormal event library are used for storing the signal characteristics of the audio signal and the video signal respectively; the control device 30, when comparing the signal characteristics of the audio signal and the video signal with the signal characteristics in the abnormal event library, respectively, is configured to compare the signal characteristics of the audio signal with the signal characteristics in the first sub-library and compare the signal characteristics of the video signal with the signal characteristics in the second sub-library to detect an abnormal event. If the image characteristics of the video signal of the abnormal event are stored in the second sub-library, the image characteristics of the video signal collected in real time are compared with the image characteristics in the second sub-library, and after the target image characteristics are matched, scene detection is carried out on the target abnormal event based on an Artificial Intelligence (AI) technology, so that scene analysis on the target abnormal event is realized, the target abnormal event is positioned based on an analysis result, and tracking and monitoring are carried out on the target abnormal event. Correspondingly, the signal characteristics in the abnormal event library under various different environment backgrounds are the signal characteristics of the fusion environment background; accordingly, the control device 30 is configured to acquire a target abnormal event library matching the environmental background of the target area from among the plurality of abnormal event libraries before comparing the signal characteristics of the audio signal and the signal characteristics of the video signal with the signal characteristics in the abnormal event library, respectively, and perform abnormal event detection based on the target abnormal event library. Accordingly, when obtaining the signal feature of the fusion environment background, the control device 30 is configured to fuse the model noise signal with the signal feature of any type of signal of the abnormal event, so as to obtain the signal feature of the fusion environment background. In the embodiment of the application, the signal characteristics of the unfused environment background and the model noise signal of the environment background are respectively stored in a signal characteristic library and a background noise library, and the signal characteristics of the fused environment background in the abnormal event library are obtained by instantly fusing the environment background. When the signal feature of the fusion environment background is obtained, the control device 30 is configured to obtain the analog noise signal corresponding to the environment background from the background noise library, and fuse the analog noise signal with the signal feature in the signal feature library to obtain the signal feature of the fusion environment background.
In some embodiments, after the target abnormal event is detected, the control device 30 is further configured to determine, based on the target signal characteristic and the analog noise signal, vibration energy, phase and doppler effect of the moving sound source, that is, the target abnormal event, under two environmental conditions, that is, an open space and a closed space, by using a beam forming method, a direction of arrival estimation method and combining with a sound signal propagation rule, so as to realize positioning of the target abnormal event based on the influence of the vibration energy, the phase and the doppler effect on algorithms such as distributed target detection, positioning and tracking, obtain geographic coordinates of the target abnormal event, and further facilitate tracking of the target abnormal event.
Optionally, the system in this embodiment of the application is an embedded operating system with limited computational resources, so that the control device 30 may not be able to read the signal features in the signal feature library for signal fusion once or read the signal features in the abnormal event library for comparison once, and then the control device 30 is configured to read the signal features in the signal feature library for multiple times to perform signal fusion and read the signal features in the abnormal event library for multiple times to perform comparison, so as to obtain a more accurate comparison result.
In the embodiment of the application, the target abnormal event comprises a plurality of development stages, and the event information comprises the occurrence position of the target abnormal event at the current stage in the plurality of development stages; accordingly, the control device 30, when monitoring the target abnormal event based on the event information of the target abnormal event, is configured to predict the occurrence position of the target abnormal event in the next stage based on the occurrence position of the target abnormal event in the current stage, and monitor the target abnormal event based on the occurrence position of the target abnormal event in the next stage.
In the embodiment of the present application, the control device 30 is configured to, when predicting the occurrence position of the target abnormal event in the next stage based on the occurrence position of the target abnormal event in the current stage, acquire the moving direction and the moving speed of the target abnormal event, and determine the occurrence position of the target abnormal event in the next stage based on the occurrence position, the moving direction and the moving speed of the target abnormal event in the current stage. The control device 30 determines a moving distance based on the moving speed and the time interval between the two phases, and moves the occurrence position of the target abnormal event in the current phase by the moving distance in the moving direction, that is, obtains the occurrence position of the target abnormal event in the next phase.
In the embodiment of the present application, the control device 30 may obtain the moving speed and the moving direction of the target abnormal event based on the two occurrence positions determined by the two frames of video signals; or based on two occurrence positions located by two frames of audio signals, obtaining the moving speed and the moving direction of the target abnormal event. The control device 30 is configured to use the quotient of the distance between two occurrence positions and the time interval of two frames of signals as the moving speed, wherein the two occurrence positions can be any two different positions in the plurality of occurrence positions of the target abnormal event, that is, the two frames of video signals or the two frames of audio signals can be any two different frames of signals.
In this embodiment, if the target abnormal event may not be moving at a constant speed, the control device 30 may further determine a plurality of moving speeds based on the plurality of occurrence positions determined by the multi-frame video signals, and further determine that the target abnormal event is moving at a constant speed, moving at an accelerated speed, or moving at a decelerated speed based on the plurality of moving speeds. Wherein the control device 30 is configured to determine a moving speed based on two occurrence positions determined every two frames of the video signal, and thereby a plurality of moving speeds can be obtained. The same is true. The control device 30 may also derive a plurality of moving speeds based on a plurality of occurrence positions at which the multi-frame audio signal is located. Accordingly, if the target abnormal event is an acceleration movement or a deceleration movement, the control device 30 may determine an acceleration based on a plurality of movement speeds, determine a movement distance of the target abnormal event between two stages based on the acceleration, and then determine an occurrence position of the target abnormal event at the next stage based on the movement distance and the movement direction.
For example, if the target abnormal event is a vehicle overspeed event, the control device 30 may obtain the moving speed and moving direction of the vehicle based on the positions of the two vehicles determined by the two frames of video signals, further determine the moving distance of the vehicle based on the time interval of the two stages, and further obtain the position of the vehicle in the next stage of the vehicle overspeed event based on the position, moving direction and moving distance of the vehicle in the current stage of the vehicle overspeed event. And a plurality of moving speeds can be determined based on the positions of a plurality of vehicles determined by the multi-frame video signals, and further, under the condition that the vehicles move in an accelerating mode or a decelerating mode, the moving distance of the vehicles between two stages is determined based on the acceleration, so that the positions of the vehicles in the next stage of the vehicle overspeed event can be obtained.
In the embodiment of the present application, by predicting the occurrence position of the target abnormal event at the next stage, the position monitored by the video capture device 20 can be adjusted to the predicted occurrence position of the target abnormal event at the next stage, so as to timely obtain the video signal of the target abnormal event at the next stage, and ensure the timeliness of monitoring the target abnormal event.
In some embodiments, the location of the target exception event at the next stage may also be predicted based on historical exceptions. Accordingly, the control device 30 is configured to acquire event information of the historical abnormal event when predicting the occurrence position of the target abnormal event in the next stage based on the occurrence position of the target abnormal event in the current stage, the event information of the historical abnormal event including the occurrence positions of the historical abnormal event in the plurality of stages of development, and in response to the occurrence position of the historical abnormal event in the first historical stage matching the occurrence position of the target abnormal event in the current stage, set the occurrence position of the historical abnormal event in the second historical stage as the occurrence position of the target abnormal event in the next stage, the first historical stage being any one of the plurality of stages of development of the historical abnormal event except the last stage, and the second historical stage being the next stage of the first historical stage.
In the embodiment of the present application, when acquiring the historical abnormal event, the control device 30 is configured to acquire the historical abnormal event matching with the current background noise based on the current background noise corresponding to the environmental background of the target area. Alternatively, the historical abnormal events corresponding to different background noises are respectively stored in different historical event libraries, and then the control device 30 determines a target background noise with the highest matching degree with the current background noise in the plurality of background noises, and further obtains the historical abnormal event from the historical event library corresponding to the target background noise. If the historical event library includes a plurality of historical abnormal events, optionally, the control device 30 is configured to use, as the historical abnormal event corresponding to the target abnormal event, the historical abnormal event with the largest similarity between the signal characteristic of the audio signal and the signal characteristic of the audio signal of the target abnormal event, among the plurality of historical abnormal events.
For example, the target area is an elevator area, and the target abnormal event is the crowd gathering at an elevator entrance; if the crowd is gathered at the elevator entrance in the first history stage in the history elevator entrance crowd gathering event, and the crowd is evacuated to the stair entrance in the second history stage after the crowd is evacuated, the occurrence position of the target abnormal event in the next stage can be predicted to be the stair entrance based on the history elevator entrance crowd gathering event.
In the embodiment of the application, the historical abnormal event corresponds to the same environmental background as the event information of the currently-occurring target abnormal event, so that the occurrence position of the target abnormal event at the next stage is predicted based on the historical abnormal event, the prediction efficiency is improved, and the prediction accuracy is ensured.
In the embodiment of the present application, after predicting the occurrence position of the target abnormal event in the next stage, the control device 30 monitors the target abnormal event based on the occurrence position of the target abnormal event in the next stage; accordingly, the control device 30 may further determine the occurrence time of the target abnormal event in the next stage to control the video capture device 20 to rotate at the occurrence time of the target abnormal event in the next stage, so that the video capture device 20 monitors the occurrence position of the target abnormal event in the next stage at the occurrence time of the target abnormal event in the next stage. Alternatively, the control device 30 determines the occurrence time of the target abnormal event at the next stage based on the time interval of the historical abnormal event at the two historical stages and the occurrence time of the target abnormal event at the current stage.
In this embodiment, by predicting the occurrence time of the target abnormal event at the next stage, the video capture device 10 can monitor the occurrence position of the target abnormal event at the next stage at the occurrence time of the target abnormal event at the next stage, so as to timely and accurately obtain the video signal of the target abnormal event at the next stage, thereby ensuring the timeliness and accuracy of detecting the target abnormal event.
In some embodiments, when a target exception occurs, a companion event may also occur, and thus the target exception may also be monitored based on the companion event. Accordingly, the control device 30 is configured to acquire event information of a companion event of the historical abnormal event and monitor the target abnormal event based on the event information of the companion event when monitoring the target abnormal event based on the event information of the target abnormal event.
The companion event refers to an event occurring along with the occurrence of the target abnormal event, and includes an event occurring before, after or simultaneously with the occurrence of the target abnormal event, and the companion event is an event occurring within a preset range of an occurrence location of the target abnormal event. Therefore, if a companion event of the historical abnormal events is detected, a target location where the target abnormal event is likely to occur in the next stage may be determined based on an occurrence location of the companion event in the current stage and a target time where the target abnormal event is likely to occur in the next stage may be determined based on an occurrence time of the companion event in the current stage, and the target abnormal event may be monitored based on at least one of the target time and the target location. The determination method of the target time and the target position may be: determining a target time based on a time interval between the historical abnormal event and the companion event thereof and the occurrence time of the companion event in the current stage, and determining a target position based on the distance between the historical abnormal event and the companion event thereof and the occurrence position of the companion event in the current stage.
For example, the target abnormal event may be a crash event, the companion event may be a fire event, the target abnormal event may be a vehicle overspeed event, and the companion event may be a crash event. Taking the target abnormal event as a vehicle overspeed event, the accompanying event of which may be a crash event as an example, and the event information of the accompanying event includes the occurrence location of the crash event, the implementation manner of monitoring the target abnormal event based on the event information of the accompanying event may be: and determining the target time and the target position of the vehicle overspeed event possibly occurring in the next stage based on the occurrence position and the occurrence time of the companion event in the current stage so as to monitor the target abnormal event at the target time and the target position.
In some embodiments, the target exception event occurs multiple times, and statistics and analysis may also be performed on the target exception event. Accordingly, when detecting that a target abnormal event occurs in the target area based on any type of signal, the control device 30 is configured to perform event statistics on the target abnormal event based on event information of the target abnormal event for a plurality of times, and generate an analysis report of the target abnormal event based on a result of the statistics, the analysis report including at least one of an occurrence time chart, an occurrence location chart, and an occurrence number chart of the target abnormal event.
In the embodiment of the present application, the analysis report may be in the form of a table, a bar chart, a pie chart, or a line chart, and is not limited herein. In the embodiment of the present application, the target abnormal event may be crowd gathering, people abnormal, parking stall watching abnormal, people abnormal, duty abnormal, temperature abnormal, and the like. For example, if the target abnormal event is a temperature abnormality, that is, the temperature in the target area exceeds a preset temperature, the temperature exceeding the preset temperature each time may be counted to generate an analysis report, where the analysis report may be in the form of a thermal imaging graph or a line graph.
In the embodiment of the present application, the occurrence time table may reflect the number of times that the target abnormal event occurs at the same time, and the occurrence position table may reflect the number of times that the target abnormal event occurs at the same position.
In this embodiment, the control device 30 is further configured to generate an analysis conclusion based on the analysis report, so as to implement intelligent analysis of the system. For example, if the target abnormal event is crowd accumulation, the number of people in each crowd accumulation may be counted to generate an analysis report, which may be in the form of a table, a bar graph or a line graph, and an analysis conclusion may be generated based on the analysis report, for example, the analysis conclusion may be that the number of people in each crowd accumulation is more and more or the frequency of crowd accumulation is higher and higher as time goes on.
Referring to fig. 2, in some embodiments, the audio video processing system further includes a cloud server 40, the control device 30 is connected to the cloud server 40 through a wired or wireless network, and the control device 30 is configured to perform event statistics through the cloud server 40 to generate an analysis report and an analysis conclusion.
In some embodiments, after the target abnormal event is detected, in order to enable the manager to timely handle the target abnormal event, the manager needs to alarm the target abnormal event. Accordingly, after acquiring the event information of the target abnormal event, the control device 30 is configured to determine alarm information corresponding to the event information, and alarm the target abnormal event based on the alarm information.
In the embodiment of the application, the event information comprises keywords, occurrence time and occurrence position of the target abnormal event; correspondingly, the process of determining the alarm information corresponding to the event information includes: the control device 30 is configured to determine an event category of the target abnormal event based on the keyword, the occurrence location, and the occurrence time of the target abnormal event, determine an alarm level of the target abnormal event based on the importance corresponding to the event category, and determine alarm information of the target abnormal event based on the alarm level of the target abnormal event.
In the embodiment of the present application, the keywords may be obtained by performing speech recognition on the audio signal or by performing image recognition on the video signal, for example, by extracting motion features of people in the image.
In the embodiment of the present application, the control device 30 is configured to determine the event category of the target abnormal event from the event category library based on the keyword, the occurrence location, and the occurrence time of the target abnormal event when determining the event category of the target abnormal event based on the keyword, the occurrence location, and the occurrence time of the target abnormal event. Optionally, the event category library includes first categories corresponding to the plurality of keywords, each of the first categories includes second categories corresponding to the plurality of occurrence positions, and each of the second categories includes third categories corresponding to the plurality of occurrence times, respectively, then, when determining the event category of the target abnormal event from the event category library based on the keyword, the occurrence position, and the occurrence time of the target abnormal event, the control device 30 is configured to determine, from the plurality of first categories, a target first category corresponding to the keyword of the target abnormal event, determine, from the plurality of second categories of the target first category, a target second category corresponding to the occurrence position of the target abnormal event, determine, from the plurality of third categories of the target second category, a target third category corresponding to the occurrence time of the target abnormal event, and use the target third category as the event category of the target abnormal event.
In the embodiment of the present application, when determining a target first category corresponding to a keyword of a target abnormal event from among a plurality of first categories, the control device 30 is configured to compare a valid value of the keyword of the target abnormal event with a keyword threshold corresponding to the target first category, and if the valid value is greater than or equal to the keyword threshold, take the target first category as the category corresponding to the keyword of the target abnormal event, so as to ensure accuracy of the determined event category. Wherein the effective value of the keyword is determined based on at least one of a tone, a tone intensity, or an image feature of the keyword. If the keyword is 'life saving' extracted based on the audio signal, the tone is high, and the effective value is large; if the keyword is "run away" extracted based on the video signal, the conversion frequency of the position feature of the person in the image is large, and the effective value is large.
In the embodiment of the application, each event category corresponds to a preset characteristic value, one preset characteristic value corresponds to an importance degree, and the importance degree of a target abnormal event can be determined based on the preset characteristic value; one importance degree corresponds to one warning grade, and further based on the importance degree, the warning grade of the target abnormal event can be determined, and the importance degree of the target abnormal event is positively correlated with the warning grade of the target abnormal event; for example, if the importance is high, the warning level is high, if the importance is medium, the warning level is low, and if the importance is low, the warning level is low. The alarm information comprises at least one of ring alarm, light alarm or voice alarm, one alarm grade corresponds to one alarm information, if the alarm grade is low, the alarm information is one of ring alarm or light alarm, if the alarm grade is medium, the alarm information is sound-light combined alarm, and if the alarm grade is high, the alarm information is voice alarm.
Optionally, the control device 30 is configured to determine the alarm information through the cloud server 40. The control device 30 is configured to send the event information to the cloud server 40, so as to determine the alarm information through the cloud server 40.
With continued reference to fig. 2, in some embodiments, the audio video processing system further includes a target terminal 50, the target terminal 50 is in bidirectional communication with the cloud server 40, and the target terminal 50 is connected to the cloud server 40 through a wired or wireless network. In the embodiment of the present application, the control device 30 is configured to transmit the alarm information to the target terminal 50 after determining the alarm information of the target abnormal event, and the target terminal 50 is configured to alarm the target abnormal event based on the alarm information of the target abnormal event. In this embodiment, the target terminal 50 may be a mobile terminal or a fixed terminal used by a manager, so that the manager can timely process the target abnormal event that is alarmed by the target terminal 50.
In the embodiment of the present application, in order to facilitate subsequent query of the target exceptional event, the related information of the target exceptional event may be stored. Accordingly, the control device 30 is configured to extract key segments in the audio signal and the video signal after acquiring the event information of the target abnormal event, and store the key segments in correspondence with the event information. In the embodiment of the present application, the key segment includes segments in multiple development stages, and the event information may also include event information in multiple development stages, so that the control device 30 is further configured to splice and edit the key segments and the event information in multiple development stages respectively to obtain video information and semantic information reflecting the target abnormal event.
In the embodiment of the application, the key fragments and the event information of the target abnormal event are stored, so that the target abnormal event can be conveniently inquired subsequently. In one implementation, when the control device 30 stores the key segment and the event information correspondingly, the control device is configured to store the key segment and the event information through the cloud server 40, so as to reduce resource occupation of the control device 30. Optionally, the control device 30 is configured to compress and encode the video intelligence and the semantic intelligence, and transmit the compressed and encoded video intelligence and semantic intelligence to the cloud server 40 through a network.
In the embodiment of the present application, the control device 30 performs operations such as alarm management, security management, video management, log query, algorithm configuration, and disk management through the cloud server 40, in addition to performing event statistics and event alarm through the cloud server 40.
In the embodiment of the present application, the alarm management refers to setting a mobile alert region of a target region and setting an alarm manner of the mobile alert region. For example, when an abnormal event is detected in the mobile alert area, a prompt of mobile alarm is displayed on a video preview interface of the camera, and corresponding alarm linkage can be set, wherein the alarm linkage can be linkage output, linkage snapshot or linkage video recording and the like. And the monitoring defense deployment time can be set, and the detection of the abnormal event is only carried out within the defense deployment date or time period. And events to be alarmed can be set, such as equipment shielding alarm, audio frequency abnormity alarm, human body temperature measurement alarm, human face target identification alarm, environment temperature and humidity abnormity alarm, voltage abnormity alarm, equipment failure alarm and the like. The alarm generated by the abnormality of the equipment, such as equipment shielding alarm, audio abnormality alarm, voltage abnormality alarm, equipment failure alarm and the like, can be given by setting special sound, so that the alarm mode is different from that of an abnormal event.
The safety management refers to managing certificates of a server and a client so as to ensure the safety of communication among all devices in the system; the certificate includes an SSL (Secure socket layer), the server includes a cloud server 40, and the client includes a target terminal 50 used by a manager. The video recording management comprises the defense time, the retention time, the pre-recording time, the delay time, the type of the video recording and the like of the video recording. The log query comprises the query of the login operation attribute of the user on the audio and video processing system, various parameters of the system operation, the system memory, the system disk and the like.
The algorithm configuration refers to configuring different event detection algorithms for the system according to the abnormal events to be recognized in the target area, wherein the algorithms comprise at least one of a behavior analysis algorithm, a people counting algorithm, a license plate recognition algorithm, a crowd gathering detection algorithm, a duty detection algorithm, a safety helmet detection algorithm, a face detection algorithm, a witness protection algorithm, a video diagnosis algorithm, an audio diagnosis algorithm and the like. The witness protection algorithm is used for mosaicing the image, the video diagnosis algorithm is used for detecting abnormal video signals, and the audio diagnosis algorithm is used for detecting abnormal audio signals. The disk management means that when the disk space of the audio capturing apparatus 10, the video capturing apparatus 20, or the control apparatus 30 is insufficient, the disk can be managed for each apparatus in at least one of cyclically deleting all the stored contents, cyclically deleting the stored contents other than the abnormal event, or stopping the signal capturing.
With continued reference to fig. 2, in some embodiments, the system further includes a control center 60, the control center 60 performs bidirectional communication with the cloud server 40, and the control center 60 is configured to perform parameter configuration on the system, including at least one of audio and video basic parameter configuration, alert management parameter configuration, human body temperature measurement parameter configuration, event alarm parameter configuration, face recognition parameter configuration, road monitoring parameter configuration, and intelligent monitoring parameter configuration.
The audio and video basic parameter configuration comprises basic parameters for configuring an audio signal and basic parameters for configuring a video signal. The basic parameters of the audio signal include at least one of an encoding mode, a sampling rate, a noise reduction parameter, an echo suppression parameter, an audio output type, and the like of the audio signal. The basic parameters of the video signal include at least one of a resolution, a coding mode, a compression mode, a frame rate, a code rate, and the like of the video signal.
The alert management parameter configuration comprises configuration of perimeter alert and tripwire alert to obtain an alert zone in the target zone, and can also configure detection targets to alert detection targets that intrude into the alert zone, leave the alert zone, and are out of range behavior. Wherein, the perimeter alert can be a default alert zone, and the detection target defaults to personnel and vehicles.
The human body temperature measurement parameter configuration refers to setting basic human body temperature measurement parameters according to the actual condition or the historical condition of a target area and configuring alarm information when the temperature is abnormal, the system in the embodiment of the application supports two temperature scale modes of temperature centigrade and temperature fahrenheit, and the default temperature scale mode is the temperature centigrade. Optionally, the control device 30 is adapted to detect the temperature by means of an infrared thermometer.
The event alarm configuration refers to configuring alarm parameters for the abnormal events according to the types of the abnormal events, such as configuring alarm threshold values, alarm modes and the like. And the alarm threshold value under a certain scene can be intelligently configured through the event alarm configuration, for example, the alarm threshold value of overspeed of a vehicle can be intelligently configured according to whether the road is a highway or a downtown road. Optionally, the control device 30 is used to detect the speed by a tachometer.
The face recognition parameter configuration refers to the configuration of a target to be detected, a snapshot mode, a push image mode and the like in the process of snapshotting an image. The target to be detected comprises at least one of a motor vehicle, a face of a driver, a non-motor vehicle, a pedestrian and the like. The snapshot mode comprises an automatic trigger mode, an external trigger mode and a mixed trigger mode, wherein the automatic trigger mode refers to that after a target abnormal event is detected, a trigger system automatically shoots an image, the external trigger mode refers to that an administrator triggers the system to shoot the image, and the mixed trigger mode refers to that the automatic trigger mode and the external trigger mode are combined to shoot the image. The image pushing mode comprises a full-grabbing mode, a high-quality mode, a user-defined mode and the like, all images grabbed by the full-grabbing mode are sent to an image pushing module, the image pushing module is used for pushing the images to the cloud server 40, and the grabbing missing rate of the images in the full-grabbing mode is low; the high-quality mode is to filter the captured images based on the default boundary of the system and only push the images in the boundary to the image pushing module, so that the image pushing effect in the high-quality mode is good, and the false capture rate is low; the user-defined mode is to filter the captured image based on the boundary set by the manager and push the filtered image to the image pushing module.
The road monitoring parameter configuration refers to configuring relevant parameters of a monitored road, wherein the relevant parameters include a snapshot mode after a running track of a vehicle deviates from a preset track, license plate detection parameters, parameters for performing real-time statistical analysis on traffic flow, a target to be detected on the road, the resolution of a snapshot image and the like, for example, the license plate detection parameters can be used for performing license plate detection on 9 vehicles simultaneously, and the resolution can be 1440P, 1080P or 960P and the like.
The intelligent monitoring parameter configuration refers to configuring at least one of a behavior analysis algorithm, a people counting algorithm, a license plate recognition algorithm, a crowd gathering detection algorithm, a duty detection algorithm, a safety helmet detection algorithm, a face detection algorithm, a witness protection algorithm, a video diagnosis algorithm, an audio diagnosis algorithm and the like for the system, and the algorithm can be adjusted under the condition that part of the algorithm conflicts with other algorithms through the intelligent monitoring parameter configuration.
In the embodiment of the present application, the target terminal 50 is further configured to remotely view and execute various events of the system, including at least one of event execution, alarm response, event parameter management, user management, statistical form, peripheral management, and the like.
The event execution includes the target terminal 50 instructing the control device 30 to alarm through the cloud server 40, including at least one of a light alarm, a ring alarm, a voice alarm, and the like, which may be a voice real-time shout alarm. The alarm response means that in the case that a target abnormal event is detected, the control device 30 communicates with the target terminal 50 through the cloud server 40 to send alarm information to the target terminal 50 to trigger the target terminal 50 to alarm the target abnormal event, where the alarm mode includes at least one of light alarm, ring alarm, voice alarm, vibration alarm, and the like.
The event parameter management refers to configuring different alarm parameters for the abnormal event according to the type of the abnormal event, for example, configuring an alarm threshold, an alarm mode and the like. And the alarm threshold value of an intelligent analysis event under a certain scene can be configured through the event parameter management, and the alarm threshold value of overspeed of the vehicle can be dynamically set according to whether the road is a high-speed road or an urban road.
User management means that the target terminal 50 can be used to manage the use rights of different users to the system and the rights to access data in the cloud server 40, and the like, and the management can be realized by adding or deleting users and the like. The statistical form indicates that the target terminal 50 can obtain the statistical result of the abnormal event from the cloud server 40, including the analysis form and the analysis conclusion. The peripheral management target terminal 50 may be configured to configure relevant parameters of auxiliary devices of the system, where the auxiliary devices include a flash lamp, an infrared fill-in lamp, a microphone, and the like, and the relevant parameters include power of the flash lamp, power of the infrared fill-in lamp, sensitivity of the microphone, and the like. And serial numbers of the video acquisition device 20, the audio acquisition device 10, the control device 30, and the like can be configured through the peripheral management, and the serial numbers are used for connecting the devices with auxiliary devices.
The embodiment of the application introduces the audio signals into the field of video monitoring, and the performance of the algorithm for detecting the abnormal events and tracking the events based on video monitoring can be improved due to the fact that the processing algorithm of the audio signals has the characteristics of small complexity and good real-time performance. In addition, the embodiment of the application combines the composite characteristics of two heterogeneous signals of the audio signal and the video signal, overcomes the defects of narrow monitoring visual angle, easy shielding and the like of the traditional video monitoring, enables the system to have the detection, positioning and tracking capabilities of all weather, no shielding and no blind area, and further can improve the response speed of the system. In addition, according to the embodiment of the application, the event is automatically analyzed and semantically understood through the audio signal and the video signal, the event information and the key fragments of the abnormal event are captured, the video information and the semantic information are formed, and the compressed and encoded video information and the semantic information are transmitted to the cloud server 40, so that the abnormal event can be conveniently stored, analyzed, counted and the like through the cloud server 40. In addition, in the embodiment of the present application, the functions of event alarm and the like are issued according to the instruction of the control center 60 or the target terminal 50, so that the multiple devices cooperatively realize real-time monitoring and analysis of the abnormal event. In addition, the embodiment of the application integrates the acquisition, analysis, operation and communication functions of the multi-channel audio signal and the multi-channel video signal into a whole system, and avoids the process of installing a microphone array with a large volume into a camera. In addition, the system of the embodiment of the application supports wireless transmission and a Programmable Logic Controller (PLC) function, and the problems of high cost and the like caused by more connecting cables are solved.
The embodiment of the application provides an audio and video processing system, the system respectively collects audio signals and video signals of a target area through an audio collecting device 10 and a video collecting device 20, and then abnormal event detection is carried out based on the two types of signals, and a target abnormal event can be determined to be detected under the condition that any type of signal detects the abnormal event.
Referring to fig. 3, fig. 3 is a flowchart of an audio video processing method provided in an embodiment of the present application, where an execution subject of the method is a control device, and the method includes:
301. and receiving an audio signal sent by the audio acquisition equipment and a video signal sent by the video acquisition equipment, wherein the audio signal and the video signal are both acquired signals of a target area.
302. An abnormal event detection is performed based on the audio signal and the video signal.
303. And if the target abnormal event is detected to occur in the target area based on any type of signals, acquiring event information of the target abnormal event.
304. And monitoring the target abnormal event based on the event information of the target abnormal event.
In some embodiments, performing abnormal event detection based on the received signal comprises:
under the condition that the audio signal is determined not to be matched with the reference audio signal and the video signal is determined not to be matched with the reference video signal, abnormal event detection is carried out on the basis of the audio signal and the video signal;
under the condition that the audio signal is determined to be matched with the reference audio signal and the video signal is determined not to be matched with the reference video signal, abnormal event detection is carried out based on the video signal;
performing abnormal event detection based on the audio signal in the case where it is determined that the audio signal does not match the reference audio signal and the video signal matches the reference video signal; the reference audio signal is an audio signal of the target area acquired when the target abnormal event does not occur in the target area, and the reference video signal is a video signal of the target area acquired when the target abnormal event does not occur in the target area.
In some embodiments, in the event that it is determined that the audio signal does not match the reference audio signal and the video signal matches the reference video signal, performing an abnormal event detection based on the audio signal comprises:
determining the occurrence position of the target abnormal event based on the audio signal; acquiring a video signal which is acquired again based on the occurrence position of the target abnormal event; and detecting abnormal events based on the recaptured video signals.
In some embodiments, acquiring the recaptured video signal, and performing the abnormal event detection based on the recaptured video signal, comprises:
acquiring a video signal acquired by target standby video acquisition equipment;
and detecting the abnormal event based on the video signal acquired by the target standby video acquisition equipment, wherein the monitoring position of the target standby video acquisition equipment is the same as the occurrence position of the target abnormal event.
In some embodiments, acquiring the recaptured video signal, and performing the abnormal event detection based on the recaptured video signal, comprises:
controlling the position monitored by the video acquisition equipment to be adjusted to the occurrence position so as to enable the video acquisition equipment to acquire the video signal of the occurrence position;
and detecting abnormal events based on the video signals re-acquired by the video acquisition equipment.
In some embodiments, the target exception event includes a plurality of stages of development, and the event information includes an occurrence location of the target exception event at a current stage of the plurality of stages of development; monitoring the target abnormal event based on the event information of the target abnormal event, wherein the monitoring comprises the following steps:
predicting the occurrence position of the target abnormal event in the next stage based on the occurrence position of the target abnormal event in the current stage;
and monitoring the target abnormal event based on the occurrence position of the target abnormal event in the next stage.
In some embodiments, predicting the location of the target exception event at the next stage based on the location of the target exception event at the current stage comprises:
acquiring event information of historical abnormal events, wherein the event information of the historical abnormal events comprises the occurrence positions of the historical abnormal events in a plurality of development stages;
and in response to the matching of the occurrence position of the historical abnormal event in the first historical stage and the occurrence position of the target abnormal event in the current stage, taking the occurrence position of the historical abnormal event in the second historical stage as the occurrence position of the target abnormal event in the next stage, wherein the first historical stage is any one stage except the last stage in the multiple development stages of the historical abnormal event, and the second historical stage is the next stage of the first historical stage.
In some embodiments, performing abnormal event detection based on the received signal comprises:
under the condition that any kind of signals are matched with target signals in an abnormal event library, determining that the target abnormal events are detected, wherein the abnormal event library is used for storing audio signals and video signals of various abnormal events, and the target signals comprise the audio signals and/or the video signals of the target abnormal events; the target exception event includes any one of a plurality of exception events.
In some embodiments, determining that any type of signal matches a target signal in the anomaly library comprises: and determining the similarity between any type of signal and the signals in the abnormal event library respectively, and determining that any type of signal is matched with the target signal under the condition that the similarity between any type of signal and the target signal meets the similarity condition.
In some embodiments, the target signal corresponds to a plurality of exception events; determining that a target exception event is detected, comprising:
and determining that the target abnormal event is detected under the condition that any kind of signals are determined to be signals under the target abnormal event in the multiple abnormal events based on the target signals and the Bayesian probability.
In some embodiments, the target exception event occurs multiple times; after detecting that a target abnormal event occurs in the target area based on any type of signals, the method further comprises the following steps: and carrying out event statistics based on the event information of the multiple target abnormal events, and generating an analysis report of the target abnormal events based on the statistical result, wherein the analysis report comprises at least one of an occurrence time chart, an occurrence position chart and an occurrence frequency chart of the target abnormal events.
The embodiment of the application provides an audio and video processing method, the method comprises the steps that audio signals and video signals of a target area are respectively collected through an audio collecting device and a video collecting device, then abnormal event detection is carried out based on the two signals, and the target abnormal event can be determined to be detected under the condition that any type of signals detect the abnormal event.
It should be noted that the audio-video processing method provided in the embodiment of the present application and the embodiment of the audio-video system belong to the same concept, and specific processes thereof are detailed in the embodiment of the audio-video processing system and are not described herein again.
Referring to fig. 4, fig. 4 is a block diagram of an audio and video processing apparatus provided in an embodiment of the present application, where the apparatus includes:
the signal receiving module 401 is configured to receive an audio signal sent by an audio acquisition device and a video signal sent by a video acquisition device, where the audio signal and the video signal are both signals of an acquired target area;
an event detection module 402 for performing abnormal event detection based on the audio signal and the video signal;
an information obtaining module 403, configured to obtain event information of a target abnormal event if it is detected that the target abnormal event occurs in the target area based on any type of signal;
and the event monitoring module 404 is configured to monitor the target abnormal event based on the event information of the target abnormal event.
In some embodiments, the event detection module 402 is configured to:
under the condition that the audio signal is determined not to be matched with the reference audio signal and the video signal is determined not to be matched with the reference video signal, abnormal event detection is carried out on the basis of the audio signal and the video signal;
under the condition that the audio signal is matched with the reference audio signal and the video signal is not matched with the reference video signal, abnormal event detection is carried out on the basis of the video signal;
performing abnormal event detection based on the audio signal in the case where it is determined that the audio signal does not match the reference audio signal and the video signal matches the reference video signal; the reference audio signal is an audio signal of the target area acquired when the target abnormal event does not occur in the target area, and the reference video signal is a video signal of the target area acquired when the target abnormal event does not occur in the target area.
In some embodiments, the event detection module 402 is to: determining the occurrence position of the target abnormal event based on the audio signal; acquiring a video signal acquired again based on the occurrence position of the target abnormal event; and detecting abnormal events based on the recaptured video signals.
In some embodiments, the event detection module 402 is configured to: acquiring a video signal acquired by target standby video acquisition equipment; and detecting the abnormal event based on the video signal acquired by the target standby video acquisition equipment, wherein the monitoring position of the target standby video acquisition equipment is the same as the occurrence position of the target abnormal event.
In some embodiments, the event detection module 402 is configured to: controlling the position monitored by the video acquisition equipment to be adjusted to the occurrence position so as to enable the video acquisition equipment to acquire the video signal of the occurrence position; and detecting abnormal events based on the video signals re-acquired by the video acquisition equipment.
In some embodiments, the target exception event includes a plurality of stages of development, and the event information includes an occurrence location of the target exception event at a current stage of the plurality of stages of development; an event monitoring module 404 configured to: predicting the occurrence position of the target abnormal event in the next stage based on the occurrence position of the target abnormal event in the current stage; and monitoring the target abnormal event based on the occurrence position of the target abnormal event in the next stage.
In some embodiments, the event monitoring module 404 is configured to: acquiring event information of historical abnormal events, wherein the event information of the historical abnormal events comprises the occurrence positions of the historical abnormal events in a plurality of development stages; and in response to the matching of the occurrence position of the historical abnormal event in the first historical stage and the occurrence position of the target abnormal event in the current stage, taking the occurrence position of the historical abnormal event in the second historical stage as the occurrence position of the target abnormal event in the next stage, wherein the first historical stage is any one stage except the last stage in the multiple development stages of the historical abnormal event, and the second historical stage is the next stage of the first historical stage.
In some embodiments, the event detection module 402 is to: under the condition that any kind of signals are matched with target signals in an abnormal event library, determining that the target abnormal events are detected, wherein the abnormal event library is used for storing audio signals and video signals of various abnormal events, and the target signals comprise the audio signals and/or the video signals of the target abnormal events; the target exception event includes any of a variety of exception events.
In some embodiments, the event detection module 402 is configured to: and determining the similarity between any type of signal and the signals in the abnormal event library respectively, and determining that any type of signal is matched with the target signal under the condition that the similarity between any type of signal and the target signal meets the similarity condition.
In some embodiments, the target signal corresponds to a plurality of exception events; an event detection module 402 to: and determining that the target abnormal event is detected under the condition that any kind of signals are determined to be signals under the target abnormal event in the multiple abnormal events based on the target signals and the Bayesian probability.
In some embodiments, the target exception event occurs multiple times; after detecting that a target abnormal event occurs in the target area based on any type of signals, the device further comprises: and the event counting module is used for counting events based on the event information of the target abnormal events for multiple times, and generating an analysis report of the target abnormal events based on the counting result, wherein the analysis report comprises at least one of an occurrence time chart, an occurrence position chart and an occurrence frequency chart of the target abnormal events.
The embodiment of the application provides an audio and video processing device, the device respectively collects audio signals and video signals of a target area through an audio collecting device and a video collecting device, then abnormal event detection is carried out based on the two types of signals, and a target abnormal event can be determined to be detected under the condition that any type of signal detects the abnormal event, so that the abnormal event can be detected based on the other type of signal under the condition that the abnormal event is not detected based on one type of signal, the probability of detecting the abnormal event is improved, the detected target abnormal event is monitored, and the accuracy of detecting the abnormal event is further improved.
Fig. 5 shows a block diagram of a control device 500 according to an exemplary embodiment of the present application. In general, the control device 500 includes: a processor 501 and a memory 502.
The processor 501 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. The processor 501 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 501 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 501 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed by the display screen. In some embodiments, the processor 501 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.
Memory 502 may include one or more computer-readable storage media, which may be non-transitory. Memory 502 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in the memory 502 is used to store at least one program code for execution by the processor 501 to implement the audio video processing method provided by the method embodiments herein.
In some embodiments, the control device 500 may further optionally include: a peripheral interface 503 and at least one peripheral. The processor 501, memory 502 and peripheral interface 503 may be connected by a bus or signal lines. Each peripheral may be connected to the peripheral interface 503 by a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 504, display screen 505, camera assembly 506, audio circuitry 507, and power supply 508.
The peripheral interface 503 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 501 and the memory 502. In some embodiments, the processor 501, memory 502, and peripheral interface 503 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 501, the memory 502, and the peripheral interface 503 may be implemented on a separate chip or circuit board, which is not limited in this embodiment.
The Radio Frequency circuit 504 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 504 communicates with communication networks and other communication devices via electromagnetic signals. The rf circuit 504 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 504 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuitry 504 may communicate with other control devices via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: the world wide web, metropolitan area networks, intranets, various generations of mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the rf circuit 504 may further include NFC (Near Field Communication) related circuits, which are not limited in this application.
The display screen 505 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 505 is a touch display screen, the display screen 505 also has the ability to capture touch signals on or over the surface of the display screen 505. The touch signal may be input to the processor 501 as a control signal for processing. At this point, the display screen 505 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display screen 505 may be one, disposed on a front panel of the control device 500; in other embodiments, the display screens 505 may be at least two, respectively disposed on different surfaces of the control device 500 or in a folded design; in other embodiments, the display 505 may be a flexible display, disposed on a curved surface or on a folded surface of the control device 500. Even more, the display screen 505 can be arranged in a non-rectangular irregular figure, i.e. a shaped screen. The Display screen 505 may be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), and other materials.
The camera assembly 506 is used to capture images or video. Optionally, camera assembly 506 includes a front camera and a rear camera. Generally, the front camera is disposed on the front panel of the control apparatus, and the rear camera is disposed on the rear surface of the control apparatus. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera head assembly 506 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.
Audio circuitry 507 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 501 for processing, or inputting the electric signals to the radio frequency circuit 504 to realize voice communication. For stereo capture or noise reduction purposes, the microphones may be multiple and located at different locations of the control device 500. The microphone may also be an array microphone or an omni-directional acquisition microphone. The speaker is used to convert electrical signals from the processor 501 or the radio frequency circuit 504 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, audio circuitry 507 may also include a headphone jack.
The power supply 508 is used to supply power to the various components in the control device 500. The power source 508 may be alternating current, direct current, disposable or rechargeable. When the power supply 508 includes a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.
In some embodiments, the control device 500 also includes one or more sensors 509. The one or more sensors 509 include, but are not limited to: acceleration sensor 510, gyro sensor 511, pressure sensor 512, optical sensor 513, and proximity sensor 514.
The acceleration sensor 510 may detect the magnitude of acceleration in three coordinate axes of a coordinate system established with the control apparatus 500. For example, the acceleration sensor 510 may be used to detect the components of the gravitational acceleration in three coordinate axes. The processor 501 may control the display screen 505 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 510. The acceleration sensor 510 may also be used for game or user motion data acquisition.
The gyro sensor 511 may detect a body direction and a rotation angle of the control device 500, and the gyro sensor 511 may cooperate with the acceleration sensor 510 to collect a 3D motion of the user on the control device 500. The processor 501 may implement the following functions according to the data collected by the gyro sensor 511: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization while shooting, game control, and inertial navigation.
The pressure sensors 512 may be disposed on the side bezel of the control device 500 and/or underneath the display screen 505. When the pressure sensor 512 is disposed on the side frame of the control device 500, the holding signal of the user to the control device 500 can be detected, and the processor 501 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 512. When the pressure sensor 512 is disposed at the lower layer of the display screen 505, the processor 501 controls the operability control on the UI interface according to the pressure operation of the user on the display screen 505. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.
The optical sensor 513 is used to collect the ambient light intensity. In one embodiment, the processor 501 may control the display brightness of the display screen 505 based on the ambient light intensity collected by the optical sensor 513. Specifically, when the ambient light intensity is higher, the display brightness of the display screen 505 is increased; when the ambient light intensity is low, the display brightness of the display screen 505 is reduced. In another embodiment, processor 501 may also dynamically adjust the shooting parameters of camera head assembly 506 based on the ambient light intensity collected by optical sensor 513.
A proximity sensor 514, also called a distance sensor, is typically provided on the front panel of the control device 500. The proximity sensor 514 is used to capture the distance between the user and the front of the control device 500. In one embodiment, the processor 501 controls the display screen 505 to switch from the bright screen state to the dark screen state when the proximity sensor 514 detects that the distance between the user and the front face of the control device 500 is gradually decreased; when the proximity sensor 514 detects that the distance between the user and the front face of the control device 500 is gradually increased, the display screen 505 is controlled by the processor 501 to switch from the breath-screen state to the bright-screen state.
Those skilled in the art will appreciate that the configuration shown in fig. 5 is not intended to be limiting of the control device 500 and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.
An embodiment of the present application further provides a computer-readable storage medium, where at least one program code is stored in the computer-readable storage medium, and the at least one program code is loaded and executed by a processor, so as to implement the audio and video processing method according to any implementation manner.
Embodiments of the present application further provide a computer program product, where the computer program product includes computer program codes, the computer program codes are stored in a computer-readable storage medium, and a processor of the computer device reads the computer program codes from the computer-readable storage medium, and executes the computer program codes, so that the computer device executes the audio and video processing method of any of the above-mentioned implementations.
In some embodiments, the computer program product according to the embodiments of the present application may be deployed to be executed on one computer device or on multiple computer devices located at one site, or may be executed on multiple computer devices distributed at multiple sites and interconnected by a communication network, and the multiple computer devices distributed at the multiple sites and interconnected by the communication network may constitute a block chain system.
The present application is intended to cover various modifications, alternatives, and equivalents, which may be included within the spirit and scope of the present application.

Claims (15)

1. An audio video processing system, the system comprising: the system comprises an audio acquisition device, a video acquisition device and a control device;
the audio acquisition equipment is used for acquiring an audio signal of a target area and sending the audio signal to the control equipment;
the video acquisition equipment is used for carrying out video monitoring on the target area so as to acquire a video signal of the target area and send the video signal to the control equipment;
the control device is used for receiving the audio signal and the video signal, detecting an abnormal event based on the received signals, acquiring event information of a target abnormal event if the target abnormal event is detected to occur in the target area based on any type of signals, and monitoring the target abnormal event based on the event information of the target abnormal event.
2. The system according to claim 1, wherein the control device, upon performing abnormal event detection based on the received signal, is configured to perform abnormal event detection based on the audio signal and the video signal in a case where it is determined that the audio signal does not match a reference audio signal and the video signal does not match a reference video signal;
performing abnormal event detection based on the video signal if it is determined that the audio signal matches the reference audio signal and the video signal does not match the reference video signal;
in the case that the audio signal is determined not to match the reference audio signal and the video signal is determined to match the reference video signal, performing abnormal event detection based on the audio signal; the reference audio signal is an audio signal of a target area acquired when the target abnormal event does not occur in the target area, and the reference video signal is a video signal of the target area acquired when the target abnormal event does not occur in the target area.
3. The system according to claim 2, wherein the control device, upon determining that the audio signal does not match the reference audio signal and that the video signal matches the reference video signal, is configured to determine an occurrence position of the target abnormal event based on the audio signal upon performing abnormal event detection based on the audio signal;
acquiring a video signal which is acquired again based on the occurrence position of the target abnormal event;
performing an abnormal event detection based on the re-acquired video signal.
4. The system of claim 3, further comprising a plurality of backup video capture devices for video surveillance of different locations of the target area;
the control equipment is used for acquiring a video signal acquired by target standby video acquisition equipment when acquiring a video signal acquired again based on the occurrence position of the target abnormal event and detecting the abnormal event based on the video signal acquired again;
and detecting an abnormal event based on the video signal acquired by the target standby video acquisition equipment, wherein the position monitored by the target standby video acquisition equipment is the same as the occurrence position of the target abnormal event.
5. The system according to claim 3, wherein the control device, when acquiring a recaptured video signal based on the occurrence position of the target abnormal event and performing abnormal event detection based on the recaptured video signal, is configured to control the position monitored by the video capture device to be adjusted to the occurrence position of the target abnormal event, so that the video capture device captures the video signal of the occurrence position of the target abnormal event;
and detecting abnormal events based on the video signals re-acquired by the video acquisition equipment.
6. The system of claim 1, wherein the target exception comprises a plurality of stages of development, and the event information comprises a location of occurrence of the target exception at a current stage of the plurality of stages of development;
the control device is used for predicting the occurrence position of the target abnormal event at the next stage based on the occurrence position of the target abnormal event at the current stage when monitoring the target abnormal event based on the event information of the target abnormal event;
and monitoring the target abnormal event based on the occurrence position of the target abnormal event in the next stage.
7. The system according to claim 6, wherein the control device, when predicting the occurrence position of the target abnormal event at the next stage based on the occurrence position of the target abnormal event at the current stage, is configured to acquire event information of a historical abnormal event including the occurrence positions of the historical abnormal event at a plurality of stages of development;
in response to the matching of the occurrence position of the historical abnormal event in a first historical stage and the occurrence position of the target abnormal event in the current stage, taking the occurrence position of the historical abnormal event in a second historical stage as the occurrence position of the target abnormal event in the next stage, wherein the first historical stage is any one stage except the last stage in a plurality of development stages of the historical abnormal event, and the second historical stage is the next stage of the first historical stage.
8. The system according to claim 1, wherein the control device, upon performing abnormal event detection based on the received signal, is configured to determine that a target abnormal event is detected if it is determined that any type of signal matches a target signal in an abnormal event library, the abnormal event library being configured to store audio signals and video signals of a plurality of types of abnormal events, the target signal including audio signals and/or video signals of the target abnormal event; the target exception event comprises any one of the plurality of exception events.
9. The system according to claim 8, wherein the control device, when determining that any kind of signal matches with a target signal in the abnormal event library, is configured to determine similarity between the any kind of signal and a signal in the abnormal event library, and in a case that the similarity between the any kind of signal and the target signal meets a similarity condition, determine that the any kind of signal matches with the target signal.
10. The system of claim 8, wherein the target signal corresponds to a plurality of abnormal events;
the control device is used for determining that the target abnormal event is detected under the condition that any type of signal is determined to be the signal under the target abnormal event in the multiple abnormal events based on the target signal and the Bayesian probability when the target abnormal event is determined to be detected.
11. The system of claim 1, wherein the target exception event occurs a plurality of times;
the control device is used for performing event statistics based on event information of a plurality of times of target abnormal events after detecting that the target abnormal events occur in the target area based on any type of signals, and generating an analysis report of the target abnormal events based on statistical results, wherein the analysis report comprises at least one of an occurrence time chart, an occurrence position chart and an occurrence frequency chart of the target abnormal events.
12. An audio-video processing method, characterized in that the method comprises:
receiving an audio signal sent by audio acquisition equipment and a video signal sent by video acquisition equipment, wherein the audio signal and the video signal are both signals of an acquired target area;
performing abnormal event detection based on the audio signal and the video signal;
if a target abnormal event is detected to occur in the target area based on any type of signals, acquiring event information of the target abnormal event;
and monitoring the target abnormal event based on the event information of the target abnormal event.
13. An audio-video processing apparatus, characterized in that the apparatus comprises:
the signal receiving module is used for receiving an audio signal sent by audio acquisition equipment and a video signal sent by video acquisition equipment, wherein the audio signal and the video signal are both signals of an acquired target area;
an event detection module for performing abnormal event detection based on the audio signal and the video signal;
the information acquisition module is used for acquiring event information of a target abnormal event if the target abnormal event is detected to occur in the target area based on any type of signals;
and the event monitoring module is used for monitoring the target abnormal event based on the event information of the target abnormal event.
14. A control device, characterized in that the control device comprises one or more processors and one or more memories, in which at least one program code is stored, which is loaded and executed by the one or more processors to implement the audio video processing method according to claim 12.
15. A computer-readable storage medium having stored therein at least one program code, which is loaded and executed by a processor, to implement the audio video processing method of claim 12.
CN202210964134.XA 2022-08-11 2022-08-11 Audio video processing system, method, device, equipment and storage medium Pending CN115334289A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210964134.XA CN115334289A (en) 2022-08-11 2022-08-11 Audio video processing system, method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210964134.XA CN115334289A (en) 2022-08-11 2022-08-11 Audio video processing system, method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115334289A true CN115334289A (en) 2022-11-11

Family

ID=83923414

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210964134.XA Pending CN115334289A (en) 2022-08-11 2022-08-11 Audio video processing system, method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115334289A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114998839A (en) * 2022-07-06 2022-09-02 北京原流科技有限公司 Data management method and system based on hierarchical distribution
CN117541957A (en) * 2023-11-08 2024-02-09 继善(广东)科技有限公司 Method, system and medium for generating event solving strategy based on artificial intelligence
CN117541957B (en) * 2023-11-08 2024-05-24 继善(广东)科技有限公司 Method, system and medium for generating event solving strategy based on artificial intelligence

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114998839A (en) * 2022-07-06 2022-09-02 北京原流科技有限公司 Data management method and system based on hierarchical distribution
CN117541957A (en) * 2023-11-08 2024-02-09 继善(广东)科技有限公司 Method, system and medium for generating event solving strategy based on artificial intelligence
CN117541957B (en) * 2023-11-08 2024-05-24 继善(广东)科技有限公司 Method, system and medium for generating event solving strategy based on artificial intelligence

Similar Documents

Publication Publication Date Title
US11972036B2 (en) Scene-based sensor networks
US10123051B2 (en) Video analytics with pre-processing at the source end
CN104519318B (en) Frequency image monitoring system and surveillance camera
CN110895861B (en) Abnormal behavior early warning method and device, monitoring equipment and storage medium
JP6134825B2 (en) How to automatically determine the probability of image capture by the terminal using context data
AU2009243916B2 (en) A system and method for electronic surveillance
US20060170772A1 (en) Surveillance system and method
US10922547B1 (en) Leveraging audio/video recording and communication devices during an emergency situation
CN111818050B (en) Target access behavior detection method, system, device, equipment and storage medium
KR20040105612A (en) Change detecting method and apparatus and monitoring system using the method or apparatus
CN111614634A (en) Flow detection method, device, equipment and storage medium
CN112540739A (en) Screen projection method and system
CN115334289A (en) Audio video processing system, method, device, equipment and storage medium
US10867495B1 (en) Device and method for adjusting an amount of video analytics data reported by video capturing devices deployed in a given location
CN111416996A (en) Multimedia file detection method, multimedia file playing device, multimedia file equipment and storage medium
CN112714294B (en) Alarm preview method, device and computer readable storage medium
KR101964230B1 (en) System for processing data
CN113706807B (en) Method, device, equipment and storage medium for sending alarm information
CN115836516B (en) Monitoring system
KR20150114589A (en) Apparatus and method for subject reconstruction
CN113844976B (en) Alarm data processing method, device, computer equipment and storage medium
CN114420163B (en) Voice recognition method, voice recognition device, storage medium, electronic device, and vehicle
JP2000059759A (en) Monitoring camera system
KR20230089558A (en) Apparatus and method for transmitting images and apparatus and method for receiving images
CN114090281A (en) Method, device and equipment for acquiring abnormal behaviors and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination