CN110830771A - Intelligent monitoring method, device, equipment and computer readable storage medium - Google Patents

Intelligent monitoring method, device, equipment and computer readable storage medium Download PDF

Info

Publication number
CN110830771A
CN110830771A CN201911098789.8A CN201911098789A CN110830771A CN 110830771 A CN110830771 A CN 110830771A CN 201911098789 A CN201911098789 A CN 201911098789A CN 110830771 A CN110830771 A CN 110830771A
Authority
CN
China
Prior art keywords
sound source
monitoring
information
voice information
position point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911098789.8A
Other languages
Chinese (zh)
Inventor
陈昊亮
许敏强
杨世清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou National Acoustic Intelligent Technology Co Ltd
Original Assignee
Guangzhou National Acoustic Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou National Acoustic Intelligent Technology Co Ltd filed Critical Guangzhou National Acoustic Intelligent Technology Co Ltd
Priority to CN201911098789.8A priority Critical patent/CN110830771A/en
Publication of CN110830771A publication Critical patent/CN110830771A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • H04N7/181Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a plurality of remote sources
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B21/00Alarms responsive to a single specified undesired or abnormal condition and not otherwise provided for
    • G08B21/02Alarms for ensuring the safety of persons
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/04Time compression or expansion
    • G10L21/055Time compression or expansion for synchronising with other signals, e.g. video signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Emergency Management (AREA)
  • Quality & Reliability (AREA)
  • Emergency Alarm Devices (AREA)
  • Alarm Systems (AREA)

Abstract

The invention discloses an intelligent monitoring method, an intelligent monitoring device, intelligent monitoring equipment and a computer readable storage medium, wherein the method comprises the following steps: acquiring sound source voice information acquired by all microphones at the same moment in a monitoring scene, and detecting whether the sound source voice information is sensitive information or not; if so, extracting sound source voiceprint characteristics corresponding to the sound source voice information; when the voiceprint characteristics of the sound source are matched with the preset voiceprint characteristics, the sound source direction corresponding to the sound source voice information is determined, the monitoring picture collected in the sound source direction is monitored, the target monitoring object is guaranteed to exist on the collected monitoring picture, the monitoring angle is allowed to be diversified, and the user experience of the monitoring equipment is improved.

Description

Intelligent monitoring method, device, equipment and computer readable storage medium
Technical Field
The present invention relates to the field of monitoring technologies, and in particular, to an intelligent monitoring method, apparatus, device, and computer-readable storage medium.
Background
With the increasing urbanization rhythm, most parents cannot accompany the children at any time due to work, so that the out-of-home worry is generated. Therefore, the related art currently proposes to install a monitor in a home to enable a parent to know the status of a child in real time, but after the monitor is pressed in the home, there is a common problem that when the monitor uploads a shot picture to the mobile phone of the parent in real time, the shadow of the child may not be displayed on the monitoring picture, so that the monitoring result does not meet the expected effect of the parent. When a child is in an abnormal situation, the monitoring device may not be able to shoot the child in time to let the parent know that the child is in danger.
Disclosure of Invention
The invention mainly aims to provide an intelligent monitoring method, an intelligent monitoring device, intelligent monitoring equipment and a computer readable storage medium, and aims to solve the problem that the user experience is reduced due to the fact that the monitoring visual angle cannot be changed due to the movement of a monitored object in the prior art.
In order to achieve the above object, the present invention provides an intelligent monitoring method, which comprises:
acquiring sound source voice information acquired by all microphones at the same moment in a monitoring scene, and detecting whether the sound source voice information is sensitive information or not;
if so, extracting sound source voiceprint characteristics corresponding to the sound source voice information;
and when the sound source voiceprint characteristics are matched with the preset voiceprint characteristics, determining the sound source direction corresponding to the sound source voice information, and monitoring the monitoring picture acquired in the sound source direction.
Further, the step of detecting whether the sound source voice information is sensitive information comprises:
converting the sound source voice information into text information according to a voice recognition technology;
inquiring the sensitivity index corresponding to the text information from a text information base;
and if the sensitivity index is greater than or equal to a preset value, determining the sound source voice information as sensitive information.
Further, when the voiceprint feature of the sound source matches a preset voiceprint feature, the step of determining the sound source direction corresponding to the sound source voice information includes:
calculating the voice power value of each preset position point in a monitoring scene;
and identifying the position point with the maximum voice power value, and determining the direction of the position point as the sound source direction.
Further, the step of calculating the voice power value of each preset position point in the monitoring scene includes:
acquiring coordinate information of each microphone and coordinate information of each preset position point;
and calculating the voice power value corresponding to each preset position point according to the voice information of the sound source, the coordinate information of each microphone and the coordinate information of each preset position point.
Further, the step of calculating the voice power value corresponding to each preset position point according to the voice information of the sound source, the coordinate information of each microphone, and the coordinate information of each preset position point includes:
carrying out Fourier transform on the sound source voice information, and calculating the time delay difference from each preset position point to the sound velocity of each two adjacent microphones;
calculating generalized cross-correlation from each preset position point to each two adjacent microphones according to the Fourier transform and the time delay difference from each preset position point to each two adjacent microphones;
and calculating the voice power value corresponding to each preset position point according to the generalized cross correlation.
Further, the step of monitoring the monitoring picture collected in the sound source direction includes:
generating a camera driving instruction according to the sound source direction;
and adjusting the shooting direction of the monitoring camera to the sound source direction based on the camera driving instruction so as to collect the monitoring picture in the sound source direction.
Further, after the step of monitoring and collecting the monitoring picture in the sound source direction, the method further comprises the following steps:
sending the monitoring picture to a terminal, and detecting whether an alarm instruction exists;
and if the alarm indication is detected, starting an alarm mode.
In addition, to achieve the above object, the present invention also provides an intelligent monitoring apparatus, including:
the acquisition module is used for acquiring sound source voice information acquired by all microphones at the same moment in a monitoring scene;
the detection module is used for detecting whether the sound source voice information is sensitive information;
the extraction module is used for extracting the sound source voiceprint characteristics corresponding to the sound source voice information if the voice information is the voice information;
the determining module is used for determining the sound source direction corresponding to the sound source voice information when the sound source voiceprint characteristics are matched with the preset voiceprint characteristics;
and the monitoring acquisition module is used for monitoring and acquiring the monitoring picture in the sound source direction. .
In addition, in order to achieve the above object, the present invention further provides an intelligent monitoring device, which includes a memory, a processor, and an intelligent monitoring program stored in the memory and capable of running on the processor, wherein the intelligent monitoring program, when executed by the processor, implements the steps of the intelligent monitoring method as described above.
In addition, to achieve the above object, the present invention further provides a computer readable storage medium, wherein the intelligent monitoring program is stored on the computer readable storage medium, and when being executed by a processor, the intelligent monitoring program implements the steps of the intelligent monitoring method as described above.
The method comprises the steps of acquiring sound source voice information acquired by all microphones at the same moment in a monitoring scene, and detecting whether the sound source voice information is sensitive information; if so, extracting sound source voiceprint characteristics corresponding to the sound source voice information; when the sound source voiceprint characteristics are matched with the preset voiceprint characteristics, the sound source direction corresponding to the sound source voice information is determined, and the monitoring picture collected in the sound source direction is monitored, so that the monitoring visual angle can be changed in time when the monitoring object is detected to be abnormal, the monitoring object is shot, and the user experience of the monitoring equipment is further improved.
Drawings
FIG. 1 is a diagram illustrating a hardware configuration of an apparatus for implementing various embodiments of the invention;
FIG. 2 is a schematic flow chart of a first embodiment of the intelligent monitoring method of the present invention;
fig. 3 is a diagram of an application scenario of the present invention.
The implementation, functional features and advantages of the present invention will be described with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The invention provides an intelligent monitoring device, and referring to fig. 1, fig. 1 is a schematic structural diagram of a hardware operating environment according to an embodiment of the present invention.
It should be noted that fig. 1 is a schematic structural diagram of a hardware operating environment of an intelligent monitoring device. The intelligent monitoring equipment of the embodiment of the invention can be equipment such as a PC, a portable computer, a server and the like.
As shown in fig. 1, the intelligent monitoring apparatus may include: a processor 1001, such as a CPU, a memory 1005, a user interface 1003, a network interface 1004, a communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.
Optionally, the smart monitoring device may further include RF (Radio Frequency) circuits, sensors, WiFi modules, and the like.
Those skilled in the art will appreciate that the configuration of the intelligent monitoring device shown in FIG. 1 does not constitute a limitation of the intelligent monitoring device, and may include more or fewer components than shown, or some components in combination, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is a kind of computer storage readable storage medium, may include therein an operating system, a network communication module, a user interface module, and an intelligent monitoring program. The operating system is a program for managing and controlling hardware and software resources of the intelligent monitoring device and supports the operation of the intelligent monitoring program and other software or programs.
The intelligent monitoring device shown in fig. 1 may be used to implement monitoring intellectualization, and the user interface 1003 is mainly used to detect or output various information, such as information for detecting a sound source voice message and outputting an alarm signal; the network interface 1004 is mainly used for interacting with a background server and communicating; the processor 1001 may be configured to invoke the smart monitoring program stored in the memory 1005 and perform the following operations:
acquiring sound source voice information acquired by all microphones at the same moment in a monitoring scene, and detecting whether the sound source voice information is sensitive information or not;
if so, extracting sound source voiceprint characteristics corresponding to the sound source voice information;
and when the sound source voiceprint characteristics are matched with the preset voiceprint characteristics, determining the sound source direction corresponding to the sound source voice information, and monitoring the monitoring picture acquired in the sound source direction.
Further, the step of detecting whether the sound source voice information is sensitive information comprises:
converting the sound source voice information into text information according to a voice recognition technology;
inquiring the sensitivity index corresponding to the text information from a text information base;
and if the sensitivity index is greater than or equal to a preset value, determining the sound source voice information as sensitive information.
Further, when the voiceprint feature of the sound source matches a preset voiceprint feature, the step of determining the sound source direction corresponding to the sound source voice information includes:
calculating the voice power value of each preset position point in a monitoring scene;
and identifying the position point with the maximum voice power value, and determining the direction of the position point as the sound source direction.
Further, the step of daily calculating the voice power value of each preset position point in the monitoring scene includes:
acquiring coordinate information of each microphone and coordinate information of each preset position point;
and calculating the voice power value corresponding to each preset position point according to the voice information of the sound source, the coordinate information of each microphone and the coordinate information of each preset position point.
Further, the step of calculating the voice power value corresponding to each preset position point according to the voice information of the sound source, the coordinate information of each microphone, and the coordinate information of each preset position point includes:
carrying out Fourier transform on the sound source voice information, and calculating the time delay difference from each preset position point to the sound velocity of each two adjacent microphones;
calculating generalized cross-correlation from each preset position point to each two adjacent microphones according to the Fourier transform and the time delay difference from each preset position point to each two adjacent microphones;
and calculating the voice power value corresponding to each preset position point according to the generalized cross correlation.
Further, the step of monitoring the monitoring picture collected in the sound source direction includes:
generating a camera driving instruction according to the sound source direction;
and adjusting the shooting direction of the monitoring camera to the sound source direction based on the camera driving instruction so as to collect the monitoring picture in the sound source direction.
Further, after the step of monitoring the monitoring picture collected in the sound source direction, the processor 1001 is further configured to call the intelligent monitoring program stored in the memory 1005, and perform the following operations:
sending the monitoring picture to a terminal, and detecting whether an alarm instruction exists;
and if the alarm indication is detected, starting an alarm mode.
The method comprises the steps of acquiring sound source voice information acquired by all microphones at the same moment in a monitoring scene, and detecting whether the sound source voice information is sensitive information; if so, extracting sound source voiceprint characteristics corresponding to the sound source voice information; when the sound source voiceprint characteristics are matched with the preset voiceprint characteristics, the sound source direction corresponding to the sound source voice information is determined, and the monitoring picture collected in the sound source direction is monitored, so that the monitoring visual angle can be changed in time when the monitoring object is detected to be abnormal, the monitoring object is shot, and the user experience of the monitoring equipment is further improved.
The specific implementation of the mobile terminal of the present invention is substantially the same as the following embodiments of the intelligent monitoring method, and will not be described herein again.
Based on the structure, the invention provides various embodiments of the intelligent monitoring method.
The invention provides an intelligent monitoring method.
Referring to fig. 1, fig. 1 is a schematic flow chart of a first embodiment of the intelligent monitoring method of the present invention.
In the present embodiment, an embodiment of an intelligent monitoring method is provided, and it should be noted that although a logical order is shown in the flow chart, in some cases, the steps shown or described may be performed in an order different from that shown or described herein.
In this embodiment, the intelligent monitoring method includes:
step S10, acquiring sound source voice information collected by each microphone at the same time in a monitoring scene, and detecting whether the sound source voice information is sensitive information.
It should be noted that microphones are arranged at multiple places in the monitoring scene, and when the monitored object makes a sound, sound source voice information can be acquired from the arranged microphones. Multiple microphones may receive the same source voice information from multiple microphones.
After the sound source voice information is acquired, whether the sound source voice information is sensitive information is detected, whether the state of a detected object is a dangerous state is judged, and whether an alarm mode is started is further judged.
Further, step S10 includes:
step a, converting the sound source voice information into text information according to a voice recognition technology.
The voice recognition technology has various uses, and in this embodiment, the voice recognition technology is used to extract semantic information in the voice information of the sound source, convert the semantic information into text information, and obtain a sensitivity index corresponding to the text information.
And b, inquiring the sensitivity index corresponding to the text information from a text information base.
The text information base stores a plurality of text information, the text information records are common communication languages, some emotion words, voice words and the like, sensitivity indexes are allocated to the text information, the text information and the sensitivity indexes are in one-to-one correspondence, for example, the sensitivity indexes can be represented by levels, namely, a first level, a second level, a third level and the like, and the sensitivity index of the text information is the strongest of the first level and the second level.
And acquiring text information in the sound source voice information, and searching the sensitivity index of the text information from a text information base according to the one-to-one correspondence relationship between the text information and the sensitivity index.
And c, if the sensitivity index is greater than or equal to a preset value, determining the sound source voice information as sensitive information.
After the sensitivity index of the text information in the sound source voice information is obtained, judging whether the sensitivity index is larger than or equal to a preset value, if so, determining that the sound source voice information corresponding to the text information is sensitive information; if the sensitivity index is smaller than the preset value, the sound source voice information corresponding to the text information is not sensitive information and is still in the current monitoring state, and the monitoring visual angle is not changed. If the received text information is sensitive information, the monitored object is in an abnormal condition, the monitoring view angle needs to be changed, and tracking monitoring is carried out on the monitored object.
The preset value is set by a user of the monitoring equipment according to the usual expression habit of the monitored object.
And step S20, if yes, extracting sound source voiceprint characteristics corresponding to the sound source voice information.
The voiceprint features are attributes of all voice information, and the voiceprint features have certain differences when the voice information is different. The voiceprint feature has a variety of uses, and the voiceprint feature is used in the embodiment to identify whether the monitored object to be monitored is the monitored object specified by the monitoring device.
If the sensitivity index of the sound source voice information is greater than or equal to the preset value, it is indicated that an abnormal condition occurs in the monitoring scene, and at this time, the monitoring object needs to be tracked to determine whether the abnormal condition is from the monitoring object, so when the sensitivity index of the sound source voice information is greater than or equal to the preset value, the sound source voiceprint feature corresponding to the sound source voice information is extracted to determine whether the object with the sensitivity index greater than or equal to the preset value is the monitoring object specified by the user of the monitoring device.
And step S30, when the sound source voiceprint characteristics are matched with the preset voiceprint characteristics, determining the sound source direction corresponding to the sound source voice information, and monitoring the monitoring picture collected in the sound source direction.
The preset voiceprint feature is set by a user of the monitoring device, and is the voiceprint feature of the monitored object specified by the user, and it can be understood that the voiceprint features of a plurality of monitored objects to be monitored can be set. When the sound source voiceprint feature corresponding to the obtained sound source voice information is the preset voiceprint feature, analyzing the direction of a monitored object sending the sound source voice information, namely the sound source direction, setting the monitoring direction as the sound source direction, monitoring and collecting a monitoring image in the sound source direction, and sending the monitoring image to a terminal of a user.
If the voiceprint feature of the sound source voice information is not the preset voiceprint feature, the sound source direction of the sound source voice information does not need to be analyzed, and the monitoring visual angle does not need to be changed.
Further, step S30 includes:
and d, generating a camera driving command according to the sound source side.
When the sound source direction is determined to be completed, a camera driving instruction is generated and used for adjusting and controlling the monitoring direction of the camera, and further scenes in the sound source direction everywhere can be monitored. The camera on the monitoring equipment can be driven by a motor and adjusted in position, and the motor can be driven to operate to perform orientation adjustment based on a camera driving command.
And e, adjusting the shooting direction of the monitoring camera to the sound source direction based on the camera driving instruction so as to collect the monitoring picture in the sound source direction.
The direction of the monitoring camera is controlled and adjusted based on the camera driving instruction, so that the camera can shoot a monitoring picture in the sound source direction, collect the monitoring picture, and transmit the monitoring picture to a mobile phone of a user in real time, so that the user can timely know the abnormal condition of a monitored object and judge whether to start an alarm mode.
In the embodiment, the sound source voice information of the microphones in different directions is acquired at the same time, whether the sound source voice information is sensitive information is judged, and when the sound source voice information is sensitive information, the sound source voiceprint feature in the sound source voice information is extracted, otherwise, the voiceprint information does not need to be extracted, so that the disorder of the functions of the equipment is reduced by setting conditions for extracting the voiceprint information, and unnecessary resource waste is caused.
This embodiment is through when sound source voiceprint characteristic matches with the voiceprint characteristic of predetermineeing, confirm the sound source direction that this sound source speech information corresponds to the direction of monitoring of adjustment camera, the control and the collection sound source direction's control picture, can realize the tracking control to the target, the timely unusual circumstances of grasp target, furtherly, this kind of intelligent monitoring's mode has improved user's experience greatly for the unchangeable monitoring mode of monitoring range and the monitoring mode according to the mode change monitoring range of predetermineeing.
Further, a second embodiment of the intelligent monitoring method of the present invention is presented. The difference between the second embodiment of the intelligent monitoring method and the first embodiment of the intelligent monitoring method is that when the voiceprint feature of the sound source matches with the preset voiceprint feature, the step of determining the sound source direction corresponding to the sound source voice information includes:
and f, calculating the voice power value of each preset position point in the monitoring scene.
The voice power value of each preset position point can be used for representing the sound size of each preset position point, and then the sound source direction is judged according to the voice power value of each preset position point.
Further, step f comprises:
and f1, acquiring the coordinate information of each microphone and the coordinate information of each preset position point.
The final purpose of acquiring the coordinate information of each microphone and the coordinate information of each preset position point is to calculate the voice power value of the voice information of the sound source.
And f2, calculating the voice power value corresponding to each preset position point according to the voice information of the sound source, the coordinate information of each microphone and the coordinate information of each preset position point.
The steps of specifically calculating the voice power value are as follows:
and f11, performing Fourier transform on the sound source voice information, and calculating the time delay difference from each preset position point to each two adjacent microphones.
And f12, calculating the generalized cross correlation from each preset position point to each two adjacent microphones according to the Fourier transform and the time delay difference from each preset position point to each two adjacent microphones.
And f13, calculating the voice power value corresponding to each preset position point according to the generalized cross correlation.
The purpose of performing fourier transform on the sound source voice information according to the sound source voice information collected by each microphone is to decompose an audio signal in the sound source voice information so as to make the audio signal easier to process. Specifically, fourier transform may be performed on each sound source voice information according to an existing method.
For any preset position point, the time delay difference from the preset position point to each two adjacent microphones can be calculated according to the coordinate information of the preset position point and the coordinate information of each microphone, specifically, the time delay difference τ from any preset position point m to any two adjacent microphones k and l can be calculated according to the following formulamkl
Figure BDA0002268324790000101
Wherein D ismkIs the distance from the preset position point m to the microphone k, DmlThe distance from the position point m to the microphone l is preset, c is the sound velocity, and c is 340 m/s.
When the coordinate information of the preset position point m is (X)m,Ym) The coordinate information of microphone k is (X)m,Yk) The coordinate information of the microphone l is (X)l,Yl) When D ismk、DmlRespectively as follows:
Figure BDA0002268324790000102
Figure BDA0002268324790000103
and then, calculating the generalized cross-correlation from the preset position point to each two adjacent microphones according to the Fourier transform of the voice information of each sound source and the time delay difference from the preset position point to each two adjacent microphones. Specifically, the formula for calculating the generalized cross-correlation from the preset position point m to the adjacent microphone k, l is as follows:
Figure BDA0002268324790000104
wherein M isk(w) is a fourier transform of the speech signal received by microphone k;
Figure BDA0002268324790000107
a conjugate of the fourier transform of the speech signal received for microphone l; w is the speech signal frequency; phi is akl(w) is determined by the following formula:
Figure BDA0002268324790000105
and finally, calculating the voice power value corresponding to the position point according to the generalized cross-correlation from the preset position point to each two adjacent microphones. Specifically, calculating a voice power value p (m) corresponding to the preset position point m:
Figure BDA0002268324790000106
wherein M is the total number of microphones.
And g, identifying the position point with the maximum voice power value, and determining the direction of the position point as the sound source direction.
The position point with the largest voice power value, namely the position point with the largest sound, is the sound source position point. The sound source location point is the location of the person speaking, and the sound at this location should be the largest compared with other locations, and therefore the speech power value corresponding to this location should also be the largest. The direction of the sound source position is the sound source direction.
The embodiment provides a sound source positioning method, which includes acquiring sound source voice information acquired by each microphone, acquiring coordinate information of each microphone and coordinate information of each preset position point, calculating a voice power value corresponding to each position point according to the sound source voice information acquired by each microphone, the coordinate information of each microphone and the coordinate information of each preset position point, finally identifying a position point with the largest voice power value, and determining the position point as a sound source position. In the embodiment of the invention, the microphone is arranged outside the video acquisition equipment, and the voice power value is calculated according to the coordinate information of the preset position, the microphone position information and the voice information of the sound source so as to determine the position of the sound source, further determine the direction of the position of the sound source, improve the accuracy of sound source positioning and ensure that the monitored object is in the monitoring range.
Further, a third embodiment of the intelligent monitoring method of the present invention is proposed, which is different from the first or second embodiment of the intelligent monitoring method in that after the step of monitoring the monitoring picture acquired in the sound source direction, the method further includes:
and h, sending the monitoring picture to a terminal, and detecting whether an alarm instruction exists or not.
When the voiceprint features of the sound source voice information are successfully matched with the preset voiceprint features, the monitoring angle is immediately adjusted to monitor the monitored object, the collected monitoring picture containing the monitored object is sent to a mobile phone terminal of a user of the monitoring equipment, so that the user can timely know the abnormal condition of the monitored object, and whether alarming is needed or not is judged through the subjectivity of the user. If the alarm is needed, an alarm instruction is input so as to send an alarm instruction to the monitoring equipment; if no alarm is required, no action may be performed. The user can also send a call connection indication to realize a call with the monitored object. It is understood that whether the user sends the alarm indication or not is judged by the user according to the degree of harm to the abnormal condition of the monitored object. The alarm indication can be generated by voice input of a user or by pressing keys of the mobile phone terminal.
And i, if the alarm indication is detected, starting an alarm mode.
If the alarm indication is detected, an alarm mode is started. The alarm mode can be acousto-optic alarm or voice alarm. The sound-light alarm stops illegal actions by frightening illegal persons, and the voice alarm sends the recording information to the monitoring equipment and then plays the recording information on the monitoring equipment, so that the purpose of frightening or guiding the monitored object in a dangerous scene is achieved.
The embodiment provides an alarm mode, the alarm mode is started according to the alarm instruction of the terminal by sending the monitoring picture to the terminal, the alarm timeliness is improved, the monitored object in abnormal conditions can be timely rescued, and therefore the safety of the monitored object is guaranteed.
In addition, an embodiment of the present invention further provides an intelligent monitoring device, where the intelligent monitoring device includes:
the acquisition module is used for acquiring sound source voice information acquired by all microphones at the same moment in a monitoring scene;
the detection module is used for detecting whether the sound source voice information is sensitive information;
the extraction module is used for extracting the sound source voiceprint characteristics corresponding to the sound source voice information if the voice information is the voice information;
the determining module is used for determining the sound source direction corresponding to the sound source voice information when the sound source voiceprint characteristics are matched with the preset voiceprint characteristics;
and the monitoring acquisition module is used for monitoring and acquiring the monitoring picture in the sound source direction.
Further, the intelligent monitoring module further comprises:
the conversion module is used for converting the sound source voice information into text information according to a voice recognition technology;
the query module is used for querying the sensitivity index corresponding to the text information from a text information base;
and the determining module is used for determining the sound source voice information as the sensitive information if the sensitivity index is greater than or equal to a preset value.
Further, the determining module further comprises:
the calculating unit is used for calculating the voice power value of each preset position point in the monitoring scene;
the recognition unit is used for recognizing the position point with the maximum voice power value;
and the determining unit is used for determining the direction of the position point as the sound source direction.
Further, the calculation unit includes:
the acquisition subunit is used for acquiring the coordinate information of each microphone and the coordinate information of each preset position point;
and the calculating subunit is used for calculating the voice power value corresponding to each preset position point according to the voice information of the sound source, the coordinate information of each microphone and the coordinate information of each preset position point.
Furthermore, the computing unit is further configured to perform fourier transform on the sound source voice information, and compute a delay difference from each preset location point to each two adjacent microphones; calculating generalized cross-correlation from each preset position point to each two adjacent microphones according to the Fourier transform and the time delay difference from each preset position point to the sound velocity of each two adjacent microphones; and calculating the voice power value corresponding to each preset position point according to the generalized cross correlation.
Further, the monitoring acquisition module comprises:
a generating unit that generates a camera driving instruction according to the sound source direction;
and the adjusting unit is used for adjusting the shooting direction of the monitoring camera to the sound source direction based on the camera driving instruction so as to collect the monitoring picture in the sound source direction.
Further, the intelligent monitoring module further comprises:
the sending unit is used for sending the monitoring picture to a terminal;
the detection unit is used for detecting whether an alarm instruction exists or not;
and the starting unit is used for starting an alarm mode if the alarm indication is detected.
The implementation of the intelligent monitoring device of the present invention is basically the same as that of the above-mentioned intelligent monitoring method, and is not described herein again.
In addition, an embodiment of the present invention further provides a computer-readable storage medium, where an intelligent monitoring program is stored on the computer-readable storage medium, and the intelligent monitoring program, when executed by a processor, implements the steps of the intelligent monitoring method described above.
It should be noted that the computer readable storage medium may be provided in the intelligent monitoring apparatus.
The specific implementation of the computer-readable storage medium of the present invention is substantially the same as that of the above-mentioned embodiments of the intelligent monitoring method, and is not described herein again.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. An intelligent monitoring method, characterized in that the intelligent monitoring method comprises the following steps:
acquiring sound source voice information acquired by all microphones at the same moment in a monitoring scene, and detecting whether the sound source voice information is sensitive information or not;
if so, extracting sound source voiceprint characteristics corresponding to the sound source voice information;
and when the sound source voiceprint characteristics are matched with the preset voiceprint characteristics, determining the sound source direction corresponding to the sound source voice information, and monitoring the monitoring picture acquired in the sound source direction.
2. The intelligent monitoring method according to claim 1, wherein the step of detecting whether the sound source voice information is sensitive information comprises:
converting the sound source voice information into text information according to a voice recognition technology;
inquiring the sensitivity index corresponding to the text information from a text information base;
and if the sensitivity index is greater than or equal to a preset value, determining the sound source voice information as sensitive information.
3. The intelligent monitoring method according to claim 1, wherein the step of determining the sound source direction corresponding to the sound source voice information comprises:
calculating the voice power value of each preset position point in a monitoring scene;
and identifying the position point with the maximum voice power value, and determining the direction of the position point as the sound source direction.
4. The intelligent monitoring method according to claim 3, wherein the step of calculating the voice power value of each preset position point in the monitoring scene comprises:
acquiring coordinate information of each microphone and coordinate information of each preset position point;
and calculating the voice power value corresponding to each preset position point according to the voice information of the sound source, the coordinate information of each microphone and the coordinate information of each preset position point.
5. The intelligent monitoring method according to claim 4, wherein the step of calculating the voice power value corresponding to each preset position point according to the voice information of the sound source, the coordinate information of each microphone, and the coordinate information of each preset position point comprises:
carrying out Fourier transform on the sound source voice information, and calculating the time delay difference from each preset position point to the sound velocity of each two adjacent microphones;
calculating generalized cross-correlation from each preset position point to each two adjacent microphones according to the Fourier transform and the time delay difference from each preset position point to each two adjacent microphones;
and calculating the voice power value corresponding to each preset position point according to the generalized cross correlation.
6. The intelligent monitoring method according to claim 1, wherein the step of monitoring the monitoring picture collected in the direction of the sound source comprises:
generating a camera driving instruction according to the sound source direction;
and adjusting the shooting direction of the monitoring camera to the sound source direction based on the camera driving instruction so as to collect the monitoring picture in the sound source direction.
7. The intelligent monitoring method according to any one of claims 1-6, wherein the step of monitoring acquisition of monitoring pictures in the direction of the sound source is followed by further comprising:
sending the monitoring picture to a terminal, and detecting whether an alarm instruction exists;
and if the alarm indication is detected, starting an alarm mode.
8. An intelligent monitoring device, comprising:
the acquisition module is used for acquiring sound source voice information acquired by all microphones at the same moment in a monitoring scene;
the detection module is used for detecting whether the sound source voice information is sensitive information;
the extraction module is used for extracting the sound source voiceprint characteristics corresponding to the sound source voice information if the voice information is the voice information;
the determining module is used for determining the sound source direction corresponding to the sound source voice information when the sound source voiceprint characteristics are matched with the preset voiceprint characteristics;
and the monitoring acquisition module is used for monitoring and acquiring the monitoring picture in the sound source direction.
9. An intelligent monitoring device, comprising a memory, a processor and an intelligent monitoring program stored on the memory and operable on the processor, the intelligent monitoring program when executed by the processor implementing the steps of the intelligent monitoring method according to any one of claims 1 to 7.
10. A readable storage medium, characterized in that the readable storage medium has stored thereon an intelligent monitoring program, which when executed by a processor implements the steps of the intelligent monitoring method according to any one of claims 1 to 7.
CN201911098789.8A 2019-11-11 2019-11-11 Intelligent monitoring method, device, equipment and computer readable storage medium Pending CN110830771A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911098789.8A CN110830771A (en) 2019-11-11 2019-11-11 Intelligent monitoring method, device, equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911098789.8A CN110830771A (en) 2019-11-11 2019-11-11 Intelligent monitoring method, device, equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN110830771A true CN110830771A (en) 2020-02-21

Family

ID=69554176

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911098789.8A Pending CN110830771A (en) 2019-11-11 2019-11-11 Intelligent monitoring method, device, equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN110830771A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111486537A (en) * 2020-06-15 2020-08-04 江苏新科电器有限公司 Air conditioner with security and monitoring functions
CN111768789A (en) * 2020-08-03 2020-10-13 上海依图信息技术有限公司 Electronic equipment and method, device and medium for determining identity of voice sender thereof
CN111784947A (en) * 2020-07-10 2020-10-16 上海茂声智能科技有限公司 Active early warning method, system and equipment based on image and voiceprint
CN112261365A (en) * 2020-10-19 2021-01-22 西北工业大学 Self-contained underwater acousto-optic monitoring and recording device and recording method
CN112533070A (en) * 2020-11-18 2021-03-19 深圳Tcl新技术有限公司 Video sound and picture adjusting method, terminal and computer readable storage medium
CN113114986A (en) * 2021-03-30 2021-07-13 深圳市冠标科技发展有限公司 Early warning method based on picture and sound synchronization and related equipment
CN114964650A (en) * 2022-08-01 2022-08-30 杭州兆华电子股份有限公司 Gas leakage alarm method and device based on acoustic imaging

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108089153A (en) * 2016-11-23 2018-05-29 杭州海康威视数字技术股份有限公司 A kind of sound localization method, apparatus and system
CN108766439A (en) * 2018-04-27 2018-11-06 广州国音科技有限公司 A kind of monitoring method and device based on Application on Voiceprint Recognition
CN109616125A (en) * 2018-12-13 2019-04-12 苏州思必驰信息科技有限公司 Monitoring method and system based on Application on Voiceprint Recognition
US20190130720A1 (en) * 2017-10-27 2019-05-02 Benjamin Lui Systems and methods for a machine learning baby monitor

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108089153A (en) * 2016-11-23 2018-05-29 杭州海康威视数字技术股份有限公司 A kind of sound localization method, apparatus and system
US20190130720A1 (en) * 2017-10-27 2019-05-02 Benjamin Lui Systems and methods for a machine learning baby monitor
CN108766439A (en) * 2018-04-27 2018-11-06 广州国音科技有限公司 A kind of monitoring method and device based on Application on Voiceprint Recognition
CN109616125A (en) * 2018-12-13 2019-04-12 苏州思必驰信息科技有限公司 Monitoring method and system based on Application on Voiceprint Recognition

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111486537A (en) * 2020-06-15 2020-08-04 江苏新科电器有限公司 Air conditioner with security and monitoring functions
CN111486537B (en) * 2020-06-15 2020-10-02 江苏新科电器有限公司 Air conditioner with security and monitoring functions
CN111784947A (en) * 2020-07-10 2020-10-16 上海茂声智能科技有限公司 Active early warning method, system and equipment based on image and voiceprint
CN111768789A (en) * 2020-08-03 2020-10-13 上海依图信息技术有限公司 Electronic equipment and method, device and medium for determining identity of voice sender thereof
CN111768789B (en) * 2020-08-03 2024-02-23 上海依图信息技术有限公司 Electronic equipment, and method, device and medium for determining identity of voice generator of electronic equipment
CN112261365A (en) * 2020-10-19 2021-01-22 西北工业大学 Self-contained underwater acousto-optic monitoring and recording device and recording method
CN112533070A (en) * 2020-11-18 2021-03-19 深圳Tcl新技术有限公司 Video sound and picture adjusting method, terminal and computer readable storage medium
CN112533070B (en) * 2020-11-18 2024-02-06 深圳Tcl新技术有限公司 Video sound and picture adjusting method, terminal and computer readable storage medium
CN113114986A (en) * 2021-03-30 2021-07-13 深圳市冠标科技发展有限公司 Early warning method based on picture and sound synchronization and related equipment
CN113114986B (en) * 2021-03-30 2023-04-28 深圳市冠标科技发展有限公司 Early warning method based on picture and sound synchronization and related equipment
CN114964650A (en) * 2022-08-01 2022-08-30 杭州兆华电子股份有限公司 Gas leakage alarm method and device based on acoustic imaging
CN114964650B (en) * 2022-08-01 2022-11-18 杭州兆华电子股份有限公司 Gas leakage alarm method and device based on acoustic imaging

Similar Documents

Publication Publication Date Title
CN110830771A (en) Intelligent monitoring method, device, equipment and computer readable storage medium
US9198225B2 (en) Ad-hoc surveillance network
EP3236469B1 (en) Object monitoring method and device
CN109839614B (en) Positioning system and method of fixed acquisition equipment
CN109284081B (en) Audio output method and device and audio equipment
KR20170018140A (en) Method for emergency diagnosis having nonlinguistic speech recognition function and apparatus thereof
CN110381204B (en) Information display method, mobile terminal and computer readable storage medium
CN105933502B (en) The method and apparatus for marking message read states
CN112306799A (en) Abnormal information acquisition method, terminal device and readable storage medium
JP6973380B2 (en) Information processing device and information processing method
CN108600559B (en) Control method and device of mute mode, storage medium and electronic equipment
CN113709629A (en) Frequency response parameter adjusting method, device, equipment and storage medium
CN109670105B (en) Searching method and mobile terminal
CN110831114A (en) Connection method with wireless device, terminal and readable storage medium
CN113257251B (en) Robot user identification method, apparatus and storage medium
CN106210002B (en) Control method and device and electronic equipment
CN216014810U (en) Notification device and wearing device
CN110764668B (en) Comment information acquisition method and electronic equipment
CN111353422B (en) Information extraction method and device and electronic equipment
CN109542293B (en) Menu interface setting method and mobile terminal
CN113225624A (en) Time-consuming determination method and device for voice recognition
CN112698806A (en) Parameter adjusting method and device, electronic equipment and readable storage medium
KR101520446B1 (en) Monitoring system for prevention beating and cruel act
CN111479060B (en) Image acquisition method and device, storage medium and electronic equipment
CN113965476B (en) Inspection method, device and equipment based on application

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200221