CN110830771A

CN110830771A - Intelligent monitoring method, device, equipment and computer readable storage medium

Info

Publication number: CN110830771A
Application number: CN201911098789.8A
Authority: CN
Inventors: 陈昊亮; 许敏强; 杨世清
Original assignee: Guangzhou National Acoustic Intelligent Technology Co Ltd
Current assignee: Guangzhou National Acoustic Intelligent Technology Co Ltd
Priority date: 2019-11-11
Filing date: 2019-11-11
Publication date: 2020-02-21

Abstract

The invention discloses an intelligent monitoring method, an intelligent monitoring device, intelligent monitoring equipment and a computer readable storage medium, wherein the method comprises the following steps: acquiring sound source voice information acquired by all microphones at the same moment in a monitoring scene, and detecting whether the sound source voice information is sensitive information or not; if so, extracting sound source voiceprint characteristics corresponding to the sound source voice information; when the voiceprint characteristics of the sound source are matched with the preset voiceprint characteristics, the sound source direction corresponding to the sound source voice information is determined, the monitoring picture collected in the sound source direction is monitored, the target monitoring object is guaranteed to exist on the collected monitoring picture, the monitoring angle is allowed to be diversified, and the user experience of the monitoring equipment is improved.

Description

Intelligent monitoring method, device, equipment and computer readable storage medium

Technical Field

The present invention relates to the field of monitoring technologies, and in particular, to an intelligent monitoring method, apparatus, device, and computer-readable storage medium.

Background

With the increasing urbanization rhythm, most parents cannot accompany the children at any time due to work, so that the out-of-home worry is generated. Therefore, the related art currently proposes to install a monitor in a home to enable a parent to know the status of a child in real time, but after the monitor is pressed in the home, there is a common problem that when the monitor uploads a shot picture to the mobile phone of the parent in real time, the shadow of the child may not be displayed on the monitoring picture, so that the monitoring result does not meet the expected effect of the parent. When a child is in an abnormal situation, the monitoring device may not be able to shoot the child in time to let the parent know that the child is in danger.

Disclosure of Invention

The invention mainly aims to provide an intelligent monitoring method, an intelligent monitoring device, intelligent monitoring equipment and a computer readable storage medium, and aims to solve the problem that the user experience is reduced due to the fact that the monitoring visual angle cannot be changed due to the movement of a monitored object in the prior art.

In order to achieve the above object, the present invention provides an intelligent monitoring method, which comprises:

acquiring sound source voice information acquired by all microphones at the same moment in a monitoring scene, and detecting whether the sound source voice information is sensitive information or not;

if so, extracting sound source voiceprint characteristics corresponding to the sound source voice information;

and when the sound source voiceprint characteristics are matched with the preset voiceprint characteristics, determining the sound source direction corresponding to the sound source voice information, and monitoring the monitoring picture acquired in the sound source direction.

Further, the step of detecting whether the sound source voice information is sensitive information comprises:

converting the sound source voice information into text information according to a voice recognition technology;

inquiring the sensitivity index corresponding to the text information from a text information base;

and if the sensitivity index is greater than or equal to a preset value, determining the sound source voice information as sensitive information.

Further, when the voiceprint feature of the sound source matches a preset voiceprint feature, the step of determining the sound source direction corresponding to the sound source voice information includes:

calculating the voice power value of each preset position point in a monitoring scene;

and identifying the position point with the maximum voice power value, and determining the direction of the position point as the sound source direction.

Further, the step of calculating the voice power value of each preset position point in the monitoring scene includes:

acquiring coordinate information of each microphone and coordinate information of each preset position point;

and calculating the voice power value corresponding to each preset position point according to the voice information of the sound source, the coordinate information of each microphone and the coordinate information of each preset position point.

Further, the step of calculating the voice power value corresponding to each preset position point according to the voice information of the sound source, the coordinate information of each microphone, and the coordinate information of each preset position point includes:

carrying out Fourier transform on the sound source voice information, and calculating the time delay difference from each preset position point to the sound velocity of each two adjacent microphones;

calculating generalized cross-correlation from each preset position point to each two adjacent microphones according to the Fourier transform and the time delay difference from each preset position point to each two adjacent microphones;

and calculating the voice power value corresponding to each preset position point according to the generalized cross correlation.

Further, the step of monitoring the monitoring picture collected in the sound source direction includes:

generating a camera driving instruction according to the sound source direction;

and adjusting the shooting direction of the monitoring camera to the sound source direction based on the camera driving instruction so as to collect the monitoring picture in the sound source direction.

Further, after the step of monitoring and collecting the monitoring picture in the sound source direction, the method further comprises the following steps:

sending the monitoring picture to a terminal, and detecting whether an alarm instruction exists;

and if the alarm indication is detected, starting an alarm mode.

In addition, to achieve the above object, the present invention also provides an intelligent monitoring apparatus, including:

the acquisition module is used for acquiring sound source voice information acquired by all microphones at the same moment in a monitoring scene;

the detection module is used for detecting whether the sound source voice information is sensitive information;

the extraction module is used for extracting the sound source voiceprint characteristics corresponding to the sound source voice information if the voice information is the voice information;

the determining module is used for determining the sound source direction corresponding to the sound source voice information when the sound source voiceprint characteristics are matched with the preset voiceprint characteristics;

and the monitoring acquisition module is used for monitoring and acquiring the monitoring picture in the sound source direction. .

In addition, in order to achieve the above object, the present invention further provides an intelligent monitoring device, which includes a memory, a processor, and an intelligent monitoring program stored in the memory and capable of running on the processor, wherein the intelligent monitoring program, when executed by the processor, implements the steps of the intelligent monitoring method as described above.

In addition, to achieve the above object, the present invention further provides a computer readable storage medium, wherein the intelligent monitoring program is stored on the computer readable storage medium, and when being executed by a processor, the intelligent monitoring program implements the steps of the intelligent monitoring method as described above.

The method comprises the steps of acquiring sound source voice information acquired by all microphones at the same moment in a monitoring scene, and detecting whether the sound source voice information is sensitive information; if so, extracting sound source voiceprint characteristics corresponding to the sound source voice information; when the sound source voiceprint characteristics are matched with the preset voiceprint characteristics, the sound source direction corresponding to the sound source voice information is determined, and the monitoring picture collected in the sound source direction is monitored, so that the monitoring visual angle can be changed in time when the monitoring object is detected to be abnormal, the monitoring object is shot, and the user experience of the monitoring equipment is further improved.

Drawings

FIG. 1 is a diagram illustrating a hardware configuration of an apparatus for implementing various embodiments of the invention;

FIG. 2 is a schematic flow chart of a first embodiment of the intelligent monitoring method of the present invention;

fig. 3 is a diagram of an application scenario of the present invention.

The implementation, functional features and advantages of the present invention will be described with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The invention provides an intelligent monitoring device, and referring to fig. 1, fig. 1 is a schematic structural diagram of a hardware operating environment according to an embodiment of the present invention.

It should be noted that fig. 1 is a schematic structural diagram of a hardware operating environment of an intelligent monitoring device. The intelligent monitoring equipment of the embodiment of the invention can be equipment such as a PC, a portable computer, a server and the like.

As shown in fig. 1, the intelligent monitoring apparatus may include: a processor 1001, such as a CPU, a memory 1005, a user interface 1003, a network interface 1004, a communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001.

Optionally, the smart monitoring device may further include RF (Radio Frequency) circuits, sensors, WiFi modules, and the like.

Those skilled in the art will appreciate that the configuration of the intelligent monitoring device shown in FIG. 1 does not constitute a limitation of the intelligent monitoring device, and may include more or fewer components than shown, or some components in combination, or a different arrangement of components.

As shown in fig. 1, a memory 1005, which is a kind of computer storage readable storage medium, may include therein an operating system, a network communication module, a user interface module, and an intelligent monitoring program. The operating system is a program for managing and controlling hardware and software resources of the intelligent monitoring device and supports the operation of the intelligent monitoring program and other software or programs.

The intelligent monitoring device shown in fig. 1 may be used to implement monitoring intellectualization, and the user interface 1003 is mainly used to detect or output various information, such as information for detecting a sound source voice message and outputting an alarm signal; the network interface 1004 is mainly used for interacting with a background server and communicating; the processor 1001 may be configured to invoke the smart monitoring program stored in the memory 1005 and perform the following operations:

Further, the step of daily calculating the voice power value of each preset position point in the monitoring scene includes:

Further, after the step of monitoring the monitoring picture collected in the sound source direction, the processor 1001 is further configured to call the intelligent monitoring program stored in the memory 1005, and perform the following operations:

and if the alarm indication is detected, starting an alarm mode.

The specific implementation of the mobile terminal of the present invention is substantially the same as the following embodiments of the intelligent monitoring method, and will not be described herein again.

Based on the structure, the invention provides various embodiments of the intelligent monitoring method.

The invention provides an intelligent monitoring method.

Referring to fig. 1, fig. 1 is a schematic flow chart of a first embodiment of the intelligent monitoring method of the present invention.

In the present embodiment, an embodiment of an intelligent monitoring method is provided, and it should be noted that although a logical order is shown in the flow chart, in some cases, the steps shown or described may be performed in an order different from that shown or described herein.

In this embodiment, the intelligent monitoring method includes:

step S10, acquiring sound source voice information collected by each microphone at the same time in a monitoring scene, and detecting whether the sound source voice information is sensitive information.

It should be noted that microphones are arranged at multiple places in the monitoring scene, and when the monitored object makes a sound, sound source voice information can be acquired from the arranged microphones. Multiple microphones may receive the same source voice information from multiple microphones.

After the sound source voice information is acquired, whether the sound source voice information is sensitive information is detected, whether the state of a detected object is a dangerous state is judged, and whether an alarm mode is started is further judged.

Further, step S10 includes:

step a, converting the sound source voice information into text information according to a voice recognition technology.

The voice recognition technology has various uses, and in this embodiment, the voice recognition technology is used to extract semantic information in the voice information of the sound source, convert the semantic information into text information, and obtain a sensitivity index corresponding to the text information.

And b, inquiring the sensitivity index corresponding to the text information from a text information base.

The text information base stores a plurality of text information, the text information records are common communication languages, some emotion words, voice words and the like, sensitivity indexes are allocated to the text information, the text information and the sensitivity indexes are in one-to-one correspondence, for example, the sensitivity indexes can be represented by levels, namely, a first level, a second level, a third level and the like, and the sensitivity index of the text information is the strongest of the first level and the second level.

And acquiring text information in the sound source voice information, and searching the sensitivity index of the text information from a text information base according to the one-to-one correspondence relationship between the text information and the sensitivity index.

And c, if the sensitivity index is greater than or equal to a preset value, determining the sound source voice information as sensitive information.

After the sensitivity index of the text information in the sound source voice information is obtained, judging whether the sensitivity index is larger than or equal to a preset value, if so, determining that the sound source voice information corresponding to the text information is sensitive information; if the sensitivity index is smaller than the preset value, the sound source voice information corresponding to the text information is not sensitive information and is still in the current monitoring state, and the monitoring visual angle is not changed. If the received text information is sensitive information, the monitored object is in an abnormal condition, the monitoring view angle needs to be changed, and tracking monitoring is carried out on the monitored object.

The preset value is set by a user of the monitoring equipment according to the usual expression habit of the monitored object.

And step S20, if yes, extracting sound source voiceprint characteristics corresponding to the sound source voice information.

The voiceprint features are attributes of all voice information, and the voiceprint features have certain differences when the voice information is different. The voiceprint feature has a variety of uses, and the voiceprint feature is used in the embodiment to identify whether the monitored object to be monitored is the monitored object specified by the monitoring device.

If the sensitivity index of the sound source voice information is greater than or equal to the preset value, it is indicated that an abnormal condition occurs in the monitoring scene, and at this time, the monitoring object needs to be tracked to determine whether the abnormal condition is from the monitoring object, so when the sensitivity index of the sound source voice information is greater than or equal to the preset value, the sound source voiceprint feature corresponding to the sound source voice information is extracted to determine whether the object with the sensitivity index greater than or equal to the preset value is the monitoring object specified by the user of the monitoring device.

And step S30, when the sound source voiceprint characteristics are matched with the preset voiceprint characteristics, determining the sound source direction corresponding to the sound source voice information, and monitoring the monitoring picture collected in the sound source direction.

The preset voiceprint feature is set by a user of the monitoring device, and is the voiceprint feature of the monitored object specified by the user, and it can be understood that the voiceprint features of a plurality of monitored objects to be monitored can be set. When the sound source voiceprint feature corresponding to the obtained sound source voice information is the preset voiceprint feature, analyzing the direction of a monitored object sending the sound source voice information, namely the sound source direction, setting the monitoring direction as the sound source direction, monitoring and collecting a monitoring image in the sound source direction, and sending the monitoring image to a terminal of a user.

If the voiceprint feature of the sound source voice information is not the preset voiceprint feature, the sound source direction of the sound source voice information does not need to be analyzed, and the monitoring visual angle does not need to be changed.

Further, step S30 includes:

and d, generating a camera driving command according to the sound source side.

When the sound source direction is determined to be completed, a camera driving instruction is generated and used for adjusting and controlling the monitoring direction of the camera, and further scenes in the sound source direction everywhere can be monitored. The camera on the monitoring equipment can be driven by a motor and adjusted in position, and the motor can be driven to operate to perform orientation adjustment based on a camera driving command.

And e, adjusting the shooting direction of the monitoring camera to the sound source direction based on the camera driving instruction so as to collect the monitoring picture in the sound source direction.

The direction of the monitoring camera is controlled and adjusted based on the camera driving instruction, so that the camera can shoot a monitoring picture in the sound source direction, collect the monitoring picture, and transmit the monitoring picture to a mobile phone of a user in real time, so that the user can timely know the abnormal condition of a monitored object and judge whether to start an alarm mode.

In the embodiment, the sound source voice information of the microphones in different directions is acquired at the same time, whether the sound source voice information is sensitive information is judged, and when the sound source voice information is sensitive information, the sound source voiceprint feature in the sound source voice information is extracted, otherwise, the voiceprint information does not need to be extracted, so that the disorder of the functions of the equipment is reduced by setting conditions for extracting the voiceprint information, and unnecessary resource waste is caused.

This embodiment is through when sound source voiceprint characteristic matches with the voiceprint characteristic of predetermineeing, confirm the sound source direction that this sound source speech information corresponds to the direction of monitoring of adjustment camera, the control and the collection sound source direction's control picture, can realize the tracking control to the target, the timely unusual circumstances of grasp target, furtherly, this kind of intelligent monitoring's mode has improved user's experience greatly for the unchangeable monitoring mode of monitoring range and the monitoring mode according to the mode change monitoring range of predetermineeing.

Further, a second embodiment of the intelligent monitoring method of the present invention is presented. The difference between the second embodiment of the intelligent monitoring method and the first embodiment of the intelligent monitoring method is that when the voiceprint feature of the sound source matches with the preset voiceprint feature, the step of determining the sound source direction corresponding to the sound source voice information includes:

and f, calculating the voice power value of each preset position point in the monitoring scene.

The voice power value of each preset position point can be used for representing the sound size of each preset position point, and then the sound source direction is judged according to the voice power value of each preset position point.

Further, step f comprises:

and f1, acquiring the coordinate information of each microphone and the coordinate information of each preset position point.

The final purpose of acquiring the coordinate information of each microphone and the coordinate information of each preset position point is to calculate the voice power value of the voice information of the sound source.

And f2, calculating the voice power value corresponding to each preset position point according to the voice information of the sound source, the coordinate information of each microphone and the coordinate information of each preset position point.

The steps of specifically calculating the voice power value are as follows:

and f11, performing Fourier transform on the sound source voice information, and calculating the time delay difference from each preset position point to each two adjacent microphones.

And f12, calculating the generalized cross correlation from each preset position point to each two adjacent microphones according to the Fourier transform and the time delay difference from each preset position point to each two adjacent microphones.

And f13, calculating the voice power value corresponding to each preset position point according to the generalized cross correlation.

The purpose of performing fourier transform on the sound source voice information according to the sound source voice information collected by each microphone is to decompose an audio signal in the sound source voice information so as to make the audio signal easier to process. Specifically, fourier transform may be performed on each sound source voice information according to an existing method.

For any preset position point, the time delay difference from the preset position point to each two adjacent microphones can be calculated according to the coordinate information of the preset position point and the coordinate information of each microphone, specifically, the time delay difference τ from any preset position point m to any two adjacent microphones k and l can be calculated according to the following formula_mkl：

Wherein D is_mkIs the distance from the preset position point m to the microphone k, D_mlThe distance from the position point m to the microphone l is preset, c is the sound velocity, and c is 340 m/s.

When the coordinate information of the preset position point m is (X)_m，Y_m) The coordinate information of microphone k is (X)_m，Y_k) The coordinate information of the microphone l is (X)_l，Y_l) When D is_mk、D_mlRespectively as follows:

and then, calculating the generalized cross-correlation from the preset position point to each two adjacent microphones according to the Fourier transform of the voice information of each sound source and the time delay difference from the preset position point to each two adjacent microphones. Specifically, the formula for calculating the generalized cross-correlation from the preset position point m to the adjacent microphone k, l is as follows:

wherein M is_k(w) is a fourier transform of the speech signal received by microphone k;

a conjugate of the fourier transform of the speech signal received for microphone l; w is the speech signal frequency; phi is a_kl(w) is determined by the following formula:

and finally, calculating the voice power value corresponding to the position point according to the generalized cross-correlation from the preset position point to each two adjacent microphones. Specifically, calculating a voice power value p (m) corresponding to the preset position point m:

wherein M is the total number of microphones.

And g, identifying the position point with the maximum voice power value, and determining the direction of the position point as the sound source direction.

The position point with the largest voice power value, namely the position point with the largest sound, is the sound source position point. The sound source location point is the location of the person speaking, and the sound at this location should be the largest compared with other locations, and therefore the speech power value corresponding to this location should also be the largest. The direction of the sound source position is the sound source direction.

The embodiment provides a sound source positioning method, which includes acquiring sound source voice information acquired by each microphone, acquiring coordinate information of each microphone and coordinate information of each preset position point, calculating a voice power value corresponding to each position point according to the sound source voice information acquired by each microphone, the coordinate information of each microphone and the coordinate information of each preset position point, finally identifying a position point with the largest voice power value, and determining the position point as a sound source position. In the embodiment of the invention, the microphone is arranged outside the video acquisition equipment, and the voice power value is calculated according to the coordinate information of the preset position, the microphone position information and the voice information of the sound source so as to determine the position of the sound source, further determine the direction of the position of the sound source, improve the accuracy of sound source positioning and ensure that the monitored object is in the monitoring range.

Further, a third embodiment of the intelligent monitoring method of the present invention is proposed, which is different from the first or second embodiment of the intelligent monitoring method in that after the step of monitoring the monitoring picture acquired in the sound source direction, the method further includes:

and h, sending the monitoring picture to a terminal, and detecting whether an alarm instruction exists or not.

When the voiceprint features of the sound source voice information are successfully matched with the preset voiceprint features, the monitoring angle is immediately adjusted to monitor the monitored object, the collected monitoring picture containing the monitored object is sent to a mobile phone terminal of a user of the monitoring equipment, so that the user can timely know the abnormal condition of the monitored object, and whether alarming is needed or not is judged through the subjectivity of the user. If the alarm is needed, an alarm instruction is input so as to send an alarm instruction to the monitoring equipment; if no alarm is required, no action may be performed. The user can also send a call connection indication to realize a call with the monitored object. It is understood that whether the user sends the alarm indication or not is judged by the user according to the degree of harm to the abnormal condition of the monitored object. The alarm indication can be generated by voice input of a user or by pressing keys of the mobile phone terminal.

And i, if the alarm indication is detected, starting an alarm mode.

If the alarm indication is detected, an alarm mode is started. The alarm mode can be acousto-optic alarm or voice alarm. The sound-light alarm stops illegal actions by frightening illegal persons, and the voice alarm sends the recording information to the monitoring equipment and then plays the recording information on the monitoring equipment, so that the purpose of frightening or guiding the monitored object in a dangerous scene is achieved.

The embodiment provides an alarm mode, the alarm mode is started according to the alarm instruction of the terminal by sending the monitoring picture to the terminal, the alarm timeliness is improved, the monitored object in abnormal conditions can be timely rescued, and therefore the safety of the monitored object is guaranteed.

In addition, an embodiment of the present invention further provides an intelligent monitoring device, where the intelligent monitoring device includes:

and the monitoring acquisition module is used for monitoring and acquiring the monitoring picture in the sound source direction.

Further, the intelligent monitoring module further comprises:

the conversion module is used for converting the sound source voice information into text information according to a voice recognition technology;

the query module is used for querying the sensitivity index corresponding to the text information from a text information base;

and the determining module is used for determining the sound source voice information as the sensitive information if the sensitivity index is greater than or equal to a preset value.

Further, the determining module further comprises:

the calculating unit is used for calculating the voice power value of each preset position point in the monitoring scene;

the recognition unit is used for recognizing the position point with the maximum voice power value;

and the determining unit is used for determining the direction of the position point as the sound source direction.

Further, the calculation unit includes:

the acquisition subunit is used for acquiring the coordinate information of each microphone and the coordinate information of each preset position point;

and the calculating subunit is used for calculating the voice power value corresponding to each preset position point according to the voice information of the sound source, the coordinate information of each microphone and the coordinate information of each preset position point.

Furthermore, the computing unit is further configured to perform fourier transform on the sound source voice information, and compute a delay difference from each preset location point to each two adjacent microphones; calculating generalized cross-correlation from each preset position point to each two adjacent microphones according to the Fourier transform and the time delay difference from each preset position point to the sound velocity of each two adjacent microphones; and calculating the voice power value corresponding to each preset position point according to the generalized cross correlation.

Further, the monitoring acquisition module comprises:

a generating unit that generates a camera driving instruction according to the sound source direction;

and the adjusting unit is used for adjusting the shooting direction of the monitoring camera to the sound source direction based on the camera driving instruction so as to collect the monitoring picture in the sound source direction.

Further, the intelligent monitoring module further comprises:

the sending unit is used for sending the monitoring picture to a terminal;

the detection unit is used for detecting whether an alarm instruction exists or not;

and the starting unit is used for starting an alarm mode if the alarm indication is detected.

The implementation of the intelligent monitoring device of the present invention is basically the same as that of the above-mentioned intelligent monitoring method, and is not described herein again.

In addition, an embodiment of the present invention further provides a computer-readable storage medium, where an intelligent monitoring program is stored on the computer-readable storage medium, and the intelligent monitoring program, when executed by a processor, implements the steps of the intelligent monitoring method described above.

It should be noted that the computer readable storage medium may be provided in the intelligent monitoring apparatus.

The specific implementation of the computer-readable storage medium of the present invention is substantially the same as that of the above-mentioned embodiments of the intelligent monitoring method, and is not described herein again.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. An intelligent monitoring method, characterized in that the intelligent monitoring method comprises the following steps:

2. The intelligent monitoring method according to claim 1, wherein the step of detecting whether the sound source voice information is sensitive information comprises:

3. The intelligent monitoring method according to claim 1, wherein the step of determining the sound source direction corresponding to the sound source voice information comprises:

4. The intelligent monitoring method according to claim 3, wherein the step of calculating the voice power value of each preset position point in the monitoring scene comprises:

5. The intelligent monitoring method according to claim 4, wherein the step of calculating the voice power value corresponding to each preset position point according to the voice information of the sound source, the coordinate information of each microphone, and the coordinate information of each preset position point comprises:

6. The intelligent monitoring method according to claim 1, wherein the step of monitoring the monitoring picture collected in the direction of the sound source comprises:

7. The intelligent monitoring method according to any one of claims 1-6, wherein the step of monitoring acquisition of monitoring pictures in the direction of the sound source is followed by further comprising:

and if the alarm indication is detected, starting an alarm mode.

8. An intelligent monitoring device, comprising:

9. An intelligent monitoring device, comprising a memory, a processor and an intelligent monitoring program stored on the memory and operable on the processor, the intelligent monitoring program when executed by the processor implementing the steps of the intelligent monitoring method according to any one of claims 1 to 7.

10. A readable storage medium, characterized in that the readable storage medium has stored thereon an intelligent monitoring program, which when executed by a processor implements the steps of the intelligent monitoring method according to any one of claims 1 to 7.