WO2019041314A1

WO2019041314A1 - Security protection method and apparatus, and smart speaker

Info

Publication number: WO2019041314A1
Application number: PCT/CN2017/100232
Authority: WO
Inventors: 蒋壮; 张国滔; 郑勇; 张立新; 金志军; 向勇阳; 卫特超
Original assignee: 深圳市沃特沃德股份有限公司
Priority date: 2017-09-01
Filing date: 2017-09-01
Publication date: 2019-03-07

Abstract

The present invention discloses a security protection method and apparatus, and a smart speaker. The method comprises the following steps: acquiring an ambient sound; determining whether the ambient sound is abnormal; and transmitting alarm information externally if the ambient sound is abnormal. Thereby, a target site such as a home or an office can be remotely monitored, realizing security protection of the target site, and raising the security level of the target site. Compared with video surveillance protection schemes, the remote monitoring protection scheme of embodiments of the present invention does not reveal images of the target site, thereby better ensuring user privacy. Further, remote monitoring can be realized by using existing smart home devices such as smart speakers to reduce costs, while concealing such devices renders them difficult to be found and damaged by intruders, and prevents them from suffering lens occlusion, illumination, and other issues, thus greatly improving stability and effectiveness of security protection.

Description

Safety protection method, device and intelligent speaker

[0001] The present invention relates to the field of security technologies, and in particular, to a security protection method, device, and smart speaker.

[0002] In order to prevent outsiders from secretly entering an important target location, it is necessary to provide security protection to the target site.

The current security protection program mainly installs a camera at the target location to perform video surveillance on the target site to achieve security protection for the target site.

technical problem

[0003] However, if the image transmitted by the camera is obtained by a third party, the user's privacy will be exposed, and there is a security risk in itself; at the same time, the camera is generally conspicuous and easily found, so it is easily destroyed by the intruder; The camera will also affect the monitoring effect due to problems such as lens occlusion and illumination. All of these factors affect the stability and effectiveness of safety protection.

Problem solution

Technical solution

[0004] A primary object of the present invention is to provide a security protection method, apparatus and smart speaker for improving the stability and effectiveness of security protection.

[0005] In order to achieve the above objective, an embodiment of the present invention provides a security protection method, where the method includes the following steps.

[0006] collecting ambient sounds;

[0007] determining whether the ambient sound is abnormal;

[0008] When the ambient sound is abnormal, the alarm information is sent out.

[0009] Optionally, the determining whether the ambient sound is abnormal includes:

[0010] determining whether the volume of the ambient sound is greater than or equal to a threshold;

[0011] When the volume of the ambient sound is greater than or equal to the threshold 吋, it is determined that the ambient sound is abnormal.

[0012] Optionally, the determining whether the ambient sound is abnormal includes: [0013] determining whether the volume of the ambient sound is greater than or equal to a threshold;

[0014] when the volume of the ambient sound is greater than or equal to the threshold 吋, determining whether there is a continuous N of the ambient sound sampling point volume is greater than or equal to a preset value, wherein N is greater than or equal to 2;

[0015] If yes, it is determined that the ambient sound is abnormal.

[0016] Optionally, after the step of determining whether the ambient sound is abnormal, the method further includes:

[0017] detecting abnormality in the ambient sound, whether the voice information is included in the ambient sound;

[0018] when the ambient sound includes voice information, the voice information is sent out.

[0019] Optionally, the detecting whether the voice information is included in the ambient sound comprises:

[0020] Performing a domain and frequency domain feature analysis on the ambient sound by using a voice activity detection algorithm to determine whether voice information is included in the ambient sound.

[0021] Optionally, the sending the voice information outward comprises: sending the voice information to a user terminal by using an audio-video peer-to-peer network transmission technology.

[0022] Optionally, the threshold has at least two, and different segments correspond to different thresholds.

[0023] Optionally, the preset value has at least two, and different segments correspond to different preset values.

[0024] Optionally, the determining whether the ambient sound is abnormal includes:

[0025] determining whether voice information is included in the ambient sound;

[0026] When the ambient sound includes voice information 吋, it is determined that the ambient sound is abnormal.

[0027] Optionally, after the step of determining whether the ambient sound is abnormal, the method further includes:

[0028] When the ambient sound is abnormal, the voice information is sent out.

[0029] Optionally, the step of collecting an ambient sound further includes:

[0030] When the listen command is received, the listening mode is started, and the next step is taken: the ambient sound is collected.

[0031] Embodiments of the present invention also provide a security protection device, and the device includes:

[0032] a sound collection module, configured to collect an ambient sound;

[0033] the abnormality determining module is configured to determine whether the ambient sound is abnormal;

[0034] The abnormality alarm module is configured to send an alarm message outward when the ambient sound is abnormal.

[0035] Optionally, the abnormality determining module includes:

[0036] The first determining unit is configured to determine whether the volume of the ambient sound is greater than or equal to a threshold;

[0037] a first determining unit, configured to: when the volume of the ambient sound is greater than or equal to the threshold 吋, The ambient sound is abnormal.

[0038] Optionally, the abnormality determining module includes:

[0039] The first determining unit is configured to determine whether the volume of the ambient sound is greater than or equal to a threshold;

[0040] The second determining unit is configured to: when the volume of the ambient sound is greater than or equal to the threshold 吋, determine whether a volume of consecutive N sampling points of the ambient sound is greater than or equal to a preset value, where N Greater than or equal to 2;

[0041] The second determining unit is configured to determine that the ambient sound is abnormal when the volume of the sampling points having consecutive N of the ambient sounds is greater than or equal to a preset value 。.

[0042] Optionally, the device further includes:

[0043] a voice detection module, configured to detect whether the ambient sound includes a linguistic first when the ambient sound is abnormally 吋;

[0044] The voice sending module is configured to: when the ambient sound includes voice information, send the voice to the outside.

[0045] Optionally, the voice detection module is configured to: perform a domain and frequency domain feature analysis on the ambient sound by using a voice activity detection algorithm, and determine whether the voice information is included in the environment sound.

[0046] Optionally, the voice sending module is configured to: send the voice information to the user terminal by using an audio-video peer-to-peer network transmission technology.

[0047] Optionally, the abnormality determining module includes:

[0048] a third determining unit, configured to determine whether voice information is included in the ambient sound;

[0049] The third determining unit is configured to determine that the environmental sound meter is suspended when the ambient sound includes voice information.

[0050] Optionally, the device further includes a voice sending module, configured to: when the ambient sound is abnormal, send the voice information outward.

Optionally, the device further includes a monitoring and starting module, configured to: when receiving the monitoring command, start the listening mode to trigger the sound collecting module to collect the ambient sound.

[0052] Embodiments of the present invention also provide a smart speaker that includes a memory, a processor, and at least one application stored in the memory and configured to be executed by the processor, the application being configured It is used to implement the aforementioned security protection method. Advantageous effects of the invention

Beneficial effect

[0053] A security protection method provided by an embodiment of the present invention, by monitoring an environmental sound, when an environmental sound is abnormal, sending an alarm message, thereby realizing remote monitoring of a target location such as a home or an office, and realizing the target location. The safety protection has improved the security level of the target site. Compared with the video surveillance protection scheme, the remote monitoring protection scheme of the embodiment of the present invention does not expose the image of the target location, so the user privacy can be better protected, and the remote monitoring can be realized by using the smart home device such as the existing smart speaker. The cost is low, the concealment is good, it is not easy to be discovered and destroyed by the intruder, and the monitoring effect is affected by the lens occlusion, illumination and the like, thereby greatly improving the stability and effectiveness of the safety protection.

Brief description of the drawing

DRAWINGS

1 is a flow chart of a first embodiment of a security protection method of the present invention;

2 is a flow chart of a second embodiment of the security protection method of the present invention;

3 is a schematic block diagram of a first embodiment of the safety protection device of the present invention;

4 is a block diagram of the abnormality determining module of FIG. 3;

5 is another block diagram of the abnormality determining module of FIG. 3;

6 is a schematic block diagram of a second embodiment of the safety protection device of the present invention;

7 is a schematic block diagram of a third embodiment of the safety protection device of the present invention;

8 is a block diagram of the abnormality determining module of FIG. 7.

[0062] The implementation, functional features, and advantages of the present invention will be further described with reference to the accompanying drawings.

BEST MODE FOR CARRYING OUT THE INVENTION

The specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The embodiments of the present invention are described in detail below, and the examples of the embodiments are illustrated in the drawings, wherein the same or similar reference numerals are used to refer to the same or similar elements or elements having the same or similar functions. Under The embodiments described with reference to the drawings are exemplified and are not to be construed as limiting the invention.

The singular forms "a", "an", "the" and "the" It will be further understood that the phrase "comprising", used in the <RTI ID=0.0> </ RTI> <RTIgt; </ RTI> <RTIgt; </ RTI> <RTIgt; </ RTI> is intended to mean the presence of the features, integers, steps, operations, components and/or components, but does not exclude the presence or addition of one or more other features, Integers, steps, operations, components, components, and/or their groups. It will be understood that when we refer to an element being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element, or an intermediate element can be present. Further, "connected" or "coupled" as used herein may include either a wireless connection or a wireless coupling. The phrase "and/or" used herein includes all or any of the elements and all combinations of one or more of the associated listed.

[0066] Those skilled in the art will appreciate that all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs, unless otherwise defined. It should also be understood that terms such as those defined in a general dictionary should be understood to have meaning consistent with the meaning in the context of the prior art, and will not be idealized or excessive unless specifically defined as here. The formal meaning is explained.

[0067] Those skilled in the art can understand that the "terminal" and "terminal device" used herein include both a device of a wireless signal receiver, a device having only a wireless signal receiver without a transmitting capability, and a receiving and receiving device. A device that transmits hardware having a receiving and transmitting hardware capable of performing two-way communication over a two-way communication link. Such a device may comprise: a cellular or other communication device having a single line display or a multi-line display or a cellular or other communication device without a multi-line display; PCS (Persona 1 Communications Service), which may combine voice, Data processing, fax and/or data communication capabilities; PDA (Personal Digital Assistant), which can include radio frequency receivers, pagers, Internet/Intranet access, web browsers, notepads, calendars and/or GPS ( Global Positioning System, Receiver; Conventional laptop and/or palmtop computer or other device having a conventional laptop and/or palmtop computer or other device that includes and/or includes a radio frequency receiver. As used herein, "terminal", "terminal device" may be portable, transportable, installed in a vehicle (aviation, sea and/or land), or adapted and/or configured to operate locally, and/or Run in any other location on the Earth and/or space in a distributed fashion. Made here The "terminal" and "terminal device" used may also be a communication terminal, an internet terminal, a music/video playing terminal, and may be, for example, a PDA, a MID (Mobile Internet Device), and/or a music/video playing function. Mobile phones can also be smart TVs, set-top boxes and other devices.

[0068] Those skilled in the art can understand that the server used herein includes, but is not limited to, a computer, a network host, a single network server, a plurality of network server sets, or a plurality of servers. Here, the cloud consists of a large number of computers or network servers based on Cloud Computing, which is a kind of distributed computing, a super virtual computer composed of a group of loosely coupled computers. In the embodiment of the present invention, communication may be implemented by any communication means between the server, the terminal device and the WNS server, including but not limited to, mobile communication based on 3GPP, LTE, WIMAX, and computer network communication based on TCP/IP and UDP protocols. And short-range wireless transmission based on Bluetooth and infrared transmission standards.

[0069] The security protection method and the security protection device of the embodiments of the present invention are mainly applied to smart home devices such as smart speakers and smart televisions, and can also be applied to other terminal devices, which is not limited by the present invention. The following is a detailed description of the application to the smart speaker.

[0070] Referring to FIG. 1, a first embodiment of a security protection method according to the present invention is provided. The method includes the following steps:

[0071] Sl l, collecting ambient sound.

In the embodiment of the invention, after the smart speaker starts the listening mode, the ambient sound of the target place is collected by the microphone at a certain sampling frequency. The sampling frequency can be set according to actual needs, such as setting to 16KHZ or higher. The microphone of the smart speaker preferably includes a plurality of microphones and constitutes a microphone array. For example, the intelligent sound box collects sound signals from the environment through the microphone array's pickups, and transmits them to the digital signal processor (DSP) through the audio interface to sample and quantify the audio signals.

[0073] Optionally, the user can remotely control the smart speaker to enable the monitoring function. For example, before step S11, the user sends a monitoring instruction to the smart speaker through a user terminal such as a mobile phone, a tablet, a personal computer, etc., after receiving the monitoring instruction, the intelligent speaker starts the listening mode, and proceeds to step S1 l to start or determine Collect ambient sounds.

[0074] Optionally, the user can manually activate the monitoring function of the smart speaker when leaving the home, and when the monitoring function is turned on, the smart speaker starts the listening mode.

[0075] Optionally, the user can set the monitoring segment, and when entering the monitoring segment, the smart speaker automatically starts monitoring. Mode, when the monitor is off, the smart speaker automatically turns off the monitor mode.

[0076] S12. Determine whether the ambient sound is abnormal. When the ambient sound is abnormal, proceed to step S13, otherwise continue to monitor the ambient sound.

[0077] In step S12, the smart speaker analyzes and processes the sampled ambient sound to determine whether the ambient sound is hoisted.

[0078] In some embodiments, the smart speaker determines whether the volume of the ambient sound is greater than or equal to a threshold, and determines that the ambient sound is abnormal when the volume of the ambient sound is greater than or equal to the threshold.

[0079] Specifically, the smart speaker can average the volume of the sampling points of the plurality of ambient sounds collected by the microphone array (such as an arithmetic mean), and then compare the average value with a preset threshold value, when the average If the value is greater than or equal to the threshold 吋, the volume of the ambient sound is greater than or equal to the threshold, and the ambient snoring is determined.

[0080] In other embodiments, the smart speaker first determines whether the volume of the ambient sound is greater than or equal to a threshold, and when the volume of the ambient sound is greater than or equal to the threshold, determining whether there is continuous Ν (Ν>2) ambient sounds. The volume of the sampling point is greater than or equal to the preset value. If yes, it is determined that the ambient sound is abnormal, otherwise the ambient sound is determined to be normal. This way makes the judgment more accurate and prevents misjudgment.

[0081] Specifically, the smart speaker may first average the volume of the sampling points of the plurality of ambient sounds collected by the microphone array (such as an arithmetic mean), and then compare the average value with a preset threshold value. If the average value is greater than or equal to the threshold value, the volume of the ambient sound is greater than or equal to the threshold value, and then it is determined whether the volume of the sampling point of the continuous ambient sound is greater than or equal to the preset value, and if so, the ambient sound is determined to be abnormal, otherwise It is determined that the ambient sound is normal. Ν It can be set according to actual needs, for example, it is set within the range of 5-10.

[0082] The foregoing threshold may be one, or at least two, that is, each segment corresponds to one threshold. Similarly, the preset value may be one or at least two, that is, each segment corresponds to a preset value. The threshold value and the preset value may be set according to the volume statistics of the ambient sound in daily life, that is, the normal volume value of the ambient sound in daily life is counted, and then the normal volume value is set as a threshold or a preset value, or A buffer value is added as a threshold or a preset value based on the normal volume value. The threshold and preset values may or may not be equal.

[0083] For example, the smart speaker statistical microphone array is collected in a daytime period (eg, 10:00-23:59). The arithmetic mean of the volume of the ambient sound is used as the normal volume value of the daytime segment, and the arithmetic mean of the volume of the ambient sound collected during the nighttime segment (eg 0:00-9:59) is used as the normal volume value of the nighttime segment. . Then, based on the normal volume value of the daytime period, the same or different buffer values are respectively added as the threshold value and the preset value of the daytime segment, and the same or different values are respectively added on the basis of the normal volume value of the nighttime segment. The buffer value is used as the threshold and preset value for the nighttime segment. The normal volume value of the daytime segment is generally in the range of 40db-50db, and the normal volume value of the nighttime segment is generally in the range of 30db-40db. Further, the aforementioned normal volume value may be periodically updated to adapt to the current indoor noise environment.

[0084] S13. Send an alarm message to the outside.

In the embodiment of the present invention, when the ambient sound is abnormal, the smart speaker immediately sends an alarm message to the server, and after receiving the alarm information, the server immediately pushes the alarm information to the designated user terminal. Preferably, the smart speaker can also send an alarm message directly to the user terminal. After receiving the alarm information, the user terminal may send a voice message and/or display graphic information to remind the user that the user terminal may be a mobile terminal such as a mobile phone or a tablet, or may be a computer terminal such as a personal computer or a notebook computer. Thus, the user can take security measures to avoid or reduce losses.

[0086] Further, the intelligent speaker can also sound an alarm to take appropriate safety measures to shock the entrant or attract the attention of nearby residents.

[0087] Further, as shown in FIG. 2, in the second embodiment of the security protection method of the present invention, when it is determined in step S12 that the ambient sound is abnormal, the following steps are further included:

[0088] S14. Detect whether voice information is included in the ambient sound. When the voice information is included in the ambient sound, the process proceeds to step S15; when the voice message is not included in the environment sound, the process ends.

[0089] S15. Send voice information outward.

[0090] In step S14, the intelligent speaker preferably utilizes a linear superposition principle of a voice signal and a non-speech audio signal, and uses a voice activity detection algorithm (VAD) to perform spatial and frequency domain feature analysis on the ambient sound to determine the environment. Whether the voice contains voice information.

[0091] Specifically, the intelligent speaker processes the collected ambient sound according to the frame, and the length of each frame is set according to the characteristics of the collected sound signal, and the parameter feature value of each frame of the sound signal is extracted by the voice activity detection algorithm, and the parameter feature value is compared. The size of the threshold. When the parameter characteristic value is greater than or equal to the threshold value 吋, the frame is determined to be a speech frame; when the parameter value is less than the threshold value 吋, it is determined that the frame is not a speech frame and is noise. Voice 吋 The main features of the domain and frequency domain are as follows:

[0092] (1) Analysis of short enthalpy energy characteristics. The speech frame has a relatively high short-lived energy, and the short-twist energy feature is easy to extract, and in the lower noise environment, there is obvious recognition performance. Since the background sound is additive noise to the speech, the short burst energy should be different when the person starts to speak and does not start talking. The pretreatment step is required before extracting the short energy characteristics. The energy characteristics of the speech signal also include that the energy of the speech is distributed in all frequency bands, and is more prominent around the pitch frequency and the first two formant frequencies. At the same time, different types of noise also have different effects in various frequency bands. Some frequency energy distributions are concentrated. For example, car noise is mainly distributed in the low frequency band. When the noise is dominant, the noise dominates and the speech signal is completely concealed. Through other frequency bands, the speech signal can continue to maintain the intrinsic characteristics of the speech, thus dividing the full frequency band into multiple sub-bands, and distinguishing and discriminating speech by sub-band energy characteristics is a feature with a certain degree of anti-noise capability. The sub-band energy extraction 吋 can divide the full frequency band from low frequency to high frequency into four sub-bands, respectively 250-2000 Hz, 2000 Hz to 4000 Hz, 4000 Hz.

6000Hz, 6000Hz to 8000Hz.

[0093] (2) Short-cut zero-crossing rate feature analysis. The short-cut zero-crossing rate reflects the number of times the voice signal's waveform in one frame passes through the horizontal axis representing the zero level. For the sampled signal, if the adjacent sample point amplitude changes the sign, it is called zero-crossing, and the number of times the symbol is changed within one frame is called the zero-crossing rate of the frame. In terms of speech characteristics, the voiced audio rate is lower, with a lower average zero-crossing rate, about 14/lOms; the clear audio frequency is higher, with a higher average zero-crossing rate, about 47/10ms. The noise or silence zero-crossing rate is between the unvoiced and voiced sounds or less than the voiced sound. Therefore, the short zero crossing rate is a significant feature for detecting voiced frames and is not affected by power and amplitude. Therefore, the combination of short-cut zero-crossing rate and short-twist energy is a relatively effective detection algorithm.

[0094] In step S15, when it is detected that the ambient sound includes voice information, the smart speaker extracts the voice information from the ambient sound, and immediately sends the voice information to the user terminal or the server, and the server immediately receives the voice information, and immediately Push voice information to the specified user terminal. Preferably, the intelligent speaker can use the audio-video peer-to-peer network transmission technology (or the audio and video P2P transmission technology) to directly send the voice information to the user terminal, so that the voice information is not obtained by a third party such as a server, but only acquired by the user end, and the voice information is improved. Information security. After receiving the voice information, the user terminal may prompt the user to play the voice information or directly play the voice information. The user terminal may be a mobile terminal such as a mobile phone or a tablet, or may be Computer terminals such as personal computers and laptops. Therefore, the user can further understand the specific situation of the scene according to the voice information, determine whether the family actually has an intrusion, improve the accuracy of the judgment, and prevent misjudgment.

[0095] Further, when the user determines that there is an outsider in the home according to the voice information sent by the smart speaker, the user terminal can send an alarm signal to the smart speaker through the user terminal, and the smart speaker receives the alarm signal, and then sounds an alarm to shock. Intruders or the attention of nearby residents to take appropriate security measures.

[0096] In the foregoing embodiment, a specific application (APP) may be installed on a user terminal, and the user terminal communicates with the smart speaker through the specific application, for example, the user terminal sends a monitoring instruction to the smart speaker through a specific application, by using a specific The application receives alarm information, voice information, etc. sent by the smart speaker, sends an alarm signal to the smart speaker through a specific application, and the like.

In an optional embodiment, the smart speaker can also determine whether the ambient sound is abnormal by: determining whether the ambient sound contains voice information, and when the ambient sound includes voice information, determining that the ambient sound is abnormal. For the method of analyzing whether or not the voice information is included in the environment sound, refer to the foregoing embodiment, and details are not described herein again.

[0098] Further, when it is determined that the ambient sound is abnormal, the smart speaker further extracts the voice information from the ambient sound and transmits the voice information outward. For the specific sending manner, refer to the foregoing embodiment, and details are not described herein again.

[0099] In some embodiments, the foregoing embodiments may also combine the manner in which the ambient sound is determined to be abnormal. For example, it is preferred to determine whether the volume of the ambient sound is greater than or equal to the threshold. If yes, determine whether the ambient sound contains voice information, and when the ambient sound includes voice information, determine that the ambient sound is abnormal. For example, it is preferred to determine whether the volume of the ambient sound is greater than or equal to the threshold; if yes, determine whether the volume of the sampling points of the continuous N ambient sounds is greater than or equal to the preset value, and if so, whether the ambient sound contains the voice information. When the voice information is included in the ambient sound, it is determined that the ambient sound is abnormal.

It can be understood by those skilled in the art that, besides the above-mentioned manners listed in this embodiment, it is also possible to determine whether the ambient sound is abnormal by using other methods in the prior art, and the present invention will not be described again.

[0101] The security protection method of the embodiment of the present invention, by monitoring the ambient sound, sends an alarm message when the ambient sound is abnormal, thereby realizing remote monitoring of a target location such as a home or an office, thereby realizing security protection to the target location. Improve the security level of the target location. Compared with the video surveillance protection scheme, the remote monitoring protection scheme of the embodiment of the present invention does not expose the image of the target location, and thus can be better protected. User privacy, peers can use the smart home devices such as existing smart speakers to achieve remote monitoring, low cost, good concealment, not easy to be destroyed by intruders, and avoid image blocking, lighting and other issues affecting the monitoring effect, so Greatly improve the stability and effectiveness of security protection.

[0102] Referring to FIG. 3, a first embodiment of the security protection device of the present invention is provided. The device includes a sound collection module 10, an abnormality determination module 20, and an abnormality alarm module 30, wherein: the sound collection module 10 is configured to collect ambient sounds. The abnormality determining module 20 is configured to determine whether the ambient sound is abnormal; the abnormality alarm module 30 is set to send an alarm message outward when the ambient sound is abnormal.

[0103] In the embodiment of the present invention, after the smart speaker starts the listening mode, the sound collecting module 10 collects the ambient sound of the target place through the microphone at a certain sampling frequency. The sampling frequency can be set according to actual needs, such as 16KHZ or higher. The microphone of the smart speaker preferably includes a plurality of microphones and constitutes a microphone array.

[0104] Optionally, the user can remotely control the smart speaker to enable the monitoring function. For example, the security protection device further includes a monitoring startup module, and the user sends a monitoring instruction to the smart speaker through a user terminal such as a mobile phone, a tablet, a personal computer, etc., after the intelligent speaker receives the monitoring instruction, the monitoring startup module starts the monitoring mode to trigger the sound collection module. The sound of the collection environment is calculated or fixed at 10 o'clock.

[0105] Optionally, the user can manually activate the monitoring function of the smart speaker when leaving the home. When the monitoring function is turned on, the smart speaker starts the listening mode by monitoring the startup module.

[0106] Optionally, the user can set the monitoring section, and when entering the monitoring section, the intelligent speaker automatically starts the monitoring mode by monitoring the startup module, and when the monitoring section is off, the intelligent speaker automatically turns off the listening mode by monitoring the shutdown module. .

[0107] The abnormality determining module 20 analyzes and processes the ambient sound sampled by the sound collecting module 10 to determine whether the ambient sound is abnormal.

[0108] In some embodiments, the abnormality determining module 20 includes a first determining unit 21 and a first determining unit 22, as shown in FIG. 4, wherein: the first determining unit 21 is configured to determine whether the volume of the ambient sound is greater than or Equal to the threshold; the first determining unit 22 is configured to determine that the ambient sound is abnormal when the volume of the ambient sound is greater than or equal to the threshold 吋.

[0109] Specifically, the first determining unit 21 obtains an average value (such as an arithmetic mean value) of the sampling points of the plurality of ambient sounds collected by the microphone array, and then compares the average value with a preset threshold value, when If the average value is greater than or equal to the threshold 吋, then the volume of the ambient sound is greater than or equal to the threshold. When the volume of the ambient sound is greater than or equal to the threshold 吋, the first decision unit 22 determines that the ambient sound is abnormal, otherwise it determines that the ambient sound is normal.

[0110] In other embodiments, the abnormality determining module 20 includes a first determining unit 21, a second determining unit 23, and a second determining unit 24, as shown in FIG. 5, wherein: the first determining unit 21 is configured to determine Whether the volume of the ambient sound is greater than or equal to the threshold; the second determining unit 23 is configured to determine whether the volume of the sampling points of consecutive N (N>2) ambient sounds is greater than or equal to the threshold value 吋The second determining unit 24 is configured to determine that the ambient sound is abnormal when the volume of the sampling point with consecutive N ambient sounds is greater than or equal to the preset value.

[0111] Specifically, the first determining unit 21 obtains an average value (such as an arithmetic mean value) of the sampling points of the plurality of ambient sounds collected by the microphone array, and then compares the average value with a preset threshold value. When the average value is greater than or equal to the threshold 吋, the volume of the ambient sound is greater than or equal to the threshold. When the volume of the ambient sound is greater than or equal to the threshold 吋, the second determining unit 23 determines whether the volume of the sampling points of the consecutive N ambient sounds is greater than or equal to the preset value. When the volume of the sampling point of the continuous N ambient sounds is greater than or equal to the preset value 吋, the second determining unit 24 determines that the ambient sound is abnormal, otherwise it determines that the ambient sound is normal. N can be set according to actual needs, for example, set within the range of 5-10.

[0112] The foregoing threshold may be one or at least two, that is, each segment corresponds to one threshold. Similarly, the preset value may be one or at least two, that is, each segment corresponds to a preset value. The threshold value and the preset value may be set according to the volume statistics of the ambient sound in daily life, that is, the normal volume value of the ambient sound in daily life is counted, and then the normal volume value is set as a threshold or a preset value, or A buffer value is added as a threshold or a preset value based on the normal volume value. The threshold and preset values may or may not be equal.

[0113] When the ambient sound is abnormal, the abnormality alarm module 30 immediately sends an alarm message to the server, and after receiving the alarm information, the server immediately pushes the alarm information to the designated user terminal. Preferably, the abnormality alarm module 30 can also directly send alarm information to the user terminal. After receiving the alarm information, the user terminal may send a voice message and/or display graphic information to remind the user that the user terminal may be a mobile terminal such as a mobile phone or a tablet, or may be a computer terminal such as a personal computer or a notebook computer. Thus, the user can take security measures to avoid or reduce losses. [0114] Further, the abnormality alarm module 30 can also sound an alarm to take appropriate safety measures to shock the entrant or attract the attention of nearby residents.

[0115] Further, as shown in FIG. 6, in the second embodiment of the security protection device of the present invention, the device further includes a voice detection module 40 and a voice sending module 50, where: the voice detection module 40 is set to be an environment The voice is abnormally 吋, detecting whether the voice information is included in the ambient sound; and the voice sending module 50 is configured to send the voice information to the outside when the voice voice is included in the ambient voice.

[0116] The intelligent speaker preferably utilizes a linear superposition principle of a voice signal and a non-speech audio signal, and uses a voice activity detection algorithm (VAD) to perform spatial and frequency domain feature analysis on the ambient sound to determine whether the ambient sound is included. voice message.

[0117] Specifically, the voice detection module 40 processes the collected ambient sound according to the frame, and the length of each frame is set according to the characteristics of the collected sound signal, and the parameter feature value of each frame of the sound signal is extracted by the voice activity detection algorithm, and the parameter is compared. The size of the eigenvalue and threshold. When the parameter characteristic value is greater than or equal to the threshold value 吋, it is determined that the frame is a speech frame; when the parameter value is less than the threshold value 吋, it is determined that the frame is not a speech frame and is noise.

[0118] When it is detected that the voice information is included in the environment sound, the voice sending module 50 extracts the voice information from the ambient sound, and immediately sends the voice information to the user terminal or the server, and after receiving the voice information, the server immediately transmits the voice information. The information is pushed to the specified user terminal. Preferably, the voice sending module 50 uses the audio-video peer-to-peer network transmission technology (or the audio and video P2P transmission technology) to directly send voice information to the user terminal, so that the voice information is not obtained by a third party such as a server, but is only obtained by the user terminal, thereby improving The security of the information. After receiving the voice information, the user terminal may prompt the user to play the voice information or directly play the voice information. The user terminal may be a mobile terminal such as a mobile phone or a tablet, or may be a computer terminal such as a personal computer or a notebook computer. Therefore, the user can further understand the specific situation on the spot according to the voice information, determine whether the family actually has an intrusion, improve the accuracy of the judgment, and prevent misjudgment.

[0119] Further, when the user determines that there is an outsider breaking into the home according to the voice information sent by the smart speaker, the user terminal may send an alarm signal to the smart speaker, and when the abnormal alarm module 30 receives the alarm signal, an alarm sound is generated. Take appropriate safety measures to shock the entrants or attract the attention of nearby residents.

[0120] Referring to Figure 7, a third embodiment of the safety guard of the present invention is presented. The abnormality determining module 2 of this embodiment As shown in FIG. 8, the third determining unit 25 and the third determining unit 26 are included, wherein: the third determining unit 25 is configured to determine whether voice information is included in the ambient sound; and the third determining unit 26 is configured to be an ambient sound. The voice information is included, and the ambient sound is abnormal. The third judging unit 25 analyzes whether the voice information is included in the environment sound, and the manner of analyzing and judging by the voice detecting module 40 in the foregoing embodiment is the same, and details are not described herein again.

Further, the apparatus further includes a voice transmitting module 50 configured to send the voice information outward when the ambient sound is abnormal. Therefore, the user can further understand the specific situation on the spot according to the voice information, determine whether the family actually has an intrusion, improve the accuracy of the judgment, and prevent misjudgment.

[0122] In some embodiments, the abnormality determination module 20 may also combine the manner in which the foregoing embodiment determines whether the ambient sound is abnormal. For example, the abnormality determining module 20 first determines whether the volume of the ambient sound is greater than or equal to the threshold. If yes, it determines whether the ambient sound contains voice information, and when the ambient sound includes the voice information, it determines that the ambient sound is abnormal. For example, the abnormality determining module 20 first determines whether the volume of the ambient sound is greater than or equal to the threshold; if yes, determining whether the volume of the sampling points of the consecutive N ambient sounds is greater than or equal to the preset value, and if so, determining the ambient sound. Whether or not the voice information is included, and when the voice information is included in the ambient sound, it is determined that the environmental sound is abnormal.

It can be understood by those skilled in the art that, besides the above-mentioned manners listed in this embodiment, other manners in the prior art can be used to determine whether the ambient sound is abnormal, and the present invention will not be described again.

[0124] The security protection device of the embodiment of the present invention transmits an alarm message when the ambient sound is abnormal, by monitoring the ambient sound, thereby realizing remote monitoring of a target location such as a home or an office, thereby realizing security protection against the target site. Improve the security level of the target location. Compared with the video surveillance protection scheme, the remote monitoring protection scheme of the embodiment of the present invention does not expose the image of the target location, so the user privacy can be better protected, and the remote monitoring can be realized by using the smart home device such as the existing smart speaker. The cost is low, the concealment is good, it is not easy to be destroyed by the intruder, and the image occlusion, illumination and the like are avoided, which affects the monitoring effect, thereby greatly improving the stability and effectiveness of the security protection.

[0125] The present invention also provides a smart speaker that includes a memory, a processor, and at least one application stored in the memory and configured to be executed by the processor, the application being configured to perform security protection method. The security protection method includes the following steps: collecting an ambient sound, determining whether the ambient sound is abnormal, and sending an alarm message when the ambient sound is abnormal. Security described in this embodiment The protection method is the security protection method in the foregoing embodiment of the present invention, and details are not described herein again.

Those skilled in the art will appreciate that the present invention includes apparatus that is directed to performing one or more of the operations described herein. These devices may be specially designed and manufactured for the required purposes, or may also include known devices in a general purpose computer. These devices have computer programs stored therein that are selectively activated or reconfigured. Such computer programs may be stored in a device (eg, computer) readable medium or in any type of medium suitable for storing electronic instructions and respectively coupled to a bus, including but not limited to any Types of disks (including floppy disks, hard disks, CDs, CD-ROMs, and magneto-optical disks), ROM (Read-Only Memory), RAM (Random Access Memory), EPROM (Erasable Programmable Read-Only)

Memory, rewritable programmable read only memory), EEPROM (Electrically Erasable

Programmable Read-Only Memory, Flash, Magnetic Card or Light Card. That is, a readable medium includes any medium that is stored or transmitted by a device (e.g., a computer) in a readable form.

[0127] Those skilled in the art will appreciate that each block of the block diagrams and/or block diagrams and/or flow diagrams can be implemented by computer program instructions, and/or in the block diagrams and/or block diagrams and/or flow diagrams. The combination of boxes. Those skilled in the art will appreciate that these computer program instructions can be implemented by a general purpose computer, a professional computer, or a processor of other programmable data processing methods, such that the processor is executed by a computer or other programmable data processing method. The block diagrams and/or block diagrams of the invention and/or the schemes specified in the blocks or blocks of the flow diagram are invented.

[0128] Those skilled in the art can understand that the various operations, methods, and steps, measures, and solutions in the present invention may be alternated, changed, combined, or deleted. Further, various operations, methods, and other steps, measures, and arrangements in the process of the present invention may be alternated, changed, rearranged, decomposed, combined, or deleted. Further, the steps, measures, and solutions in the various operations, methods, and processes disclosed in the prior art may be alternated, changed, rearranged, decomposed, combined, or deleted.

The preferred embodiments of the present invention have been described above with reference to the drawings, and are not intended to limit the scope of the invention.

Those skilled in the art can implement the invention without departing from the scope and spirit of the invention. For example, features that are one embodiment may be used in another embodiment to yield yet another embodiment. Any modifications, equivalent substitutions and improvements made within the technical concept of the invention are intended to be included within the scope of the invention.

Claims

Claim

A security protection method, including the following steps:

Collect ambient sounds;

Determining whether the ambient sound is abnormal;

When the ambient sound is abnormal, an alarm message is sent out.

The security protection method according to claim 1, wherein the determining whether the ambient sound is abnormal comprises:

Determining whether the volume of the ambient sound is greater than or equal to a threshold;

When the volume of the ambient sound is greater than or equal to the threshold 吋, it is determined that the ambient sound is abnormal.

When the volume of the ambient sound is greater than or equal to the threshold 吋, determining whether there is a continuous N volume of the sampling point of the ambient sound is greater than or equal to a preset value, where N is greater than or equal to 2;

If so, it is determined that the environmental sound is abnormal.

The security protection method according to claim 3, wherein the step of determining whether the environmental sound is abnormal comprises: detecting whether the environmental sound contains voice information when the ambient sound is abnormal; The ambient sound contains voice information 吋, and the voice information is sent out.

The security protection method according to claim 4, wherein the detecting whether the voice information is included in the ambient sound comprises:

The voice activity detection algorithm is used to perform the analysis of the domain and frequency domain characteristics of the ambient sound, and it is determined whether the ambient sound contains voice information.

The security protection method according to claim 4, wherein the sending the voice information outward comprises: transmitting the voice information to a user terminal by using an audio-video peer-to-peer network transmission technology. [Claim 7] The security protection method according to claim 3, wherein the threshold has at least two, and different segments correspond to different thresholds.

[Claim 8] The security protection method according to claim 3, wherein the preset value has at least two

Different segments correspond to different preset values.

[Claim 9] The security protection method according to claim 1, wherein the determining whether the environmental sound is abnormal includes:

Determining whether the ambient sound contains voice information;

When the ambient sound contains voice information, it is determined that the ambient sound is abnormal.

[Claim 10] The security protection method according to claim 9, wherein the step of determining whether the environmental sound is abnormal includes:

When the ambient sound is abnormal, the voice information is sent out.

11. A safety guard comprising:

a sound collection module, configured to collect ambient sounds;

An abnormality determining module, configured to determine whether the ambient sound is abnormal;

The abnormal alarm module is set to send an alarm message outward when the ambient sound is abnormal.

The security protection device according to claim 11, wherein the abnormality determination module comprises:

The first determining unit is configured to determine whether the volume of the ambient sound is greater than or equal to a threshold. The first determining unit is configured to determine that the ambient sound is abnormal when the volume of the ambient sound is greater than or equal to the threshold.

The security protection device according to claim 11, wherein the abnormality determining module comprises:

a first determining unit, configured to determine whether the volume of the ambient sound is greater than or equal to a threshold value, the second determining unit is configured to determine whether there are consecutive N of the environment when the volume of the ambient sound is greater than or equal to the threshold value The volume of the sampling point of the sound is greater than or equal to a preset value, where N is greater than or equal to 2; The second determining unit is configured to determine that the ambient sound is abnormal when the volume of the sampling points of the consecutive N environmental sounds is greater than or equal to the preset value.

The security protection device according to claim 13, wherein the device further comprises: a voice detection module, configured to detect whether the ambient sound contains voice information when the ambient sound is abnormal;

The voice sending module is configured to send the voice information to the outside when the voice sound is included in the ambient sound.

The security protection device according to claim 14, wherein the voice detection module is configured to: perform a domain and frequency domain feature analysis on the ambient sound by using a voice activity detection algorithm, and determine whether the environment sound includes voice message.

The security protection device according to claim 14, wherein the voice sending module is configured to: send the voice information to a user terminal by using an audio-video peer-to-peer network transmission technology.

The safety protection device according to claim 13, wherein the threshold has at least two, and different segments correspond to different thresholds.

The security protection device according to claim 17, wherein the preset value has at least two, and different segments correspond to different preset values.

The safety protection device according to claim 11, wherein the abnormality determination module comprises:

The third determining unit is configured to determine whether the ambient sound includes voice information; and the third determining unit is configured to determine that the ambient sound is abnormal when the ambient sound includes voice information.

20. A smart speaker comprising a memory, a processor and at least one application stored in the memory and configured to be executed by the processor, wherein the application is configured to execute a claim The safety protection method described in 1.