CN112397055A

CN112397055A - Abnormal sound detection method and device and electronic equipment

Info

Publication number: CN112397055A
Application number: CN202110068478.8A
Authority: CN
Inventors: 张量; 许振斌
Original assignee: Beijing Family Intelligent Technology Co Ltd
Current assignee: Beijing Family Intelligent Technology Co Ltd
Priority date: 2021-01-19
Filing date: 2021-01-19
Publication date: 2021-02-23
Anticipated expiration: 2041-01-19
Also published as: CN112397055B

Abstract

The invention provides an abnormal sound detection method, an abnormal sound detection device and electronic equipment.

Description

Abnormal sound detection method and device and electronic equipment

Technical Field

The invention relates to the technical field of computers, in particular to an abnormal sound detection method and device and electronic equipment.

Background

At present, sound is one of important information sources, and there are many scenes in daily production and life that need to collect sound information, and detect and alarm by using the collected sound information, for example, in agricultural breeding work, the health condition of poultry needs to be monitored by poultry calling, and in the field of security and protection, sound needs to be used as an important means for solving video monitoring dead angles, and the like. Therefore, a method for detecting abnormal sounds is needed to detect abnormal sounds in the scene.

Disclosure of Invention

In order to solve the above problem, an object of the embodiments of the present invention is to provide an abnormal sound detection method, an abnormal sound detection apparatus, and an electronic device.

In a first aspect, an embodiment of the present invention provides an abnormal sound detection method, including:

when sound signals of the surrounding environment are collected, the abnormal sound detection equipment acquires surrounding environment information;

when the abnormal sound detection equipment is determined to be in a severe environment according to the acquired ambient environment information, performing Fourier transform on the sound signal to obtain a Fourier-transformed sound signal, wherein the sound signal comprises frequency components of the sound signal; wherein the frequency components of the sound signal comprise: the frequency range of the sound signal, the amplitude of the frequency points in the frequency range and the initial phase of each frequency point;

respectively calculating the difference value of the amplitude value between each frequency point and the adjacent frequency point of each frequency point;

when the difference values of the amplitudes between the frequency points and the adjacent frequency points of the frequency points are larger than or equal to the amplitude difference value threshold, determining the sound signals corresponding to the frequency points as narrow-band signals, and skipping to a step of converting all frequencies in the frequency range of the sound signals into Mel frequencies of the sound signals when the sound signals are determined as the narrow-band signals;

when the difference value between the amplitude of the frequency point and the amplitude between the adjacent frequency points on one side of the frequency point is more than or equal to an amplitude difference threshold value and the difference value between the amplitude of the frequency point and the amplitude between the adjacent frequency points on the other side of the frequency point is less than the amplitude difference threshold value, taking the adjacent frequency points on the other side of the frequency point as the frequency points to be detected;

taking the direction of the frequency point reaching the adjacent frequency point on the other side of the frequency point as a detection direction;

calculating the difference value between the amplitude of the frequency point adjacent to the frequency point to be detected in the detection direction and the amplitude of the frequency point to be detected;

judging whether the difference value between the amplitude of the frequency point adjacent to the frequency point to be detected in the detection direction and the amplitude of the frequency point to be detected is smaller than an amplitude difference value threshold value, if so, taking the frequency point adjacent to the frequency point to be detected in the detection direction as the frequency point to be detected, and returning to the step of calculating the difference value between the amplitude of the frequency point adjacent to the frequency point to be detected in the detection direction and the amplitude of the frequency point to be detected; if not, counting the number of the frequency points to be detected;

when the number of the frequency points to be detected is less than or equal to a number threshold, determining the sound signals between the frequency points to be detected and the frequency points to be detected indicated by the number of the frequency points to be detected in the detection direction as narrow-band signals;

converting each frequency within a frequency range of the sound signal into a Mel frequency of the sound signal when it is determined that the sound signal is determined to be a narrow-band signal;

inputting the Mel frequency of the sound signal into a Mel filter bank to obtain a filtering result of the sound signal in the Mel domain;

carrying out logarithmic calculation and discrete cosine transform on the filtering result of the obtained sound signal in the Mel domain to obtain the sound characteristic of the sound signal;

inputting the obtained sound characteristics of the sound signal into a trained convolutional neural network model, and detecting whether the sound signal is abnormal sound; wherein, the convolution neural network model includes: n convolutional layers, wherein the activation function and the pooling layer are arranged after the N convolutional layers, when the result of the convolution, activation and pooling of the sound characteristic of the sound signal is called a characteristic diagram and is represented by F, M fully-connected layers are provided, the size of a convolution kernel of each convolutional layer is Kn, the step length is Sn, the number of the convolution kernels is 2N, N is more than or equal to 1 and less than or equal to N, the number of neurons of the last fully-connected layer is 2, the number of neurons of other fully-connected layers is M, M is more than or equal to 1 and less than or equal to M, the activation function after convolution of each layer is Max-Feature-Map, and the following expression is that:

wherein k is more than or equal to 1 and less than or equal to n;

the number of F is 2 n;

the derivative of a is then:

wherein w is the width of the feature map; h is the height of the feature map; i is the ith column of the characteristic diagram, i is more than or equal to 0 and is less than w; j is the jth row of the characteristic diagram, and j is more than or equal to 0 and less than h; k is more than or equal to 1 and less than or equal to n, and R is a real number space;

a k-th feature map of 2n feature maps generated by the n-th convolutional layer;

representing the characteristic value of the ith column and the jth row in the kth characteristic diagram;

and (3) representing the characteristic value of the ith column and the jth row in the (k + n) th characteristic diagram in the 2n characteristic diagrams generated by the nth convolutional layer.

In a second aspect, an embodiment of the present invention further provides an abnormal sound detection apparatus, including:

the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring surrounding environment information when sound signals of the surrounding environment are acquired;

the first processing module is used for carrying out Fourier transform on the sound signal to obtain a Fourier-transformed sound signal when the abnormal sound detection device is determined to be in a severe environment according to the obtained surrounding environment information, wherein the sound signal comprises frequency components of the sound signal; wherein the frequency components of the sound signal comprise: the frequency range of the sound signal, the amplitude of the frequency points in the frequency range and the initial phase of each frequency point;

the calculation module is used for calculating the difference value of the amplitude between each frequency point and the adjacent frequency point of each frequency point;

the second processing module is used for determining the sound signals corresponding to the frequency points as narrow-band signals when the difference values of the amplitudes between the frequency points and the adjacent frequency points of the frequency points are greater than or equal to the amplitude difference value threshold value, and skipping to execute the function of the conversion module;

the third processing module is used for taking the adjacent frequency point on the other side of the frequency point as the frequency point to be detected when the difference value of the amplitude between the frequency point and the adjacent frequency point on one side of the frequency point is more than or equal to the amplitude difference threshold value and the difference value of the amplitude between the frequency point and the adjacent frequency point on the other side of the frequency point is less than the amplitude difference threshold value;

the fourth processing module is used for taking the direction from the frequency point to the adjacent frequency point on the other side of the frequency point as the detection direction;

the difference value calculation module is used for calculating the difference value between the amplitude value of the frequency point adjacent to the frequency point to be detected in the detection direction and the amplitude value of the frequency point to be detected;

the judging module is used for judging whether the difference value between the amplitude of the frequency point adjacent to the frequency point to be detected in the detection direction and the amplitude of the frequency point to be detected is smaller than an amplitude difference value threshold value or not, if so, the frequency point adjacent to the frequency point to be detected in the detection direction is used as the frequency point to be detected, and the function of the difference value calculating module is returned to be executed; if not, counting the number of the frequency points to be detected;

the determining module is used for determining the sound signals between the frequency points to be detected and the frequency points to be detected indicated by the number of the frequency points to be detected in the detection direction as narrow-band signals when the number of the frequency points to be detected is less than or equal to a number threshold;

a conversion module for converting each frequency within a frequency range of the sound signal into a Mel frequency of the sound signal when it is determined that the sound signal is determined to be a narrow-band signal;

the filtering module is used for inputting the Mel frequency of the sound signal into a Mel filter bank to obtain a filtering result of the sound signal in the Mel domain;

the fifth processing module is used for carrying out logarithmic calculation and discrete cosine transform on the filtering result of the obtained sound signal in the Mel domain to obtain the sound characteristic of the sound signal;

the detection module is used for inputting the obtained sound characteristics of the sound signal into a trained convolutional neural network model and detecting whether the sound signal is abnormal sound; wherein, the convolution neural network model includes: n convolutional layers, wherein the activation function and the pooling layer are arranged after the N convolutional layers, when the result of the convolution, activation and pooling of the sound characteristic of the sound signal is called a characteristic diagram and is represented by F, M fully-connected layers are provided, the size of a convolution kernel of each convolutional layer is Kn, the step length is Sn, the number of the convolution kernels is 2N, N is more than or equal to 1 and less than or equal to N, the number of neurons of the last fully-connected layer is 2, the number of neurons of other fully-connected layers is M, M is more than or equal to 1 and less than or equal to M, the activation function after convolution of each layer is Max-Feature-Map, and the following expression is that:

wherein k is more than or equal to 1 and less than or equal to n;

the number of F is 2 n;

the derivative of a is then:

In a third aspect, the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the computer program performs the steps of the method in the first aspect.

In a fourth aspect, embodiments of the present invention also provide an electronic device, which includes a memory, a processor, and one or more programs, where the one or more programs are stored in the memory and configured to be executed by the processor to perform the steps of the method according to the first aspect.

In the solutions provided in the foregoing first to fourth aspects of the embodiments of the present invention, when determining that the abnormal sound detection device is in a severe environment according to the obtained ambient environment information, it first determines whether the obtained sound signal is a wideband signal or a narrowband signal, if it is determined that the sound signal is a narrowband signal, then continuously extracts the sound feature of the sound signal, and inputs the extracted sound feature into a trained convolutional neural network model to detect whether the sound signal is an abnormal sound, and compared with a mode of detecting an abnormal sound in the related art, the mode of detecting an abnormal sound may be adjusted in combination with the ambient environment information, so that the efficiency of detecting an abnormal sound under different environmental conditions is greatly improved; before abnormal sound detection is carried out, whether the obtained sound signal is a wide-frequency signal or a narrow-frequency signal is determined, so that whether the obtained sound signal is environmental sound or sound which is required to be detected and is emitted by an object entering the environment is detected; when the sound signal is determined to be not the environmental sound, entering an abnormal sound detection link; therefore, the detection efficiency of abnormal sound is improved, and the occurrence of false detection is prevented.

In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a flowchart illustrating an abnormal sound detection method according to embodiment 1 of the present invention;

fig. 2a is a schematic diagram 1 illustrating a corresponding relationship between a frequency point and a frequency amplitude of an acoustic signal in the abnormal sound detection method provided in embodiment 1 of the present invention;

fig. 2b is a schematic diagram 2 showing a corresponding relationship between a frequency point and a frequency amplitude of an acoustic signal in the abnormal sound detection method provided in embodiment 1 of the present invention;

fig. 2c is a schematic diagram 3 showing a corresponding relationship between a frequency point and a frequency amplitude of an acoustic signal in the abnormal sound detection method provided in embodiment 1 of the present invention;

fig. 3 is a schematic structural diagram illustrating an abnormal sound detection apparatus according to embodiment 2 of the present invention;

fig. 4 shows a schematic structural diagram of an electronic device provided in embodiment 3 of the present invention.

Detailed Description

In the description of the present invention, it is to be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", "clockwise", "counterclockwise", and the like, indicate orientations and positional relationships based on those shown in the drawings, and are used only for convenience of description and simplicity of description, and do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be considered as limiting the present invention.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.

In the present invention, unless otherwise expressly specified or limited, the terms "mounted," "connected," "secured," and the like are to be construed broadly and can, for example, be fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.

Based on this, embodiments of the present application provide an abnormal sound detection method, an abnormal sound detection device, and an electronic device, when determining that the abnormal sound detection device is in a harsh environment according to the obtained ambient environment information, first determine whether the obtained sound signal is a wideband signal or a narrowband signal, if it is determined that the sound signal is a narrowband signal, continue to extract sound features of the sound signal, then input the extracted sound features into a trained convolutional neural network model, detect whether the sound signal is an abnormal sound, and adjust a mode of detecting the abnormal sound in combination with the ambient environment information, thereby greatly improving efficiency of abnormal sound detection under different environmental conditions.

In order to make the aforementioned objects, features and advantages of the present application more comprehensible, the present application is described in further detail with reference to the accompanying drawings and the detailed description.

Example 1

Referring to a flowchart of an abnormal sound detection method shown in fig. 1, the present embodiment provides an abnormal sound detection method, including the following specific steps:

step 100, when sound signals of the surrounding environment are collected, the abnormal sound detection device acquires surrounding environment information.

In the step 100, the abnormal sound detecting device is a device having a processor and is configured to detect abnormal sounds such as distress sounds and cursing sounds occurring around the device.

And the abnormal sound detection device is networked with other abnormal sound detection devices to form a block chain system for storing the determined abnormal sound. The abnormal sound detection device is registered on the blockchain system and becomes a common node in the blockchain system.

The ambient environment information includes, but is not limited to: weather information, location information of the abnormal sound detection device, time information, and geographical location information of the abnormal sound detection device.

The weather information is acquired from a weather information website by the abnormal sound detection equipment according to the geographical position information and the time information of the abnormal sound detection equipment.

The location information of the abnormal sound detection device itself may be a surrounding environment image collected by an image collection device connected to and disposed together with the abnormal sound detection device, that is, the surrounding environment image collected by the image collection device connected to and disposed together with the abnormal sound detection device is used as the location information of the abnormal sound detection device itself.

The time information is the current system time indicated by the system clock of the abnormal sound detection device itself.

The geographical position information of the abnormal sound detection device is preset in the abnormal sound detection device and is used for indicating the position and the area of the abnormal sound detection device.

After the ambient environment information is acquired, whether the abnormal sound detection device is in a severe environment or not can be determined according to the ambient environment information.

The severe environment can be the environment under the weather conditions such as sand storm, rainstorm and the like indicated by the weather information; or the midnight environment corresponding to the time period from 10 midnight to 5 early morning indicated by the time information; the abnormal sound detection device can also be indicated in a region environment with dense and noisy people flow according to the information of the place where the abnormal sound detection device is located and the information of the geographical position where the abnormal sound detection device is located.

In this embodiment, the abnormal sound detection algorithm of the abnormal sound detection apparatus in the non-severe environment is not discussed; in the non-harsh environment, the abnormal sound detection device may use any sound detection algorithm in the prior art to detect abnormal sounds in the surrounding environment, and the specific process is not described herein again.

Before the step 102 is executed, it is necessary to determine whether the abnormal sound detection device is in a harsh environment according to the surrounding environment information, and it can be known from the description of the harsh environment that a specific determination process of determining whether the abnormal sound detection device is in a harsh environment according to the surrounding environment information is the prior art, and details are not described here.

When it is determined that the abnormal sound detection apparatus is in a severe environment based on the surrounding environment information, the following step 102 is continuously performed.

And 102, when the abnormal sound detection equipment is determined to be in a severe environment according to the acquired ambient environment information, performing Fourier transform on the sound signal to obtain a Fourier-transformed sound signal, wherein the sound signal comprises frequency components of the sound signal.

Wherein the frequency components of the sound signal comprise: the frequency range of the sound signal, the amplitude of the frequency points in the frequency range and the initial phase of each frequency point.

The specific process of performing fourier transform on the sound signal to obtain the frequency component of the sound signal is the prior art, and is not described herein again.

And step 104, respectively calculating the difference value of the amplitude value between each frequency point and the adjacent frequency point of each frequency point.

And 106, when the difference values of the amplitudes between the frequency points and the adjacent frequency points of the frequency points are larger than or equal to the amplitude difference value threshold, determining the sound signals corresponding to the frequency points as narrow-band signals, and skipping to the step 122.

In step 106, the amplitude difference threshold is preset in the abnormal sound detection device.

In one embodiment, referring to the schematic diagrams of correspondence between frequency points and amplitudes of noise signals shown in fig. 2a to 2c, whether a frequency point is a narrow-band signal is sequentially determined according to a sequence from left to right. The above step 106 is described as a process of determining the noise signal corresponding to the frequency point as a narrow-band signal shown in fig. 2 a.

And 108, when the difference value of the amplitudes between the frequency point and the adjacent frequency point on one side of the frequency point is more than or equal to an amplitude difference threshold value and the difference value of the amplitudes between the frequency point and the adjacent frequency point on the other side of the frequency point is less than the amplitude difference threshold value, taking the adjacent frequency point on the other side of the frequency point as a frequency point to be detected.

And step 110, taking the direction of the frequency point reaching the adjacent frequency point on the other side of the frequency point as a detection direction.

And 112, calculating the difference value between the amplitude of the frequency point adjacent to the frequency point to be detected in the detection direction and the amplitude of the frequency point to be detected.

Step 114, judging whether the difference value between the amplitude of the frequency point adjacent to the frequency point to be detected in the detection direction and the amplitude of the frequency point to be detected is smaller than an amplitude difference value threshold value, if so, executing step 116; if not, step 118 is performed.

And step 116, taking the frequency point adjacent to the frequency point to be detected in the detection direction as the frequency point to be detected, and returning to step 112.

And step 118, counting the number of the frequency points to be detected.

And 120, when the number of the frequency points to be detected is less than or equal to a number threshold, determining the sound signals between the frequency points to be detected and the frequency points to be detected indicated by the number of the frequency points to be detected in the detection direction as narrow-band signals.

In the above step 120, the number threshold is set in advance in the abnormal sound detecting apparatus.

The above-described situations in steps 108 to 120 are the process shown in fig. 2b of determining the noise signal between the frequency point to be detected and the frequency point to be detected indicated by the number of the frequency points to be detected in the detection direction as the narrow-band signal.

And step 122, when the sound signal is determined to be the narrow-band signal, converting each frequency in the frequency range of the sound signal into the Mel frequency of the sound signal.

And step 124, inputting the Mel frequency of the sound signal into a Mel filter bank to obtain a filtering result of the sound signal in the Mel domain.

And step 126, performing logarithm calculation and discrete cosine transform on the filtering result of the obtained sound signal in the Mel domain to obtain the sound characteristic of the sound signal.

The process of obtaining the sound characteristics of the sound signal described in the above steps 122 to 126 is similar to the process of extracting Mel-Frequency Cepstral Coefficients (MFCCs) of the sound signal in the prior art, and is not repeated here.

And step 128, inputting the obtained sound characteristics of the sound signal into a trained convolutional neural network model, and detecting whether the sound signal is abnormal sound.

The trained convolutional neural network model is obtained by training with normal sounds (conversation sounds, singing sounds and the like) and abnormal sounds in advance. The specific training process is prior art and will not be described herein.

The convolutional neural network model includes: n convolutional layers, wherein the activation function and the pooling layer are arranged after the N convolutional layers, when the result of the convolution, activation and pooling of the sound characteristic of the sound signal is called a characteristic diagram and is represented by F, M fully-connected layers are provided, the size of a convolution kernel of each convolutional layer is Kn, the step length is Sn, the number of the convolution kernels is 2N, N is more than or equal to 1 and less than or equal to N, the number of neurons of the last fully-connected layer is 2, the number of neurons of other fully-connected layers is M, M is more than or equal to 1 and less than or equal to M, the activation function after convolution of each layer is Max-Feature-Map, and the following expression is that:

wherein k is more than or equal to 1 and less than or equal to n;

since the number of feature maps generated by the convolutional layer is the same as the number of convolutional kernels of the convolutional layer, the number of generated F is 2 n;

the derivative of a is then:

The abnormal sound detection method further includes: and when the difference values of the amplitudes between the frequency points and the adjacent frequency points of the frequency points are smaller than the amplitude difference value threshold, determining the sound signals corresponding to the frequency points as broadband signals, thereby determining that the collected sound signals belong to environmental sounds without abnormal sound detection.

The wideband signal is the corresponding relationship between the frequency point and the amplitude of the sound signal described in fig. 2 c.

The process of detecting whether the sound signal is an abnormal sound by the trained convolutional neural network model is the prior art, and is not described herein again.

Since abnormal sounds can appear as important evidence in law, it is desirable to avoid the abnormal sounds from being falsified and altered as much as possible; therefore, when it is determined that the sound signal is an abnormal sound, the abnormal sound detection method proposed in the present embodiment may further include the following steps (1) to (3):

(1) assigning an abnormal sound identifier to the sound signal when it is determined that the sound signal is an abnormal sound;

(2) storing the sound signal into a block chain system where the abnormal sound detection equipment is located, and obtaining a block address fed back by the block chain system and used for storing the sound signal;

(3) carrying out Hash calculation on a block address for storing the sound signal to obtain a block address Hash value, generating a corresponding relation between an abnormal sound identifier and the block address Hash value, sending the block address Hash value to a block chain system, enabling the block chain system to generate a corresponding relation between the block address for storing the sound signal and the received block address Hash value when receiving the block address Hash value, and storing the generated corresponding relation between the block address for storing the sound signal and the received block address Hash value.

In the above step (1), the abnormal sound flag assigned to the sound signal is generated by the abnormal sound detecting apparatus, and the abnormal sound flag assigned to the sound signal is set in the sound signal.

In the step (2), after storing the sound signal, the block chain system may acquire a block address where the sound signal is stored, and then send the acquired block address to the abnormal sound detection apparatus.

As can be seen from the processes described in the steps (1) to (3), the sound signal as the abnormal sound can be stored in the block chain system, the block address where the abnormal sound is stored is subjected to hash calculation to obtain the block address hash value, and the calculated block address hash value is stored in the block chain system, so that the abnormal sound is stored by the block chain system, the abnormal sound and the block address where the abnormal sound is stored are managed by the block chain system, and the characteristic that the abnormal sound is traceable and not changeable by the block chain system is utilized to ensure that the abnormal sound is not tampered, thereby ensuring the accuracy of the abnormal sound.

When an abnormal sound needs to be queried, the abnormal sound detection method provided by this embodiment further includes the following steps (11) to (15):

(11) acquiring abnormal sound query information, wherein the abnormal sound query information comprises: the user identification of the user sending the abnormal sound query information, the query time and the abnormal sound identification of the abnormal sound needing to be queried;

(12) when the block address hash value corresponding to the abnormal sound identifier can be inquired by using the abnormal sound identifier in the abnormal sound inquiry information, generating a sound signal inquiry instruction by using the inquired block address hash value corresponding to the abnormal sound identifier;

(13) sending the generated sound signal query instruction to a block chain system, so that the block chain system queries a block address corresponding to the block address hash value carried in the received sound signal query instruction according to the block address hash value carried in the received sound signal query instruction, reads a sound signal from the queried block address, and feeds the read sound signal back to the abnormal sound detection device;

(14) receiving a sound signal fed back by the block chain system, and feeding back the received sound signal to a user sending the abnormal sound query information;

(15) sending the abnormal sound query information to the blockchain system, so that the blockchain system stores the abnormal sound query information into a query log, wherein the query log is arranged in the blockchain system.

As can be seen from the content described in the above steps (11) to (15), after the query is completed, the abnormal sound detection device sends the abnormal sound query information to the blockchain system, so that the blockchain system stores the abnormal sound query information in the query log, thereby recording the abnormal sound query information and facilitating the authority security management of data; when abnormal sound is leaked or lost, investigation can be carried out according to the abnormal sound inquiry information.

To sum up, this embodiment provides an abnormal sound detection method, where when determining that the abnormal sound detection device is in a harsh environment according to the obtained ambient environment information, it is first determined whether the obtained sound signal is a wideband signal or a narrowband signal, and if it is determined that the sound signal is a narrowband signal, then continuously extracting sound features of the sound signal, and then inputting the extracted sound features into a trained convolutional neural network model to detect whether the sound signal is an abnormal sound, and compared with a mode of abnormal sound detection in the related art, a mode of detecting an abnormal sound may be adjusted in combination with the ambient environment information, so that efficiency of abnormal sound detection under different environmental conditions is greatly improved; before abnormal sound detection is carried out, whether the obtained sound signal is a wide-frequency signal or a narrow-frequency signal is determined, so that whether the obtained sound signal is environmental sound or sound which is required to be detected and is emitted by an object entering the environment is detected; when the sound signal is determined to be not the environmental sound, entering an abnormal sound detection link; therefore, the detection efficiency of abnormal sound is improved, and the occurrence of false detection is prevented.

Example 2

The present embodiment proposes an abnormal sound detection apparatus for performing the above abnormal sound detection method.

Referring to a schematic structural diagram of an abnormal sound detection apparatus shown in fig. 3, the present embodiment provides an abnormal sound detection apparatus, including:

a first obtaining module 200, configured to obtain ambient environment information when a sound signal of an ambient environment is collected;

the first processing module 202 is configured to, when it is determined that the abnormal sound detection apparatus is in a severe environment according to the obtained ambient environment information, perform fourier transform on the sound signal to obtain a fourier-transformed sound signal, where the sound signal includes frequency components of the sound signal; wherein the frequency components of the sound signal comprise: the frequency range of the sound signal, the amplitude of the frequency points in the frequency range and the initial phase of each frequency point;

a calculating module 204, configured to calculate a difference between amplitudes of each frequency point and an adjacent frequency point of each frequency point;

the second processing module 206 is configured to determine the sound signal corresponding to the frequency point as a narrow-band signal and skip to execute a function of the conversion module when the difference between the frequency amplitude of the frequency point and the amplitude between adjacent frequency points of the frequency point is greater than or equal to the amplitude difference threshold;

a third processing module 208, configured to, when a difference between the amplitudes of the frequency point and an adjacent frequency point on one side of the frequency point is greater than or equal to an amplitude difference threshold and a difference between the amplitude of the frequency point and an adjacent frequency point on the other side of the frequency point is smaller than the amplitude difference threshold, take the adjacent frequency point on the other side of the frequency point as a frequency point to be detected;

a fourth processing module 210, configured to use a direction in which the frequency point reaches an adjacent frequency point on the other side of the frequency point as a detection direction;

a difference value calculating module 212, configured to calculate a difference value between the amplitude of the frequency point adjacent to the frequency point to be detected in the detection direction and the amplitude of the frequency point to be detected;

a judging module 214, configured to judge whether a difference between a frequency amplitude of a frequency point adjacent to the frequency point to be detected in the detection direction and an amplitude of the frequency point to be detected is smaller than an amplitude difference threshold, if so, take the frequency point adjacent to the frequency point to be detected in the detection direction as the frequency point to be detected, and return to execute the function of the difference calculating module; if not, counting the number of the frequency points to be detected;

a determining module 216, configured to determine, when the number of the frequency points to be detected is less than or equal to a number threshold, a sound signal between the frequency point and the frequency point to be detected, where the number of the frequency points to be detected indicates in the detection direction, as a narrow-band signal;

a conversion module 218, configured to convert each frequency in the frequency range of the sound signal into a Mel frequency of the sound signal when it is determined that the sound signal is determined to be a narrow-band signal;

the filtering module 220 is configured to input the Mel frequency of the sound signal into a Mel filter bank to obtain a filtering result of the sound signal in the Mel domain;

a fifth processing module 222, configured to perform logarithm calculation and discrete cosine transform on the filtering result of the obtained sound signal in the Mel domain to obtain the sound characteristic of the sound signal;

a detection module 224, configured to input the obtained sound features of the sound signal into a trained convolutional neural network model, and detect whether the sound signal is an abnormal sound; wherein, the convolution neural network model includes: n convolutional layers, wherein the activation function and the pooling layer are arranged after the N convolutional layers, when the result of the convolution, activation and pooling of the sound characteristic of the sound signal is called a characteristic diagram and is represented by F, M fully-connected layers are provided, the size of a convolution kernel of each convolutional layer is Kn, the step length is Sn, the number of the convolution kernels is 2N, N is more than or equal to 1 and less than or equal to N, the number of neurons of the last fully-connected layer is 2, the number of neurons of other fully-connected layers is M, M is more than or equal to 1 and less than or equal to M, the activation function after convolution of each layer is Max-Feature-Map, and the following expression is that:

wherein k is more than or equal to 1 and less than or equal to n;

the number of F is 2 n;

the derivative of a is then:

2n characteristic diagrams generated by representing the nth convolutional layerThe ith column and the jth row in the (k + n) th feature maps.

Further, the abnormal sound detection apparatus according to the present embodiment further includes:

and the sixth processing module is used for determining the sound signals corresponding to the frequency points as broadband signals when the difference values of the amplitudes between the adjacent frequency points of the frequency points are smaller than the amplitude difference value threshold value, so that the collected sound signals are determined to belong to environmental sounds, and abnormal sound detection is not required.

an assigning module for assigning an abnormal sound identifier to the sound signal when it is determined that the sound signal is an abnormal sound;

the storage module is used for storing the sound signal into a block chain system where the abnormal sound detection equipment is located, and obtaining a block address which is fed back by the block chain system and used for storing the sound signal;

and the hash calculation module is used for carrying out hash calculation on the block address for storing the sound signal to obtain a block address hash value, generating a corresponding relation between an abnormal sound identifier and the block address hash value, and sending the block address hash value to a block chain system, so that the block chain system generates a corresponding relation between the block address for storing the sound signal and the received block address hash value when receiving the block address hash value, and stores the generated corresponding relation between the block address for storing the sound signal and the received block address hash value.

a second obtaining module, configured to obtain abnormal sound query information, where the abnormal sound query information includes: the user identification of the user sending the abnormal sound query information, the query time and the abnormal sound identification of the abnormal sound needing to be queried;

a third obtaining module, configured to, when a block address hash value corresponding to the abnormal sound identifier can be queried by using the abnormal sound identifier in the abnormal sound query information, generate a sound signal query instruction by using the queried block address hash value corresponding to the abnormal sound identifier;

a fourth obtaining module, configured to send the generated sound signal query instruction to a block chain system, so that the block chain system queries, according to a block address hash value carried in the received sound signal query instruction, a block address corresponding to the block address hash value carried in the received sound signal query instruction, reads a sound signal from the queried block address, and feeds the read sound signal back to the abnormal sound detection device;

the feedback module is used for receiving the sound signal fed back by the block chain system and feeding back the received sound signal to the user sending the abnormal sound query information;

a sending module, configured to send the abnormal sound query information to the blockchain system, so that the blockchain system stores the abnormal sound query information in a query log, where the query log is set in the blockchain system.

In summary, the present embodiment provides an abnormal sound detection apparatus, where when determining that the abnormal sound detection device is in a harsh environment according to the obtained ambient environment information, it is first determined whether the obtained sound signal is a wideband signal or a narrowband signal, and if it is determined that the sound signal is a narrowband signal, then continuously extracting sound features of the sound signal, and then inputting the extracted sound features into a trained convolutional neural network model to detect whether the sound signal is an abnormal sound, and compared with a mode of abnormal sound detection in the related art, the abnormal sound detection mode may be adjusted in combination with the ambient environment information, so that efficiency of abnormal sound detection under different environmental conditions is greatly improved; before abnormal sound detection is carried out, whether the obtained sound signal is a wide-frequency signal or a narrow-frequency signal is determined, so that whether the obtained sound signal is environmental sound or sound which is required to be detected and is emitted by an object entering the environment is detected; when the sound signal is determined to be not the environmental sound, entering an abnormal sound detection link; therefore, the detection efficiency of abnormal sound is improved, and the occurrence of false detection is prevented.

Example 3

The present embodiment proposes a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the abnormal sound detection method described in embodiment 1 above. For specific implementation, refer to method embodiment 1, which is not described herein again.

In addition, referring to the schematic structural diagram of an electronic device shown in fig. 4, the present embodiment also provides an electronic device, which includes a bus 51, a processor 52, a transceiver 53, a bus interface 54, a memory 55, and a user interface 56. The electronic device comprises a memory 55.

In this embodiment, the electronic device further includes: one or more programs stored on the memory 55 and executable on the processor 52, configured for execution by the processor to perform the steps of:

when the abnormal sound detection equipment is determined to be in a severe environment according to the acquired ambient environment information, performing Fourier transform on the sound signal to obtain a Fourier-transformed sound signal, wherein the sound signal comprises frequency components of the sound signal; wherein the frequency components of the sound signal comprise: the frequency range of the sound signal, the frequency amplitude of the frequency points in the frequency range and the initial phase of each frequency point;

when the difference values of the amplitudes between the frequency points and the adjacent frequency points of the frequency points are larger than or equal to the amplitude difference value threshold, determining the sound signals corresponding to the frequency points as narrow-band signals, and skipping to the step of converting all frequencies in the frequency range of the sound signals into Mel frequencies of the sound signals when the sound signals are determined as the narrow-band signals;

when the difference value of the amplitude between the frequency point and the adjacent frequency point on one side of the frequency point is more than or equal to an amplitude difference threshold value and the difference value of the amplitude between the frequency point and the adjacent frequency point on the other side of the frequency point is less than the amplitude difference threshold value, taking the adjacent frequency point on the other side of the frequency point as a frequency point to be detected;

judging whether the difference value between the amplitude of the frequency point adjacent to the frequency point to be detected in the detection direction and the amplitude of the frequency point to be detected is smaller than an amplitude difference value threshold value, if so, taking the frequency point adjacent to the frequency point to be detected in the detection direction as the frequency point to be detected, and returning to the step of calculating the difference value between the frequency amplitude of the frequency point adjacent to the frequency point to be detected in the detection direction and the amplitude of the frequency point to be detected; if not, counting the number of the frequency points to be detected;

wherein k is more than or equal to 1 and less than or equal to n;

(ii) a n is a preset numerical value;

the derivative of a is then:

representing the nth convolutionAnd (3) the characteristic value of the ith column and the jth row in the (k + n) th characteristic diagram in the 2n characteristic diagrams generated by the layer.

A transceiver 53 for receiving and transmitting data under the control of the processor 52.

Where a bus architecture (represented by bus 51) is used, bus 51 may include any number of interconnected buses and bridges, with bus 51 linking together various circuits including one or more processors, represented by processor 52, and memory, represented by memory 55. The bus 51 may also link various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further in this embodiment. A bus interface 54 provides an interface between the bus 51 and the transceiver 53. The transceiver 53 may be one element or may be multiple elements, such as multiple receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. For example: the transceiver 53 receives external data from other devices. The transceiver 53 is used for transmitting data processed by the processor 52 to other devices. Depending on the nature of the computing system, a user interface 56, such as a keypad, display, speaker, microphone, joystick, may also be provided.

The processor 52 is responsible for managing the bus 51 and the usual processing, running a general-purpose operating system as described above. And memory 55 may be used to store data used by processor 52 in performing operations.

Alternatively, processor 52 may be, but is not limited to: a central processing unit, a singlechip, a microprocessor or a programmable logic device.

It will be appreciated that the memory 55 in embodiments of the invention may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The non-volatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable PROM (EEPROM), or a flash Memory. Volatile Memory can be Random Access Memory (RAM), which acts as external cache Memory. By way of illustration and not limitation, many forms of RAM are available, such as Static random access memory (Static RAM, SRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic random access memory (Synchronous DRAM, SDRAM), Double Data Rate Synchronous Dynamic random access memory (ddr Data Rate SDRAM, ddr SDRAM), Enhanced Synchronous SDRAM (ESDRAM), Synchlink DRAM (SLDRAM), and Direct Rambus RAM (DRRAM). The memory 55 of the systems and methods described in this embodiment is intended to comprise, without being limited to, these and any other suitable types of memory.

In some embodiments, memory 55 stores elements, executable modules or data structures, or a subset thereof, or an expanded set thereof as follows: an operating system 551 and application programs 552.

The operating system 551 includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, for implementing various basic services and processing hardware-based tasks. The application 552 includes various applications, such as a Media Player (Media Player), a Browser (Browser), and the like, for implementing various application services. A program implementing the method of an embodiment of the present invention may be included in the application 552.

In summary, the present embodiment provides a computer-readable storage medium and an electronic device, where when determining that the abnormal sound detection device is in a severe environment according to the obtained ambient environment information, it is first determined whether the obtained sound signal is a wideband signal or a narrowband signal, and if it is determined that the sound signal is a narrowband signal, then continuously extracting sound features of the sound signal, and then inputting the extracted sound features into a trained convolutional neural network model to detect whether the sound signal is an abnormal sound, and compared with a mode of detecting an abnormal sound in the related art, the mode of detecting an abnormal sound may be adjusted in combination with the ambient environment information, so that efficiency of detecting an abnormal sound under different environmental conditions is greatly improved; before abnormal sound detection is carried out, whether the obtained sound signal is a wide-frequency signal or a narrow-frequency signal is determined, so that whether the obtained sound signal is environmental sound or sound which is required to be detected and is emitted by an object entering the environment is detected; when the sound signal is determined to be not the environmental sound, entering an abnormal sound detection link; therefore, the detection efficiency of abnormal sound is improved, and the occurrence of false detection is prevented.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims

1. An abnormal sound detection method, comprising:

when the abnormal sound detection equipment is determined to be in a severe environment according to the acquired ambient environment information, performing Fourier transform on the sound signal to obtain a Fourier-transformed sound signal, wherein the Fourier-transformed sound signal contains frequency components of the sound signal; wherein the frequency components of the sound signal comprise: the frequency range of the sound signal, the amplitude of the frequency points in the frequency range and the initial phase of each frequency point;

respectively calculating the difference value of the amplitude value between each frequency point and the adjacent frequency point of the frequency points;

wherein k is more than or equal to 1 and less than or equal to n;

the number of F is 2 n;

the derivative of a is then:

2. The method of claim 1, comprising: and when the difference values of the amplitudes of the frequency points and the amplitudes between the adjacent frequency points of the frequency points are smaller than the amplitude difference value threshold value, determining the sound signals corresponding to the frequency points as broadband signals, thereby determining that the collected sound signals belong to environmental sounds without abnormal sound detection.

3. The method of claim 1, further comprising:

assigning an abnormal sound identifier to the sound signal when it is determined that the sound signal is an abnormal sound;

storing the sound signal into a block chain system where the abnormal sound detection equipment is located, and obtaining a block address fed back by the block chain system and used for storing the sound signal;

carrying out Hash calculation on a block address for storing the sound signal to obtain a block address Hash value, generating a corresponding relation between an abnormal sound identifier and the block address Hash value, sending the block address Hash value to a block chain system, enabling the block chain system to generate a corresponding relation between the block address for storing the sound signal and the received block address Hash value when receiving the block address Hash value, and storing the generated corresponding relation between the block address for storing the sound signal and the received block address Hash value.

4. The method of claim 3, further comprising:

acquiring abnormal sound query information, wherein the abnormal sound query information comprises: the user identification of the user sending the abnormal sound query information, the query time and the abnormal sound identification of the abnormal sound needing to be queried;

when the block address hash value corresponding to the abnormal sound identifier can be inquired by using the abnormal sound identifier in the abnormal sound inquiry information, generating a sound signal inquiry instruction by using the inquired block address hash value corresponding to the abnormal sound identifier;

sending the generated sound signal query instruction to a block chain system, so that the block chain system queries a block address corresponding to the block address hash value carried in the received sound signal query instruction according to the block address hash value carried in the received sound signal query instruction, reads a sound signal from the queried block address, and feeds the read sound signal back to the abnormal sound detection device;

receiving a sound signal fed back by the block chain system, and feeding back the received sound signal to a user sending the abnormal sound query information;

sending the abnormal sound query information to the blockchain system, so that the blockchain system stores the abnormal sound query information into a query log, wherein the query log is arranged in the blockchain system.

5. An abnormal sound detection apparatus, comprising:

the first processing module is used for performing Fourier transform on the sound signal to obtain a Fourier-transformed sound signal when the abnormal sound detection device is determined to be in a severe environment according to the obtained surrounding environment information, wherein the Fourier-transformed sound signal comprises frequency components of the sound signal; wherein the frequency components of the sound signal comprise: the frequency range of the sound signal, the amplitude of the frequency points in the frequency range and the initial phase of each frequency point;

the second processing module is used for determining the sound signals corresponding to the frequency points as narrow-band signals and skipping to execute the function of the conversion module when the difference values between the amplitudes of the frequency points and the amplitudes between the adjacent frequency points of the frequency points are larger than or equal to the amplitude difference value threshold;

the third processing module is used for taking the adjacent frequency point on the other side of the frequency point as the frequency point to be detected when the difference value between the amplitude of the frequency point and the amplitude of the adjacent frequency point on one side of the frequency point is more than or equal to the amplitude difference threshold value and the difference value between the amplitude of the frequency point and the amplitude of the adjacent frequency point on the other side of the frequency point is less than the amplitude difference threshold value;

the judging module is used for judging whether the difference value between the amplitude of the frequency point adjacent to the frequency point to be detected in the detection direction and the amplitude of the frequency point to be detected is smaller than a frequency difference value threshold value, if so, the frequency point adjacent to the frequency point to be detected in the detection direction is used as the frequency point to be detected, and the function of the difference value calculating module is returned to be executed; if not, counting the number of the frequency points to be detected;

wherein k is more than or equal to 1 and less than or equal to n;

the number of F is 2 n;

the derivative of a is then:

6. The apparatus of claim 5, further comprising:

and the sixth processing module is used for determining the sound signals corresponding to the frequency points as broadband signals when the difference values between the amplitudes of the frequency points and the amplitudes between the adjacent frequency points of the frequency points are smaller than the amplitude difference value threshold value, so that the collected sound signals are determined to belong to environmental sounds, and abnormal sound detection is not required.

7. The apparatus of claim 5, further comprising:

8. The apparatus of claim 7, further comprising:

9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of the claims 1 to 4.

10. An electronic device comprising a memory, a processor, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the processor to perform the steps of the method of any of claims 1-4.