Disclosure of Invention
The purpose of this application lies in: and calculating the coordinate of the abnormal sound source according to the propagation delay between the microphone and the abnormal sound source by combining the three-channel abnormal power characteristic value, and completing the positioning of the abnormal sound source in the operation process of the electrical equipment.
The technical scheme of the first aspect of the application is as follows: the method for positioning the abnormal sound source of the power transmission and transformation equipment based on the improved EfficientNet comprises the following steps: step 1, a sound acquisition unit is used for acquiring a sample power spectrogram in the operation process of electric equipment and carrying out abnormal power value labeling on an abnormal power spectrogram in the sample power spectrogram, wherein the sample power spectrogram comprises a normal power spectrogram and an abnormal power spectrogram; step 2, introducing a composite coefficient according to a convolutional neural network and an EfficientNet network, constructing an initial classification model, training the initial classification model by using a sample power spectrogram, and recording the trained initial classification model as a power spectrogram classification model; and 3, acquiring a power spectrogram to be detected of the equipment to be detected, identifying abnormal power of the power spectrogram to be detected according to the power spectrogram classification model, and calculating the coordinates of an abnormal sound source of the equipment to be detected by adopting a fusion algorithm when the abnormal power is identified.
In any one of the above technical solutions, further, the initial classification model includes an initial convolutional layer, the initial convolutional layer is configured to perform convolution and standardized calculation on input data, and in step 2, the training of the initial classification model by using the sample power spectrogram specifically includes:
step 21, randomly dividing a sample power spectrogram into training data and testing data;
step 22, inputting training data into the initial convolutional layer to obtain a standardized input image;
step 23, performing multilayer convolution module calculation on the standardized input image, and performing data enhancement on the intermediate value calculated by each layer of the network according to a composite coefficient, wherein the calculation formula of the composite coefficient is as follows:
wherein w is the width of the current network, d is the depth of the current network, (h, p) is the resolution of the standardized input image, r is the resolution magnification, and α is the learning rate of the current network;
step 24, according to the calculation result calculated by the multilayer convolution module, performing first revision on the parameters in the initial classification model, wherein the parameters comprise a learning rate alpha;
and 25, performing second revision on the parameters in the first revised initial classification model according to the test data, and recording the second revised initial classification model as a power spectrogram classification model.
In any of the above technical solutions, further, the revision formula of the parameters is:
Memory(N)≤targetmemory
FLOPS(N)≤target_flops
in the formula (I), the compound is shown in the specification,
for the ith convolution module in the initial classification model, Li represents the number of times the ith convolution module block repeats,
is to perform a derivation calculation on the Li,
represents the ith convolution block, formed by convolution layer F
iRepetition of
In the second configuration, the result is subjected to a derivative calculation, i ═ 1,2, …, m, < > is an OR operation, target
memoryFor the target value of resource consumption, target _ flops are the calculated amountThe consumption target value, Memory (-) is a resource consumption function, FLOPS (-) is a calculated amount consumption function, N (-) is a calculated result calculated by a layer of convolution module, and Accuracy (-) is a constraint function which enables the Accuracy of the network to be the highest within the constraint condition.
In any one of the above technical solutions, further, step 3 specifically includes: step 31, acquiring a power spectrogram to be detected of equipment to be detected by using a sound acquisition unit, and performing abnormal power identification on the power spectrogram to be detected according to a power spectrogram classification model to generate an abnormal power characteristic value of the power spectrogram to be detected, wherein the sound acquisition unit at least comprises 3 sound sensors; step 32, calculating the distance between the abnormal sound source and the sound sensor according to the time between two adjacent maximum power values in the power spectrogram to be detected, which is acquired by any two sound sensors, and the abnormal power characteristic value; and step 33, calculating the abnormal sound source coordinate of the abnormal sound source according to the distance and the space coordinate of the sound sensor.
In any of the above technical solutions, further, the calculation formula of the distance is:
in the formula, WuThe abnormal power characteristic value of the power spectrogram to be detected, which is acquired by the u-th sound sensor, u is 1, …, h, …, j, h and j are the labels of the sound sensors, a is a preset precision coefficient, and r is a preset precision coefficientuIs the distance between the abnormal sound source and the u-th sound sensor, VsIs the speed of propagation of sound in air, ThThe time difference between two adjacent maximum power values corresponding to the h-th sound sensor is obtained.
The technical scheme of the second aspect of the application is as follows: the device comprises a sound collection unit, a processor and a memory, wherein the memory stores a computer program, and when the processor executes the computer program stored in the memory, the method for positioning the abnormal sound source of the power transmission and transformation equipment based on the improved EfficientNet is realized so as to position the abnormal sound source of the equipment to be detected.
The beneficial effect of this application is:
according to the technical scheme, the ternary microphone array is arranged to serve as a sound collecting unit, a trained power spectrogram classification model is utilized, whether the sound power value in the power spectrogram to be detected belongs to normal sound or abnormal sound is judged in real time, three-channel abnormal power characteristic values are combined, calculation of an abnormal sound source coordinate is achieved according to propagation delay between a microphone and an abnormal sound source, abnormal sound source positioning in the operation process of electric equipment is completed, the accuracy of calculation of the abnormal sound source coordinate is improved, the positioning time is shortened, and the positioning efficiency is improved.
The method provided by the application can be used for accurately classifying the sound emitted by the power transmission and transformation equipment, dividing the sound stream segments with fixed frame number into normal sound or abnormal sound, accurately capturing the abnormal sound stream segments and predicting the abnormal power characteristic value corresponding to the abnormal frame, and realizing accurate classification and prediction.
According to the method, the traditional time delay estimation algorithm is not used for sound source positioning, an innovative positioning method of fusing abnormal sound source power values and time delay data is adopted, a plurality of azimuth information is obtained through a plurality of formulas and geometric calculation, so that the equation is solved to obtain the target sound source coordinate, the three-dimensional coordinate vector value of the target abnormal sound source can be directly obtained through the method, and the accuracy of the result is high.
The method fully utilizes the strong modeling and fitting capability of the neural network, accurately classifies the input sound flow into normal and abnormal sound flows, predicts and captures the abnormal power characteristic value of the abnormal sound source, and then integrates the time delay information to calculate the azimuth coordinate, thereby realizing the real-time detection and positioning of the abnormal sound source of the power transmission and transformation equipment.
Detailed Description
In order that the above objects, features and advantages of the present application can be more clearly understood, the present application will be described in further detail with reference to the accompanying drawings and detailed description. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application, however, the present application may be practiced in other ways than those described herein, and therefore the scope of the present application is not limited by the specific embodiments disclosed below.
As shown in fig. 1, the present embodiment provides a method for positioning an abnormal sound source of an electric transmission and transformation device based on improved EfficientNet, where the method includes:
step 1, a sound acquisition unit is used for acquiring a sample power spectrogram in the operation process of electric equipment and carrying out abnormal power value labeling on an abnormal power spectrogram in the sample power spectrogram, wherein the sample power spectrogram comprises a normal power spectrogram and an abnormal power spectrogram;
in this embodiment, a conventional ternary microphone array is used as a sound collection unit, and the ternary microphone array is deployed around the power equipment, so as to collect sound data in the operation process of the power equipment, and obtain a sample power spectrogram, which is used as a database of a power spectrogram classification model.
When data is collected, a three-element microphone array can be arranged by adopting a planar or three-dimensional topological structure, the microphones and a corresponding PC are connected through three leads, audio processing software such as Audacity is installed on the PC, sound collected by multiple channels is analyzed by the audio processing software, normal working sound and abnormal working sound of the power transmission and transformation equipment are recorded on site, a sample power spectrogram output by the Audacity software is intercepted in real time, 10000 pieces of normal and abnormal power spectrograms of the power equipment are collected together by taking 100 frames as the total width of the spectrograms, and abnormal power values of the power equipment are marked in the abnormal power spectrogram, as shown in FIG. 2.
In this embodiment, the sound category of the normal power spectrogram is labeled as "normal", the sound category of the abnormal power spectrogram is labeled as "abnormal", and the "power value" of the abnormal power spectrogram is labeled.
As shown in fig. 3, in this embodiment, a conventional convolutional neural network model may be used as an initial model, 80% of the collected sample power spectrogram is used as training data, and the remaining 20% is used as test data, and the initial model is trained and tested to obtain a final power spectrogram classification model, thereby implementing abnormal power identification on the device to be detected.
In the embodiment, in order to improve the accuracy of the abnormal power identification of the equipment to be detected and the utilization degree of data resources, a conventional convolutional neural network model is improved to form a lightweight image classification deep learning model.
Step 2, introducing a composite coefficient according to a convolutional neural network and an EfficientNet network, constructing an initial classification model, training the initial classification model by using a sample power spectrogram, and recording the trained initial classification model as a power spectrogram classification model, wherein the initial classification model comprises an initial convolutional layer which is used for carrying out convolution and standardized calculation on input data;
in this embodiment, a convolutional neural network and an EfficientNet network model are combined, a complex coefficient is introduced, and all dimensions such as depth, width, resolution and the like of input data are scaled uniformly to adjust the extended network depth, so that better performance can be obtained without increasing the operation time.
As shown in fig. 4, in the present embodiment, the stem structure is used as an initial convolution layer, and convolution and normalization calculation are performed on input data.
In this embodiment, the power spectrum classification model after the introduction of the complex coefficient may be regarded as a network model formed by combining a plurality of convolution modules, and each time one convolution module block outputs a result, the dimension of the output result is scaled by the adaptive complex coefficient, so as to achieve the purpose of reducing the search space.
Specifically, in step 2, training the initial classification model by using the sample power spectrogram specifically includes:
and step 21, randomly dividing the sample power spectrogram into training data and testing data, wherein 80% of the sample power spectrogram is used as the training data, and the rest 20% of the sample power spectrogram is used as the testing data.
And step 22, inputting training data into the initial convolutional layer to obtain a standardized input image, and then entering a corresponding activation function and convolution module block.
In this embodiment, the convolution module block structure is divided into a main network and a residual edge structure.
Specifically, after being processed by a stem structure, training data firstly enters a block backbone network, and a feature layer of the training data (input image) is subjected to dimension-increasing operation by using a 1 × 1 convolution, then is subjected to standardization operation and an activation function, and then returns to the image feature layer after dimension-increasing.
Again, a depth separable convolution is performed, where feature size can be selected to be 3 x 3 or 5 x 5, and then a normalization operation and activation function are performed, applying an attention mechanism.
The attention mechanism in this embodiment is also divided into two parts, first, a feature layer of the attention mechanism is applied, here, an operation of global pooling is performed on an input image feature layer, only a dimension of the number of channels remains, then, an operation of reshape is performed, two dimensions of height and width are added to a result after the image is globally pooled, and then convolution is performed, here, convolution of 1 × 1 is performed first to compress the number of channels, then, the number of channels is extended through convolution of 1 × 1, after two times of convolution processing of 1 × 1, the obtained feature layer and the input feature layer have the same dimension, and then, sigmod operation is performed to fix the value between 0 and 1. Finally, multiplying the output under the attention mechanism with the image feature layer needing attention in the attention mechanism, so that each layer of feature layer needing attention is applied with the attention mechanism.
After the attention mechanism is completed, dimension reduction is carried out by utilizing a convolution of 1 multiplied by 1, then standardization operation and an activation function are carried out, a result under a main network is output, and the output result is added with an output result of a residual edge structure, so that a complete calculation process of a convolution module block structure is realized.
Step 23, performing multilayer convolution module calculation on the standardized input image, and performing data enhancement on the intermediate value calculated by each layer of the network according to a composite coefficient, wherein the calculation formula of the composite coefficient is as follows:
wherein w is the width of the current network, d is the depth of the current network, (h, p) is the resolution of the standardized input image, r is the resolution multiplying power, the initial value is 1, then the resolution is changed by continuously updating and optimizing in the network through constraint and multiplying the resolution (h, p), and alpha is the learning rate of the current network;
and 24, performing first revision on parameters in the initial classification model according to the calculation result calculated by the multilayer convolution module, wherein the parameters at least comprise a learning rate alpha, the width w of the current network, the depth d of the current network, resolution (h, p) and resolution multiplying power r.
In the process of parameter revision, the revision formula of the parameters is as follows:
Memory(N)≤targetmemory
FLOPS(N)≤target_flops
namely, the meanings of the above constraints are:
memory (n) is less than or equal to target _ memory, and the consumption of memory resources is less than a target value;
flops (n) is less than or equal to targetflops, and the consumption of the calculated amount is less than the target value.
In the formula (I), the compound is shown in the specification,
accuracy (. circle.) is a constraint function that maximizes the Accuracy of the network within the constraints, for the ith convolution module in the initial classification model,
represents the ith convolution block, formed by convolution layer F
iRepetition of
And then the derivative calculation is carried out on the result, Li is the repeated times of the ith convolution module block,
to perform the derivation calculation for Li, i is 1,2, …, m, target
memoryThe target _ FLOPS are the resource consumption target values, the target _ FLOPS are the calculated amount consumption target values, the Memory (-) is the resource consumption function, the FLOPS (-) is the calculated amount consumption function, the N (-) is the calculation result calculated by one layer of convolution module, the XNOR operation is performed, namely the calculation results calculated by the plurality of convolution modules N are subjected to XNOR operation integrally.
And 25, performing second revision on the parameters in the first revised initial classification model according to the test data, and recording the second revised initial classification model as a power spectrogram classification model.
It should be noted that the construction process and the testing process of the initial classification model are similar to the training process, and are not described herein again.
And 3, acquiring a power spectrogram to be detected of the equipment to be detected, identifying abnormal power of the power spectrogram to be detected according to the power spectrogram classification model, and calculating the coordinates of an abnormal sound source of the equipment to be detected by adopting a fusion algorithm when the abnormal power is identified.
In the embodiment, the three-element microphone array is used as a sound collecting unit, so that the three-dimensional coordinate positioning of the abnormal sound source is realized.
And inputting the power spectrogram to be detected into a previously trained model (power spectrogram classification model), and outputting a sound type (normal or abnormal) obtained after the three microphones (sound sensors) receive the sound and a predicted power value in real time.
Judging through sound types, if any classification result of the outputs of the three sound sensor channels is normal, judging the sound to be in a normal state, and not processing; otherwise, the obtained three-channel power value is used as an abnormal power characteristic value for subsequent processing, and after the abnormal power characteristic value is obtained, the time delay data of the sound sensor is fused, and the coordinate calculation of the abnormal sound source is performed on the time delay data, as shown in fig. 5.
The step 3 specifically includes:
and step 31, acquiring a power spectrogram to be detected of the equipment to be detected by using a sound acquisition unit, identifying abnormal power of the power spectrogram to be detected according to a power spectrogram classification model, and when it is judged that power values obtained in the power spectrogram to be detected corresponding to the three sound sensor channels exceed a preset safety threshold, taking the obtained power values as abnormal power characteristic values of the power spectrogram to be detected.
The predicted power value obtained in this embodiment is a predicted value obtained by performing image recognition on the power spectrogram to be detected through a power spectrogram classification model, and the prediction process is not limited in this embodiment.
Step 32, calculating the distance between the abnormal sound source and the sound sensor according to the time between two adjacent maximum power values in the power spectrogram to be detected acquired by any two sound sensors and the abnormal power characteristic value, wherein the calculation formula of the distance is as follows:
in the formula, WuThe abnormal power characteristic value of the power spectrogram to be detected, which is acquired by the u-th sound sensor, u is 1, …, h, …, j, h and j are the labels of the sound sensors, a is a preset precision coefficient, and r is a preset precision coefficientuIs the distance between the abnormal sound source and the u-th sound sensor, VsIs the speed of propagation of sound in air, ThThe time difference between two adjacent maximum power values corresponding to the h-th sound sensor is obtained.
Specifically, the calculation formula of the distance from the abnormal sound source to any microphone (sound sensor) in the ternary microphone array can be obtained according to the corresponding relation between the distance from the abnormal sound source and the power, and is as follows:
wherein b is a proportionality coefficient, i.e. distance r
uAnd power
Proportional, therefore, for a ternary microphone array, the ratio of the distances of the abnormal sound source to the three microphones is:
further, since each microphone (sound sensor) in the ternary microphone array is located at a different position on the electric device, the propagation duration of the abnormal sound source to reach each microphone is also different.
Setting the microphones h and j, the distance difference from the abnormal sound source to the microphones h and j is determined by the propagation delay of the abnormal sound source, i.e. rh-rj=Vs*|Th-Tj|。
In calculating the propagation delay, the time for the abnormal sound source to reach the corresponding sound sensor needs to be determined respectively. For each sound sensor, on the same frequency, the time difference between two adjacent maximum power values is the time for the abnormal sound source to reach the sound sensor, so that the propagation delay of the abnormal sound source can be determined through the corresponding power spectrogram of the sound sensor.
And step 33, calculating the abnormal sound source coordinate of the abnormal sound source according to the distance and the space coordinate of the sound sensor.
Setting the spatial coordinates of three microphones in the ternary microphone array as follows in sequence: (0, -y)1,0)、(0,y1,0)、(x10,0), the abnormal sound source coordinates of the abnormal sound source are: (x, y, z), then according to the position relationship between the three-dimensional vectors, the vector relationship can be obtained as follows:
therefore, the abnormal sound source coordinates (x, y, z) of the abnormal sound source can be calculated.
On the basis of the foregoing embodiments, this embodiment further illustrates an apparatus for positioning an abnormal sound source of an electric transmission and transformation device, where the apparatus includes a sound collection unit, a processor, and a memory, where the memory stores a computer program, and when the processor executes the computer program stored in the memory, the method for positioning an abnormal sound source of an electric transmission and transformation device based on improved EfficientNet in the foregoing embodiments is implemented to perform abnormal sound source positioning on a device to be detected.
Through testing of the abnormal sound source positioning method and device, the strong modeling and fitting capability of the neural network can be fully utilized, the input sound flow is accurately classified into normal and abnormal, the abnormal power characteristic value of the abnormal sound source is predicted and captured, and the time delay information is fused to calculate the azimuth coordinate, so that the real-time detection and positioning of the abnormal sound source of the power transmission and transformation equipment are realized.
The experimental results show that: the test model and the sound processing software are deployed in the front-end embedded equipment, abnormal sounds emitted by sound source devices such as transformers can be captured quickly and accurately, tracking and positioning are carried out, intelligent real-time detection and positioning of abnormal sound sources of the power transmission and transformation equipment are achieved, and the test model and the sound processing software have good performance and popularization and application prospects in positioning of abnormal sound sources in indoor and outdoor environments.
The technical scheme of the application is described in detail in the above with reference to the accompanying drawings, and the application provides a method for positioning an abnormal sound source of power transmission and transformation equipment based on improved EfficientNet, which comprises the following steps: acquiring a sample power spectrogram in the operation process of the electrical equipment by using a sound acquisition unit, and carrying out abnormal power value labeling on an abnormal power spectrogram in the sample power spectrogram; constructing an initial classification model according to the convolutional neural network, training the initial classification model by using a sample power spectrogram, and recording the trained initial classification model as a power spectrogram classification model; acquiring a power spectrogram to be detected of equipment to be detected, identifying abnormal power of the power spectrogram to be detected according to the power spectrogram classification model, and calculating the coordinates of an abnormal sound source of the equipment to be detected by adopting a fusion algorithm when identifying the abnormal power. According to the technical scheme, the three-channel abnormal power characteristic value is combined, the coordinate of the abnormal sound source is calculated according to the propagation delay between the microphone and the abnormal sound source, and the abnormal sound source positioning in the operation process of the electric equipment is completed.
The steps in the present application may be sequentially adjusted, combined, and subtracted according to actual requirements.
The units in the device can be merged, divided and deleted according to actual requirements.
Although the present application has been disclosed in detail with reference to the accompanying drawings, it is to be understood that such description is merely illustrative and not restrictive of the application of the present application. The scope of the present application is defined by the appended claims and may include various modifications, adaptations, and equivalents of the invention without departing from the scope and spirit of the application.