CN112668584A - Intelligent detection method for portrait of air conditioner external unit based on visual attention and multi-scale convolutional neural network - Google Patents

Intelligent detection method for portrait of air conditioner external unit based on visual attention and multi-scale convolutional neural network Download PDF

Info

Publication number
CN112668584A
CN112668584A CN202011545170.XA CN202011545170A CN112668584A CN 112668584 A CN112668584 A CN 112668584A CN 202011545170 A CN202011545170 A CN 202011545170A CN 112668584 A CN112668584 A CN 112668584A
Authority
CN
China
Prior art keywords
branch
feature
attention
size
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202011545170.XA
Other languages
Chinese (zh)
Inventor
袁东风
狄子钧
梁聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN202011545170.XA priority Critical patent/CN112668584A/en
Publication of CN112668584A publication Critical patent/CN112668584A/en
Withdrawn legal-status Critical Current

Links

Images

Abstract

The invention relates to an air conditioner outdoor unit portrait intelligent detection method based on visual attention and a multi-scale convolution neural network, which comprises the following steps: (1) data preprocessing: manually classifying the portrait samples of the air conditioner outdoor unit to generate correct and wrong labels. (2) Reading the preprocessed sample image, inputting the sample image into a visual attention network, and generating an attention distribution graph; (3) inputting a multi-scale network for training to obtain a deep fusion feature vector; (4) training by taking the depth fusion feature vector as the input of a softmax classifier model; (5) inputting the verification sample set into a softmax classifier model to verify the classification precision, and obtaining a trained softmax classifier model; (6) and inputting the test sample set into the trained softmax classifier model to obtain a correct or wrong classification result of the test sample set. And the gradient is conducted in the reverse process, so that a deeper model can be successfully trained, and the performance of the network is improved.

Description

Intelligent detection method for portrait of air conditioner external unit based on visual attention and multi-scale convolutional neural network
Technical Field
The invention relates to the technical field of intelligent portrait detection of an air conditioner outdoor unit, in particular to an intelligent portrait detection method of the air conditioner outdoor unit based on visual attention and a multi-scale network.
Background
Due to the difference of the models of the air conditioner outdoor units, the attached icons are different, and the used connecting pipes are different. The time and the labor are consumed by manual detection, under the background of an industrial internet, a neural network is expected to be applied to portrait detection, the traditional method of detecting by means of manual operation is replaced, whether a matching connecting pipe and an icon of the product are accurate or not can be judged rapidly in real time, results are fed back to a factory in real time, detection of an air conditioner outdoor unit is completed efficiently and at low cost, accordingly, a production line is managed more effectively, flexibility is enhanced, production cost is reduced, and enterprise benefits are improved. No intelligent detection technology for images of outdoor units of air conditioners is available, and the research on the technology is expected to be applied to the scene.
When processing an image, a neural network is equally processing all features of the image. By selectively assigning attention to different portions of the input, with reference to human vision, regions of interest can be selectively extracted from the picture or video. The extracted regions are processed and the information is progressively combined to create a dynamic internal representation of the scene or environment. The visual attention model can be used to extract features of a target region of interest in an image, and this concept has been applied to the field of visual recognition and classification. In the problem of intelligent portrait detection of an air conditioner external unit, attention is paid to colors and icons of pipe orifices, and attention areas among different air conditioner types have pixel value differences.
Convolutional neural networks are widely used in image recognition. Convolutional neural networks extract multi-scale information from images through convolution operations, and through deeper architectures, more subtle features can be extracted. For large neural networks, convolutional neural networks may also have sparse connections, avoiding overfitting. In order to extract the multi-resolution characteristics of an image, Hu provides an improved multi-scale convolutional neural network, the network comprises a three-branch structure, the three branches comprise convolutional layers with different layers, residual errors are connected among the convolutional layers with the same size, the model can effectively extract relevant characteristics and abstract the convolutional layers with different layers, and the network optimization effect is obviously improved through the residual error connection. Although the multiscale network has advantages in feature extraction, it has some limitations in portrait detection: when the multi-scale network extracts features, irrelevant information and concerned information are trained simultaneously, and weights are equivalently distributed, so that the difficulty of calculation and analysis is increased, and the information processing efficiency is low.
Disclosure of Invention
Aiming at the problem of intelligent detection of the image of the air conditioner external unit, the invention aims to provide an intelligent detection method of the image of the air conditioner external unit based on visual attention and a multi-scale convolutional neural network.
In the present invention, first, the difference in pixel values of different attention areas in a picture is used as a label, and an attention mechanism is introduced to perform visual information processing, thereby learning the attention area and its surrounding structure and generating an attention distribution map. Secondly, inputting the generated attention distribution map into a multi-scale network for training. The multi-scale network comprises a three-branch structure, the three branches comprise different convolution layers, features with different resolutions can be extracted, and finally, the features are combined through full connection to realize feature fusion. In the three branches, residual errors are connected among different layers with the same scale feature map, so that the features in the network can be subjected to identity mapping in the forward process, the gradient can be conducted in the reverse process, and the performance of the network is improved.
Interpretation of terms:
the softmax classifier model is a common linear classifier, and is a form that Logistic regression is popularized to multi-class classification. Modeled as a polynomial Distribution (Multinomial Distribution), it can be classified into a number of mutually exclusive categories.
The technical scheme of the invention is as follows:
an intelligent detection method for an outdoor unit portrait of an air conditioner based on visual attention and a multi-scale convolutional neural network comprises the following steps:
(1) data preprocessing: manually classifying the portrait samples of the air conditioner outdoor unit, wherein the concerned areas in the portrait samples are the colors of the icons and the pipe orifices of the connecting pipes, generating correct and wrong labels according to whether the portrait samples are pasted with the icons, whether the icons are matched with the models of the outdoor units, whether the connecting pipes are arranged, and whether the colors of the pipe orifices of the connecting pipes are matched with the models of the outdoor units, wherein the sample with the correct label is an image of the air conditioner outdoor unit comprising the icons and the pipe orifices of the two outdoor unit connecting pipes, and the colors of the icons and the pipe orifices of the outdoor unit connecting pipes are matched with the models of the air conditioner outdoor.
(2) Reading the sample image preprocessed in the step (1), inputting the sample image into a visual attention network, learning a region needing to draw attention and surrounding structures of the region needing to draw attention, namely connecting pipe orifices of icons and two external machines, and generating an attention distribution map; dividing the generated attention distribution map into a training sample set, a verification sample set and a test sample set;
(3) inputting the attention distribution map into a multi-scale network for training, and realizing the feature fusion of the three convolutional layers through full connection to obtain a deep fusion feature vector;
(4) taking the depth fusion characteristic vector of the training sample set as the input of a softmax classifier model, taking the correct and wrong labels as the output of the softmax classifier model, and training a model formed by a multi-scale network and the softmax classifier model;
(5) inputting the verification sample set into a softmax classifier model to verify classification precision, and updating model parameters of the softmax classifier to obtain a trained softmax classifier model;
(6) and inputting the test sample set into a trained softmax classifier model to obtain a correct or wrong classification result of the test sample set.
Preferably, according to the present invention, the visual attention network is formed by stacking a plurality of residual attention modules, each of which includes two branches: a trunk branch and a mask branch;
the main branch is a basic residual error network structure, and the image is subjected to feature extraction to generate a feature map with the same size as the original image;
the mask branch is a structure combining top-down with bottom-up, high-level features are gradually extracted and the receptive field of a residual module is increased through a residual module and a down-sampling layer, the down-sampling is completed through pooling, then the feature map is amplified into a feature map with the same size as the original image through an up-sampling layer with the same number of down-sampling layers, the up-sampling is completed through bilinear interpolation, an attention mask is finally generated, and the mask branch plays a role of a feature selector;
the feature graph output by the main branch and the attention mask output by the mask branch are multiplied by corresponding pixel points, the weight of the attention mask is distributed on the feature graph of the main branch, the activation function of the mask branch is a sigmoid function, the value of the mask is distributed between (0,1), the output response of the feature graph is poor when the value of the mask is multiplied by the feature graph, and after a plurality of residual attention modules are stacked, the value of the final attention distribution graph becomes smaller and smaller, so that the training is difficult. Therefore, referring to the residual error network structure, the result obtained by multiplying is added with the feature graph output by the main branch to carry out addition among corresponding pixel points, and finally the attention distribution graph is output.
Further preferably, in the step (2), the sample image preprocessed in the step (1) is input into a visual attention network to generate an attention distribution map, specifically:
inputting the sample image x preprocessed in the step (1) into a visual attention network, outputting and extracting a main branch to obtain a feature map T (x), and outputting an attention mask M (x) by a mask branch; t (x) learns attention for its features through its corresponding m (x) which is equivalent to a soft weight of t (x); adding identity mapping into the residual attention module, the attention distribution graph h (x) output by the visual attention network is shown as formula (i):
H(x)=(1+M(x))*T(x) (Ⅰ)
in formula (I), M (x) has a value range of [0,1], and when M (x) is approximate to 0, H (x) is approximate to a characteristic diagram T (x). M (x) can enhance good features and suppress noise of trunk branches.
According to the optimization of the invention, the multi-scale network is three branch models comprising different convolution layer numbers, feature abstractions with different levels can be effectively extracted, the scale is adjusted by utilizing the convolution layer numbers, smaller resolution can display more local features, higher resolution can display more global features, and the combination of the two can effectively improve the network performance. In the three branch models, residual errors are connected among different layers with the same scale characteristic diagram; residual connection among different layers helps the characteristics in the multi-scale network to perform identity mapping in the forward process, when the output of a shallow network is optimal, the layers behind a deep network can realize the role of identity mapping, and helps to conduct gradients in the reverse process, so that a deeper model can be successfully trained, and the performance of the network is improved; and then, combining features through full connection, fusing feature graphs with different resolution ratios into a dimensional vector in parallel, realizing feature fusion of different levels, and finally obtaining output through a softmax classifier model.
Further preferably, the three branch models include a first branch model, a second branch model, a third branch model,
after the attention distribution map is inputted into the multi-scale network, it passes through the 5 convolutional layers included in the first branch model, wherein, the size of the feature map after convolution of the 1 st layer is reduced to 1/4 of the size of the original image, the number of feature maps is increased to 4 times of the original image, the size of the feature map after convolution of the 2 nd layer is reduced to 1/16 of the size of the original image, the number of feature maps is increased to 16 times of the size of the original image, the size of the feature map after convolution of the 3 rd layer is reduced to 1/64 of the size of the original image, the number of feature maps is increased to 64 times of the original image, the size of the feature map after convolution of the 4 th layer is reduced to 1/256 of the size of the original image, the number of feature maps is increased to 256 times of the original image, the size of the feature map after convolution of the 5 th layer is reduced to 1/1024 of the size of the original image, and the number of feature maps is increased to 1024 times of the;
the 2 convolutional layers included in the second branch model, wherein the size of the feature map after convolution of the 1 st convolutional layer is reduced to 1/16 of the size of the original image, the number of feature maps is increased to 16 times of the size of the original image, the size of the feature map after convolution of the 2 nd convolutional layer is reduced to 1/256 of the size of the original image, and the number of feature maps is increased to 256 times of the size of the original image;
after the convolution layer is convolved by 1 convolution layer included in the third branch model, the size of the feature map is reduced to 1/16 of the size of the original image, and the number of feature maps is increased to 16 times of the size of the original image.
Further preferably, in the multi-scale network, identity maps are introduced between layers having the same size and feature map number, and the 1 st convolutional layer of the first branch model and the 1 st convolutional layer of the second branch model, the 4 th convolutional layer of the first branch model and the 2 nd convolutional layer of the second branch model, and the 1 st convolutional layer of the second branch model and the convolutional layer of the third branch model are respectively connected by residual errors.
Further preferably, feature graphs with different resolution sizes are fused into a one-dimensional vector in parallel by fully connecting and combining features output by the three branch models, so that feature fusion of different levels is realized.
The invention has the beneficial effects that:
1. the invention adopts the visual attention model to generate the attention distribution map, and can extract the characteristics of the attention area and the surrounding structure in the image.
2. The invention adopts the multi-scale convolution neural network, can fully excavate the characteristics of various resolutions of the attention distribution map, and realizes the characteristic fusion of different levels by merging the characteristics through full connection and fusing the characteristic maps with different resolutions into a one-dimensional vector in parallel.
3. In the invention, in three branches of the multi-scale convolutional neural network, residual connection is carried out between different layers with the same scale characteristic diagram, and the characteristics in the network are subjected to identity mapping in the forward process, so that when the output of a shallow layer is optimal, the layers behind a deep layer network can realize the effect of identity mapping. And the gradient is conducted in the reverse process, so that a deeper model can be successfully trained, and the performance of the network is improved.
Drawings
FIG. 1 is a flow chart diagram of an intelligent detection method for an outdoor unit portrait of an air conditioner based on visual attention and a multi-scale convolutional neural network.
FIG. 2 is a schematic diagram of the structure of the visual attention model of the present invention.
Fig. 3 is a schematic structural diagram of the multi-scale network of the present invention.
Detailed Description
The invention is further defined in the following, but not limited to, the figures and examples in the description.
Example 1
An intelligent detection method for an outdoor unit portrait of an air conditioner based on visual attention and a multi-scale convolutional neural network comprises the following steps:
(1) data preprocessing: manually classifying the portrait samples of the air conditioner outdoor unit, wherein the concerned areas in the portrait samples are the colors of the icons and the pipe orifices of the connecting pipes, generating correct and wrong labels according to whether the portrait samples are pasted with the icons, whether the icons are matched with the models of the outdoor units, whether the connecting pipes are arranged, and whether the colors of the pipe orifices of the connecting pipes are matched with the models of the outdoor units, wherein the sample with the correct label is an image of the air conditioner outdoor unit comprising the icons and the pipe orifices of the two outdoor unit connecting pipes, and the colors of the icons and the pipe orifices of the outdoor unit connecting pipes are matched with the models of the air conditioner outdoor.
(2) Reading the sample image preprocessed in the step (1), inputting the sample image into a visual attention network, learning a region needing to draw attention and surrounding structures of the region needing to draw attention, namely connecting pipe orifices of icons and two external machines, and generating an attention distribution map; dividing the generated attention distribution map into a training sample set, a verification sample set and a test sample set;
(3) inputting the attention distribution map into a multi-scale network for training, and realizing the feature fusion of the three convolutional layers through full connection to obtain a deep fusion feature vector;
(4) taking the depth fusion characteristic vector of the training sample set as the input of a softmax classifier model, taking the correct and wrong labels as the output of the softmax classifier model, and training a model formed by a multi-scale network and the softmax classifier model;
(5) inputting the verification sample set into a softmax classifier model to verify classification precision, and updating model parameters of the softmax classifier to obtain a trained softmax classifier model;
(6) and inputting the test sample set into a trained softmax classifier model to obtain the classification precision of correctness and errors of the test sample set.
Example 2
The intelligent detection method for the portrait of the outdoor unit of the air conditioner based on the visual attention and the multi-scale convolutional neural network is characterized in that:
as shown in fig. 2, the visual attention network is formed by stacking a plurality of residual attention modules, each of which includes two branches: a trunk branch and a mask branch; the main branch is a basic residual error network structure, and the image is subjected to feature extraction to generate a feature map with the same size as the original image; the mask branch is a structure combining top-down with bottom-up, high-level features are gradually extracted and the receptive field of a residual module is increased through a residual module and a down-sampling layer, the down-sampling is completed through pooling, then the feature map is amplified into a feature map with the same size as the original image through an up-sampling layer with the same number of down-sampling layers, the up-sampling is completed through bilinear interpolation, an attention mask is finally generated, and the mask branch plays a role of a feature selector; the feature graph output by the main branch and the attention mask output by the mask branch are multiplied by corresponding pixel points, the weight of the attention mask is distributed on the feature graph of the main branch, the activation function of the mask branch is a sigmoid function, the value of the mask is distributed between (0,1), the output response of the feature graph is poor when the value of the mask is multiplied by the feature graph, and after a plurality of residual attention modules are stacked, the value of the final attention distribution graph becomes smaller and smaller, so that the training is difficult. Therefore, referring to the residual error network structure, the result obtained by multiplying is added with the feature graph output by the main branch to carry out addition among corresponding pixel points, and finally the attention distribution graph is output.
Inputting the sample image preprocessed in the step (1) into a visual attention network to generate an attention distribution map, specifically comprising the following steps:
inputting the sample image x preprocessed in the step (1) into a visual attention network, outputting and extracting a main branch to obtain a feature map T (x), and outputting an attention mask M (x) by a mask branch; t (x) learns attention for its features through its corresponding m (x) which is equivalent to a soft weight of t (x); adding identity mapping into the residual attention module, the attention distribution graph h (x) output by the visual attention network is shown as formula (i):
H(x)=(1+M(x))*T(x) (Ⅰ)
in formula (I), M (x) has a value range of [0,1], and when M (x) is approximate to 0, H (x) is approximate to a characteristic diagram T (x). M (x) can enhance good features and suppress noise of trunk branches.
The generated attention profile h (x) is divided into a training set S1, a verification set S2, and a test set S3.
Example 3
The intelligent detection method for the portrait of the outdoor unit of the air conditioner based on the visual attention and the multi-scale convolutional neural network is characterized in that:
the multi-scale network is provided with three branch models with different convolution layer numbers, feature abstractions with different levels can be effectively extracted, the scale size is adjusted by utilizing the convolution layer numbers, smaller resolution can display more local features, higher resolution can display more global features, and the combination of the two can effectively improve the network performance. In the three branch models, residual errors are connected among different layers with the same scale characteristic diagram; residual connection among different layers helps the characteristics in the multi-scale network to perform identity mapping in the forward process, when the output of the shallow network is optimal, the layers behind the deep network can realize the role of identity mapping, and helps to conduct gradients in the reverse process, so that a deeper model can be successfully trained, and the performance of the network is improved; and then, combining features through full connection, fusing feature graphs with different resolution ratios into a dimensional vector in parallel, realizing feature fusion of different levels, and finally obtaining output through a softmax classifier model.
As shown in fig. 3, the three branch models include a first branch model, a second branch model, and a third branch model, and the attention distribution map is inputted into the multi-scale network and passes through 5 convolutional layers included in the first branch model, wherein the feature map size after convolution of the 1 st convolutional layer is reduced to 1/4 of the original image size, the number of feature maps is increased to 4 times of the original image size, the feature map size after convolution of the 2 nd convolutional layer is reduced to 1/16 of the original image size, the number of feature maps is increased to 16 times of the original image size, the feature map size after convolution of the 3 rd convolutional layer is reduced to 1/64 of the original image size, the number of feature maps is increased to 64 times of the original image size, the feature map size after convolution of the 4 th convolutional layer is reduced to 1/256 of the original image size, the number of feature maps is increased to 256 times of the original image size, the feature map size after convolution of the 5 th convolutional layer is reduced to 1/1024 of the original image size, the number of feature mappings is increased to 1024 times of the original image;
the 2 convolutional layers included in the second branch model, wherein the size of the feature map after convolution of the 1 st convolutional layer is reduced to 1/16 of the size of the original image, the number of feature maps is increased to 16 times of the size of the original image, the size of the feature map after convolution of the 2 nd convolutional layer is reduced to 1/256 of the size of the original image, and the number of feature maps is increased to 256 times of the size of the original image;
after the convolution layer is convolved by 1 convolution layer included in the third branch model, the size of the feature map is reduced to 1/16 of the size of the original image, and the number of feature maps is increased to 16 times of the size of the original image.
In the multi-scale network, identity mapping is introduced between layers with the same size and feature mapping quantity, and the 1 st convolutional layer of the first branch model and the 1 st convolutional layer of the second branch model, the 4 th convolutional layer of the first branch model and the 2 nd convolutional layer of the second branch model, and the 1 st convolutional layer of the second branch model and the convolutional layer of the third branch model are respectively connected in a residual error mode.
By fully connecting and combining the features output by the three branch models, feature graphs with different resolution ratios are fused into a one-dimensional vector in parallel, and feature fusion of different levels is realized.
The parameter settings for the multi-scale convolutional neural network are shown in table 1.
TABLE 1
Parameter(s) Parameter value
Epoch 10
BatchSize 64
Learningrate 0.0003
Optimizer Adam
Table 2 shows comparative data of the training results and the classification results of the test set for the multi-scale network with and without the visual attention network.
TABLE 2
Multi-scale convolutional neural network Verification set accuracy (%) Test set accuracy (%)
Adding visual attention 87.00 53.00
Without adding visual attentionForce of 15.95 18.75

Claims (7)

1. An intelligent detection method for an outdoor unit portrait of an air conditioner based on visual attention and a multi-scale convolutional neural network is characterized by comprising the following steps:
(1) data preprocessing: manually classifying the portrait samples of the air conditioner outdoor unit to generate correct and wrong labels, wherein the correct label is the image of the air conditioner outdoor unit, which comprises the icons and the pipe orifices of the two outdoor unit connecting pipes, and the colors of the icons and the pipe orifices of the outdoor unit connecting pipes are matched with the model of the air conditioner outdoor unit, otherwise, the correct label is the wrong label;
(2) reading the sample image preprocessed in the step (1), inputting the sample image into a visual attention network, learning a region needing to draw attention and surrounding structures of the region needing to draw attention, namely connecting pipe orifices of icons and two external machines, and generating an attention distribution map; dividing the generated attention distribution map into a training sample set, a verification sample set and a test sample set;
(3) inputting the attention distribution map into a multi-scale network for training, and realizing the feature fusion of the three convolutional layers through full connection to obtain a deep fusion feature vector;
(4) taking the depth fusion characteristic vector of the training sample set as the input of a softmax classifier model, taking the correct and wrong labels as the output of the softmax classifier model, and training a model formed by a multi-scale network and the softmax classifier model;
(5) inputting the verification sample set into a softmax classifier model to verify classification precision, and updating model parameters of the softmax classifier to obtain a trained softmax classifier model;
(6) and inputting the test sample set into a trained softmax classifier model to obtain a correct or wrong classification result of the test sample set.
2. The method as claimed in claim 1, wherein the visual attention network is formed by stacking a plurality of residual attention modules, each residual attention module comprising two branches: a trunk branch and a mask branch;
the main branch is a basic residual error network structure, and the image is subjected to feature extraction to generate a feature map with the same size as the original image;
the mask branch is a structure combining top-down with bottom-up, high-level features are gradually extracted and the receptive field of a residual module is increased through a residual module and a down-sampling layer, the down-sampling is completed through pooling, then the feature map is amplified into a feature map with the same size as the original image through an up-sampling layer with the same number of down-sampling layers, the up-sampling is completed through bilinear interpolation, an attention mask is finally generated, and the mask branch plays a role of a feature selector;
multiplying the feature graph output by the trunk branch with the attention mask output by the mask branch by corresponding pixel points, distributing the weight of the attention mask to the feature graph of the trunk branch, adding the result obtained by multiplying with the feature graph output by the trunk branch by the corresponding pixel points, and finally outputting an attention distribution graph.
3. The intelligent detection method for the portrait of the outdoor unit of the air conditioner based on the visual attention and the multi-scale convolutional neural network as claimed in claim 1, wherein in the step (2), the sample image preprocessed in the step (1) is input into the visual attention network to generate an attention distribution map, specifically:
inputting the sample image x preprocessed in the step (1) into a visual attention network, outputting and extracting a main branch to obtain a feature map T (x), and outputting an attention mask M (x) by a mask branch; t (x) learns attention for its features through its corresponding m (x) which is equivalent to a soft weight of t (x); adding identity mapping into the residual attention module, the attention distribution graph h (x) output by the visual attention network is shown as formula (i):
H(x)=(1+M(x))*T(x) (Ⅰ)
in formula (I), M (x) has a value range of [0,1], and when M (x) is approximate to 0, H (x) is approximate to a characteristic diagram T (x).
4. The intelligent detection method for the portrait of the outdoor unit of the air conditioner based on the visual attention and the multi-scale convolutional neural network as claimed in any one of claims 1 to 3, wherein the multi-scale network comprises three branch models with different convolutional layer numbers, the feature abstractions with different levels are extracted, the scale is adjusted by using the convolutional layer numbers, and in the three branch models, the different layers with the same scale feature map are connected through residual errors; residual connection among different layers helps the characteristics in the multi-scale network to perform identity mapping in the forward process, when the output of a shallow network is optimal, the layers behind a deep network can realize the role of identity mapping, and helps to conduct gradients in the reverse process, so that a deeper model can be successfully trained, and the performance of the network is improved; and then, combining features through full connection, fusing feature graphs with different resolution ratios into a dimensional vector in parallel, realizing feature fusion of different levels, and finally obtaining output through a softmax classifier model.
5. The intelligent detection method for portrait of outdoor unit of air conditioner based on visual attention and multi-scale convolutional neural network of claim 4, wherein the three branch models comprise a first branch model, a second branch model and a third branch model,
after the attention distribution map is inputted into the multi-scale network, it passes through the 5 convolutional layers included in the first branch model, wherein, the size of the feature map after convolution of the 1 st layer is reduced to 1/4 of the size of the original image, the number of feature maps is increased to 4 times of the original image, the size of the feature map after convolution of the 2 nd layer is reduced to 1/16 of the size of the original image, the number of feature maps is increased to 16 times of the size of the original image, the size of the feature map after convolution of the 3 rd layer is reduced to 1/64 of the size of the original image, the number of feature maps is increased to 64 times of the original image, the size of the feature map after convolution of the 4 th layer is reduced to 1/256 of the size of the original image, the number of feature maps is increased to 256 times of the original image, the size of the feature map after convolution of the 5 th layer is reduced to 1/1024 of the size of the original image, and the number of feature maps is increased to 1024 times of the;
the 2 convolutional layers included in the second branch model, wherein the size of the feature map after convolution of the 1 st convolutional layer is reduced to 1/16 of the size of the original image, the number of feature maps is increased to 16 times of the size of the original image, the size of the feature map after convolution of the 2 nd convolutional layer is reduced to 1/256 of the size of the original image, and the number of feature maps is increased to 256 times of the size of the original image;
after the convolution layer is convolved by 1 convolution layer included in the third branch model, the size of the feature map is reduced to 1/16 of the size of the original image, and the number of feature maps is increased to 16 times of the size of the original image.
6. The method as claimed in claim 5, wherein identity mapping is introduced between layers having the same size and feature mapping number in the multi-scale network, and the 1 st convolutional layer of the first branch model and the 1 st convolutional layer of the second branch model, the 4 th convolutional layer of the first branch model and the 2 nd convolutional layer of the second branch model, and the 1 st convolutional layer of the second branch model and the convolutional layer of the third branch model are respectively connected by residual errors.
7. The intelligent detection method for the portrait of the outdoor unit of the air conditioner based on the visual attention and the multi-scale convolutional neural network as claimed in claim 6, wherein feature graphs with different resolution sizes are fused into a one-dimensional vector in parallel by fully connecting and combining features output by three branch models, so that feature fusion of different levels is realized.
CN202011545170.XA 2020-12-24 2020-12-24 Intelligent detection method for portrait of air conditioner external unit based on visual attention and multi-scale convolutional neural network Withdrawn CN112668584A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011545170.XA CN112668584A (en) 2020-12-24 2020-12-24 Intelligent detection method for portrait of air conditioner external unit based on visual attention and multi-scale convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011545170.XA CN112668584A (en) 2020-12-24 2020-12-24 Intelligent detection method for portrait of air conditioner external unit based on visual attention and multi-scale convolutional neural network

Publications (1)

Publication Number Publication Date
CN112668584A true CN112668584A (en) 2021-04-16

Family

ID=75409569

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011545170.XA Withdrawn CN112668584A (en) 2020-12-24 2020-12-24 Intelligent detection method for portrait of air conditioner external unit based on visual attention and multi-scale convolutional neural network

Country Status (1)

Country Link
CN (1) CN112668584A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113222044A (en) * 2021-05-25 2021-08-06 合肥工业大学 Cervical fluid-based cell classification method based on ternary attention and scale correlation fusion
CN113269077A (en) * 2021-05-19 2021-08-17 青岛科技大学 Underwater acoustic communication signal modulation mode identification method based on improved gating network and residual error network
CN113281029A (en) * 2021-06-09 2021-08-20 重庆大学 Rotating machinery fault diagnosis method and system based on multi-scale network structure
CN113792757A (en) * 2021-08-18 2021-12-14 吉林大学 Oscillogram classification method based on multi-scale attention residual error network
CN114724021A (en) * 2022-05-25 2022-07-08 北京闪马智建科技有限公司 Data identification method and device, storage medium and electronic device
WO2023274191A1 (en) * 2021-06-30 2023-01-05 华为技术有限公司 Feature map processing method and related device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109190623A (en) * 2018-09-15 2019-01-11 闽江学院 A method of identification projector brand and model
CN109800698A (en) * 2019-01-11 2019-05-24 北京邮电大学 Icon detection method based on depth network
CN110414377A (en) * 2019-07-09 2019-11-05 武汉科技大学 A kind of remote sensing images scene classification method based on scale attention network
CN110532859A (en) * 2019-07-18 2019-12-03 西安电子科技大学 Remote Sensing Target detection method based on depth evolution beta pruning convolution net
CN111539469A (en) * 2020-04-20 2020-08-14 东南大学 Weak supervision fine-grained image identification method based on vision self-attention mechanism
CN111666943A (en) * 2020-04-20 2020-09-15 国网浙江省电力有限公司 Matching detection method and system for bolt and nut of power transmission line based on image recognition

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109190623A (en) * 2018-09-15 2019-01-11 闽江学院 A method of identification projector brand and model
CN109800698A (en) * 2019-01-11 2019-05-24 北京邮电大学 Icon detection method based on depth network
CN110414377A (en) * 2019-07-09 2019-11-05 武汉科技大学 A kind of remote sensing images scene classification method based on scale attention network
CN110532859A (en) * 2019-07-18 2019-12-03 西安电子科技大学 Remote Sensing Target detection method based on depth evolution beta pruning convolution net
CN111539469A (en) * 2020-04-20 2020-08-14 东南大学 Weak supervision fine-grained image identification method based on vision self-attention mechanism
CN111666943A (en) * 2020-04-20 2020-09-15 国网浙江省电力有限公司 Matching detection method and system for bolt and nut of power transmission line based on image recognition

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
FEI WANG等: "Residual Attention Network for Image Classification", 《2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》, 9 November 2017 (2017-11-09), pages 6450 - 6458 *
ZHONG-XU HU等: "Data-Driven Fault Diagnosis Method Based on Compressed Sensing and Improved Multiscale Network", 《IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS》, 30 April 2020 (2020-04-30), pages 3216 - 3225, XP011760313, DOI: 10.1109/TIE.2019.2912763 *
边小勇等: "基于尺度注意力网络的遥感图像场景分类", 《计算机应用》, 10 March 2020 (2020-03-10), pages 872 - 877 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113269077A (en) * 2021-05-19 2021-08-17 青岛科技大学 Underwater acoustic communication signal modulation mode identification method based on improved gating network and residual error network
CN113222044A (en) * 2021-05-25 2021-08-06 合肥工业大学 Cervical fluid-based cell classification method based on ternary attention and scale correlation fusion
CN113222044B (en) * 2021-05-25 2022-03-08 合肥工业大学 Cervical fluid-based cell classification method based on ternary attention and scale correlation fusion
CN113281029A (en) * 2021-06-09 2021-08-20 重庆大学 Rotating machinery fault diagnosis method and system based on multi-scale network structure
CN113281029B (en) * 2021-06-09 2022-03-15 重庆大学 Rotating machinery fault diagnosis method and system based on multi-scale network structure
WO2023274191A1 (en) * 2021-06-30 2023-01-05 华为技术有限公司 Feature map processing method and related device
CN113792757A (en) * 2021-08-18 2021-12-14 吉林大学 Oscillogram classification method based on multi-scale attention residual error network
CN113792757B (en) * 2021-08-18 2023-12-08 吉林大学 Waveform diagram classification method based on multi-scale attention residual error network
CN114724021A (en) * 2022-05-25 2022-07-08 北京闪马智建科技有限公司 Data identification method and device, storage medium and electronic device

Similar Documents

Publication Publication Date Title
CN112668584A (en) Intelligent detection method for portrait of air conditioner external unit based on visual attention and multi-scale convolutional neural network
CN109584248B (en) Infrared target instance segmentation method based on feature fusion and dense connection network
CN107273502B (en) Image geographic labeling method based on spatial cognitive learning
CN109726627B (en) Neural network model training and universal ground wire detection method
CN113033570B (en) Image semantic segmentation method for improving void convolution and multilevel characteristic information fusion
CN113128558B (en) Target detection method based on shallow space feature fusion and adaptive channel screening
CN112819748B (en) Training method and device for strip steel surface defect recognition model
CN112862774A (en) Accurate segmentation method for remote sensing image building
CN109657538B (en) Scene segmentation method and system based on context information guidance
CN111768415A (en) Image instance segmentation method without quantization pooling
CN114913498A (en) Parallel multi-scale feature aggregation lane line detection method based on key point estimation
CN113052057A (en) Traffic sign identification method based on improved convolutional neural network
CN113989261A (en) Unmanned aerial vehicle visual angle infrared image photovoltaic panel boundary segmentation method based on Unet improvement
CN110751271B (en) Image traceability feature characterization method based on deep neural network
CN115797808A (en) Unmanned aerial vehicle inspection defect image identification method, system, device and medium
CN115482518A (en) Extensible multitask visual perception method for traffic scene
CN114626476A (en) Bird fine-grained image recognition method and device based on Transformer and component feature fusion
CN111815526A (en) Rain image rainstrip removing method and system based on image filtering and CNN
CN114863266A (en) Land use classification method based on deep space-time mode interactive network
CN117274388B (en) Unsupervised three-dimensional visual positioning method and system based on visual text relation alignment
CN112669452B (en) Object positioning method based on convolutional neural network multi-branch structure
CN111612803B (en) Vehicle image semantic segmentation method based on image definition
CN111368637B (en) Transfer robot target identification method based on multi-mask convolutional neural network
CN113011308A (en) Pedestrian detection method introducing attention mechanism
CN114743023B (en) Wheat spider image detection method based on RetinaNet model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20210416

WW01 Invention patent application withdrawn after publication