CN116363538B

CN116363538B - Bridge detection method and system based on unmanned aerial vehicle

Info

Publication number: CN116363538B
Application number: CN202310642118.3A
Authority: CN
Inventors: 肖佑铭; 徐锐; 许维平; 喻琳; 张南; 张君翼
Original assignee: Guizhou Jiaotou High Tech Co ltd
Current assignee: Guizhou Jiaotou High Tech Co ltd
Priority date: 2023-06-01
Filing date: 2023-06-01
Publication date: 2023-08-01
Anticipated expiration: 2043-06-01
Also published as: CN116363538A

Abstract

The application provides a bridge detection method and system based on an unmanned aerial vehicle, wherein a bridge aerial image set to be identified is constructed, wherein the bridge aerial image set comprises a plurality of first image blocks; for each first image block, determining a first description knowledge set corresponding to the first image block according to the first image block and a preset first image data set, wherein the first description knowledge set comprises first texture description knowledge, first color description knowledge and first image block description knowledge; performing knowledge interaction on a plurality of first description knowledge, and determining first interaction description knowledge corresponding to a first image block; determining a first anomaly analysis description knowledge corresponding to the first image block according to the first interaction description knowledge to obtain a plurality of first anomaly analysis description knowledge corresponding to a plurality of first image blocks; the anomaly recognition is carried out through the target bridge anomaly recognition network, the anomaly recognition result is determined, the anomaly detection recognition is carried out on the bridge aerial image from the plurality of aspects of colors, textures and image blocks of the bridge aerial image, and the detection accuracy is high.

Description

Bridge detection method and system based on unmanned aerial vehicle

Technical Field

The application relates to the technical fields of data processing and machine learning, and in particular relates to a bridge detection method and system based on an unmanned aerial vehicle.

Background

The civil unmanned aerial vehicle can execute tasks which are difficult to be easily executed by manpower in different fields by configuring different devices. Such as power inspection, agricultural insurance, environmental protection, film and television theatre shooting, express delivery, post-disaster rescue, and the like. In the field of bridge detection, an unmanned aerial vehicle equipped with a high-definition digital video camera, a camera and a GPS positioning system is used for positioning and autonomous cruising along a bridge, real-time transmission of aerial bridge images is carried out, and image analysis and recognition are carried out through computer equipment in communication connection with the aerial bridge images, so that the abnormal monitoring work of the bridge is completed. Unmanned aerial vehicle can reach the building dead angle that the manpower is difficult to reach and shoot, and the speed of detection is also far above the manpower. The detection efficiency and accuracy also depend on subsequent image recognition detection, in the prior art, for aerial bridge image detection, the recognition mode is single, and the precision and the efficiency are also large in lifting space.

Disclosure of Invention

In view of this, the embodiment of the application at least provides a bridge detection method based on unmanned aerial vehicle.

The technical scheme of the embodiment of the application is realized as follows:

in one aspect, an embodiment of the present application provides a bridge detection method based on an unmanned aerial vehicle, which is applied to a computer device in communication connection with the unmanned aerial vehicle, and the method includes:

Acquiring a plurality of bridge aerial images sent by at least one unmanned aerial vehicle, and constructing to obtain a bridge aerial image set to be identified, wherein the bridge aerial image set to be identified comprises a plurality of first image blocks;

for each first image block in the plurality of first image blocks, determining a first description knowledge set corresponding to the first image block according to the first image block and a preset first image data set, wherein the first description knowledge set comprises a plurality of first description knowledge, namely first texture description knowledge, first color description knowledge and first image block description knowledge;

fusing the first texture description knowledge, the first color description knowledge and the first image block description knowledge corresponding to the first image block on an intermediate layer dimension to obtain first fused knowledge; acquiring a target dimension corresponding to the target bridge anomaly identification network;

carrying out description knowledge dimension optimization on the first fusion knowledge through the target dimension to obtain first interaction description knowledge corresponding to a first image block, wherein the first intermediate layer dimension corresponding to the first interaction description knowledge is consistent with the second intermediate layer dimension corresponding to the target dimension;

Determining a first anomaly analysis description knowledge corresponding to the first image block according to the first interaction description knowledge to obtain a plurality of first anomaly analysis description knowledge corresponding to the plurality of first image blocks; when determining a first anomaly analysis description knowledge corresponding to the first image block according to the first interaction description knowledge, the method comprises the following steps: determining first image distribution description knowledge and first region segmentation description knowledge corresponding to each first image block according to the bridge aerial image set to be identified, and integrating the first interaction description knowledge, the first image distribution description knowledge corresponding to the first image block and the first region segmentation description knowledge corresponding to the first image block to obtain a first abnormal analysis description knowledge corresponding to the first image block; the first image distribution description knowledge is a feature vector representing the distribution position of a first image block in the bridge aerial image set to be identified, and the first region segmentation description knowledge is a feature vector representing the image region segmentation of the bridge aerial image set to be identified;

and carrying out anomaly recognition on the plurality of first anomaly analysis description knowledge through a target bridge anomaly recognition network so as to determine an anomaly recognition result corresponding to the bridge aerial image set to be recognized.

In some embodiments, the determining, according to the first image block and a preset first image data set, a first description knowledge set corresponding to the first image block includes:

determining first image feature data corresponding to the first image block according to the first image block and a preset first image data set, wherein the first image feature data comprises a plurality of first image feature data:

first texture image feature data, first color image feature data, and first image block image feature data;

and determining a first description knowledge set corresponding to the first image block through the first image characteristic data corresponding to the first image block.

In some embodiments, the determining, by the first image feature data corresponding to the first image block, the first descriptive knowledge set corresponding to the first image block includes:

and determining corresponding first description knowledge for each piece of first image feature data in the first image feature data corresponding to the first image block according to the first image feature data, the target feature coding network and the target first classification mapping network to obtain a first description knowledge set corresponding to the first image block.

In some embodiments, the determining the corresponding first descriptive knowledge from the first image feature data, the target feature encoding network, and the target first classification mapping network includes:

for the first image feature data, determining whether the feature capacity of the first image feature data meets a preset feature capacity;

if the feature capacity of the first image feature data meets the preset feature capacity, determining first description knowledge corresponding to the first image feature data according to the first image feature data, a target feature coding network and a target first classification mapping network;

and if the feature capacity of the first image feature data does not meet the preset feature capacity, supplementing the first image feature data based on preset supplemental image features, and determining first description knowledge corresponding to the first image feature data according to the supplemented first image feature data, the target feature coding network and the target first classification mapping network after supplementing the first image feature data to the feature capacity meeting the preset feature capacity.

In some embodiments, the method further comprises:

acquiring a bridge template image set, wherein the bridge template image set comprises a plurality of bridge template image blocks;

Determining a plurality of second image feature data corresponding to the bridge template image set, wherein the second image feature data comprises texture image feature data, color image feature data and image block image feature data;

wherein the texture image feature data comprises second texture image feature data for each of the bridge template image blocks, the color image feature data comprises second color image feature data for each of the bridge template image blocks and the image block image feature data comprises second image block image feature data for each of the bridge template image blocks;

performing shielding operation on each image characteristic data in the plurality of image characteristic data to obtain a plurality of shielding sets, wherein each image characteristic data at least corresponds to one shielding set, and each shielding set comprises a plurality of temporary image characteristic data; the masking operation is to mask the image feature data, the obtained masking set is a feature sequence obtained by masking the image feature data, and a plurality of image feature data in the masking set are temporary image feature data in the masking set;

inputting the shielding sets into an original neural network for each shielding set in the shielding sets to obtain reasoning image characteristic data corresponding to shielding distribution information in the shielding sets, wherein the original neural network comprises an original description knowledge extraction network, an original knowledge interaction network, an original anomaly carrier construction network and an original bridge anomaly identification network, the original description knowledge extraction network is used for determining a second description knowledge set corresponding to each temporary image characteristic data in the shielding sets, and the second description knowledge set comprises a plurality of second description knowledge which are respectively a second texture description knowledge, a second color description knowledge and a second image block description knowledge; the original knowledge interaction network is used for determining second interaction description knowledge corresponding to each temporary image feature data in the shielding set according to the second description knowledge set, the original abnormal carrier construction network is used for determining input data of the original bridge abnormal recognition network according to the second interaction description knowledge, and the original bridge abnormal recognition network is used for determining reasoning image feature data corresponding to shielding distribution information in the shielding set according to the input data;

And according to the reasoning image characteristic data corresponding to the shielding distribution information in each shielding set in the shielding sets, debugging the original neural network according to the second image characteristic data corresponding to the shielding distribution information in each shielding set in the shielding sets to obtain a target neural network, wherein the target neural network comprises a target description knowledge extraction network, a target knowledge interaction network, a target abnormal carrier construction network and a target bridge abnormal recognition network.

In some embodiments, the performing a masking operation on each piece of image feature data to obtain a masking set corresponding to the piece of image feature data includes:

acquiring a preset shielding duty ratio;

acquiring the number of a plurality of second image feature data in the image feature data;

determining the target number of second image characteristic data to be shielded in the image characteristic data according to the preset shielding duty ratio and the number of the plurality of second image characteristic data;

performing shielding operation on the second image characteristic data of the target number in the image characteristic data, wherein the rest second image characteristic data except the second image characteristic data of the target number are unchanged, so as to obtain a shielding set corresponding to the image characteristic data;

The debugging of the original neural network according to the inferred image feature data corresponding to the shielding distribution information in each shielding set in the plurality of shielding sets and the second image feature data corresponding to the shielding distribution information in each shielding set in the plurality of shielding sets includes:

determining a texture inference error value according to inference texture image feature data corresponding to shielding distribution information in the texture image feature data and second texture image feature data corresponding to the shielding distribution information;

determining a color reasoning error value according to reasoning color image characteristic data corresponding to shielding distribution information in the color image characteristic data and second color image characteristic data corresponding to the shielding distribution information;

determining an image block reasoning error value according to reasoning image block image characteristic data corresponding to shielding distribution information in the image block image characteristic data and second image block image characteristic data corresponding to the shielding distribution information;

and debugging the original neural network according to the texture reasoning error value, the color reasoning error value and the image block reasoning error value to obtain a target neural network.

In some embodiments, the debugging the original neural network according to the texture inference error value, the color inference error value and the image block inference error value to obtain a target neural network includes:

integrating the texture reasoning error value, the color reasoning error value and the image block reasoning error value to obtain an integrated error value;

if the integrated error value is smaller than the error critical value, the original neural network is used as a target neural network;

and if the integrated error value is not smaller than the error critical value, adjusting network configuration variables of the original neural network through the integrated error value to obtain an original neural network after the network configuration variables are adjusted, then jumping to each shielding set in the shielding sets, inputting the shielding sets into the original neural network to obtain reasoning image characteristic data corresponding to shielding distribution information in the shielding sets, and stopping debugging until the integrated error value is smaller than the error critical value, wherein the network configuration variables comprise one or more corresponding network configuration variables in the original description knowledge extraction network, the original knowledge interaction network, the original abnormal carrier construction network and the original bridge abnormal recognition network.

In another aspect, embodiments of the present application provide a bridge inspection system comprising a computer device and at least one drone communicatively connected to the computer device, the computer device comprising a memory and a processor, the memory storing a computer program executable on the processor, the processor implementing the steps in the method described above when the program is executed.

The embodiment of the application at least comprises the following beneficial effects: according to the bridge detection method and system based on the unmanned aerial vehicle, through the constructed bridge aerial image set to be identified, the bridge aerial image set to be identified comprises a plurality of first image blocks; for each first image block in the plurality of first image blocks, determining a first description knowledge set corresponding to the first image block according to the first image block and a preset first image data set, wherein the first description knowledge set comprises a plurality of first description knowledge: first texture description knowledge, first color description knowledge, and first image block description knowledge; performing knowledge interaction operation on a plurality of first description knowledge corresponding to the first image block to determine first interaction description knowledge corresponding to the first image block; determining a first anomaly analysis description knowledge corresponding to the first image block according to the first interaction description knowledge to obtain a plurality of first anomaly analysis description knowledge corresponding to a plurality of first image blocks; the anomaly detection and identification are carried out on the bridge aerial image from the aspects of color, texture and image block of the bridge aerial image, so that the detection accuracy is high.

In addition, according to the neural network debugging mode provided by the embodiment of the application, the debugging process of the neural network does not need to rely on a large number of images in abnormal states for training, so that the higher labor cost generated in the abnormal images and the labeling process is reduced, the debugging cost of the neural network is low, and the efficiency is high.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the aspects of the present application.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and, together with the description, serve to explain the technical aspects of the application.

Fig. 1 is a schematic diagram of a bridge inspection system according to an embodiment of the present application.

Fig. 2 is a schematic implementation flow chart of a bridge detection method based on an unmanned aerial vehicle according to an embodiment of the present application.

Fig. 3 is a schematic diagram of a composition structure of a bridge detection device according to an embodiment of the present application.

Fig. 4 is a schematic hardware entity diagram of a computer device according to an embodiment of the present application.

Description of the embodiments

For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application are further elaborated below in conjunction with the accompanying drawings and examples, which should not be construed as limiting the present application, and all other embodiments obtained by those skilled in the art without making inventive efforts are within the scope of protection of the present application.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is to be understood that "some embodiments" can be the same subset or different subsets of all possible embodiments and can be combined with one another without conflict. The term "first/second/third" is merely to distinguish similar objects and does not represent a specific ordering of objects, it being understood that the "first/second/third" may be interchanged with a specific order or sequence, as permitted, to enable embodiments of the present application described herein to be practiced otherwise than as illustrated or described herein.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing the present application only and is not intended to be limiting of the present application.

Fig. 1 is a schematic diagram of a bridge inspection system 400 according to some embodiments of the present application, the bridge inspection system 400 including a computing device 100 and a drone 300 communicatively coupled to each other via a network 200.

The embodiment of the application provides a bridge detection method based on an unmanned aerial vehicle, which can be executed by a processor of computer equipment. The computer device may refer to a device with data processing capability such as a server, a notebook computer, a tablet computer, a desktop computer, a smart television, a mobile device (e.g., a mobile phone, a portable video player, a personal digital assistant, a dedicated messaging device, a portable game device), etc.

In conventional bridge anomaly detection, differences between the shape and the normal shape in a partial image of a bridge are often focused on, for example, when a crack is generated, a large difference is generated between the image shape at the crack and the area in the normal state, and anomalies are discriminated. However, the local image of the bridge may have great complexity due to the exposure environment, such as non-abnormal situations of vegetation coverage, foreign object pollution and the like, and the bridge is considered to be abnormal during image recognition, so that invalid prediction is caused. Therefore, in the scheme of the embodiment of the application, prediction is performed on multi-dimensional semantic analysis such as texture, color and integral image blocks of the bridge image, so that the semantic analysis is facilitated in different scenes, and the accuracy of bridge anomaly detection is ensured. In addition, in the process of machine learning-based image recognition and detection, the debugging process of the neural network needs to rely on a large number of images in abnormal states for training, the abnormal images are marked, the labor cost generated by the abnormal images and the marking process is high, the debugging cost of the neural network is high, the efficiency is low, and in the neural network debugging method provided subsequently in the embodiment of the application, the process can be more efficient. Fig. 2 is a schematic implementation flow chart of a bridge detection method based on an unmanned aerial vehicle according to an embodiment of the present application, as shown in fig. 2, the method includes the following steps 110 to 150.

Step 110, acquiring a plurality of bridge aerial images sent by at least one unmanned aerial vehicle, and constructing to obtain a bridge aerial image set to be identified, wherein the bridge aerial image set to be identified comprises a plurality of first image blocks.

Step 120, for each first image block in the plurality of first image blocks, determining a first description knowledge set corresponding to the first image block according to the first image block and a preset first image data set, where the first description knowledge set includes a plurality of first description knowledge, and the first description knowledge set includes a first texture description knowledge, a first color description knowledge, and a first image block description knowledge.

And 130, performing knowledge interaction operation on a plurality of first description knowledge corresponding to the first image block to determine first interaction description knowledge corresponding to the first image block.

And 140, determining a first anomaly analysis description knowledge corresponding to the first image block according to the first interaction description knowledge to obtain a plurality of first anomaly analysis description knowledge corresponding to the plurality of first image blocks.

And 150, performing anomaly recognition on the plurality of first anomaly analysis description knowledge through a target bridge anomaly recognition network to determine an anomaly recognition result corresponding to the bridge aerial image set to be recognized.

In one embodiment, the preset first image dataset includes a plurality of image blocks and texture information, color information, and image block information corresponding to each image block. The first image block is an image block in a plurality of image blocks, and it is understood that the "first" is only used for distinguishing description, and has no practical meaning, and the image block may be an original bridge aerial image or an image obtained by preprocessing the original bridge aerial image, and the preprocessing process is, for example, denoising, enhancing, clipping and the like. In one embodiment, the first image dataset comprises a pre-set texture dataset, a color dataset, and an image block dataset.

In one embodiment, the image block data set may be an image block set formed by common bridge partial images, the color data set may be color information obtained by extracting colors from image blocks in the image block data set and refining the color information, which may be represented as a feature vector, and the texture data set may be texture information obtained by extracting textures from image blocks in the image block data set and refining the texture information, which may be represented as a feature vector.

In step 120, a first description knowledge set corresponding to the first image block is determined according to the first image block and a preset first image data set, which specifically includes the following steps 121 to 122:

Step 121, determining first image feature data corresponding to the first image block according to the first image block and a preset first image data set, where the first image feature data includes multiple first image feature data, specifically, first texture image feature data, first color image feature data and first image block image feature data.

Step 122, determining a first description knowledge set corresponding to the first image block according to the first image feature data corresponding to the first image block.

For example, if the bridge aerial image set to be identified is "A, B, C, D, E", including A, B, C, D, E five first image blocks, "a" of the first image blocks is determined, according to the first image blocks and the preset first image data set, to obtain multiple first image feature data corresponding to the first image blocks as first texture image feature data "V1", first color image feature data "V2" and first image block image feature data "V3".

In one embodiment, in step 122, determining, by using the first image feature data corresponding to the first image block, a first description knowledge set corresponding to the first image block may specifically include: and determining corresponding first description knowledge for each piece of first image feature data in the first image feature data corresponding to the first image block according to the first image feature data, the target feature coding network and the target first classification mapping network to obtain a first description knowledge set corresponding to the first image block. The first descriptive knowledge is used to describe a knowledge vector of the corresponding information feature, i.e. a vector representation of the image block, which may be a feature vector, or a feature matrix. Because the corresponding colors and textures of different first image blocks bear different information capacities, the corresponding vector element numbers are inconsistent, when the first image feature data are determined, the first image feature data are processed to obtain feature data with preset feature capacity (namely specific vector length), and based on the feature data, the first image feature data are processed to obtain the first image feature data. Determining corresponding first descriptive knowledge from the first image feature data, the target feature encoding network and the target first classification mapping network, comprising: for the first image feature data, determining whether the feature capacity of the first image feature data meets a preset feature capacity; if the feature capacity of the first image feature data meets the preset feature capacity, determining first description knowledge corresponding to the first image feature data according to the first image feature data, a target feature coding network and a target first classification mapping network; if the feature capacity of the first image feature data does not meet the preset feature capacity, supplementing the first image feature data based on preset supplementing image features (which can be a vector), and after supplementing the first image feature data to the capacity meeting the preset feature capacity, determining first description knowledge corresponding to the first image feature data according to the supplemented first image feature data, the target feature coding network and the target first classification mapping network.

For example, if the preset feature capacity is 6 and the feature capacity of the first texture image feature data V1 is 5 and the preset feature capacity is not satisfied, the first texture image feature data V1 is supplemented by the preset supplementary image feature to reach 6, and the preset supplementary image feature may be 0, that is, the obtained first texture image feature data is (12, 34, 23, 21, 45, 0). The same idea is also complemented with the first color image feature data.

In one embodiment, after the first image feature data is supplemented to a capacity meeting the preset feature capacity, determining a first description knowledge corresponding to the first image feature data according to the supplemented first image feature data, the target feature encoding network and the target first classification mapping network, specifically including: according to the target feature coding network (for embedding the feature dimension reduction, namely, ebedding), the first image feature data is projected to a vector of a first target dimension, and then according to the target first classification mapping network (for classifying and unifying, such as a fully connected network), the projected vector is consistent with a preset middle layer parameter.

As an implementation manner, in step 130, performing knowledge interaction operation on the plurality of first description knowledge corresponding to the first image block to determine the first interaction description knowledge corresponding to the first image block may specifically include the following steps 131 to 133:

step 131, fusing (e.g. stitching together) the first texture description knowledge, the first color description knowledge and the first image block description knowledge corresponding to the first image block on the middle layer dimension to obtain a first fused knowledge.

And step 132, obtaining a target dimension corresponding to the target bridge anomaly identification network.

And step 133, carrying out description knowledge dimension optimization on the first fusion knowledge through the target dimension to obtain first interaction description knowledge corresponding to the first image block. Wherein the first intermediate layer dimension corresponding to the first interaction description knowledge is consistent with the second intermediate layer dimension corresponding to the target dimension. The target dimension is consistent with the middle layer parameter corresponding to the target bridge anomaly identification network.

As an embodiment, the steps 131 to 133 are performed through a preset target knowledge interaction network, where the target knowledge interaction network includes a target second classification mapping network.

As an embodiment, the method further comprises: determining first image distribution description knowledge and first region segmentation description knowledge corresponding to each first image block according to the bridge aerial image set to be identified; determining a first anomaly analysis description knowledge corresponding to the first image block according to the first interaction description knowledge may include: and determining a first anomaly analysis description knowledge corresponding to the first image block according to the first interaction description knowledge, the first image distribution description knowledge corresponding to the first image block and the first region segmentation description knowledge corresponding to the first image block. The first image distribution description knowledge is a feature vector representing the distribution position of the first image block in the bridge aerial image set to be identified, and the first region segmentation description knowledge is a feature vector representing the image region segmentation of the bridge aerial image set to be identified. In one embodiment, determining a first anomaly analysis description knowledge corresponding to the first image block according to the first interaction description knowledge, the first image distribution description knowledge corresponding to the first image block, and the first region segmentation description knowledge corresponding to the first image block includes: integrating the first interaction description knowledge, the first image distribution description knowledge corresponding to the first image block and the first region segmentation description knowledge corresponding to the first image block, wherein the integration mode can be one of adding, splicing or connecting to obtain a first anomaly analysis description knowledge corresponding to the first image block. In one implementation manner, the target bridge anomaly identification network provided in the embodiment of the present application may be a deep neural network, such as DNN, RNN, transformer.

According to the method, the device and the system, the bridge aerial image set to be identified is obtained, and the bridge aerial image set to be identified comprises a plurality of first image blocks; for each first image block in the plurality of first image blocks, determining a first description knowledge set corresponding to the first image block according to the first image block and a preset first image data set, wherein the first description knowledge set comprises a plurality of first description knowledge, and specifically comprises a first texture description knowledge, a first color description knowledge and a first image block description knowledge; performing knowledge interaction operation on a plurality of first description knowledge corresponding to the first image block to determine first interaction description knowledge corresponding to the first image block; determining a first anomaly analysis description knowledge corresponding to the first image block according to the first interaction description knowledge to obtain a plurality of first anomaly analysis description knowledge corresponding to the plurality of first image blocks; the anomaly recognition network is used for carrying out anomaly recognition on the plurality of first anomaly analysis description knowledge so as to determine the anomaly recognition result corresponding to the bridge aerial image set to be recognized.

As an implementation manner, the bridge detection method based on the unmanned aerial vehicle can be further executed through a preset target neural network, for the target neural network, input data can be an aerial image set of the bridge to be identified, and output data is an abnormal identification result corresponding to the aerial image set of the bridge to be identified.

In one embodiment, the target neural network comprises a target description knowledge extraction network, a target knowledge interaction network, a target anomaly vector construction network and a target bridge anomaly identification network. The bridge detection method based on the unmanned aerial vehicle comprises the following steps of:

step 100, after obtaining an aerial bridge image set to be identified including a plurality of first image blocks, determining, for each first image block in the plurality of first image blocks, a first description knowledge set corresponding to the first image block according to the first image block and a preset first image data set, where the first description knowledge set includes a plurality of first description knowledge, and is respectively a first texture description knowledge, a first color description knowledge and a first image block description knowledge.

Step 200, after a first description knowledge set corresponding to a first image block is determined through a target description knowledge extraction network, knowledge interaction operation is performed on a plurality of first description knowledge corresponding to the first image block through a target knowledge interaction network, and first interaction description knowledge corresponding to the first image block is determined.

And 300, constructing a network through a target abnormal carrier, and determining a first abnormal analysis description knowledge corresponding to the first image block according to the first interactive description knowledge to obtain a plurality of first abnormal analysis description knowledge corresponding to the plurality of first image blocks.

Step 400, performing anomaly recognition on the plurality of first anomaly analysis description knowledge through a target bridge anomaly recognition network so as to determine an anomaly recognition result corresponding to the bridge aerial image set to be recognized.

In one embodiment, the target feature encoding network and the target first class mapping network are both networks in the target neural network and the middle layer dimension is the middle layer dimension corresponding to the target neural network.

Then, the embodiment of the application further provides a method for debugging the neural network, which includes the following steps:

step 101, obtaining a bridge template image set, wherein the bridge template image set comprises a plurality of bridge template image blocks.

The bridge template image set is a data set used for debugging the network.

Step 102, determining a plurality of second image feature data corresponding to the bridge template image set, including texture image feature data, color image feature data and image block image feature data.

The texture image feature data comprises second texture image feature data of each bridge template image block, the color image feature data comprises second color image feature data of each bridge template image block, and the image block image feature data comprises second image block image feature data of each bridge template image block.

And step 103, performing shielding operation on each image characteristic data in the plurality of image characteristic data to obtain a plurality of shielding sets.

The masking operation is to mask the image feature data, and the obtained masking set is a feature sequence obtained by masking the feature data, wherein each image feature data at least corresponds to one masking set, and each masking set comprises a plurality of temporary image feature data. The plurality of image feature data in the mask set is temporary image feature data in the mask set.

Step 104, for each mask set in the plurality of mask sets, inputting the mask set into the original neural network to obtain inferred image feature data corresponding to the mask distribution information in the mask set (i.e. a result obtained after the network predicts a feature sequence corresponding to the mask set). The original neural network comprises an original description knowledge extraction network, an original knowledge interaction network, an original anomaly carrier construction network and an original bridge anomaly identification network, wherein the original description knowledge extraction network is used for determining a second description knowledge set corresponding to each temporary image characteristic data in the shielding set, and the second description knowledge set comprises a plurality of second description knowledge, namely second texture description knowledge, second color description knowledge and second image block description knowledge; the original knowledge interaction network is used for determining second interaction description knowledge corresponding to each temporary image feature data in the shielding set according to the second description knowledge set, the original abnormal carrier construction network is used for determining input data of the original bridge abnormal recognition network according to the second interaction description knowledge, and the target bridge abnormal recognition network is used for determining reasoning image feature data corresponding to shielding distribution information in the shielding set according to the input data.

Step 105, debugging the original neural network according to the inferred image feature data corresponding to the shielding distribution information in each shielding set of the plurality of shielding sets and the second image feature data corresponding to the shielding distribution information in each shielding set of the plurality of shielding sets to obtain the target neural network.

The target neural network comprises a target description knowledge extraction network, a target knowledge interaction network, a target anomaly carrier construction network and a target bridge anomaly identification network. The debugging of the target neural network relies on a plurality of bridge template image sets, and the bridge template image set in step 101 is one bridge template image set in the plurality of bridge template image sets.

In one embodiment, step 102, determining a plurality of second image feature data corresponding to the bridge template image set may include: acquiring a plurality of feature classifications, namely textures, colors and image blocks; and determining a plurality of second image feature data corresponding to the bridge template image set according to the plurality of classifications, wherein one classification corresponds to one image feature data, and the second image feature data is actual image feature data corresponding to the bridge template image block, for example, the second image feature data comprises actual image block image feature data, actual color image feature data and actual texture feature data corresponding to the bridge template image block.

In one embodiment, step 103, performing a masking operation on each image feature data in the plurality of image feature data to obtain a plurality of masking sets, performing a masking operation on each image feature data to obtain a masking set corresponding to the image feature data, including: acquiring a preset shielding duty ratio (namely the percentage of shielding required); acquiring the number of a plurality of second image feature data in the image feature data; determining the target number of second image feature data to be masked in the image feature data according to the preset masking duty ratio and the number of the plurality of second image feature data; and performing shielding operation on the second image characteristic data of the target number in the image characteristic data, wherein the rest second image characteristic data except the second image characteristic data of the target number are unchanged, so as to obtain a shielding set corresponding to the image characteristic data. The numerical value of the preset shielding duty ratio is set according to actual needs, so that the network can learn the shielded image characteristic data without depending on a large number of training samples for additional labeling, the neural network can learn abnormal bridge image information, and the debugging efficiency of the neural network is improved. And setting a shielding duty ratio, acquiring the number of characteristic elements contained in the image characteristic data, and then shielding according to a proportion, so that the method is simple and convenient to operate, high in efficiency and capable of improving the debugging efficiency of a network.

In one embodiment, determining the target number of second image feature data to be masked in the image feature data according to the preset mask duty ratio and the number of the plurality of second image feature data includes: and obtaining a product result of the preset shielding duty ratio and the number of the plurality of second image characteristic data, and determining the target number of the second image characteristic data to be shielded in the image characteristic data according to the product result.

For example, determining the target number of the second image feature data to be masked in the image feature data according to the product result includes: the product result is normalized (e.g., to the nearest integer), and the normalized result is determined as the target number of second image feature data in the image feature data to be masked.

In one embodiment, the masking operation is performed on the second image feature data of the target number, where the other second image feature data except the second image feature data of the target number is unchanged, to obtain a masking set corresponding to the image feature data, and the masking set includes: and optionally performing shielding operation on the second image characteristic data of the target number in the second image characteristic data contained in the image characteristic data, wherein the rest second image characteristic data except the second image characteristic data of the target number are unchanged, and obtaining a shielding set corresponding to the image characteristic data.

As one embodiment, optionally, the masking operation is performed on the second image feature data of the target number, and the remaining second image feature data except for the second image feature data of the target number is unchanged, so as to obtain a mask set corresponding to the image feature data, including: optionally, the second image feature data of the target number is transformed into a target vector, and other second image feature data except the second image feature data of the target number are unchanged, so that a shielding set corresponding to the image feature data is obtained.

The target vector is a first trusted coefficient of a preset vector, the trusted coefficient of any information of the target vector is a second trusted coefficient, the trusted coefficient of the original second image characteristic data of the target vector is a third trusted coefficient, and the trusted coefficient can be probability, confidence or support.

In one embodiment, the preset vector is 0, and the arbitrary information is associated with a feature class corresponding to the image feature data, for example, if the class corresponding to the image feature data is texture class, the arbitrary information is arbitrary texture image feature data, if the class corresponding to the image feature data is color class, the arbitrary information is arbitrary color image feature data, if the class corresponding to the image feature data is image block class, the arbitrary information is arbitrary image block image feature data.

In one embodiment, debugging the original neural network according to inferred image feature data corresponding to mask distribution information in each of the plurality of mask sets, and second image feature data corresponding to mask distribution information in each of the plurality of mask sets, includes: determining a texture inference error value according to inference texture image feature data corresponding to shielding distribution information in the texture image feature data and second texture image feature data corresponding to the shielding distribution information; determining a color reasoning error value according to reasoning color image characteristic data corresponding to shielding distribution information in the color image characteristic data and second color image characteristic data corresponding to the shielding distribution information; determining an image block reasoning error value according to reasoning image block image characteristic data corresponding to shielding distribution information in the image block image characteristic data and second image block image characteristic data corresponding to the shielding distribution information; and debugging the original neural network according to the texture reasoning error value, the color reasoning error value and the image block reasoning error value to obtain a target neural network.

In one embodiment, determining a texture inference error value according to inferred texture image feature data corresponding to mask distribution information in the texture image feature data, and second texture image feature data corresponding to the mask distribution information, includes: determining cross entropy error values of inferred texture image feature data corresponding to shielding distribution information and second texture image feature data corresponding to shielding distribution information according to inferred texture image feature data corresponding to shielding distribution information in the texture image feature data and second texture image feature data corresponding to the shielding distribution information; in one embodiment, determining a color inference error value according to the inferred color image feature data corresponding to the mask distribution information in the color image feature data and the second color image feature data corresponding to the mask distribution information includes: determining a cross entropy error value of inferred color image feature data corresponding to shielding distribution information and second color image feature data corresponding to the shielding distribution information according to inferred color image feature data corresponding to shielding distribution information in the color image feature data and second color image feature data corresponding to the shielding distribution information; in one embodiment, determining a texture inference error value according to the inference image block image feature data corresponding to the mask distribution information in the image block image feature data and the second image block image feature data corresponding to the mask distribution information includes: and determining a cross entropy error value of the inferred image block image feature data corresponding to the shielding distribution information and the second image block image feature data corresponding to the shielding distribution information according to the inferred image block image feature data corresponding to the shielding distribution information in the image block image feature data.

Specifically, debugging the original neural network according to the texture inference error value, the color inference error value and the image block inference error value to obtain a target neural network, including: integrating the texture inference error value, the color inference error value and the image block inference error value, for example, directly adding the texture inference error value, the color inference error value and the image block inference error value, or carrying out weighted summation according to different weights, or solving the average value of the texture inference error value, the color inference error value and the image block inference error value to obtain an integrated error value; if the integrated error value is smaller than the error threshold, the original neural network is used as a target neural network, and for different error integration modes, the error threshold can be selected to be adaptively changed, and the method is not limited herein; if the integrated error value is not smaller than the error critical value, adjusting the network configuration variable of the original neural network through the integrated error value to obtain an original neural network with the adjusted network configuration variable, then jumping to each shielding set in the plurality of shielding sets, inputting the shielding set into the original neural network to obtain inferred image characteristic data corresponding to shielding distribution information in the shielding set, and stopping debugging until the integrated error value is smaller than the error critical value, wherein the network configuration variable comprises: the original description knowledge extraction network, the original knowledge interaction network, the original anomaly carrier construction network, and one or more corresponding network configuration variables in the original bridge anomaly identification network.

The debugging process of the neural network enables the network to learn fusion correspondence and semantic information of image blocks, colors and textures of the bridge image. For example, when image blocks are inferred, the network infers the image blocks based on the adjacent integrated codes of the image blocks, semantic information embedded in the image blocks in the integrated codes is focused more currently, and semantic information embedded in the colors in the integrated codes is focused more when colors are inferred, so that the accuracy of bridge image detection can be enhanced.

For each temporary image feature data, the original knowledge interaction network is configured to determine, according to the second description knowledge set, a second interaction description knowledge corresponding to each temporary image feature data in the mask set, for example, the original knowledge interaction network may fuse (e.g. splice) the second texture description knowledge, the second color description knowledge, and the second image block description knowledge corresponding to the temporary image feature data on the middle layer dimension to obtain a second fused knowledge; acquiring a target dimension corresponding to the original bridge anomaly identification network; and carrying out description knowledge dimension optimization on the second fusion knowledge through the target dimension to obtain second interaction description knowledge corresponding to the temporary image feature data, wherein the intermediate layer dimension corresponding to the second interaction description knowledge is consistent with the intermediate layer dimension corresponding to the target dimension.

For example, the original description knowledge extraction network may determine each second description knowledge corresponding to the temporary image feature data according to the temporary image feature data, the original feature encoding network, and the original first classification mapping network, where the original description knowledge extraction network is configured to, when determining the corresponding second description knowledge according to the temporary image feature data, the original feature encoding network, and the original first classification mapping network: for the temporary image feature data, determining whether the feature capacity of the temporary image feature data meets a preset feature capacity; if the feature capacity of the temporary image feature data meets the preset feature capacity, determining second description knowledge corresponding to the temporary image feature data according to the temporary image feature data, an original feature coding network and an original first classification mapping network; and if the feature capacity of the temporary image feature data does not meet the preset feature capacity, supplementing the temporary image feature data based on the preset supplemental image feature, and determining second description knowledge corresponding to the temporary image feature data according to the supplemented temporary image feature data, the original feature coding network and the original first classification mapping network after supplementing the temporary image feature data to the capacity meeting the preset feature capacity.

The original abnormal carrier construction network is used for determining input data of the original bridge abnormal identification network according to the second interactive description knowledge, and is used for: for each temporary image characteristic data in the shielding set, determining a second abnormal analysis description knowledge corresponding to the temporary image characteristic data according to the second interaction description knowledge to obtain a plurality of second abnormal analysis description knowledge corresponding to the plurality of temporary image characteristic data; and taking the plurality of second anomaly analysis descriptive knowledge as input data of the original bridge anomaly identification network.

The original abnormal carrier construction network is also used for determining second image distribution description knowledge and second region segmentation description knowledge corresponding to the temporary image characteristic data according to the shielding set; for each temporary image feature data, determining a second anomaly analysis description knowledge corresponding to the temporary image feature data according to the second interaction description knowledge, including: and determining a second abnormal analysis description knowledge corresponding to the temporary image feature data according to the second interaction description knowledge, the second image distribution description knowledge corresponding to the temporary image feature data and the second region segmentation description knowledge corresponding to the temporary image feature data.

In one embodiment, the original anomaly carrier building network is configured to, when determining a second anomaly analysis description knowledge corresponding to the temporary image feature data according to the second interaction description knowledge, the second image distribution description knowledge corresponding to the temporary image feature data, and the second region segmentation description knowledge corresponding to the temporary image feature data, determine the second anomaly analysis description knowledge corresponding to the temporary image feature data: and integrating the second interaction description knowledge, the second image distribution description knowledge corresponding to the temporary image feature data and the second region segmentation description knowledge corresponding to the temporary image feature data to obtain a second abnormal analysis description knowledge corresponding to the temporary image feature data.

The original bridge anomaly identification network is used for determining reasoning image feature data corresponding to the shielding distribution information in the shielding set according to the input data, and the reasoning image feature data is matched with the classification corresponding to the shielding set.

In one embodiment, for each temporary image feature data, the original description knowledge extraction network may further determine each second description knowledge corresponding to the temporary image feature data according to the temporary image feature data, the original feature encoding network and the original first classification mapping network, to obtain a second description knowledge set corresponding to the temporary image feature data.

Based on the foregoing embodiments, the embodiments of the present application provide a bridge inspection device, where each unit included in the bridge inspection device and each module included in each unit may be implemented by a processor in a computer device; of course, the method can also be realized by a specific logic circuit; in practice, the processor may be a central processing unit (Central Processing Unit, CPU), microprocessor (Microprocessor Unit, MPU), digital signal processor (Digital Signal Processor, DSP) or field programmable gate array (Field Programmable Gate Array, FPGA), etc.

Fig. 3 is a schematic structural diagram of a bridge detection device according to an embodiment of the present application, and as shown in fig. 3, a bridge detection device 300 includes:

the image acquisition module 310 is configured to acquire a plurality of bridge aerial images sent by at least one unmanned aerial vehicle, and construct a bridge aerial image set to be identified, where the bridge aerial image set to be identified includes a plurality of first image blocks;

the knowledge extraction module 320 is configured to determine, for each first image block of the plurality of first image blocks, a first description knowledge set corresponding to the first image block according to the first image block and a preset first image data set, where the first description knowledge set includes a plurality of first description knowledge, and the first description knowledge set includes a first texture description knowledge, a first color description knowledge, and a first image block description knowledge;

The knowledge interaction module 330 is configured to perform a knowledge interaction operation on a plurality of first description knowledge corresponding to the first image block, so as to determine first interaction description knowledge corresponding to the first image block;

the anomaly knowledge determination module 340 is configured to determine a first anomaly analysis description knowledge corresponding to the first image block according to the first interaction description knowledge, so as to obtain a plurality of first anomaly analysis description knowledge corresponding to the plurality of first image blocks;

the anomaly identification module 350 is configured to perform anomaly identification on the plurality of first anomaly analysis description knowledge through a target bridge anomaly identification network, so as to determine an anomaly identification result corresponding to the bridge aerial image set to be identified.

The description of the apparatus embodiments above is similar to that of the method embodiments above, with similar advantageous effects as the method embodiments. In some embodiments, functions or modules included in the apparatus provided in the embodiments of the present application may be used to perform the methods described in the embodiments of the methods, and for technical details that are not disclosed in the embodiments of the apparatus of the present application, please refer to the description of the embodiments of the methods of the present application for understanding.

It should be noted that, in the embodiment of the present application, if the bridge detection method based on the unmanned aerial vehicle is implemented in the form of a software function module, and sold or used as an independent product, the bridge detection method may also be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially or portions contributing to the related art, and the software product may be stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read Only Memory (ROM), a magnetic disk, an optical disk, or other various media capable of storing program codes. Thus, embodiments of the present application are not limited to any specific hardware, software, or firmware, or to any combination of hardware, software, and firmware.

The embodiment of the application provides a computer device, which comprises a memory and a processor, wherein the memory stores a computer program capable of running on the processor, and the processor executes the program to realize part or all of the steps of the method.

Embodiments of the present application provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs some or all of the steps of the above-described method. The computer readable storage medium may be transitory or non-transitory.

Embodiments of the present application provide a computer program comprising computer readable code which, when run in a computer device, performs some or all of the steps for implementing the above method.

Embodiments of the present application provide a computer program product comprising a non-transitory computer-readable storage medium storing a computer program which, when read and executed by a computer, performs some or all of the steps of the above-described method. The computer program product may be realized in particular by means of hardware, software or a combination thereof. In some embodiments, the computer program product is embodied as a computer storage medium, in other embodiments the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), or the like.

It should be noted here that: the above description of various embodiments is intended to emphasize the differences between the various embodiments, the same or similar features being referred to each other. The above description of apparatus, storage medium, computer program and computer program product embodiments is similar to that of method embodiments described above, with similar advantageous effects as the method embodiments. For technical details not disclosed in the embodiments of the apparatus, storage medium, computer program and computer program product of the present application, please refer to the description of the method embodiments of the present application.

Fig. 4 is a schematic diagram of a hardware entity of a computer device according to an embodiment of the present application, as shown in fig. 4, the hardware entity of the computer device 1000 includes: a processor 1001 and a memory 1002, wherein the memory 1002 stores a computer program executable on the processor 1001, the processor 1001 implementing the steps in the method of any of the embodiments described above when the program is executed.

The memory 1002 stores a computer program executable on the processor, and the memory 1002 is configured to store instructions and applications executable by the processor 1001, and may also cache data (e.g., image data, audio data, voice communication data, and video communication data) to be processed or already processed by each module in the processor 1001 and the model training apparatus 1000, which may be implemented by a FLASH memory (FLASH) or a random access memory (Random Access Memory, RAM).

The steps of the unmanned aerial vehicle-based bridge detection method according to any one of the above are implemented when the processor 1001 executes a program. The processor 1001 generally controls the overall operation of the computer device 1000.

Embodiments of the present application provide a computer storage medium storing one or more programs executable by one or more processors to implement the steps of the unmanned aerial vehicle-based bridge detection method of any of the embodiments above.

It should be noted here that: the description of the storage medium and apparatus embodiments above is similar to that of the method embodiments described above, with similar benefits as the method embodiments. For technical details not disclosed in the embodiments of the storage medium and the apparatus of the present application, please refer to the description of the method embodiments of the present application for understanding. The processor may be at least one of a target application integrated circuit (Application Specific Integrated Circuit, ASIC), a digital signal processor (Digital Signal Processor, DSP), a digital signal processing device (Digital Signal Processing Device, DSPD), a programmable logic device (Programmable Logic Device, PLD), a field programmable gate array (Field Programmable Gate Array, FPGA), a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, and a microprocessor. It will be appreciated that the electronic device implementing the above-mentioned processor function may be other, and embodiments of the present application are not specifically limited.

The computer storage medium/Memory may be a Read Only Memory (ROM), a programmable Read Only Memory (Programmable Read-Only Memory, PROM), an erasable programmable Read Only Memory (Erasable Programmable Read-Only Memory, EPROM), an electrically erasable programmable Read Only Memory (Electrically Erasable Programmable Read-Only Memory, EEPROM), a magnetic random access Memory (Ferromagnetic Random Access Memory, FRAM), a Flash Memory (Flash Memory), a magnetic surface Memory, an optical disk, or a Read Only optical disk (Compact Disc Read-Only Memory, CD-ROM); but may also be various terminals such as mobile phones, computers, tablet devices, personal digital assistants, etc., that include one or any combination of the above-mentioned memories.

It should be appreciated that reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present application. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. It should be understood that, in various embodiments of the present application, the sequence number of each step/process described above does not mean that the execution sequence of each step/process should be determined by the function and the internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present application. The foregoing embodiment numbers of the present application are merely for describing, and do not represent advantages or disadvantages of the embodiments. It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above described device embodiments are only illustrative, e.g. the division of the units is only one logical function division, and there may be other divisions in practice, such as: multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. In addition, the various components shown or discussed may be coupled or directly coupled or communicatively coupled to each other via some interface, whether indirectly coupled or communicatively coupled to devices or units, whether electrically, mechanically, or otherwise.

The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units; can be located in one place or distributed to a plurality of network units; some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may be separately used as one unit, or two or more units may be integrated in one unit; the integrated units may be implemented in hardware or in hardware plus software functional units.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware related to program instructions, and the foregoing program may be stored in a computer readable storage medium, where the program, when executed, performs steps including the above method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read Only Memory (ROM), a magnetic disk or an optical disk, or the like, which can store program codes.

Alternatively, the integrated units described above may be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the related art in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a removable storage device, a ROM, a magnetic disk, or an optical disk.

The foregoing is merely an embodiment of the present application, but the protection scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered in the protection scope of the present application.

Claims

1. A bridge inspection method based on an unmanned aerial vehicle, characterized in that it is applied to a computer device communicatively connected to the unmanned aerial vehicle, the method comprising:

fusing the first texture description knowledge, the first color description knowledge and the first image block description knowledge corresponding to the first image block on an intermediate layer dimension to obtain first fused knowledge; acquiring a target dimension corresponding to a target bridge anomaly identification network;

Performing anomaly recognition on the plurality of first anomaly analysis description knowledge through a target bridge anomaly recognition network to determine an anomaly recognition result corresponding to the bridge aerial image set to be recognized;

the determining, according to the first image block and a preset first image data set, a first description knowledge set corresponding to the first image block includes:

determining a first description knowledge set corresponding to the first image block through first image feature data corresponding to the first image block;

the determining, by the first image feature data corresponding to the first image block, a first description knowledge set corresponding to the first image block includes:

for each piece of first image feature data in the first image feature data corresponding to the first image block, determining corresponding first description knowledge according to the first image feature data, a target feature coding network and a target first classification mapping network to obtain a first description knowledge set corresponding to the first image block;

The determining the corresponding first description knowledge according to the first image feature data, the target feature encoding network and the target first classification mapping network includes:

2. The method according to claim 1, wherein the method further comprises:

3. The method according to claim 2, wherein the performing a masking operation on each image feature data to obtain a masking set corresponding to the image feature data includes:

acquiring a preset shielding duty ratio;

4. The method of claim 3, wherein the debugging the original neural network to obtain a target neural network according to the texture inference error value, the color inference error value, and the image block inference error value comprises:

5. A bridge inspection system comprising a computer device and at least one unmanned aerial vehicle communicatively connected to the computer device, the computer device comprising a memory and a processor, the memory storing a computer program executable on the processor, the processor executing the program to perform the steps of the method of any one of claims 1 to 4.