WO2022047736A1 - 一种基于卷积神经网络的损伤检测方法 - Google Patents

一种基于卷积神经网络的损伤检测方法 Download PDF

Info

Publication number
WO2022047736A1
WO2022047736A1 PCT/CN2020/113533 CN2020113533W WO2022047736A1 WO 2022047736 A1 WO2022047736 A1 WO 2022047736A1 CN 2020113533 W CN2020113533 W CN 2020113533W WO 2022047736 A1 WO2022047736 A1 WO 2022047736A1
Authority
WO
WIPO (PCT)
Prior art keywords
damage
image
dual
cnn
detection method
Prior art date
Application number
PCT/CN2020/113533
Other languages
English (en)
French (fr)
Inventor
瓦尔·阿波得莫姆·阿波得莫姆 阿塔贝
默罕默德 努里
洪卫星
Original Assignee
江苏前沿交通研究院有限公司
南京智行信息科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 江苏前沿交通研究院有限公司, 南京智行信息科技有限公司 filed Critical 江苏前沿交通研究院有限公司
Priority to PCT/CN2020/113533 priority Critical patent/WO2022047736A1/zh
Publication of WO2022047736A1 publication Critical patent/WO2022047736A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition

Definitions

  • the invention relates to the field of neural networks, in particular to a damage detection method based on a convolutional neural network.
  • R-CNN Registered-based convolutional neural network
  • Fast R-CNN Faster R-CNN (Faster R-CNN) R-CNN).
  • R-CNN uses selective search to extract a set of regions from a given image, and then checks whether any box contains an object. These regions are first extracted, and for each region, a CNN is used to extract specific features. Finally, these features are used to detect objects. Unfortunately, because of the multiple steps involved in this process, R-CNN becomes quite slow.
  • Fast R-CNN passes the entire image to a convolutional network, which generates ROIs (Regions of interest) instead of passing extracted regions from the image. Also, instead of using three different models (as we saw in R-CNN), it uses a single model that extracts features from regions, classifies them into different types, and returns bounding boxes. All these steps are done simultaneously, so it performs faster compared to R-CNN. However, because Fast R-CNN also uses selective search to extract regions, it is not fast enough when applied to large datasets.
  • RPN Region proposal Network
  • the present invention proposes a new method for damage detection based on the R-CNN family structure.
  • the algorithm based on the dual/multi-region faster convolutional neural network is named as Faster Dual/Multi Region-based Convolution Neural Network (Faster Dual/Multi Region-based Convolution Neural Network, referred to as Faster D/ MR-CNN) algorithm.
  • the present invention provides a damage detection method based on a convolutional neural network, the method comprising:
  • Step 1-1 Input the image to be inspected into the faster D/M-R-CNN model
  • the faster D/MR-CNN model in step 1-2 processes the image to be inspected, and outputs a final damage image and a confidence score, where the confidence score means that the damage in the final damage image is an expected damage possibility;
  • the faster D/M-R-CNN model includes:
  • a deep CNN for generating a feature map of the image to be inspected
  • Dual/Multi Region Proposal Network (D/M-RPN) model including two or more region proposal network models RPN, for each candidate damage in the image to be inspected generating two or more damage proposals, comparing the two or more damage proposals to obtain a confidence level, and classifying and regressing all the obtained damage proposals, and outputting the final and the confidence score; wherein the confidence refers to the likelihood that the two or more damage suggestions are the expected damage.
  • D/M-RPN Dual/Multi Region Proposal Network
  • the deep CNN generates the feature maps of different scales.
  • each of the two or more damage proposals includes a bounding box (bbox) for representing the damage detected by the D/M-RPN model.
  • bbox bounding box
  • the D/M-RPN model includes a dual/multi region of interest (Dual/Multi regions of interest, D/M-ROI) pooling layer, and the D/M-ROI pooling layer includes two or two more than one region-of-interest ROI pooling layer is used to generate the two or more damage suggestions for each candidate damage in the image to be inspected, and to analyze the two or more damage suggestions A comparison is recommended to obtain the confidence level.
  • D/M-ROI dual/multi region of interest
  • the D/M-ROI pooling layer is one of a max pooling layer and an average pooling layer.
  • the D/M-RPN model further includes a fully connected (Fully connected, FC) layer, and the FC layer is used to classify and regress the bbox.
  • FC Fully connected
  • the present invention also provides a damage detection method based on a convolutional neural network, the method comprising:
  • Step 2-1 Input the image to be inspected into the deep CNN to obtain the feature map of the image to be inspected;
  • Step 2-2 inputs the feature map into the D/M-ROI pooling layer in the D/M-RPN model, wherein the D/M-RPN model includes two or more RPN models, and The D/M-ROI pooling layer includes two or more regions of interest ROI pooling layers;
  • the D/M-ROI pooling layer in step 2-3 generates two or more damage suggestions for each candidate damage in the image to be inspected, and then compares the two or more damage suggestions, to create a confidence that the lesion detected in the bounding box bbox is the expected lesion, wherein the confidence refers to the likelihood that the two or more lesion proposals are the expected lesion;
  • Steps 2-4 input the damage suggestion into the fully connected FC layer of the D/M-RPN model to classify and regress the bbox;
  • Step 2-5 obtains the final damage image according to the results of the classification and regression, calculates a confidence score, and outputs a result including the classification result and the confidence score, wherein the confidence score refers to the final
  • the damage in the damage image is the likelihood of the desired damage.
  • the deep CNN and the D/M-RPN model constitute a faster D/MR-CNN model
  • the method further includes the step of training the faster D/MR-CNN model, wherein the The training steps include:
  • Step 3-1 obtain the source image for training, the source image is the image sequence of a single object
  • Step 3-2 enhancing and labeling the source image
  • Step 3-3 select the weight
  • Steps 3-4 designing and training the faster D/M-R-CNN model.
  • Step 4-1 initialize the faster D/M-R-CNN model
  • Step 4-2 training the deep CNN and the D/M-RPN model, after the training is completed, the two form the first model
  • Step 4-3 using the first model obtained in step 4-2 to generate a damage suggestion
  • Step 4-4 using the damage suggestion obtained in the step 4-3 to train a classifier
  • Step 4-5 re-initialize the faster D/M-R-CNN model using the parameters of the first model obtained in step 4-2 to obtain a second model;
  • Step 4-6 using the weight of the second model to retrain the D/M-RPN model
  • Steps 4-7 using the second model to generate a damage suggestion
  • Step 4-8 train the classifier using the damage recommendations obtained in the step 4-7.
  • the deep CNN is trained separately first, and after the training is completed, the deep CNN is fixed and the D/M-RPN model is trained.
  • the deep CNN is fixed.
  • steps 4-4 and 4-8 a sequence of damaged images is extracted from the source image according to the damage suggestion for training the classifier.
  • a Support Vector Machine (SVM) is attached behind each of two or more CNNs;
  • SVM Support Vector Machine
  • the final predicted score calculation process when training the classifier, includes:
  • Step 5-1 calculate the P tensor
  • Step 5-2 calculate the E tensor
  • Step 5-3 calculate the V tensor
  • Step 5-4 calculate the ⁇ vector
  • Step 5-5 calculate the prediction score S
  • the P tensor represents the damage feature output by the SVM of each of the N CNNs;
  • the E tensor represents the size estimation tensor of the source image;
  • the V tensor represents the size of the source image Velocity tensor;
  • the ⁇ vector is expressed as a fused vector of all the P tensors.
  • the P tensor is represented as follows:
  • ci,j is the probability of class (i,j)
  • nc is the number of classes
  • n is the number of the source images used for training, so that each of the image sequences has one of the P tensor
  • the combined P tensor of the image sequence is:
  • the average value of the size estimation of the image sequence is calculated, and all classes e containing the average size are checked from the size lookup table, wherein some elements are converted to 1, and other elements are set is 0, thus obtaining the E tensor:
  • step 5-3 all categories containing the provided velocity v are checked from the velocity lookup table, some elements are converted to 1, and other elements are converted to 0, and the V tensor is obtained:
  • the ⁇ vector is:
  • the predicted score S is:
  • m represents the average value of the S (i, j) .
  • the algorithm is short in time and high in detection accuracy. It does not have to follow the traditional methods of other networks in the R-CNN family. It reduces overfitting and improves detection accuracy by adding more images to the database.
  • Faster D/MR-CNN has high accuracy and recall, and has high speed for real-time extraction of all target (damage) features from images, which is very important for accurate damage detection from acquired images, improving The ability of previous damage detection systems to achieve real-time detection.
  • mAP mean mean precision
  • the invention will lay a foundation for applying a new generation of deep learning technology in the structural damage detection system and solving the defects in the existing structural damage detection system based on the deep learning.
  • Figure 1 is a schematic diagram describing the structure and function of Faster D/M-R-CNN.
  • Figure 2 is a comparison between the R-CNN family of algorithms and the faster D/M-R-CNN.
  • Figure 3 is the overall flow chart of faster D/M-R-CNN training and application.
  • Figure 4 is a flow chart of the faster D/M-R-CNN training process.
  • Figure 5 is an illustration of the training of the classifier in Faster D/M-R-CNN.
  • FIG. 6 is an explanatory diagram of the operation of D/M-CNN in an embodiment of the present application.
  • FIG. 7 is an explanatory diagram of a maximum pool operation in an embodiment of the present application.
  • FIG. 8 is a connection between the D/M-CNN layer and the D/M-Sub-Sampling layer in an embodiment of the present application.
  • Figure 9 is a comparison diagram of this algorithm and the faster R-CNN algorithm.
  • Fig. 1 shows the flow chart of the damage detection method based on convolutional neural network provided by the present invention, including:
  • the faster D/MR-CNN model 20 processes the image to be inspected and outputs the final damage image 30.
  • a confidence score may also be output, and the confidence score refers to the likelihood that the lesions in the final lesion image 30 are expected lesions.
  • the faster D/M-R-CNN model 20 used is an algorithm proposed based on the R-CNN family structure.
  • the faster D/MR-CNN model includes a deep CNN21 and a Dual/Multi Region Proposal Network (D/M-RPN) model 22, where the deep CNN21 is used to generate the Feature mapping; D/M-RPN model 22, including two or more region proposal network RPN models for generating two or more damage proposals for each candidate impairment in the image 10 to be examined, and for Two or more damage proposals are compared to obtain confidence (confidence refers to the likelihood that two or more damage proposals are expected damage), and all the resulting damage proposals are classified and regressed, The final lesion image 30 is output along with the confidence score.
  • D/M-RPN Dual/Multi Region Proposal Network
  • the D/M-RPN model takes an image to be examined 10 as input and outputs a set of object proposals, including the probability of being a target lesion in each proposal.
  • the D/M-RPN model uses a deep CNN (Deep-CNN) to extract features in the image (the last layer of the deep-CNN as output) and slides another convolutional layer over the image.
  • the convolutional layer is followed by a Rectified Linear Unit (RELU) activation function, which provides nonlinearity and improves convergence speed.
  • RELU Rectified Linear Unit
  • the feature map, then RELU maps the features of each window into a vector, which is fed to the regression and classification layers, which then predict the coordinates of multiple bounding boxes and the probability of objects in each box, respectively.
  • each corresponding feature map (Conv) is associated with nine rectangular boxes called anchors.
  • the feature map is followed by RELU, which is fed to the FC layer.
  • RELU which is fed to the FC layer.
  • two outputs are computed for each generated object in the box, the probability that they are the object in the box or just a part of the background (without the object).
  • the objectivity probability computed for each bounding box is between 0 and 1, and is updated during training to minimize the difference between 0 or 1 for positive or negative anchors, respectively.
  • D/M-RPN is trained end-to-end for both classification and regression layers.
  • Anchor points are regions in the input image between target objects.
  • Faster D/MR-CNN algorithm can be applied to provide fast and accurate damage detection and classification in various structural images in real time, and can be used for various structures (such as bridges, high-rise buildings, dams, pipelines, storage tanks, etc. ), traffic control systems and damage identification systems in transportation systems. It should be understood that faster D/M-R-CNN algorithms can also be used for image analysis and processing in industries such as smart cities, traffic control and transportation systems, etc.
  • the faster D/MR-CNN algorithm does not have to follow the traditional method of other networks in the R-CNN family, that is, by adding more images to the database to reduce overfitting to improve the precision measurement accuracy, it can achieve short time and high detection accuracy. Effect.
  • Figure 2 shows the comparison of the faster D/M-R-CNN algorithm of the present application with other algorithms of the R-CNN family.
  • the CNN algorithm divides the image into multiple regions, and then divides each region into different classes, however, the algorithm requires a large number of regions for accurate prediction, so the computation time is very long.
  • the R-CNN algorithm uses selective search to generate regions, extracting about 2000 regions from each image, however, when each region is passed to the CNN separately, the computation time is very long, in addition, the algorithm uses three different models Make predictions.
  • each image is passed to the CNN only once, and feature maps are extracted, and selective search is used on these maps to generate predictions.
  • the algorithm combines the three models used in R-CNN, but , the algorithm is still based on selective search, which is slow and, therefore, still takes a long time to compute.
  • the faster R-CNN algorithm is used in the Region Proposal Network (RPN) instead of the selection search method, which improves the algorithm speed, however, in this algorithm, the object proposal takes time, since there are different systems working successively, the performance of the system depends on performance of the previous system.
  • two or more region proposal networks i.e., dual/multi-region proposal network D/M-RPN
  • D/M-RPN two or more region proposal networks
  • Figure 9 shows the difference between the faster D/M-R-CNN algorithm of the present application and the faster R-CNN algorithm and the improvement of the effect.
  • Faster R-CNN adopts a single RPN network
  • faster D/MR-CNN algorithm can adopt dual region proposal network, namely two region proposal network (D-RPN), for each candidate object in the same image. (Injury) proposal and compare the two proposals to get the desired object.
  • D-RPN region proposal network
  • the faster D/M-R-CNN algorithm uses two RPNs. It should be understood that in practical applications, more RPNs can be used. To obtain better results, the number of RPNs needs to be optimized.
  • the faster D/MR-CNN algorithm can receive the input image and generate convolutional multi-feature maps of different scales; the generated convolutional feature maps are processed by the dual/multi-region proposal network D/M-RPN, which is used for each candidate in the image.
  • object damage
  • generates two or more proposals i.e.
  • double/multiple proposals and creates two or more region proposal bounding boxes (dual/multiple region proposal bounding boxes); converts double/multiple bounding boxes Project back to the feature maps of the individual convolutional layers, resulting in a set of dual/multiple regions of interest (D/M-ROIs); the output of this process is a proposed dual/multiple stack for different regions of the same input image, by comparing them , creating a confidence level representing the likelihood of detecting the desired object (damage) within the bounding box, so that the desired object (damage) is detected in just one step.
  • D/M-ROIs dual/multiple regions of interest
  • the deep CNN21 can receive the input image to be inspected and generate convolutional multi-feature maps of different scales, which can be performed in the manner of the prior art.
  • the dual/multi-region proposal network D/M-RPN model 22 includes a dual/multi-region of interest D/M-ROI pooling layer 23 and a fully connected FC layer 25 .
  • the double/multiple attention area D/M-ROI pooling layer 23 includes two or more attention area ROI pooling layers. As shown in FIG. 1, the number of pooling layers is (A), and A is greater than Equal to 2, in one pooling layer, a fully connected FC layer can also be used.
  • the dual/multiple region-of-interest D/M-ROI pooling layer 23 is used to generate two or more damage proposals for each candidate damage in the image under inspection, and compare the two or more damage proposals to obtain Get confidence.
  • the double/multiple attention area D/M-ROI pooling layer 23 can be set as a max pooling layer or an average pooling layer.
  • Each of the damage proposals includes a bounding box bbox24 representing the detected damage.
  • a fully connected FC layer is used for classification and regression of bounding boxes bbox24.
  • the method for damage detection using the faster D/M-R-CNN algorithm includes the following steps:
  • Step 1 Input the image to be inspected 10 into the depth CNN21 to obtain the feature map of the image to be inspected 10;
  • Step 2 Input the obtained feature map into the dual/multi-region of interest D/M-ROI pooling layer 23 in the dual/multi-region proposal network D/M-RPN model 22, where the dual/multi-region proposal network D/M -
  • the RPN model 22 includes two or more region proposal RPN network models, and the dual/multiple regions of interest D/M-ROI pooling layer 23 includes two or more regions of interest ROI pooling layers;
  • Step 3 The double/multiple attention area D/M-ROI pooling layer 23 generates two or more damage proposals for each candidate damage in the image to be inspected 10, and then compares these two or more damage proposals to obtain create a confidence that the damage detected in the bounding box bbox24 is the desired damage, where the confidence is the likelihood that two or more damage proposals are the desired damage;
  • Step 4 Input the damage proposal into the fully connected FC layer 25 of the dual/multi-region proposal network D/M-RPN model 22 to perform bbox24 classification and regression on the bounding box;
  • Step 5 According to the results of classification and regression, obtain the final damage image 30, calculate the confidence score, and output the result including the classification result and the confidence score, where the confidence score refers to the possibility that the damage in the final damage image is the expected damage .
  • Step 3-1 Obtain the source image for training
  • Step 3-2 Enhance and label the image
  • Step 3-3 choose weight
  • Step 3-4 Design and train a faster D/M-R-CNN model
  • Step 3-5 compare the error between the output of the algorithm and the target, and judge whether the error is within the acceptable range, if so, continue to the next step, if not, return to step 3-3
  • Steps 3-6 use deep CNN to generate convolutional feature maps
  • Steps 3-7 generate recommendations based on convolutional feature maps
  • Steps 3-8 Classify and score proposed objects (injuries)
  • Steps 3-9 output images with classification and/or ratings
  • Step 4-1 Initialize the faster D/M-R-CNN model
  • Step 4-2. Train the deep CNN and the dual/multi-region proposal network D/M-RPN model. After the training is completed, the two form the first model, which includes the deep CNN and the dual/multi-region proposal network D/M- The RPN model, after the combination of the two, is called D/M-CRPN(1);
  • Step 4-3 using the first model D/M-CRPN(1) obtained in step 4-2 to generate a damage suggestion
  • Step 4-4 use the damage suggestion obtained in the step 4-3 to train the classifier (FC25 in Figure 1);
  • Step 4-5 re-initialize the faster D/MR-CNN model using the first model D/M-CRPN(1), and retrain the faster D/MR-CNN with the damage suggestion obtained in step 4-3 described above model to obtain the second model D/M-CRPN(2);
  • Steps 4-6 using the weights of the second model D/M-CRPN(2) to retrain the dual/multi-region proposal network D/M-RPN model;
  • Steps 4-7 use the second model D/M-CRPN(2) to generate a new damage proposal
  • Step 4-8 train the classifier using the damage recommendations obtained in the step 4-7.
  • step 4-2 the training of the deep CNN and the D/M-RPN can be carried out separately: first, the deep CNN is trained separately, after the training is completed, the deep CNN is fixed, and the D/M-RPN is trained.
  • An image sequence is extracted from an image of a single object (i.e., a sequence of temporally consecutive frames of damage type), which is fed to a D/M-CNN to extract image features.
  • D/M-SVM dual/multiple support vector machine
  • the SVM output of each CNN is compared with them to collect all damage features in the image with high accuracy and represented as P tensors as follows:
  • c i,j is the probability of class (i,j)
  • nc is the number of classes
  • n is the number of images of training examples, so each image in any given sequence of images has a P tensor .
  • the P tensor represents the result of the SVM, including sets of vectors representing attribution probabilities.
  • composition of the tensor is as follows: compute the average of the image sequence size estimates, check all classes e containing the average size from the size lookup table, convert some elements to 1, and set others to 0, resulting in
  • the E tensor represents the size estimate.
  • V tensor When the target moves, the velocity of the target is rewritten as the V tensor in a similar way. Velocities for object damage types are constructed in a similar way to the E tensor in size estimation, i.e. check from the velocity lookup table for all categories containing the provided velocity v, convert these elements to 1 and the others to 0.
  • the final classification is achieved by a fusion between the provided parameters and the predicted values of the image classifier.
  • the combined P tensor for a sequence of images is:
  • n is the number of images in each sequence
  • is:
  • FIG. 6 illustrates the double/multiple convolution and pooling processing, in the double/multiple convolution operation, the input data consists of a 7 ⁇ 7 ⁇ 3 dataset, where 7 ⁇ 7 represents the width and height Pixels, 3 for R, G, B color channels.
  • the step size is 2, which means that the window extracts 3x3 local data, spanning two steps each time.
  • Zero padding 1.
  • the filter bank is convolved with different local data covered by the window.
  • the double/multiple convolution operations are calculated with two filter banks respectively, and the double convolution operation and the multiple convolution operation of the two sets of results are given.
  • D/M filter a set of neurons with fixed weights
  • stride the span to cover the data
  • zero padding add a few zeros to make the window The distance from the initial position to the end of the dataset is larger.
  • One embodiment, shown in Figure 7, is a max pooling operation, which means getting the maximum value of a particular data window region.
  • Another pooling method is average pooling in the faster D/M-R-CNN algorithm, which takes the average of a specific data window region.
  • CNN generally consists of alternating convolution operations and subsampling operations, and the last layer is represented as a general multi-layer network. Setting up convolutional layers between subsampling layers improves computational efficiency and further improves structural invariance and spatial invariance.
  • C(1,j) is a D/M-CNN layer, and each CNN layer consists of six feature maps. Through the convolution operation, the characteristics of the original signal can be enhanced and the influence of noise can be reduced.
  • Each neuron of the feature map is connected to a 16 ⁇ 16 neighborhood of the input image.
  • the feature map size is 196 ⁇ 196.
  • S(2,j) is a D/M-sub-sampling layer. According to the local correlation principle of the image, each sub-sampling can be applied to the image, which reduces the data processing power and preserves useful information.
  • the 16 inputs per unit of S(2,j) are summed and multiplied by the tuning parameter with tuning bias. The result can be calculated using a sigmoid function. Tuning parameters and tuning control the nonlinearity of the sigmoid function. If these parameters are relatively small, the operation is similar to the linear operation.
  • each subsampling is equivalent to blurring the image.
  • each sub-sample can be viewed as a noisy "or” or "and” operation.
  • the 8 ⁇ 8 receptive fields of each unit do not overlap, so the size of each feature map in S(2,j) is 1/4 of C(1,j).

Abstract

一种基于卷积神经网络的损伤检测方法,可以接收输入图像,生成不同尺度的卷积多特征映射;通过双/多区域建议网络处理生成的卷积特征映射,为图像中的每个候选损伤生成双/多损伤建议,并创建双/多区域建议边界框;将双/多边界框投影回各个卷积层的特征映射,得到一组双/多关注区域;通过对它们进行比较,创建一个置信分数,表示在边界框中检测到期望损伤的可能性,从而只需一步就可检测到期望的损伤。本申请的有益效果是:用时短,高的精度和查全率,增加数据集尺寸和卷积层可以提高模型的速度和精度,可取得高达98%至99%左右的平均精度均佰。

Description

一种基于卷积神经网络的损伤检测方法 技术领域
本发明涉及神经网络领域,特别涉及一种基于卷积神经网络的损伤检测方法。
背景技术
若结构发生损伤,会有几个重要的迹象表明结构退化,甚至可能预示着严重失效的开始。已经在有可能替代人工检测的研究社区中进行了图像损伤检测的尝试。
在众多的损伤检测方法中,基于深度学习的损伤检测方法近年来得到了积极的探索。
下面快速总结一下基于区域的卷积神经网络R-CNN(Region-based convolutional neural network)家族中的不同算法:R-CNN、快速R-CNN(Fast R-CNN)和更快速R-CNN(Faster R-CNN)。
R-CNN使用选择性搜索从给定的图像中提取一组区域,然后检查任一方框中是否包含对象。首先提取这些区域,对于每个区域,CNN被用来提取特定的特征。最后,这些特征被用来检测对象。不幸的是,因为这个过程中涉及多个步骤,R-CNN变得相当缓慢。
快速R-CNN将整个图像传递给卷积网络,卷积网络生成关注区域ROI(Regions of interest),而不是从图像中传递提取的区域。此外,它没有使用三个不同的模型(如我们在R-CNN中所见),而是使用一个从区域中提取特征、将它们分为不同类型并返回边界框的单一模型。所有这些步骤都是同时完成的,因此与R-CNN相比,它的执行速度更快。然而,因为快速R-CNN还使用选择性搜索来提取区域,所以它在应用于大型数据集时速度不够快。
更快速R-CNN通过用区域建议网络RPN(Region Proposal Network)代替它来解决选择性搜索的问题。首先使用卷积网络从输入图像中提取特征图,然后通过返回对象建议的RPN传递这些特征图。最后,对这些特征图进行分类并预测边界框。
然而,如何在像素级快速、准确地自动提取损伤,即实时的损伤描述(包括检测和分割)是一个具有挑战性的问题。
发明内容
为了克服现有技术存在的问题,本发明提出了一种基于R-CNN家族结构的进行损伤检测的新方法,其基于双/多区域的更快速卷积神经网络的算法,用于高精 度的实时目标(损伤)检测和分类。在本发明中,将基于双/多区域的更快速卷积神经网络的算法命名为更快速双/多区域卷积神经网络(Faster Dual/Multi Region-based Convolution Neural Network,简称为更快速D/M-R-CNN)算法。
为达到上述目的,本发明提供了一种基于卷积神经网络的损伤检测方法,所述方法包括:
步骤1-1将待检图像输入到更快速D/M-R-CNN模型;
步骤1-2所述更快速D/M-R-CNN模型处理所述待检图像,输出最终的损伤图像以及置信分数,其中,所述置信分数是指所述最终的损伤图像中的损伤是期望损伤的可能性;
其中,所述更快速D/M-R-CNN模型包括:
深度CNN,用于生成所述待检图像的特征映射;
双/多区域建议网络(Dual/Multi Region Proposal Network,D/M-RPN)模型,包括两个或两个以上的区域建议网络模型RPN,用于对所述待检图像中的每个候选损伤生成两个或两个以上的损伤建议,并对所述两个或两个以上的损伤建议进行比较以得到置信度,以及将得到的所有的所述损伤建议进行分类和回归,输出所述最终的损伤图像以及所述置信分数;其中,所述置信度是指所述两个或两个以上的损伤建议是所述期望损伤的可能性。
进一步地,所述深度CNN生成不同尺度的所述特征映射。
进一步地,所述两个或两个以上的损伤建议中的每一个包括边界框(bounding box,bbox),所述边界框用于表示所述D/M-RPN模型检测到的损伤。
进一步地,所述D/M-RPN模型包括双/多关注区域(Dual/Multi regions of interest,D/M-ROI)池化层,所述D/M-ROI池化层包括两个或两个以上的关注区域ROI池化层,用于生成所述待检图像中所述每个候选损伤的所述两个或两个以上的损伤建议,并对所述两个或两个以上的损伤建议进行比较以得到所述置信度。
进一步地,所述D/M-ROI池化层是最大池化层和平均池化层中的一个。
进一步地,所述D/M-RPN模型还包括完全连接(Fully connected,FC)层,所述FC层用于对所述bbox进行分类和回归。
本发明还提供了一种基于卷积神经网络的损伤检测方法,所述方法包括:
步骤2-1将待检图像输入到深度CNN,得到所述待检图像的特征映射;
步骤2-2将所述特征映射输入到D/M-RPN模型中的D/M-ROI池化层,其中,所述D/M-RPN模型包括两个或两个以上的RPN模型,以及所述D/M-ROI池化层包括两个或两个以上的关注区域ROI池化层;
步骤2-3所述D/M-ROI池化层对所述待检图像中每个候选损伤生成两个或两个以上的损伤建议,然后比较所述两个或两个以上的损伤建议,以创建表示在边界框 bbox中检测到的损伤是期望损伤的置信度,其中,所述置信度是指所述两个或两个以上的损伤建议是所述期望损伤的可能性;
步骤2-4将所述损伤建议输入所述D/M-RPN模型的完全连接FC层,以对所述bbox进行分类和回归;
步骤2-5根据所述分类和回归的结果,得到最终的损伤图像,并计算置信分数,输出包括所述分类结果和所述置信分数的结果,其中,所述置信分数是指所述最终的损伤图像中的损伤是所述期望损伤的可能性。
进一步地,所述深度CNN和所述D/M-RPN模型构成更快速D/M-R-CNN模型,以及所述方法还包括对所述更快速D/M-R-CNN模型进行训练的步骤,其中所述训练的步骤包括:
步骤3-1、取得训练用的源图像,所述源图像是单个对象的图像序列;
步骤3-2、对所述源图像进行增强和标注;
步骤3-3、选择权重;
步骤3-4、设计并训练所述更快速D/M-R-CNN模型。
进一步地,所述步骤3-3、3-4中,进一步包括以下步骤:
步骤4-1、初始化所述更快速D/M-R-CNN模型;
步骤4-2、训练所述深度CNN和所述D/M-RPN模型,训练完成后,两者组成第一模型;
步骤4-3、使用所述步骤4-2得到的所述第一模型生成损伤建议;
步骤4-4、使用所述步骤4-3中得到的所述损伤建议训练分类器;
步骤4-5、使用所述步骤4-2得到的所述第一模型的参数重新初始化所述更快速D/M-R-CNN模型,得到第二模型;
步骤4-6、利用所述第二模型的权值再次训练所述D/M-RPN模型;
步骤4-7、使用所述第二模型生成损伤建议;
步骤4-8、使用所述步骤4-7中获得的损伤建议训练所述分类器。
进一步地,所述步骤4-2中,先单独训练所述深度CNN,训练完成后,固定所述深度CNN,训练所述D/M-RPN模型。
进一步地,所述步骤4-6中,训练所述D/M-RPN模型时,固定所述深度CNN。
进一步地,步骤4-4、4-8中,根据所述损伤建议在所述源图像中提取出损伤图像序列,用于训练所述分类器。
进一步地,所述步骤4-4、4-8中,在训练所述分类器时,在两个或两个以上的CNN的每一个的后面附加支持向量机(Support Vector Machine,SVM);所述SVM仅在训练时使用,训练完成后被移除。
进一步地,所述步骤4-4、4-8中,在训练所述分类器时,最终预测的得分计算 过程包括:
步骤5-1、计算P张量;
步骤5-2、计算E张量;
步骤5-3、计算V张量;
步骤5-4、计算Φ向量;
步骤5-5、计算预测得分S;
其中,所述P张量表示将N个CNN中的每一个的SVM输出的损伤特征;所述E张量表示所述源图像的尺寸估算张量;所述V张量表示所述源图像的速度张量;所述Φ向量表示为所有的所述P张量融合后的向量。
进一步地,所述步骤5-1中,所述P张量表示如下:
Figure PCTCN2020113533-appb-000001
式中,c i,j是类(i,j)的概率,nc是类的数目,n是用于训练的所述源图像的数目,因此所述图像序列中的每个都有一个所述P张量;
所述图像序列的组合P张量为:
Figure PCTCN2020113533-appb-000002
进一步地,所述步骤5-2中,计算所述图像序列的尺寸估算的平均值,从尺寸查找表中检查包含平均尺寸的所有类e,其中,将一些元素转换为1,将其他元素设置为0,从而得到所述E张量:
Figure PCTCN2020113533-appb-000003
其中:
Figure PCTCN2020113533-appb-000004
进一步地,所述步骤5-3中,从速度查找表中检查包含所提供速度v的所有类别,将一些元素转换为1,将其他元素转换为0,得到所述V张量:
Figure PCTCN2020113533-appb-000005
其中:
Figure PCTCN2020113533-appb-000006
进一步地,所述步骤5-4中,所述Φ向量为:
Φ (i,j)=P (i,j).*V (i,j).*E (i,j)             (7)
其中(.*)表示按元素进行的乘法运算。
进一步地,所述步骤5-5中,所述预测得分S为:
S (i,j)=max mΦ (i,j)             (8)
m=arg max mΦ (i,j)              (9)
其中,m表示所述S (i,j)的平均值。
本发明具有如下技术效果:
该算法用时短、检测精度高,不必遵循R-CNN家族其他网络的传统方法,通过在数据库中添加更多的图像来减少过拟合,提高检测精度。
更快速D/M-R-CNN具有很高的精度和查全率,对于从图像中实时提取所有目标(损伤)特征具有很高的速度,这对于从采集的图像中准确检测损伤非常重要,提高了以往损伤检测系统实现实时检测的能力。
具体来说,增加数据集尺寸和卷积层可以提高模型的速度和精度,并将取得高达98%至99%左右的平均精度均值(mAP)。
本发明将为在结构损伤检测系统中应用新一代的深度学习技术及解决基于深度学习的现有结构损伤检测系统中的缺陷奠定基础。
附图说明
图1是描述更快速D/M-R-CNN的结构和功能示意图。
图2是R-CNN家族算法和更快速D/M-R-CNN之间的比较。
图3是更快速D/M-R-CNN训练和应用总体流程图。
图4是更快速D/M-R-CNN训练过程流程图。
图5是更快速D/M-R-CNN中分类器的训练说明图。
图6是本申请一个实施例中D/M-CNN操作说明图。
图7是本申请一个实施例中最大池操作说明图。
图8是本申请一个实施例中D/M-CNN层和D/M-Sub-Sampling层的连接。
图9是本算法与更快速R-CNN算法对比图。
具体实施方式
以下参考说明书附图介绍本申请的优选实施例,使其技术内容更加清楚和便于理解。本申请可以通过许多不同形式的实施例来得以体现,本申请的保护范围并非仅限于文中提到的实施例。
以下将对本发明的构思、具体结构及产生的技术效果作进一步的说明,以充分地了解本发明的目的、特征和效果,但本发明的保护不仅限于此。
图1显示了本发明提供的基于卷积神经网络的损伤检测方法的流程图,包括:
将待检图像10输入到更快速双/多区域卷积神经网络(更快速D/M-R-CNN)模型20中,然后更快速D/M-R-CNN模型20处理待检图像,输出最终的损伤图像30,同时,也可以输出置信分数,置信分数指最终的损伤图像30中的损伤是期望损伤的可能性。其中,所使用的更快速D/M-R-CNN模型20是基于R-CNN家族结构提出的一种算法。
参见图1,更快速D/M-R-CNN模型包括深度CNN21和双/多区域建议网络(Dual/Multi Region Proposal Network,D/M-RPN)模型22,其中,深度CNN21用于生成待检图像的特征映射;D/M-RPN模型22,包括两个或两个以上的区域建议网络RPN模型,用于对待检图像10中的每个候选损伤生成两个或两个以上的损伤建议,并对两个或两个以上的损伤建议进行比较以得到置信度(置信度是指两个或两个以上的损伤建议是期望损伤的可能性),以及将得到的所有的损伤建议进行分类和回归,输出最终的损伤图像30以及置信分数。D/M-RPN模型以待检图像10作为输入,并输出一组对象建议,包括在每个建议中成为目标损伤的概率。D/M-RPN模型使用深度CNN(Deep-CNN)来提取图像中的特征(深度-CNN的最后一层作为输出),并在图像上滑动另一卷积层。卷积层之后是整流线性单元(RELU)激活函数,其提供非线性并提高收敛速度。特征映射,然后是RELU,将每个窗口的特征映射到一个矢量中,该矢量被馈送到回归层和分类层,然后分别预测多个边界框的坐标和每个框中的对象的概率。为了生成目标建议,每个对应的特征映射(Conv)都与被称为锚点的九个矩形框相关联。如图1所示,特征映射之后是RELU,被馈送到FC层。使用矢量和初始权重,针对每个生成的框中的对象计算两个输出,它们是框中的对象或仅仅是背景的一部分(没有对象)的概率。为每个边界框计算的客观性概率介于0和1之间,并且在训练过程中更新,以使其对于正锚点或负锚点分别于0或1的差值最小。对于分类层和回归层,D/M-RPN都是端到端训练的。锚点是输入图像中在目标对象之间的区域。
更快速D/M-R-CNN算法能够应用于实时提供各种结构图像中快速、准确的损伤检测和分类,可被用作各种结构物(如桥梁、高层建筑、大坝、管道、储罐等)、交通控制系统和运输系统中损伤识别的系统。应当理解,更快速D/M-R-CNN算法也可以用于行业内图像分析和处理,如智慧城市、交通控制和运输系统等。
更快速D/M-R-CNN算法不必遵循R-CNN家族其他网络的传统方法,即通过在数据库中添加更多图像来减少过拟合以提高精测精度,就可以达到用时短、检测精度高的效果。
图2显示了本申请的更快速D/M-R-CNN算法与R-CNN家族其他算法的比较。CNN算法将图像分成多个区域,然后将每个区域分成不同的类,但是,该算法需要大量的区域来准确预测,因此,计算时间很长。R-CNN算法使用选择性搜索生成区域,从每个图像中提取大约2000个区域,但是,当每个区域分别传递给CNN 时,计算时间很长,此外,该算法还使用三种不同的模型进行预测。快速R-CNN算法中,每个图像只传递一次到CNN,并提取特征映射,在这些映射上使用选择性搜索来生成预测,该算法将R-CNN中使用的三种模型结合在一起,但是,该算法仍基于选择性搜索,速度慢,因此,计算时间仍然很长。更快速R-CNN算法用于区域建议网络(RPN)代替选择搜索法,提高了算法速度,但是,在该算法中,对象建议需要时间,由于有不同的系统相继工作,因此系统的性能取决于前一个系统的性能。在本申请的更快速D/M-R-CNN算法中,应用两个或两个以上的区域建议网络(即双/多区域建议网络D/M-RPN),对同一图像中的每个候选目标(损伤)进行目标(损伤)建议,并将这些建议进行比较,得到期望的目标,使算法具有更高的精度和更快的速度。
图9显示了本申请的更快速D/M-R-CNN算法与更快速R-CNN算法的区别以及效果的提升。更快速R-CNN采用单个RPN网络,而更快速D/M-R-CNN算法可以采用双区域建议网络,即两个区域建议网络(D-RPN),对同一图像中的每个候选对象进行双目标(损伤)建议,并对这两个建议进行比较,得到期望对象。图9中,更快速D/M-R-CNN算法使用两个RPN,应当理解,在实际应用中,可以使用更多个RPN,要获得更佳的效果,需要优化RPN的数目。
更快速D/M-R-CNN算法可以接收输入图像,生成不同尺度的卷积多特征映射;通过双/多区域建议网络D/M-RPN处理生成的卷积特征映射,为图像中的每个候选对象(损伤)生成两个或两个以上的建议(即双/多建议),并创建两个或两个以上的区域建议边界框(双/多区域建议边界框);将双/多边界框投影回各个卷积层的特征映射,得到一组双/多关注区域(D/M-ROI);此过程的输出是对同一输入图像的不同区域建议的双/多堆栈,通过对它们进行比较,创建一个置信度,表示在边界框中检测到期望对象(损伤)的可能性,从而只需一步就可检测到期望的对象(损伤)。
回到图1,深度CNN21能够接收输入的待检图像,并生成不同尺度的卷积多特征映射,其可以采用现有技术中的方式来进行。双/多区域建议网络D/M-RPN模型22包括双/多关注区域D/M-ROI池化层23和完全连接FC层25。其中,双/多关注区域D/M-ROI池化层23包括两个或两个以上的关注区域ROI池化层,如图1所示,池化层的层数为(A),A大于等于2,在一个池化层中,还可以使用完全连接FC层。双/多关注区域D/M-ROI池化层23用于生成待检图像中每个候选损伤的两个或两个以上的损伤建议,并对两个或两个以上的损伤建议进行比较以得到置信度。双/多关注区域D/M-ROI池化层23可以设置为最大池化层或平均池化层。损伤建议中的每一个包括边界框bbox24,用于表示检测到的损伤。完全连接FC层用于对边界框bbox24进行分类和回归。
具体来说,采用更快速D/M-R-CNN算法进行损伤检测的方法包括如下步骤:
步骤一 将待检图像10输入到深度CNN21,得到待检图像10的特征映射;
步骤二 将得到的特征映射输入到双/多区域建议网络D/M-RPN模型22中的双/多关注区域D/M-ROI池化层23,其中,双/多区域建议网络D/M-RPN模型22包括两个或两个以上的区域建议RPN网络模型,以及双/多关注区域D/M-ROI池化层23包括两个或两个以上的关注区域ROI池化层;
步骤三 双/多关注区域D/M-ROI池化层23对待检图像10中每个候选损伤生成两个或两个以上的损伤建议,然后比较这两个或两个以上的损伤建议,以创建表示在边界框bbox24中检测到的损伤是期望损伤的置信度,其中,置信度是指两个或两个以上的损伤建议是期望损伤的可能性;
步骤四 将损伤建议输入双/多区域建议网络D/M-RPN模型22的完全连接FC层25,以对边界框进行bbox24分类和回归;
步骤五 根据分类和回归的结果,得到最终的损伤图像30,并计算置信分数,输出包括分类结果和置信分数的结果,其中,置信分数是指最终的损伤图像中的损伤是期望损伤的可能性。
在应用本申请的算法之前,必须对其进行训练。如图3所示,包含训练和应用步骤的流程图,包括以下步骤:
步骤3-1、取得训练用源图像
步骤3-2、对图像进行增强和标注
步骤3-3、选择权重
步骤3-4、设计和训练更快速D/M-R-CNN模型
步骤3-5、比较算法的输出与目标之间的误差,判断误差是否在可接受范围,若是,则继续执行下一步骤,若否,则返回步骤3-3
步骤3-6、使用深度CNN生成卷积特征映射
步骤3-7、根据卷积特征映射生成建议
步骤3-8、对建议的对象(损伤)进行分类和评分
步骤3-9、输出带有分类和/或评分的图像
其中,更快速D/M-R-CNN的训练过程如图4所示,包括以下步骤:
步骤4-1、初始化更快速D/M-R-CNN模型;
步骤4-2、训练深度CNN和双/多区域建议网络D/M-RPN模型,训练完成后,两者组成第一模型,该模型包括了深度CNN和双/多区域建议网络D/M-RPN模型,两者组合后,称为D/M-CRPN(1);
步骤4-3、使用所述步骤4-2得到的第一模型D/M-CRPN(1)生成损伤建议;
步骤4-4、使用所述步骤4-3中得到的损伤建议训练分类器(图1中的FC25);
步骤4-5、使用第一模型D/M-CRPN(1)重新初始化更快速D/M-R-CNN模型, 并用所述步骤4-3中得到的损伤建议再训练该更快速D/M-R-CNN模型,得到第二模型D/M-CRPN(2);
步骤4-6、利用第二模型D/M-CRPN(2)的权值再次训练双/多区域建议网络D/M-RPN模型;
步骤4-7、使用第二模型D/M-CRPN(2)生成新的损伤建议;
步骤4-8、使用所述步骤4-7中获得的损伤建议训练所述分类器。
在步骤4-2中,深度CNN和D/M-RPN的训练可以分开进行:先单独训练深度CNN,训练完成后,固定深度CNN,训练D/M-RPN。
步骤4-4、4-8中分类器的训练过程如图5所示。从单个对象的图像(即损伤类型的时间连续帧序列)中提取图像序列,该图像序列被馈送到D/M-CNN以提取图像特征。将前N-1层视为特征映射,对D/M-CNN(即图1中的FC25,其过程参见图6,图8描述了其与D/M-Sub-Sampling层的连接)进行训练,并利用这些映射训练双/多支持向量机(D/M-SVM)分类器(D/M-SVM仅在训练时使用,训练完成后被移除)。将每个CNN的SVM输出与它们进行比较,以高精度收集图像中的所有损伤特征,并以P张量表示如下:
Figure PCTCN2020113533-appb-000007
式中,c i,j是类(i,j),的概率,nc是类的数目,n是训练示例的图像的数目,因此任何给定图像序列中的每个图像都有一个P张量。P张量表示SVM的结果,包括多组表示归属概率的向量。
张量的组成如下:计算图像序列尺寸估算的平均值,从尺寸查找表中检查包含平均尺寸的所有类e,将一些元素转换为1,将其他元素设置为0,从而得到
Figure PCTCN2020113533-appb-000008
其中:
Figure PCTCN2020113533-appb-000009
E张量表示大小估计。
当目标移动时,目标的速度被改写为相似方式的V张量。对象损伤类型的速度的构成方式与尺寸估算中的E张量类似,即从速度查找表中检查包含所提供速度v的所有类别,将这些元素转换为1,将其他元素转换为0。
Figure PCTCN2020113533-appb-000010
其中:
Figure PCTCN2020113533-appb-000011
最终分类是通过提供的参数和图像分类器的预测值之间的融合来实现的。一系列图像的组合P张量为:
Figure PCTCN2020113533-appb-000012
其中n是每个序列中的图像数量,融合向量Φ为:
Φ (i,j)=P (i,j).*V (i,j).*E (i,j)             (7)
其中(.*)表示按元素进行的乘法运算。最终预测的得分S为:
S (i,j)=max mΦ (i,j)               (8)
m=arg max mΦ (i,j)             (9)
一个实施例,如图6所示,说明了双/多卷积和池处理,在双/多卷积操作中,输入数据由7×7×3数据集组成,其中7×7表示宽度和高度像素,3表示R、G、B色通道。
M/D-滤波器W0(i,j)和M/D-滤波器W1(i,j)有两个不同的滤波器组。步长为2,表示窗口提取3x3本地数据,每次跨两步。零填充=1。在左窗口平滑移动的情况下,滤波器组利用窗口覆盖的不同局部数据进行卷积。分别用两个滤波器组计算双/多个卷积运算,给出了两组结果的双卷积运算和多卷积运算。
在双/多卷积神经网络(D/M-CNN)中,采用D/M滤波器(一组固定权值的神经元)对局部输入数据进行卷积运算。在计算每个窗口中的数据之后,数据窗口以特定的步幅平滑地移动,直到完成所有卷积操作。有几个参数需要计算出来:(1)深度:神经元(过滤器)的数量,确定深度,(2)跨距:覆盖数据的跨距,(3)零填充:补充几个零,使窗口从初始位置到数据集末尾的距离更大。
一个实施例,如图7所示,是最大池操作,这意味着获取特定数据窗口区域的最大值。另一种池方法是更快速D/M-R-CNN算法中的平均池,即取特定数据窗口区域的平均值。
一个实施例,如图8所示,描述了D/M-CNN层和D/M-sub-sampling层连接的基本架构。CNN一般由交替卷积运算和二次采样运算组成,最后一层表示为一般的多层网络。在子采样层之间设置卷积层,提高了计算效率,进一步改善了结构不变性和空间不变性。C(1,j)是一个D/M-CNN层,每个CNN层由六个特征映射组成。通过卷积运算,可以增强原始信号的特征,降低噪声影响。特征映射的每个神经元与输入图像的16×16邻域相连。特征映射尺寸为196×196。C(1,j)有156个调谐参数(每个滤波器有16×16个单位参数和一个偏置参数,共6个滤波器,所以(16×16+1)×6=1542个参数。输入和C(1,j)之间使用一个内核,因此总共 1542×(196×196)=59237472个连接。
S(2,j)是一个D/M-sub-sampling层。根据图像的局部相关原理,每个子采样都可以应用于图像,从而降低了数据处理能力,保留了有用信息。两张98×98的特征映射各一张。特征映射的每一单元都与C(1,j)的8×8邻域相连。S(2,j)的每单位的16个输入相加,乘以带调谐偏置的调谐参数。结果可用s形函数计算。调谐参数和调谐控制了s形函数的非线性。如果这些参数相对较小,则操作与线性操作类似。通过降低图像像素,每个子采样都与模糊图像等价。如果这些参数相对较大,则每个子采样可被视为具有噪声的“或”或“和”操作。每个单位8×8的接收域不重叠,因此S(2,j)中每个特征映射的尺寸是C(1,j)的1/4。S(2,j)具有(1+1)×2=4个调谐参数和(8×8+1)×2×(98×98)=1248520个连接。
以上详细描述了本申请的较佳具体实施例。应当理解,本领域的普通技术无需创造性劳动就可以根据本申请的构思作出诸多修改和变化。因此,凡本技术领域中技术人员依本申请的构思在现有技术的基础上通过逻辑分析、推理或者有限的实验可以得到的技术方案,皆应在由权利要求书所确定的保护范围内。

Claims (20)

  1. 一种基于卷积神经网络的损伤检测方法,其中,所述方法包括:
    步骤1-1将待检图像输入到更快速双/多区域卷积神经网络D/M-R-CNN模型;
    步骤1-2所述更快速D/M-R-CNN模型处理所述待检图像,输出最终的损伤图像以及置信分数,其中,所述置信分数是指所述最终的损伤图像中的损伤是期望损伤的可能性;
    其中,所述更快速D/M-R-CNN模型包括:
    深度CNN,用于生成所述待检图像的特征映射;
    双/多区域建议网络模型,包括两个或两个以上的区域建议网络模型,用于对所述待检图像中的每个候选损伤生成两个或两个以上的损伤建议,并对所述两个或两个以上的损伤建议进行比较以得到置信度,以及将得到的所有的所述损伤建议进行分类和回归,输出所述最终的损伤图像以及所述置信分数;其中,所述置信度是指所述两个或两个以上的损伤建议是所述期望损伤的可能性。
  2. 如权利要求1所述的损伤检测方法,其中,所述深度CNN生成不同尺度的所述特征映射。
  3. 如权利要求2所述的损伤检测方法,其中,所述两个或两个以上的损伤建议中的每一个包括边界框,所述边界框用于表示所述双/多区域建议网络模型检测到的损伤。
  4. 如权利要求3所述的损伤检测方法,其中,所述双/多区域建议网络模型包括双/多关注区域池化层,所述双/多关注区域池化层包括两个或两个以上的关注区域池化层,用于生成所述待检图像中所述每个候选损伤的所述两个或两个以上的损伤建议,并对所述两个或两个以上的损伤建议进行比较以得到所述置信度。
  5. 如权利要求4所述的损伤检测方法,其中,所述双/多关注区域池化层是最大池化层和平均池化层中的一个。
  6. 如权利要求5所述的损伤检测方法,其中,所述双/多区域建议网络模型还包括完全连接层,所述完全连接层用于对所述边界框进行分类和回归。
  7. 如权利要求1所述的损伤检测方法,其中,输出所述深度CNN的最后一层作为所述特征映射。
  8. 一种基于卷积神经网络的损伤检测方法,其中,所述方法包括:
    步骤2-1将待检图像输入到深度CNN,得到所述待检图像的特征映射;
    步骤2-2将所述特征映射输入到双/多区域建议网络模型中的双/多关注区域池化层,其中,所述双/多区域建议网络模型包括两个或两个以上的区域建议网络模型,以及所述双/多关注区域池化层包括两个或两个以上的关注区域池化层;
    步骤2-3所述双/多关注区域池化层对所述待检图像中每个候选损伤生成两个或两个以上的损伤建议,然后比较所述两个或两个以上的损伤建议,以创建表示在边界框中检测到的损伤是期望损伤的置信度,其中,所述置信度是指所述两个或两个以上的损伤建议是所述期望损伤的可能性;
    步骤2-4将所述损伤建议输入所述双/多区域建议网络模型的完全连接层,以对所述边界框进行分类和回归;
    步骤2-5根据所述分类和回归的结果,得到最终的损伤图像,并计算置信分数,输出包括所述分类结果和所述置信分数的结果,其中,所述置信分数是指所述最终的损伤图像中的损伤是所述期望损伤的可能性。
  9. 如权利要求8所述的损伤检测方法,其中,所述深度CNN和所述双/多区域建议网络模型构成更快速双/多区域卷积神经网络D/M-R-CNN模型,以及所述方法还包括对所述更快速D/M-R-CNN模型进行训练的步骤,其中所述训练的步骤包括:
    步骤3-1、取得训练用的源图像,所述源图像是单个对象的图像序列;
    步骤3-2、对所述源图像进行增强和标注;
    步骤3-3、选择权重;
    步骤3-4、设计并训练所述更快速D/M-R-CNN模型。
  10. 如权利要求9所述的损伤检测方法,其中,所述步骤3-3、3-4中,进一步包括以下步骤:
    步骤4-1、初始化所述更快速D/M-R-CNN模型;
    步骤4-2、训练所述深度CNN和所述双/多区域建议网络模型,训练完成后,两者组成第一模型;
    步骤4-3、使用所述步骤4-2得到的所述第一模型生成损伤建议;
    步骤4-4、使用所述步骤4-3中得到的所述损伤建议训练分类器;
    步骤4-5、使用所述步骤4-2得到的所述第一模型的参数重新初始化所述更快速D/M-R-CNN模型,得到第二模型;
    步骤4-6、利用所述第二模型的权值再次训练所述双/多区域建议网络模型;
    步骤4-7、使用所述第二模型生成损伤建议;
    步骤4-8、使用所述步骤4-7中获得的损伤建议训练所述分类器。
  11. 如权利要求10所述的损伤检测方法,其中,所述步骤4-2中,先单独训练所述深度CNN,训练完成后,固定所述深度CNN,训练所述双/多区域建议网络模型。
  12. 如权利要求10所述的损伤检测方法,其中,所述步骤4-6中,训练所述双/多区域建议网络模型时,固定所述深度CNN。
  13. 如权利要求10所述的损伤检测方法,其中,步骤4-4、4-8中,根据所述损伤建议在所述源图像中提取出损伤图像序列,用于训练所述分类器。
  14. 如权利要求10所述的损伤检测方法,其中,所述步骤4-4、4-8中,在训练所述分类器时,在两个或两个以上的CNN的每一个的后面附加支持向量机;所述支持向量机仅在训练时使用,训练完成后被移除。
  15. 如权利要求14所述的损伤检测方法,其中,所述步骤4-4、4-8中,在训练所述分类器时,最终预测的得分计算过程包括:
    步骤5-1、计算P张量;
    步骤5-2、计算E张量;
    步骤5-3、计算V张量;
    步骤5-4、计算Φ向量;
    步骤5-5、计算预测得分S;
    其中,所述P张量表示将N个CNN中的每一个的支持向量机输出的损伤特征;所述E张量表示所述源图像的尺寸估算张量;所述V张量表示所述源图像的速度张量;所述Φ向量表示为所有的所述P张量融合后的向量。
  16. 如权利要求15所述的损伤检测方法,其中,所述步骤5-1中,所述P张量表示如下:
    Figure PCTCN2020113533-appb-100001
    式中,c i,j是类(i,j)的概率,nc是类的数目,n是用于训练的所述源图像的数目,因此所述图像序列中的每个都有一个所述P张量;
    所述图像序列的组合P张量为:
    Figure PCTCN2020113533-appb-100002
  17. 如权利要求16所述的损伤检测方法,其中,所述步骤5-2中,计算所述图像序列的尺寸估算的平均值,从尺寸查找表中检查包含平均尺寸的所有类e,其中,将一些元素转换为1,将其他元素设置为0,从而得到所述E张量:
    Figure PCTCN2020113533-appb-100003
    其中:
    Figure PCTCN2020113533-appb-100004
  18. 如权利要求17所述的损伤检测方法,其中,所述步骤5-3中,从速度查找表中检查包含所提供速度v的所有类别,将一些元素转换为1,将其他元素转换为0,得到所述V张量:
    Figure PCTCN2020113533-appb-100005
    其中:
    Figure PCTCN2020113533-appb-100006
  19. 如权利要求18所述的损伤检测方法,其中,所述步骤5-4中,所述Φ向量为:
    Φ (i,j)=P (i,j).*V (i,j).*E (i,j)  (7)
    其中(.*)表示按元素进行的乘法运算。
  20. 如权利要求19所述的损伤检测方法,其中,所述步骤5-5中,所述预测得分S为:
    S (i,j)=max mΦ (i,j)   (8)
    m=arg max mΦ (i,j)  (9)
    其中,m表示所述S (i,j)的平均值。
PCT/CN2020/113533 2020-09-04 2020-09-04 一种基于卷积神经网络的损伤检测方法 WO2022047736A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/113533 WO2022047736A1 (zh) 2020-09-04 2020-09-04 一种基于卷积神经网络的损伤检测方法

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/113533 WO2022047736A1 (zh) 2020-09-04 2020-09-04 一种基于卷积神经网络的损伤检测方法

Publications (1)

Publication Number Publication Date
WO2022047736A1 true WO2022047736A1 (zh) 2022-03-10

Family

ID=80492435

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/113533 WO2022047736A1 (zh) 2020-09-04 2020-09-04 一种基于卷积神经网络的损伤检测方法

Country Status (1)

Country Link
WO (1) WO2022047736A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116758579A (zh) * 2023-04-17 2023-09-15 三峡大学 一种基于特征增强的多实例行人检测方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106599939A (zh) * 2016-12-30 2017-04-26 深圳市唯特视科技有限公司 一种基于区域卷积神经网络的实时目标检测方法
CN107194323A (zh) * 2017-04-28 2017-09-22 阿里巴巴集团控股有限公司 车辆定损图像获取方法、装置、服务器和终端设备
US10373262B1 (en) * 2014-03-18 2019-08-06 Ccc Information Services Inc. Image processing system for vehicle damage
CN110287768A (zh) * 2019-05-06 2019-09-27 浙江君嘉智享网络科技有限公司 图像智能识别车辆定损方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10373262B1 (en) * 2014-03-18 2019-08-06 Ccc Information Services Inc. Image processing system for vehicle damage
CN106599939A (zh) * 2016-12-30 2017-04-26 深圳市唯特视科技有限公司 一种基于区域卷积神经网络的实时目标检测方法
CN107194323A (zh) * 2017-04-28 2017-09-22 阿里巴巴集团控股有限公司 车辆定损图像获取方法、装置、服务器和终端设备
CN110287768A (zh) * 2019-05-06 2019-09-27 浙江君嘉智享网络科技有限公司 图像智能识别车辆定损方法

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116758579A (zh) * 2023-04-17 2023-09-15 三峡大学 一种基于特征增强的多实例行人检测方法

Similar Documents

Publication Publication Date Title
WO2019144575A1 (zh) 一种快速行人检测方法及装置
US20190057507A1 (en) System and method for semantic segmentation of images
CN109426805B (zh) 用于对象检测的方法、设备和计算机程序产品
CN112507777A (zh) 一种基于深度学习的光学遥感图像舰船检测与分割方法
Chandrakar et al. Enhanced the moving object detection and object tracking for traffic surveillance using RBF-FDLNN and CBF algorithm
Raghavan et al. Optimized building extraction from high-resolution satellite imagery using deep learning
Li et al. Implementation of deep-learning algorithm for obstacle detection and collision avoidance for robotic harvester
CN114972418A (zh) 基于核自适应滤波与yolox检测结合的机动多目标跟踪方法
Liu et al. SRAF-Net: A scene-relevant anchor-free object detection network in remote sensing images
CN113158862A (zh) 一种基于多任务的轻量级实时人脸检测方法
Chen et al. A multi-task framework for infrared small target detection and segmentation
CN112395951A (zh) 一种面向复杂场景的域适应交通目标检测与识别方法
Kim et al. Improved center and scale prediction-based pedestrian detection using convolutional block
Liu et al. Survey of road extraction methods in remote sensing images based on deep learning
Zuo et al. A remote sensing image semantic segmentation method by combining deformable convolution with conditional random fields
Hu et al. A video streaming vehicle detection algorithm based on YOLOv4
Muthalagu et al. Vehicle lane markings segmentation and keypoint determination using deep convolutional neural networks
Dong et al. Scale-recursive network with point supervision for crowd scene analysis
WO2022047736A1 (zh) 一种基于卷积神经网络的损伤检测方法
Patil et al. Road segmentation in high-resolution images using deep residual networks
CN117079095A (zh) 基于深度学习的高空抛物检测方法、系统、介质和设备
CN116758340A (zh) 基于超分辨率特征金字塔和注意力机制的小目标检测方法
Nag et al. ARCN: a real-time attention-based network for crowd counting from drone images
CN111639563B (zh) 一种基于多任务的篮球视频事件与目标在线检测方法
Guo et al. ANMS: attention-based non-maximum suppression

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20951980

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20951980

Country of ref document: EP

Kind code of ref document: A1