CN113033720A - Vehicle bottom picture foreign matter identification method and device based on sliding window and storage medium - Google Patents
Vehicle bottom picture foreign matter identification method and device based on sliding window and storage medium Download PDFInfo
- Publication number
- CN113033720A CN113033720A CN202110588934.1A CN202110588934A CN113033720A CN 113033720 A CN113033720 A CN 113033720A CN 202110588934 A CN202110588934 A CN 202110588934A CN 113033720 A CN113033720 A CN 113033720A
- Authority
- CN
- China
- Prior art keywords
- foreign matter
- network
- frame
- vehicle bottom
- window
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a vehicle bottom picture foreign matter identification method, a device and a storage medium based on a sliding window, wherein the vehicle bottom picture foreign matter identification method comprises the following steps: step 1, training a foreign body recognition network model M to obtain parameters of a foreign body recognition network; and 2, segmenting the vehicle bottom picture into a plurality of window images by adopting a sliding window, loading image data of each window in a multi-process mode, preprocessing the window images and inputting the window images into a foreign matter identification network M to obtain an identification result of the foreign matter. According to the method, the vehicle bottom image after the segmentation of the sliding window is processed in multiple processes, and the efficiency and the accuracy of the high-resolution vehicle bottom foreign matter identification are effectively improved by combining the anchor frame-based method which is more stable for medium target identification and the characteristic point-based method which is more flexible for small target identification.
Description
Technical Field
The invention relates to an efficient sliding window-based small target foreign matter identification method for a high-resolution vehicle bottom picture, and belongs to the technical field of foreign matter identification.
Background
In order to prevent the dangerous articles such as firearms, explosives, drugs and the like from being hidden in the vehicle chassis by lawless persons, the foreign body identification of the vehicle chassis is an important part in the safety inspection. The existing vehicle bottom foreign matter identification method is to shoot a high-definition vehicle bottom picture and manually check the picture, but the method has the defects of low efficiency, low judgment accuracy rate after a security inspector monitors the picture for a long time and the like. A more efficient security check mode is for carrying out analysis to the vehicle bottom picture through the target identification technique based on degree of depth learning in the computer vision to automatically, discern the foreign matter that exists at the bottom of the vehicle.
The vehicle bottom picture shot by the safety inspection system has the following characteristics: first, the picture resolution is high. The calculation cost of the picture identification process is high, the time consumption is long, and tens of seconds are often needed to obtain the identification result when the high-resolution picture is directly input into the target identification network; second, the vehicle bottom foreign matter is mostly the small-size target, and the foreign matter occupies that the proportion of whole vehicle bottom picture is less promptly. This makes the recognition accuracy of foreign matter low in the general target recognition method.
Disclosure of Invention
The invention aims to solve the technical problem of improving the efficiency and accuracy of vehicle bottom foreign matter identification under the conditions that the resolution of a vehicle bottom picture is higher and the vehicle bottom foreign matter is a small target.
In order to solve the problems, the invention adopts the following technical scheme:
a vehicle bottom picture foreign matter identification method based on a sliding window is characterized by comprising the following steps
Step 1, training a foreign body recognition network model M to obtain parameters of a foreign body recognition network;
and 2, segmenting the vehicle bottom picture into a plurality of window images by adopting a sliding window, loading image data of each window in a multi-process mode, preprocessing the window images and inputting the window images into a foreign matter identification network M to obtain an identification result of the foreign matter.
The step 1 comprises the following steps: the method comprises the steps of data set construction and division, vehicle bottom image preprocessing, foreign matter identification network forward propagation and foreign matter identification network parameter updating.
Step 1-1, collecting vehicle bottom picture data containing foreign matters, wherein the vehicle bottom picture data comprises a vehicle bottom image and corresponding foreign matter labeling boundary frames and categories; dividing the vehicle bottom picture data into a training set and a verification set, randomly disordering the training data and dividing the training data into a plurality of small batches, and directly dividing the verification set data into a plurality of small batches;
step 1-2, preprocessing an input vehicle bottom image, randomly cutting a window with the length and width of h and w from the input vehicle bottom image to obtain a window image, horizontally and randomly overturning the window image, updating a foreign matter labeling frame corresponding to the window image, normalizing the window image, converting the window image into a Pythrch tensor, and splicing data in a small batch to obtain input data of a foreign matter identification network M;
step 1-3, inputting the data obtained by preprocessing in the step 1-2 into a backbone network in a foreign matter identification network M to obtain a multi-scale characteristic diagram;
the backbone network can be ResNet50 backbone network to obtain C2、C3、C4、C5Four multi-scale feature maps.
Step 1-4, inputting the multi-scale feature map obtained in the step 1-3 into a feature pyramid network in a foreign matter identification network M to obtain a multi-scale feature map with a plurality of fused features;
each of the plurality of features is P2、P 3、P 4、P 5、P 6、P 7And six characteristics.
Step 1-5, each feature map P output in step 1-4iDistributing anchor frames with various scales and length-width ratios, and respectively inputting the feature map into a prediction network based on the anchor frame and a prediction network based on the feature points in the foreign matter identification network M to obtain a classification value and a frame regression value of each anchor frame and a classification value and a frame regression value of each feature point;
step 1-6, firstly, calculating a loss value according to the classification value and the frame regression value of the anchor frame, the classification value and the frame regression value of the feature point and the labeled data of the vehicle bottom image in the step 1-5; then calculating the gradient value of the network parameter and updating the network parameter value;
step 1-7, completing the training of one batch after the steps 1-2 to 1-6 are executed on all the small batches; performing steps 1-2 to 1-5 on each small batch of the verification set, then performing post-processing operation consisting of frame filtering and non-maximum value inhibition to obtain a foreign matter identification result of the data of the verification set, and calculating the average accuracy of the whole class according to the identification result and the label of the verification set;
and 1-8, repeating the steps 1-1 to 1-710 to 20 times, and selecting the model parameters of the batches with the highest class average accuracy as the parameters of the foreign matter identification network M.
As a preferred technical scheme of the invention, the prediction network based on the anchor frame and the prediction network based on the characteristic points in the steps 1-5 are both composed of classification branches and regression branches, and each branch comprises an L-layer convolution and a prediction layer. Each layer of convolution adopts group regularization, C channels are divided into G groups, and normalization and linear transformation are carried out in each group of data of each vehicle bottom picture sample.
The step 2 comprises the following steps: the method comprises the steps of dividing a vehicle bottom image into a plurality of window images by adopting a sliding window, loading image data of each window in a multi-process mode, preprocessing the window images and inputting the window images into a foreign matter recognition network M to obtain a foreign matter recognition result. The method specifically comprises the following steps:
step 2-1, adopting sliding windows with the length and width of H and W respectively and the step length of s to obtain N = [ (H-H)// s +1] [ (W-W)// s +1] windows on the vehicle bottom image with the length and width of H, W respectively, wherein "//" represents a division of downward rounding;
step 2-2, distributing the N window images to P processes, wherein the first process isOne process processes N// P window pictures and the last process processes N- (P-1) (N// P) pictures, where "/" denotes division by rounding down. Loading a foreign matter identification network M comprising a backbone network, a characteristic pyramid, a prediction network based on an anchor frame and a prediction network based on a characteristic point in each process;
step 2-3, carrying out normalization of the statistic value of the ImageNet data set and preprocessing of converting the statistic value into a Pythroch tensor data type on the input window image;
and 2-4, inputting the preprocessed data into the foreign matter recognition network M to obtain the predicted value of the anchor frame, similar to the steps 1-3 to 1-5. And calculating to obtain the frame after the anchor frame is adjusted according to the frame regression value, the position, the scale and the length-width ratio of the anchor frame. Meanwhile, calculating to obtain a frame obtained by the feature points according to the regression values of the feature points and the positions of the feature points;
step 2-5, in the prediction frame of each characteristic layer output in step 2-4, the confidence degree is taken to be more than t2Frame of (a), t2The manually set confidence threshold is typically set between 0.3 and 0.6. Fusing the predicted frames of the characteristic layers of the window pictures processed by the processes, and performing intersection ratio with threshold value of tiouThe final foreign object recognition result is obtained by the non-maximum value suppression operation. t is tiouThe confidence threshold set manually is typically 0.5.
Compared with the prior art, the invention has the following technical effects:
1. according to the method for processing the high-resolution vehicle bottom picture after the sliding window segmentation in the multi-process mode, on one hand, the foreign matter identification speed is improved in the multi-process mode, on the other hand, the occupation ratio of the small target in the image is improved through the sliding window segmentation, and the small target identification precision is improved.
2. In the training process, an anchor frame-based method which is more stable for medium-scale target identification and a feature point-based method which is more flexible for small target identification are combined, and the combination of the anchor frame-based method and the feature point-based method can obtain a better solution of network parameters, so that the efficiency and the accuracy of high-resolution vehicle bottom image foreign matter identification are effectively improved.
3. The invention is beneficial to the identification of foreign objects with various scales by combining a method based on an anchor frame which is more stable for the identification of the medium-scale object and a method based on the feature points which is more flexible for the identification of the small object in the test process.
Drawings
FIG. 1 is a flow chart of a foreign object identifier recognition method employed in the present invention;
fig. 2 is a visualization of a vehicle bottom image foreign matter recognition result according to the invention.
Detailed Description
The embodiments of the present invention will be described in further detail with reference to the drawings attached to the specification.
Example 1
As shown in figure 1, the vehicle bottom image foreign matter identification method of the invention, the vehicle bottom includes but is not limited to the wagon bottom, the train bottom and other vehicle bottoms, the identification method comprises the following steps:
step 1, training a foreign body recognition network model M to obtain parameters of a foreign body recognition network.
And 2, predicting the vehicle bottom picture to obtain a vehicle bottom picture foreign matter identification result.
The step 1 comprises the following steps: the method comprises the steps of data set construction and division, vehicle bottom image preprocessing, foreign matter identification network forward propagation and foreign matter identification network parameter updating.
Step 1-1, collecting vehicle bottom picture data containing foreign matters, wherein the vehicle bottom picture data comprises a vehicle bottom image and corresponding foreign matter labeling boundary frames and categories; dividing the vehicle bottom picture data into a training set and a verification set, randomly disordering the training data and dividing the training data into a plurality of small batches, and directly dividing the verification set data into a plurality of small batches;
the training set and validation set may be as follows 9: 1 or other ratios divide the data into training and validation sets, with 9: 1 is Tval= T//10 validation set samples and Ttrain= T-T//10 training set samples. Then randomly disorganizing and dividing training set data into Ttrain// bs Small batches containing bs samples, direct partitioning of the validation dataset into TvalV/bs mini-lot, where T is the total number of car bottom images, "/" denotes division by rounding down.
In one embodiment, 10000 underbody images and corresponding foreign objects are labeled with bounding boxes and categories. Wherein the number of types of foreign matter is 10. According to the following steps of 9: 1 divide the data into a training set and a validation set, i.e. Tval=1000 validation set samples and Ttrain=9000 training set samples. The training set data was then randomly shuffled and divided into 281 small batches containing up to 32 images, and the validation data set was directly divided into 32 small batches containing up to 32 images.
Step 1-2, preprocessing an input vehicle bottom image, randomly cutting a window with the length and width of h and w from the input vehicle bottom image to obtain a window image, horizontally and randomly overturning the window image, updating a foreign matter labeling frame corresponding to the window image, normalizing the window image, converting the window image into a Pythrch tensor, and splicing data in a small batch to obtain input data of a foreign matter identification network M.
In one embodiment, a small batch of input vehicle bottom images are preprocessed, a window image is obtained by randomly cutting windows with the length and width of 800 and 1333 from input vehicle bottom images with the length and width of 1024 and 3750 respectively, namely the upper left corner of the input image is used as the origin, and the window image is obtained by randomly cutting the upper left corner of the input imageRandomly selecting upper boundary coordinates of the window image in the original image within the range with equal probabilityAnd randomly selecting left boundary coordinates of the window image in the original image within the range with equal probability. And then, randomly overturning the window image at a probability level of 0.5, updating a foreign matter labeling frame corresponding to the window image, finally normalizing the window image by using the statistic value of the ImageNet data set, converting the window image into a Pythroch tensor, and splicing the data in a small batch to obtain the input data of the foreign matter identification network M.
And step 1-3, inputting the data obtained by preprocessing in the step 1-2 into a backbone network in the foreign matter identification network M to obtain a multi-scale characteristic diagram.
In one embodiment, the backbone network may be a ResNet50 backbone network, resulting in C2、C3、C4、C5Four multi-scale feature maps. C2、C3、C4、C5Four multi-scale feature maps with step sizes of 8, 16, 32, 64, respectively, relative to the input window image, where CiThe top layer characteristic diagram with the step length being the corresponding step length of the characteristic diagram in the backbone network.
And 1-4, inputting the multi-scale feature map obtained in the step 1-3 into a feature pyramid network in the foreign matter identification network M to obtain a multi-scale feature map with a plurality of fused features.
In one embodiment, P is obtained2、P 3、P 4、P 5、P 6、P 7The step sizes of the six feature fused multi-scale feature maps relative to the input window image are 8, 16, 32, 64, 128 and 256 respectively, wherein P isiAnd the step length in the characteristic pyramid network is the characteristic diagram of the corresponding step length of the characteristic diagram.
Step 1-5, each feature map P output in step 1-4iAnd distributing anchor frames with various scales and length-width ratios, and respectively inputting the feature map into a prediction network based on the anchor frame and a prediction network based on the feature points in the foreign matter identification network M to obtain a classification value and a frame regression value of each anchor frame and a classification value and a frame regression value of each feature point.
In one embodiment, let the profile P output from steps 1-4iStep length of SiAt PiEach feature point is assigned with 3 scales Si、2Si、4SiAnd 9 anchor frames with 3 length-width ratios of 0.5, 1.0 and 2.0. The feature map is input into an anchor-box based prediction network and 10 classification values, 4 bounding box regression values, are predicted for each anchor box. And meanwhile, inputting the feature map into a prediction network based on the feature points, and predicting 10 classification values and 4 frame regression values for each feature point.
Step 1-6, firstly, calculating a loss value according to the classification value and the frame regression value of the anchor frame, the classification value and the frame regression value of the feature point and the labeled data of the vehicle bottom image in the step 1-5; then calculating the gradient value of the network parameter and updating the network parameter value;
the prediction classification vector containing K values for the anchor box is paThe corresponding labeled class value is yaThe 4 predicted regression values constitute a vector raThe corresponding 4 labeled regression values constitute the vector taIf the classification loss weight of the positive sample is 0.25, the corresponding loss value of the anchor box is:
wherein L isaAs loss value, i = yaWhen the temperature of the water is higher than the set temperature,otherwise。
Loss value L of feature pointpThe calculation method is the same.
The total loss value L is: l = La+Lp
Then calculating the gradient value of each parameter, updating each parameter value, setting the learning rate of parameter updating to be 0.1, and adopting SGD (serving gateway device) in an optimization algorithm;
step 1-7, completing the training of one batch after the steps 1-2 to 1-6 are executed on all the small batches; performing steps 1-2 to 1-5 on each small batch of the verification set, then performing post-processing operation consisting of frame filtering and non-maximum value inhibition to obtain a foreign matter identification result of the data of the verification set, and calculating the average accuracy of the whole class according to the identification result and the label of the verification set;
performing the steps 1-2 to 1-5 on each small batch of the verification set, and calculating to obtain the frame after the anchor frame is adjusted according to the frame regression value, the position, the scale and the length-width ratio of the anchor frame; and meanwhile, calculating to obtain a frame obtained by the characteristic points according to the regression values of the characteristic points and the positions of the characteristic points. Then, in each feature layer prediction frame, top-k confidence coefficients are selected to be the highest, and the confidence coefficient is larger than t1And performing a cross-over ratio threshold of tiouThe non-maximum value suppression operation of (2) to obtain a foreign object identification result of the verification set data. And finally, calculating the average accuracy of the whole class according to the identification result and the label of the verification set.
In one embodiment, 1000 frames with the highest confidence and the confidence greater than 0.05 are taken, and the non-maximum suppression operation with the intersection ratio threshold of 0.5 is performed to obtain the foreign object identification result of the verification set data. Finally, calculating the average accuracy of the whole class according to the identification result and the label of the verification set;
and 1-8, repeating the steps 1-1 to 1-710 to 20 times, and selecting the model parameters of the batches with the highest class average accuracy as the parameters of the foreign matter identification network M.
In one embodiment, E picks 12, i.e., 12 batches. The learning rate dropped to 0.1 times before in 8 th and 11 th batches. Finally, selecting the model parameters of the batches with the highest class average accuracy as the parameters of the foreign matter identification network M;
as a preferred technical solution of the present invention, the backbone network ResNet50 described in steps 1-3 includes layer2, layer3, layer4, and layer5, and is composed of 3, 4, 6, and 3 bolt sock modules, respectively, and the bolt sock of layer2 and layer3 is composed of 1 × 1 convolution, 3 × 3 convolution, 1 × 1 convolution, and hop connection. The bottle sock in layer4 and layer5 is composed of 1 × 1 convolution, deformable convolution, 1 × 1 convolution and jump connection;
in one embodiment, the anchor frame prediction network and the feature point-based prediction network of steps 1-5 are each comprised of classification branches and regression branches, each branch containing an L-layer convolution and a prediction layer. Each layer of convolution adopts group regularization, 256 channels are divided into 32 groups, and normalization and linear transformation are carried out in each group of data of each vehicle bottom picture sample.
The step 2 comprises the following steps: the method comprises the steps of dividing a vehicle bottom image into a plurality of window images by adopting a sliding window, loading image data of each window in a multi-process mode, preprocessing the window images and inputting the window images into a foreign matter recognition network M to obtain a foreign matter recognition result. The method specifically comprises the following steps:
step 2-1, adopting sliding windows with the length and width of H and W respectively and the step length of s to obtain N = [ (H-H)// s +1] [ (W-W)// s +1] windows on the vehicle bottom image with the length and width of H, W respectively, wherein "//" represents a division of downward rounding;
in one example, using sliding windows with a length and width of 800, 1333, respectively, and a step size of 400, N =14 windows are obtained on a vehicle bottom image with a length and width of 1024, 3750, respectively.
And 2-2, distributing the N window images to P processes, wherein the first P-1 processes process N// P window images, and the last process processes N- (P-1) (N// P) images, wherein "/" represents division by rounding down. Loading a foreign matter identification network M comprising a backbone network, a characteristic pyramid, a prediction network based on an anchor frame and a prediction network based on a characteristic point in each process;
in one embodiment, 14 window images are assigned to 4 processes, with the first 3 processes processing 3 window images and the last process processing 5 images. And loading a foreign object identification network M comprising a backbone network, a characteristic pyramid and an anchor frame prediction network in each process.
Step 2-3, carrying out normalization of the statistic value of the ImageNet data set and preprocessing of converting the statistic value into a Pythroch tensor data type on the input window image;
and 2-4, inputting the preprocessed data into the foreign matter recognition network M to obtain the predicted value of the anchor frame, similar to the steps 1-3 to 1-5. And calculating to obtain the frame after the anchor frame is adjusted according to the frame regression value, the position, the scale and the length-width ratio of the anchor frame. Meanwhile, calculating to obtain a frame obtained by the feature points according to the regression values of the feature points and the positions of the feature points;
and 2-5, taking a frame with the confidence level larger than 0.5 from the predicted frames of each characteristic layer output in the step 2-4. And fusing the prediction frames of the characteristic layers of the window pictures processed by the processes, and executing non-maximum suppression operation with the intersection ratio threshold value of 0.5 to obtain the final foreign matter identification result. As shown in fig. 2, black boxes indicate the positions of the recognized alien materials, and characters on the boxes indicate the types of recognized alien materials, "gun", "knife", and "ax" indicate the gun, knife, and axe, respectively.
The speed and the full-class average accuracy of the model are evaluated on 1000 validation set samples in step 1, and compared with a commonly used target identification method RetinaNet, and the results are shown in Table 1:
TABLE 1
Method | Speed (frame/second) | Average accuracy of all classes (%) |
RetinaNet | 0.5 | 85.3 |
The invention | 1.8 | 89.6 |
As can be seen from Table 1, the speed of the identification method of the invention is improved obviously, which is more than 3 times of the existing method, and the average accuracy of the whole class is not reduced and is also improved on the premise of greatly improving the speed.
Example 2
The invention also provides a vehicle bottom picture foreign matter recognition device based on the sliding window, which comprises a processor and a memory; the memory is stored with a program or an instruction, and the program or the instruction is loaded and executed by the processor to realize the vehicle bottom picture foreign matter identification method in the embodiment 1.
Example 3
The present invention also provides a computer-readable storage medium, which may be a non-volatile computer-readable storage medium, and which may also be a volatile computer-readable storage medium, wherein instructions are stored in the computer-readable storage medium, and when the instructions are run on a computer, the instructions cause the computer to execute the vehicle bottom picture foreign matter identification method according to embodiment 1.
It is clear to those skilled in the art that the technical solution of the present invention, which is essential or part of the technical solution contributing to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The method, the apparatus and the storage medium for enhancing the brightness of the vehicle bottom image provided by the present invention have many methods and approaches for implementing the technical solution, and the above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, a plurality of modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention. All the components not specified in the present embodiment can be realized by the prior art.
Claims (7)
1. A vehicle bottom picture foreign matter identification method based on a sliding window is characterized by comprising the following steps:
step 1, training a foreign body recognition network model M to obtain parameters of a foreign body recognition network;
step 2, segmenting the vehicle bottom picture into a plurality of window images by adopting a sliding window, loading image data of each window in a multi-process manner, preprocessing the window images and inputting the window images into a foreign matter identification network M to obtain an identification result of the foreign matter; the method comprises the following steps:
step 2-1, segmenting a high-resolution vehicle bottom image by adopting sliding windows with the length and the width of h and w respectively to obtain a plurality of window images;
step 2-2, averagely distributing the window image data to a plurality of processes, and loading a foreign matter identification network M in each process, wherein the foreign matter identification network M comprises a backbone network, a characteristic pyramid, an anchor frame-based prediction network and a characteristic point-based prediction network;
step 2-3, carrying out normalization on the input window image and preprocessing for converting the window image into a Pythrch tensor data type;
step 2-4, inputting the preprocessed data into a foreign matter recognition network M to respectively obtain classification values and frame regression values of the anchor frame and the feature points;
step 2-5, obtaining a predicted frame according to the classification value output in the step 2-4 and the frame regression value, and taking the confidence coefficient larger than t2Frame of (a), t2A confidence threshold value set manually; and adding the prediction frames based on the anchor frame and the feature point of each feature layer of the window picture processed by each process into the frame set, and performing non-maximum suppression operation on the frame set to obtain a final foreign matter identification result.
2. The method of claim 1, wherein the confidence threshold t is2Is set between 0.3 and 0.6.
3. The method of claim 1, wherein step 1 comprises:
step 1-1, collecting vehicle bottom picture data containing foreign matters, wherein the vehicle bottom picture data comprises a vehicle bottom image and corresponding foreign matter labeling boundary frames and categories; dividing the vehicle bottom picture data into a training set and a verification set, randomly disordering the training data and dividing the training data into a plurality of small batches, and directly dividing the verification set data into a plurality of small batches;
step 1-2, preprocessing an input vehicle bottom image, randomly cutting a window with the length and width of h and w from the input vehicle bottom image to obtain a window image, horizontally and randomly overturning the window image, updating a foreign matter labeling frame corresponding to the window image, normalizing the window image, converting the window image into a Pythrch tensor, and splicing data in a small batch to obtain input data of a foreign matter identification network M;
step 1-3, inputting the data obtained by preprocessing in the step 1-2 into a backbone network in a foreign matter identification network M to obtain a multi-scale characteristic diagram;
step 1-4, inputting the multi-scale feature map obtained in the step 1-3 into a feature pyramid network in a foreign matter identification network M to obtain a multi-scale feature map with fused features;
step 1-5, distributing anchor frames with various scales and length-width ratios on each feature map output in the step 1-4, and respectively inputting the feature maps into a prediction network based on the anchor frames and a prediction network based on feature points in the foreign matter identification network M to obtain a classification value and a frame regression value of each anchor frame and a classification value and a frame regression value of each feature point;
step 1-6, firstly, calculating a loss value according to the classification value and the frame regression value of the anchor frame, the classification value and the frame regression value of the feature point and the labeled data of the vehicle bottom image in the step 1-5; then calculating the gradient value of the network parameter and updating the network parameter value;
step 1-7, completing a round of training after the steps 1-2 to 1-6 are executed on all the small batches; performing steps 1-2 to 1-5 on each small batch of the verification set, then performing post-processing operation consisting of frame filtering and non-maximum value inhibition to obtain a foreign matter identification result of the data of the verification set, and calculating the average accuracy of the whole class according to the identification result and the label of the verification set;
and 1-8, repeating the steps 1-1 to 1-7 for E times, and selecting the model parameters of the batches with the highest overall average accuracy as the parameters of the foreign matter identification network M, wherein E = 10-20.
4. The method of claim 2, wherein the backbone network uses deformable convolution, and wherein the anchor-box-based prediction network and the feature-point-based prediction network both use packet regularization.
5. The method of claim 2, wherein the frame regression value of the anchor frame-based prediction network is a relative offset value of the frame center point coordinates and an adjustment value of the frame length and width; and the frame regression value of the prediction network based on the feature points is the distance value from the feature points to four boundaries of the frame.
6. A vehicle bottom picture foreign matter recognition device based on a sliding window is characterized by comprising a processor and a memory; the memory stores programs or instructions which are loaded and executed by the processor to realize the vehicle bottom picture foreign matter identification method as claimed in any one of claims 1 to 5.
7. A computer-readable storage medium on which a program or instructions are stored, the program or instructions, when executed by a processor, implementing the underbody picture foreign matter identification method according to any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110588934.1A CN113033720B (en) | 2021-05-28 | 2021-05-28 | Vehicle bottom picture foreign matter identification method and device based on sliding window and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110588934.1A CN113033720B (en) | 2021-05-28 | 2021-05-28 | Vehicle bottom picture foreign matter identification method and device based on sliding window and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113033720A true CN113033720A (en) | 2021-06-25 |
CN113033720B CN113033720B (en) | 2021-08-13 |
Family
ID=76456161
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110588934.1A Active CN113033720B (en) | 2021-05-28 | 2021-05-28 | Vehicle bottom picture foreign matter identification method and device based on sliding window and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113033720B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114359948A (en) * | 2021-12-23 | 2022-04-15 | 华南理工大学 | Power grid wiring diagram primitive identification method based on overlapping sliding window mechanism and YOLOV4 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109800716A (en) * | 2019-01-22 | 2019-05-24 | 华中科技大学 | One kind being based on the pyramidal Oceanic remote sensing image ship detecting method of feature |
CN110399816A (en) * | 2019-07-15 | 2019-11-01 | 广西大学 | A kind of high-speed train bottom foreign matter detecting method based on Faster R-CNN |
CN111402211A (en) * | 2020-03-04 | 2020-07-10 | 广西大学 | High-speed train bottom foreign matter identification method based on deep learning |
CN111652228A (en) * | 2020-05-21 | 2020-09-11 | 哈尔滨市科佳通用机电股份有限公司 | Railway wagon sleeper beam hole foreign matter detection method |
CN112164038A (en) * | 2020-09-16 | 2021-01-01 | 上海电力大学 | Photovoltaic hot spot detection method based on deep convolutional neural network |
CN112836713A (en) * | 2021-03-12 | 2021-05-25 | 南京大学 | Image anchor-frame-free detection-based mesoscale convection system identification and tracking method |
-
2021
- 2021-05-28 CN CN202110588934.1A patent/CN113033720B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109800716A (en) * | 2019-01-22 | 2019-05-24 | 华中科技大学 | One kind being based on the pyramidal Oceanic remote sensing image ship detecting method of feature |
CN110399816A (en) * | 2019-07-15 | 2019-11-01 | 广西大学 | A kind of high-speed train bottom foreign matter detecting method based on Faster R-CNN |
CN111402211A (en) * | 2020-03-04 | 2020-07-10 | 广西大学 | High-speed train bottom foreign matter identification method based on deep learning |
CN111652228A (en) * | 2020-05-21 | 2020-09-11 | 哈尔滨市科佳通用机电股份有限公司 | Railway wagon sleeper beam hole foreign matter detection method |
CN112164038A (en) * | 2020-09-16 | 2021-01-01 | 上海电力大学 | Photovoltaic hot spot detection method based on deep convolutional neural network |
CN112836713A (en) * | 2021-03-12 | 2021-05-25 | 南京大学 | Image anchor-frame-free detection-based mesoscale convection system identification and tracking method |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114359948A (en) * | 2021-12-23 | 2022-04-15 | 华南理工大学 | Power grid wiring diagram primitive identification method based on overlapping sliding window mechanism and YOLOV4 |
CN114359948B (en) * | 2021-12-23 | 2024-09-13 | 华南理工大学 | Based on overlapping sliding window mechanism and YOLOV < 4) Power grid wiring graphic primitive identification method |
Also Published As
Publication number | Publication date |
---|---|
CN113033720B (en) | 2021-08-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111738995B (en) | RGBD image-based target detection method and device and computer equipment | |
CN110032925B (en) | Gesture image segmentation and recognition method based on improved capsule network and algorithm | |
CN108805016B (en) | Head and shoulder area detection method and device | |
CN110956225A (en) | Contraband detection method and system, computing device and storage medium | |
CN110008853B (en) | Pedestrian detection network and model training method, detection method, medium and equipment | |
CN111738114B (en) | Vehicle target detection method based on anchor-free accurate sampling remote sensing image | |
CN112580662A (en) | Method and system for recognizing fish body direction based on image features | |
CN113221731B (en) | Multi-scale remote sensing image target detection method and system | |
CN110992314A (en) | Pavement defect detection method and device and storage medium | |
CN111695640A (en) | Foundation cloud picture recognition model training method and foundation cloud picture recognition method | |
CN111242066A (en) | Large-size image target detection method and device and computer readable storage medium | |
CN113033720B (en) | Vehicle bottom picture foreign matter identification method and device based on sliding window and storage medium | |
CN110334775B (en) | Unmanned aerial vehicle line fault identification method and device based on width learning | |
CN114399780B (en) | Form detection method, form detection model training method and device | |
CN114882423A (en) | Truck warehousing goods identification method based on improved Yolov5m model and Deepsort | |
CN114511731A (en) | Training method and device of target detector, storage medium and electronic equipment | |
CN111832641B (en) | Image identification method based on cascade downsampling convolution neural network | |
CN113657196A (en) | SAR image target detection method and device, electronic equipment and storage medium | |
CN113947723B (en) | High-resolution remote sensing scene target detection method based on size balance FCOS | |
CN112132207A (en) | Target detection neural network construction method based on multi-branch feature mapping | |
CN116612450A (en) | Point cloud scene-oriented differential knowledge distillation 3D target detection method | |
CN113936133B (en) | Self-adaptive data enhancement method for target detection | |
CN113963178A (en) | Method, device, equipment and medium for detecting infrared dim and small target under ground-air background | |
CN115205573A (en) | Image processing method, device and equipment | |
CN114255342A (en) | Improved YOLOv 4-based onshore typical target detection method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |