CN110992361A

CN110992361A - Engine fastener detection system and detection method based on cost balance

Info

Publication number: CN110992361A
Application number: CN201911352939.3A
Authority: CN
Inventors: 黄家水; 张发恩; 管恒; 唐永亮
Original assignee: Innovation Qizhi (chengdu) Technology Co Ltd
Current assignee: Innovation Qizhi (chengdu) Technology Co Ltd
Priority date: 2019-12-25
Filing date: 2019-12-25
Publication date: 2020-04-10

Abstract

The invention discloses a cost balance-based engine fastener detection system and system, wherein the system comprises: the image acquisition module is used for acquiring an image of the fastener to be detected; the classified detection model training module is used for training and forming a fastener classified detection model for performing classified detection on the fastener images by taking the fastener sample images as training samples; the fastener classification detection module is used for carrying out multi-scale image feature extraction on the fastener image based on the fastener classification detection model and identifying and outputting a fastener target identification classification result of the fastener image; the identification error calculation module is used for calculating the identification error of the fastener classification detection model; the classification detection model training module is also used for training and updating the classification detection model of the fastener by taking the fastener sample image and the recognition error as the double inputs of the convolutional neural network.

Description

Engine fastener detection system and detection method based on cost balance

Technical Field

The invention relates to the technical field of image recognition, in particular to a system and a method for detecting engine fasteners based on a cost balance convolutional neural network.

Background

Whether fasteners such as bolts, nuts, screws or screw holes mounted on the engine are mounted in place or whether the fasteners are missing affects the overall quality of the engine, so that it is necessary to detect whether the fasteners are missing. The existing detection method mainly adopts a visual inspection mode of a production worker, and the traditional manual inspection mode has low efficiency and is easy to miss inspection. In order to overcome the problem, in recent years, some nut detection methods based on computer vision recognition technology have appeared, but the existing image detection methods mainly have the following two problems:

firstly, the existing image detection algorithm needs to manually extract image features, wherein the image features include image pixel gradient and histogram calculation, and local contrast normalization needs to be performed on the gradient in order to cope with the change of local illumination and the change of image foreground and background contrast during picture shooting. The normalized features are then collected and input into a linear classifier for classification (distinguishing image foreground from background). However, in an engine bolt missing detection scene, images acquired by an industrial camera often have many noise points, such as highlight and shadow generated by illumination, reflection of engine oil on the surface of an engine, and the like, which all pose a small challenge to the existing detection algorithm.

Secondly, complicated abstract features with high dimensions cannot be extracted manually, which leads to low algorithm robustness. And because the engine models are different and the structures are different, the number of the features and the image templates which need to be manually extracted is greatly increased if the engines of different models are detected, which undoubtedly also increases the requirement on hardware storage.

Therefore, the existing detection algorithm for the engine bolt loss has the problems of low robustness and poor adaptability, the detection precision is not high, and an engine bolt loss detection algorithm needs to be additionally researched to solve the technical problems

Disclosure of Invention

The invention aims to provide a system and a method for detecting an engine fastener based on a cost balance convolutional neural network, so as to solve the technical problem.

In order to achieve the purpose, the invention adopts the following technical scheme:

the engine fastener detection system based on the cost balance convolutional neural network is used for carrying out classification, identification and detection on fasteners installed on an engine and comprises the following components:

the image acquisition module is used for acquiring an image of a fastener to be detected, which is arranged on the engine;

the classification detection model training module is used for training and forming a fastener classification detection model for performing classification detection on the fastener images by taking the fastener sample images as training samples and based on a convolution neural network;

the fastener classification detection module is respectively connected with the classification detection model training module and the image acquisition module and is used for carrying out multi-scale image feature extraction on the fastener image based on the fastener classification detection model and identifying and outputting a fastener target identification classification result of the fastener image based on the extracted multi-scale image feature related to the fastener image;

the identification error calculation module is connected with the fastener classification detection module and used for calculating a target position regression error and a target classification error of the fastener classification detection model and summing the target position regression error and the target classification error to obtain an identification error of the fastener classification detection model;

the classification detection model training module is also connected with the identification error calculation module and used for training and updating the fastener classification detection model by taking the fastener sample image and the identification error as the double inputs of the convolutional neural network.

As a preferred aspect of the present invention, the fastening member includes any one or more of a bolt, a nut, a screw, or a bolt hole installed on the engine.

As a preferred aspect of the present invention, a network structure of the convolutional neural network for training the fastener classification detection model includes a multi-scale feature extraction network and a multi-scale feature fusion network connected to the multi-scale feature extraction network, where the multi-scale feature extraction network includes:

a cascade feature extractor for extracting a plurality of feature maps with different scales corresponding to the fastener image in a cascade feature extraction manner, wherein the cascade feature extractor specifically comprises:

seven feature extractors which are sequentially cascaded, wherein the first feature extractor takes the fastener image as input and outputs the feature map corresponding to the fastener image; the output of more than one feature extractor of each feature extractor in the subsequent cascade is used as the input, multi-scale image feature extraction is carried out on the fastener image, and seven feature maps with different scales and related to the fastener image are finally output;

a pooling network connected to the cascaded feature extractor and configured to pool a part of the feature map output by the cascaded feature extractor, where the pooling network specifically includes:

each maximum pooling unit is correspondingly connected with one of the first to fourth feature extractors which are sequentially cascaded, and is used for performing maximum pooling operation on the feature graphs extracted by the connected feature extractors;

and the mean pooling unit is connected with a fifth feature extractor which is sequentially cascaded and is used for performing mean pooling operation on the feature map extracted by the fifth feature extractor.

As a preferred scheme of the present invention, the first feature extractor and the second feature extractor, which are sequentially cascaded, respectively include two convolutional layers sequentially arranged in front of and behind each other;

the third and fourth cascaded feature extractors respectively comprise three convolution layers which are sequentially arranged in the front and the back;

the fifth cascaded feature extractor comprises four convolution layers which are sequentially arranged in the front and the back;

the sixth and seventh cascaded feature extractors also respectively comprise two convolution layers which are sequentially arranged in the front and the back;

wherein each of the convolutional layers in the first feature extractor includes 64 filters therein;

a second of said feature extractors in cascade comprising 128 of said filters in each of said convolutional layers;

a third one of said feature extractors includes 256 filters in each of said convolutional layers;

512 filters are included in each of the convolutional layers in the fourth and fifth of the feature extractors.

As a preferable aspect of the present invention, each of the filters in each of the convolutional layers in the first to fifth feature extractors, which are cascade-connected in sequence, has a size of 3 × 3 and a convolution step size of 1;

the size of the filter in the former one of the convolutional layers in the cascade of the sixth and seventh feature extractors is 3 × 3 and the convolution step size is 1, and the size of the filter in the latter one of the convolutional layers is 1 × 1 and the convolution step size is 2.

As a preferable aspect of the present invention, the pooling core size of the maximum pooling unit is 2 × 2 and the step size is 2;

the pooling kernel size of the mean pooling unit is again 2 x 2 and the step size is again 2.

As a preferred scheme of the present invention, the multi-scale feature fusion network includes:

a target position regression module, configured to perform target position prediction on a target on each feature map output by the multi-scale feature extraction network and associated with the fastener image, and fuse target position prediction results made for each feature map to form a fastener target position prediction result associated with the fastener image, where the target position regression module specifically includes:

a target position regression object selection unit, configured to select the feature map serving as a target position regression object from the feature maps output by the multi-scale feature extraction network;

a target position prediction unit connected to the target position regression object selection unit, configured to predict a target position of the target on each feature map selected as a target position regression object, and output a target position prediction result for each feature map;

a target position prediction result fusion unit connected to the target position prediction unit and used for fusing each target position prediction result made for each feature map to form a fastener target position prediction result related to the fastener image;

a target classification module, connected to the target location regression module, configured to perform recognition classification on the predicted categories of the targets on the feature maps, and fuse target recognition classification results of the targets to form the fastener target recognition classification result associated with the fastener image, where the target classification module specifically includes:

the target classification unit is used for identifying and classifying the predicted categories of the targets on the characteristic graphs to obtain target identification and classification results corresponding to the characteristic graphs;

and the target identification and classification result fusion unit is connected with the target classification unit and is used for fusing each target identification and classification result to form the fastener target identification and classification result related to the fastener image.

As a preferred aspect of the present invention, the identification error calculation module specifically includes:

a target position regression error calculation unit, configured to calculate, according to the fastener target position prediction result, the target position regression error when the fastener classification detection model regresses the target position corresponding to each of the targets on the fastener image;

the target classification error calculation unit is used for calculating the target classification error of the fastener classification detection model for performing target identification classification on each target on each fastener image according to the fastener target identification classification result;

and the identification error calculation unit is respectively connected with the target position regression error calculation unit and the target classification error calculation unit and is used for summing the target position regression error and the target classification error and calculating to obtain the identification error of the fastener classification detection model.

As a preferred aspect of the present invention, the target classification error calculation unit specifically includes:

the positive sample cross entropy error calculation subunit is used for calculating a positive sample cross entropy error corresponding to the positive sample output by the fastener classification and identification model according to the fastener target identification and classification result;

the negative sample cross entropy error calculation subunit is used for calculating a negative sample cross entropy error corresponding to the negative sample output by the fastener classification and identification model according to the fastener target identification and classification result;

the classification error weighting subunit is respectively connected with the positive sample cross entropy error calculation subunit and the negative sample cross entropy error calculation subunit and is used for assigning classification error weights corresponding to the positive sample cross entropy error and the negative sample cross entropy error;

and the target classification error calculation subunit is connected with the classification error weighting subunit and is used for summing the weighted positive sample cross entropy error and the weighted negative sample cross entropy error to obtain the target classification error.

The invention also provides a method for detecting the engine fastener based on the cost balance convolutional neural network, which is realized by applying the system for detecting the engine fastener and specifically comprises the following steps:

step S1, the engine fastener detection system collects the fastener image to be detected which is installed on the engine;

step S2, the engine fastener detection system takes the fastener sample image as a training sample, and based on the convolutional neural network, trains and forms the fastener classification detection model for performing classification detection on the fastener image;

step S3, the engine fastener detection system extracts the multi-scale image features of the fastener image based on the fastener classification detection model, and identifies and outputs the fastener target identification classification result of the fastener image based on the extracted multi-scale image features related to the fastener image;

step S4, the engine fastener detection system calculates the target position regression error and the target classification error of the fastener classification detection model, and sums the target position regression error and the target classification error to obtain the identification error of the fastener classification detection model;

and step S5, the engine fastener detection system takes the fastener sample image and the identification error as the double input of the convolutional neural network, and trains and updates the fastener classification detection model.

The invention has the beneficial effects that:

1. the method adopts the deep convolutional neural network to automatically learn the characteristics in the training sample, has higher ground efficiency and better robustness compared with the traditional algorithm for manually designing an operator to extract the image characteristics, and also has better detection effect on unclear image characteristics such as highlight, shadow or oil stain on the surface of an engine, which are generated due to the illumination problem;

2. by extracting the multi-scale features of the fastener image, the advantages of the image features under different scales can be fully adopted, for example, when a small target is detected, the extracted local features are more obvious, and when a large target is detected, the global features are more integral. In the feature fusion stage, the fused features comprise the advantages of feature maps with different scales, and finally, a better detection effect can be obtained;

3. the classification errors are balanced by giving weights corresponding to the cross entropy errors of the positive samples and the negative samples, and the problem that the identification precision of a fastener classification detection model is not high enough due to imbalance of the proportion of the positive samples and the negative samples is solved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments of the present invention will be briefly described below. It is obvious that the drawings described below are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.

FIG. 1 is a schematic structural diagram of an engine fastener detection system based on a cost-balanced convolutional neural network according to an embodiment of the present invention;

FIG. 2 is a schematic structural diagram of the convolutional neural network used to train the fastener classification detection model;

FIG. 3 is a schematic diagram of a network structure of the multi-scale feature extraction network in the convolutional neural network for training the fastener classification detection model;

FIG. 4 is a schematic diagram of the structure of the cascaded feature extractor in the multi-scale feature extraction network;

FIG. 5 is a schematic diagram of a network structure of the multi-scale feature fusion network in the convolutional neural network for training the fastener classification detection model;

FIG. 6 is a schematic structural diagram of the target location regression module in the multi-scale feature fusion network;

FIG. 7 is a schematic structural diagram of the object classification module in the multi-scale feature fusion network;

FIG. 8 is a schematic diagram of the identification error calculation module in the engine fastener detection system;

FIG. 9 is a schematic diagram of the structure of the target classification error calculation unit in the recognition error calculation module;

FIG. 10 is a diagram of method steps for a cost-balanced convolutional neural network-based engine fastener detection method, in accordance with an embodiment of the present invention.

Detailed Description

The technical scheme of the invention is further explained by the specific implementation mode in combination with the attached drawings.

Wherein the showings are for the purpose of illustration only and are shown by way of illustration only and not in actual form, and are not to be construed as limiting the present patent; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.

The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if the terms "upper", "lower", "left", "right", "inner", "outer", etc. are used for indicating the orientation or positional relationship based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not indicated or implied that the referred device or element must have a specific orientation, be constructed in a specific orientation and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes and are not to be construed as limitations of the present patent, and the specific meanings of the terms may be understood by those skilled in the art according to specific situations.

In the description of the present invention, unless otherwise explicitly specified or limited, the term "connected" or the like, if appearing to indicate a connection relationship between the components, is to be understood broadly, for example, as being fixed or detachable or integral; can be mechanically or electrically connected; they may be directly connected or indirectly connected through intervening media, or may be connected through one or more other components or may be in an interactive relationship with one another. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.

The engine fastener detection system based on the cost balance convolutional neural network provided by the embodiment of the invention is used for carrying out classification, identification and detection on fasteners installed on an engine, and referring to fig. 1, the engine fastener detection system comprises:

the image acquisition module 1 is used for acquiring an image of a fastener to be detected, which is arranged on the engine;

the classification detection model training module 2 is used for training and forming a fastener classification detection model for performing classification detection on the fastener images by taking the fastener sample images as training samples and based on a convolution neural network;

the fastener classification detection module 3 is respectively connected with the classification detection model training module 2 and the image acquisition module 1, and is used for carrying out multi-scale image feature extraction on a fastener image based on a fastener classification detection model, and identifying and outputting a fastener target identification classification result of the fastener image based on the extracted multi-scale image feature related to the fastener image;

the identification error calculation module 4 is connected with the fastener classification detection module 3 and used for calculating a target position regression error and a target classification error of the fastener classification detection model and summing the target position regression error and the target classification error to obtain an identification error of the fastener classification detection model;

the classification detection model training module 2 is also connected with the identification error calculation module 4, and is used for training and updating the fastener classification detection model by taking the fastener sample image and the identification error as the double inputs of the convolutional neural network.

In the above technical solution, the fastener includes any one or more of a bolt, a nut, a screw, or a screw hole installed on the engine.

In the above technical solution, referring to fig. 2, a network structure of a convolutional neural network for training a fastener classification detection model includes a multi-scale feature extraction network 10 and a multi-scale feature fusion network 20 connected to the multi-scale feature extraction network 10, referring to fig. 3 and 4, where the multi-scale feature extraction network 10 includes:

the cascade feature extractor is used for extracting a plurality of feature maps with different scales corresponding to the fastener image in a cascade feature extraction mode, and specifically comprises the following steps:

seven feature extractors 101 which are sequentially cascaded, wherein the first feature extractor 101 takes a fastener image as input and outputs a feature map corresponding to the fastener image; the method comprises the following steps that the output of more than one feature extractor 101 of each feature extractor 101 in subsequent cascade connection serves as input, multi-scale image feature extraction is carried out on a fastener image, and finally seven feature graphs with different scales related to the fastener image are output;

the pooling network is connected with the cascade feature extractor and is used for pooling the partial feature map output by the cascade feature extractor, and the pooling network specifically comprises:

each maximum pooling unit 102 is correspondingly connected with one of the first to fourth feature extractors 101 which are sequentially cascaded, and is used for performing maximum pooling operation on feature maps extracted by the connected feature extractors 101;

the mean pooling unit 103 is connected with the fifth feature extractor 101 which is sequentially cascaded and is used for performing mean pooling operation on the feature map extracted by the fifth feature extractor 101;

in the above technical solution, preferably, the first feature extractor 101 and the second feature extractor 101, which are sequentially cascaded, respectively include two convolutional layers sequentially arranged in front and at the back;

the cascaded third and fourth feature extractors 101 respectively comprise three convolution layers which are sequentially arranged in the front and the back;

the fifth cascaded feature extractor 101 comprises four convolution layers arranged in sequence from front to back;

the sixth and seventh cascaded feature extractors 101 also respectively include two convolution layers arranged in sequence from front to back;

wherein each convolutional layer in the first feature extractor 101 comprises 64 filters;

each convolutional layer in the second feature extractor 101 in the cascade comprises 128 filters;

each convolutional layer in the cascaded third feature extractor 101 includes 256 filters;

512 filters are included in each convolutional layer in the cascaded fourth and fifth feature extractors 101.

More preferably, the size of each filter in each convolution layer in the first to fifth feature extractors 101 cascade-connected in sequence is 3 × 3 and the convolution step size is 1;

the size of the filter in the former convolutional layer in the cascade of the sixth and seventh feature extractors 101 is 3 × 3 and the convolution step is 1, and the size of the filter in the latter convolutional layer is 1 × 1 and the convolution step is 2, so that the feature maps output by the sixth and seventh feature extractors 101 will be reduced by two times compared with the feature maps output by the first to fifth feature extractors 101.

In the above technical solution, preferably, the pooling core size of the maximum pooling unit 102 is 2 × 2 and the step size is 2;

the pooled kernel size of the mean pooling unit 103 is again preferably 2 x 2 and the step size is again preferably 2.

Referring to fig. 5, the multi-scale feature fusion network 20 includes:

a target position regression module 201, configured to perform target position prediction on a target on each feature map output by the multi-scale feature extraction network and associated with the fastener image, and fuse target position prediction results made for each feature map to form a fastener target position prediction result associated with the fastener image, with reference to fig. 6, the target position regression module 201 specifically includes:

a target position regression object selection unit 2011, configured to select a feature map serving as a target position regression object from feature maps output by the multi-scale feature extraction network;

a target position prediction unit 2012 connected to the target position regression object selection unit 2011 and configured to predict target positions of the targets on the feature maps selected as the target position regression object and output a target position prediction result for each feature map;

a target position prediction result fusion unit 2013, connected to the target position prediction unit 2012, for fusing the target position prediction results to form a fastener target position prediction result associated with the fastener image;

the object classification module 202 is connected to the object location regression module 201, and configured to perform recognition and classification on the predicted categories of the objects on the feature maps, and fuse the object recognition and classification results of the objects on the feature maps to form a fastener object recognition and classification result related to the fastener image, with reference to fig. 7, the object classification module 202 specifically includes:

the target classification unit 2021 is configured to perform recognition classification on the predicted categories of the targets on the feature maps to obtain target recognition classification results corresponding to the feature maps;

the target recognition and classification result fusion unit 2022 is connected to the target classification unit 2021, and is configured to fuse the target recognition and classification results to form a fastener target recognition and classification result associated with the fastener image.

In the above technical solution, referring to fig. 8 specifically, the identification error calculation module 4 specifically includes:

a target position regression error calculation unit 41, configured to calculate a target position regression error when the fastener classification detection model regresses the target position corresponding to each target on the fastener image, according to the fastener target position prediction result;

a target classification error calculation unit 42, configured to calculate, according to the result of the fastener target identification and classification, a target classification error of the fastener classification detection model for performing target identification and classification on each target on the fastener image;

and the identification error calculation unit 43 is respectively connected with the target position regression error calculation unit 41 and the target classification error calculation unit 42, and is used for summing the target position regression error and the target classification error to calculate and obtain the identification error of the fastener classification detection model.

Referring to fig. 9, the target classification error calculation unit 42 specifically includes:

the positive sample cross entropy error calculation subunit 421 is configured to calculate, according to the fastener target identification classification result, a positive sample cross entropy error corresponding to a positive sample output by the fastener classification identification model;

the negative sample cross entropy error calculation subunit 422 is used for calculating a negative sample cross entropy error corresponding to the negative sample output by the fastener classification and identification model according to the fastener target identification and classification result;

a classification error weighting subunit 423 connected to the positive sample cross entropy error calculation subunit 421 and the negative sample cross entropy error calculation subunit 422, respectively, and configured to assign classification error weights corresponding to the positive sample cross entropy error and the negative sample cross entropy error;

and the target classification error calculating subunit 424 is connected to the classification error weighting subunit 423, and is configured to sum and calculate the weighted positive sample cross entropy error and the weighted negative sample cross entropy error to obtain a target classification error.

In the above technical solutions, the target position regression error is calculated by using a smoothed absolute value error, a method of calculating the target position regression error by using the smoothed absolute value error is the prior art, and the specific method of calculating the target position regression error is not within the scope of the claimed invention, and therefore, will not be described herein.

The target classification error is calculated by using a cross entropy loss function, the method for calculating the target classification error by using the cross entropy loss function is the prior art, and the specific method for calculating the target classification error is not within the protection scope of the claims of the invention, so the method is not described in the specification.

In the above technical solution, a method for predicting a target position corresponding to a target on a fastener image by using a multi-scale feature fusion network is briefly described as follows:

the engine fastener detection system provided by the embodiment of the invention does not use all the features of 7 different image scales output by the multi-scale feature extraction network to perform target position prediction result fusion, and only uses 4 feature graphs which are output by the multi-scale feature extraction network and have the smallest size and are related to fastener images to perform target position prediction result fusion. The system respectively predicts 4, 6 and 6 detection frames with different scales and different lengths and widths by 1 multiplied by 1 unit on 4 feature maps, wherein each detection frame comprises four detection frame parameters of a detection frame central point coordinate position (x, y) and a detection frame height h and width w;

first, the target position regression module convolves the four feature maps by 3 × 3 (convolution step is 1, and padding is 0), and outputs n × 4 targets (n is the number of detection frames, "4" is the above four detection frame parameters of the detection frame).

Then, the object classification module identifies and classifies the classes of the n × 4 objects (the adopted convolution kernel size is also 3 × 3, the convolution step length is 1, and the filling is 0), and outputs n × 5 feature maps (n is the number of detection frames, and "5" is five predicted classes, namely, a bolt, a nut, a screw hole or a picture background).

In addition, because the image background in the fastener image usually occupies most pixels in the fastener detection process of the engine, according to the detection experience, 10 of each tensioning fastener image detected by the detection system⁴-10⁵Only a few of the candidate locations include detection targets, which results in too large a difference between the number of positive and negative samples and in model degradation and reduced detection accuracy, so the classification error is balanced by giving corresponding weights to the positive and negative sample cross entropy errors, so that the fastener classification detection model places more attention on the identification classification of the aligned samples.

step S1, the engine fastener detection system collects the fastener image to be detected which is arranged on the engine;

step S2, the engine fastener detection system takes the fastener sample image as a training sample, and based on a convolution neural network, trains and forms a fastener classification detection model for performing classification detection on the fastener image;

step S3, the engine fastener detection system extracts the multi-scale image characteristics of the fastener image based on the fastener classification detection model, and identifies and outputs the fastener target identification classification result of the fastener image based on the extracted multi-scale image characteristics related to the fastener image;

and step S5, the engine fastener detection system takes the fastener sample image and the recognition error as the double inputs of the convolutional neural network, and trains and updates the fastener classification detection model.

In conclusion, the method has better robustness and adaptability for the missing detection of the engine fastener, and has higher detection precision.

It should be understood that the above-described embodiments are merely preferred embodiments of the invention and the technical principles applied thereto. It will be understood by those skilled in the art that various modifications, equivalents, changes, and the like can be made to the present invention. However, such variations are within the scope of the invention as long as they do not depart from the spirit of the invention. In addition, certain terms used in the specification and claims of the present application are not limiting, but are used merely for convenience of description.

Claims

1. An engine fastener detection system based on a cost balanced convolutional neural network is used for carrying out classification, identification and detection on fasteners installed on an engine, and is characterized by comprising:

2. The engine fastener detection system of claim 1, wherein the fastener comprises any one or more of a bolt, a nut, a screw, or a threaded bore mounted to the engine.

3. The engine fastener detection system of claim 1, wherein the network architecture of the convolutional neural network used to train the fastener classification detection model includes a multi-scale feature extraction network and a multi-scale feature fusion network connecting the multi-scale feature extraction network, wherein the multi-scale feature extraction network includes:

4. The engine fastener inspection system of claim 3, wherein a first said feature extractor and a second said feature extractor in sequential cascade comprise two convolutional layers, one after the other, respectively;

5. The engine fastener detection system of claim 4, wherein each of said filters in each of said convolutional layers in the first through fifth said feature extractors cascaded in sequence is 3 x 3 in size and has a convolution step size of 1;

6. The engine fastener inspection system of claim 3, wherein the pooled core size of the maximum pooled cell is 2 x 2 and the step size is 2;

7. The engine fastener inspection system of claim 5, wherein the multi-scale feature fusion network comprises:

8. The engine fastener detection system of claim 7, wherein the identification error calculation module specifically comprises:

9. The engine fastener inspection system of claim 8, wherein the target classification error calculation unit specifically includes:

10. An engine fastener detection method based on a cost balance convolutional neural network is realized by applying the engine fastener detection system as in any one of claims 1 to 9, and is characterized by specifically comprising the following steps of: