CN112365497A - High-speed target detection method and system based on Trident Net and Cascade-RCNN structures - Google Patents

High-speed target detection method and system based on Trident Net and Cascade-RCNN structures Download PDF

Info

Publication number
CN112365497A
CN112365497A CN202011405295.2A CN202011405295A CN112365497A CN 112365497 A CN112365497 A CN 112365497A CN 202011405295 A CN202011405295 A CN 202011405295A CN 112365497 A CN112365497 A CN 112365497A
Authority
CN
China
Prior art keywords
network
target detection
training
data
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011405295.2A
Other languages
Chinese (zh)
Inventor
刘凤余
张琦
张平平
张冲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Zhuofan Information Technology Co ltd
Original Assignee
Shanghai Zhuofan Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Zhuofan Information Technology Co ltd filed Critical Shanghai Zhuofan Information Technology Co ltd
Priority to CN202011405295.2A priority Critical patent/CN112365497A/en
Publication of CN112365497A publication Critical patent/CN112365497A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a high-speed target detection method based on Trident Net and Cascade-RCNN structures, which comprises the steps of obtaining a data set of target detection images, and performing enhancement processing on the images in the data set; constructing a neural network, wherein the neural network comprises a feature extraction network and a prediction network, the feature extraction network fuses a plurality of backbone networks and comprises a feature pyramid network, a deformable convolution network is fused in each backbone network, and the prediction network comprises a double-branch structure; training the neural network through the enhanced data set, and judging a target according to an IOU threshold value in the training process to obtain a target detection model; and carrying out target detection on the image to be detected through the target detection model, wherein the system comprises modules corresponding to the steps of the method.

Description

High-speed target detection method and system based on Trident Net and Cascade-RCNN structures
Technical Field
The invention relates to the field of artificial intelligence and machine learning, in particular to the field of deep learning, and discloses a high-speed target detection method and system based on Trident Net and Cascade-RCNN structures.
Background
In a plurality of application scenes of target detection, due to different distances between a target to be detected and a detector (sensor), the detection scales of the same type of target are different, and natural defects and low accuracy rate are inevitably generated when the same scale is used for detecting the targets with different sizes, so that multi-scale detection is forced to be needed. Although the conventional detection model can perform detection in a multi-scale mode, due to the preset template, the feature expression capacities of targets with different sizes are different, for example, the targets with too large or too small sizes are difficult to accurately detect. In order to enable the expression capability of the model to be approximate to the expression capability of targets with different sizes, the Trident Net introduces a scale-aware parallel structure, firstly proposes that the receptive fields have influence on objects with different sizes and dimensions in a target detection task, adopts scaled constraint to obtain characteristic diagrams of different receptive fields, and uses parameter sharing to ensure smaller parameters and calculated amount.
In the target detection task, threshold selection of an intersection ratio (IOU) has a great influence on detection, and a high-quality sample is obtained more easily when the threshold is higher. However, the pursuit of a high threshold at a glance raises a certain degree of problems: (1) sample reduction induced overfitting, (2) the use of different thresholds in the training and reasoning process can easily lead to mismatching. The Cascade-RCNN is a Cascade detection structure, and the core is to train positive and negative samples on different networks in a stage-by-stage mode by utilizing a continuously rising threshold value so as to enable a detector of each stage to be focused on detecting the prosusal of an IOU in a certain range, and because an output IOU is generally larger than an input IOU, the detection effect is continuously improved.
In order to obtain a more accurate target detection result by combining the advantages of TridentNet and Cascade-RCNN, the patent provides a method for combining two model structures, and the advantages of TridentNet and Cascade-RCNN are fully utilized to solve the multi-scale problem and the IOU threshold selection problem in target detection.
Disclosure of Invention
In order to at least partially solve the above problems, the present invention provides a method and a system for detecting a high-speed target based on a TridentNet and Cascade-RCNN structure, wherein the method comprises: the method comprises the following steps:
acquiring a data set of a target detection image, and enhancing the image in the data set;
constructing a neural network, wherein the neural network comprises a feature extraction network and a prediction network, the feature extraction network fuses a plurality of backbone networks and comprises a feature pyramid network, a deformable convolution network is fused in each backbone network, and the prediction network comprises a double-branch structure;
training the neural network through the enhanced data set, and judging a target according to an IOU threshold value in the training process to obtain a target detection model;
and carrying out target detection on the image to be detected through the target detection model.
Further, the extracting a speech feature sequence from a user speech waveform includes:
the acquiring a data set of a target detection image and performing enhancement processing on the image in the data set includes:
step a1, performing global brightness enhancement processing on the brightness component of the image in the data set by using the following formula:
Figure BDA0002813892490000021
wherein L isγ(c, d) represents a brightness function after global brightness adjustment, Gamma represents a Gamma change coefficient, I (c, d) represents an original brightness component of an image in a data set, q represents a standard difference value of a Gaussian function, pi represents a natural constant, exp represents an exponential function, and (c, d) represents coordinates of image pixel points in the data set;
step a2, after performing global brightness enhancement processing on the brightness component of the image in the data set, performing saturation enhancement processing according to the following formula:
Figure BDA0002813892490000031
where H' represents the enhanced saturation component, H represents the original saturation component of the image in the data set, ψ represents the parameter value, and T represents the average luminance of the illumination information.
Furthermore, the feature extraction network is divided into three paths of different scale features of the learning target by using a scale-aware parallel structure mode in the TridentNet and combining the characteristics of a feature pyramid network and an FPN network, so that the Trident part of the target detection model has a better detection effect on the targets with different scales.
Furthermore, the backbone network is ResNet-18, the dual-branch structures are FC-head and Conv-head respectively, the FC-head is used as a classification network, and the Conv-head is used as a regression network.
Further, the training the neural network through the enhanced data set, and determining the target according to the IOU threshold in the training process to obtain a target detection model, including:
acquiring an enhanced data set, and removing redundant repeated data;
the data are accurately marked in position and category by using the existing marking software, and a marking file is generated;
the method comprises the steps of enabling a label file to correspond to picture data in a data set one by one, then analyzing the label file to generate tag data in a txt format, and randomly segmenting the data set into a training set, a verification set and a test set;
training a neural network in the target detection model by using the training set data, carrying out target detection model verification observation by using the verification set data in a continuous iteration process, then testing the target detection model on the test set data, and analyzing the detection result of the target detection model.
Further, before the training of the neural network in the target detection model using the training set data, the method further includes: the method comprises the following steps of carrying out normalization processing on training set data:
step A1, calculating the mean and variance of the samples in the training set according to the following formulas:
Figure BDA0002813892490000041
Figure BDA0002813892490000042
wherein, muaRepresents the mean value of the sample data in the training set,
Figure BDA0002813892490000043
representing the variance, x, of sample data in the training setiRepresents the ith sample data in the training set, i is 1,2, …, N;
step A2, the training set is normalized according to the following formula:
Figure BDA0002813892490000044
wherein,
Figure BDA0002813892490000045
represents the ith sample data in the normalized training set, and epsilon represents the minimum value.
Further, measuring the training degree of the neural network according to the Loss function Loss when training the neural network, including:
the Loss function Loss is obtained according to the following formula:
Loss=Rcls[h]+Rloc[f]
wherein R iscls[h]Representing a classification loss function, Rloc[f]Represents a position loss function;
the classification loss function is expressed as:
Figure BDA0002813892490000046
wherein, h (x)i) Probability estimation, y, representing the posterior distribution of classesiRepresents a class label, LclsRepresents the cross entropy loss;
the position loss function is expressed as:
Figure BDA0002813892490000047
wherein, (f (x)i,bi),gi) Representing an image area xiPredicted frame biRegression to labeled label borders, LlocRepresenting the bezel regression loss.
Further, the bounding box regression loss LlocUsing smoothed L1And (4) loss.
Further, the performing target detection on the image to be detected through the target detection model includes:
acquiring an image to be detected, dividing the image into three CNNs in a scale-aware mode for feature extraction, and performing high-speed feature extraction on each CNN by using a network strut of Resnet 18;
carrying out equal-size series operation on the extracted features to obtain the features after series connection;
and (3) performing high-speed feature extraction and processing on the features after series connection by adopting a network strut of Resnet18, dividing the features into three sub-features after feature extraction, setting different IOU values for the sub-features, performing cascade operation on the feature graph of three-way border regression, and taking the classification result and the border regression result after third-stage cascade as a final target detection result.
Further, the system comprises:
the data enhancement module is used for acquiring a data set of a target detection image and enhancing the image in the data set;
the network construction module is used for constructing a neural network, wherein the neural network comprises a feature extraction network and a prediction network, the feature extraction network is fused with a plurality of backbone networks and comprises a feature pyramid network, each backbone network is fused with a deformable convolution network, and the prediction network comprises a double-branch structure;
the model acquisition module is used for training the neural network through the enhanced data set, and judging a target according to an IOU threshold value in the training process to obtain a target detection model;
and the target detection module is used for carrying out target detection on the image to be detected through the target detection model.
Compared with the prior art, the invention has the beneficial effects that: the invention provides a high-speed target detection method based on Trident Net and Cascade-RCNN structures, which comprises the steps of obtaining a data set of target detection images, and performing enhancement processing on the images in the data set; constructing a neural network, wherein the neural network comprises a feature extraction network and a prediction network, the feature extraction network fuses a plurality of backbone networks and comprises a feature pyramid network, a deformable convolution network is fused in each backbone network, and the prediction network comprises a double-branch structure; training the neural network through the enhanced data set, and judging a target according to an IOU threshold value in the training process to obtain a target detection model; the target detection model is used for carrying out target detection on the image to be detected, and the reasonable and high-speed multi-scale detection model is designed, so that the accuracy and efficiency of target detection can be greatly improved.
The following description of the preferred embodiments for carrying out the present invention will be made in detail with reference to the accompanying drawings so that the features and advantages of the present invention can be easily understood.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings of the embodiments of the present invention will be briefly described below. Wherein the drawings are only for purposes of illustrating some embodiments of the invention and are not to be construed as limiting the invention to all embodiments thereof.
FIG. 1 is a flow chart of a high-speed target detection method based on TridentNet and Cascade-RCNN structures according to the present invention;
FIG. 2 is a block diagram of a high-speed target detection system based on TridentNet and Cascade-RCNN structures according to the present invention;
FIG. 3 is a diagram of the Trident-Cascade-RCNN neural network structure of the present invention;
FIG. 4 is an original test data diagram;
FIG. 5 is a diagram showing the detection result of the Trident-Cascade-RCNN dual-structure network of the present invention;
FIG. 6 is a diagram showing the results of detection by the Cascade-RCNN network;
fig. 7 is a graph of the detection results of the TridentNet network.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1 to 7, the technical problem to be solved by the present invention is to provide a method and a system for detecting a high-speed target based on a TridentNet and Cascade-RCNN structure, wherein the method comprises:
acquiring a data set of a target detection image, and enhancing the image in the data set;
constructing a neural network, wherein the neural network comprises a feature extraction network and a prediction network, the feature extraction network fuses a plurality of backbone networks and comprises a feature pyramid network, a deformable convolution network is fused in each backbone network, and the prediction network comprises a double-branch structure;
training the neural network through the enhanced data set, and judging a target according to an IOU threshold value in the training process to obtain a target detection model;
and carrying out target detection on the image to be detected through the target detection model.
The working principle of the technical scheme is as follows: firstly, acquiring a data set of a target detection image, and enhancing the image in the data set; then, a neural network is constructed, wherein the neural network comprises a feature extraction network and a prediction network, the feature extraction network refers to a scale-aware parallel structure mode in Trident Net, and is divided into different scale features of three paths of learning targets by combining the characteristics of a feature pyramid network and an FPN network, so that the Trident part of a target detection model has a better detection effect on targets with different scales, the feature extraction network is fused with a plurality of backbone networks and comprises a feature pyramid network, a deformable convolution network is fused in each backbone network, the prediction network comprises a double-branch structure, the backbone networks are Res-18, the double-branch structures are FC-head and Conv-head respectively, the FC-head is used as a classification network, and the Conv-head is used as a regression network; secondly, training the neural network through the enhanced data set, and judging a target according to an IOU threshold value in the training process to obtain a target detection model; and finally, carrying out target detection on the image to be detected through the target detection model.
The beneficial effects of the above technical scheme are that: by designing a reasonable and high-speed multi-scale target detection model according to the technical scheme, the accuracy and efficiency of target detection can be greatly improved, wherein the target detection model comprises a neural network, and by using the neural network for reference to fast-RCNN, FPN (feature probable network) and RPN (region probable network) networks are adopted in sequence to obtain the feature mapping of the candidate frame, so that the feature extraction network can better extract features, the target detection model can be better trained, overfitting is reduced, and the accuracy and robustness of the target detection model are greatly improved.
In an embodiment provided by the present invention, the acquiring a data set of a target detection image and performing enhancement processing on an image in the data set includes:
step a1, performing global brightness enhancement processing on the brightness component of the image in the data set by using the following formula:
Figure BDA0002813892490000081
wherein L isγ(c, d) represents a brightness function after global brightness adjustment, Gamma represents a Gamma change coefficient, I (c, d) represents an original brightness component of an image in a data set, q represents a standard difference value of a Gaussian function, pi represents a natural constant, exp represents an exponential function, and (c, d) represents coordinates of image pixel points in the data set;
step a2, after performing global brightness enhancement processing on the brightness component of the image in the data set, performing saturation enhancement processing according to the following formula:
Figure BDA0002813892490000082
where H' represents the enhanced saturation component, H represents the original saturation component of the image in the data set, ψ represents the parameter value, and T represents the average luminance of the illumination information.
The beneficial effects of the above technical scheme are that: the image characteristic information can be enhanced through the technology, different places in the image can be better distinguished by improving the brightness and the saturation of the image, so that the characteristic information is more sensitive and more obvious, the algorithm adjusts the brightness by adopting a Gaussian function, the image contrast is increased along with the increase of the standard difference value of the Gaussian function due to the brightness, but the brightness value is reduced, in order to solve the problem, the saturation component is also processed on the image, so that the contrast is enhanced, the characteristic information of the image in a database is more obvious, and the characteristic extraction is convenient.
In an embodiment of the present invention, the training of the neural network by the enhanced data set, and the determining of the target according to the IOU threshold in the training process to obtain the target detection model include:
acquiring an enhanced data set, and removing redundant repeated data;
the data are accurately marked in position and category by using the existing marking software, and a marking file is generated;
the method comprises the steps of enabling a label file to correspond to picture data in a data set one by one, then analyzing the label file to generate tag data in a txt format, and randomly segmenting the data set into a training set, a verification set and a test set;
training a neural network in the target detection model by using the training set data, carrying out target detection model verification observation by using the verification set data in a continuous iteration process, then testing the target detection model on the test set data, and analyzing the detection result of the target detection model.
The working principle of the technical scheme is as follows: firstly, acquiring an enhanced data set and removing redundant data, wherein PCA principal component analysis is used for removing the redundant data; secondly, accurately marking the position and the type of the data by using the existing marking software labellimg, and generating a marking file; secondly, the label file and the picture data in the data set are in one-to-one correspondence, then the label file is analyzed to generate tag data in a txt format, and the data set is randomly segmented into a training set, a verification set and a test set; and finally, training the neural network in the target detection model by using training set data, wherein the data normalization is required to be carried out by the following steps before the training of the neural network in the target detection model by using the training set data: step A1, calculating the mean and variance of the samples in the training set according to the following formulas:
Figure BDA0002813892490000091
Figure BDA0002813892490000092
wherein, muaRepresents the mean value of the sample data in the training set,
Figure BDA0002813892490000093
represents a training setVariance of this data, xiRepresents the ith sample data in the training set, i is 1,2, …, N;
step A2, the training set is normalized according to the following formula:
Figure BDA0002813892490000094
wherein,
Figure BDA0002813892490000095
representing the ith sample data in the normalized training set, representing the minimum value, preventing the calculation error when the denominator is zero, wherein, the normalized data needs to be reconstructed when the gradient is reversed,
Figure BDA0002813892490000101
wherein, yiThe representative category label, xi, beta, represents the parameter to be learned, and in order not to change the distribution characteristic of the data, xi, beta should be calculated according to the following formula:
Figure BDA0002813892490000102
β=μa(ii) a When training the neural network, measuring the training degree of the neural network according to the Loss function Loss, comprising the following steps:
the Loss function Loss is obtained according to the following formula:
Loss=Rcls[h]+Rloc[f]
wherein R iscls[h]Representing a classification loss function, Rloc[f]Represents a position loss function;
the classification loss function is expressed as:
Figure BDA0002813892490000103
wherein, h (x)i) Probability estimation, y, representing the posterior distribution of classesiRepresents a class label, LclsRepresents the cross entropy loss;
the position loss function is expressed as:
Figure BDA0002813892490000104
wherein, (f (x)i,bi),gi) Representing an image area xiPredicted frame biRegression to labeled label borders, LlocRepresenting the frame regression loss, frame regression loss LlocUsing smoothed L1Loss; the method comprises the steps of using verification set data to verify and observe a target detection model in a continuous iteration process, testing the target detection model on the test set data, and analyzing a detection result of the target detection model, wherein the target detection model is realized based on Trident Net and Cascade-RCNN double-structure networks, the Trident module and the Cascade module are respectively marked as a Trident module and a Cascade module, the Trident module adopts a scale-aware parallel structure mode and is divided into three paths to extract CNN features, and each CNN has a backbone structure of ResNet-18 so as to comprehensively consider the capability and efficiency problems of feature extraction. The CNN structure uses hole convolution (scaled convolution) with different convolution rates to obtain feature maps under different receptive fields. The three branches share weight parameters, and the features of different scales are extracted according to the receptive fields of different scales, so that the Trident part of the model has a better feature extraction effect on the targets of different scales. Then, performing localization operation, and performing feature merging to send the feature merged into the next network module; the Cascaded module directly divides the characteristics output by the Trident module into three sub-networks according to the difference of IOU thresholds, the detection frame after the regression of the first sub-network is input into the second sub-network for frame classification and regression, the detection frame after the regression of the second sub-network is input into the third sub-network for frame classification and regression, and the final result after the classification and regression of the third sub-network is the final network output result. In the Cascade module, the IOU thresholds of the three sub-networks are respectively set to be 0.4, 0.6 and 0.8 according to the actual trial situation, so that the optimal average classification and regression precision can be obtained; the data of the test set can be respectively detected in TridentNet and Cascade-RCNN, so thatAnd analyzing the detection results of the Cascade-RCNN network, the TridentNet network and the TridentNet and Cascade-RCNN double structures.
The beneficial effects of the above technical scheme are that: by adopting a scale-aware parallel structure mode, the CNN characteristic extraction is carried out in three paths, so that favorable characteristics can be better extracted at high speed, the normalization operation is carried out on training set data before the network is trained, the data are in accordance with positive distribution, so that the loss function can be better reduced, the overfitting is prevented, the training data can be better fitted by network parameters, the precision value and the accuracy of a target detection model are greatly improved, and the frame regression loss in the loss function adopts L1Loss, L1The loss is based on image comparison difference, then an absolute value is taken, the problem of gradient explosion in the iterative process can be better prevented, when the loss is reduced to a certain value, the verification set and the test set are used for verification and test, the quality of the model is measured through mAP of a printed target detection model, the quality is automatically detected and identified by a computer, extra manual maintenance is not needed, and therefore the intelligent level is greatly improved.
A high-speed object detection system based on the TridentNet and Cascade-RCNN architecture, the system comprising:
the data enhancement module is used for acquiring a data set of a target detection image and enhancing the image in the data set;
the network construction module is used for constructing a neural network, wherein the neural network comprises a feature extraction network and a prediction network, the feature extraction network is fused with a plurality of backbone networks and comprises a feature pyramid network, each backbone network is fused with a deformable convolution network, and the prediction network comprises a double-branch structure;
the model acquisition module is used for training the neural network through the enhanced data set, and judging a target according to an IOU threshold value in the training process to obtain a target detection model;
and the target detection module is used for carrying out target detection on the image to be detected through the target detection model.
The working principle of the technical scheme is as follows: firstly, a data enhancement module acquires a data set of a target detection image and performs enhancement processing on the image in the data set; then, a network construction module constructs a neural network, wherein the neural network comprises a feature extraction network and a prediction network, the feature extraction network refers to a scale-aware parallel structure mode in Trident Net, and is divided into different scale features of three paths of learning targets by combining the characteristics of a feature pyramid network and an FPN network, so that the Trident part of a target detection model has a better detection effect on targets with different scales, the feature extraction network is fused with a plurality of backbone networks and comprises the feature pyramid network, each backbone network can be deformed and fused with a convolution network, the prediction network comprises a double-branch structure, the backbone network is ResNet-18, the double-branch structures are respectively FC-head and Conv-head, the FC-head is used as a classification network, and the Conv-head is used as a regression network; secondly, training the neural network by the enhanced data set through a model acquisition module, and judging a target according to an IOU threshold value in the training process to obtain a target detection model; and finally, the target detection module performs target detection on the image to be detected through the target detection model.
The beneficial effects of the above technical scheme are that: by designing a reasonable and high-speed multi-scale target detection model according to the technical scheme, the accuracy and efficiency of target detection can be greatly improved, wherein the target detection model comprises a neural network, and by using the neural network for reference to fast-RCNN, FPN (feature probable network) and RPN (region probable network) networks are adopted in sequence to obtain the feature mapping of the candidate frame, so that the feature extraction network can better extract features, the target detection model can be better trained, overfitting is reduced, and the accuracy and robustness of the target detection model are greatly improved.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle scope of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A high-speed target detection method based on Trident Net and Cascade-RCNN structures is characterized in that: the method comprises the following steps:
acquiring a data set of a target detection image, and enhancing the image in the data set;
constructing a neural network, wherein the neural network comprises a feature extraction network and a prediction network, the feature extraction network fuses a plurality of backbone networks and comprises a feature pyramid network, a deformable convolution network is fused in each backbone network, and the prediction network comprises a double-branch structure;
training the neural network through the enhanced data set, and judging a target according to an IOU threshold value in the training process to obtain a target detection model;
and carrying out target detection on the image to be detected through the target detection model.
2. The method as claimed in claim 1, wherein the step of obtaining a data set of target detection images and performing enhancement processing on the images in the data set comprises:
step a1, performing global brightness enhancement processing on the brightness component of the image in the data set by using the following formula:
Figure FDA0002813892480000011
wherein L isγ(c, d) represents a brightness function after global brightness adjustment, Gamma represents a Gamma change coefficient, I (c, d) represents an original brightness component of an image in a data set, q represents a standard difference value of a Gaussian function, pi represents a natural constant, exp represents an exponential function, and (c, d) represents coordinates of image pixel points in the data set;
step a2, after performing global brightness enhancement processing on the brightness component of the image in the data set, performing saturation enhancement processing according to the following formula:
Figure FDA0002813892480000012
where H' represents the enhanced saturation component, H represents the original saturation component of the image in the data set, ψ represents the parameter value, and T represents the average luminance of the illumination information.
3. The method as claimed in claim 1, wherein the feature extraction network uses a scale-aware parallel structure mode in TridentNet for reference, combines the features of a feature pyramid network and an FPN network, and divides the feature into different scale features of three paths of learning objects, so that the Trident part of the object detection model has better detection effect on objects with different scales.
4. The method as claimed in claim 1, wherein the backbone network is ResNet-18, the dual-branch structures are FC-head and Conv-head, the FC-head is a classification network, and the Conv-head is a regression network.
5. The method as claimed in claim 1, wherein the training of the neural network is performed by the enhanced data set, and the target is judged according to the IOU threshold during the training process to obtain the target detection model, and the method comprises:
acquiring an enhanced data set, and removing redundant repeated data;
the data are accurately marked in position and category by using the existing marking software, and a marking file is generated;
the method comprises the steps of enabling a label file to correspond to picture data in a data set one by one, then analyzing the label file to generate tag data in a txt format, and randomly segmenting the data set into a training set, a verification set and a test set;
training a neural network in the target detection model by using the training set data, carrying out target detection model verification observation by using the verification set data in a continuous iteration process, then testing the target detection model on the test set data, and analyzing the detection result of the target detection model.
6. The method of claim 5, wherein the training of the neural network in the target detection model using the training set data further comprises: the method comprises the following steps of carrying out normalization processing on training set data:
step A1, calculating the mean and variance of the samples in the training set according to the following formulas:
Figure FDA0002813892480000031
Figure FDA0002813892480000032
wherein, muaRepresents the mean value of the sample data in the training set,
Figure FDA0002813892480000033
representing the variance, x, of sample data in the training setiRepresents the ith sample data in the training set, i is 1,2, …, N;
step A2, the training set is normalized according to the following formula:
Figure FDA0002813892480000034
wherein,
Figure FDA0002813892480000035
represents the ith sample data in the normalized training set, and epsilon represents the minimum value.
7. The method as claimed in claim 4, wherein the measuring of the training degree of the neural network according to the Loss function Loss during the training of the neural network comprises:
the Loss function Loss is obtained according to the following formula:
Loss=Rcls[h]+Rloc[f]
wherein R iscls[h]Representing a classification loss function, Rloc[f]Represents a position loss function;
the classification loss function is expressed as:
Figure FDA0002813892480000036
wherein, h (x)i) Probability estimation, y, representing the posterior distribution of classesiRepresents a class label, LclsRepresents the cross entropy loss;
the position loss function is expressed as:
Figure FDA0002813892480000037
wherein, (f (x)i,bi),gi) Representing an image area xiPredicted frame biRegression to labeled label borders, LlocRepresenting the bezel regression loss.
8. The method as claimed in claim 7, wherein the bounding box regression loss L is LlocUsing smoothed L1And (4) loss.
9. The method as claimed in claim 1, wherein the performing of the object detection on the image to be detected by the object detection model comprises:
acquiring an image to be detected, dividing the image into three CNNs in a scale-aware mode for feature extraction, and performing high-speed feature extraction on each CNN by using a network strut of Resnet 18;
carrying out equal-size series operation on the extracted features to obtain the features after series connection;
and (3) performing high-speed feature extraction and processing on the features after series connection by adopting a network strut of Resnet18, dividing the features into three sub-features after feature extraction, setting different IOU values for the sub-features, performing cascade operation on the feature graph of three-way border regression, and taking the classification result and the border regression result after third-stage cascade as a final target detection result.
10. A high-speed object detection system based on the TridentNet and Cascade-RCNN architecture, the system comprising:
the data enhancement module is used for acquiring a data set of a target detection image and enhancing the image in the data set;
the network construction module is used for constructing a neural network, wherein the neural network comprises a feature extraction network and a prediction network, the feature extraction network is fused with a plurality of backbone networks and comprises a feature pyramid network, each backbone network is fused with a deformable convolution network, and the prediction network comprises a double-branch structure;
the model acquisition module is used for training the neural network through the enhanced data set, and judging a target according to an IOU threshold value in the training process to obtain a target detection model;
and the target detection module is used for carrying out target detection on the image to be detected through the target detection model.
CN202011405295.2A 2020-12-02 2020-12-02 High-speed target detection method and system based on Trident Net and Cascade-RCNN structures Pending CN112365497A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011405295.2A CN112365497A (en) 2020-12-02 2020-12-02 High-speed target detection method and system based on Trident Net and Cascade-RCNN structures

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011405295.2A CN112365497A (en) 2020-12-02 2020-12-02 High-speed target detection method and system based on Trident Net and Cascade-RCNN structures

Publications (1)

Publication Number Publication Date
CN112365497A true CN112365497A (en) 2021-02-12

Family

ID=74535914

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011405295.2A Pending CN112365497A (en) 2020-12-02 2020-12-02 High-speed target detection method and system based on Trident Net and Cascade-RCNN structures

Country Status (1)

Country Link
CN (1) CN112365497A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113326924A (en) * 2021-06-07 2021-08-31 太原理工大学 Depth neural network-based key target photometric positioning method in sparse image
CN113591617A (en) * 2021-07-14 2021-11-02 武汉理工大学 Water surface small target detection and classification method based on deep learning
CN113780193A (en) * 2021-09-15 2021-12-10 易采天成(郑州)信息技术有限公司 RCNN-based cattle group target detection method and equipment
CN113869361A (en) * 2021-08-20 2021-12-31 深延科技(北京)有限公司 Model training method, target detection method and related device
CN115526874A (en) * 2022-10-08 2022-12-27 哈尔滨市科佳通用机电股份有限公司 Round pin of brake adjuster control rod and round pin split pin loss detection method
CN115527059A (en) * 2022-08-16 2022-12-27 贵州博睿科讯科技发展有限公司 Road-related construction element detection system and method based on AI (Artificial Intelligence) identification technology
CN115931359A (en) * 2023-03-03 2023-04-07 西安航天动力研究所 Turbine pump bearing fault diagnosis method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102779330A (en) * 2012-06-13 2012-11-14 京东方科技集团股份有限公司 Image reinforcement method, image reinforcement device and display device
US20190095795A1 (en) * 2017-03-15 2019-03-28 Samsung Electronics Co., Ltd. System and method for designing efficient super resolution deep convolutional neural networks by cascade network training, cascade network trimming, and dilated convolutions
CN110852349A (en) * 2019-10-21 2020-02-28 上海联影智能医疗科技有限公司 Image processing method, detection method, related equipment and storage medium
CN111814755A (en) * 2020-08-18 2020-10-23 深延科技(北京)有限公司 Multi-frame image pedestrian detection method and device for night motion scene

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102779330A (en) * 2012-06-13 2012-11-14 京东方科技集团股份有限公司 Image reinforcement method, image reinforcement device and display device
US20190095795A1 (en) * 2017-03-15 2019-03-28 Samsung Electronics Co., Ltd. System and method for designing efficient super resolution deep convolutional neural networks by cascade network training, cascade network trimming, and dilated convolutions
CN110852349A (en) * 2019-10-21 2020-02-28 上海联影智能医疗科技有限公司 Image processing method, detection method, related equipment and storage medium
CN111814755A (en) * 2020-08-18 2020-10-23 深延科技(北京)有限公司 Multi-frame image pedestrian detection method and device for night motion scene

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
司马海峰等: "《遥感图像分类中的智能计算方法》", 31 January 2018 *
杨东方等: "《数学模型在生态学的应用及研究》", 31 March 2019 *
雷帮军等: "《视频目标跟踪系统分步详解》", 31 December 2015 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113326924B (en) * 2021-06-07 2022-06-14 太原理工大学 Depth neural network-based key target photometric positioning method in sparse image
CN113326924A (en) * 2021-06-07 2021-08-31 太原理工大学 Depth neural network-based key target photometric positioning method in sparse image
CN113591617B (en) * 2021-07-14 2023-11-28 武汉理工大学 Deep learning-based water surface small target detection and classification method
CN113591617A (en) * 2021-07-14 2021-11-02 武汉理工大学 Water surface small target detection and classification method based on deep learning
CN113869361A (en) * 2021-08-20 2021-12-31 深延科技(北京)有限公司 Model training method, target detection method and related device
CN113780193A (en) * 2021-09-15 2021-12-10 易采天成(郑州)信息技术有限公司 RCNN-based cattle group target detection method and equipment
CN113780193B (en) * 2021-09-15 2024-09-24 易采天成(郑州)信息技术有限公司 RCNN-based cattle group target detection method and RCNN-based cattle group target detection equipment
CN115527059A (en) * 2022-08-16 2022-12-27 贵州博睿科讯科技发展有限公司 Road-related construction element detection system and method based on AI (Artificial Intelligence) identification technology
CN115527059B (en) * 2022-08-16 2024-04-09 贵州博睿科讯科技发展有限公司 System and method for detecting road construction elements based on AI (advanced technology) recognition technology
CN115526874B (en) * 2022-10-08 2023-05-12 哈尔滨市科佳通用机电股份有限公司 Method for detecting loss of round pin and round pin cotter pin of brake adjuster control rod
CN115526874A (en) * 2022-10-08 2022-12-27 哈尔滨市科佳通用机电股份有限公司 Round pin of brake adjuster control rod and round pin split pin loss detection method
CN115931359B (en) * 2023-03-03 2023-07-14 西安航天动力研究所 Turbine pump bearing fault diagnosis method and device
CN115931359A (en) * 2023-03-03 2023-04-07 西安航天动力研究所 Turbine pump bearing fault diagnosis method and device

Similar Documents

Publication Publication Date Title
CN112365497A (en) High-speed target detection method and system based on Trident Net and Cascade-RCNN structures
CN108830188B (en) Vehicle detection method based on deep learning
CN111179251B (en) Defect detection system and method based on twin neural network and by utilizing template comparison
EP3478728B1 (en) Method and system for cell annotation with adaptive incremental learning
CN113160192B (en) Visual sense-based snow pressing vehicle appearance defect detection method and device under complex background
CN106803247B (en) Microangioma image identification method based on multistage screening convolutional neural network
WO2022012110A1 (en) Method and system for recognizing cells in embryo light microscope image, and device and storage medium
CN109285139A (en) A kind of x-ray imaging weld inspection method based on deep learning
CN108564085B (en) Method for automatically reading of pointer type instrument
CN111815564B (en) Method and device for detecting silk ingots and silk ingot sorting system
CN109919934A (en) A kind of liquid crystal display panel defect inspection method based on the study of multi-source domain depth migration
CN106340016A (en) DNA quantitative analysis method based on cell microscope image
CN109284779A (en) Object detection method based on deep full convolution network
CN110599463B (en) Tongue image detection and positioning algorithm based on lightweight cascade neural network
CN113435407B (en) Small target identification method and device for power transmission system
CN116012291A (en) Industrial part image defect detection method and system, electronic equipment and storage medium
CN113205511B (en) Electronic component batch information detection method and system based on deep neural network
CN112613428B (en) Resnet-3D convolution cattle video target detection method based on balance loss
CN109840483A (en) A kind of method and device of landslide fissure detection and identification
CN115147363A (en) Image defect detection and classification method and system based on deep learning algorithm
CN112381806A (en) Double centromere aberration chromosome analysis and prediction method based on multi-scale fusion method
CN115294377A (en) System and method for identifying road cracks
CN110287970B (en) Weak supervision object positioning method based on CAM and covering
CN115439654A (en) Method and system for finely dividing weakly supervised farmland plots under dynamic constraint
CN109993728B (en) Automatic detection method and system for deviation of thermal transfer glue

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210212