CN113052184B - Target detection method based on two-stage local feature alignment - Google Patents
Target detection method based on two-stage local feature alignment Download PDFInfo
- Publication number
- CN113052184B CN113052184B CN202110270152.3A CN202110270152A CN113052184B CN 113052184 B CN113052184 B CN 113052184B CN 202110270152 A CN202110270152 A CN 202110270152A CN 113052184 B CN113052184 B CN 113052184B
- Authority
- CN
- China
- Prior art keywords
- feature
- training
- domain
- point
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a target detection technology based on two-stage local feature alignment. The method can further enhance the generalization performance of a target detection algorithm represented by Faster R-CNN in different application scenes. The prior target detection technology based on feature alignment generally has two problems: firstly, the foreground target characteristic region is inaccurately positioned in the early stage of network training; secondly, feature alignment is not carried out according to foreground object classification, and the fine granularity of the algorithm is low. According to the two-stage feature alignment method, in the first stage of training, a standard candidate frame generated by a central feature point diagram is used for solving the problem that the foreground target feature area is inaccurately positioned; and in the second stage of training, performing feature alignment on the foreground target features according to the classification result, and improving the fine granularity of the algorithm. The method provided by the invention has a simple network structure and can be transplanted to other target detection algorithms.
Description
Technical Field
The invention relates to the field of transfer learning in deep learning, and aims at application of a sub-class technology of transfer learning, namely feature alignment (domain adaptation), in a target detection task.
Background
In recent years, deep learning has made a breakthrough in the field of object detection, in which a one-stage object detection network representing SSD, YOLO series, etc., and a two-stage object detection network representing Faster R-CNN are extremely excellent in the object detection task. But two conditions are needed for the excellent performance of the target detection network, namely, a data set with larger sample number and perfect label is provided for the training of the target detection network; secondly, the feature spaces of the training scene and the application scene are independent and distributed in the same way or are approximately similar in feature distribution. However, in actual applications, the application scenario (referred to as a target domain in the transfer learning) and the training scenario (referred to as a source domain in the transfer learning) corresponding to the network training sample are often different. For example, a vehicle detector is trained, a training set used is a vehicle image sample taken on a road in a clear weather, and after network training is completed, the performance is excellent in the clear weather, but the detection performance is greatly reduced in rainy days and foggy days. In general, when the feature distribution of the source domain and the target domain is different greatly, that is, the inter-domain difference is large, the deep neural network performance is greatly reduced. This indicates that the generalization performance of the deep neural network is generally poor, and the same is true in the field of target detection. Although a special data set is established for a specific application scenario, it is very costly to establish an effective data set, and not only a huge number of samples need to be collected, but also manual labeling needs to be performed on the samples. In some specific application scenarios, such as military facilities of enemy army, medical imaging, etc., it is difficult to collect a sufficient number of samples for network training.
Inspired by the ability of people to hold one against three in the process of learning knowledge, the migration learning migrates the knowledge from the source domain data set to the target domain, so that when the target detection network trained on the source domain data set is applied to the target domain with a different source domain feature space, the generalization performance of the target detection algorithm can be improved only with small cost. The "knowledge" of the migratory learning migration is commonly owned by the source domain and the target domain. In the current transfer learning algorithm, the effect of a feature alignment method (domain adaptation) is the best, and the core idea is to reduce inter-domain differences, so that features extracted by a feature extractor of a target detection network have domain invariance, that is, the feature extractor can ignore the differences of a source domain and a target domain in the aspects of backgrounds and the like and extract common feature parts in the two domains. The existing target detection algorithm based on feature alignment all adopts an Faster R-CNN network as a target detection framework, and the methods have the following two problems in the feature alignment process: 1) Selecting a foreground target area by using a candidate frame generated by the RPN, wherein the accuracy of the RPN candidate frame is low in the early stage of training, and a characteristic area corresponding to the candidate frame contains a large amount of background noise, so that the characteristic alignment is negatively influenced; 2) The feature alignment does not perform feature alignment on the foreground target according to classification, the feature alignment is performed on the foreground target with larger difference, the feature alignment effect is also influenced, and the fine granularity of the algorithm is lower. Due to the above disadvantages, the performance improvement of the existing target detection algorithm based on feature migration is always limited. The method aims to solve the defects of the existing algorithm, and further increases the feature migration learning as the improvement of the generalization performance of the target detection algorithm.
Disclosure of Invention
In order to overcome the defects of the existing algorithm, the invention provides a target detection algorithm based on two-stage local feature alignment. The algorithm takes Faster R-CNN as a target detection frame, and solves the problems of low positioning precision of an RPN candidate frame at the early stage of training, excessive background noise, unclassified feature alignment and low fine granularity by a two-stage feature alignment mode, so that the problem of low generalization performance of a target detection network caused by inter-domain difference is solved (as shown in figure 1, the detection effect of the target detection network trained on a sample shown in figure (a) on a figure (b) is poor).
The technical scheme adopted by the invention is as follows:
step 1: the invention takes the fast R-CNN as a target detection framework, and a feature extraction backbone network of the fast R-CNN is VGG16-D and comprises a first convolution layer, a first lower sampling layer, a second convolution layer, a second lower sampling layer, a third convolution layer, a third lower sampling layer, a fourth convolution layer, a fourth lower sampling layer and a fifth convolution layer, wherein a feature diagram output by the fifth convolution layer is marked as F, and the size of the feature diagram is marked as M multiplied by N multiplied by 512;
step 2: the step is an RPN module, a RoI Head module and a classification regression module of the original Faster R-CNN; the RPN module generates candidate boxes and maps to bitsObtaining a characteristic candidate area A on the characteristic diagram F i I represents the ith feature candidate region; the RoI Head module scales candidate feature areas with different sizes to 7 x 7 size, unfolds the candidate feature areas into one-dimensional features, and then passes through two layers of fully connected layers FC1 and FC2 to obtain a feature vector A' i ;A′ i As input to the classification and regression module, a classification result F is obtained i And regression offset O of predicted frame coordinates i (ii) a In the two-stage feature alignment algorithm, in the first stage and the second stage of training, the Faster R-CNN module participates in training;
and step 3: the step is one of core contents of a patent, in a first training stage (a first 5 training period, 20 training periods in total), on the basis of step 2, a feature map F output by a fifth convolutional layer is additionally input into a 1 × 1 × 512 convolutional network, the number of output feature channels is 1, a new feature map F ' is output, the size of the new feature map F ' is M × N × 1, sigmoid operation is performed on each feature point of F ', a result of which the operation result is greater than or equal to λ (λ = 0.8) is reserved, a feature center point map C before clustering is obtained, the center distinguishes which foreground target each feature point belongs to by using a K-means + + algorithm, the foreground target is divided into N classes in total, the feature point of each class of foreground target is a feature point 3 before probability, the reserved feature point is a feature center point, the rest of feature points are set to 0, a feature center point after clustering is finally obtained, and each feature center point on P is marked as P k ;
And 4, step 4: the step is one of the core contents of the patent, and based on the step 3, each central feature point P on the feature central point diagram P is based k Generating a series of standard feature candidate frames with the size of 1 × 1,3 × 3,5 × 5 (the unit is a feature point on the feature map F) on the feature map F, wherein the center of the candidate frame is P k (ii) a Based on central feature point P k The characteristic region of the generated standard candidate frame mapped on the characteristic map M is recorded asAndforAndif the characteristic value of the characteristic point in the range is less than 1/2 of the corresponding characteristic value of the central characteristic point on the characteristic map M, the characteristic value is set to 0, namely, the characteristic point is paired with the central characteristic pointAndperforming background suppression to generate a feature region after background suppressionAnd
and 5: for the characteristic region generated in step 4Andtransversely unfolding and splicing to form a one-dimensional characteristic vector L k The size is (35X 3X N) X1, and is input into a domain classifier D 1 Judging the one-dimensional feature vector L k Belonging to a source domain or a target domain; domain classifier D 1 The structure of the system is three full connection layers, which are marked as FC3, FC4 and FC5, the input dimension of the FC3 is 35 multiplied by 3, the output dimension is 1024, the input dimension of the FC4 is 1024, the output dimension is 1024, the input dimension of the FC5 is 1024, and the output dimension is 1; the loss function due to domain classification is shown in equation (1), where D i Is a domain label, D i =0 denotes that the characteristic region is from the source domain, D i =1 indicates that the characteristic region is from the target domain,representing a characteristic vector corresponding to a characteristic point with coordinates (u, v) in a characteristic area corresponding to a kth ( k epsilon 0,1, 2) standard frame of a jth central characteristic point in the ith image sample;
step 6: the step is one of the core contents of the patent, and in the last 15 training periods, the contents of the steps 3-5 do not participate in the training any more; on the basis of the step 2, in each training turn, for the feature vector output by the RoI Head module, the feature vector of which the classification result belongs to the class c in the source domain sample updates the feature vectorFeature vector updating feature vector of classification result belonging to class c in target domain sampleThe updating mode is shown as formulas (2) and (3), wherein S and T respectively represent that the sample features come from a source domain and a target domain, and T represents the T-th training round; feature vectors generated with a current training roundAndupdating the accumulated characteristics of each class of foreground objects from the second training stage to the current t training turn according to the classificationAndthen, carrying out feature alignment; the updating mode of the feature accumulation vector is shown in formulas (4) and (5);
and 7: step 6, outputting feature accumulations of each class in each training turnAndusing L 2 Distance measures the difference between the same class of target features between the source domain and the target domain, using a loss function L cat Is represented as formula (6);
and step 8: for the foreground target feature vector p output by the RoI Head in each training turn i,j Input domain classifier D 2 Domain classifier D 2 The network structure of (2) is also three full connection layers, and the domain classifier D 1 Consistently, but the input size of the first fully connected layer is 49, the loss function is shown in equation (7), where i represents the ith image sample and j represents the jth foreground object.
L ins =-∑ i,j [D i logD 2 (p i,j )+(1-D i )log(1-D 2 (p i,j ))] (7)
Compared with the prior art, the invention has the beneficial effects that:
(1) In the local feature alignment process, by the method of firstly finding the central point of the feature region and then positioning the feature region, the condition that excessive background noise is introduced due to low accuracy of the feature candidate region at the early stage of training is avoided, and the adverse effect of the background noise on feature alignment is reduced;
(2) In the local feature alignment process, feature alignment is carried out on the foreground target feature region according to the classification, so that the fine granularity of the algorithm is improved, the inter-domain difference of the foreground target features of the same class is further reduced, and the feature alignment effect is enhanced.
Drawings
FIG. 1 shows: and the performance of the target detection algorithm is reduced under the condition of inter-domain difference.
FIG. 2 is a diagram of: and (3) a schematic diagram of a target detection algorithm based on two-stage feature alignment.
FIG. 3 is a diagram of: the feature region center points before clustering illustrate the intent.
FIG. 4 is a diagram of: the feature region center points after clustering illustrate the intent.
FIG. 5 is a diagram: the standard area features after background suppression illustrate the intent.
FIG. 6 is a diagram of: domain classifier D 1 And (4) a network structure schematic diagram.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
First, a structure diagram of a target detection algorithm based on two-stage local feature alignment is shown in fig. 2. The method is divided into a Faster R-CNN module and a two-stage feature alignment module, wherein the first stage feature alignment of the two-stage feature alignment acts on the first 5 training periods, and the second stage feature alignment acts on the second 15 training periods. The Faster R-CNN module participates in the whole training process, and the feature extraction network adopts VGG16 and comprises 13 convolutional layers and 5 pooling layers.
In the first stage of feature alignment, a feature graph F generated by a feature extraction network is utilized, for each feature point on the F, whether the feature point is a feature central point is judged through a 1 multiplied by 1 convolution network and a sigmoid layer, the feature central points with the probability value larger than 0.8 are filtered out, and a central feature point diagram C before clustering is obtained, as shown in figure 3And shown in the figure C, clustering the feature center points, dividing the feature center points into N types, and selecting the feature center point of k before the probability from each type of feature center points to obtain a feature center point diagram P after clustering. Based on each feature center point on P, standard candidate frames with the size of 1 × 1,3 × 3,5 × 5 are generated on the feature map M. The standard candidate box generated for each feature center point is shown in fig. 4. And taking 1/2 of the feature value corresponding to the feature central point as a background suppression reference value, performing background suppression on the feature region mapped by the standard candidate frame, and resetting the feature value smaller than the background suppression reference value to 0, as shown in fig. 5. After background suppression, the feature areas mapped by the standard candidate frames corresponding to each feature central point are characterized by expansion and splicing into 35 multiplied by 1 feature vectors, and the feature vectors are input into a domain classifier D 1 Performing a classification judgment, D 1 The network structure is shown in figure 6. The accuracy of the standard candidate frame generated based on the feature center point is far higher than that of the RPN candidate frame in the early stage of training, and the feature region mapped by the standard candidate frame is subjected to background suppression, so that the problem of background noise introduced by the conventional feature alignment is solved.
Second-stage feature alignment instead of using standard candidate boxes, feature vectors from each class of source and target domains are obtained in equations (2) and (3) using the RoI Head and classification networkAndand utilize in the manner of formulas (4) and (5)Andto update the cumulative characteristics of each class of foreground objectsAndand inputs it to the domain classifier D 2 . In the later stage of training, the accuracy of the RPN candidate frame is improved to a satisfactory value, the size specification of the RPN candidate frame is more, and the RPN candidate frame is more fit with a foreground target than a standard candidate frame, so that the RPN candidate frame is adopted in the later stage instead of the standard candidate frame; and for the feature vectors corresponding to the RPN candidate frames, feature alignment is carried out according to classification, so that the fine granularity of the algorithm is improved, and the feature difference of the same class target between two domains is further reduced.
Where mentioned above are merely embodiments of the present invention, any feature disclosed in this specification may, unless stated otherwise, be replaced by alternative features serving equivalent or similar purposes; all of the disclosed features, or all of the method or process steps, may be combined in any combination, except combinations where mutually exclusive features or/and steps are present.
Claims (3)
1. A target detection method based on two-stage local feature alignment is characterized by comprising the following steps:
step 1: the method takes Faster R-CNN as a target detection framework, the feature extraction backbone network is VGG16-D, and the method comprises a first convolution layer, a first lower sampling layer, a second convolution layer, a second lower sampling layer, a third convolution layer, a third lower sampling layer, a fourth convolution layer, a fourth lower sampling layer and a fifth convolution layer, wherein a feature graph output by the fifth convolution layer is marked as F, and the size of the feature graph is M multiplied by N multiplied by 512 at the moment;
and 2, step: the step is an RPN module, a RoI Head module and a classification regression module of the original Faster R-CNN; the RPN module generates a candidate frame and maps the candidate frame to a feature map F to obtain a feature candidate area A i I represents the ith feature candidate region; the RoI Head module scales the candidate feature regions with different sizes to 7 x 7 sizes, expands the candidate feature regions into one-dimensional features, and then obtains a feature vector A 'through two layers of full connection layers FC1 and FC 2' i ;A′ i As input to the classification and regression module, a classification result F is obtained i And regression offset O of predicted frame coordinates i (ii) a In a two-stage feature alignment algorithmIn the first stage and the second stage of training, the Faster R-CNN module participates in training;
and 3, step 3: training for 20 training cycles in total, wherein the first training cycle is the first 5 training cycles, in the first training stage, on the basis of the step 2, a feature map F output by the fifth convolutional layer is additionally input into a 1 × 1 × 512 convolutional network, the number of output feature channels is 1, a new feature map F ' is output, the size of the feature map F ' is M × N × 1, sigmoid operation is performed on each feature point of the F ', a result of which the operation result is greater than or equal to a threshold value λ is reserved, a feature center point map C before clustering is obtained, the center distinguishes which foreground target each feature point belongs to specifically by using a clustering algorithm and is divided into Z classes, the feature points of each class of foreground target are feature points of k before probability, the reserved feature points are feature center points, the rest of the feature center point maps are set to 0, finally a feature center point after clustering is obtained, each feature center point is marked as P, and each feature center point on P is marked as P k ;
And 4, step 4: based on the step 3, based on each central feature point P on the feature central point diagram P k Generating standard feature candidate frames on the feature map F, wherein the sizes of the standard feature candidate frames are 1 multiplied by 1,3 multiplied by 3 and 5 multiplied by 5 respectively, the unit is a feature point on the feature map F, and the center of the standard feature candidate frame is P k (ii) a Based on central feature point P k The feature region of the generated standard feature candidate box mapped on the feature map F is recorded asAndfor theAndif the feature value of the feature point in the range is smaller than the delta of the feature value corresponding to the central feature point on the feature map F, the feature point is set to 0, namely, the feature point is paired with the feature pointAndperforming background suppression to generate a feature region after background suppressionAnd
and 5: for the characteristic region generated in step 4Andtransversely unfolding and splicing the two-dimensional characteristic vectors into a one-dimensional characteristic vector L k The size is (35X 3X Z) X1, and is input into a domain classifier D 1 Judging the one-dimensional feature vector L k Belonging to a source domain or a target domain; domain classifier D 1 The structure of the system is three full connection layers, which are marked as FC3, FC4 and FC5, the input dimension of the FC3 is 35 multiplied by 3, the output dimension is 1024, the input dimension of the FC4 is 1024, the output dimension is 1024, the input dimension of the FC5 is 1024, and the output dimension is 1; the loss function due to domain classification is as follows:wherein D i As a domain tag, D i =0 denotes that the characteristic region is from the source domain, D i =1 indicates that the characteristic region is from the target domain,representing a feature vector corresponding to a feature point with coordinates (u, v) in a feature region corresponding to a kth standard feature candidate frame of a jth central feature point in an ith image sample, wherein k belongs to (0, 1, 2);
and 6: in the last 15 training periods, the contents of steps 3-5 are no longer involved in training; on the basis of the step 2, in each training turn, for the feature vector output by the RoI Head module, the feature vector of which the classification result belongs to the class c in the source domain sample updates the feature vectorUpdating the feature vector of which the classification result belongs to the class c in the target domain sampleThe updating mode is as formula (2)
And (3)Showing, wherein S and T respectively represent that the sample features come from a source domain and a target domain, and T represents the tth training round; feature vectors generated with a current training roundAndupdating the accumulated characteristics of each class of foreground targets from the second training stage to the current t training turn according to the classificationAndthen, carrying out feature alignment; the update mode of the feature accumulation vector is as follows:
and 7: step 6, outputting each type of feature accumulation in each training turnAndusing L 2 The distance measures the difference between the target features of the same category between the source domain and the target domain, and the loss function formula (6) is obtained:
and step 8: for the foreground target feature vector p output by the RoI Head in each training turn i,j Input domain classifier D 2 Obtaining the loss function formula (7)
L ins =-∑ i,j [D i logD 2 (p i,j )+(1-D i )log(1-D 2 (p i,j ))]。
2. The object detection method according to claim 1, wherein the threshold λ is 0.8 and the clustering algorithm is K-means + +, in step 3.
3. The object detection method as claimed in claim 1, wherein the step 4 is characterized in that the feature suppression is performed on the background feature with a small center feature value δ = 1/2.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110270152.3A CN113052184B (en) | 2021-03-12 | 2021-03-12 | Target detection method based on two-stage local feature alignment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110270152.3A CN113052184B (en) | 2021-03-12 | 2021-03-12 | Target detection method based on two-stage local feature alignment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113052184A CN113052184A (en) | 2021-06-29 |
CN113052184B true CN113052184B (en) | 2022-11-18 |
Family
ID=76512631
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110270152.3A Active CN113052184B (en) | 2021-03-12 | 2021-03-12 | Target detection method based on two-stage local feature alignment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113052184B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113343989B (en) * | 2021-07-09 | 2022-09-27 | 中山大学 | Target detection method and system based on self-adaption of foreground selection domain |
CN114399697A (en) * | 2021-11-25 | 2022-04-26 | 北京航空航天大学杭州创新研究院 | Scene self-adaptive target detection method based on moving foreground |
CN114821152B (en) * | 2022-03-23 | 2023-05-02 | 湖南大学 | Domain self-adaptive target detection method and system based on foreground-class perception alignment |
CN115131590B (en) * | 2022-09-01 | 2022-12-06 | 浙江大华技术股份有限公司 | Training method of target detection model, target detection method and related equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109977812A (en) * | 2019-03-12 | 2019-07-05 | 南京邮电大学 | A kind of Vehicular video object detection method based on deep learning |
CN110321891A (en) * | 2019-03-21 | 2019-10-11 | 长沙理工大学 | A kind of big infusion medical fluid foreign matter object detection method of combined depth neural network and clustering algorithm |
CN111882055A (en) * | 2020-06-15 | 2020-11-03 | 电子科技大学 | Method for constructing target detection self-adaptive model based on cycleGAN and pseudo label |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109977790A (en) * | 2019-03-04 | 2019-07-05 | 浙江工业大学 | A kind of video smoke detection and recognition methods based on transfer learning |
CN110363122B (en) * | 2019-07-03 | 2022-10-11 | 昆明理工大学 | Cross-domain target detection method based on multi-layer feature alignment |
CN110516671B (en) * | 2019-08-27 | 2022-06-07 | 腾讯科技(深圳)有限公司 | Training method of neural network model, image detection method and device |
AU2019101224A4 (en) * | 2019-10-05 | 2020-01-16 | Shu, Zikai MR | Method of Human detection research and implement based on deep learning |
CN111861978B (en) * | 2020-05-29 | 2023-10-31 | 陕西师范大学 | Bridge crack example segmentation method based on Faster R-CNN |
CN111832513B (en) * | 2020-07-21 | 2024-02-09 | 西安电子科技大学 | Real-time football target detection method based on neural network |
CN112465752A (en) * | 2020-11-16 | 2021-03-09 | 电子科技大学 | Improved Faster R-CNN-based small target detection method |
-
2021
- 2021-03-12 CN CN202110270152.3A patent/CN113052184B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109977812A (en) * | 2019-03-12 | 2019-07-05 | 南京邮电大学 | A kind of Vehicular video object detection method based on deep learning |
CN110321891A (en) * | 2019-03-21 | 2019-10-11 | 长沙理工大学 | A kind of big infusion medical fluid foreign matter object detection method of combined depth neural network and clustering algorithm |
CN111882055A (en) * | 2020-06-15 | 2020-11-03 | 电子科技大学 | Method for constructing target detection self-adaptive model based on cycleGAN and pseudo label |
Also Published As
Publication number | Publication date |
---|---|
CN113052184A (en) | 2021-06-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113052184B (en) | Target detection method based on two-stage local feature alignment | |
CN107301383B (en) | Road traffic sign identification method based on Fast R-CNN | |
CN106096561B (en) | Infrared pedestrian detection method based on image block deep learning features | |
CN111460968B (en) | Unmanned aerial vehicle identification and tracking method and device based on video | |
CN107633226B (en) | Human body motion tracking feature processing method | |
CN111723693B (en) | Crowd counting method based on small sample learning | |
CN112395957B (en) | Online learning method for video target detection | |
KR102190527B1 (en) | Apparatus and method for automatic synthesizing images | |
CN106815323B (en) | Cross-domain visual retrieval method based on significance detection | |
CN111461039A (en) | Landmark identification method based on multi-scale feature fusion | |
CN113129335B (en) | Visual tracking algorithm and multi-template updating strategy based on twin network | |
Reis et al. | Combining convolutional side-outputs for road image segmentation | |
Liu et al. | A deep fully convolution neural network for semantic segmentation based on adaptive feature fusion | |
CN112990282B (en) | Classification method and device for fine-granularity small sample images | |
Yang et al. | A fast and effective video vehicle detection method leveraging feature fusion and proposal temporal link | |
CN111898665A (en) | Cross-domain pedestrian re-identification method based on neighbor sample information guidance | |
Maggiolo et al. | Improving maps from CNNs trained with sparse, scribbled ground truths using fully connected CRFs | |
Murugan et al. | Automatic moving vehicle detection and classification based on artificial neural fuzzy inference system | |
CN113627481A (en) | Multi-model combined unmanned aerial vehicle garbage classification method for smart gardens | |
Liang et al. | Multiple object tracking by reliable tracklets | |
Prabhakar et al. | Cdnet++: Improved change detection with deep neural network feature correlation | |
Saini et al. | Object Detection in Surveillance Using Deep Learning Methods: A Comparative Analysis | |
Zhang et al. | Bus passenger flow statistics algorithm based on deep learning | |
CN114067240A (en) | Pedestrian single-target tracking method based on online updating strategy and fusing pedestrian characteristics | |
Fu et al. | Foreground gated network for surveillance object detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |