CN109472298A - Depth binary feature pyramid for the detection of small scaled target enhances network - Google Patents
Depth binary feature pyramid for the detection of small scaled target enhances network Download PDFInfo
- Publication number
- CN109472298A CN109472298A CN201811219005.8A CN201811219005A CN109472298A CN 109472298 A CN109472298 A CN 109472298A CN 201811219005 A CN201811219005 A CN 201811219005A CN 109472298 A CN109472298 A CN 109472298A
- Authority
- CN
- China
- Prior art keywords
- target
- output
- network
- scale
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The present invention relates to a kind of depth binary feature pyramids for the detection of small scaled target to enhance network, comprising: determines the core network at network code end;Design Bottom-up feature pyramid;Design Top-down feature pyramid;Target detection sub-network: the strategy detected using two stages in faster-rcnn, respectively candidate frame extracts stage and target classification stage, the RPN stage used on the output characteristic pattern of the pyramidal each scale of top-down feature convolution kernel for 3x3 convolution carry out target frame recurrence and whether be target probability prediction, candidate target frame after screening is ROI-pooling with the pyramidal output characteristic pattern of corresponding scale top-down feature again, finally carries out the adjustment of frame and the classification of target specific category using two full articulamentums;Export object detection result.
Description
Technical field
The invention belongs to the target detection skills in the fields such as computer vision, pattern-recognition, deep learning and artificial intelligence
Art, specifically, more particularly to being examined in an image or a video using depth convolutional neural networks to the target in scene
The technology of survey.
Background technique
In depth object detecting areas, with the continuous improvement of object detection performance, the performance of small nanoscale object detection at
For new bottleneck, some new network structures are proposed for the problem of improving the detection of small nanoscale object.Feature pyramid network
(featurepyramidnetwork [1], abbreviation FPN) is representative therein.FPN will be answered extensively in traditional images process field
Pyramid thought is introduced into depth object detection framework, and very big mention is achieved in the object detection of large scale range
It rises, the detection performance of especially small nanoscale object.Feature pyramid in FPN is with the form structure of top-down (top-down)
It makes, combines together with core network, can be used in single phase or dual-stage object detecting method.Spy in DSSD [2]
Sign pyramid structure is similar with FPN, and make is more complex, is used in single phase object detection.Work in [3] Blitznet
Person attempts to carry out while solving the problems, such as the multitask of object detection and semantic segmentation using the pyramidal structure of feature, is used for single phase
Object detection.In DSOD [4], author propose one bottom-up (bottom-up) based on the network knot intensively connected
Structure merges more shallow-layer network characterizations in forward direction.Although these methods make wisp detection performance achieve certain mention
It rises, but the requirement from actual scene is there are also with a certain distance from.
Existing method is mostly to pass through a jump link block and gold using by the core network feature of previous scale
The feature of word tower network current scale is merged, and has plenty of top-down feature pyramid structure, has plenty of bottom-up
Structure, it is all insufficient to the utilization of different scale with Bu Tong semantic hierarchy characteristic.
The challenge that computer vision field faces when identifying the object of large scale range.Currently, based on depth nerve
The object detecting method of network achieves inundatory performance advantage in object detecting areas.But the object of most existing is examined
Survey method is preferable to the detection effect of large scale object, not satisfactory to the effect of small nanoscale object detection.One often
The small nanoscale object test problems seen are as shown in Figure 1.Reason is the continuous intensification with network, and the resolution ratio of characteristic pattern is corresponding
Decline, the information of small nanoscale object is gradually submerged in the background of context in characteristic extraction procedure.However, all
It is again very harsh to the performance requirement of small nanoscale object in such as automatic Pilot scene.
Bibliography:
[1]Lin,T.Y.,Dollár,P.,Girshick,R.,He,K.,Hariharan,B.,&Belongie,S.
(2017,July).Featurepyramidnetworks for object detection.In CVPR(Vol.1,No.2,
p.4).
[2]Fu,C.Y.,Liu,W.,Ranga,A.,Tyagi,A.,&Berg,A.C.(2017).DSSD:
Deconvolutional single shotdetector.arXivpreprint arXiv:1701.06659.
[3]Dvornik,N.,Shmelkov,K.,Mairal,J.,&Schmid,C.(2017,October)
.BlitzNet:A real-time deep network for scene understanding.In ICCV 2017-
International Conference on ComputerVision(p.11).
[4]Shen,Z.,Liu,Z.,Li,J.,Jiang,Y.G.,Chen,Y.,&Xue,X.(2017,October)
.Dsod:Learning deeply supervised object detectors from scratch.In The IEEE
International Conference on ComputerVision(ICCV)(Vol.3,No.6,p.7).
Summary of the invention
The problem of gradually being flooded by background to improve small nanoscale object information as network is deepened, the invention proposes
Depth binary feature pyramid for the detection of small scaled target enhances network, to improve the scale robust of object detection algorithms
Property.Technical solution is as follows:
A kind of depth binary feature pyramid enhancing network for the detection of small scaled target, comprising:
(1) determine the core network at network code end: using residual error network as core network, residual error network includes 5 volumes
Volume module, the convolution (stride that each convolution module is two with a pond layer (pooling) or step-length
Convolution) start.
(2) Bottom-up feature pyramid is designed: in the pyramidal construction process of bottom-up feature, each feature
The operation that two-way feature corresponding element is added by mixing operation is completed, the pooling of the core network of a routing current scale
Layer or step-length are that the output of two convolutional layers carries out channel characteristics fusion and channel direction by the convolutional layer that a convolution kernel is 1x1
Dimension adjustment, channel adjusted are the 256 of unification, and another way is a preceding Fusion Features in bottom-up pyramid structure
It is afterwards 3x3 by a convolution kernel, step-length two, output channel number is the output after 256 convolutional layer, from core network
Third convolution module starts the fusion for continuously doing three scales;
(3) design Top-down feature pyramid: in the pyramidal construction process of top-down feature, each feature is melted
The operation that three tunnel features corresponding element is added by closing operation is merged, and the first via is identical with present fusion module output scale
The output of the convolution module the last layer of core network is merged channel characteristics by the convolutional layer that a convolution kernel is 1 and is adjusted logical
Road direction dimension is the output after unified 256, and the second tunnel is to export ruler with present fusion module on bottom-up pyramid
The output of the corresponding Fusion Features module of 1/2 scale of degree is 3x3, the convolution that output channel number is 256 by a convolution kernel
Layer, using the output for the up-sampling that a multiple is 2, third road is to export on top-down pyramid with present fusion module
The output of the corresponding Fusion Features module of 1/2 scale of scale is 3x3, the volume that output channel number is 256 by a convolution kernel
Lamination, using the output for the up-sampling that multiple is 2, from the last one volume of core network in top-down feature pyramid
The output of volume module starts the fusion for continuously doing three scales;
(4) target detection sub-network: the strategy detected using two stages in faster-rcnn, respectively candidate frame are extracted
Stage and target classification stage, the RPN stage is on the output characteristic pattern of the pyramidal each scale of top-down feature using volume
Product core be 3x3 convolution carry out target frame recurrence and whether be target probability prediction, by screening after candidate target
Frame is ROI-pooling with the pyramidal output characteristic pattern of corresponding scale top-down feature again, finally using two full connections
Layer carries out the adjustment of frame and the classification of target specific category;
(5) object detection result is exported: given input picture, it is special by the feature extraction of core network and bottom-up
Pyramid and the pyramidal Fusion Features of top-down feature are levied, it is enterprising in the fused characteristic pattern of top-down feature pyramid
The extraction and classification of row candidate target frame, the position for the candidate target frame that the position and scale for exporting target were exported by the RPN stage
The adjustment that information returns location information by the target classification stage exports final position and scale later, the classification of target by
The output in target classification stage determines;By merging for decoding end Analysis On Multi-scale Features space and semantic space, obtain high-resolution
Prognostic chart, prognostic chart through being upsampled to the consistent scale of image, and then obtain the Pixel-level semantic segmentation figure of input picture.
Compared to FPN, the mentioned network structure of the present invention merges top-down and bottom-up duplex pyramid, energy simultaneously
Enough shallow-layer networks richer from details retain more small nanoscale object information.It is used in view of bottom-up pyramid structure
Lesser port number merely adds the extra computation amount of very little.Bottom-up pyramid core network forward portion into
It has gone the fusion of feature, more channels has been increased for the transmitting of information, so as to improve the small object in information exchanging process
Body information loss.Three tunnel information sources are utilized in Fusion Features module in top-down feature pyramid, and every road is special
The semantic level of sign is all different, increases the diversity of information, is conducive to the information for retaining small nanoscale object.
Detailed description of the invention
Fig. 1 illustrates small scaled target test problems.In left figure, people's missing inspection that dark suit is squatted is worn, in right figure,
Innermost two child's missing inspections
Fig. 2 depicts the depth binary feature pyramid enhancing network for the detection of small scaled target proposed by the invention
Fig. 3 describes the operation of the Fusion Features in bottom-up feature pyramid and top-down feature pyramid
Fig. 4 illustrates the overall target detection network architecture
Some experimental results that Fig. 5 illustrates Resnet50-FPN and PEN proposed by the present invention compare.
Fig. 6 illustrates general embodiment of the invention
Specific embodiment
Enhance network, network knot the invention proposes a depth binary feature pyramid for the detection of small scaled target
Structure is as shown in Fig. 2, it is capable of the forward direction transmitting of Enhanced feature, the holding of especially small scaled target information.Mentioned network includes
One trunk convolutional neural networks and one semantically bottom-up (bottom-up) feature pyramid and one are semantically
The feature pyramid of top-down (top-down).Top-down feature pyramid includes that three fusions input source, is respectively
The previous scale of core network, the pyramidal current scale of top-down feature and bottom-up feature are pyramidal current
Scale.The Fusion Features module of pyramid structure is as shown in Figure 3.Fusion Features module packet in top-down feature pyramid
Include up-sampling, the operation that convolution sum corresponding element is added.In the case where the Fusion Features module in bottom-up feature pyramid includes
The operation that sampling, convolution sum corresponding element are added.In the present invention, similar to enhance using bottom-up feature pyramid
The top-down feature pyramid of FPN in network forward process so that keep richer small scaled target information.Enhancing
Binary feature pyramid network afterwards can be combined with target detection sub-network (such as fast-rcnn) as major network and be formed
Overall target detection network is as shown in Figure 4.Either bottom-up feature pyramid and top-down feature pyramid
Structure is all conducive to improve information loss of the small scaled target in depth network forward process, to improve small scaled target inspection
The performance of survey.
The invention mainly comprises universe network construction, two aspects of study of the parameter of network.Separately below with regard to these two aspects
It is described in detail.
It is the construction of universe network first, this respect can be divided into core network, bottom-up feature pyramid, top-
Four parts of down feature pyramid and target detection sub-network.
Core network: in our implementation, using classical residual error network as core network.Concrete implementation can be tied
The requirement of application scenarios and equipment is closed to select suitable core network, for example, rate request height and equipment calculated performance it is limited
Scene needs to select the core network of lightweight, and Resnet18 etc. can be used.When equipment and efficiency requirements are not high but to performance
It is required that stringent scene, can use more complicated core network.We are by taking Resnet50 as an example herein, 50 layers of residual error net
Network includes 5 convolution modules, the convolution (stride that each convolution module is two with a pond layer (pooling) or step-length
Convolution) start.
Bottom-up feature pyramid: the pyramidal constructor of bottom-up is completed by the operation of the left figure in Fig. 3,
In the pyramidal construction process of bottom-up feature, what two-way feature corresponding element was added by each Fusion Features operation
Operation is completed, and the pooling layer or step-length of the core network of a routing current scale are the output of two convolutional layers by a volume
Product core carries out channel characteristics fusion and the adjustment of channel direction dimension for the convolutional layer of 1x1, and channel adjusted is the 256 of unification,
It after a preceding Fusion Features by a convolution kernel is 3x3 that another way, which is in bottom-up pyramid structure, step-length 2, output
Output after the convolutional layer that port number is 256.Three rulers are continuously made since the third convolution module of core network in this way
The fusion of degree.
Top-down feature pyramid: the pyramidal construction of top-down feature is completed by the operation of the right figure in Fig. 3,
In the pyramidal construction process of top-down feature, each Fusion Features operate the operation for being added three tunnel features with corresponding element
Fusion.The first via is that the output of the convolution module the last layer of core network identical with present fusion module output scale is passed through
The convolutional layer that one convolution kernel is 1 merges channel characteristics and adjusts the output after channel direction dimension is unified 256, and second
Road is the output of Fusion Features module corresponding with present fusion module output 1/2 scale of scale on bottom-up pyramid
By a convolution kernel be 3x3, output channel number be 256 convolutional layer, using a multiple be 2 up-sampling output,
Third road is the defeated of Fusion Features module corresponding with present fusion module output 1/2 scale of scale on top-down pyramid
It is out 3x3, the convolutional layer that output channel number is 256, using the output for the up-sampling that multiple is 2 by a convolution kernel.?
Melting for three scales is continuously done since the output of the last one convolution module of core network in top-down feature pyramid
It closes.
Target detection sub-network:, the strategy that using in faster-rcnn two stages is detected similar with FPN, it is respectively candidate
Frame extracts stage and target classification stage.RPN (regionproposal network) stage is in top-down feature pyramid
Each scale output characteristic pattern on use convolution kernel for 3x3 convolution carry out target frame recurrence and whether be the general of target
The prediction of rate.Candidate target frame after screening is done with the pyramidal output characteristic pattern of corresponding scale top-down feature again
ROI-pooling finally carries out the adjustment of frame and the classification of target specific category using two full articulamentums.But herein with FPN
Some difference, the convolutional layer that the fused feature of each scale of pyramid is 3x3 with a convolution kernel in FPN are examined after drawing
It surveys.And mentioned network is detected immediately following a 3x3 convolutional layer and on the output characteristic pattern of convolutional layer after each fusion.
The followed by study of network parameter, this respect can be divided into following three parts.
Trained and test data prepares: the effect in order to prove mentioned network, needs to select a database, be divided into
Training airplane and test set, training set are used for learning network parameter, and test set is used to examine the comprehensive performance of network horizontal.In view of me
Be concerned with the detection of small scaled target, COCO data set disclosed in Microsoft is a relatively good selection, has been divided above
Trained and test set and more objective appraisal standard is provided, what we were done only needs data set to be processed into ours
The form and some data enhancement operations that network inputs need, this depends on our selected deep learning Development Frameworks, such as
Caffe, tensorflow, caffe2, mxnet, pytorch etc..Our experiment is all based on the expansion of this data set.
Netinit and training hyper parameter setting: our uses are trained on image recognition database Imagenet
Initial value of the resnet50 model as core network parameter, it is remaining to use random initializtion.Our training is single
NVIDIATITANX GPU is carried out, and trained hyper parameter includes that data set cycle-index is set as 20, and learning rate initial value is set as
1e-2 will be original 1/10 at the 12nd and the 18th after circulation terminates, and batch processing is sized to 2.
The selection of Training strategy: we use two stages Training strategy, fix the value of core network, adjustment in the first stage
The pyramidal parameter of bottom-up and top-down feature and the parameter of detection sub-network network are whole in second stage until convergence
A network is finely adjusted together.
The effect of embodiment: when core network all selects resnet50, our mentioned networks (abbreviation PEN) and FPN
Comprehensive performance comparison such as table 1 on coco data set, it can be seen that the PEN that the present invention is mentioned increases the scale Shandong of detection
Stick, the detection performance of especially small scaled target, which has, to be obviously improved.
Fig. 5 illustrate the mentioned network of some present invention (pyramid enhancement network, referred to as PEN) with
The comparing result of FPN, using identical core network (such as Resnet50), mentioned PEN is in small scaled target
FPN tool is compared in the detection of (such as railway people with car) to have great advantage.
The comparison of the test performance on COCO Minival data set of table 1
Claims (1)
1. a kind of depth binary feature pyramid for the detection of small scaled target enhances network, comprising:
(1) determine the core network at network code end: using residual error network as core network, residual error network includes 5 convolution moulds
Block, each convolution module are opened with the convolution (stride convolution) that a pond layer (pooling) or step-length are two
Begin.
(2) Bottom-up feature pyramid is designed: in the pyramidal construction process of bottom-up feature, each Fusion Features
The operation that two-way feature corresponding element is added by operation is completed, one route the pooling layer of the core network of current scale or
Step-length is that the output of two convolutional layers carries out channel characteristics fusion and channel direction dimension by the convolutional layer that a convolution kernel is 1x1
Adjustment, channel adjusted are the 256 of unification, and another way is to pass through after a preceding Fusion Features in bottom-up pyramid structure
Crossing a convolution kernel is 3x3, and step-length two, output channel number is the output after 256 convolutional layer, from the third of core network
A convolution module starts the fusion for continuously doing three scales;
(3) Top-down feature pyramid is designed: in the pyramidal construction process of top-down feature, each Fusion Features behaviour
The operation fusion that Zuo Jiang tri- tunnel features corresponding element is added, the first via are trunk identical with present fusion module output scale
The convolutional layer that the output of the convolution module the last layer of network is 1 by a convolution kernel merges channel characteristics and adjusts channel side
It is the output after 256 uniformly to dimension, the second tunnel is to export scale with present fusion module on bottom-up pyramid
The output of the corresponding Fusion Features module of 1/2 scale is 3x3 by convolution kernel, the convolutional layer that output channel number is 256, then
By the output for the up-sampling that a multiple is 2, third road is to export scale with present fusion module on top-down pyramid
The output of the corresponding Fusion Features module of 1/2 scale by convolution kernel be 3x3, the convolutional layer that output channel number is 256,
Using the output for the up-sampling that multiple is 2, from the last one convolution mould of core network in top-down feature pyramid
The output of block starts the fusion for continuously doing three scales;
(4) target detection sub-network: the strategy detected using two stages in faster-rcnn, respectively candidate frame extract the stage
With the target classification stage, the RPN stage uses convolution kernel on the output characteristic pattern of the pyramidal each scale of top-down feature
For 3x3 convolution carry out target frame recurrence and whether be target probability prediction, by screening after candidate target frame again
Be ROI-pooling with the pyramidal output characteristic pattern of corresponding scale top-down feature, finally using two full articulamentums into
The adjustment of row frame and the classification of target specific category;
(5) object detection result is exported: given input picture, feature extraction and bottom-up feature gold by core network
Word tower and the pyramidal Fusion Features of top-down feature, are waited on the fused characteristic pattern of top-down feature pyramid
The extraction and classification for selecting target frame, the location information for the candidate target frame that the position and scale for exporting target were exported by the RPN stage
Final position and scale are exported after the adjustment returned by the target classification stage to location information, the classification of target is by target
The output of sorting phase determines;By merging for decoding end Analysis On Multi-scale Features space and semantic space, high-resolution prediction is obtained
Figure, prognostic chart through being upsampled to the consistent scale of image, and then obtain the Pixel-level semantic segmentation figure of input picture.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811219005.8A CN109472298B (en) | 2018-10-19 | 2018-10-19 | Deep bidirectional feature pyramid enhanced network for small-scale target detection |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811219005.8A CN109472298B (en) | 2018-10-19 | 2018-10-19 | Deep bidirectional feature pyramid enhanced network for small-scale target detection |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109472298A true CN109472298A (en) | 2019-03-15 |
CN109472298B CN109472298B (en) | 2021-06-01 |
Family
ID=65664134
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811219005.8A Expired - Fee Related CN109472298B (en) | 2018-10-19 | 2018-10-19 | Deep bidirectional feature pyramid enhanced network for small-scale target detection |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109472298B (en) |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109858539A (en) * | 2019-01-24 | 2019-06-07 | 武汉精立电子技术有限公司 | A kind of ROI region extracting method based on deep learning image, semantic parted pattern |
CN109903339A (en) * | 2019-03-26 | 2019-06-18 | 南京邮电大学 | A kind of video group personage's position finding and detection method based on multidimensional fusion feature |
CN110084816A (en) * | 2019-03-21 | 2019-08-02 | 深圳大学 | Method for segmenting objects, device, computer readable storage medium and computer equipment |
CN110084124A (en) * | 2019-03-28 | 2019-08-02 | 北京大学 | Feature based on feature pyramid network enhances object detection method |
CN110334622A (en) * | 2019-06-24 | 2019-10-15 | 电子科技大学 | Based on the pyramidal pedestrian retrieval method of self-adaptive features |
CN110348384A (en) * | 2019-07-12 | 2019-10-18 | 沈阳理工大学 | A kind of Small object vehicle attribute recognition methods based on Fusion Features |
CN110378297A (en) * | 2019-07-23 | 2019-10-25 | 河北师范大学 | A kind of Remote Sensing Target detection method based on deep learning |
CN110580699A (en) * | 2019-05-15 | 2019-12-17 | 徐州医科大学 | Pathological image cell nucleus detection method based on improved fast RCNN algorithm |
CN111104962A (en) * | 2019-11-05 | 2020-05-05 | 北京航空航天大学青岛研究院 | Semantic segmentation method and device for image, electronic equipment and readable storage medium |
CN111242122A (en) * | 2020-01-07 | 2020-06-05 | 浙江大学 | Lightweight deep neural network rotating target detection method and system |
CN111460926A (en) * | 2020-03-16 | 2020-07-28 | 华中科技大学 | Video pedestrian detection method fusing multi-target tracking clues |
CN111539435A (en) * | 2020-04-15 | 2020-08-14 | 创新奇智(合肥)科技有限公司 | Semantic segmentation model construction method, image segmentation equipment and storage medium |
CN111695398A (en) * | 2019-12-24 | 2020-09-22 | 珠海大横琴科技发展有限公司 | Small target ship identification method and device and electronic equipment |
CN111898615A (en) * | 2020-06-16 | 2020-11-06 | 济南浪潮高新科技投资发展有限公司 | Feature extraction method, device, equipment and medium of object detection model |
CN112528976A (en) * | 2021-02-09 | 2021-03-19 | 北京世纪好未来教育科技有限公司 | Text detection model generation method and text detection method |
CN112634273A (en) * | 2021-03-10 | 2021-04-09 | 四川大学 | Brain metastasis segmentation system based on deep neural network and construction method thereof |
CN113011442A (en) * | 2021-03-26 | 2021-06-22 | 山东大学 | Target detection method and system based on bidirectional adaptive feature pyramid |
CN113111736A (en) * | 2021-03-26 | 2021-07-13 | 浙江理工大学 | Multi-stage characteristic pyramid target detection method based on depth separable convolution and fusion PAN |
CN113378815A (en) * | 2021-06-16 | 2021-09-10 | 南京信息工程大学 | Model for scene text positioning recognition and training and recognition method thereof |
CN113591872A (en) * | 2020-04-30 | 2021-11-02 | 华为技术有限公司 | Data processing system, object detection method and device |
WO2022246720A1 (en) * | 2021-05-24 | 2022-12-01 | 中国科学院深圳先进技术研究院 | Training method of surgical action identification model, medium and device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102063623A (en) * | 2010-12-28 | 2011-05-18 | 中南大学 | Method for extracting image region of interest by combining bottom-up and top-down ways |
CN106845351A (en) * | 2016-05-13 | 2017-06-13 | 苏州大学 | It is a kind of for Activity recognition method of the video based on two-way length mnemon in short-term |
CN107391609A (en) * | 2017-07-01 | 2017-11-24 | 南京理工大学 | A kind of Image Description Methods of two-way multi-modal Recursive Networks |
CN107798691A (en) * | 2017-08-30 | 2018-03-13 | 西北工业大学 | A kind of unmanned plane independent landing terrestrial reference real-time detecting and tracking method of view-based access control model |
CN108171752A (en) * | 2017-12-28 | 2018-06-15 | 成都阿普奇科技股份有限公司 | A kind of sea ship video detection and tracking based on deep learning |
-
2018
- 2018-10-19 CN CN201811219005.8A patent/CN109472298B/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102063623A (en) * | 2010-12-28 | 2011-05-18 | 中南大学 | Method for extracting image region of interest by combining bottom-up and top-down ways |
CN106845351A (en) * | 2016-05-13 | 2017-06-13 | 苏州大学 | It is a kind of for Activity recognition method of the video based on two-way length mnemon in short-term |
CN107391609A (en) * | 2017-07-01 | 2017-11-24 | 南京理工大学 | A kind of Image Description Methods of two-way multi-modal Recursive Networks |
CN107798691A (en) * | 2017-08-30 | 2018-03-13 | 西北工业大学 | A kind of unmanned plane independent landing terrestrial reference real-time detecting and tracking method of view-based access control model |
CN108171752A (en) * | 2017-12-28 | 2018-06-15 | 成都阿普奇科技股份有限公司 | A kind of sea ship video detection and tracking based on deep learning |
Non-Patent Citations (4)
Title |
---|
LEI ZHU 等: "Bidirectional Feature Pyramid Network with Recurrent Attention Residual Modules for Shadow Detection", 《COMPUTER VISION ECCV 2018》 * |
NIKITA DVORNIK 等: "BlitzNet: A Real-Time Deep Network for Scene Understanding", 《ARXIV:1708.02813V1 [CS.CV]》 * |
TSUNG-YI LIN 等: "Feature Pyramid Networks for Object Detection", 《ARXIV:1612.03144V2 [CS.CV]》 * |
韦燕凤 等: "用边缘金字塔结构实现Hausdorff距离匹配", 《计算机辅助设计与图形学学报》 * |
Cited By (31)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109858539A (en) * | 2019-01-24 | 2019-06-07 | 武汉精立电子技术有限公司 | A kind of ROI region extracting method based on deep learning image, semantic parted pattern |
WO2020186563A1 (en) * | 2019-03-21 | 2020-09-24 | 深圳大学 | Object segmentation method and apparatus, computer readable storage medium, and computer device |
CN110084816A (en) * | 2019-03-21 | 2019-08-02 | 深圳大学 | Method for segmenting objects, device, computer readable storage medium and computer equipment |
CN109903339A (en) * | 2019-03-26 | 2019-06-18 | 南京邮电大学 | A kind of video group personage's position finding and detection method based on multidimensional fusion feature |
CN109903339B (en) * | 2019-03-26 | 2021-03-05 | 南京邮电大学 | Video group figure positioning detection method based on multi-dimensional fusion features |
CN110084124A (en) * | 2019-03-28 | 2019-08-02 | 北京大学 | Feature based on feature pyramid network enhances object detection method |
CN110580699A (en) * | 2019-05-15 | 2019-12-17 | 徐州医科大学 | Pathological image cell nucleus detection method based on improved fast RCNN algorithm |
CN110334622A (en) * | 2019-06-24 | 2019-10-15 | 电子科技大学 | Based on the pyramidal pedestrian retrieval method of self-adaptive features |
CN110334622B (en) * | 2019-06-24 | 2022-04-19 | 电子科技大学 | Pedestrian retrieval method based on adaptive feature pyramid |
CN110348384A (en) * | 2019-07-12 | 2019-10-18 | 沈阳理工大学 | A kind of Small object vehicle attribute recognition methods based on Fusion Features |
CN110348384B (en) * | 2019-07-12 | 2022-06-17 | 沈阳理工大学 | Small target vehicle attribute identification method based on feature fusion |
CN110378297A (en) * | 2019-07-23 | 2019-10-25 | 河北师范大学 | A kind of Remote Sensing Target detection method based on deep learning |
CN110378297B (en) * | 2019-07-23 | 2022-02-11 | 河北师范大学 | Remote sensing image target detection method and device based on deep learning and storage medium |
CN111104962A (en) * | 2019-11-05 | 2020-05-05 | 北京航空航天大学青岛研究院 | Semantic segmentation method and device for image, electronic equipment and readable storage medium |
CN111104962B (en) * | 2019-11-05 | 2023-04-18 | 北京航空航天大学青岛研究院 | Semantic segmentation method and device for image, electronic equipment and readable storage medium |
CN111695398A (en) * | 2019-12-24 | 2020-09-22 | 珠海大横琴科技发展有限公司 | Small target ship identification method and device and electronic equipment |
CN111242122A (en) * | 2020-01-07 | 2020-06-05 | 浙江大学 | Lightweight deep neural network rotating target detection method and system |
CN111242122B (en) * | 2020-01-07 | 2023-09-08 | 浙江大学 | Lightweight deep neural network rotating target detection method and system |
CN111460926A (en) * | 2020-03-16 | 2020-07-28 | 华中科技大学 | Video pedestrian detection method fusing multi-target tracking clues |
CN111460926B (en) * | 2020-03-16 | 2022-10-14 | 华中科技大学 | Video pedestrian detection method fusing multi-target tracking clues |
CN111539435A (en) * | 2020-04-15 | 2020-08-14 | 创新奇智(合肥)科技有限公司 | Semantic segmentation model construction method, image segmentation equipment and storage medium |
CN113591872A (en) * | 2020-04-30 | 2021-11-02 | 华为技术有限公司 | Data processing system, object detection method and device |
WO2021218786A1 (en) * | 2020-04-30 | 2021-11-04 | 华为技术有限公司 | Data processing system, object detection method and apparatus thereof |
CN111898615A (en) * | 2020-06-16 | 2020-11-06 | 济南浪潮高新科技投资发展有限公司 | Feature extraction method, device, equipment and medium of object detection model |
CN112528976A (en) * | 2021-02-09 | 2021-03-19 | 北京世纪好未来教育科技有限公司 | Text detection model generation method and text detection method |
CN112634273A (en) * | 2021-03-10 | 2021-04-09 | 四川大学 | Brain metastasis segmentation system based on deep neural network and construction method thereof |
CN113111736A (en) * | 2021-03-26 | 2021-07-13 | 浙江理工大学 | Multi-stage characteristic pyramid target detection method based on depth separable convolution and fusion PAN |
CN113011442A (en) * | 2021-03-26 | 2021-06-22 | 山东大学 | Target detection method and system based on bidirectional adaptive feature pyramid |
WO2022246720A1 (en) * | 2021-05-24 | 2022-12-01 | 中国科学院深圳先进技术研究院 | Training method of surgical action identification model, medium and device |
CN113378815A (en) * | 2021-06-16 | 2021-09-10 | 南京信息工程大学 | Model for scene text positioning recognition and training and recognition method thereof |
CN113378815B (en) * | 2021-06-16 | 2023-11-24 | 南京信息工程大学 | Scene text positioning and identifying system and training and identifying method thereof |
Also Published As
Publication number | Publication date |
---|---|
CN109472298B (en) | 2021-06-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109472298A (en) | Depth binary feature pyramid for the detection of small scaled target enhances network | |
CN109902806B (en) | Method for determining target bounding box of noise image based on convolutional neural network | |
Tao et al. | An object detection system based on YOLO in traffic scene | |
CN110147797B (en) | Sketch complementing and identifying method and device based on generating type confrontation network | |
CN110837778A (en) | Traffic police command gesture recognition method based on skeleton joint point sequence | |
CN112163498B (en) | Method for establishing pedestrian re-identification model with foreground guiding and texture focusing functions and application of method | |
CN112084866A (en) | Target detection method based on improved YOLO v4 algorithm | |
CN110309723B (en) | Driver behavior recognition method based on human body characteristic fine classification | |
CN114758288B (en) | Power distribution network engineering safety control detection method and device | |
CN109543632A (en) | A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features | |
CN112800937A (en) | Intelligent face recognition method | |
CN107085723A (en) | A kind of characters on license plate global recognition method based on deep learning model | |
CN107038442A (en) | A kind of car plate detection and global recognition method based on deep learning | |
WO2023030182A1 (en) | Image generation method and apparatus | |
CN115810157A (en) | Unmanned aerial vehicle target detection method based on lightweight feature fusion | |
Song et al. | A joint siamese attention-aware network for vehicle object tracking in satellite videos | |
Xu et al. | BANet: A balanced atrous net improved from SSD for autonomous driving in smart transportation | |
CN115410078A (en) | Low-quality underwater image fish target detection method | |
CN116596966A (en) | Segmentation and tracking method based on attention and feature fusion | |
CN116503602A (en) | Unstructured environment three-dimensional point cloud semantic segmentation method based on multi-level edge enhancement | |
CN118015490A (en) | Unmanned aerial vehicle aerial image small target detection method, system and electronic equipment | |
CN109543519B (en) | Depth segmentation guide network for object detection | |
Liu et al. | Progressive context-dependent inference for object detection in remote sensing imagery | |
Zhu et al. | SCNet: A lightweight and efficient object detection network for remote sensing | |
CN117593794A (en) | Improved YOLOv7-tiny model and human face detection method and system based on model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20210601 Termination date: 20211019 |