CN111753849A - Detection method and system based on compact aggregation feature and cyclic residual learning - Google Patents

Detection method and system based on compact aggregation feature and cyclic residual learning Download PDF

Info

Publication number
CN111753849A
CN111753849A CN202010606592.7A CN202010606592A CN111753849A CN 111753849 A CN111753849 A CN 111753849A CN 202010606592 A CN202010606592 A CN 202010606592A CN 111753849 A CN111753849 A CN 111753849A
Authority
CN
China
Prior art keywords
detection
aggregation
convolution
features
saliency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010606592.7A
Other languages
Chinese (zh)
Inventor
化春键
凌艳
陈莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangnan University
Original Assignee
Jiangnan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangnan University filed Critical Jiangnan University
Priority to CN202010606592.7A priority Critical patent/CN111753849A/en
Publication of CN111753849A publication Critical patent/CN111753849A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention aims to provide a detection method and a detection system based on compact aggregate characteristics and cyclic residual error learning, and belongs to the technical field of image processing. The system comprises a compact feature extraction module, all feature aggregation modules and a circulating residual error optimization module, and the method comprises the following steps: extracting compact convolution characteristics, combining output characteristics of continuous stages together, and adopting a cavity space pyramid pooling module to realize multi-layer characteristic external information aggregation aiming at the compact convolution characteristics extracted from all layers; under a deep supervision mechanism, the method is continuously optimized in a residual learning mode, the whole cyclic residual network is tested on three visual saliency detection data sets, and the cyclic residual network based on the compact aggregation characteristics can be used for the actual application of the visual saliency detection in natural images after the test is finished. The invention improves the detection effect of visual saliency detection in a complex scene, and enhances the suppression of background noise and the continuity and integrity of a detection area.

Description

Detection method and system based on compact aggregation feature and cyclic residual learning
Technical Field
The invention relates to a detection method and a detection system based on compact aggregate characteristics and cyclic residual error learning, and belongs to the technical field of image processing.
Background
The visual saliency detection technology aims to detect the most distinctive target in a natural image and screen out complete target content. With the advantage of helping to reduce the complexity of computer understanding and analysis of natural images, visual saliency detection techniques have become one of the important pre-processing steps for many computer vision tasks, including image retrieval, visual tracking, scene classification, and pedestrian re-identification.
The traditional algorithm is based on the comparison or statistical information of manual features such as color, brightness and texture in the image and is accumulated by means of priori knowledge of workers. The convolutional neural network can autonomously and quickly learn the effective characteristics of the image, and the future development space in the field of image processing is large. With the continuous stacking of the convolutional layer and the pooling layer, the resolution of the output features of the high layer of the neural network is gradually reduced, and the semantic information is enhanced. However, when the high-level features are directly applied to the task of detecting the visual saliency at the pixel level, although the salient objects can be accurately positioned, the high-level features lack detailed information and are rough in overall appearance. And the shallow high-resolution feature of the convolutional neural network has the advantage of retaining spatial detail information.
Disclosure of Invention
In order to improve the detection effect of visual saliency detection in a complex scene and enhance the suppression of background noise and the continuity and integrity of a detection area, the invention provides a detection system, which comprises: the tight feature extraction module is used for realizing effective information aggregation in a single layer by adopting a tight connection mode aiming at the last convolutional layer feature of the second to the fifth stages of the ResNeXt101 network; all the feature aggregation modules utilize the ASPP module to realize the exchange and fusion of the feature information of the external layers aiming at the features of the different resolutions of the layers; and the cyclic residual optimization module is used for repeatedly utilizing the compact aggregation characteristics to continuously optimize the predicted saliency map under a deep supervision mechanism.
Another objective of the present invention is to provide a method for detecting the visual saliency of the cyclic residual based on the tightly aggregated features; firstly, extracting compact convolution characteristics of different levels from a basic network respectively, then aggregating all the characteristics of multiple resolutions, and finally realizing continuous optimization of a saliency map in a mode of circularly learning residual errors under a deep supervision mechanism; the method comprises the following steps:
s1, extracting compact convolution characteristics from the basic framework ResNeXt101 network, combining output characteristics of continuous stages together in a dense connection mode, covering a larger receptive field and realizing single-layer characteristic internal information fusion;
s2 only carries out tight aggregation on the basic features of a single layer, and neglecting the fusion of information among different depths and resolution features of the deep neural network is not beneficial to visual saliency detection, so that the aggregation of external information of multilayer features is realized by adopting a cavity space pyramid pooling module aiming at the tight convolution features extracted from all layers;
s3, under a deep supervision mechanism, the tight aggregation characteristics are recycled, continuous optimization is performed in a residual error learning mode, and a proper cycle number is determined through experiments;
s4, the whole cyclic residual error network is tested on three visual saliency detection data sets, and after the test is finished, the cyclic residual error network based on the compact aggregation characteristics can be used for the actual application of the visual saliency detection in natural images.
Optionally, the S1 includes:
firstly, aiming at the characteristics of 256 channels, 512 channels, 1024 channels and 2048 channels of the last convolutional layer of the second to fifth stages in the ResNeXt101 network, a convolution operation with the kernel size of 3 and the number of the channels of 128 is used for dimensionality reduction, dimensionality reduction characteristics are multiplexed to each stage which is then closely connected, the fusion of subsequent information is guided, namely the input of each current stage is the result of the characteristic cascade of all the previous stages, the convolution with the kernel size of 3 and the number of the channels of 64 is uniformly used for extracting characteristic information, and finally the output of a tight characteristic extraction module is obtained by cascading the dimensionality reduction characteristics and the intermediate output of a plurality of stages.
Optionally, the S2 includes:
firstly, cascading the compact convolution characteristics extracted from all layers, realizing dimensionality reduction through convolution operation with two kernels of 3 and channels of 256, then sending the compact convolution characteristics into a void space pyramid pooling module to realize information fusion, namely, parallelly performing convolution operation with one kernel of 1 and one channel of 128, three kernels of 3, expansion rates of 2,4 and 6 and channels of 128 respectively, and a combination of global mean pooling, kernel of 1 and channels of 128, and finally cascading the characteristics of 5 paths through convolution operation with kernel of 1 and channel of 256 to realize aggregation dimensionality reduction, namely obtaining compact aggregated characteristics (DAF). The output characteristics tightly aggregated inside a single layer and outside multiple layers have strong characteristic expression capability and contain rich significant clues.
Optionally, the S3 includes:
firstly, S2 is used to obtain compact aggregation characteristic, and an initial saliency map SM is obtained through convolution operation with kernel size of 1 and channel number of 10Then, the aggregation feature and the saliency map are repeatedly input to a Residual Convolution Block (RCB) in a cascade mode to learn the residual, and the saliency map after the k-th cycle is represented as SMk=RCB(Cat(SMk-1,DAF))+SMk-1Wherein RCB (-) includes convolution operations with two kernel sizes of 3, channel number of 128, and one kernel size of 1, channel number of 1; cat (-) denotes cascading input features across channels. And after the proper cycle number K, obtaining a final saliency map through sigmoid operation.
The whole network training adopts a calculation mode of
Figure BDA0002559373880000021
Standard cross entropy loss, wherein SMtAnd GTtRespectively representing the saliency of the T-th pixel on the saliency map and the true value map, T representing the total number of pixels of the image, GT 1 and GT 0 representing the saliency and the non-saliency pixels, respectively, and SMt∈[0,1]Representing the significance of the algorithm prediction. The closer the saliency map and the truth map are, the smaller the penalty value. The application of deep supervision mechanisms can place constraints in the middle part of the network, driving the overall trend towards more detailed learning and optimization. In the course of a cycle, a plurality of saliency maps { SM } are generated0,SM1,L,SMKCalculating cross entropy for each output between significance map and truth map, so totalA loss of
Figure BDA0002559373880000031
In the process of optimizing each cycle residual error, the input and the output both obtain the constraint of loss, and the learning and the information fusion of the deep aggregation characteristics are more facilitated. By minimizing the total loss, the network parameters are continuously refined to obtain the final model.
The invention has the following beneficial effects:
the invention realizes tight aggregation in the single-layer interior and the multi-layer exterior of the convolutional network, continuously optimizes the obvious prediction result in a cyclic residual error mode, alleviates the problems of regional integrity and continuity in the current deep visual significance detection technology, and improves the detection accuracy and smoothness.
The invention designs a compact characteristic extraction module which is simple, effective and strong in portability aiming at single-layer characteristics in a basic framework, and can effectively enhance the reusability and continuity of the characteristics.
The invention utilizes the cavity space pyramid pooling module to effectively aggregate the closely extracted features of a plurality of levels and different resolutions, directly promotes the fusion of information among the features without layer convolution, and improves the result of visual saliency detection.
Drawings
Fig. 1 is a schematic diagram of a cyclic residual significance detection network structure based on a tight aggregation feature according to the present invention.
Fig. 2 is a compact feature extraction module for the interior of a single layer as proposed by the present invention.
FIG. 3 is an all level feature aggregation module for a multi-layer exterior employed by the present invention.
FIG. 4 is a schematic diagram of the detection results of the present invention and other deep visual saliency detection algorithms on a public data set.
Detailed Description
The first embodiment is as follows:
a detection method based on close aggregation features and cyclic residual learning, see fig. 1, the method comprising the steps of:
step 1: the common data set MSRA10K is set as a training set containing 10000 natural RGB images and corresponding binary true value maps. In order to enhance the robustness of the network to image transformation and to solve the over-fitting problem, the method adopts the modes of random rotation, random cutting and horizontal turnover to realize the expansion of the training sample.
The MSRA10K data set is referred to in Cheng Mingming as "Global Contrast based SalientRegion Detection", published in 2011 at pages 409 and 416 of IEEE Confenre on Computer Vision and Pattern Recognition.
Step 2: referring to fig. 1, training samples are first input into a resenext 101 base network with full link layers removed, and convolution characteristics with channel numbers of 256,512,1024 and 2048 are obtained from the last convolution layer of the second to fifth stages, respectively. The higher the level, the richer the semantic information of the convolution features, and the shallow features better retain the detail and texture information.
The ResNeXt101 network may be referred to as Xie Saining "Aggregated resource transformation for Deep Neural Networks" published in 2017 on IEEE Confenrence on computer Vision and Pattern registration, page 5987. 5995.
And step 3: referring to fig. 2, for the features of different layers obtained in step 2, the feature information of the interior of a single layer is exchanged through the compact feature extraction module. Firstly, using convolution operation with kernel size of 3 x 3 and channel number of 128 to reduce dimension of original feature, then multiplexing the dimension-reduced feature to each stage which is closely connected afterwards, guiding the fusion of subsequent information, then inputting each current stage as the result of feature cascade of all previous stages, uniformly using convolution with kernel size of 3 x 3 and channel number of 64 to extract feature information, and finally outputting the result obtained by cascade connection of dimension-reduced feature and intermediate output of multiple stages.
And 4, step 4: and 4, fusing the multilayer external feature information of the four layers of convolution features with different resolutions obtained in the step 3 through all feature aggregation modules. All the feature aggregation modules are composed of two convolutions with kernel size of 3 × 3 and channel number of 256 and a cavity space pyramid pooling module for operation. The cavity space pyramid pooling module inputs five parallel paths, wherein the first path comprises convolution operation with the kernel size of 1 multiplied by 1 and the number of channels of 128, the middle three paths are convolution operation with the kernel size of 3 multiplied by 3, the cavity expansion rate values of 2,4 and 6 and the number of channels of 128 respectively, the last path is convolution operation with the kernel size of 1 multiplied by 1 and the number of channels of 128 in sequence, and finally the output characteristics of the five paths are subjected to convolution dimensionality reduction with the kernel size of 1 multiplied by 1 and the number of channels of 256. The compact aggregation characteristics obtained by compact aggregation of the single-layer internal characteristic information and the multi-layer external characteristic information have strong characteristic expression capability and contain rich significant clues.
And 5: referring to fig. 1, firstly, an initial saliency map is obtained by using compact aggregation features through convolution operation with a kernel size of 1 × 1 and a channel number of 1, then the saliency map and the compact aggregation features in the previous cycle stage are repeatedly cascaded and input into a residual convolution block formed by convolution operation with two kernels of 3 × 3, a channel number of 128, a kernel size of 1 × 1 and a channel number of 1, a predicted saliency map is continuously optimized, and a saliency map is obtained by performing sigmoid operation on a result after a suitable cycle number.
Step 6: and under a deep learning framework of the Pythrch, training the whole network by adopting a random gradient descent algorithm until loss convergence stops, and storing an optimal network model.
And 7: in order to determine the setting of the cycle number, the whole network is trained under different cycle numbers, and the test is carried out on the public data set DUT-OMRON data set, the results of three objective evaluation indexes of an F value, an MAE value and an S value are given in table 1, and it can be seen that the detection effect is improved to a certain extent along with the increase of the cycle number. The number of cycles was finally determined to be 6.
Table 1: the method and the device have the advantages that the related evaluation index results on the DUT-OMRON data set under different cycles
Figure BDA0002559373880000041
Figure BDA0002559373880000051
The DUT-OMRON data set may be referred to as "Frequency-tuned SalientObject Detection" by Radhakrishna Achantay, published in 2009 at IEEE Confenrence on Computer Vision and Pattern Recognition, p 1597. sup. 1604.
And 8: in order to show the superiority of the performance of the cyclic residual visual saliency detection network based on the compact aggregation feature, the application compares the ECSSD, HKU-IS and DUT-OMRON data sets with the currently advanced methods of RFCN, UCF, NLDF, GBR, MPFF, R3Net and RefineNet, respectively. The relevant objective evaluation index values on different test sets are shown in table 2, and it can be seen that the detection effect of the method provided by the invention is in the front.
Table 2: the application and different algorithms have relative evaluation index comparison results on different test sets
Figure BDA0002559373880000052
Comparison of visual effects of partial test images referring to FIG. 4, the first four rows of images are from the ECSSD dataset, the middle four rows of images are from the HKU-IS dataset, and the last four rows of images are from the DUT-OMRON dataset. In terms of visual effect, different algorithms can effectively detect partial effective salient regions, but the problems of incomplete regions, unclear boundaries, background interference and the like still exist. The detection results of the characters, the flowers and the objects in the first, the fourth and the seventh rows show that the areas detected by the method are more complete and smooth. The detection result of the third row of fishes shows that under the condition of higher similarity with the background, the robustness of the method provided by the invention is better, and the position of the target can be well positioned. The detection result of the fifth element orange shows that under the interference of the same type of targets, the method provided by the invention can still directly position and keep high accuracy. The detection result of the balloon in the twelfth row shows that the method provided by the invention can still accurately detect the obvious target in the dark environment. In a comprehensive view, the use of the close aggregation feature effectively improves the integrity of the detection area, inhibits background noise, and makes the spatial structure of the predicted saliency map closer to that of the true value map.
The ECSSD data set is referred to by Yan Qiong as "Hierarchical sales Detection" published in 2013 on IEEE Confenrence on Computer Vision and Pattern Recognition, pages 1155-1162.
The HKU-IS data set IS referenced to Li Guinbin "Visual science Based on multiscale deep feeds" published in 2015 at pages 5455 and 5463 of IEEE Confenrence on Computer Vision and Pattern Recognition.
RFCN can refer to "Saliency Detection with recovery FullyConvolitional Networks" of Wang Linzhao, published in 2016 at IEEE Confenrence on computer Vision and Pattern Recognition, page 825 and 841.
UCF can refer to Zhang PingPing, "Learning Unsequent software for Accurate Detection, published in 2017 at IEEE Confenrence on computer Vision and Pattern Recognition, page 212 and 221.
NLDF can be referred to the Lun Zhiming "Non-local Deep Features for sales ObjectDetection" published in 2017 at pages 6609 and 6617 of IEEE Confenrence on Computer Vision and Pattern Recognition.
GBR can refer to Tan Xin "Saliency Detection by Deep Network with BoundarryrReference and Global Context", published in 2018 on pages 1-6 of International conference Multimedia and Expo.
MPFF can be referred to as "Multi-Path Feature Fusion Network for sales Detection" by Zhu Hengliang, published in 2018 on International Conference on multimedia and Expo on pages 1-6.
R3Net can refer to "R3 Net: Current resource network for functionality Detection" of Ding bending, which was published in 2018 at 684-.
RefineNet can refer to "Refineet: A Deep Segmentation assistance recovery Network for sales Object Detection" of Keren Fu, published in 2019 on IEEETES ANSACTIONS ON MULTI MEDIA, page 457-469.
Example two
A detection system is applied to the detection method of the first embodiment and comprises a compact feature extraction module, an all-feature aggregation module and a cyclic residual optimization module. The compact feature extraction module is used for realizing effective information aggregation in a single layer by adopting a compact connection mode aiming at the last convolutional layer feature of the second to the fifth stages of the ResNeXt101 network;
all the feature aggregation modules utilize the ASPP module to realize the exchange and fusion of the feature information of the external layers aiming at the features of the layers with different resolutions;
and the loop residual optimization module repeatedly utilizes the compact aggregation characteristics to continuously optimize the predicted saliency map under a deep supervision mechanism.
Although the present invention has been described with reference to the preferred embodiments, it should be understood that various changes and modifications can be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (8)

1. A detection system, comprising:
the tight feature extraction module is used for realizing effective information aggregation in a single layer by adopting a tight connection mode aiming at the last convolutional layer feature of the second to the fifth stages of the ResNeXt101 network;
all the feature aggregation modules utilize the ASPP module to realize the exchange and fusion of the feature information of the external layers aiming at the features of the different resolutions of the layers;
and the cyclic residual optimization module is used for repeatedly utilizing the compact aggregation characteristics to continuously optimize the predicted saliency map under a deep supervision mechanism.
2. A method of detection, comprising the steps of:
s1, extracting compact convolution characteristics from the basic framework ResNeXt101 network, combining output characteristics of continuous stages together in a dense connection mode, covering a larger receptive field and realizing single-layer characteristic internal information fusion;
s2 only carries out tight aggregation on the basic features of a single layer, and neglecting the fusion of information among different depths and resolution features of the deep neural network is not beneficial to visual saliency detection, so that the aggregation of external information of multilayer features is realized by adopting a cavity space pyramid pooling module aiming at the tight convolution features extracted from all layers;
s3, under a deep supervision mechanism, the tight aggregation characteristics are recycled, continuous optimization is performed in a residual error learning mode, and a proper cycle number is determined through experiments;
s4, the whole cyclic residual error network is tested on three visual saliency detection data sets, and after the test is finished, the cyclic residual error network based on the compact aggregation characteristics can be used for the actual application of the visual saliency detection in natural images.
3. The detection method according to claim 2, wherein in S1, firstly, for the last convolution layer of the second to the fifth stages in the resenext 101 network, the last convolution layer has 256,512,1024 and 2048 channels, respectively, dimension reduction is performed by using a convolution operation with a kernel size of 3 and a channel number of 128, the dimension reduction features are multiplexed to each stage which is closely connected afterwards, and fusion of subsequent information is guided, that is, the input of each current stage is the result of feature concatenation of all previous stages, the convolution with a kernel size of 3 and a channel number of 64 is uniformly used for feature information extraction, and the output of the final tight feature extraction module is obtained by cascading the dimension reduction features and intermediate outputs of a plurality of stages.
4. The detection method according to claim 2, wherein in S2, the tight convolution features extracted from all layers are first concatenated, dimension reduction is achieved through two convolution operations with kernel size of 3 and number of channels of 256, then the concatenated features are sent to a void space pyramid pooling module to achieve information fusion, that is, the concatenated features are parallelly passed through a convolution operation with kernel size of 1 and number of channels of 128, three convolution operations with kernel size of 3 and expansion ratios of 2,4,6 respectively and number of channels of 128 and a combination of global mean pooling and kernel size of 1 and number of channels of 128, and finally the feature concatenation of 5 paths is passed through convolution operations with kernel size of 1 and number of channels of 256 to achieve aggregation dimension reduction, that is, the tight aggregation features are obtained.
5. The detection method according to claim 2, wherein in S3, the close clustering feature is obtained first using S2, and the initial saliency map SM is obtained through a convolution operation with kernel size of 1 and channel number of 10Then repeatedly inputting the aggregation characteristics and the saliency map into the residual volume block to learn the residual, wherein the saliency map after the k-th cycle is represented as SMk=RCB(Cat(SMk-1,DAF))+SMk-1Wherein RCB (-) includes convolution operations with two kernel sizes of 3, channel number of 128, and one kernel size of 1, channel number of 1; cat (-) denotes cascading input features across channels. And after the proper cycle number K, obtaining a final saliency map through sigmoid operation.
6. A detection method according to claim 5, characterized in that the whole network training is calculated as
Figure FDA0002559373870000021
Standard cross entropy loss of (1), wherein SMtAnd GTtRespectively representing the saliency of the T-th pixel on the saliency map and the true value map, T representing the total number of pixels of the image, GT 1 and GT 0 representing the saliency and the non-saliency pixels, respectively, and SMt∈[0,1]Representing the significance of the algorithm prediction.
7. A method of inspection as claimed in claim 6, characterized in that in the course of a cycle, a plurality of saliency maps { SM } are generated0,SM1,L,SMKCalculating cross entropy between each output saliency map and truth map with total loss of
Figure FDA0002559373870000022
8. A detection method according to claim 7, characterized in that during each cycle of residual optimization, both the input and the output get a constraint of loss.
CN202010606592.7A 2020-06-29 2020-06-29 Detection method and system based on compact aggregation feature and cyclic residual learning Pending CN111753849A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010606592.7A CN111753849A (en) 2020-06-29 2020-06-29 Detection method and system based on compact aggregation feature and cyclic residual learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010606592.7A CN111753849A (en) 2020-06-29 2020-06-29 Detection method and system based on compact aggregation feature and cyclic residual learning

Publications (1)

Publication Number Publication Date
CN111753849A true CN111753849A (en) 2020-10-09

Family

ID=72678047

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010606592.7A Pending CN111753849A (en) 2020-06-29 2020-06-29 Detection method and system based on compact aggregation feature and cyclic residual learning

Country Status (1)

Country Link
CN (1) CN111753849A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108875777A (en) * 2018-05-03 2018-11-23 浙江大学 Kinds of fibers and blending rate recognition methods in textile fabric based on two-way neural network
CN109447976A (en) * 2018-11-01 2019-03-08 电子科技大学 A kind of medical image cutting method and system based on artificial intelligence
US20200026942A1 (en) * 2018-05-18 2020-01-23 Fudan University Network, System and Method for Image Processing
CN111275718A (en) * 2020-01-18 2020-06-12 江南大学 Clothes amount detection and color protection washing discrimination method based on significant region segmentation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108875777A (en) * 2018-05-03 2018-11-23 浙江大学 Kinds of fibers and blending rate recognition methods in textile fabric based on two-way neural network
US20200026942A1 (en) * 2018-05-18 2020-01-23 Fudan University Network, System and Method for Image Processing
CN109447976A (en) * 2018-11-01 2019-03-08 电子科技大学 A kind of medical image cutting method and system based on artificial intelligence
CN111275718A (en) * 2020-01-18 2020-06-12 江南大学 Clothes amount detection and color protection washing discrimination method based on significant region segmentation

Similar Documents

Publication Publication Date Title
CN111210443B (en) Deformable convolution mixing task cascading semantic segmentation method based on embedding balance
CN110210539B (en) RGB-T image saliency target detection method based on multi-level depth feature fusion
CN113052210B (en) Rapid low-light target detection method based on convolutional neural network
CN111291809B (en) Processing device, method and storage medium
CN111582316A (en) RGB-D significance target detection method
CN110569851B (en) Real-time semantic segmentation method for gated multi-layer fusion
CN110533041B (en) Regression-based multi-scale scene text detection method
CN111797841B (en) Visual saliency detection method based on depth residual error network
CN112597985A (en) Crowd counting method based on multi-scale feature fusion
CN113011329A (en) Pyramid network based on multi-scale features and dense crowd counting method
CN113033454B (en) Method for detecting building change in urban video shooting
CN112580480A (en) Hyperspectral remote sensing image classification method and device
CN114022408A (en) Remote sensing image cloud detection method based on multi-scale convolution neural network
CN111401380A (en) RGB-D image semantic segmentation method based on depth feature enhancement and edge optimization
CN110852199A (en) Foreground extraction method based on double-frame coding and decoding model
CN113850324A (en) Multispectral target detection method based on Yolov4
CN112580458A (en) Facial expression recognition method, device, equipment and storage medium
CN116740439A (en) Crowd counting method based on trans-scale pyramid convertors
CN116596966A (en) Segmentation and tracking method based on attention and feature fusion
CN114092540A (en) Attention mechanism-based light field depth estimation method and computer readable medium
Chua et al. Visual IoT: ultra-low-power processing architectures and implications
CN116229406B (en) Lane line detection method, system, electronic equipment and storage medium
CN113139544A (en) Saliency target detection method based on multi-scale feature dynamic fusion
CN117011655A (en) Adaptive region selection feature fusion based method, target tracking method and system
CN114743045B (en) Small sample target detection method based on double-branch area suggestion network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination