CN115424026A - End-to-end foggy day image multi-target detection model based on knowledge embedding - Google Patents
End-to-end foggy day image multi-target detection model based on knowledge embedding Download PDFInfo
- Publication number
- CN115424026A CN115424026A CN202210960202.5A CN202210960202A CN115424026A CN 115424026 A CN115424026 A CN 115424026A CN 202210960202 A CN202210960202 A CN 202210960202A CN 115424026 A CN115424026 A CN 115424026A
- Authority
- CN
- China
- Prior art keywords
- image
- target detection
- sub
- network
- knowledge
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 80
- 238000011084 recovery Methods 0.000 claims abstract description 24
- 238000000034 method Methods 0.000 claims abstract description 20
- 238000013507 mapping Methods 0.000 claims abstract description 11
- 238000005070 sampling Methods 0.000 claims abstract description 11
- 238000012549 training Methods 0.000 claims abstract description 7
- 230000008569 process Effects 0.000 claims description 8
- 230000000007 visual effect Effects 0.000 claims description 7
- 238000000605 extraction Methods 0.000 claims description 6
- 239000000284 extract Substances 0.000 claims description 3
- 230000004807 localization Effects 0.000 claims description 3
- 238000003909 pattern recognition Methods 0.000 abstract description 2
- 230000001737 promoting effect Effects 0.000 abstract 1
- 238000011176 pooling Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 210000001525 retina Anatomy 0.000 description 2
- 238000002834 transmittance Methods 0.000 description 2
- 241000287196 Asthenes Species 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/90—Determination of colour characteristics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/70—Labelling scene content, e.g. deriving syntactic or semantic representations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20016—Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Abstract
The invention provides an end-to-end foggy day image multi-target detection model based on knowledge embedding, and relates to the technical field of pattern recognition. The end-to-end foggy image multi-target detection model based on knowledge embedding comprises an image defogging sub-network, a target detection sub-network and semantic association feature learning based on knowledge embedding, wherein the image defogging sub-network comprises a public module and a feature recovery module, and the feature recovery module comprises an up-sampling sub-module, a multi-scale mapping sub-module and an image generation sub-module. The method can obtain higher target detection precision under the condition that a plurality of heterogeneous targets exist in a scene at the same time and a limited small amount of data training sets, has positive significance on the aspect of promoting understanding and application of foggy image scenes, and is high in detection quality and high in efficiency.
Description
Technical Field
The invention relates to the technical field of pattern recognition, in particular to an end-to-end foggy day image multi-target detection model based on knowledge embedding.
Background
Under a complex imaging scene in a foggy day, the quality of an image obtained by an image acquisition system is seriously degraded, the performance of a target detection algorithm is influenced, and the conditions of missing detection and false detection are caused, so that the environment perception capability of an aerial, ground or offshore unmanned system platform is influenced. The target detection method under the foggy scene can be divided into two types, one type is a two-stage method, namely a non-associated target detection method based on defogging-detection, firstly, an image enhancement and recovery method is used for carrying out sharpening processing on a foggy image, and then, a target is detected by using a target detection method, wherein the defogging process in the 1 st stage possibly brings the problems of artifacts and color distortion to the image, so that the target detection precision of all images cannot be improved, and the method is generally not suitable for scenes with high real-time requirements. The other type is an end-to-end method, a defogging network and a target detection network are subjected to combined optimization training, defogging and target detection tasks are performed at the same time, the influence of image degradation is reduced through shared feature extraction, and the target detection precision of the foggy day image is improved.
The end-to-end foggy day image target detection model mainly comprises KODNet and DONet network models. KODNet designs an anchor frame surface ratio in a depth detection model so as to guide target detection in a real foggy scene; the DONet defogging model and the target detection model are cascaded and are subjected to combined learning, so that the problems of difficulty in target detection and low detection precision in a foggy scene are effectively solved, and the problems of artifacts, detail loss and color distortion generated by using an image enhancement and recovery method are avoided. However, end-to-end foggy image target detection has the characteristics of difficult data set acquisition and difficult labeling, and particularly under the condition that a plurality of heterogeneous targets exist in a scene at the same time, the data set has noises of missing detection and wrong labeling, so that the target detection performance is reduced in a foggy scene. How to obtain higher target detection precision under a limited small amount of data training sets is an urgent problem to be solved for foggy day image multi-target detection.
Disclosure of Invention
Technical problem to be solved
Aiming at the defects of the prior art, the invention provides an end-to-end foggy image multi-target detection model based on knowledge embedding, and solves the problem of reduced target detection performance in a foggy scene.
(II) technical scheme
In order to realize the purpose, the invention is realized by the following technical scheme: an end-to-end foggy day image multi-target detection model based on knowledge embedding comprises an image defogging sub-network, a target detection sub-network and semantic association feature learning based on knowledge embedding, wherein the image defogging sub-network comprises a public module and a feature recovery module, and the feature recovery module comprises an up-sampling sub-module, a multi-scale mapping sub-module and an image generation sub-module.
Preferably, the fog image defogging subnetwork is used for generating the characteristics, and the characteristics are shared with the detection subnetwork in the joint training process to improve the target detection accuracy under the fog condition, and the fog image defogging subnetwork is completed on the basis of the atmospheric light intensity scattering model.
Preferably, the common module extracts features in the input image including important feature information for learning visual enhancement, object recognition and localization at the same time.
Preferably, the upsampling sub-model is the size of the output image and the input image in the feature recovery sub-network, but the size of the feature map extracted by the common module is one fourth of the size of the input image.
Preferably, the multi-scale mapping sub-module is a feature f C2 After the resolution ratio is increased by the up-sampling submodule, the obtained feature map is transmitted to the multi-scale mapping submodule to carry out multi-scale feature extraction.
Preferably, the image generation sub-module is the last stage of the image restoration sub-network, and the scene restoration is completed through the image generation sub-module.
Preferably, the image target detection sub-network model adopts RetinaNet as a backbone network for the foggy image target detection sub-network model, the RetinaNet provides a top-down line by using a characteristic pyramid network, and the transverse connection enables a network layer with higher resolution to be constructed from an abundant semantic layer, so that the detection precision of the small target in the foggy scene is greatly improved.
Preferably, the knowledge embedding-based semantic associated feature learning is a knowledge-guided semantic associated feature learning method, and features covering more comprehensive judgment information are learned by structurally expressing prior knowledge of category, attribute association and category association and embedding a deep network model.
The working principle is as follows: firstly, extracting important characteristic information including learning visual enhancement, target identification and positioning of characteristics in an input image by using a common module of an end-to-end foggy-day image multi-target detection model, and then repairing a foggy-day degraded image by using a characteristic repairing model; secondly, extracting a feature map of the whole image by using a RetinaNet network structure, and then adding a Feature Pyramid Network (FPN) to the RetinaNet structure to construct a multi-scale feature at the top end of the network structure, so that the problem of feature construction of a target in different scales is solved; by adopting the knowledge-embedded feature learning expression method, the problems that the appearance features of a few cases are only covered by the labeled samples in the scene with few samples, and the learned model expression capability and generalization capability are poor are solved, and the influence of data concentrated omission and wrong label noise on the detection of various targets in the foggy scene is avoided.
(III) advantageous effects
The invention provides an end-to-end foggy day image multi-target detection model based on knowledge embedding. The method has the following beneficial effects:
the invention provides an end-to-end foggy day image multi-target detection model based on knowledge embedding, wherein a defogging network and a detection network are optimized in a combined manner, a result after image recovery is reconstructed under the guidance of target detection information, and target structure detail characteristics and color characteristics recovered after image defogging are learned in the detection network, so that the target detection precision is improved.
Drawings
FIG. 1 is a flow chart of a network structure for detecting foggy day image targets according to the present invention;
FIG. 2 is a flow diagram of a knowledge-guided semantic feature learning framework of the present invention;
FIG. 3 is a flow chart of the target detection result of the knowledge-embedding-based end-to-end foggy day image multi-target detection model.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment is as follows:
as shown in fig. 1 to 3, an embodiment of the present invention provides a knowledge-embedding-based end-to-end foggy image multi-target detection model, which includes an image defogging sub-network, a target detection sub-network, and knowledge-embedding-based semantic association feature learning, where the image defogging sub-network includes a common module and a feature recovery module, where the feature recovery module is that input image features extracted from the common module may degrade due to fog, thereby causing a decrease in target detection performance, and the feature recovery sub-network employs an FR module in order to recover model features output by the common module, and the feature recovery module includes an upsampling sub-module, a multi-scale mapping sub-module, and an image generation sub-module.
Fog image defogging subnetwork is used to generate feature (f) C2 ) And the characteristics are shared with a detector sub-network in the process of joint training to improve the target detection precision under the foggy condition, the fog image defogging sub-network is completed on the basis of an atmospheric light intensity scattering model,
image defogging is realized through an atmospheric scattering model, and the model formula is as follows:
to facilitate the estimation of the transmittance t (x) and the global atmospheric light intensity value α, the formula can be rewritten as:
J(x)=G(x)I(x)-G(x)+1
(formula 2)
Here, the first and second liquid crystal display panels are,the image defogging sub-network combines the transmissivity t (x) and the atmospheric light intensity value alpha in the visual enhancement process and utilizes a network model for estimation.
The common module extracts features in the input image including important feature information of simultaneous learning of visual enhancement, target identification and localization, the common module is not designed independently, but is designed together with some residual modules of the target detection sub-network so as to maintain a simple structure, and particularly, 16 residual modules are divided into four residual stages in the detection sub-network (respectively represented as Conv _2, conv _3, conv _4and Conv _5, as shown in fig. 1), considering that the features acquired through the shallow neural network may contain more spatial information, which is beneficial to visual enhancement, but the spatial information of the deep network is reduced in the process of pooling, so the former 10 layers of convolutional network layers of the detection sub-network are selected to form a common module model, and Conv _2 is used as the output of the model, and the feature map acquired from the common module is synchronously transferred to the feature recovery module for visual enhancement, and transferred to Conv _3 for target detection.
The up-sampling sub-model is that the size of the output image and the input image are the same in the feature recovery sub-network, but the size of the feature map extracted by the common module is one fourth of the size of the input image, therefore, the feature recovery sub-network module matches the resolution of the input image sum by using the up-sampling sub-module, and in the deep learning based defogging research, the bilinear interpolation technology can be applied to the image defogging process based on the convolutional neural network, and the pooled feature map generates the defogged output image by bilinear up-sampling, therefore, the up-sampling sub-model in the model firstly uses the convolutional layer to reduce the feature dimension, especially, the number of channels of the feature is reduced by 7 times, and then, the bilinear interpolation is used to increase the size of the feature to be the same as the size of the input image.
The multiscale mapping submodule is feature f C2 The resolution of the image is increased by an up-sampling submodule, an obtained feature map is transmitted to a multi-scale mapping submodule for multi-scale feature extraction, the multi-scale feature extraction is widely applied to an image defogging method and is effective in the aspect of visibility enhancement, the multi-scale feature submodule is composed of four parallel convolutions, including 1 × 1 convolution, 3 × 3 convolution, 5 × 5 convolution and 7 × 7 convolution, the number of the passages of the convolutions is 4, and the final feature is estimated by another 3 × 3 convolution to obtain G (x), including the transmittance t (x) and the global atmospheric light intensity value alpha.
The image generation sub-module is the last stage of the image recovery sub-network, the scene restoration is completed through the module, and the module takes G (x) as input and adopts an element-wise multiplication layer, a feature vector extraction layer and a feature vector addition layer to calculate a transformation formula 2.
To train visibility enhancement, the image recovery sub-network exploits the mean-squared error (MSE) penalty, which can be described as:
where n is the image slice size, Y i Is the image after the real recovery of the image,is an estimated post-recovery image, further emphasizing that although the recovery subnetwork can generate the defogged image directly, its goal is not to generate the input defogged image of the detection subnetwork, but rather from the common moduleCharacteristic f independent of fog C2 To learn visibility-enhancing tasks.
The image target detection sub-network model adopts Retina Net as a main network for the foggy image target detection sub-network model, retina Net utilizes a Feature Pyramid Network (FPN) to provide a top-down line, and transverse connection enables a network layer with higher resolution to be constructed from an abundant semantic layer, so that the detection precision of small targets in foggy scenes is greatly improved, because a deep layer network contains abundant semantic information but lacks position information after pooling processing, the transverse connection between the deep layer network and a corresponding shallow layer network can enrich the position information, and the precision is improved.
In order to efficiently detect the target, a detection subnetwork follows a specific strategy, firstly, a RetinaNet network structure is used for extracting a feature map of the whole image, then, at the top end of the network structure, a Feature Pyramid Network (FPN) is used for being attached to the RetinaNet structure to construct multi-scale features, the problem of feature construction of the target in different scales is solved, and finally, a multi-target identification and positioning task is added to an FPN network layer through a simplified Full Convolution Network (FCNs) to complete target detection and bounding box regression (bounding box regression).
In order to train the object classification, the detection network adopts a loss function focal loss, lambda c To balance the variables, are described as
L cls (p c )=-λ c (1-p c ) γ log(p c ) (formula 4)
Here, λ c ∈[0,1]Is of the object class 1, 1-lambda c For the target class-1, γ is an adjustable focusing parameter (γ ≧ 0), p c Is defined as
Where y is the determined reference class (y ∈ { ±. 1 }), and p is the target class probability of the class label y =1, and is obtained through model estimation.
Locating targets, detecting networks in advanceSmooth loss is adopted between the measuring frame (k) and the reference frame (g), and the matching pair between the anchor frame (l) and the reference frame (g) is expressed as (l) m ,g m ) m=1,2,...q' Where q denotes the number of matching pairs, and for each matched anchor frame, a reference frame regression is defined asThen a corresponding prediction box is represented asWhere x, y, ω and h represent the two center coordinates, width and height, of the box, the positioning penalty is expressed as:
the semantic association feature learning based on knowledge embedding is a semantic association feature learning method adopting knowledge guidance, the prior knowledge of category-attribute association and category association is expressed in a structured mode, a deep network model is embedded, the feature covering more comprehensive judgment information is learned, in a foggy image target detection model, a semantic association feature learning model is connected in a main network RetinaNet, firstly, a knowledge graph of category-attribute association is built for each type of target, the attribute knowledge feature of the image is learned by using a graph propagation network, then, an attention mechanism of semantic association is introduced, the feature of attribute association for each type of target is guided and learned by using the attribute knowledge feature, the coexistence probability of different types of targets of the image is learned based on the attribute association feature, a K knowledge graph of category association is built according to the coexistence probability, and the semantic feature of context association is learned through graph propagation and interaction network.
Assuming that a foggy day image scene has C object classes and K object attributes, constructing a knowledge of class and attribute association for each class CGraph G c ={V c ,A c },V c ={v c,0 ,v c,1 ,v c,2 ,…,v c,K Is a set of nodes, where v c,0 Represents the class c node, v c,k A node representing an attribute k; a. The c Representing a node incidence matrix, wherein a c,i,j Representing the association probability of the node i and the node j, and then constructing a knowledge graph G = { V, A }, V = { V } of the association of the category and the category 1 ,v 2 ,...,v k In which v is c Representing class c nodes; a represents a node incidence matrix, wherein a ij Indicating the coexistence probability of class i and class j.
Given a foggy image, a target detection subnetwork is first utilized to extract multi-scale global featuresAiming at a plurality of target classes, extracting the characteristics of the class c and the corresponding k attributes by utilizing a Glove model and using the characteristics for initializing the graph G c Of corresponding category and attribute nodes, i.e.Then, a Graph Convolution Network (GCN) is introduced to explore information propagation and interaction of different nodes and update node characteristics, namely
Using a priori information A c Initializing adjacency matricesThen, in the training process, the learning category and attribute relation is jointly optimized and passes through L c The information between the nodes of the graph is deeply interacted and explored through the sub-convolution operation, and H can be obtained c ={h c,0 ,h c,2 ,...,h c,K And f, expressing by cascading each node feature and mapping the node feature to attribute knowledge of the category c
Introducing an attention mechanism based on knowledge guidance, and expressing x by using the c attribute knowledge of each category c Guiding learning of features associated with semantic attributes, specifically, for each position (w, h) of an image feature f, first fusing the position feature and the corresponding knowledge expression, and learning an importance factor of the position, namelyThis operation is repeated for each location, resulting in an importance factor for each locationAnd normalizing the data by utilizing a softmax function to obtain a final normalized importance factorFinally, obtaining semantic attribute association characteristics of the category c by utilizing weighted average pooling operation
And executing the operation aiming at all the categories to obtain the characteristics { f) associated with all the categories and the corresponding attributes thereof 1 ,f 2 ,...,f C Therein, feature vector f c Features covering regions associated with attributes of the category are primarily learned.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.
Claims (8)
1. An end-to-end foggy-day image multi-target detection model based on knowledge embedding comprises an image defogging sub-network, a target detection sub-network and semantic association feature learning based on knowledge embedding, and is characterized in that: the image defogging subnetwork comprises a public module and a feature recovery module, wherein the feature recovery module comprises an up-sampling submodule, a multi-scale mapping submodule and an image generation submodule.
2. The knowledge-embedding-based end-to-end foggy day image multi-target detection model as claimed in claim 1, wherein: the fog image defogging subnetwork is used for generating characteristics and sharing the characteristics with the detection subnetwork in the joint training process to improve the target detection precision under the fog condition, and the fog image defogging subnetwork is completed on the basis of an atmospheric light intensity scattering model.
3. The knowledge-embedding-based end-to-end foggy day image multi-target detection model as claimed in claim 1, wherein: the common module extracts features in the input image including important feature information for simultaneous learning of visual enhancement, target recognition and localization.
4. The knowledge-embedding-based end-to-end foggy day image multi-target detection model as claimed in claim 1, characterized in that: the up-sampling sub-model is that in the feature recovery sub-network, the sizes of the output image and the input image are the same, but the size of the feature map extracted by the common module is one fourth of that of the input image.
5. The knowledge-embedding-based end-to-end foggy day image multi-target detection model as claimed in claim 1, wherein: the multi-scale mapping sub-module is a feature f C2 After the resolution ratio is increased by the up-sampling submodule, the obtained feature map is transmitted to the multi-scale mapping submodule to carry out multi-scale feature extraction.
6. The knowledge-embedding-based end-to-end foggy day image multi-target detection model as claimed in claim 1, wherein: the image generation sub-module is the last stage of the image recovery sub-network, and scene restoration is completed through the image generation sub-module.
7. The knowledge-embedding-based end-to-end foggy day image multi-target detection model as claimed in claim 1, wherein: the image target detection sub-network model adopts RetinaNet as a backbone network for the foggy image target detection sub-network model, the RetinaNet provides a top-down line by utilizing a characteristic pyramid network, and the transverse connection enables a network layer with higher resolution to be constructed from an abundant semantic layer, so that the detection precision of small targets in foggy scenes is greatly improved.
8. The knowledge-embedding-based end-to-end foggy day image multi-target detection model as claimed in claim 1, characterized in that: the semantic association feature learning based on knowledge embedding is a semantic association feature learning method adopting knowledge guidance, and by structurally expressing prior knowledge of category and attribute association and category association and embedding a deep network model, the learning covers features of more comprehensive judgment information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210960202.5A CN115424026A (en) | 2022-08-11 | 2022-08-11 | End-to-end foggy day image multi-target detection model based on knowledge embedding |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210960202.5A CN115424026A (en) | 2022-08-11 | 2022-08-11 | End-to-end foggy day image multi-target detection model based on knowledge embedding |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115424026A true CN115424026A (en) | 2022-12-02 |
Family
ID=84199103
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210960202.5A Pending CN115424026A (en) | 2022-08-11 | 2022-08-11 | End-to-end foggy day image multi-target detection model based on knowledge embedding |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115424026A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117253184A (en) * | 2023-08-25 | 2023-12-19 | 燕山大学 | Foggy day image crowd counting method guided by foggy priori frequency domain attention characterization |
CN117253184B (en) * | 2023-08-25 | 2024-05-17 | 燕山大学 | Foggy day image crowd counting method guided by foggy priori frequency domain attention characterization |
-
2022
- 2022-08-11 CN CN202210960202.5A patent/CN115424026A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117253184A (en) * | 2023-08-25 | 2023-12-19 | 燕山大学 | Foggy day image crowd counting method guided by foggy priori frequency domain attention characterization |
CN117253184B (en) * | 2023-08-25 | 2024-05-17 | 燕山大学 | Foggy day image crowd counting method guided by foggy priori frequency domain attention characterization |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109934200B (en) | RGB color remote sensing image cloud detection method and system based on improved M-Net | |
CN109584248B (en) | Infrared target instance segmentation method based on feature fusion and dense connection network | |
WO2022111219A1 (en) | Domain adaptation device operation and maintenance system and method | |
CN110197505B (en) | Remote sensing image binocular stereo matching method based on depth network and semantic information | |
CN113610905B (en) | Deep learning remote sensing image registration method based on sub-image matching and application | |
CN113609896A (en) | Object-level remote sensing change detection method and system based on dual-correlation attention | |
CN112561876A (en) | Image-based pond and reservoir water quality detection method and system | |
CN115527123B (en) | Land cover remote sensing monitoring method based on multisource feature fusion | |
Zhou et al. | FSAD-Net: Feedback spatial attention dehazing network | |
CN116311254B (en) | Image target detection method, system and equipment under severe weather condition | |
CN113505726A (en) | Photovoltaic group string identification and positioning method in map | |
Su et al. | DLA-Net: Learning dual local attention features for semantic segmentation of large-scale building facade point clouds | |
CN114283137A (en) | Photovoltaic module hot spot defect detection method based on multi-scale characteristic diagram inference network | |
CN114972748A (en) | Infrared semantic segmentation method capable of explaining edge attention and gray level quantization network | |
CN114708313A (en) | Optical and SAR image registration method based on double-branch neural network | |
CN115410081A (en) | Multi-scale aggregated cloud and cloud shadow identification method, system, equipment and storage medium | |
Sun et al. | IRDCLNet: Instance segmentation of ship images based on interference reduction and dynamic contour learning in foggy scenes | |
Babu et al. | An efficient image dahazing using Googlenet based convolution neural networks | |
CN114463624A (en) | Method and device for detecting illegal buildings applied to city management supervision | |
Li et al. | Feature guide network with context aggregation pyramid for remote sensing image segmentation | |
Goncalves et al. | Guidednet: Single image dehazing using an end-to-end convolutional neural network | |
CN111079807A (en) | Ground object classification method and device | |
CN113506230B (en) | Photovoltaic power station aerial image dodging processing method based on machine vision | |
CN115661451A (en) | Deep learning single-frame infrared small target high-resolution segmentation method | |
CN115424026A (en) | End-to-end foggy day image multi-target detection model based on knowledge embedding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |