CN112686913A - Object boundary detection and object segmentation model based on boundary attention consistency - Google Patents
Object boundary detection and object segmentation model based on boundary attention consistency Download PDFInfo
- Publication number
- CN112686913A CN112686913A CN202110028596.6A CN202110028596A CN112686913A CN 112686913 A CN112686913 A CN 112686913A CN 202110028596 A CN202110028596 A CN 202110028596A CN 112686913 A CN112686913 A CN 112686913A
- Authority
- CN
- China
- Prior art keywords
- model
- attention
- boundary
- obd
- pix2pix
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000011218 segmentation Effects 0.000 title claims abstract description 53
- 238000001514 detection method Methods 0.000 title claims abstract description 34
- 238000011176 pooling Methods 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 239000000284 extract Substances 0.000 claims description 2
- 238000012546 transfer Methods 0.000 claims description 2
- 238000000034 method Methods 0.000 abstract description 9
- 238000013461 design Methods 0.000 abstract description 2
- 230000009466 transformation Effects 0.000 description 15
- 238000013519 translation Methods 0.000 description 8
- 238000012549 training Methods 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 230000006872 improvement Effects 0.000 description 2
- 238000000844 transformation Methods 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
Abstract
The invention relates to a target boundary detection and target segmentation model based on boundary attention consistency, which is mainly technically characterized in that: the method comprises two pix2pix models which are cascaded together, wherein each pix2pix model consists of a generator, a discriminator and a loss function, the first pix2pix model is a target boundary detection model (OBD model), a detection result of the first pix2pix model is superposed on an original image and used as the input of the second pix2pix model, and the second pix2pix model is a target segmentation model. The method is reasonable in design, and the boundary attention consistency is introduced into the target boundary detection model to enhance the attention to the target boundary, so that the accurate target boundary is detected, and a more accurate target segmentation result is realized.
Description
Technical Field
The invention belongs to the technical field of computer vision, and relates to a target boundary detection and target segmentation model, in particular to a target boundary detection and target segmentation model based on boundary attention consistency.
Background
In the technology and vision technology, accurate object segmentation is different from salient object detection for various objects, and the specific object is required to be segmented from the background with higher precision. For example, it is applied to portrait segmentation of scene change tasks and organ segmentation before medical diagnosis. Although deep neural networks have significantly improved the performance of object segmentation, accurate segmentation in complex scenes remains very difficult due to background interference.
Through research on the problem of non-ideal segmentation at the boundary, the problem is found to be mostly present in an area where the target boundary is not obvious. This is because the local difference between the target and the background is so small that the model cannot distinguish between the two based on the extracted features. One possible solution is to raise boundary awareness by treating Object Boundary Detection (OBD) as a task for object segmentation. However, OBD does not draw sufficient attention in existing object segmentation models, since the object boundary only occupies a very small part of the whole image and its contribution to the improvement of object segmentation performance in the per-pixel loss function is small.
In the existing object segmentation model, the OBD is only used as a simple sub-network in the existing object segmentation model, and only the initial image and the real object boundary image are used for training. Such sub-networks are prone to overfitting and inaccurate OBD results due to the small proportion of target boundary pixels and lack of supervision over the middle of the model. Therefore, directing attention to target boundaries by monitoring the middle layer of the OBD model helps to improve accuracy of OBD. Investigations have found that most of the excellent attention mechanisms are based on Class Activation Maps (CAMs). CAM is an effective way to enhance the attention of tag-related areas through image classification.
However, due to the poor surveillance of image-level classification, the attention gained using CAM is still difficult to accurately fall on the label-related area. Therefore, researchers have proposed a kind of attention consistency (TAC) under spatial transformation to further constrain the attention area. TAC means: in image classification, if the input image undergoes spatial transformation, the attention area should follow the same transformation. Spatial transformations typically include rotation, flipping, cropping, and the like. However, TAC is to increase attention to the tag-related region by requiring attention consistency of the input image under indirect transformation, and experiments prove that there is a significant difference in consistency obtained under different transformations or combinations. In other words, to obtain good consistency, a lot of experimentation is required to find a suitable transformation, and thus the consistency obtained under indirect transformation is limited.
In summary, how to improve the accuracy of target boundary detection and target segmentation is a problem that needs to be solved urgently at present.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, provides a target boundary detection and target segmentation model based on boundary attention consistency, and solves the problem that target boundary detection and target segmentation are inaccurate.
The technical problem to be solved by the invention is realized by adopting the following technical scheme:
a target boundary detection and target segmentation model based on boundary attention consistency comprises two pix2pix models, each pix2pix model is composed of a generator, a discriminator and a loss function, the two pix2pix models are cascaded together, the first pix2pix model is an OBD model and used for detecting a target boundary, a detection result of the OBD model is superposed on an original image and used as an input of the second pix2pix model, and the second pix2pix model is a target segmentation model and used for generating a target segmentation result;
the OBD model generator comprises a twin network, an attention module and a decoder, wherein the twin network shares all parameters and takes an initial image A and an OBD detection result G (A) thereof as two inputs, and a feature map F corresponding to two branches is obtained through down-sampling and a residual blockAAnd feature map FG(A)Feature map FAAnd feature map FG(A)After pooling by a global average pooling layer GAP and a global maximum pooling layer GMP, sending the obtained product to a full-link layer with W as a weight for classification; the attention module calculates classification values by weighting the pooled feature maps, extracts initial feature maps by linearly combining the feature maps by way of channel-by-channel multiplication and summing them along the dimension of the combined feature mapsAn attention map M (A) and an attention map M (G (A)) of the initial image A and the OBD detection result G (A), wherein the classification loss and the consistency loss of the attention module jointly guide an encoder of the OBD model to extract target boundary characteristics and transfer the target boundary characteristics to a decoder to generate an OBD detection result;
the structure of a discriminator of the OBD model is the same as that of a discriminator in a conventional pix2pix model;
the loss function of the OBD model comprises a penalty function L for generating a true target boundary imageadvLoss function L1 for maintaining stable generationGClassification loss function of auxiliary classifierAnd boundary attention consistency loss function Latt;
The generator of the target segmentation model adopts the same structure as a conventional pix2pix model, and trains the model by using an image with a target boundary subjected to enhancement;
the structure of a discriminator of the target segmentation model is the same as that of a discriminator in a conventional pix2pix model;
the loss function of the target segmentation model comprises a countervailing loss function Ladv2And a loss function L1G2And adopting least square GAN as an optimization objective function.
Further, in the attention module of the OBD model, the object boundary is treated as a class attribute, and the initial image and the object boundary image are of the same class.
Further, in the attention module of the OBD model, the attention maps M (a) and M (g (a)) of the initial image and the transformed image are equal at the same OBD transform.
Further, the penalty function LadvAnd a penalty function Ladv2Respectively expressed as:
Ladv=Ex~A[log(1-D(x,G(x)))2]+Ex~A,y~B[log(D(x,y))2]
wherein G, G2 and D, D2 are the generator and the arbiter of two pix2pix models, respectively;
the loss function L1GLoss function L1G2Respectively expressed as:
L1G=Ex~A,y~B[||G(x)-y||1]
wherein c isgIs an auxiliary classifier of the generator, and adopts a cross entropy classification loss function;
boundary attention consistency loss function L of auxiliary classifier of OBD modelattIs represented as;
att=Ex~A[||G(M(x))-M(G(x))||1]
where M (x) represents the attention map of image x in the a domain, g (x) and M (g (x)) represent the generated image and its attention map;
the above-mentioned loss function is integrated into two optimized objective functions to train the pix2pix model:
whereinα1=1,α2=1000,α3=10,α4=10,β=10。
The invention has the advantages and positive effects that:
according to the method, two pix2pix image translation models are cascaded together, the first pix2pix model is used for detecting a target boundary (OBD), a detection result is superposed on an original image and used as an input of the second pix2pix model, and the second pix2pix model generates a target segmentation result. The boundary attention consistency is introduced into the target boundary detection model to enhance the attention to the target boundary, so that the accurate target boundary is detected, and a more accurate target segmentation result is realized.
Drawings
FIG. 1 is a schematic diagram of boundary attention consistency under OBD transform;
FIG. 2 is a schematic diagram of the structure of a generator of an OBD model;
fig. 3 is a diagram illustrating the segmentation result on the PFCN data set according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings.
The design idea of the invention is as follows:
aiming at the problem that the current target segmentation model has poor segmentation effect in a local area with similar target and background, the method can be very helpful to solve the problem by enhancing the target boundary. The method treats target segmentation as a two-stage task and is realized by utilizing two cascaded pix2pix image translation models. The first pix2pix image translation model is used for detecting the target boundary of the initial image independently, and the second pix2pix image translation model is used for completing target segmentation on the image with the target boundary enhanced. The invention focuses on the improvement of the object segmentation performance by the Object Boundary Detection (OBD) in the first stage.
Generally, whether the attention area is reasonable may reflect the performance of the model. In the case of OBD, the objective is to map the initial image and the target boundary image to the same distribution. Hence, the target boundary is certainly the most reasonable attention area, since it is a direct proof that the source domain (initial image) and the target domain (target boundary image) have the same distribution and OBD results. The more the attention is focused on the object boundary, the better the OBD model performance will be.
One straightforward way to increase the attention of a desired area is to implement full supervision of the attention map. However, if full supervision is used on the level of attention in the model, the model is difficult to learn well, i.e. under-fit, under-fit under high demand and low complexity conditions. Another possible solution is to implement weak supervision on the attention map by using image-level classification. However, the results using this approach show that attention cannot be accurately localized on the label-related area, i.e. an overfitting occurs. Therefore, the patent application adopts the following two measures to be applied to the attention map, thereby improving the attention to the target boundary.
(1) The CAM was introduced in the middle of the OBD generator as an attention module. In the attention module, the target boundary is treated as a class attribute, and the source domain (initial image) and the target domain (target boundary image) are classified into the same class. (2) The attention area is guided directly using the Boundary Attention Consistency (BAC) under Object Boundary Detection (OBD) transformation. BAC requires that when an initial image is converted to an object boundary map by OBD, as shown in the first line of fig. 1, its attention map should also be the attention map of the object boundary image under the same OBD conversion, as shown in the last line of fig. 1. To evaluate BAC, the same OBD transform needs to be applied on the attention map of the initial image. However, unlike simple transitions such as flipping or rotating, there are difficulties in implementing OBD transitions on an attention map. To solve this problem, the present invention uses the OBD model itself to perform the conversion, and obtains its output as the conversion result by inputting the attention of the original image again into the OBD. The generator of the OBD is shown in fig. 2.
For convenience of explanation, the meanings of the following symbols are explained first: a and B respectively represent an initial image and a real target boundary image of a training OBD model, A2Representing an enhanced image of the boundary of the object obtained by superimposing A and B, B2Representing the true target segmentation result, A2And B2For training of the object segmentation model in the second stage. c. CgIs an auxiliary classifier in an OBD generator.
The target boundary detection and target segmentation model is formed by cascading two pix2pix image translation models, wherein the first pix2pix model is used for detecting a target boundary (OBD), a detection result is superposed on an original image and used as the input of the second pix2pix image translation model, and the second pix2pix image translation model generates a target segmentation result. The boundary attention consistency is introduced into the target boundary detection model to enhance the attention to the target boundary, so that the accurate target boundary is detected. The conventional pix2pix model consists of a generator, an arbiter, an impedance loss and an L1 loss. The invention focuses on improving the generator part of the OBD model, and the structure of the discriminator of the OBD and the second pix2pix is the same as that of the conventional pix2 pix. The OBD generator receives an original image to be segmented and generates a corresponding target boundary detection result; the arbiter of the OBD receives the generated OBD result and the correct object boundary and tries to distinguish the difference between them to push the generator to generate a true OBD result. A generator of the second pix2pix receives the superposed image of the OBD result and the original image and generates a target segmentation result; the discriminator receives the generated target segmentation result and the real target segmentation result, and tries to distinguish the difference between the two results so as to drive the generator to generate an accurate target segmentation result. The loss functions of the two models comprise the confrontation loss and the L1 loss of the pix2pix image translation model, and in addition, the classification loss and the boundary attention consistency loss of the attention module are added into the generator of the OBD model. The generator structure of the OBD model is shown in fig. 2, with the discriminator and the second pix2pix being identical to the conventional pix2pix structure. The various components of the model are described in detail below.
The following describes the respective parts of the present invention:
(1) the generator for detecting the object boundary is shown in FIG. 2 and comprises a twin network and attention module cgAnd a decoder. The twin network shares all parameters and takes the initial image a and its OBD result G (a) as two inputs, G (-) representing the process of generation or OBD transformation. Obtaining characteristic graphs corresponding to the two branches through down sampling and residual blockIs shown as FAAnd FG(A). GAP and GMP then pool the profiles and feed them into the global connectivity layer with W as weight for classification, attention module cgThe classification value is calculated by weighting the pooled feature maps. To enhance attention to the target boundary, attention module cgThe two branches are classified into the same class.
At the same time, attention module cgBy multiplying the linear combined feature maps path by path and summing them along the dimension of the combined feature maps, attention maps of a and g (a), denoted M (a) and M (g (a)), respectively, are extracted, where M (·) represents the process of computing the attention maps using CAM. According to the requirement of consistency, the attention maps M (a) and M (g (a)) of the initial image and the transformed image should be equal under the same OBD transform, which can be expressed as:
G(M(A))=M(G(A)) (1)
the classification loss and the consistency loss of the attention module together guide an encoder of the OBD model to extract target boundary features and pass them to a decoder to generate an OBD result.
It should be noted that: the inputs to the twin network of the present invention are a and g (a) rather than the two domains a and B for the following reasons: first, as the number of training times increases, g (a) becomes B, and the generation process may also implement OBD transformation. Secondly, g (a) instead of B helps to perform OBD transformation with the model itself trying to focus on attention. Finally, the generation process can also be seen as an extension of the spatial transformation in the TAC, so it is also reasonable to maintain attention consistency under the generation transformation.
To achieve consistency in equation 1, two sides of the equation are obtained along the two branches in FIG. 2. The first branch is shown as a solid line, and an attention map M (A) is obtained by taking A as an input; then, m (a) is re-input and its output G (m (a)) is obtained to represent the OBD conversion result for the attention map. The other branch follows a dotted line and represents the attention map of the OBD transform result with the fed back g (a) as input and obtaining its attention map M (g (a)). Finally, G (M (a)) and M (G (a)) were used to assess consistency.
(2) A generator of a target segmentation model, which is trained using the same structure as conventional pix2pix, but with enhanced images of the target boundaries.
(3) A discriminator: the structure of the discriminators in the two pix2pix models is the same as in the conventional pix2 pix. The discriminator receives the source-false-target and source-true-target pairs of fields, respectively, and tries to distinguish them to direct the generator to produce the true target field.
(4) Loss function
For the first pix2pix, i.e. the OBD model of the invention, the objective function consists of four parts. Wherein the penalty function L is resistedadvFor generating true object boundary images, L1GThe loss is used to maintain a stable generation,and LattRepresenting the classification penalty of the auxiliary classifier and the loss of boundary attention consistency, respectively.
For the second pix2pix, the object segmentation model, whose objective function is the same as the conventional pix2pix, including countering the loss Ladv2And L1G2And (4) loss. To maintain stable training, least squares GAN is used as the optimization objective function.
Wherein the penalty function L is resistedadv、Ladv2Comprises the following steps: the penalty of two pix2pix models is used to match the distribution of the source domain image to the target domain image:
Ladv=Ex~A[log(1-D(x,G(x)))2]+Ex~A,y~B[log(D(x,y))2] (2)
where G, G2 and D, D2 are the generators and discriminators of the two pix2pix models, respectively.
L1 loss function: as with the conventional pix2pix model, the L1 penalty is applied in the generator to avoid model collapse and ensure stable generation, with the L1 penalty for the two pix2pix models as follows:
L1G=Ex~A,y~B[||G(x)-y||1] (4)
classification loss function of CAM: in order to enhance the attention to the target boundary, the image x of the CAM classification a domain of the OBD model and its OBD result g (x) are of the same class. The classification loss of CAM is as follows:
wherein c isgIs an auxiliary classifier of the generator, and adopts a cross entropy classification loss function.
Loss of target consistency of attention: according to the definition of consistency, if an initial image is transformed through OBD into an object boundary map, its attention is intended to have the same OBD transformation. The loss of consistency is defined using the absolute deviation as follows:
att=Ex~A[||G(M(x))-M(G(x))||1] (7)
where M (x) represents the attention map of image x in the a domain, and g (x) and M (g (x)) represent the generated image and its attention map. The consistency of the present invention is a strong constraint on the attention of the target boundary.
The complete objective function: the above-mentioned loss function is integrated into two optimized objective functions to train the pix2pix model:
wherein alpha is1=1,α2=1000,α3=10,α4=10,β=10。
It should be emphasized that the embodiments described herein are illustrative rather than restrictive, and thus the present invention is not limited to the embodiments described in the detailed description, but also includes other embodiments that can be derived from the technical solutions of the present invention by those skilled in the art.
Claims (4)
1. A target boundary detection and target segmentation model based on boundary attention consistency comprises two pix2pix models, wherein each pix2pix model consists of a generator, a discriminator and a loss function, and is characterized in that: the two pix2pix models are cascaded together, the first pix2pix model is an OBD model, a detection result of the OBD model is superposed on an original image and used as an input of a second pix2pix model, and the second pix2pix model is a target segmentation model;
the OBD model generator comprises a twin network, an attention module and a decoder, wherein the twin network shares all parameters and takes an initial image A and an OBD detection result G (A) thereof as two inputs, and a feature map F corresponding to two branches is obtained through down-sampling and a residual blockAAnd feature map FG(A)Feature map FAAnd feature map FG(A)After pooling by a global average pooling layer GAP and a global maximum pooling layer GMP, sending the obtained product to a full-link layer with W as a weight for classification; the attention module calculates classification values by weighting the pooled feature maps, extracts an attention map M (A) and an attention map M (G (A)) of the initial image A and the OBD detection result G (A) by linearly combining the feature maps by channel-by-channel multiplication and summing them along the dimension of the combined feature maps, and the classification loss and the consistency loss of the attention module together guide an encoder of the OBD model to extract target boundary features and transfer them to a decoder to generate an OBD detection result;
the structure of a discriminator of the OBD model is the same as that of a discriminator in a conventional pix2pix model;
the loss function of the OBD model comprises a penalty function L for generating a true target boundary imageadvLoss function L1 for maintaining stable generationGClassification loss function of auxiliary classifierNumber ofAnd boundary attention consistency loss function Latt;
The generator of the target segmentation model adopts the same structure as a conventional pix2pix model, and trains the model by using an image with a target boundary subjected to enhancement;
the structure of a discriminator of the target segmentation model is the same as that of a discriminator in a conventional pix2pix model;
the loss function of the target segmentation model comprises a countervailing loss function Ladv2And a loss function L1G2And adopting least square GAN as an optimization objective function.
2. The boundary consistency-of-attention based object boundary detection and object segmentation model of claim 1, wherein: in the attention module of the OBD model, the object boundary is considered as a class attribute, and the initial image and the object boundary image are of the same class.
3. The boundary consistency-of-attention based object boundary detection and object segmentation model of claim 1, wherein: in the attention module of the OBD model, the attention maps M (a) and M (g (a)) of the initial image and the transformed image are equal under the same OBD transform.
4. The boundary consistency-of-attention based object boundary detection and object segmentation model of claim 1, wherein:
the penalty function LadvAnd a penalty function Ladv2Respectively expressed as:
Ladv=Ex~A[log(1-D(x,G(x)))2]+Ex~A,y~B[log(D(x,y))2]
wherein G, G2 and D, D2 are the generator and the arbiter of two pix2pix models, respectively;
the loss function L1GLoss function L1G2Respectively expressed as:
L1G=Ex~A,y~B[||G(x)-y||1]
wherein c isgIs an auxiliary classifier of the generator, and adopts a cross entropy classification loss function;
boundary attention consistency loss function L of auxiliary classifier of OBD modelattIs represented as;
att=Ex~A[||G(M(x))-M(G(x))||1]
wherein M (x) represents the attention map of image x in the initial image set a, g (x) and M (g (x)) represent the generated image and its attention map;
the above-mentioned loss function is integrated into two optimized objective functions to train the pix2pix model:
wherein alpha is1=1,α2=1000,α3=10,α4=10,β=10。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110028596.6A CN112686913B (en) | 2021-01-11 | 2021-01-11 | Object boundary detection and object segmentation model based on boundary attention consistency |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110028596.6A CN112686913B (en) | 2021-01-11 | 2021-01-11 | Object boundary detection and object segmentation model based on boundary attention consistency |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112686913A true CN112686913A (en) | 2021-04-20 |
CN112686913B CN112686913B (en) | 2022-06-10 |
Family
ID=75457031
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110028596.6A Expired - Fee Related CN112686913B (en) | 2021-01-11 | 2021-01-11 | Object boundary detection and object segmentation model based on boundary attention consistency |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112686913B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113450366A (en) * | 2021-07-16 | 2021-09-28 | 桂林电子科技大学 | AdaptGAN-based low-illumination semantic segmentation method |
CN114037845A (en) * | 2021-11-30 | 2022-02-11 | 昆明理工大学 | Method and system for judging main direction of different-source image feature block based on GAN network |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110148142A (en) * | 2019-05-27 | 2019-08-20 | 腾讯科技(深圳)有限公司 | Training method, device, equipment and the storage medium of Image Segmentation Model |
CN110287826A (en) * | 2019-06-11 | 2019-09-27 | 北京工业大学 | A kind of video object detection method based on attention mechanism |
CN110458133A (en) * | 2019-08-19 | 2019-11-15 | 电子科技大学 | Lightweight method for detecting human face based on production confrontation network |
CN110569905A (en) * | 2019-09-10 | 2019-12-13 | 江苏鸿信系统集成有限公司 | Fine-grained image classification method based on generation of confrontation network and attention network |
CN110738642A (en) * | 2019-10-08 | 2020-01-31 | 福建船政交通职业学院 | Mask R-CNN-based reinforced concrete crack identification and measurement method and storage medium |
CN111462126A (en) * | 2020-04-08 | 2020-07-28 | 武汉大学 | Semantic image segmentation method and system based on edge enhancement |
CN111914698A (en) * | 2020-07-16 | 2020-11-10 | 北京紫光展锐通信技术有限公司 | Method and system for segmenting human body in image, electronic device and storage medium |
CN111932479A (en) * | 2020-08-10 | 2020-11-13 | 中国科学院上海微系统与信息技术研究所 | Data enhancement method, system and terminal |
CN112016569A (en) * | 2020-07-24 | 2020-12-01 | 驭势科技(南京)有限公司 | Target detection method, network, device and storage medium based on attention mechanism |
CN112132844A (en) * | 2020-11-12 | 2020-12-25 | 福建帝视信息科技有限公司 | Recursive non-local self-attention image segmentation method based on lightweight |
-
2021
- 2021-01-11 CN CN202110028596.6A patent/CN112686913B/en not_active Expired - Fee Related
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110148142A (en) * | 2019-05-27 | 2019-08-20 | 腾讯科技(深圳)有限公司 | Training method, device, equipment and the storage medium of Image Segmentation Model |
CN110287826A (en) * | 2019-06-11 | 2019-09-27 | 北京工业大学 | A kind of video object detection method based on attention mechanism |
CN110458133A (en) * | 2019-08-19 | 2019-11-15 | 电子科技大学 | Lightweight method for detecting human face based on production confrontation network |
CN110569905A (en) * | 2019-09-10 | 2019-12-13 | 江苏鸿信系统集成有限公司 | Fine-grained image classification method based on generation of confrontation network and attention network |
CN110738642A (en) * | 2019-10-08 | 2020-01-31 | 福建船政交通职业学院 | Mask R-CNN-based reinforced concrete crack identification and measurement method and storage medium |
CN111462126A (en) * | 2020-04-08 | 2020-07-28 | 武汉大学 | Semantic image segmentation method and system based on edge enhancement |
CN111914698A (en) * | 2020-07-16 | 2020-11-10 | 北京紫光展锐通信技术有限公司 | Method and system for segmenting human body in image, electronic device and storage medium |
CN112016569A (en) * | 2020-07-24 | 2020-12-01 | 驭势科技(南京)有限公司 | Target detection method, network, device and storage medium based on attention mechanism |
CN111932479A (en) * | 2020-08-10 | 2020-11-13 | 中国科学院上海微系统与信息技术研究所 | Data enhancement method, system and terminal |
CN112132844A (en) * | 2020-11-12 | 2020-12-25 | 福建帝视信息科技有限公司 | Recursive non-local self-attention image segmentation method based on lightweight |
Non-Patent Citations (5)
Title |
---|
PHILLIP ISOLA ET AL: "Image-to-Image Translation with Conditional Adversarial Networks", 《2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION》 * |
YI-HAO HUANG ET AL: "Illumination-Robust Object Coordinate Detection by Adopting Pix2Pix GAN for Training Image Generation", 《2019 INTERNATIONAL CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICAL INTELLIGENCE》 * |
曹建芳等: "改进的GrabCut算法在古代壁画分割中的应用", 《湖南科技大学学报(自然科学版)》 * |
詹琦梁等: "一种结合多种图像分割算法的实例分割方案", 《小型微型计算机系统》 * |
赵文哲和秦世引: "一种高精度视频目标检测与分割新方法", 《北京航空航天大学学报》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113450366A (en) * | 2021-07-16 | 2021-09-28 | 桂林电子科技大学 | AdaptGAN-based low-illumination semantic segmentation method |
CN113450366B (en) * | 2021-07-16 | 2022-08-30 | 桂林电子科技大学 | AdaptGAN-based low-illumination semantic segmentation method |
CN114037845A (en) * | 2021-11-30 | 2022-02-11 | 昆明理工大学 | Method and system for judging main direction of different-source image feature block based on GAN network |
CN114037845B (en) * | 2021-11-30 | 2024-04-09 | 昆明理工大学 | Method and system for judging main direction of heterogeneous image feature block based on GAN (gateway-oriented network) |
Also Published As
Publication number | Publication date |
---|---|
CN112686913B (en) | 2022-06-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106127684A (en) | Image super-resolution Enhancement Method based on forward-backward recutrnce convolutional neural networks | |
CN112686913B (en) | Object boundary detection and object segmentation model based on boundary attention consistency | |
Jiang et al. | Masked swin transformer unet for industrial anomaly detection | |
CN110889895B (en) | Face video super-resolution reconstruction method fusing single-frame reconstruction network | |
CN104504673A (en) | Visible light and infrared images fusion method based on NSST and system thereof | |
CN113610732B (en) | Full-focus image generation method based on interactive countermeasure learning | |
Hu et al. | A two-stage unsupervised approach for low light image enhancement | |
Yue et al. | IENet: Internal and external patch matching ConvNet for web image guided denoising | |
Li et al. | MAFusion: Multiscale attention network for infrared and visible image fusion | |
Krishnan et al. | SwiftSRGAN-Rethinking super-resolution for efficient and real-time inference | |
Zhao et al. | Skip-connected deep convolutional autoencoder for restoration of document images | |
Lin et al. | Steformer: Efficient stereo image super-resolution with transformer | |
Xu et al. | Attention-guided polarization image fusion using salient information distribution | |
Zhao et al. | Adaptively attentional feature fusion oriented to multiscale object detection in remote sensing images | |
CN110766609B (en) | Depth-of-field map super-resolution reconstruction method for ToF camera | |
CN112884773B (en) | Target segmentation model based on target attention consistency under background transformation | |
Wu et al. | Review of imaging device identification based on machine learning | |
Dulhare et al. | A review on diversified mechanisms for multi focus image fusion | |
CN115841438A (en) | Infrared image and visible light image fusion method based on improved GAN network | |
Zhang et al. | Pooling Pyramid Vision Transformer for Unsupervised Monocular Depth Estimation | |
Zuo et al. | A 2 GSTran: Depth Map Super-resolution via Asymmetric Attention with Guidance Selection | |
Tian et al. | Improving novelty detection by self-supervised learning and channel attention mechanism | |
Wang et al. | Lowlight image enhancement based on unsupervised learning global-local feature modeling | |
Tan et al. | DBSwin: Transformer based dual branch network for single image deraining | |
Zhu et al. | Coord-FCN for same-class objects segmentation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20220610 |
|
CF01 | Termination of patent right due to non-payment of annual fee |