CN112509046A - Weak supervision convolutional neural network image target positioning method - Google Patents

Weak supervision convolutional neural network image target positioning method Download PDF

Info

Publication number
CN112509046A
CN112509046A CN202011437759.8A CN202011437759A CN112509046A CN 112509046 A CN112509046 A CN 112509046A CN 202011437759 A CN202011437759 A CN 202011437759A CN 112509046 A CN112509046 A CN 112509046A
Authority
CN
China
Prior art keywords
image
neural network
convolutional neural
layer
map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011437759.8A
Other languages
Chinese (zh)
Other versions
CN112509046B (en
Inventor
罗杨
濮希同
骆春波
徐加朗
张赟疆
韦仕才
许燕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202011437759.8A priority Critical patent/CN112509046B/en
Publication of CN112509046A publication Critical patent/CN112509046A/en
Application granted granted Critical
Publication of CN112509046B publication Critical patent/CN112509046B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/041Abduction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Abstract

The invention discloses a weakly supervised convolutional neural network image target positioning method, which comprises the following steps: establishing a convolutional neural network classification model with a batch normalization layer, training the convolutional neural network classification model, and storing after training; inputting the image to be positioned into the convolutional neural network classification model trained in the step S1, and acquiring a feature map output by the depth convolutional layer; carrying out weighted fusion on the obtained characteristic images to obtain a saliency map; converting the obtained saliency map into a thermodynamic map and superimposing the thermodynamic map on the input image to generate a composite image; storing or visualizing the obtained synthetic image to obtain a target positioning image; the invention relates to a weak supervision convolutional neural network image target positioning method which utilizes batch normalization scaling factors as corresponding feature map weights, and solves the problems that the prior art is complex to realize, needs information such as category confidence scores and gradients, is opaque in a convolutional neural network model and is poor in working function.

Description

Weak supervision convolutional neural network image target positioning method
Technical Field
The invention relates to the field of image positioning, in particular to a method for positioning a convolutional neural network image target under weak supervision.
Background
Convolutional neural networks make a breakthrough in image classification tasks at the earliest time, and are widely applied to various fields, such as image target positioning, due to the outstanding feature extraction capability of the convolutional neural networks. When the convolutional neural network is applied to image classification, only simple coding needs to be carried out on image categories, and in a target positioning task, a target position in an image needs to be manually marked by a bounding box in advance. Therefore, the object localization task requires more supervision and has more challenges than the image classification task.
The method comprises the steps of utilizing weight parameters output by target categories learned by a global average pooling layer to carry out weighted addition on feature maps output by a last layer of convolution layer to obtain a similar activation map, and utilizing a position area of a target in a similar activation map highlight image to carry out target positioning. The relatives combine the feedback information of the convolutional neural network, calculate the partial derivative of the class fraction output by the convolutional neural network to the output characteristic diagram of the last layer of convolutional layer of the convolutional neural network through back propagation, obtain a class activation diagram by using the global average pooling value of the partial derivative as the weight of the corresponding characteristic diagram, and further position the image target. The relativistic scholars carry out weighted average pixel by pixel on the gradient of the feature map on the basis of predecessors, the obtained average value is used as the weight of the corresponding feature map, and then the obtained category activation map carries out target positioning.
The prior art relies on the global average pooling layer and the class confidence scores output by the convolutional neural network model, and for the convolutional neural network model not containing the global average pooling layer, structural changes need to be made to add the global average pooling layer. In other prior arts, the weight parameters corresponding to the feature map are calculated through back propagation, and a large number of parameters need to be added, so that the calculation complexity is high and the implementation is complex. The inference process of the convolutional neural network classification model can predict the image target class only by forward propagation, which is not consistent with the mechanism of the information flow mode of the backward propagation in other prior art, so that the internal mechanism of the operation of the convolutional neural network cannot be well explained.
Disclosure of Invention
Aiming at the defects in the prior art, the weakly supervised convolutional neural network image target positioning method provided by the invention solves the problems that the conventional target positioning method needs bounding box annotation information and needs back propagation calculation and model structure modification, so that the realization is complex, and simultaneously also solves the problem that the opaque working mechanism of a convolutional neural network model cannot be well explained.
In order to achieve the purpose of the invention, the invention adopts the technical scheme that:
a weak supervision convolutional neural network image target positioning method comprises the following steps:
s1, establishing a convolutional neural network classification model with a batch normalization layer, training the convolutional neural network classification model, and storing after training;
s2, inputting the image to be positioned into the convolutional neural network classification model trained in the step S1, and acquiring a feature map output by the depth convolutional layer;
s3, carrying out weighted fusion on the characteristic graphs obtained in the step S2 to obtain a saliency map;
s4, converting the saliency map obtained in the step S3 into a thermodynamic map, and superposing the thermodynamic map on the original input image to generate a composite image;
and S5, storing or visualizing the synthetic image obtained in the step S4 to obtain a target positioning image.
The invention has the beneficial effects that: the method effectively utilizes the characteristic information extracted by the convolutional neural network, can position the target in the image without manually marking position information, and greatly reduces the labor cost. And secondly, target positioning can be carried out only by utilizing forward propagation of the classification model, the complexity is low, the implementation is easy, and the method is closer to the inference process of a neural network. Finally, the working mechanism in the convolutional neural network can be explained to a certain extent, so that people can more deeply understand the working mechanism of the convolutional neural network.
Further, step S2 includes the following substeps:
s21, obtaining the characteristic diagram of the first layer output of the convolution layer from the input image I, and showing the characteristic diagram as follows:
Xl=H(I),I∈R3
wherein, XlA characteristic diagram representing the convolution output of the l layer, wherein H (-) represents an abstract function of convolution calculation of the 1 st layer to the l layer of the convolution neural network;
s22, converting the feature diagram XlA batch normalization process was performed, expressed as:
Figure BDA0002829048680000031
wherein the content of the first and second substances,
Figure BDA0002829048680000032
the characteristic diagram of the ith channel of the ith layer after batch normalization calculation is shown, the index i represents the channel index, BN (-) represents batch normalization calculation,
Figure BDA0002829048680000033
scaling factor, μ, representing the batch normalization calculation for layer liMean, δ, representing the ith channel profileiRepresents the variance of the ith channel profile,
Figure BDA0002829048680000034
represents the bias of batch normalization calculation of the l-th layer, and N represents the general sum of characteristic diagrams of the l-th layerThe number of tracks;
s23, carrying out nonlinear processing on the characteristic diagram by adopting a ReLU activation function to obtain an output characteristic diagram of the l-th layer, wherein the output characteristic diagram is represented as follows:
Figure BDA0002829048680000035
wherein, Fl(x, y) represents a deep convolutional layer output characteristic diagram, ReLU (phi) represents a ReLU activation function, max (phi) represents the maximum value of the ReLU activation function and the max (phi), and (x, y) represents the spatial pixel point coordinates of the characteristic diagram.
The beneficial effects of the above further scheme are: the target is positioned only by adopting the characteristic diagram obtained by the forward propagation calculation of the convolutional neural network model, the characteristic extraction capability of the model is fully utilized, and the condition that the existing target positioning method needs additional strong supervision marking information is avoided. The effective information of the intermediate stage is obtained in the deduction process of the model for positioning, so that the method is more consistent with the operation mechanism of the model and is beneficial to reasonably explaining the operation mechanism in the model.
Further, the specific process of step S3 is:
the characteristic diagram obtained in the step S2
Figure BDA0002829048680000041
Batch normalization scaling factor corresponding to the convolutional layer along the channel direction
Figure BDA0002829048680000042
A weighted addition is performed to obtain a two-dimensional saliency map, which is represented as:
Figure BDA0002829048680000043
wherein S (x, y) represents a saliency map;
the beneficial effects of the above further scheme are: the existing parameters of the model are directly used as the weighted values of the weighted fusion of the characteristic diagram channels, the model structure does not need to be changed, the class confidence scores of the model prediction do not need to be changed, the weighted values do not need to be obtained through large-scale calculation, and the target positioning speed is improved.
Further, step S4 includes the following substeps:
s41, filtering the negative values in the saliency map S obtained in step S3, and enlarging the saliency map S to the same size as the original image I using an interpolation algorithm, as follows:
R(x,y)=resize(max(S(x,y),0))
wherein, R (x, y) represents the enlarged saliency map, and resize (·) represents the interpolation function;
s42, the values in the enlarged saliency map R (x, y) are normalized and expressed as:
Figure BDA0002829048680000044
wherein R '(x, y) represents the normalized saliency map, and R' (x, y) is ∈ [0,1], max (·) and min (·) represent functions for maximum and minimum values, respectively;
s43, converting the normalized saliency map R' (x, y) into a thermodynamic map and adding the thermodynamic map to the original image element by element to obtain a composite image, which is represented as:
M(x,y)=H(x,y)+I(x,y)
where M (x, y) represents a composite image, H (x, y) represents a thermodynamic diagram generated by R' (x, y), I (x, y) represents an original input image, and I (x, y) is ∈ [0,1 ].
The beneficial effects of the above further scheme are: the noise in the saliency map can be reduced by filtering the negative value, and the accuracy of target positioning is effectively improved; enlarging the saliency map to the same size as the original image makes it possible to combine the features extracted by the model with the input image; normalization can make the target area in the saliency map more prominent and convert the saliency map into an image target localization mask; the combination of the saliency map and the input image may make the pixel values of the target position region in the input image larger and more prominent for efficient localization.
Further, the specific process of step S5 is:
the values in the composite image M are mapped into the [0,255] interval, as:
Figure BDA0002829048680000051
wherein L (x, y) represents the mapped image, and L (x, y) is ∈ [0,255 ];
and finally, storing or visualizing the mapped image L (x, y) in an image format to obtain a required target positioned image.
The beneficial effects of the above further scheme are: the positioned synthetic image is mapped into a digital image format which can be recognized by a computer, and the image target positioning result can be displayed and observed conveniently. By adopting the visual result, the internal operation mechanism of the model can be more intuitively and reasonably explained.
Drawings
FIG. 1 is a flow chart of a method for weakly supervised convolutional neural network image target localization of the present invention;
FIG. 2 is a flowchart of step S2 according to the present invention;
FIG. 3 is a flowchart of step S4 according to the present invention;
FIG. 4 is a schematic diagram of an implementation of the present invention.
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate the understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and it will be apparent to those skilled in the art that various changes may be made without departing from the spirit and scope of the invention as defined and defined in the appended claims, and all matters produced by the invention using the inventive concept are protected.
As shown in fig. 1, a method for positioning an image target of a weakly supervised convolutional neural network includes the following steps:
s1, establishing a convolutional neural network classification model with a batch normalization layer, training the convolutional neural network classification model, and storing after training;
establishing a convolutional neural network classification model with a batch normalization layer, inputting a training set in an image classification data set into a convolutional neural network, training the classification model, enabling the model to extract deep features of a target in the image, and storing the trained convolutional neural network classification model;
s2, inputting the image to be positioned into the convolutional neural network classification model trained in the step S1, and acquiring a feature map output by the depth convolutional layer;
inputting the image to be positioned containing the classified target into the trained convolutional neural network classification model in S1, and obtaining a feature map output by the last layer of convolutional layer through calculation;
because the size of the input image is different from that of the convolutional neural network model, the space size of the feature map output by the last convolutional layer is different, the space size of the feature map applicable in the step needs to be larger than 7 multiplied by 7, and when the space size of the feature map is too small, target positioning cannot be well carried out. When the input image is small, in order to ensure an appropriate feature map space size, a feature map output by a relatively earlier convolutional layer may also be used.
In the embodiment of the present invention, step S2 shown in fig. 2 includes the following sub-steps:
s21, obtaining the feature map output by the ith layer, and assuming that the calculation flow of the feature map in the ith layer is the sequence of convolution → batch normalization → ReLU, the calculation formula of the feature map output by the ith layer convolution obtained from the input image I is as follows:
Xl=H(I),I∈R3
wherein, XlA characteristic diagram representing the convolution output of the l layer, wherein H (-) represents an abstract function of convolution calculation of the 1 st layer to the l layer of the convolution neural network;
s22, converting the feature diagram XlCarrying out batch normalization calculation, wherein the calculation formula is as follows:
Figure BDA0002829048680000071
wherein the content of the first and second substances,
Figure BDA0002829048680000072
the characteristic diagram of the ith channel of the ith layer after batch normalization calculation is shown, the index i represents the channel index, BN (-) represents batch normalization calculation,
Figure BDA0002829048680000073
scaling factor, μ, representing the batch normalization calculation for layer liMean, δ, representing the ith channel profileiRepresents the variance of the ith channel profile,
Figure BDA0002829048680000074
representing the bias of batch normalization calculation of the l layer, wherein N represents the total channel number of the characteristic diagram of the l layer; each feature map channel is normalized by subtracting the mean value and dividing by the variance, and the corresponding feature map channel
Figure BDA0002829048680000075
Multiplication of each other
Figure BDA0002829048680000076
The importance of the ith feature map may be represented.
S23, carrying out nonlinear processing on the characteristic diagram by adopting a ReLU activation function to obtain an output characteristic diagram of the l-th layer, wherein the output characteristic diagram is represented as follows:
Figure BDA0002829048680000077
wherein, Fl(x, y) represents a deep convolutional layer output characteristic diagram to be obtained by the invention, ReLU (·) represents a ReLU activation function, max (·,) represents taking the maximum value of the two, and (x, y) represents the space pixel point coordinates of the characteristic diagram.
S3, carrying out weighted fusion on the characteristic graphs obtained in the step S2 to obtain a saliency map;
reading the scaling factor beta of the batch normalization layer in the model depth convolution layer as the weight of the feature map obtained in the step S2, and performing weighted fusion on the feature map to obtain a saliency map;
in this embodiment of the present invention, the specific content of step S3 includes:
the characteristic diagram obtained in the step S2
Figure BDA0002829048680000078
Batch normalization scaling factor corresponding to the layer along the channel direction
Figure BDA0002829048680000079
Weighted addition is performed to obtain a two-dimensional saliency map, which is represented as:
Figure BDA00028290486800000710
wherein S (x, y) represents a saliency map;
the region with the larger value in the saliency map S is the spatial position of the object in the original image.
S4, converting the saliency map obtained in the step S3 into a thermodynamic map, and superposing the thermodynamic map on the original input image to generate a composite image;
and amplifying the saliency map to the same size of the original image in space by using an interpolation algorithm, converting the amplified saliency map into a thermodynamic map, overlapping the thermodynamic map on the original input image to generate a synthetic image, and mapping the synthetic image to a reasonable range for storage or visualization to obtain the target positioning image.
In the embodiment of the present invention, step S4 shown in fig. 3 includes the following sub-steps:
s41, filtering the negative values in the saliency map S obtained in step S3, and enlarging the saliency map S to the same size as the original image I using an interpolation algorithm, as follows:
R(x,y)=resize(max(S(x,y),0))
wherein, R (x, y) represents an enlarged saliency map, resize (·) represents an interpolation function, which has the effect of enlarging the saliency map S (x, y) into R (x, y), and the size of the enlarged saliency map R (x, y) is the same as that of the input original image I;
s42, normalizing the values in the enlarged saliency map R (x, y) for thermodynamic map conversion, as:
Figure BDA0002829048680000081
wherein R '(x, y) represents the normalized saliency map, and R' (x, y) is ∈ [0,1], max (·) and min (·) represent functions for maximum and minimum values, respectively;
s43, converting the normalized saliency map R' (x, y) into a thermodynamic map, and then adding the thermodynamic map element by element to the original image to obtain a composite image matrix, which is expressed as:
M(x,y)=H(x,y)+I(x,y)
where M (x, y) represents a composite image, H (x, y) represents a thermodynamic diagram generated by R' (x, y), I (x, y) represents an original input image, and I (x, y) is ∈ [0,1 ].
And S5, storing or visualizing the synthetic image obtained in the step S4 to obtain a target positioning image.
Because of the size of the pixel values p e [0,255] in the digital image, if the composite image M (x, y) is to be saved or displayed in an image format, the values in the composite image M (x, y) need to be mapped into the [0,255] interval:
in the embodiment of the present invention, the specific content in step S5 includes:
the values in the composite image M (x, y) are mapped into the [0,255] interval, as:
Figure BDA0002829048680000091
wherein L (x, y) represents the mapped image, and L (x, y) is ∈ [0,255 ];
the mapped image L (x, y) is stored in an image format or is processed in a visualization manner, and finally the image after the target is located is obtained, and the implementation process of the invention is shown in fig. 4.
It will be appreciated by those of ordinary skill in the art that the examples provided herein are intended to assist the reader in understanding the principles of the invention and are to be construed as being without limitation to such specifically recited examples and embodiments. Those skilled in the art can make various other specific changes and combinations based on the teachings of the present invention without departing from the spirit of the invention, and these changes and combinations are within the scope of the invention.

Claims (5)

1. A method for positioning an image target of a weakly supervised convolutional neural network is characterized by comprising the following steps:
s1, establishing a convolutional neural network classification model with a batch normalization layer, training the convolutional neural network classification model, and storing after training;
s2, inputting the image to be positioned into the convolutional neural network classification model trained in the step S1, and acquiring a feature map output by the depth convolutional layer;
s3, carrying out weighted fusion on the characteristic graphs obtained in the step S2 to obtain a saliency map;
s4, converting the saliency map obtained in the step S3 into a thermodynamic map, and superposing the thermodynamic map on the input image to generate a composite image;
and S5, storing or visualizing the synthetic image obtained in the step S4 to obtain a target positioning image.
2. The weakly supervised convolutional neural network image target localization method of claim 1, wherein the step S2 includes the following substeps:
s21, obtaining the characteristic diagram of the first layer output of the convolution layer from the input image I, and showing the characteristic diagram as follows:
Xl=H(I),I∈R3
wherein, XlA characteristic diagram representing the convolution output of the l layer, wherein H (-) represents an abstract function of convolution calculation of the 1 st layer to the l layer of the convolution neural network;
s22, matching the characteristic diagram XlA batch normalization process was performed, expressed as:
Figure FDA0002829048670000011
wherein the content of the first and second substances,
Figure FDA0002829048670000012
the characteristic diagram of the ith channel of the ith layer after batch normalization calculation is shown, the index i represents the channel index, BN (-) represents batch normalization calculation,
Figure FDA0002829048670000013
scaling factor, μ, representing the batch normalization calculation for layer liMean, δ, representing the ith channel profileiRepresents the variance of the ith channel profile,
Figure FDA0002829048670000014
representing the bias of batch normalization calculation of the l layer, wherein N represents the total channel number of the characteristic diagram of the l layer;
s23, carrying out nonlinear processing on the characteristic diagram by adopting a ReLU activation function to obtain an output characteristic diagram of the l-th layer, wherein the output characteristic diagram is represented as follows:
Figure FDA0002829048670000021
wherein, Fl(x, y) represents a deep convolutional layer output characteristic diagram, ReLU (phi) represents a ReLU activation function, max (phi) represents the maximum value of the ReLU activation function and the max (phi), and (x, y) represents the spatial pixel point coordinates of the characteristic diagram.
3. The weakly supervised convolutional neural network image target localization method as claimed in claim 2, wherein the specific process of step S3 is:
the feature map obtained in the step S2 is used
Figure FDA0002829048670000022
Batch normalization scaling factor corresponding to the convolutional layer along the channel direction
Figure FDA0002829048670000023
A weighted addition is performed to obtain a two-dimensional saliency map, which is represented as:
Figure FDA0002829048670000024
wherein S (x, y) represents a saliency map.
4. The weakly supervised convolutional neural network image object localization method of claim 3, wherein the step S4 includes the following substeps:
s41, filtering the negative values in the saliency map S obtained in step S3, and enlarging the saliency map S to the same size as the original image I using an interpolation algorithm, as follows:
R(x,y)=resize(max(S(x,y),0))
wherein, R (x, y) represents the enlarged saliency map, and resize (·) represents the interpolation function;
s42, normalizing the values in the enlarged post-saliency map R (x, y), and expressing:
Figure FDA0002829048670000025
wherein R '(x, y) represents the normalized saliency map, and R' (x, y) is ∈ [0,1], max (·) and min (·) represent functions for maximum and minimum values, respectively;
s43, converting the normalized saliency map R' (x, y) into a thermodynamic map and adding the thermodynamic map to the original image element by element to obtain a composite image, which is represented as:
M(x,y)=H(x,y)+I(x,y)
where M (x, y) represents a composite image, H (x, y) represents a thermodynamic diagram generated by R' (x, y), I (x, y) represents an original input image, and I (x, y) is ∈ [0,1 ].
5. The weakly supervised convolutional neural network image target localization method as claimed in claim 4, wherein the specific process in the step S5 includes:
mapping the values in the composite image M (x, y) into the [0,255] interval, as:
Figure FDA0002829048670000031
wherein L (x, y) represents the mapped image, and L (x, y) is ∈ [0,255 ];
and finally, storing or visualizing the mapped image L (x, y) in an image format to obtain a required target positioned image.
CN202011437759.8A 2020-12-10 2020-12-10 Weak supervision convolutional neural network image target positioning method Active CN112509046B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011437759.8A CN112509046B (en) 2020-12-10 2020-12-10 Weak supervision convolutional neural network image target positioning method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011437759.8A CN112509046B (en) 2020-12-10 2020-12-10 Weak supervision convolutional neural network image target positioning method

Publications (2)

Publication Number Publication Date
CN112509046A true CN112509046A (en) 2021-03-16
CN112509046B CN112509046B (en) 2021-09-21

Family

ID=74970670

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011437759.8A Active CN112509046B (en) 2020-12-10 2020-12-10 Weak supervision convolutional neural network image target positioning method

Country Status (1)

Country Link
CN (1) CN112509046B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114750164A (en) * 2022-05-25 2022-07-15 清华大学深圳国际研究生院 Transparent object grabbing method and system and computer readable storage medium
CN115082657A (en) * 2022-04-14 2022-09-20 华南理工大学 Soft erasure-based weak supervision target positioning algorithm
WO2023088176A1 (en) * 2021-11-18 2023-05-25 International Business Machines Corporation Data augmentation for machine learning

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190019020A1 (en) * 2017-07-17 2019-01-17 Open Text Corporation Systems and methods for image based content capture and extraction utilizing deep learning neural network and bounding box detection training techniques
CN110570410A (en) * 2019-09-05 2019-12-13 河北工业大学 Detection method for automatically identifying and detecting weld defects
CN110610475A (en) * 2019-07-07 2019-12-24 河北工业大学 Visual defect detection method of deep convolutional neural network
CN111008633A (en) * 2019-10-17 2020-04-14 安徽清新互联信息科技有限公司 License plate character segmentation method based on attention mechanism
CN111046964A (en) * 2019-12-18 2020-04-21 电子科技大学 Convolutional neural network-based human and vehicle infrared thermal image identification method
US20200125877A1 (en) * 2018-10-22 2020-04-23 Future Health Works Ltd. Computer based object detection within a video or image
CN111563418A (en) * 2020-04-14 2020-08-21 浙江科技学院 Asymmetric multi-mode fusion significance detection method based on attention mechanism
CN111611999A (en) * 2020-05-22 2020-09-01 福建师范大学 Saliency detection method and terminal fusing small-size depth generation model
CN111882560A (en) * 2020-06-16 2020-11-03 北京工业大学 Lung parenchymal CT image segmentation method based on weighted full-convolution neural network
CN112036288A (en) * 2020-08-27 2020-12-04 华中师范大学 Facial expression recognition method based on cross-connection multi-feature fusion convolutional neural network

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190019020A1 (en) * 2017-07-17 2019-01-17 Open Text Corporation Systems and methods for image based content capture and extraction utilizing deep learning neural network and bounding box detection training techniques
US20200125877A1 (en) * 2018-10-22 2020-04-23 Future Health Works Ltd. Computer based object detection within a video or image
CN110610475A (en) * 2019-07-07 2019-12-24 河北工业大学 Visual defect detection method of deep convolutional neural network
CN110570410A (en) * 2019-09-05 2019-12-13 河北工业大学 Detection method for automatically identifying and detecting weld defects
CN111008633A (en) * 2019-10-17 2020-04-14 安徽清新互联信息科技有限公司 License plate character segmentation method based on attention mechanism
CN111046964A (en) * 2019-12-18 2020-04-21 电子科技大学 Convolutional neural network-based human and vehicle infrared thermal image identification method
CN111563418A (en) * 2020-04-14 2020-08-21 浙江科技学院 Asymmetric multi-mode fusion significance detection method based on attention mechanism
CN111611999A (en) * 2020-05-22 2020-09-01 福建师范大学 Saliency detection method and terminal fusing small-size depth generation model
CN111882560A (en) * 2020-06-16 2020-11-03 北京工业大学 Lung parenchymal CT image segmentation method based on weighted full-convolution neural network
CN112036288A (en) * 2020-08-27 2020-12-04 华中师范大学 Facial expression recognition method based on cross-connection multi-feature fusion convolutional neural network

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
SEUNGHAN YANG ET AL.: "Combinational Class Activation Maps forWeakly Supervised Object Localization", 《ARXIV》 *
YONGXIANG HUANG ET AL.: "Evidence Localization for Pathology Images Using Weakly Supervised Learning", 《MICCAI 2019》 *
仇鹏: "基于CRNN模型的弱标签城市交通工具声识别检测", 《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》 *
谢晓蔚 等: "弱监督卷积神经网络的多目标图像检测研究", 《电子测量与仪器学报》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023088176A1 (en) * 2021-11-18 2023-05-25 International Business Machines Corporation Data augmentation for machine learning
CN115082657A (en) * 2022-04-14 2022-09-20 华南理工大学 Soft erasure-based weak supervision target positioning algorithm
CN114750164A (en) * 2022-05-25 2022-07-15 清华大学深圳国际研究生院 Transparent object grabbing method and system and computer readable storage medium

Also Published As

Publication number Publication date
CN112509046B (en) 2021-09-21

Similar Documents

Publication Publication Date Title
CN112509046B (en) Weak supervision convolutional neural network image target positioning method
CN110287849B (en) Lightweight depth network image target detection method suitable for raspberry pi
WO2021036059A1 (en) Image conversion model training method, heterogeneous face recognition method, device and apparatus
CN112132058B (en) Head posture estimation method, implementation system thereof and storage medium
CN110796143A (en) Scene text recognition method based on man-machine cooperation
CN112801182B (en) RGBT target tracking method based on difficult sample perception
CN111414954B (en) Rock image retrieval method and system
CN103714537A (en) Image saliency detection method
CN113177549B (en) Few-sample target detection method and system based on dynamic prototype feature fusion
CN113554032B (en) Remote sensing image segmentation method based on multi-path parallel network of high perception
WO2023151237A1 (en) Face pose estimation method and apparatus, electronic device, and storage medium
CN115424282A (en) Unstructured text table identification method and system
CN111524117A (en) Tunnel surface defect detection method based on characteristic pyramid network
CN116152254B (en) Industrial leakage target gas detection model training method, detection method and electronic equipment
CN104299241A (en) Remote sensing image significance target detection method and system based on Hadoop
CN117252904B (en) Target tracking method and system based on long-range space perception and channel enhancement
CN115311463A (en) Category-guided multi-scale decoupling marine remote sensing image text retrieval method and system
JP2020127194A (en) Computer system and program
CN113850136A (en) Yolov5 and BCNN-based vehicle orientation identification method and system
CN112668662B (en) Outdoor mountain forest environment target detection method based on improved YOLOv3 network
CN111368637B (en) Transfer robot target identification method based on multi-mask convolutional neural network
CN117454116A (en) Ground carbon emission monitoring method based on multi-source data interaction network
WO2024082602A1 (en) End-to-end visual odometry method and apparatus
CN116109682A (en) Image registration method based on image diffusion characteristics
CN116486203B (en) Single-target tracking method based on twin network and online template updating

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant