CN110675403A - Multi-instance image segmentation method based on coding auxiliary information - Google Patents

Multi-instance image segmentation method based on coding auxiliary information Download PDF

Info

Publication number
CN110675403A
CN110675403A CN201910814122.7A CN201910814122A CN110675403A CN 110675403 A CN110675403 A CN 110675403A CN 201910814122 A CN201910814122 A CN 201910814122A CN 110675403 A CN110675403 A CN 110675403A
Authority
CN
China
Prior art keywords
network
loss
spectrum
segmentation
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910814122.7A
Other languages
Chinese (zh)
Other versions
CN110675403B (en
Inventor
吴庆波
李辉
魏浩冉
吴晨豪
李宏亮
孟凡满
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201910814122.7A priority Critical patent/CN110675403B/en
Publication of CN110675403A publication Critical patent/CN110675403A/en
Application granted granted Critical
Publication of CN110675403B publication Critical patent/CN110675403B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a multi-instance image segmentation method based on coding auxiliary information, and belongs to the technical field of image coding and instance segmentation. The invention provides a multi-instance image segmentation method based on coding auxiliary information, aiming at the defects caused by the fact that the existing multi-instance image segmentation method only uses an original image to perform instance segmentation. The invention obtains the brightness and color difference macro blocks with different sizes from the input image through the image decoding algorithm, and extracts the intra-frame prediction direction information, thereby taking the obtained coding unit scale spectrum and the intra-frame prediction direction spectrum as the coding auxiliary information and fully utilizing the information of the image. The invention applies the long and short term memory network to the fields except for text classification and natural language processing, and fuses the scale spectrum, the direction spectrum and the original image together by using the long and short term memory network, thereby improving the accuracy of multi-instance image segmentation.

Description

Multi-instance image segmentation method based on coding auxiliary information
Technical Field
The invention relates to the technical field of image coding and instance segmentation, in particular to a multi-instance image segmentation method based on coding auxiliary information.
Background
In the field of computer vision, when images are subjected to multi-instance segmentation, each image of a training set only needs to be input into a segmentation network in the traditional method, but the information of the images is not fully utilized, so that the segmentation effect is not greatly improved, and the difference between the accuracy of classification and detection tasks is obvious.
Because the data volume of the image is large, the image in real life is a compressed image obtained by an image compression algorithm for the convenience of storage. For example, JPEG images are obtained by JPEG compression algorithms; the video is obtained by an H.264/HEVC video compression algorithm. Video and image compression algorithms typically go through the processes of color mode conversion (RGB-YUV), sampling, chunking, Discrete Cosine Transform (DCT), Zigzag ordering, quantization, and entropy coding. In an image or video encoding process, the encoding and intra prediction directions are different for different sizes of luminance and color difference macroblocks. Since the same size of the coding unit or the same intra-frame prediction direction indicates that the information correlation is strong, if the scale information of the coding unit and the intra-frame prediction direction information are combined when the image is subjected to multi-instance segmentation, compared with the traditional method, the method takes more image information into consideration, and is beneficial to improving the effect of a multi-instance segmentation network.
Disclosure of Invention
The invention aims to: aiming at the defects caused by the fact that the existing multi-instance image segmentation method only uses the original image to perform instance segmentation, the multi-instance image segmentation method based on the coding auxiliary information is provided. The invention extracts the coding unit scale information and the intra-frame prediction direction information from the image to obtain the corresponding scale spectrum and direction spectrum as auxiliary information, and fuses the original image and two characteristic spectrums through a long-short term memory network (LSTM) to carry out multi-instance image segmentation
The invention relates to a multi-instance image segmentation method based on coding auxiliary information, which comprises the following steps:
step 1, setting a segmentation network based on a convolutional neural network:
the segmentation network comprises a characteristic pyramid network, a proposed generation network, a sensitive region extraction network, a mask prediction network and a full connection layer;
the system comprises a characteristic pyramid network, a proposal generation network, a feeling region extraction network and a characteristic spectrum extraction network, wherein the characteristic pyramid network is used for characteristic extraction, and the obtained characteristic spectrum is respectively input into the proposal generation network and the feeling region extraction network;
the proposal generation network is used for generating a bounding box proposal and inputting the bounding box proposal into the sensitive area extraction network;
a sense region extraction network for extracting a sense region; classifying and frame regression are carried out on the areas with the feeling of fun through two full connection layers respectively;
two branches of a mask prediction network connecting the outputs of the sensitive region extraction network;
wherein, the first branch comprises four identical convolution layers and a deconvolution layer which are connected in sequence; the deconvolution layer outputs a characteristic spectrum of M K, wherein M is a preset value, and K represents the number of classes of samples;
the second branch comprises two convolution layers, a full-connection layer and a deformation layer which are connected in sequence; the fully connected layers are used to obtain 1 x M2The deformation layer is used for generating a characteristic spectrum of M1 predicting the foreground and the background;
connecting feature spectrums of two branches of the mask prediction network in series to obtain a prediction mask of M (K + 1);
and the Loss function Loss of the segmented network is; loss is lesscls+lossbox+lossmaskWherein, losscls、lossboxAnd lossmaskRespectively representing classification loss, frame regression loss and mask loss of the segmentation network;
step 2, carrying out convolutional neural network training processing on the segmentation network;
collecting training sample pictures and extracting fusion characteristics of the training sample pictures;
initializing network parameters of the segmentation network, inputting the fusion characteristics of the training sample picture into the segmentation network, and obtaining the fusion characteristics based on classification output, frame regression output and mask prediction network output
Obtaining classification Loss, frame regression Loss and mask Loss of the segmentation network respectively from the difference between the real classification and the segmentation frame and the mask, thereby obtaining a current Loss function Loss;
based on the classification result output by classification, the segmentation frame output by frame regression and the prediction mask output by the mask prediction network, the difference between the classification result output by classification, the segmentation frame and the prediction mask output by the mask prediction network and the real classification, the segmentation frame and the mask is respectively obtained to obtain classification loss, frame regression loss and mask loss;
when the change rate of the Loss function Loss does not exceed a preset threshold value, stopping training, and obtaining a trained segmented network based on the current network parameters of the segmented network;
step 3, extracting the fusion characteristics of the picture to be segmented, inputting the fusion characteristics into a trained segmentation network, and outputting a multi-instance image segmentation result of the picture to be segmented based on classification and frame regression;
the extraction mode of the fusion features is as follows:
carrying out image decoding (image and video decoding processing) on the picture to be subjected to the fusion feature extraction to obtain brightness and color difference macro blocks with different sizes and intra-frame prediction direction information; different labels are respectively allocated to coding units with different scales in the image to obtain a coding unit scale spectrum; different labels are distributed to different intra-frame prediction directions to obtain an intra-frame prediction direction spectrum of the image;
and performing feature fusion on the obtained scale spectrum and direction spectrum with the original image through LSTM to obtain fusion features:
inputting the picture to be extracted with the fusion features into a first LSTM network, and obtaining a feature spectrum h1 based on the output of the first LSTM network;
the characteristic spectrum h1 is connected with the coding unit scale spectrum in series and then input into a second LSTM network, and a characteristic spectrum h2 is obtained based on the output of the second LSTM network;
and then the feature spectrum h2 and the intra-frame prediction direction spectrum are input into a third LSTM network after being connected in series, and the fusion feature of the picture of which the fusion feature is to be extracted is obtained based on the output of the third LSTM network.
In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that:
(1) the invention obtains the brightness and color difference macro blocks with different sizes from the input image through the image decoding algorithm, and extracts the intra-frame prediction direction information, thereby taking the obtained coding unit scale spectrum and the intra-frame prediction direction spectrum as the coding auxiliary information and fully utilizing the information of the image.
(2) The invention uses LSTM in the fields except text classification and natural language processing, and uses LSTM to fuse the scale spectrum, direction spectrum and original image together, thus improving the accuracy of multi-instance image segmentation.
Drawings
Fig. 1 is a diagram illustrating an implementation process of the present invention in an embodiment.
FIG. 2 is a schematic diagram of an LSTM feature fusion mode in an embodiment.
Fig. 3 is a block diagram schematically illustrating a structure of a split network according to an embodiment.
Fig. 4 is a block diagram illustrating a Mask branch structure according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the following embodiments and accompanying drawings.
The invention extracts the coding unit scale information and the intra-frame prediction direction information of the image through the image and video decoding algorithm, and avoids the defect that the traditional method only uses the original image to perform example segmentation because the information of the image is fully utilized; the traditional long and short term memory network (LSTM) is widely used in the fields of text classification and natural language processing, and the present invention utilizes LSTM to perform fusion of different feature spectra.
Referring to fig. 1, the multi-instance image segmentation method based on coding auxiliary information of the present invention includes four parts, i.e. inputting an image, decoding the image, and fusing and segmenting LSTM features, i.e. the present invention mainly includes an image decoding module, an LSTM feature fusing module, and a convolutional neural network multi-instance segmenting module, which is specifically implemented as follows:
A. an image decoding module: for each training sample picture, different sizes of luminance, color difference macroblocks (4 × 4, 8 × 8, 16 × 16) and intra prediction direction information (vertical, horizontal, DC, lower left diagonal mode, lower right diagonal mode, vertical right mode, horizontal down mode, vertical left mode and horizontal up mode) can be obtained through an image and video compression decoding algorithm. And respectively allocating different labels (label) to coding units of 4 x 4, 8 x 8 and 16 x 16 in the image to obtain a coding unit scale spectrum, and allocating different labels (label) to different intra-frame prediction directions to obtain an intra-frame prediction direction spectrum of the image.
B. LSTM feature fusion module: and carrying out feature fusion on the scale spectrum and the direction spectrum obtained by decoding the image and the original image through LSTM. The LSTM is commonly used in the field of natural language processing, and can better solve the long-term dependence problem. The LSTM comprises a forgetting gate, an input gate and an output gate, the original image, the coding unit scale spectrum and the intra-frame prediction direction spectrum are respectively input into the LSTM, and feature fusion is carried out on the original image, the coding unit scale spectrum and the intra-frame prediction direction spectrum under the control of gate control signals, and finally a result after feature fusion is obtained.
Referring to fig. 2, the specific process of feature fusion using LSTM is as follows:
firstly, inputting an original image X1 into an LSTM network to obtain a characteristic spectrum h1, serially connecting h1 with a coding unit scale spectrum X2 (concatanate), then inputting the obtained result into the LSTM network to obtain a characteristic spectrum h2, serially connecting h2 with an intra-frame prediction direction spectrum X3, inputting the obtained result into the LSTM network to obtain a finally fused characteristic spectrum h3, and inputting the obtained result into a segmentation network.
C. A convolutional neural network multi-instance segmentation module: and (4) putting the fused features obtained by the operation B into a segmentation network for training. And then realizing multi-instance segmentation processing of the image to be segmented based on the trained segmentation network to obtain a corresponding segmentation result.
The framework of the convolutional neural net based segmentation network of the present invention is shown in fig. 3.
C1, FPN feature extraction: firstly, inputting a fusion result (H3) into a Feature Pyramid Network (FPN) for feature extraction to obtain a feature spectrum f of H W C, and then obtaining H through a convolution layer CONV11*W1*C1Characteristic spectrum f of1
The Feature extraction process of the Pyramid network can refer to the document "Feature Pyramid Networks for object Detection, Kaiming He".
C2, proposal (propofol) generation: to obtain an effective propofol, the profile f is compared1Each point in (a) predicts 9 area proposals, i.e. bounding box proposals. To determine foreground and background, the image is passed through convolutional layer CONV2 (convolution kernel size H)1*W1*C1Number of convolution kernels is 18) to obtain 1 × 18 output, and calculating the probability that the 9 region proposals belong to the foreground and the background respectively by using a normalized exponential function (Softmax); to obtain the coordinates of the center point, the length and the width of the bounding box, the bounding box is passed through convolutional layer CONV3 (convolution kernel size H)1*W1*C1Number of convolution kernels 36) yields an output of 1 x 36.
C3, ROI sensitive region extraction: corresponding features, namely the features corresponding to the positions, are extracted from the feature spectrum through the generated bounding box proposal, and the corresponding features are extracted from the feature spectrum according to the position coordinates of the generated bounding box proposal.
And the ROI Align divides the extracted features into N x N rectangular blocks, each rectangular block is divided into 4 sub rectangular blocks, the central points of the 4 sub rectangular blocks are obtained through bilinear interpolation, and then the 4 central points are subjected to maximum pooling operation, so that the N x N256 region of interest (ROI) is finally obtained. Wherein N is a preset value, set based on actual application scenarios and requirements.
C4, classification and bounding box regression: the region ROI of interest was classified and frame-regressed by passing through two fully connected layers with parameters (N × 256-1024) and (1 × 1024-1024), respectively.
C5, Mask (Mask) prediction: the first branch of the mask firstly passes through 4 identical convolution layers (conv1, conv2, conv3 and conv4), the size of a convolution kernel is 3 x 256, the number of the convolution kernels is 256, the convolution step (stride) is set to be 1, the convolution operation padding is set to be 'SAME', so that the resolution of a characteristic spectrum is kept unchanged, and then the characteristic spectrum of M x K is output through a deconvolution layer with the factor (factor) of 2, wherein K represents the class number of samples;
the second branch connects the convolution layer conv3 output with convolution layer conv4_ FC (convolution kernel size: 3 × 256, number of convolution kernels: 256, convolution step: 1, padding: 'SAME') and conv5_ FC (convolution kernel size: 3 × 256, number of convolution kernels: 128, convolution step: 1, padding: 'SAME'), and then through a fully connected layer FC layer (N × 128-M FC layer)2) To obtain 1 x M2The feature vector of (3), the feature spectrum transformed (reshape) into M × 1 is used to predict the background and the foreground.
And (3) connecting the two branched feature spectrums in series (concatenate), obtaining a mask of M (K +1), and calculating the loss of the mask. A block diagram of which is shown in fig. 4.
D. And (4) calculating a loss function. The Loss function Loss of the whole network consists of three parts: loss of classification lossclsLoss of bounding Box regressionboxLoss of maskmask. The overall network penalty is therefore:
Loss=losscls+lossbox+lossmask
the convolutional neural network training processing of the segmentation network is achieved based on the Loss function Loss, when the Loss function Loss does not change obviously any more, namely the change rate does not exceed a preset threshold value, the training is stopped, and the trained segmentation network is obtained based on the current network parameters of the segmentation network.
According to the invention, the input image is subjected to image decoding algorithm to obtain the brightness and color difference macro blocks with different sizes, and the intra-frame prediction direction information is extracted, so that the obtained coding unit scale spectrum and the intra-frame prediction direction spectrum are used as coding auxiliary information, and the information of the image can be fully utilized. By using LSTM in fields other than text classification and natural language processing, the scale spectrum, direction spectrum, and original image are fused together with LSTM, so the accuracy of multi-instance image segmentation can be improved.
While the invention has been described with reference to specific embodiments, any feature disclosed in this specification may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise; all of the disclosed features, or all of the method or process steps, may be combined in any combination, except mutually exclusive features and/or steps.

Claims (3)

1. A multi-instance image segmentation method based on coding side information, comprising the steps of:
step 1, setting a segmentation network based on a convolutional neural network:
the segmentation network comprises a characteristic pyramid network, a proposed generation network, a sensitive region extraction network, a mask prediction network and a full connection layer;
the system comprises a characteristic pyramid network, a proposal generation network, a feeling region extraction network and a characteristic spectrum extraction network, wherein the characteristic pyramid network is used for characteristic extraction, and the obtained characteristic spectrum is respectively input into the proposal generation network and the feeling region extraction network;
the proposal generation network is used for generating a bounding box proposal and inputting the bounding box proposal into the sensitive area extraction network;
a sense region extraction network for extracting a sense region; classifying and frame regression are carried out on the areas with the feeling of fun through two full connection layers respectively;
two branches of a mask prediction network connecting the outputs of the sensitive region extraction network;
wherein, the first branch comprises four identical convolution layers and a deconvolution layer which are connected in sequence; the deconvolution layer outputs a characteristic spectrum of M K, wherein M is a preset value, and K represents the number of classes of samples;
second branchComprises two convolution layers, a full-connection layer and a deformation layer which are connected in sequence; the fully connected layers are used to obtain 1 x M2The deformation layer is used for generating a characteristic spectrum of M1 predicting the foreground and the background;
connecting feature spectrums of two branches of the mask prediction network in series to obtain a prediction mask of M (K + 1);
and the Loss function Loss of the segmented network is; loss is lesscls+lossbox+lossmaskWherein, losscls、lossboxAnd lossmaskRespectively representing classification loss, frame regression loss and mask loss of the segmentation network;
step 2, carrying out convolutional neural network training processing on the segmentation network;
collecting training sample pictures and extracting fusion characteristics of the training sample pictures;
initializing network parameters of the segmentation network, inputting the fusion characteristics of the training sample picture into the segmentation network, and obtaining the fusion characteristics based on classification output, frame regression output and mask prediction network output
Obtaining classification Loss, frame regression Loss and mask Loss of the segmentation network respectively from the difference between the real classification and the segmentation frame and the mask, thereby obtaining a current Loss function Loss;
when the change rate of the Loss function Loss does not exceed a preset threshold value, stopping training, and obtaining a trained segmented network based on the current network parameters of the segmented network;
step 3, extracting the fusion characteristics of the picture to be segmented, inputting the fusion characteristics into a trained segmentation network, and outputting a multi-instance image segmentation result of the picture to be segmented based on classification and frame regression;
the extraction mode of the fusion features is as follows:
carrying out image decoding processing on the picture to be subjected to the extraction of the fusion characteristics to obtain brightness and color difference macro blocks with different sizes and intra-frame prediction direction information; different labels are respectively allocated to different scales of the color difference macro block to obtain a scale spectrum of the coding unit; different labels are distributed to different intra-frame prediction directions to obtain an intra-frame prediction direction spectrum of the image;
and performing feature fusion on the obtained scale spectrum and direction spectrum with the original image through LSTM to obtain fusion features:
inputting the picture to be extracted with the fusion features into a first LSTM network, and obtaining a feature spectrum h1 based on the output of the first LSTM network;
the characteristic spectrum h1 is connected with the coding unit scale spectrum in series and then input into a second LSTM network, and a characteristic spectrum h2 is obtained based on the output of the second LSTM network;
and then the feature spectrum h2 and the intra-frame prediction direction spectrum are input into a third LSTM network after being connected in series, and the fusion feature of the picture of which the fusion feature is to be extracted is obtained based on the output of the third LSTM network.
2. The method of claim 1, wherein the color difference macroblock comprises three dimensions when extracting the fusion feature, respectively: 4 x 4, 8 x 8, 16 x 16;
the intra prediction directions include vertical, horizontal, DC, lower left diagonal mode, lower right diagonal mode, vertical right mode, horizontal down mode, vertical left mode, and horizontal up mode.
3. The method of claim 1, wherein the processing procedure of the perceptual area extraction network is:
extracting corresponding features from a feature spectrum output by the feature pyramid network based on a bounding box proposal generated by a proposal generation network;
dividing the extracted features into N x N rectangular blocks, dividing each rectangular block into 4 small rectangular blocks, obtaining the central points of the 4 rectangular blocks through bilinear interpolation, and performing maximum pooling operation on the 4 central points to obtain an N x N256 sensitization area, wherein N is a preset value.
CN201910814122.7A 2019-08-30 2019-08-30 Multi-instance image segmentation method based on coding auxiliary information Active CN110675403B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910814122.7A CN110675403B (en) 2019-08-30 2019-08-30 Multi-instance image segmentation method based on coding auxiliary information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910814122.7A CN110675403B (en) 2019-08-30 2019-08-30 Multi-instance image segmentation method based on coding auxiliary information

Publications (2)

Publication Number Publication Date
CN110675403A true CN110675403A (en) 2020-01-10
CN110675403B CN110675403B (en) 2022-05-03

Family

ID=69075852

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910814122.7A Active CN110675403B (en) 2019-08-30 2019-08-30 Multi-instance image segmentation method based on coding auxiliary information

Country Status (1)

Country Link
CN (1) CN110675403B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111260666A (en) * 2020-01-19 2020-06-09 上海商汤临港智能科技有限公司 Image processing method and device, electronic equipment and computer readable storage medium
CN111369565A (en) * 2020-03-09 2020-07-03 麦克奥迪(厦门)医疗诊断系统有限公司 Digital pathological image segmentation and classification method based on graph convolution network
CN111881981A (en) * 2020-07-29 2020-11-03 苏州科本信息技术有限公司 Mask coding-based single-stage instance segmentation method
CN112651982A (en) * 2021-01-12 2021-04-13 杭州智睿云康医疗科技有限公司 Image segmentation method and system based on image and non-image information
CN113870371A (en) * 2021-12-03 2021-12-31 浙江霖研精密科技有限公司 Picture color transformation device and method based on generation countermeasure network and storage medium
GB2606816A (en) * 2021-02-16 2022-11-23 Nvidia Corp Using neural networks to perform object detection, instance segmentation, and semantic correspondence from bounding box supervision

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107123123A (en) * 2017-05-02 2017-09-01 电子科技大学 Image segmentation quality evaluating method based on convolutional neural networks
CN107403430A (en) * 2017-06-15 2017-11-28 中山大学 A kind of RGBD image, semantics dividing method
CN107909015A (en) * 2017-10-27 2018-04-13 广东省智能制造研究所 Hyperspectral image classification method based on convolutional neural networks and empty spectrum information fusion
CN108898137A (en) * 2018-05-25 2018-11-27 黄凯 A kind of natural image character identifying method and system based on deep neural network
CN109255298A (en) * 2018-08-07 2019-01-22 南京工业大学 Safety helmet detection method and system in dynamic background
US20190138826A1 (en) * 2016-11-14 2019-05-09 Zoox, Inc. Spatial and Temporal Information for Semantic Segmentation
CN110175974A (en) * 2018-03-12 2019-08-27 腾讯科技(深圳)有限公司 Image significance detection method, device, computer equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190138826A1 (en) * 2016-11-14 2019-05-09 Zoox, Inc. Spatial and Temporal Information for Semantic Segmentation
CN107123123A (en) * 2017-05-02 2017-09-01 电子科技大学 Image segmentation quality evaluating method based on convolutional neural networks
CN107403430A (en) * 2017-06-15 2017-11-28 中山大学 A kind of RGBD image, semantics dividing method
CN107909015A (en) * 2017-10-27 2018-04-13 广东省智能制造研究所 Hyperspectral image classification method based on convolutional neural networks and empty spectrum information fusion
CN110175974A (en) * 2018-03-12 2019-08-27 腾讯科技(深圳)有限公司 Image significance detection method, device, computer equipment and storage medium
CN108898137A (en) * 2018-05-25 2018-11-27 黄凯 A kind of natural image character identifying method and system based on deep neural network
CN109255298A (en) * 2018-08-07 2019-01-22 南京工业大学 Safety helmet detection method and system in dynamic background

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
DONGQING ZHANG: "A multi-level convolutional LSTM model for the segmentation of left ventricle myocardium in infarcted porcine cine MR images", 《2018 IEEE 15TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2018)》 *
FANMAN MENG: "Weakly Supervised Part Proposal Segmentation From Multiple Images", 《IEEE TRANSACTIONS ON IMAGE PROCESSING ( VOLUME: 26, ISSUE: 8, AUG. 2017)》 *
JIANAN LI: "Scale-aware Fast R-CNN for Pedestrian Detection", 《IEEE TRANSACTIONS ON MULTIMEDIA》 *
WEN SHI: "Segmentation quality evaluation based on multi-scale convolutional neural networks", 《2017 IEEE VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP)》 *
孟凡满: "图像的协同分割理论与方法研究", 《中国博士学位论文全文数据库 (信息科技辑)》 *
邓朔: "基于预分割信息融合的快速图割算法研究", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111260666A (en) * 2020-01-19 2020-06-09 上海商汤临港智能科技有限公司 Image processing method and device, electronic equipment and computer readable storage medium
CN111260666B (en) * 2020-01-19 2022-05-24 上海商汤临港智能科技有限公司 Image processing method and device, electronic equipment and computer readable storage medium
CN111369565A (en) * 2020-03-09 2020-07-03 麦克奥迪(厦门)医疗诊断系统有限公司 Digital pathological image segmentation and classification method based on graph convolution network
CN111369565B (en) * 2020-03-09 2023-09-15 麦克奥迪(厦门)医疗诊断系统有限公司 Digital pathological image segmentation and classification method based on graph convolution network
CN111881981A (en) * 2020-07-29 2020-11-03 苏州科本信息技术有限公司 Mask coding-based single-stage instance segmentation method
CN112651982A (en) * 2021-01-12 2021-04-13 杭州智睿云康医疗科技有限公司 Image segmentation method and system based on image and non-image information
GB2606816A (en) * 2021-02-16 2022-11-23 Nvidia Corp Using neural networks to perform object detection, instance segmentation, and semantic correspondence from bounding box supervision
CN113870371A (en) * 2021-12-03 2021-12-31 浙江霖研精密科技有限公司 Picture color transformation device and method based on generation countermeasure network and storage medium

Also Published As

Publication number Publication date
CN110675403B (en) 2022-05-03

Similar Documents

Publication Publication Date Title
CN110675403B (en) Multi-instance image segmentation method based on coding auxiliary information
Hu et al. Towards coding for human and machine vision: A scalable image coding approach
CN111868751B (en) Using non-linear functions applied to quantization parameters in machine learning models for video coding
CN109886282B (en) Object detection method, device, computer-readable storage medium and computer equipment
CN112379231B (en) Equipment detection method and device based on multispectral image
Wang et al. Towards analysis-friendly face representation with scalable feature and texture compression
CN111311578A (en) Object classification method and device based on artificial intelligence and medical imaging equipment
DE202012013410U1 (en) Image compression with SUB resolution images
EP3718306B1 (en) Cluster refinement for texture synthesis in video coding
Lin et al. Generative adversarial network-based frame extrapolation for video coding
KR102342526B1 (en) Method and Apparatus for Video Colorization
CN114627269A (en) Virtual reality security protection monitoring platform based on degree of depth learning target detection
CN112200817A (en) Sky region segmentation and special effect processing method, device and equipment based on image
Zonglei et al. Deep compression: A compression technology for apron surveillance video
CN114067009A (en) Image processing method and device based on Transformer model
Li et al. ROI-based deep image compression with Swin transformers
WO2023020513A1 (en) Method, device, and medium for generating super-resolution video
JP4490351B2 (en) Inter-layer prediction processing method, inter-layer prediction processing apparatus, inter-layer prediction processing program, and recording medium therefor
Wang et al. A feature-based video transmission framework for visual IoT in fog computing systems
US20230135978A1 (en) Generating alpha mattes for digital images utilizing a transformer-based encoder-decoder
US20230342986A1 (en) Autoencoder-based segmentation mask generation in an alpha channel
CN115294429A (en) Feature domain network training method and device
CN114565764A (en) Port panorama sensing system based on ship instance segmentation
CN114040140B (en) Video matting method, device, system and storage medium
Jin et al. Fast QTBT partition algorithm for JVET intra coding based on CNN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant