WO2018076212A1 - Procédé de segmentation sémantique de scène basé sur un réseau de neurones déconvolutif - Google Patents

Procédé de segmentation sémantique de scène basé sur un réseau de neurones déconvolutif Download PDF

Info

Publication number
WO2018076212A1
WO2018076212A1 PCT/CN2016/103425 CN2016103425W WO2018076212A1 WO 2018076212 A1 WO2018076212 A1 WO 2018076212A1 CN 2016103425 W CN2016103425 W CN 2016103425W WO 2018076212 A1 WO2018076212 A1 WO 2018076212A1
Authority
WO
WIPO (PCT)
Prior art keywords
layer
neural network
picture
scene
semantic segmentation
Prior art date
Application number
PCT/CN2016/103425
Other languages
English (en)
Chinese (zh)
Inventor
黄凯奇
赵鑫
程衍华
Original Assignee
中国科学院自动化研究所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院自动化研究所 filed Critical 中国科学院自动化研究所
Priority to PCT/CN2016/103425 priority Critical patent/WO2018076212A1/fr
Publication of WO2018076212A1 publication Critical patent/WO2018076212A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Definitions

  • the invention relates to the field of pattern recognition, machine learning and computer vision, in particular to a scene semantic segmentation method based on deconvolution neural network.
  • Scene semantic segmentation is the use of computer to intelligently analyze images, and then determine the object categories to which each pixel in the image belongs, such as floors, walls, people, chairs, and so on.
  • the traditional scene semantic segmentation algorithm generally only relies on RGB (red, green and blue) to segment, which is easy to be affected by light changes, object color changes and background noise. It is not robust in practical use, and the accuracy is difficult. User needs.
  • the present invention is directed to the above problems existing in the prior art, and proposes a method based on deconvolution.
  • Neural network scene semantic segmentation method to improve the accuracy of scene semantic segmentation.
  • the scene semantic segmentation method based on deconvolution neural network of the present invention comprises the following steps:
  • Step S1 extracting a dense feature expression by using a full convolutional neural network for the scene picture
  • Step S2 using a locally sensitive deconvolution neural network and using the local affinity matrix of the picture, upsampling and optimizing the dense feature expression obtained in step S1 to obtain a score map of the picture, thereby achieving fine Semantic segmentation of the scene.
  • the local affinity matrix is obtained by extracting a SIFT (Scale-invariant feature transform) feature of the picture, and SPIN (Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes: in a complex three-dimensional scene)
  • SIFT Scale-invariant feature transform
  • SPIN Spin Images for Efficient Object Recognition in Cluttered 3D Scenes: in a complex three-dimensional scene
  • the feature of the effective target recognition using the rotated image and the gradient feature are then obtained by using the ucm-gPb (Contour Detection and Hierarchical Image Segmentation) algorithm.
  • the locally sensitive deconvolution neural network is formed by multiple splicing of three modules, which are a locally sensitive anti-aggregation layer, a deconvolution layer, and a locally sensitive mean aggregation layer.
  • the number of stitching is 2 or 3 times.
  • the output of the locally sensitive anti-aggregation layer is obtained by the following formula:
  • x represents the eigenvector of a pixel in the feature map
  • (i, j) and (o, o) represent arbitrary positions and center positions in the affinity matrix, respectively
  • Y ⁇ Y i, j ⁇ is a feature map of the inverse aggregate output.
  • the scene picture includes an RGB picture and a depth picture
  • the method further includes step S3: optimizing the RGB score map and the depth score map through the switch gate fusion layer, thereby Achieve more detailed scene semantic segmentation.
  • the switch gate fusion layer includes a splicing layer, a convolution layer, and a normalization layer.
  • the convolution layer is implemented by the following function: Where P rgb ⁇ ⁇ c ⁇ h ⁇ w is a score map based on RGB data prediction, P depth ⁇ ⁇ c ⁇ h ⁇ w is a score map based on depth data prediction, and W ⁇ R c ⁇ 2c ⁇ 1 ⁇ 1 is a switch gate
  • the filter of the fusion layer learning, C ⁇ R c ⁇ h ⁇ w is the contribution coefficient matrix of the convolution output.
  • the normalized layer is implemented by a sigmoid function (a function of S type, also referred to as an S-type growth curve).
  • the locally sensitive deconvolution neural network is used to strengthen the sensitivity of the full convolutional neural network to the local edges by using the local underlying information, thereby obtaining a more accurate scene segmentation, which can effectively overcome the full convolutional neural network.
  • the inherent defect is that a large amount of context information is aggregated to perform scene segmentation, causing blurring effects of edges.
  • the switch gate fusion layer by designing the switch gate fusion layer, the different functions of RGB and depth modes in different objects in different scenes can be effectively and automatically learned.
  • This dynamic adaptive contribution coefficient is superior to the non-discriminatory treatment method used by traditional algorithms, which can further improve the scene segmentation accuracy.
  • Figure 1 is a flow chart of one embodiment of the method of the present invention.
  • FIG. 2 is a schematic diagram of a full convolutional neural network for dense feature extraction in the present invention
  • 3a is a schematic diagram of a local sensitive deconvolution neural network according to an embodiment of the present invention.
  • Figure 3b is a schematic diagram of a locally sensitive anti-aggregation layer and a locally sensitive mean aggregation layer in accordance with one embodiment of the present invention
  • FIG. 4 is a switch door fusion layer in accordance with an embodiment of the present invention.
  • a deconvolution neural network based scene semantic segmentation method includes the following steps:
  • Step S1 extracting low-resolution dense feature expressions from the full-convolution neural network for the scene picture
  • Step S2 using a locally sensitive deconvolution neural network and using the local affinity matrix of the picture, upsampling and optimizing the dense feature expression obtained in step S1 to obtain a score map of the picture, thereby achieving fine Semantic segmentation of the scene.
  • the present invention employs a full convolutional neural network to efficiently extract dense features of a picture, which may be RGB pictures, and/or depth pictures.
  • a full convolutional neural network can be aggregated through multiple convolution, downsampling, and maximum aggregation processes.
  • the context information is used to characterize each pixel in the picture to obtain an RGB feature map S1 and/or a depth feature map S1.
  • the full convolutional neural network yields a low resolution feature map with very blurred edges.
  • the present invention embeds the underlying pixel level information into the deconvolution neural network to guide the training of the network.
  • the locally sensitive deconvolution neural network is used to perform upsampling learning and object edge optimization to obtain RGB fractional graph S2 and/or depth fractional graph S2, thereby achieving finer scene semantic segmentation.
  • step S2 the similarity relationship between each pixel in the picture and the neighboring pixels is first calculated, and a binarized local affinity matrix is obtained.
  • SIFT, SPIN and gradient features of RGB and depth pictures can be extracted, and the local affinity matrix is obtained by using the ucm-gPb algorithm.
  • the network structure can include three modules: a locally sensitive anti-aggregation layer, a deconvolution layer, and a locally sensitive average pooling.
  • the input of the locally sensitive anti-aggregation layer is the feature map response of the previous layer, and the local affinity matrix, the output is a feature map response of twice the resolution.
  • the main function of the network layer is to learn to restore the richer details in the original picture and get the result of clearer segmentation of the edges of the object.
  • the output of the locally sensitive anti-aggregation layer can be obtained by the following formula:
  • x represents the eigenvector of a pixel in the feature map
  • the input of the deconvolution layer is the output of the upper layer of the anti-aggregation layer, and the output is the signature response of the equal resolution.
  • the network layer is mainly used to smooth the feature map, because the anti-aggregation layer is prone to generate many broken object edges, and the deconvolution process can be used to learn to splicing the edges of the breaks.
  • Deconvolution uses the inverse of convolution, mapping each stimulus response value to multiple stimulus response outputs. The response graph after deconvolution will become relatively smoother.
  • the input of the locally sensitive mean gather layer is the output of the upper deconvolution layer, and the local affinity matrix, and the output is an equal resolution feature map response.
  • the network layer is mainly used to obtain a more robust feature representation for each pixel while maintaining sensitivity to the edges of the object.
  • the invention combines the locally sensitive anti-aggregation layer, the deconvolution layer and the locally sensitive mean aggregation layer multiple times, gradually upsampling and optimizing the detailed information of the scene segmentation, and obtaining a finer and more accurate scene segmentation effect.
  • the number of stitching is 2 or 3 times. The more the number of stitching, the finer and more accurate the segmentation is, but the larger the amount of calculation.
  • RGB color information and depth information describe information about different modalities of objects in the scene.
  • RGB images can describe the appearance, color, and texture features of an object
  • depth data provides spatial geometry, shape, and size information of an object. Effectively blending these two complementary information can improve the accuracy of scene semantic segmentation.
  • the existing methods basically treat the data of the two modes equivalently, and cannot distinguish the different contributions of the two modes when identifying different objects in different scenes.
  • the RGB fractional map and the depth fraction map obtained by the above steps S1 and S2 are optimally fused by gate fusion to obtain a fusion score map, thereby realizing More detailed scene semantic segmentation, as shown in Figure 4.
  • the switch gate fusion layer can effectively measure the importance of RGB (apparent) and depth (shape) information for identifying different objects in different scenes.
  • the switch gate fusion layer of the present invention is mainly composed of a stitching layer, a convolution layer and a normalized layer, which can automatically learn the weights of the two modes, thereby better integrating the two modes.
  • Complementary information is used in scene semantic segmentation.
  • the features obtained by the RGB and deep networks are first spliced through the splicing layer.
  • the second is the convolution operation, which learns the weight matrix of RGB and depth information through convolutional layer learning.
  • the convolution process can be implemented as follows:
  • P rgb ⁇ ⁇ c ⁇ h ⁇ w (characteristic map of c channels, each feature map height h, width w) is a fractional graph predicted based on RGB data
  • P depth ⁇ ⁇ c ⁇ h ⁇ w (parameter meanings as defined above) as a fraction of depth based on the prediction data
  • W ⁇ R c ⁇ 2c ⁇ 1 ⁇ 1 (c th sub-filter, filtering each of the sub-three-dimensional matrix 2c ⁇ 1 ⁇ 1) of the learning filter layer is fusion door
  • C ⁇ R c ⁇ h ⁇ w is the contribution coefficient matrix of the convolution output.
  • is a matrix point multiplication operation. Adding the RGB and depth scores as the final blend score is Based on the final score map, the semantic segmentation results can be obtained.
  • the new local sensitivity-based deconvolution neural network proposed by the present invention can be used for RGB-D indoor scene semantic segmentation.
  • the invention can well adapt to the light changes of the indoor scene, the background noise, the small objects and the occlusion, and can more effectively utilize the complementarity of RGB and depth to obtain more robust, higher precision, and the edge of the object remains more. Good scene semantic segmentation effect.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un procédé de segmentation sémantique de scène basé sur un réseau de neurones déconvolutif. Le procédé comprend les étapes consistant : S1, à extraire une expression de caractéristiques intensive pour une image de scène à l'aide d'un réseau de neurones totalement convolutif ; et S2, à effectuer un apprentissage de suréchantillonnage et une optimisation de contours d'objet sur l'expression de caractéristiques intensive obtenue à l'étape S1 au moyen d'un réseau de neurones déconvolutif sensible localement à l'aide d'une matrice d'affinité locale de l'image, de façon à obtenir une carte de score de l'image puis à réaliser une segmentation sémantique de scène affinée. Grâce au réseau de neurones déconvolutif sensible localement, la sensibilité au contour local du réseau de neurones totalement convolutif est renforcée au moyen d'informations de couche inférieure locale, de sorte qu'une segmentation de scène plus précise soit obtenue.
PCT/CN2016/103425 2016-10-26 2016-10-26 Procédé de segmentation sémantique de scène basé sur un réseau de neurones déconvolutif WO2018076212A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/103425 WO2018076212A1 (fr) 2016-10-26 2016-10-26 Procédé de segmentation sémantique de scène basé sur un réseau de neurones déconvolutif

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/103425 WO2018076212A1 (fr) 2016-10-26 2016-10-26 Procédé de segmentation sémantique de scène basé sur un réseau de neurones déconvolutif

Publications (1)

Publication Number Publication Date
WO2018076212A1 true WO2018076212A1 (fr) 2018-05-03

Family

ID=62023002

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/103425 WO2018076212A1 (fr) 2016-10-26 2016-10-26 Procédé de segmentation sémantique de scène basé sur un réseau de neurones déconvolutif

Country Status (1)

Country Link
WO (1) WO2018076212A1 (fr)

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522966A (zh) * 2018-11-28 2019-03-26 中山大学 一种基于密集连接卷积神经网络的目标检测方法
CN109543502A (zh) * 2018-09-27 2019-03-29 天津大学 一种基于深度多尺度神经网络的语义分割方法
CN109785435A (zh) * 2019-01-03 2019-05-21 东易日盛家居装饰集团股份有限公司 一种墙体重建方法及装置
CN109902755A (zh) * 2019-03-05 2019-06-18 南京航空航天大学 一种用于xct切片的多层信息分享与纠正方法
CN110427953A (zh) * 2019-06-21 2019-11-08 中南大学 基于卷积神经网络和序列匹配的在变化环境中让机器人进行视觉地点识别的实现方法
CN110458939A (zh) * 2019-07-24 2019-11-15 大连理工大学 基于视角生成的室内场景建模方法
CN110826702A (zh) * 2019-11-18 2020-02-21 方玉明 一种多任务深度网络的异常事件检测方法
CN110874841A (zh) * 2018-09-04 2020-03-10 斯特拉德视觉公司 参照边缘图像的客体检测方法及装置
CN110929613A (zh) * 2019-11-14 2020-03-27 上海眼控科技股份有限公司 一种智能交通违法审核的图像筛选算法
CN111192271A (zh) * 2018-11-14 2020-05-22 银河水滴科技(北京)有限公司 一种图像分割方法及装置
CN111242027A (zh) * 2020-01-13 2020-06-05 北京工业大学 一种融合语义信息的无监督学习场景特征快速提取方法
CN111259901A (zh) * 2020-01-13 2020-06-09 镇江优瞳智能科技有限公司 一种利用空间信息提升语义分割精度的高效方法
CN111311611A (zh) * 2020-02-17 2020-06-19 清华大学深圳国际研究生院 一种实时三维大场景多对象实例分割的方法
US10692244B2 (en) 2017-10-06 2020-06-23 Nvidia Corporation Learning based camera pose estimation from images of an environment
CN111488880A (zh) * 2019-01-25 2020-08-04 斯特拉德视觉公司 用于提高利用边缘损失来检测事件的分割性能的方法装置
CN111563507A (zh) * 2020-04-14 2020-08-21 浙江科技学院 一种基于卷积神经网络的室内场景语义分割方法
CN111723810A (zh) * 2020-05-11 2020-09-29 北京航空航天大学 一种场景识别任务模型的可解释性方法
CN111931689A (zh) * 2020-08-26 2020-11-13 北京建筑大学 一种在线提取视频卫星数据鉴别特征的方法
CN112085747A (zh) * 2020-09-08 2020-12-15 中国科学院计算技术研究所厦门数据智能研究院 一种基于局部关系指导的图像分割方法
CN112164078A (zh) * 2020-09-25 2021-01-01 上海海事大学 基于编码器-解码器的rgb-d多尺度语义分割方法
CN112381948A (zh) * 2020-11-03 2021-02-19 上海交通大学烟台信息技术研究院 一种基于语义的激光条纹中心线提取及拟合方法
CN113239891A (zh) * 2021-06-09 2021-08-10 上海海事大学 基于深度学习的场景分类系统及方法
CN113658200A (zh) * 2021-07-29 2021-11-16 东北大学 基于自适应特征融合的边缘感知图像语义分割方法
CN114332473A (zh) * 2021-09-29 2022-04-12 腾讯科技(深圳)有限公司 目标检测方法、装置、计算机设备、存储介质及程序产品
CN115496975A (zh) * 2022-08-29 2022-12-20 锋睿领创(珠海)科技有限公司 辅助加权数据融合方法、装置、设备及存储介质
CN115546271A (zh) * 2022-09-29 2022-12-30 锋睿领创(珠海)科技有限公司 基于深度联合表征的视觉分析方法、装置、设备及介质
CN115953666A (zh) * 2023-03-15 2023-04-11 国网湖北省电力有限公司经济技术研究院 一种基于改进Mask-RCNN的变电站现场进度识别方法
CN115995002A (zh) * 2023-03-24 2023-04-21 南京信息工程大学 一种网络构建方法及城市场景实时语义分割方法
CN116051830A (zh) * 2022-12-20 2023-05-02 中国科学院空天信息创新研究院 一种面向跨模态数据融合的对比语义分割方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105389798A (zh) * 2015-10-19 2016-03-09 西安电子科技大学 基于反卷积网络与映射推理网络的sar图像分割方法
CN105427313A (zh) * 2015-11-23 2016-03-23 西安电子科技大学 基于反卷积网络和自适应推理网络的sar图像分割方法
CN105488809A (zh) * 2016-01-14 2016-04-13 电子科技大学 基于rgbd描述符的室内场景语义分割方法
CN105608692A (zh) * 2015-12-17 2016-05-25 西安电子科技大学 基于反卷积网络和稀疏分类的极化sar图像分割方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105389798A (zh) * 2015-10-19 2016-03-09 西安电子科技大学 基于反卷积网络与映射推理网络的sar图像分割方法
CN105427313A (zh) * 2015-11-23 2016-03-23 西安电子科技大学 基于反卷积网络和自适应推理网络的sar图像分割方法
CN105608692A (zh) * 2015-12-17 2016-05-25 西安电子科技大学 基于反卷积网络和稀疏分类的极化sar图像分割方法
CN105488809A (zh) * 2016-01-14 2016-04-13 电子科技大学 基于rgbd描述符的室内场景语义分割方法

Cited By (49)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10964061B2 (en) 2017-10-06 2021-03-30 Nvidia Corporation Learning-based camera pose estimation from images of an environment
US10692244B2 (en) 2017-10-06 2020-06-23 Nvidia Corporation Learning based camera pose estimation from images of an environment
CN110874841B (zh) * 2018-09-04 2023-08-29 斯特拉德视觉公司 参照边缘图像的客体检测方法及装置
CN110874841A (zh) * 2018-09-04 2020-03-10 斯特拉德视觉公司 参照边缘图像的客体检测方法及装置
CN109543502A (zh) * 2018-09-27 2019-03-29 天津大学 一种基于深度多尺度神经网络的语义分割方法
CN109543502B (zh) * 2018-09-27 2023-06-06 天津大学 一种基于深度多尺度神经网络的语义分割方法
CN111192271A (zh) * 2018-11-14 2020-05-22 银河水滴科技(北京)有限公司 一种图像分割方法及装置
CN111192271B (zh) * 2018-11-14 2023-08-22 银河水滴科技(北京)有限公司 一种图像分割方法及装置
CN109522966B (zh) * 2018-11-28 2022-09-27 中山大学 一种基于密集连接卷积神经网络的目标检测方法
CN109522966A (zh) * 2018-11-28 2019-03-26 中山大学 一种基于密集连接卷积神经网络的目标检测方法
CN109785435A (zh) * 2019-01-03 2019-05-21 东易日盛家居装饰集团股份有限公司 一种墙体重建方法及装置
CN111488880A (zh) * 2019-01-25 2020-08-04 斯特拉德视觉公司 用于提高利用边缘损失来检测事件的分割性能的方法装置
CN111488880B (zh) * 2019-01-25 2023-04-18 斯特拉德视觉公司 用于提高利用边缘损失来检测事件的分割性能的方法装置
CN109902755A (zh) * 2019-03-05 2019-06-18 南京航空航天大学 一种用于xct切片的多层信息分享与纠正方法
CN110427953A (zh) * 2019-06-21 2019-11-08 中南大学 基于卷积神经网络和序列匹配的在变化环境中让机器人进行视觉地点识别的实现方法
CN110427953B (zh) * 2019-06-21 2022-11-29 中南大学 基于卷积神经网络和序列匹配的在变化环境中让机器人进行视觉地点识别的实现方法
CN110458939A (zh) * 2019-07-24 2019-11-15 大连理工大学 基于视角生成的室内场景建模方法
CN110458939B (zh) * 2019-07-24 2022-11-18 大连理工大学 基于视角生成的室内场景建模方法
CN110929613A (zh) * 2019-11-14 2020-03-27 上海眼控科技股份有限公司 一种智能交通违法审核的图像筛选算法
CN110826702A (zh) * 2019-11-18 2020-02-21 方玉明 一种多任务深度网络的异常事件检测方法
CN111259901A (zh) * 2020-01-13 2020-06-09 镇江优瞳智能科技有限公司 一种利用空间信息提升语义分割精度的高效方法
CN111242027A (zh) * 2020-01-13 2020-06-05 北京工业大学 一种融合语义信息的无监督学习场景特征快速提取方法
CN111242027B (zh) * 2020-01-13 2023-04-14 北京工业大学 一种融合语义信息的无监督学习场景特征快速提取方法
CN111311611A (zh) * 2020-02-17 2020-06-19 清华大学深圳国际研究生院 一种实时三维大场景多对象实例分割的方法
CN111311611B (zh) * 2020-02-17 2023-04-18 清华大学深圳国际研究生院 一种实时三维大场景多对象实例分割的方法
CN111563507B (zh) * 2020-04-14 2024-01-12 浙江科技学院 一种基于卷积神经网络的室内场景语义分割方法
CN111563507A (zh) * 2020-04-14 2020-08-21 浙江科技学院 一种基于卷积神经网络的室内场景语义分割方法
CN111723810B (zh) * 2020-05-11 2022-09-16 北京航空航天大学 一种场景识别任务模型的可解释性方法
CN111723810A (zh) * 2020-05-11 2020-09-29 北京航空航天大学 一种场景识别任务模型的可解释性方法
CN111931689B (zh) * 2020-08-26 2021-04-23 北京建筑大学 一种在线提取视频卫星数据鉴别特征的方法
CN111931689A (zh) * 2020-08-26 2020-11-13 北京建筑大学 一种在线提取视频卫星数据鉴别特征的方法
CN112085747B (zh) * 2020-09-08 2023-07-21 中科(厦门)数据智能研究院 一种基于局部关系指导的图像分割方法
CN112085747A (zh) * 2020-09-08 2020-12-15 中国科学院计算技术研究所厦门数据智能研究院 一种基于局部关系指导的图像分割方法
CN112164078B (zh) * 2020-09-25 2024-03-15 上海海事大学 基于编码器-解码器的rgb-d多尺度语义分割方法
CN112164078A (zh) * 2020-09-25 2021-01-01 上海海事大学 基于编码器-解码器的rgb-d多尺度语义分割方法
CN112381948B (zh) * 2020-11-03 2022-11-29 上海交通大学烟台信息技术研究院 一种基于语义的激光条纹中心线提取及拟合方法
CN112381948A (zh) * 2020-11-03 2021-02-19 上海交通大学烟台信息技术研究院 一种基于语义的激光条纹中心线提取及拟合方法
CN113239891A (zh) * 2021-06-09 2021-08-10 上海海事大学 基于深度学习的场景分类系统及方法
CN113658200A (zh) * 2021-07-29 2021-11-16 东北大学 基于自适应特征融合的边缘感知图像语义分割方法
CN113658200B (zh) * 2021-07-29 2024-01-02 东北大学 基于自适应特征融合的边缘感知图像语义分割方法
CN114332473A (zh) * 2021-09-29 2022-04-12 腾讯科技(深圳)有限公司 目标检测方法、装置、计算机设备、存储介质及程序产品
CN115496975B (zh) * 2022-08-29 2023-08-18 锋睿领创(珠海)科技有限公司 辅助加权数据融合方法、装置、设备及存储介质
CN115496975A (zh) * 2022-08-29 2022-12-20 锋睿领创(珠海)科技有限公司 辅助加权数据融合方法、装置、设备及存储介质
CN115546271B (zh) * 2022-09-29 2023-08-22 锋睿领创(珠海)科技有限公司 基于深度联合表征的视觉分析方法、装置、设备及介质
CN115546271A (zh) * 2022-09-29 2022-12-30 锋睿领创(珠海)科技有限公司 基于深度联合表征的视觉分析方法、装置、设备及介质
CN116051830B (zh) * 2022-12-20 2023-06-20 中国科学院空天信息创新研究院 一种面向跨模态数据融合的对比语义分割方法
CN116051830A (zh) * 2022-12-20 2023-05-02 中国科学院空天信息创新研究院 一种面向跨模态数据融合的对比语义分割方法
CN115953666A (zh) * 2023-03-15 2023-04-11 国网湖北省电力有限公司经济技术研究院 一种基于改进Mask-RCNN的变电站现场进度识别方法
CN115995002A (zh) * 2023-03-24 2023-04-21 南京信息工程大学 一种网络构建方法及城市场景实时语义分割方法

Similar Documents

Publication Publication Date Title
WO2018076212A1 (fr) Procédé de segmentation sémantique de scène basé sur un réseau de neurones déconvolutif
CN107066916B (zh) 基于反卷积神经网络的场景语义分割方法
WO2020108358A1 (fr) Procédé et appareil de retouche d'image, dispositif informatique et support de stockage
CN107578418B (zh) 一种融合色彩和深度信息的室内场景轮廓检测方法
CN106529447B (zh) 一种小样本人脸识别方法
CN109583340B (zh) 一种基于深度学习的视频目标检测方法
CN108875935B (zh) 基于生成对抗网络的自然图像目标材质视觉特征映射方法
CN109410168B (zh) 用于确定图像中的子图块类别的卷积神经网络的建模方法
WO2018000752A1 (fr) Procédé d'estimation de profondeur d'image monoculaire basé sur un cnn à échelles multiples et un crf
CN108564549B (zh) 一种基于多尺度稠密连接网络的图像去雾方法
CN111476710B (zh) 基于移动平台的视频换脸方法及系统
TW200834459A (en) Video object segmentation method applied for rainy situations
CN111046868B (zh) 基于矩阵低秩稀疏分解的目标显著性检测方法
CN110705634B (zh) 一种鞋跟型号识别方法、装置及存储介质
Huang et al. Automatic building change image quality assessment in high resolution remote sensing based on deep learning
CN112580661A (zh) 一种深度监督下的多尺度边缘检测方法
Feng et al. Low-light image enhancement algorithm based on an atmospheric physical model
Liu et al. Progressive complex illumination image appearance transfer based on CNN
CN111401209B (zh) 一种基于深度学习的动作识别方法
CN117593540A (zh) 一种基于智能化图像识别技术的压力性损伤分期识别方法
Zhao et al. Color channel fusion network for low-light image enhancement
CN115953330B (zh) 虚拟场景图像的纹理优化方法、装置、设备和存储介质
CN116993760A (zh) 一种基于图卷积和注意力机制的手势分割方法、系统、设备及介质
Yuan et al. Explore double-opponency and skin color for saliency detection
CN108765384B (zh) 一种联合流形排序和改进凸包的显著性检测方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16920114

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16920114

Country of ref document: EP

Kind code of ref document: A1