CN112101376A - Image processing method, apparatus, electronic device and computer readable medium - Google Patents
Image processing method, apparatus, electronic device and computer readable medium Download PDFInfo
- Publication number
- CN112101376A CN112101376A CN202010822524.4A CN202010822524A CN112101376A CN 112101376 A CN112101376 A CN 112101376A CN 202010822524 A CN202010822524 A CN 202010822524A CN 112101376 A CN112101376 A CN 112101376A
- Authority
- CN
- China
- Prior art keywords
- feature map
- scale
- feature
- image
- map
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 21
- 238000000034 method Methods 0.000 claims abstract description 91
- 238000012545 processing Methods 0.000 claims abstract description 73
- 238000000605 extraction Methods 0.000 claims abstract description 23
- 238000005457 optimization Methods 0.000 claims description 52
- 230000008569 process Effects 0.000 claims description 48
- 238000010586 diagram Methods 0.000 claims description 20
- 230000015654 memory Effects 0.000 claims description 16
- 230000006870 function Effects 0.000 claims description 9
- 230000002708 enhancing effect Effects 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 6
- 238000011176 pooling Methods 0.000 claims description 5
- 230000009467 reduction Effects 0.000 claims description 5
- 238000005070 sampling Methods 0.000 claims description 5
- 238000005728 strengthening Methods 0.000 claims 14
- 230000002787 reinforcement Effects 0.000 claims 1
- 238000004804 winding Methods 0.000 claims 1
- 230000011218 segmentation Effects 0.000 abstract description 28
- 239000011159 matrix material Substances 0.000 description 15
- 238000004891 communication Methods 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000009336 multiple cropping Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 230000001131 transforming effect Effects 0.000 description 2
- 238000012800 visualization Methods 0.000 description 2
- 241000282472 Canis lupus familiaris Species 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/32—Normalisation of the pattern dimensions
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
本发明提供了一种图像处理方法、装置、电子设备和计算机可读介质,包括:获取待处理图像,并对待处理图像进行特征提取,得到多尺度的特征图;对多尺度的特征图中显著性物体所对应的部分进行强化处理,得到多尺度的强化特征图;对多尺度的强化特征图进行图像还原,得到与待处理图像对应的显著性物体蒙版。本发明的方法对多尺度的特征图中显著性物体所对应的部分进行强化处理后,得到的多尺度的强化特征图中,显著性物体所对应的特征图更加突出,最后对多尺度的强化特征图进行图像还原后,分割得到的显著性物体蒙版更加准确,缓解了现有的显著性物体分割方法在对图像进行处理时,精度差的技术问题。
The present invention provides an image processing method, device, electronic device and computer-readable medium, comprising: acquiring an image to be processed, and performing feature extraction on the image to be processed to obtain a multi-scale feature map; The corresponding part of the salient object is enhanced to obtain a multi-scale enhanced feature map; the image restoration is performed on the multi-scale enhanced feature map to obtain a salient object mask corresponding to the image to be processed. After the method of the present invention performs enhancement processing on the parts corresponding to the salient objects in the multi-scale feature map, in the obtained multi-scale enhanced feature map, the feature maps corresponding to the salient objects are more prominent, and finally the multi-scale enhancement is carried out. After the feature map is restored, the salient object mask obtained by segmentation is more accurate, which alleviates the technical problem of poor accuracy when processing the image by the existing salient object segmentation method.
Description
技术领域technical field
本发明涉及图像处理的技术领域,尤其是涉及一种图像处理方法、装置、电子设备和计算机可读介质。The present invention relates to the technical field of image processing, and in particular, to an image processing method, apparatus, electronic device and computer-readable medium.
背景技术Background technique
显著性物体分割(Salient Object Segmentation)是计算机视觉(ComputerVision)的一个重要课题。在手机自动对焦、无人驾驶、场景理解、图像编辑等领域中都有着非常广泛的应用。显著性物体分割的目的是在一张图像中将显著物体的像素点与其它背景像素点区分出来。不同于传统的语义分割任务,显著性物体并不属于同一类物体,并没有语义相关的标签。但是显著性物体往往处于图像中间,且颜色丰富,如图1(a)所示,图1(b)为与图1(a)对应的显著性物体分割结果示意图。Salient Object Segmentation is an important subject in Computer Vision. It has a very wide range of applications in the fields of mobile phone autofocus, unmanned driving, scene understanding, and image editing. The purpose of salient object segmentation is to distinguish the pixels of salient objects from other background pixels in an image. Unlike traditional semantic segmentation tasks, salient objects do not belong to the same class of objects and do not have semantically related labels. However, the salient objects are often in the middle of the image and have rich colors, as shown in Figure 1(a), and Figure 1(b) is a schematic diagram of the segmentation result of the salient objects corresponding to Figure 1(a).
现有的显著性物体分割方法主要分为两类。其中一类是通过对图像的纹理进行分析,确定出图像中纹理丰富的区域,继而通过聚类方法,将物体和其它纹理单一的区域区分出来。这种方法受限于聚类方法,很难获得较高的精度;另外一类是将显著性物体分割看作标准物体分割问题。但是标准物体分割是将图像中预设种类的物体分割出来,例如,将图像中的人、车、狗分割出来,但是这些物体对于某张特定的图像来讲,可能并不是显著性物体,或者并不是所有的都是显著性物体,这就导致分割出的显著性物体出现错误。Existing salient object segmentation methods are mainly divided into two categories. One of them is to analyze the texture of the image to determine the area with rich texture in the image, and then use the clustering method to distinguish the object from other areas with single texture. This method is limited by the clustering method, and it is difficult to obtain high accuracy; the other is to regard salient object segmentation as a standard object segmentation problem. However, standard object segmentation is to segment the preset types of objects in the image, for example, people, cars, and dogs in the image, but these objects may not be salient objects for a specific image, or Not all of them are salient objects, which leads to errors in the segmented salient objects.
综上,现有的显著性物体分割方法在对图像进行处理时,存在精度差的技术问题。To sum up, the existing saliency object segmentation methods have the technical problem of poor accuracy when processing images.
发明内容SUMMARY OF THE INVENTION
有鉴于此,本发明的目的在于提供一种图像处理方法、装置、电子设备和计算机可读介质,以缓解现有的显著性物体分割方法在对图像进行处理时,精度差的技术问题。In view of this, the purpose of the present invention is to provide an image processing method, apparatus, electronic device and computer-readable medium, so as to alleviate the technical problem of poor accuracy when processing an image in an existing saliency object segmentation method.
第一方面,本发明实施例提供了一种图像处理方法,包括:获取待处理图像,并对所述待处理图像进行特征提取,得到多尺度的特征图;对所述多尺度的特征图中显著性物体所对应的部分进行强化处理,得到多尺度的强化特征图;对所述多尺度的强化特征图进行图像还原,得到与所述待处理图像对应的显著性物体蒙版。In a first aspect, an embodiment of the present invention provides an image processing method, including: acquiring an image to be processed, and performing feature extraction on the to-be-processed image to obtain a multi-scale feature map; The part corresponding to the salient object is enhanced to obtain a multi-scale enhanced feature map; the image restoration is performed on the multi-scale enhanced feature map to obtain a mask of the salient object corresponding to the to-be-processed image.
进一步的,对所述待处理图像进行特征提取包括:对所述待处理图像进行多层下采样处理,得到多尺度的原始特征图;对所述多尺度的原始特征图进行优化处理,得到所述多尺度的特征图。Further, performing feature extraction on the image to be processed includes: performing multi-layer downsampling processing on the image to be processed to obtain a multi-scale original feature map; performing optimization processing on the multi-scale original feature map to obtain the multi-scale original feature map. The multi-scale feature maps are described.
进一步的,对所述多尺度的原始特征图进行优化处理包括:对所述多尺度的原始特征图中的目标原始特征图进行第一优化处理,得到第一优化的特征图,其中,所述目标原始特征图为所述多尺度的原始特征图中,除最高维原始特征图以外的特征图;对所述多尺度的原始特征图中的最高维原始特征图进行第二优化处理,得到第二优化的特征图;将所述第一优化的特征图和所述第二优化的特征图作为所述多尺度的特征图。Further, performing the optimization process on the multi-scale original feature map includes: performing a first optimization process on the target original feature map in the multi-scale original feature map to obtain a first optimized feature map, wherein the The target original feature map is the feature map of the multi-scale original feature map, except the highest-dimensional original feature map; the second optimization process is performed on the highest-dimensional original feature map in the multi-scale original feature map to obtain the first Two optimized feature maps; the first optimized feature map and the second optimized feature map are used as the multi-scale feature maps.
进一步的,对所述多尺度的原始特征图中的目标原始特征图进行第一优化处理包括:利用第一优化模块对所述目标原始特征图进行优化处理,得到第一初始优化的特征图,其中,所述第一优化模块包括:预设数量个第一卷积层;将所述第一初始优化的特征图和其对应的目标原始特征图进行加和运算,得到所述第一优化的特征图。Further, performing a first optimization process on the target original feature map in the multi-scale original feature map includes: using a first optimization module to perform an optimization process on the target original feature map to obtain a first initial optimized feature map, Wherein, the first optimization module includes: a preset number of first convolutional layers; summing the feature map of the first initial optimization and its corresponding target original feature map to obtain the first optimized feature map feature map.
进一步的,对所述多尺度的原始特征图中的最高维原始特征图进行第二优化处理包括:利用第二优化模块对所述最高维原始特征图进行优化处理,得到优化权重,其中,所述第二优化模块包括:第二卷积层、全局池化层和Sigmoid函数处理层;将所述优化权重与所述最高维原始特征图进行乘积运算,得到第二初始优化的特征图;将所述第二初始优化的特征图和所述最高维原始特征图进行加和运算,得到所述第二优化的特征图。Further, performing a second optimization process on the highest-dimensional original feature map in the multi-scale original feature map includes: using a second optimization module to perform an optimization process on the highest-dimensional original feature map to obtain an optimization weight, wherein the The second optimization module includes: a second convolution layer, a global pooling layer and a Sigmoid function processing layer; the optimization weight and the highest dimensional original feature map are multiplied to obtain a second initial optimized feature map; The second initial optimized feature map and the highest-dimensional original feature map are summed to obtain the second optimized feature map.
进一步的,对所述多尺度的特征图中显著性物体所对应的部分进行强化处理包括:根据所述多尺度的特征图得到显著性物体的初始位置;根据所述初始位置,以至少两种不同扩充尺度,裁剪所述多尺度的特征图中最高维的特征图,得到多个裁剪特征图,所述多个裁剪特征图中包含所述显著性物体的特征信息;将所述多尺度的特征图中的一个或多个作为目标特征图,逐一将各所述目标特征图作为当前目标特征图,计算所述多个裁剪特征图与所述当前目标特征图的相关度,得到所述当前目标特征图的与所述多个裁剪特征图一一对应的多个相关度特征图;根据所述多个相关度特征图和所述当前目标特征图,得到所述当前目标特征图对应的强化特征图。Further, performing enhancement processing on the part corresponding to the salient object in the multi-scale feature map includes: obtaining the initial position of the salient object according to the multi-scale feature map; At different expansion scales, crop the feature map of the highest dimension in the multi-scale feature map to obtain a plurality of cropped feature maps, and the plurality of cropped feature maps contain the feature information of the salient objects; One or more of the feature maps are used as target feature maps, each of the target feature maps is used as the current target feature map one by one, the correlation between the multiple cropped feature maps and the current target feature map is calculated, and the current target feature map is obtained. A plurality of correlation feature maps of the target feature map corresponding to the plurality of cropped feature maps one-to-one; according to the plurality of correlation feature maps and the current target feature map, the enhancement corresponding to the current target feature map is obtained feature map.
进一步的,根据所述多尺度的特征图得到显著性物体的初始位置包括:对所述多尺度的特征图中最高维的特征图进行降维处理,得到单通道的特征图;对所述单通道的特征图进行二值化处理,得到单通道的二值化特征图;根据所述单通道的二值化特征图确定所述显著性物体的初始位置。Further, obtaining the initial position of the salient object according to the multi-scale feature map includes: performing dimension reduction processing on the highest-dimensional feature map in the multi-scale feature map to obtain a single-channel feature map; The feature map of the channel is binarized to obtain a single-channel binarized feature map; the initial position of the salient object is determined according to the single-channel binarized feature map.
进一步的,根据所述初始位置,以至少两种不同扩充尺度,裁剪所述多尺度的特征图中最高维的特征图包括:根据所述初始位置确定所述显著性物体的像素宽度和像素高度;根据所述扩充尺度、所述像素宽度和所述像素高度确定扩充像素宽度和扩充像素高度;在所述最高维的特征图中,沿着将所述初始位置扩充所述扩充像素宽度和所述扩充像素高度后的位置进行裁剪。Further, according to the initial position, with at least two different expansion scales, cropping the feature map of the highest dimension in the multi-scale feature map includes: determining the pixel width and pixel height of the salient object according to the initial position. ; Determine the expanded pixel width and the expanded pixel height according to the expanded scale, the pixel width and the pixel height; In the feature map of the highest dimension, expand the expanded pixel width and all the expanded pixel widths along the initial position. Crop at the position after expanding the pixel height described above.
进一步的,计算所述多个裁剪特征图与所述当前目标特征图的相关度包括:将所述多个裁剪特征图缩放至预设尺度,得到预设尺度的多个裁剪特征图;以所述预设尺度为滑动窗口在所述当前目标特征图上进行滑动;将每次滑动后所述滑动窗口所包含的特征图与所述预设尺度的多个裁剪特征图分别进行乘积运算,根据乘积运算的结果得到当前目标特征图的与多个裁剪特征图一一对应的多个所述相关度特征图。Further, calculating the correlation between the multiple cropping feature maps and the current target feature map includes: scaling the multiple cropping feature maps to a preset scale to obtain multiple cropping feature maps of the preset scale; The preset scale is that the sliding window slides on the current target feature map; after each sliding, the feature map included in the sliding window and the multiple cropped feature maps of the preset scale are respectively multiplied by the product operation, according to The result of the product operation obtains a plurality of the correlation feature maps of the current target feature map that correspond one-to-one with the multiple cropped feature maps.
进一步的,根据所述多个相关度特征图和所述当前目标特征图,得到所述当前目标特征图对应的强化特征图包括:将所述多个相关度特征图中的每个相关度特征图与所述当前目标特征图进行乘积运算,得到所述当前目标特征图对应的多个第一强化特征图;将所述多个第一强化特征图与所述当前目标特征图串联,得到所述当前目标特征图对应的第二强化特征图;获取所述当前目标特征图对应的位置强化特征图,并将所述第二强化特征图和所述位置强化特征图串联,得到与所述当前目标特征图对应的强化特征图,其中,所述位置强化特征图的尺度与所述第二强化特征图的尺度相同。Further, obtaining the enhanced feature map corresponding to the current target feature map according to the plurality of correlation feature maps and the current target feature map includes: combining each correlation feature in the plurality of correlation feature maps The product operation is performed on the current target feature map and the current target feature map to obtain a plurality of first enhanced feature maps corresponding to the current target feature map; the plurality of first enhanced feature maps and the current target feature map are connected in series to obtain the obtaining the second enhanced feature map corresponding to the current target feature map; obtaining the position enhanced feature map corresponding to the current target feature map, and connecting the second enhanced feature map and the position enhanced feature map in series to obtain a The enhanced feature map corresponding to the target feature map, wherein the scale of the position enhanced feature map is the same as the scale of the second enhanced feature map.
进一步的,获取所述当前目标特征图对应的位置强化特征图包括:基于所述显著性物体的初始位置确定所述显著性物体X方向的中心线和Y方向的中心线;将所述Y方向的中心线设置为第一目标值,沿着X方向线性变换为第二目标值,得到X方向的位置强化特征图;将所述X方向的中心线设置为所述第一目标值,沿着Y方向线性变换为所述第二目标值,得到Y方向的位置强化特征图;将所述X方向的位置强化特征图和所述Y方向的位置强化特征图作为所述位置强化特征图。Further, acquiring the position enhancement feature map corresponding to the current target feature map includes: determining the center line of the salient object in the X direction and the center line in the Y direction based on the initial position of the salient object; The center line of the X direction is set as the first target value, and linearly transformed into the second target value along the X direction to obtain the position enhancement feature map in the X direction; the center line in the X direction is set as the first target value, along the X direction The Y direction is linearly transformed into the second target value to obtain the position enhancement feature map in the Y direction; the position enhancement feature map in the X direction and the position enhancement feature map in the Y direction are used as the position enhancement feature map.
进一步的,对所述多尺度的强化特征图进行图像还原包括:对所述多尺度的强化特征图进行上采样,得到与所述待处理图像对应的显著性物体蒙版。Further, performing image restoration on the multi-scale enhanced feature map includes: up-sampling the multi-scale enhanced feature map to obtain a salient object mask corresponding to the to-be-processed image.
第二方面,本发明实施例还提供了一种图像处理装置,包括:特征提取单元,用于获取待处理图像,并对所述待处理图像进行特征提取,得到多尺度的特征图;强化处理单元,用于对所述多尺度的特征图中显著性物体所对应的部分进行强化处理,得到多尺度的强化特征图;图像还原单元,用于对所述多尺度的强化特征图进行图像还原,得到与所述待处理图像对应的显著性物体蒙版。In a second aspect, an embodiment of the present invention further provides an image processing apparatus, including: a feature extraction unit, configured to acquire an image to be processed, and perform feature extraction on the to-be-processed image to obtain a multi-scale feature map; enhancement processing The unit is used for enhancing the part corresponding to the salient objects in the multi-scale feature map to obtain the multi-scale enhanced feature map; the image restoration unit is used for image restoration on the multi-scale enhanced feature map , to obtain the saliency object mask corresponding to the image to be processed.
第三方面,本发明实施例提供了一种电子设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现上述第一方面任一项所述的方法的步骤。In a third aspect, an embodiment of the present invention provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, when the processor executes the computer program Implement the steps of the method according to any one of the above first aspects.
第四方面,本发明实施例提供了一种具有处理器可执行的非易失的程序代码的计算机可读介质,所述程序代码使所述处理器执行上述第一方面任一项所述的方法的步骤。In a fourth aspect, an embodiment of the present invention provides a computer-readable medium having a non-volatile program code executable by a processor, the program code causing the processor to execute any one of the above-mentioned first aspect. steps of the method.
在本发明实施例中,首先,获取待处理图像,并对待处理图像进行特征提取,得到多尺度的特征图;然后,对多尺度的特征图中显著性物体所对应的部分进行强化处理,得到多尺度的强化特征图;最后,对多尺度的强化特征图进行图像还原,得到与待处理图像对应的显著性物体蒙版。通过上述描述可知,对多尺度的特征图中显著性物体所对应的部分进行强化处理后,得到的多尺度的强化特征图中,显著性物体所对应的特征图更加突出,最后对多尺度的强化特征图进行图像还原后,分割得到的显著性物体蒙版更加准确,缓解了现有的显著性物体分割方法在对图像进行处理时,精度差的技术问题。In the embodiment of the present invention, first, an image to be processed is acquired, and feature extraction is performed on the image to be processed to obtain a multi-scale feature map; then, parts corresponding to salient objects in the multi-scale feature map are enhanced to obtain Multi-scale enhanced feature map; finally, image restoration is performed on the multi-scale enhanced feature map to obtain a salient object mask corresponding to the image to be processed. It can be seen from the above description that after the enhancement processing is performed on the part corresponding to the salient objects in the multi-scale feature map, the multi-scale enhanced feature map obtained, the feature map corresponding to the salient objects is more prominent, and finally the multi-scale enhanced feature map is obtained. After image restoration is performed by enhancing the feature map, the salient object mask obtained by segmentation is more accurate, which alleviates the technical problem of poor accuracy of the existing salient object segmentation methods when processing images.
附图说明Description of drawings
为了更清楚地说明本发明具体实施方式或现有技术中的技术方案,下面将对具体实施方式或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施方式,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to illustrate the specific embodiments of the present invention or the technical solutions in the prior art more clearly, the following briefly introduces the accompanying drawings that need to be used in the description of the specific embodiments or the prior art. Obviously, the accompanying drawings in the following description The drawings are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can also be obtained based on these drawings without creative efforts.
图1(a)为本发明实施例提供的待处理图像的示意图;1(a) is a schematic diagram of an image to be processed provided by an embodiment of the present invention;
图1(b)为本发明实施例提供的与图1(a)对应的显著性物体分割结果的示意图;FIG. 1(b) is a schematic diagram of a salient object segmentation result corresponding to FIG. 1(a) provided by an embodiment of the present invention;
图2为本发明实施例提供的一种电子设备的示意图;2 is a schematic diagram of an electronic device according to an embodiment of the present invention;
图3为本发明实施例提供的一种图像处理方法的流程图;3 is a flowchart of an image processing method provided by an embodiment of the present invention;
图4为本发明实施例提供的图像处理方法的整体示意图;4 is an overall schematic diagram of an image processing method provided by an embodiment of the present invention;
图5为本发明实施例提供的第一优化处理的示意图;5 is a schematic diagram of a first optimization process provided by an embodiment of the present invention;
图6为本发明实施例提供的第二优化处理的示意图;6 is a schematic diagram of a second optimization process provided by an embodiment of the present invention;
图7为本发明实施例提供的对多尺度的特征图中显著性物体所对应的部分进行强化处理的流程图;FIG. 7 is a flowchart of enhancing processing for parts corresponding to salient objects in a multi-scale feature map provided by an embodiment of the present invention;
图8为本发明实施例提供的确定当前目标特征图对应的强化特征图的流程图;8 is a flowchart of determining an enhanced feature map corresponding to a current target feature map according to an embodiment of the present invention;
图9为本发明实施例提供的位置强化特征图的示意图;9 is a schematic diagram of a position enhancement feature map provided by an embodiment of the present invention;
图10为本发明实施例提供的对多尺度的特征图中显著性物体所对应的部分进行强化处理的示意图;FIG. 10 is a schematic diagram of performing enhancement processing on a part corresponding to a salient object in a multi-scale feature map provided by an embodiment of the present invention;
图11为本发明实施例提供的本发明的图像处理方法与现有的显著性物体分割方法在多个公开数据集上进行训练和测试的结果示意图;11 is a schematic diagram showing the results of training and testing the image processing method of the present invention and the existing salient object segmentation method on multiple public data sets provided by an embodiment of the present invention;
图12为本发明实施例提供的本发明的图像处理方法与现有的显著性物体分割方法对待处理图像进行处理后的可视化结果示意图;12 is a schematic diagram of a visualization result after processing an image to be processed by the image processing method of the present invention and the existing salient object segmentation method provided by the embodiment of the present invention;
图13为本发明实施例提供的图像处理装置的示意图。FIG. 13 is a schematic diagram of an image processing apparatus according to an embodiment of the present invention.
具体实施方式Detailed ways
下面将结合实施例对本发明的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions of the present invention will be clearly and completely described below with reference to the embodiments. Obviously, the described embodiments are part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
实施例1:Example 1:
首先,参照图2来描述用于实现本发明实施例的电子设备100,该电子设备可以用于运行本发明各实施例的图像处理方法。First, an
如图2所示,电子设备100包括一个或多个处理器102、一个或多个存储器104、输入装置106、输出装置108以及摄像机110,这些组件通过总线系统112和/或其它形式的连接机构(未示出)互连。应当注意,图2所示的电子设备100的组件和结构只是示例性的,而非限制性的,根据需要,所述电子设备也可以具有其他组件和结构。As shown in FIG. 2,
所述处理器102可以采用数字信号处理器(DSP,Digital Signal Processing)、现场可编程门阵列(FPGA,Field-Programmable Gate Array)、可编程逻辑阵列(PLA,Programmable Logic Array)和ASIC(Application Specific Integrated Circuit)中的至少一种硬件形式来实现,所述处理器102可以是中央处理单元(CPU,Central ProcessingUnit)或者具有数据处理能力和/或指令执行能力的其它形式的处理单元,并且可以控制所述电子设备100中的其它组件以执行期望的功能。The processor 102 may use a digital signal processor (DSP, Digital Signal Processing), a Field-Programmable Gate Array (FPGA, Field-Programmable Gate Array), a Programmable Logic Array (PLA, Programmable Logic Array) and an ASIC (Application Specific) Integrated Circuit) in at least one form of hardware, the processor 102 may be a central processing unit (CPU, Central Processing Unit) or other forms of processing units with data processing capability and/or instruction execution capability, and can control other components in the
所述存储器104可以包括一个或多个计算机程序产品,所述计算机程序产品可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性存储器。所述易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。所述非易失性存储器例如可以包括只读存储器(ROM)、硬盘、闪存等。在所述计算机可读存储介质上可以存储一个或多个计算机程序指令,处理器102可以运行所述程序指令,以实现下文所述的本发明实施例中(由处理器实现)的客户端功能以及/或者其它期望的功能。在所述计算机可读存储介质中还可以存储各种应用程序和各种数据,例如所述应用程序使用和/或产生的各种数据等。The memory 104 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random access memory (RAM) and/or cache memory, or the like. The non-volatile memory may include, for example, read only memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 102 may execute the program instructions to implement the client functions (implemented by the processor) in the embodiments of the present invention described below. and/or other desired functionality. Various application programs and various data, such as various data used and/or generated by the application program, etc. may also be stored in the computer-readable storage medium.
所述输入装置106可以是用户用来输入指令的装置,并且可以包括键盘、鼠标、麦克风和触摸屏等中的一个或多个。The input device 106 may be a device used by a user to input instructions, and may include one or more of a keyboard, mouse, microphone, touch screen, and the like.
所述输出装置108可以向外部(例如,用户)输出各种信息(例如,图像或声音),并且可以包括显示器、扬声器等中的一个或多个。The output device 108 may output various information (eg, images or sounds) to the outside (eg, a user), and may include one or more of a display, a speaker, and the like.
所述摄像机110用于进行待处理图像的采集,其中,摄像机所采集的待处理图像经过所述图像处理方法进行处理之后得到与待处理图像对应的显著性物体蒙版,例如,摄像机可以拍摄用户期望的图像(例如照片、视频等),然后,将该图像经过所述图像处理方法进行处理之后得到与待处理图像对应的显著性物体蒙版,摄像机还可以将所拍摄的图像存储在所述存储器104中以供其它组件使用。The camera 110 is used to collect the to-be-processed image, wherein the to-be-processed image collected by the camera is processed by the image processing method to obtain a salient object mask corresponding to the to-be-processed image. For example, the camera can photograph the user. A desired image (such as a photo, video, etc.), and then, the image is processed by the image processing method to obtain a salient object mask corresponding to the image to be processed, and the camera can also store the captured image in the in memory 104 for use by other components.
示例性地,用于实现根据本发明实施例的图像处理方法的电子设备可以被实现为诸如智能手机、平板电脑等智能移动终端。Exemplarily, the electronic device for implementing the image processing method according to the embodiment of the present invention may be implemented as a smart mobile terminal such as a smart phone, a tablet computer, or the like.
实施例2:Example 2:
根据本发明实施例,提供了一种图像处理方法的实施例,需要说明的是,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。According to an embodiment of the present invention, an embodiment of an image processing method is provided. It should be noted that the steps shown in the flowchart of the accompanying drawings may be executed in a computer system such as a set of computer-executable instructions, and although A logical order is shown in the flowcharts, but in some cases steps shown or described may be performed in an order different from that herein.
图3是根据本发明实施例的一种图像处理方法的流程图,如图3所示,该方法包括如下步骤:FIG. 3 is a flowchart of an image processing method according to an embodiment of the present invention. As shown in FIG. 3 , the method includes the following steps:
步骤S302,获取待处理图像,并对待处理图像进行特征提取,得到多尺度的特征图;Step S302, acquiring the image to be processed, and performing feature extraction on the image to be processed to obtain a multi-scale feature map;
在本发明实施例中,上述多尺度的特征图表示尺寸不同(即高度和宽度不同)的特征图。特征提取可以为多层卷积的下采样处理,待处理图像每经过一层下采样,就能得到一种尺度的特征图,该种尺度的特征图包括多张子特征图,实际为多通道矩阵(每个通道的矩阵为二维矩阵),每个通道的矩阵可以与一张子特征图对应,其中,矩阵每行的元素个数就表示了该通道对应的子特征图的宽度,矩阵每列的元素个数就表示了该通道对应的子特征图的高度。In the embodiment of the present invention, the above-mentioned multi-scale feature maps represent feature maps with different sizes (ie, different heights and widths). Feature extraction can be a multi-layer convolution downsampling process. Each time the image to be processed is downsampled by one layer, a feature map of one scale can be obtained. The feature map of this scale includes multiple sub-feature maps, which are actually multi-channel. Matrix (the matrix of each channel is a two-dimensional matrix), the matrix of each channel can correspond to a sub-feature map, where the number of elements in each row of the matrix represents the width of the sub-feature map corresponding to the channel, and the matrix The number of elements in each column represents the height of the sub-feature map corresponding to the channel.
步骤S304,对多尺度的特征图中显著性物体所对应的部分进行强化处理,得到多尺度的强化特征图;Step S304, performing enhancement processing on the part corresponding to the salient object in the multi-scale feature map to obtain a multi-scale enhanced feature map;
发明人考虑到显著性物体只与其邻域的像素值相关,显著性物体在图像中越突出,分割得到的显著性物体蒙版的精度就越高。因此,在得到多尺度的特征图后,对多尺度的特征图中显著性物体所对应的部分进行强化处理,显著性物体对应的部分强化处理后,其特征更突显,这样在对多尺度的强化特征图进行图像还原后,得到的显著性物体蒙版会更加准确。The inventor considers that a salient object is only related to the pixel values of its neighborhood, and the more prominent the salient object is in the image, the higher the accuracy of the salient object mask obtained by segmentation. Therefore, after obtaining the multi-scale feature map, the parts corresponding to the salient objects in the multi-scale feature map are enhanced, and the features corresponding to the salient objects are more prominent after the part corresponding to the salient objects is enhanced. After enhancing the feature map for image restoration, the obtained salient object mask will be more accurate.
下文中再对强化处理的过程进行详细介绍,在此不再赘述。The process of the enhancement processing will be described in detail below, and will not be repeated here.
步骤S306,对多尺度的强化特征图进行图像还原,得到与待处理图像对应的显著性物体蒙版。Step S306, performing image restoration on the multi-scale enhanced feature map to obtain a salient object mask corresponding to the image to be processed.
具体的,对多种尺度的强化特征图进行上采样,上采样的过程中,对不同尺度的强化特征图进行融合,进而得到与待处理图像对应的显著性物体蒙版。Specifically, the enhanced feature maps of various scales are up-sampled, and during the up-sampling process, the enhanced feature maps of different scales are fused to obtain a salient object mask corresponding to the image to be processed.
在本发明实施例中,首先,获取待处理图像,并对待处理图像进行特征提取,得到多尺度的特征图;然后,对多尺度的特征图中显著性物体所对应的部分进行强化处理,得到多尺度的强化特征图;最后,对多尺度的强化特征图进行图像还原,得到与待处理图像对应的显著性物体蒙版。通过上述描述可知,对多尺度的特征图中显著性物体所对应的部分进行强化处理后,得到的多尺度的强化特征图中,显著性物体所对应的特征图更加突出,最后对多尺度的强化特征图进行图像还原后,分割得到的显著性物体蒙版更加准确,缓解了现有的显著性物体分割方法在对图像进行处理时,精度差的技术问题。In the embodiment of the present invention, first, an image to be processed is acquired, and feature extraction is performed on the image to be processed to obtain a multi-scale feature map; then, parts corresponding to salient objects in the multi-scale feature map are enhanced to obtain Multi-scale enhanced feature map; finally, image restoration is performed on the multi-scale enhanced feature map to obtain a salient object mask corresponding to the image to be processed. It can be seen from the above description that after the enhancement processing is performed on the part corresponding to the salient objects in the multi-scale feature map, the multi-scale enhanced feature map obtained, the feature map corresponding to the salient objects is more prominent, and finally the multi-scale enhanced feature map is obtained. After image restoration is performed by enhancing the feature map, the salient object mask obtained by segmentation is more accurate, which alleviates the technical problem of poor accuracy of the existing salient object segmentation methods when processing images.
上述内容对本发明的图像处理方法进行了简要介绍,下面对其中涉及到的具体内容进行详细描述。The above content briefly introduces the image processing method of the present invention, and the specific content involved is described in detail below.
在本发明的一个可选实施例中,步骤S302,对待处理图像进行特征提取的步骤包括如下(1)-(2)的过程:In an optional embodiment of the present invention, in step S302, the step of performing feature extraction on the image to be processed includes the following processes (1)-(2):
(1)对待处理图像进行多层下采样处理,得到多尺度的原始特征图;(1) Multi-layer downsampling is performed on the image to be processed to obtain a multi-scale original feature map;
具体的,参考图4,多层下采样可以为多层卷积的下采样,每经过一层下采样,就能得到一种尺度的原始特征图,如此,可以得到多尺度的原始特征图。Specifically, referring to FIG. 4 , the multi-layer downsampling can be the downsampling of multi-layer convolutions. After each layer of downsampling, an original feature map of one scale can be obtained. In this way, an original feature map of multiple scales can be obtained.
需要说明的是,通过实验发现,多尺度的原始特征图中的高维原始特征图(例如第三层以上下采样得到的原始特征图)对于图像处理的精度影响较大,低维原始特征图(例如第一层下采样和第二层下采样得到的原始特征图)对于图像处理的精度影响不大,为了提高图像处理的效率,可以不用考虑第一层下采样和第二层下采样的结果,即后续过程不对第一层下采样和第二层下采样得到的原始特征图进行处理,如图4所示。It should be noted that, through experiments, it is found that the high-dimensional original feature map in the multi-scale original feature map (such as the original feature map obtained by downsampling above the third layer) has a greater impact on the accuracy of image processing, and the low-dimensional original feature map (For example, the original feature map obtained by the first layer downsampling and the second layer downsampling) has little effect on the accuracy of image processing. In order to improve the efficiency of image processing, it is not necessary to consider the first layer downsampling and the second layer downsampling. As a result, the subsequent process does not process the original feature maps obtained by the first layer downsampling and the second layer downsampling, as shown in Figure 4.
(2)对多尺度的原始特征图中的高维原始特征图进行优化处理,得到多尺度的特征图。(2) The high-dimensional original feature map in the multi-scale original feature map is optimized to obtain a multi-scale feature map.
具体包括如下过程:Specifically, it includes the following processes:
21)对多尺度的原始特征图中的目标原始特征图进行第一优化处理,得到第一优化的特征图,其中,目标原始特征图为多尺度的原始特征图中,除最高维原始特征图以外的高维原始特征图;21) Perform a first optimization process on the target original feature map in the multi-scale original feature map to obtain the first optimized feature map, wherein the target original feature map is the multi-scale original feature map, except the highest-dimensional original feature map. High-dimensional original feature maps other than ;
参考图4,对多尺度的原始特征图中的除最高维原始特征图以外的高维原始特征图都进行第一优化处理,即图4中的SRB处理。第一优化处理(SRB处理)的具体过程为:利用第一优化模块对目标原始特征图进行优化处理,得到第一初始优化的特征图,其中,第一优化模块包括:预设数量个第一卷积层;将第一初始优化的特征图和其对应的目标原始特征图进行加和运算,得到第一优化的特征图。Referring to FIG. 4 , the first optimization process, that is, the SRB process in FIG. 4 , is performed on all high-dimensional original feature maps except the highest-dimensional original feature map in the multi-scale original feature map. The specific process of the first optimization processing (SRB processing) is: using a first optimization module to perform optimization processing on the target original feature map to obtain a first initial optimized feature map, wherein the first optimization module includes: a preset number of first Convolutional layer; the first initial optimized feature map and its corresponding target original feature map are added to obtain the first optimized feature map.
在本发明实施例中,预设数量个第一卷积层可以为两个3x3的卷积层,参考图5,目标原始特征图通过串联的两个3x3的卷积层后,得到第一初始优化的特征图,进而将该第一初始优化的特征图与其对应的目标原始特征图进行加和运算,就能得到第一优化的特征图。In this embodiment of the present invention, the preset number of first convolutional layers may be two 3×3 convolutional layers. Referring to FIG. 5 , after the target original feature map passes through two 3×3 convolutional layers in series, the first initial The optimized feature map, and then adding the first initial optimized feature map and its corresponding target original feature map, the first optimized feature map can be obtained.
上文中已经对特征图实际为多维矩阵的本质进行了说明,所以第一初始优化的特征图与其对应的目标原始特征图进行加和运算的过程实际上是多维矩阵中相应元素加和运算的过程。The fact that the feature map is actually a multi-dimensional matrix has been explained above, so the process of adding the first initial optimized feature map and its corresponding target original feature map is actually the process of adding the corresponding elements in the multi-dimensional matrix. .
22)对多尺度的原始特征图中的最高维原始特征图进行第二优化处理,得到第二优化的特征图;22) performing a second optimization process on the highest-dimensional original feature map in the multi-scale original feature map to obtain a second optimized feature map;
参考图4,对多尺度的原始特征图中的最高维原始特征图进行第二优化处理,即图4中的GRB处理。第二优化处理(GRB处理)的具体过程为:利用第二优化模块对最高维原始特征图进行优化处理,得到优化权重,其中,第二优化模块包括:第二卷积层、全局池化层和Sigmoid函数处理层;将优化权重与最高维原始特征图进行乘积运算,得到第二初始优化的特征图;将第二初始优化的特征图和最高维原始特征图进行加和运算,得到第二优化的特征图。Referring to FIG. 4 , a second optimization process, that is, the GRB process in FIG. 4 , is performed on the highest-dimensional original feature map in the multi-scale original feature map. The specific process of the second optimization processing (GRB processing) is: using the second optimization module to optimize the highest-dimensional original feature map to obtain the optimization weight, wherein the second optimization module includes: a second convolution layer, a global pooling layer and Sigmoid function processing layer; multiply the optimization weight and the highest-dimensional original feature map to obtain the second initial optimized feature map; add the second initial optimized feature map and the highest-dimensional original feature map to obtain the second initial optimized feature map Optimized feature map.
在本发明实施例中,第二卷积层可以为1x1的卷积层,参考图6,最高维原始特征图依次通过1x1的卷积层,全局池化层,1x1的卷积层和Sigmoid函数后,得到优化权重,该优化权重再与最高维原始特征图进行乘积运算,运算得到的结果再与最高维原始特征图进行加和运算,得到第二优化的特征图。In this embodiment of the present invention, the second convolutional layer may be a 1×1 convolutional layer. Referring to FIG. 6 , the highest-dimensional original feature map sequentially passes through a 1×1 convolutional layer, a global pooling layer, a 1×1 convolutional layer, and the Sigmoid function. Then, the optimized weight is obtained, and the optimized weight is then multiplied with the original feature map of the highest dimension, and the result obtained by the operation is added to the original feature map of the highest dimension to obtain the second optimized feature map.
同理,乘积运算为优化权重与二维矩阵中的元素进行的乘积运算过程,加和运算为二维矩阵中相应元素加和运算的过程。Similarly, the product operation is the product operation process of the optimization weight and the elements in the two-dimensional matrix, and the sum operation is the process of adding the corresponding elements in the two-dimensional matrix.
23)将第一优化的特征图和第二优化的特征图作为多尺度的特征图。23) The first optimized feature map and the second optimized feature map are used as multi-scale feature maps.
下面再对对多尺度的特征图中显著性物体所对应的部分进行强化处理的过程进行详细描述。The process of enhancing the part corresponding to the salient objects in the multi-scale feature map will be described in detail below.
在本发明的一个可选实施例中,参考图7,步骤S304,对多尺度的特征图中显著性物体所对应的部分进行强化处理的步骤包括:In an optional embodiment of the present invention, referring to FIG. 7, step S304, the step of performing enhancement processing on the part corresponding to the salient object in the multi-scale feature map includes:
步骤S701,根据多尺度的特征图得到显著性物体的初始位置;Step S701, obtaining the initial position of the salient object according to the multi-scale feature map;
具体包括如下过程:对多尺度的特征图中最高维的特征图进行降维处理,得到单通道的特征图;对单通道的特征图进行二值化处理,得到单通道的二值化特征图;根据单通道的二值化特征图确定显著性物体的初始位置。Specifically, it includes the following processes: performing dimensionality reduction on the feature map with the highest dimension in the multi-scale feature map to obtain a single-channel feature map; binarizing the single-channel feature map to obtain a single-channel binarized feature map ; Determine the initial position of the salient object based on the single-channel binarized feature map.
参考图4,通过卷积层对GRB处理(第二优化处理)后的第二优化的特征图(即多尺度的特征图中最高维的特征图)进行降维处理,得到单通道的特征图(是指N*M的,通道数为1),然后再对单通道的特征图进行二值化处理,得到单通道的二值化特征图(如图4中GRB下方所指的图像),上述单通道的二值化特征图中,1所代表的部分即为显著性物体所对应的部分,进而根据其中1所在的位置便能确定显著性物体的初始位置。该初始位置可以为单通道的二值化特征图中,位于最左边的1、最右边的1、最上边的1和最下面的1所确定的位置。Referring to FIG. 4 , the second optimized feature map (ie, the feature map with the highest dimension in the multi-scale feature map) after GRB processing (second optimization process) is subjected to dimensionality reduction processing through the convolution layer to obtain a single-channel feature map. (referring to N*M, the number of channels is 1), and then the single-channel feature map is binarized to obtain a single-channel binarized feature map (the image below the GRB in Figure 4), In the above single-channel binarization feature map, the part represented by 1 is the part corresponding to the salient object, and then the initial position of the salient object can be determined according to the position of 1. The initial position may be a position determined by the leftmost 1, the rightmost 1, the uppermost 1, and the lowermost 1 in the single-channel binarized feature map.
步骤S702,根据初始位置,以至少两种不同扩充尺度,裁剪多尺度的特征图中最高维的特征图,得到多个裁剪特征图,多个裁剪特征图中包含显著性物体的特征信息;Step S702, according to the initial position, with at least two different expansion scales, crop the feature map of the highest dimension in the multi-scale feature map, and obtain a plurality of cropped feature maps, and the plurality of cropped feature maps contain feature information of salient objects;
具体包括如下过程:根据初始位置确定显著性物体的像素宽度和像素高度;根据扩充尺度、像素宽度和像素高度确定扩充像素宽度和扩充像素高度;在最高维的特征图中,沿着将初始位置扩充扩充像素宽度和扩充像素高度后的位置进行裁剪。裁剪特征图是以物体的初始位置为中心进行一定扩充裁剪出的,因此包含显著性物体,也就包含显著性物体的特征信息。Specifically, it includes the following processes: determine the pixel width and pixel height of the salient object according to the initial position; determine the extended pixel width and height according to the extended scale, pixel width and pixel height; in the feature map of the highest dimension, along the initial position The position after expanding the expanded pixel width and expanded pixel height is cropped. The cropped feature map is based on the initial position of the object and is expanded and cropped to a certain extent, so it contains salient objects, and also contains the feature information of salient objects.
例如,当扩充尺度为10%时,若显著性物体的像素宽度和像素高度为30*30,那么扩充像素宽度(即30的10%)为3个像素点,扩充像素高度(即30的10%)为3个像素点,也就是在最高维的特征图中,将初始位置向其左右方向分别扩充(即增加)3个背景像素点,同时将初始位置向其上下方向分别扩充3个背景像素点之后,进行裁剪,之所以要进行这样的扩充是由于直接二值化得到的显著性物体所对应的特征图的准确性差,其不一定是显著性物体的全部,所以需要往外扩充一些,以包含显著性物体的全部。For example, when the expansion scale is 10%, if the pixel width and pixel height of the salient object are 30*30, then the expanded pixel width (ie 10% of 30) is 3 pixels, and the expanded pixel height (ie 10 of 30) %) is 3 pixels, that is, in the feature map of the highest dimension, the initial position is expanded (that is, increased) by 3 background pixels in the left and right directions, and the initial position is expanded by 3 backgrounds in the upper and lower directions. After the pixels are cropped, the reason for such expansion is that the accuracy of the feature map corresponding to the salient objects obtained by direct binarization is poor, and it is not necessarily all the salient objects, so it needs to be expanded. to include all of the significant objects.
上述扩充尺度还可以为30%,50%等,本发明实施例对上述扩充尺度不进行具体限制。按照不同的扩充尺度将裁剪出不同的裁剪特征图,因此扩充尺度的个数等于裁剪特征图的张数。The above-mentioned expansion scale may also be 30%, 50%, etc., and the embodiment of the present invention does not specifically limit the above-mentioned expansion scale. Different cropped feature maps will be cropped according to different expansion scales, so the number of expanded scales is equal to the number of cropped feature maps.
步骤S703,将多尺度的特征图中的一个或多个作为目标特征图,逐一将各目标特征图作为当前目标特征图,计算多个裁剪特征图与当前目标特征图的相关度,得到当前特征图的与多个裁剪特征图一一对应的多个相关度特征图;Step S703, taking one or more of the multi-scale feature maps as the target feature map, using each target feature map as the current target feature map one by one, calculating the correlation between the multiple cropped feature maps and the current target feature map, and obtaining the current feature. A plurality of correlation feature maps corresponding to a plurality of cropped feature maps one-to-one in the graph;
上述目标特征图可以为多尺度的特征图中一个或多个,本发明实施例对其不进行具体限制。The foregoing target feature maps may be one or more of multi-scale feature maps, which are not specifically limited in this embodiment of the present invention.
具体包括如下过程:将多个裁剪特征图缩放至预设尺度,得到预设尺度的多个裁剪特征图;逐一将各目标特征图作为当前目标特征图,以预设尺度为滑动窗口在当前目标特征图上进行滑动;将每次滑动后滑动窗口所包含的特征图与预设尺度的多个裁剪特征图中的每个裁剪特征图进行乘积运算,根据乘积运算的结果得到当前目标特征图的与多个裁剪特征图一一对应的多个相关度特征图。Specifically, it includes the following process: scaling multiple cropped feature maps to a preset scale to obtain multiple cropped feature maps of the preset scale; taking each target feature map as the current target feature map one by one, and using the preset scale as the sliding window to move the current target Sliding on the feature map; the feature map contained in the sliding window after each sliding is multiplied with each cropped feature map in multiple cropped feature maps of the preset scale, and the current target feature map is obtained according to the result of the multiplication operation. Multiple correlation feature maps corresponding to multiple cropped feature maps one-to-one.
为了能够对该过程更好的理解,下面以一具体实例对上述过程进行描述:若预设尺度的裁剪特征图为32*32*64的B图,当前目标特征图为64*64*128的A图。In order to better understand the process, the above process is described below with a specific example: if the cropped feature map of the preset scale is the B map of 32*32*64, and the current target feature map is 64*64*128 A picture.
为了简化描述,先以32*32*1的B1图和64*64*1的A1图进行说明,在A1图上以32*32为滑动窗口按照预设滑动步长进行依次滑动,每滑动一次,就能得到一个32*32的小块,该小块与32*32的B1图进行乘积运算,得到新的32*32的小块,全部滑动完成,就能得到A1图和B1图的相关度特征图,该相关度特征图的大小为64*64*1;In order to simplify the description, the B1 picture of 32*32*1 and the A1 picture of 64*64*1 are used for illustration. On the A1 picture, the sliding window of 32*32 is used to slide in sequence according to the preset sliding step, and each slide is once. , you can get a 32*32 small block, the small block is multiplied with the 32*32 B1 map, and a new 32*32 small block is obtained. After all the sliding is completed, the correlation between the A1 map and the B1 map can be obtained. degree feature map, the size of the correlation feature map is 64*64*1;
当计算32*32*1的B2图和64*64*128的A2图的相关度特征图时,在A2图的每一个通道上都以32*32为滑动窗口按照预设滑动步长进行依次滑动,每滑动一次,就能得到一个32*32的小块,该小块与32*32的B2图进行乘积运算,得到新的32*32的小块,所有的通道全部滑动完成后,就能得到A2图和B2图的相关度特征图,该相关度特征图的大小为64*64*128;When calculating the correlation feature map of the 32*32*1 B2 image and the 64*64*128 A2 image, on each channel of the A2 image, use 32*32 as the sliding window and follow the preset sliding step size. Sliding, every time you slide, you can get a 32*32 small block, which is multiplied with the 32*32 B2 map to get a new 32*32 small block. The correlation feature map of A2 and B2 can be obtained, and the size of the correlation feature map is 64*64*128;
当计算32*32*64的B图和64*64*128的A图的相关度特征图时,32*32*64的每一维都按照上述过程分别与64*64*128计算,得到一个64*64*128的相关度特征图,然后把这64个加到一起,得到一个最终的64*64*128的相关度特征图,即为A图和B图的相关度特征图。When calculating the correlation feature map of 32*32*64 B picture and 64*64*128 A picture, each dimension of 32*32*64 is calculated with 64*64*128 according to the above process, and a 64*64*128 correlation feature map, and then add these 64 together to get a final 64*64*128 correlation feature map, which is the correlation feature map of A and B.
步骤S704,根据多个相关度特征图和当前目标特征图,得到当前目标特征图对应的强化特征图。Step S704, obtaining an enhanced feature map corresponding to the current target feature map according to the plurality of correlation feature maps and the current target feature map.
参考图8,具体包括如下过程:Referring to Figure 8, it specifically includes the following process:
步骤S801,将多个相关度特征图中的每个相关度特征图与当前目标特征图进行乘积运算,得到当前目标特征图对应的多个第一强化特征图;Step S801, performing a product operation on each correlation feature map in the plurality of correlation feature maps and the current target feature map to obtain a plurality of first enhanced feature maps corresponding to the current target feature map;
进行上述乘积运算后,强化了当前目标特征图中显著性物体所对应的部分。After the above product operation is performed, the part corresponding to the salient objects in the current target feature map is strengthened.
步骤S802,将多个第一强化特征图与当前目标特征图串联,得到当前目标特征图对应的第二强化特征图;Step S802, connecting a plurality of first enhanced feature maps with the current target feature map in series to obtain a second enhanced feature map corresponding to the current target feature map;
例如:多个第一强化特征图为两个64*64*128(分别对应图的宽W*高H*通道数C)的特征图,与64*64*128的目标特征图串联后,得到64*64*384的第二强化特征图,串联即为通道数相加。For example: multiple first enhanced feature maps are two feature maps of 64*64*128 (corresponding to the width W*height H*channel number C of the map respectively), which are concatenated with the target feature map of 64*64*128 to obtain The second enhanced feature map of 64*64*384, the concatenation is the addition of the number of channels.
步骤S803,获取当前目标特征图对应的位置强化特征图,并将第二强化特征图和位置强化特征图串联,得到与该目标特征图对应的强化特征图,其中,位置强化特征图的尺度与第二强化特征图的尺度相同。Step S803, obtaining the position enhancement feature map corresponding to the current target feature map, and connecting the second enhancement feature map and the position enhancement feature map in series to obtain an enhancement feature map corresponding to the target feature map, wherein the scale of the position enhancement feature map is the same as The scale of the second enhanced feature map is the same.
具体过程如下:The specific process is as follows:
a)在目标矩阵上,基于显著性物体的初始位置确定显著性物体X方向的中心线和Y方向的中心线,目标矩阵为与第二强化特征图的尺度相同的单通道矩阵,目标矩阵各元素的值可以为0;a) On the target matrix, determine the center line of the salient object in the X direction and the center line in the Y direction based on the initial position of the salient object. The target matrix is a single-channel matrix with the same scale as the second enhanced feature map. The value of an element can be 0;
b)将目标矩阵的Y方向的中心线设置为第一目标值,沿着X方向线性变换为第二目标值,得到X方向的位置强化特征图;b) setting the center line in the Y direction of the target matrix as the first target value, linearly transforming it into the second target value along the X direction, and obtaining a position enhancement feature map in the X direction;
c)将目标矩阵的X方向的中心线设置为第一目标值,沿着Y方向线性变换为第二目标值,得到Y方向的位置强化特征图;c) setting the center line in the X direction of the target matrix as the first target value, linearly transforming it into the second target value along the Y direction, and obtaining the position enhancement feature map in the Y direction;
d)将X方向的位置强化特征图和Y方向的位置强化特征图作为位置强化特征图。d) The position enhancement feature map in the X direction and the position enhancement feature map in the Y direction are used as the position enhancement feature map.
上述第一目标值可以为1,第二目标值可以为0。基于显著性物体的初始位置得到位置强化特征图的示意图如图9所示。The above-mentioned first target value may be 1, and the second target value may be 0. A schematic diagram of the location-enhanced feature map obtained based on the initial location of the salient object is shown in Figure 9.
上述强化处理的过程(用LCB表示)可以参考图10,根据单通道的二值化特征图确定显著性物体所对应部分的定位信息(包括显著性物体的初始位置、像素宽度、像素高度、X方向的中心线和Y方向的中心线),进而根据定位信息,以至少两种不同扩充尺度,裁剪多尺度的特征图中最高维的特征图,得到多个裁剪特征图,同时,根据定位信息确定位置强化特征图,然后再计算多个裁剪特征图与多尺度的特征图中的当前目标特征图的相关度特征图,将相关度特征图与当前目标特征图进行乘积运算,得到的结果与当前目标特征图、当前目标特征图对应的位置强化特征图串联,最终得到当前目标特征图对应的强化特征图,从而得到各个目标特征图对应的强化特征图。The above enhancement process (represented by LCB) can refer to Figure 10, according to the single-channel binarized feature map to determine the positioning information of the corresponding part of the salient object (including the initial position of the salient object, pixel width, pixel height, X The center line of the multi-scale feature map and the center line of the Y direction), and then according to the positioning information, with at least two different expansion scales, crop the feature map of the highest dimension in the multi-scale feature map, and obtain multiple cropped feature maps. At the same time, according to the positioning information Determine the position enhancement feature map, and then calculate the correlation feature map of the multiple cropped feature maps and the current target feature map in the multi-scale feature map, and multiply the correlation feature map and the current target feature map. The current target feature map and the position enhancement feature map corresponding to the current target feature map are connected in series, and finally the enhanced feature map corresponding to the current target feature map is obtained, thereby obtaining the enhanced feature map corresponding to each target feature map.
该强化特征图大大强化了其中显著性物体的特征部分,使得分割得到的显著性物体蒙版的准确度更高。The enhanced feature map greatly strengthens the feature parts of the salient objects, so that the salient object masks obtained by segmentation have higher accuracy.
在本发明的一个可选实施例中,对多尺度的强化特征图进行上采样,得到与待处理图像对应的显著性物体蒙版包括:对不同尺度的强化特征图进行融合,进而得到与待处理图像对应的显著性物体蒙版。In an optional embodiment of the present invention, up-sampling the multi-scale enhanced feature maps to obtain a salient object mask corresponding to the to-be-processed image includes: fusing the enhanced feature maps of different scales to obtain a Process the salient object mask corresponding to the image.
在本发明实施例中,上采样融合的过程可以为:如图4所示,第5层的强化特征图(即第5层下采样后得到的特征图经过GRB处理,再经过LCB处理后得到的特征图)先要经过SRB处理(上文中已对SRB处理进行了介绍,这里不再赘述),SRB处理后,上采样,上采样后的特征图与第4层的强化特征图相加后,再经过SRB处理,处理后再上采样,再上采样的特征图与第3层的强化特征图相加后,又经过SRB处理后,最后再4倍放大得到显著性物体蒙版。In the embodiment of the present invention, the process of upsampling and fusion may be as follows: as shown in FIG. 4 , the enhanced feature map of the fifth layer (that is, the feature map obtained after downsampling of the fifth layer is processed by GRB, and then obtained after processing by LCB) The feature map of the , and then processed by SRB, and then upsampled after processing. After the upsampled feature map is added to the enhanced feature map of the third layer, and then processed by SRB, the salient object mask is finally obtained by 4 times magnification.
发明人对本发明的图像处理方法(用LCANet表示)与现有的显著性物体分割方法在多个公开数据集(分别有DUTS-TE数据集、ECSSD数据集、HKU-IS数据集、PASCAL-S数据集和DUT-OM数据集)上进行了训练和测试,结果如图11所示,表明采用本发明的图像处理方法对图像进行处理时,得到的显著性物体蒙版的准确度更高(图11中,对于maxF参量来讲,其值越大说明精度越高,对于MAE参量来讲,其值越小说明精度越高)。另外,参考图12,从可视化的结果也能看出本发明的图像处理方法的准确性更好。图12中,GT列表示人工标注的显著性物体分割结果,LCANet列表示本发明的显著性物体分割结果,其它列表示其它方法(在每列的下方标记了对应的方法)得到的显著性物体分割结果,通过对比可知,本发明方法得到的显著性物体分割结果与人工标注的显著性物体分割结果更接近,也就说明相较于现有的其它方法本发明的方法准确性更好。The inventor compares the image processing method of the present invention (represented by LCANet) and the existing saliency object segmentation method in multiple public datasets (DUTS-TE dataset, ECSSD dataset, HKU-IS dataset, PASCAL-S dataset respectively). Data set and DUT-OM data set) were trained and tested, and the results are shown in Figure 11, indicating that when the image processing method of the present invention is used to process the image, the obtained salient object mask has higher accuracy ( In FIG. 11 , for the maxF parameter, the larger the value, the higher the accuracy, and the smaller the value of the MAE parameter, the higher the accuracy). In addition, referring to FIG. 12 , it can also be seen from the visualization results that the image processing method of the present invention has better accuracy. In Figure 12, the GT column represents the segmentation result of salient objects marked manually, the LCANet column represents the salient object segmentation result of the present invention, and the other columns represent the salient objects obtained by other methods (the corresponding methods are marked below each column). It can be seen from the comparison that the segmentation result of the salient object obtained by the method of the present invention is closer to the segmentation result of the salient object marked manually, which means that the method of the present invention is more accurate than other existing methods.
实施例3:Example 3:
本发明实施例还提供了一种图像处理装置,该图像处理装置主要用于执行本发明实施例上述内容所提供的图像处理方法,以下对本发明实施例提供的图像处理装置做具体介绍。An embodiment of the present invention further provides an image processing apparatus, which is mainly used to execute the image processing method provided by the above content of the embodiment of the present invention. The following describes the image processing apparatus provided by the embodiment of the present invention in detail.
图13是根据本发明实施例的一种图像处理装置的示意图,如图13所示,该图像处理装置主要包括:特征提取单元10,强化处理单元20和图像还原单元30,其中:13 is a schematic diagram of an image processing apparatus according to an embodiment of the present invention. As shown in FIG. 13 , the image processing apparatus mainly includes: a feature extraction unit 10, an enhancement processing unit 20 and an image restoration unit 30, wherein:
特征提取单元,用于获取待处理图像,并对待处理图像进行特征提取,得到多尺度的特征图;The feature extraction unit is used to obtain the image to be processed, and perform feature extraction on the image to be processed to obtain a multi-scale feature map;
强化处理单元,用于对多尺度的特征图中显著性物体所对应的部分进行强化处理,得到多尺度的强化特征图;The enhancement processing unit is used to enhance the part corresponding to the salient objects in the multi-scale feature map to obtain the multi-scale enhanced feature map;
图像还原单元,用于对多尺度的强化特征图进行图像还原,得到与待处理图像对应的显著性物体蒙版。The image restoration unit is used for image restoration of the multi-scale enhanced feature map to obtain a salient object mask corresponding to the image to be processed.
在本发明实施例中,首先,获取待处理图像,并对待处理图像进行特征提取,得到多尺度的特征图;然后,对多尺度的特征图中显著性物体所对应的部分进行强化处理,得到多尺度的强化特征图;最后,对多尺度的强化特征图进行图像还原,得到与待处理图像对应的显著性物体蒙版。通过上述描述可知,对多尺度的特征图中显著性物体所对应的部分进行强化处理后,得到的多尺度的强化特征图中,显著性物体所对应的特征图更加突出,最后对多尺度的强化特征图进行图像还原后,分割得到的显著性物体蒙版更加准确,缓解了现有的显著性物体分割方法在对图像进行处理时,精度差的技术问题。In the embodiment of the present invention, first, an image to be processed is acquired, and feature extraction is performed on the image to be processed to obtain a multi-scale feature map; then, parts corresponding to salient objects in the multi-scale feature map are enhanced to obtain Multi-scale enhanced feature map; finally, image restoration is performed on the multi-scale enhanced feature map to obtain a salient object mask corresponding to the image to be processed. It can be seen from the above description that after the enhancement processing is performed on the part corresponding to the salient objects in the multi-scale feature map, the multi-scale enhanced feature map obtained, the feature map corresponding to the salient objects is more prominent, and finally the multi-scale enhanced feature map is obtained. After image restoration is performed by enhancing the feature map, the salient object mask obtained by segmentation is more accurate, which alleviates the technical problem of poor accuracy of the existing salient object segmentation methods when processing images.
可选地,特征提取单元还用于:对待处理图像进行多层下采样处理,得到多尺度的原始特征图;对多尺度的原始特征图进行优化处理,得到多尺度的特征图。Optionally, the feature extraction unit is further configured to: perform multi-layer downsampling processing on the image to be processed to obtain a multi-scale original feature map; and perform optimization processing on the multi-scale original feature map to obtain a multi-scale feature map.
可选地,特征提取单元还用于:对多尺度的原始特征图中的目标原始特征图进行第一优化处理,得到第一优化的特征图,其中,目标原始特征图为多尺度的原始特征图中,除最高维原始特征图以外的特征图;对多尺度的原始特征图中的最高维原始特征图进行第二优化处理,得到第二优化的特征图;将第一优化的特征图和第二优化的特征图作为多尺度的特征图。Optionally, the feature extraction unit is further configured to: perform a first optimization process on the target original feature map in the multi-scale original feature map to obtain a first optimized feature map, wherein the target original feature map is a multi-scale original feature In the figure, the feature maps except the highest-dimensional original feature map; the second optimization process is performed on the highest-dimensional original feature map in the multi-scale original feature map to obtain the second optimized feature map; the first optimized feature map and The second optimized feature map is used as a multi-scale feature map.
可选地,特征提取单元还用于:利用第一优化模块对目标原始特征图进行优化处理,得到第一初始优化的特征图,其中,第一优化模块包括:预设数量个第一卷积层;将第一初始优化的特征图和其对应的目标原始特征图进行加和运算,得到第一优化的特征图。Optionally, the feature extraction unit is further configured to: use a first optimization module to perform optimization processing on the target original feature map to obtain a first initial optimized feature map, wherein the first optimization module includes: a preset number of first convolutions layer; the first initial optimized feature map and its corresponding target original feature map are added to obtain the first optimized feature map.
可选地,特征提取单元还用于:利用第二优化模块对最高维原始特征图进行优化处理,得到优化权重,其中,第二优化模块包括:第二卷积层、全局池化层和Sigmoid函数处理层;将优化权重与最高维原始特征图进行乘积运算,得到第二初始优化的特征图;将第二初始优化的特征图和最高维原始特征图进行加和运算,得到第二优化的特征图。Optionally, the feature extraction unit is further configured to: use a second optimization module to perform optimization processing on the highest-dimensional original feature map to obtain optimization weights, wherein the second optimization module includes: a second convolution layer, a global pooling layer, and a Sigmoid Function processing layer; Multiply the optimization weight and the highest-dimensional original feature map to obtain the second initial optimized feature map; add the second initial optimized feature map and the highest-dimensional original feature map to obtain the second optimized feature map feature map.
可选地,强化处理单元还用于:根据多尺度的特征图得到显著性物体的初始位置;根据初始位置,以至少两种不同扩充尺度,裁剪多尺度的特征图中最高维的特征图,得到多个裁剪特征图,多个裁剪特征图中包含显著性物体的特征信息;将多尺度的特征图中的一个或多个作为目标特征图,逐一将各目标特征图作为当前目标特征图,计算多个裁剪特征图与当前目标特征图的相关度,得到当前目标特征图的与多个裁剪特征图一一对应的多个相关度特征图;根据多个相关度特征图和当前目标特征图,得到当前目标特征图对应的强化特征图。Optionally, the enhancement processing unit is further configured to: obtain the initial position of the salient object according to the multi-scale feature map; according to the initial position, crop the feature map of the highest dimension in the multi-scale feature map with at least two different expansion scales, Obtain multiple cropped feature maps, which contain feature information of salient objects; use one or more of the multi-scale feature maps as target feature maps, and use each target feature map as the current target feature map one by one, Calculate the correlation between the multiple cropped feature maps and the current target feature map, and obtain multiple correlation feature maps of the current target feature map that correspond one-to-one with the multiple cropped feature maps; according to the multiple correlation feature maps and the current target feature map , to obtain the enhanced feature map corresponding to the current target feature map.
可选地,强化处理单元还用于:对多尺度的特征图中最高维的特征图进行降维处理,得到单通道的特征图;对单通道的特征图进行二值化处理,得到单通道的二值化特征图;根据单通道的二值化特征图确定显著性物体的初始位置。Optionally, the enhancement processing unit is further configured to: perform dimensionality reduction processing on the feature map with the highest dimension in the multi-scale feature map to obtain a single-channel feature map; perform binarization processing on the single-channel feature map to obtain a single-channel feature map The binarized feature map of ; determine the initial position of the salient object according to the single-channel binarized feature map.
可选地,强化处理单元还用于:根据初始位置确定显著性物体的像素宽度和像素高度;根据扩充尺度、像素宽度和像素高度确定扩充像素宽度和扩充像素高度;在最高维的特征图中,沿着将初始位置扩充扩充像素宽度和扩充像素高度后的位置进行裁剪。Optionally, the enhancement processing unit is also used to: determine the pixel width and pixel height of the salient object according to the initial position; determine the expanded pixel width and the expanded pixel height according to the expanded scale, the pixel width and the pixel height; in the feature map of the highest dimension , and crop along the position where the original position is expanded by the expanded pixel width and expanded pixel height.
可选地,强化处理单元还用于:将多个裁剪特征图缩放至预设尺度,得到预设尺度的多个裁剪特征图;以预设尺度为滑动窗口在当前目标特征图上进行滑动;将每次滑动后滑动窗口所包含的特征图与预设尺度的多个裁剪特征图分别进行乘积运算,根据乘积运算的结果得到当前目标特征图的与多个裁剪特征图一一对应的多个相关度特征图。Optionally, the enhancement processing unit is further configured to: scale multiple cropped feature maps to a preset scale to obtain multiple cropped feature maps of the preset scale; use the preset scale as a sliding window to slide on the current target feature map; Multiply the feature map included in the sliding window after each sliding with multiple cropped feature maps of the preset scale, respectively, and obtain multiple cropped feature maps of the current target feature map one-to-one corresponding to the multiple cropped feature maps according to the result of the multiplication operation. Correlation feature map.
可选地,强化处理单元还用于:将多个相关度特征图中的每个相关度特征图与当前目标特征图进行乘积运算,得到当前目标特征图对应的多个第一强化特征图;将多个第一强化特征图与当前目标特征图串联,得到当前目标特征图对应的第二强化特征图;获取当前目标特征图对应的位置强化特征图,并将第二强化特征图和位置强化特征图串联,得到与当前目标特征图对应的强化特征图,其中,位置强化特征图的尺度与第二强化特征图的尺度相同。Optionally, the enhancement processing unit is further configured to: perform a product operation on each correlation feature map in the plurality of correlation feature maps and the current target feature map to obtain a plurality of first enhanced feature maps corresponding to the current target feature map; Connect a plurality of first enhanced feature maps and the current target feature map in series to obtain a second enhanced feature map corresponding to the current target feature map; obtain the position enhanced feature map corresponding to the current target feature map, and combine the second enhanced feature map and the location enhanced feature map The feature maps are concatenated to obtain an enhanced feature map corresponding to the current target feature map, wherein the scale of the position enhanced feature map is the same as the scale of the second enhanced feature map.
可选地,强化处理单元还用于:基于显著性物体的初始位置确定显著性物体X方向的中心线和Y方向的中心线;将Y方向的中心线设置为第一目标值,沿着X方向线性变换为第二目标值,得到X方向的位置强化特征图;将X方向的中心线设置为第一目标值,沿着Y方向线性变换为第二目标值,得到Y方向的位置强化特征图;将X方向的位置强化特征图和Y方向的位置强化特征图作为位置强化特征图。Optionally, the enhancement processing unit is further configured to: determine the center line in the X direction and the center line in the Y direction of the salient object based on the initial position of the salient object; set the center line in the Y direction as the first target value, along the X direction. The direction is linearly transformed into the second target value, and the position enhancement feature map in the X direction is obtained; the center line in the X direction is set as the first target value, and the second target value is linearly transformed along the Y direction to obtain the position enhancement feature in the Y direction. Figure; the position enhancement feature map in the X direction and the position enhancement feature map in the Y direction are used as the position enhancement feature map.
可选地,图像还原单元还用于:对多尺度的强化特征图进行上采样,得到与待处理图像对应的显著性物体蒙版。Optionally, the image restoration unit is further configured to: perform up-sampling on the multi-scale enhanced feature map to obtain a salient object mask corresponding to the image to be processed.
本发明实施例所提供的图像处理装置,其实现原理及产生的技术效果和前述实施例2中的方法实施例相同,为简要描述,装置实施例部分未提及之处,可参考前述方法实施例中相应内容。The image processing device provided by the embodiment of the present invention has the same implementation principle and technical effect as the method embodiment in the foregoing embodiment 2. For the sake of brief description, the parts not mentioned in the embodiment of the device can be implemented with reference to the foregoing method. corresponding content in the example.
在另一个实施例中,还提供了一种具有处理器可执行的非易失的程序代码的计算机可读介质,所述程序代码使所述处理器执行上述权实施例2中任意实施例所述的方法的步骤。In another embodiment, there is also provided a computer-readable medium having non-volatile program code executable by a processor, the program code causing the processor to execute any of the above-mentioned embodiments in Embodiment 2. steps of the method described.
另外,在本发明实施例的描述中,除非另有明确的规定和限定,术语“安装”、“相连”、“连接”应做广义理解,例如,可以是固定连接,也可以是可拆卸连接,或一体地连接;可以是机械连接,也可以是电连接;可以是直接相连,也可以通过中间媒介间接相连,可以是两个元件内部的连通。对于本领域的普通技术人员而言,可以具体情况理解上述术语在本发明中的具体含义。In addition, in the description of the embodiments of the present invention, unless otherwise expressly specified and limited, the terms "installed", "connected" and "connected" should be understood in a broad sense, for example, it may be a fixed connection or a detachable connection , or integrally connected; it can be a mechanical connection or an electrical connection; it can be a direct connection, or an indirect connection through an intermediate medium, or the internal communication between the two components. For those of ordinary skill in the art, the specific meanings of the above terms in the present invention can be understood in specific situations.
在本发明的描述中,需要说明的是,术语“中心”、“上”、“下”、“左”、“右”、“竖直”、“水平”、“内”、“外”等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本发明和简化描述,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本发明的限制。此外,术语“第一”、“第二”、“第三”仅用于描述目的,而不能理解为指示或暗示相对重要性。In the description of the present invention, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. The indicated orientation or positional relationship is based on the orientation or positional relationship shown in the accompanying drawings, which is only for the convenience of describing the present invention and simplifying the description, rather than indicating or implying that the indicated device or element must have a specific orientation or a specific orientation. construction and operation, and therefore should not be construed as limiting the invention. Furthermore, the terms "first", "second", and "third" are used for descriptive purposes only and should not be construed to indicate or imply relative importance.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and brevity of description, the specific working process of the system, device and unit described above may refer to the corresponding process in the foregoing method embodiments, which will not be repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,又例如,多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. The apparatus embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some communication interfaces, indirect coupling or communication connection of devices or units, which may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。The functions, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a processor-executable non-volatile computer-readable storage medium. Based on such understanding, the technical solution of the present invention can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution. The computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention. The aforementioned storage medium includes: U disk, removable hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes .
最后应说明的是:以上所述实施例,仅为本发明的具体实施方式,用以说明本发明的技术方案,而非对其限制,本发明的保护范围并不局限于此,尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化,或者对其中部分技术特征进行等同替换;而这些修改、变化或者替换,并不使相应技术方案的本质脱离本发明实施例技术方案的精神和范围,都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应所述以权利要求的保护范围为准。Finally, it should be noted that the above-mentioned embodiments are only specific implementations of the present invention, and are used to illustrate the technical solutions of the present invention, but not to limit them. The protection scope of the present invention is not limited thereto, although referring to the foregoing The embodiment has been described in detail the present invention, those of ordinary skill in the art should understand: any person skilled in the art who is familiar with the technical field within the technical scope disclosed by the present invention can still modify the technical solutions described in the foregoing embodiments. Or can easily think of changes, or equivalently replace some of the technical features; and these modifications, changes or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present invention, and should be covered in the present invention. within the scope of protection. Therefore, the protection scope of the present invention should be based on the protection scope of the claims.
Claims (15)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010822524.4A CN112101376B (en) | 2020-08-14 | 2020-08-14 | Image processing method, device, electronic device and computer readable medium |
PCT/CN2021/092743 WO2022033088A1 (en) | 2020-08-14 | 2021-05-10 | Image processing method, apparatus, electronic device, and computer-readable medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010822524.4A CN112101376B (en) | 2020-08-14 | 2020-08-14 | Image processing method, device, electronic device and computer readable medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112101376A true CN112101376A (en) | 2020-12-18 |
CN112101376B CN112101376B (en) | 2024-10-22 |
Family
ID=73753882
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010822524.4A Active CN112101376B (en) | 2020-08-14 | 2020-08-14 | Image processing method, device, electronic device and computer readable medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112101376B (en) |
WO (1) | WO2022033088A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022033088A1 (en) * | 2020-08-14 | 2022-02-17 | 北京迈格威科技有限公司 | Image processing method, apparatus, electronic device, and computer-readable medium |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110182517A1 (en) * | 2010-01-20 | 2011-07-28 | Duke University | Segmentation and identification of layered structures in images |
US20120050567A1 (en) * | 2010-09-01 | 2012-03-01 | Apple Inc. | Techniques for acquiring and processing statistics data in an image signal processor |
US20120050566A1 (en) * | 2010-09-01 | 2012-03-01 | Apple Inc. | Techniques for collection of auto-focus statistics |
US20120081580A1 (en) * | 2010-09-30 | 2012-04-05 | Apple Inc. | Overflow control techniques for image signal processing |
US20170116497A1 (en) * | 2015-09-16 | 2017-04-27 | Siemens Healthcare Gmbh | Intelligent Multi-scale Medical Image Landmark Detection |
CN109447990A (en) * | 2018-10-22 | 2019-03-08 | 北京旷视科技有限公司 | Image, semantic dividing method, device, electronic equipment and computer-readable medium |
CN109741293A (en) * | 2018-11-20 | 2019-05-10 | 武汉科技大学 | Significant detection method and device |
US20190205606A1 (en) * | 2016-07-21 | 2019-07-04 | Siemens Healthcare Gmbh | Method and system for artificial intelligence based medical image segmentation |
CN110097564A (en) * | 2019-04-04 | 2019-08-06 | 平安科技(深圳)有限公司 | Image labeling method, device, computer equipment and storage medium based on multi-model fusion |
CN111126258A (en) * | 2019-12-23 | 2020-05-08 | 深圳市华尊科技股份有限公司 | Image recognition method and related device |
CN111179193A (en) * | 2019-12-26 | 2020-05-19 | 苏州斯玛维科技有限公司 | Dermatoscope image enhancement and classification method based on DCNNs and GANs |
CN111435448A (en) * | 2019-01-11 | 2020-07-21 | 中国科学院半导体研究所 | Image saliency object detection method, device, equipment and medium |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9042648B2 (en) * | 2012-02-23 | 2015-05-26 | Microsoft Technology Licensing, Llc | Salient object segmentation |
CN109359654B (en) * | 2018-09-18 | 2021-02-12 | 北京工商大学 | Image segmentation method and system based on frequency tuning global saliency and deep learning |
CN109543701A (en) * | 2018-11-30 | 2019-03-29 | 长沙理工大学 | Vision significance method for detecting area and device |
CN110021031B (en) * | 2019-03-29 | 2023-03-10 | 中广核贝谷科技有限公司 | X-ray image enhancement method based on image pyramid |
CN112101376B (en) * | 2020-08-14 | 2024-10-22 | 北京迈格威科技有限公司 | Image processing method, device, electronic device and computer readable medium |
-
2020
- 2020-08-14 CN CN202010822524.4A patent/CN112101376B/en active Active
-
2021
- 2021-05-10 WO PCT/CN2021/092743 patent/WO2022033088A1/en active Application Filing
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110182517A1 (en) * | 2010-01-20 | 2011-07-28 | Duke University | Segmentation and identification of layered structures in images |
US20120050567A1 (en) * | 2010-09-01 | 2012-03-01 | Apple Inc. | Techniques for acquiring and processing statistics data in an image signal processor |
US20120050566A1 (en) * | 2010-09-01 | 2012-03-01 | Apple Inc. | Techniques for collection of auto-focus statistics |
US20120081580A1 (en) * | 2010-09-30 | 2012-04-05 | Apple Inc. | Overflow control techniques for image signal processing |
US20170116497A1 (en) * | 2015-09-16 | 2017-04-27 | Siemens Healthcare Gmbh | Intelligent Multi-scale Medical Image Landmark Detection |
US20190205606A1 (en) * | 2016-07-21 | 2019-07-04 | Siemens Healthcare Gmbh | Method and system for artificial intelligence based medical image segmentation |
CN109447990A (en) * | 2018-10-22 | 2019-03-08 | 北京旷视科技有限公司 | Image, semantic dividing method, device, electronic equipment and computer-readable medium |
CN109741293A (en) * | 2018-11-20 | 2019-05-10 | 武汉科技大学 | Significant detection method and device |
CN111435448A (en) * | 2019-01-11 | 2020-07-21 | 中国科学院半导体研究所 | Image saliency object detection method, device, equipment and medium |
CN110097564A (en) * | 2019-04-04 | 2019-08-06 | 平安科技(深圳)有限公司 | Image labeling method, device, computer equipment and storage medium based on multi-model fusion |
CN111126258A (en) * | 2019-12-23 | 2020-05-08 | 深圳市华尊科技股份有限公司 | Image recognition method and related device |
CN111179193A (en) * | 2019-12-26 | 2020-05-19 | 苏州斯玛维科技有限公司 | Dermatoscope image enhancement and classification method based on DCNNs and GANs |
Non-Patent Citations (2)
Title |
---|
周鹏程;龚声蓉;钟珊;包宗铭;戴兴华;: "基于深度特征融合的图像语义分割", 计算机科学, no. 02 * |
李希;徐翔;李军;: "面向航空飞行安全的遥感图像小目标检测", 航空兵器, no. 03 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022033088A1 (en) * | 2020-08-14 | 2022-02-17 | 北京迈格威科技有限公司 | Image processing method, apparatus, electronic device, and computer-readable medium |
Also Published As
Publication number | Publication date |
---|---|
WO2022033088A1 (en) | 2022-02-17 |
CN112101376B (en) | 2024-10-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109493350B (en) | Portrait segmentation method and device | |
WO2020034663A1 (en) | Grid-based image cropping | |
WO2019201035A1 (en) | Method and device for identifying object node in image, terminal and computer readable storage medium | |
CN108876792B (en) | Semantic segmentation method, device and system, and storage medium | |
CN109816011B (en) | Video key frame extraction method | |
CN111104962A (en) | Semantic segmentation method and device for image, electronic equipment and readable storage medium | |
CN105551036B (en) | A kind of training method and device of deep learning network | |
WO2022217876A1 (en) | Instance segmentation method and apparatus, and electronic device and storage medium | |
CN112990219B (en) | Method and device for image semantic segmentation | |
US10277806B2 (en) | Automatic image composition | |
CN109117846B (en) | Image processing method and device, electronic equipment and computer readable medium | |
CN109816659B (en) | Image segmentation method, device and system | |
JP6902811B2 (en) | Parallax estimation systems and methods, electronic devices and computer readable storage media | |
CN110163866A (en) | A kind of image processing method, electronic equipment and computer readable storage medium | |
CN109543685A (en) | Image, semantic dividing method, device and computer equipment | |
WO2020207134A1 (en) | Image processing method, device, apparatus, and computer readable medium | |
CN111832476A (en) | Layout analysis methods, reading aids, circuits and media | |
CN112419342A (en) | Image processing method, image processing device, electronic equipment and computer readable medium | |
WO2022033088A1 (en) | Image processing method, apparatus, electronic device, and computer-readable medium | |
WO2020077535A1 (en) | Image semantic segmentation method, computer device, and storage medium | |
CN115147606A (en) | Medical image segmentation method, device, computer equipment and storage medium | |
CN113628181A (en) | Image processing method, image processing device, electronic equipment and storage medium | |
US20230005104A1 (en) | Method and electronic device for performing ai based zoom of image | |
CN118470079A (en) | Monocular depth estimation device and method and electronic equipment | |
CN115272906A (en) | Video background portrait segmentation model and algorithm based on point rendering |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20250205 Address after: No. 257, 2nd Floor, Building 9, No. 2 Huizhu Road, Liangjiang New District, Yubei District, Chongqing, China 401123 Patentee after: Force Map New (Chongqing) Technology Co.,Ltd. Country or region after: China Address before: 100086 316-318, block a, Rongke Information Center, No.2, south academy of Sciences Road, Haidian District, Beijing Patentee before: MEGVII (BEIJING) TECHNOLOGY Co.,Ltd. Country or region before: China |