WO2021104058A1 - Image segmentation method and apparatus, and terminal device - Google Patents

Image segmentation method and apparatus, and terminal device Download PDF

Info

Publication number
WO2021104058A1
WO2021104058A1 PCT/CN2020/128846 CN2020128846W WO2021104058A1 WO 2021104058 A1 WO2021104058 A1 WO 2021104058A1 CN 2020128846 W CN2020128846 W CN 2020128846W WO 2021104058 A1 WO2021104058 A1 WO 2021104058A1
Authority
WO
WIPO (PCT)
Prior art keywords
layer
convolutional
convolutional layer
scale
layers
Prior art date
Application number
PCT/CN2020/128846
Other languages
French (fr)
Chinese (zh)
Inventor
司伟鑫
李才子
王琼
王平安
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Publication of WO2021104058A1 publication Critical patent/WO2021104058A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20016Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform

Definitions

  • the multi-layer convolutional layer includes convolutional layers of multiple scales, and the convolutional layer of each scale includes multiple convolutional layers;
  • an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the computer program implements the process described in any of the first aspects. Image segmentation method.
  • FIG. 10 is a schematic structural diagram of a computer to which the image segmentation method provided by an embodiment of the present application is applicable.
  • P 9 is the output layer convolutional layer of the sub-tree
  • P 6 is the parent node of the sub-tree
  • P 9 is the same as P 6 , P 7 , P 8 is connected, and P 6 , P 7 , P 8 are connected in turn; among them, P 9 and P 6 are two non-adjacent convolutional layers, and the connection between the two is a cross-layer connection (such as The thin solid line with arrows in Figure 6), P 9 and P 7 are two non-adjacent convolutional layers, and the connection between the two is a cross-layer connection (the thin arrowed line in Figure 6 Shown by the solid line).
  • Each convolutional layer of the next convolutional layer is correspondingly connected to the convolutional layer of the same scale in the previous convolutional layer; wherein, the next convolutional layer is different from the previous convolutional layer.
  • Two adjacent convolutional layers Two adjacent convolutional layers.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

An image segmentation method and apparatus, and a terminal device. The image segmentation method comprises: obtaining a target image to be segmented (101); performing convolution processing on said image by means of multiple convolutional layers (102), wherein the convolutional layers of the multiple convolutional layers are connected to one another, the convolutional layers in a first convolutional layer are sequentially connected in series, the convolutional layer of a first scale receives said image, the convolutional layers sequentially perform convolution down-sampling on said image, the convolutional layers in the last convolutional layer of the multiple convolution layers are sequentially connected in series, the convolutional layers sequentially perform convolution up-sampling on received feature information, and output a convolution processing result by means of the convolutional layer of the first scale; and performing image segmentation according to the convolution processing result (103). Compared with traditional neural networks, the method can greatly reduce the number of parameters, so that the calculation amount is reduced, and the performance and efficiency of the neural network are improved.

Description

图像分割方法、装置及终端设备Image segmentation method, device and terminal equipment 技术领域Technical field
本申请属于图像处理技术领域,尤其涉及图像分割方法、装置及终端设备。This application belongs to the field of image processing technology, and in particular relates to image segmentation methods, devices, and terminal equipment.
背景技术Background technique
关于神经网络系统在图像分割的研究已经取得了较多的研究成果,但是在多数情况下,人类能够轻易地在一系列的空间尺度上提取一幅图像的不同信息,从而得到包括小区域到大区域的图像细节和特征,而对于计算机设备来说则是一个比较有挑战性的任务。而且神经网络训练需要大量参数参与运算,过程繁琐,因此导致使用神经网络去做图像分割成本高,精度不佳。Research on neural network systems in image segmentation has achieved a lot of research results, but in most cases, humans can easily extract different information of an image on a series of spatial scales, so as to obtain information from small areas to large areas. The image details and characteristics of the area are a more challenging task for computer equipment. Moreover, neural network training requires a large number of parameters to participate in the calculation, and the process is cumbersome, which leads to high cost and poor accuracy of image segmentation using neural networks.
发明内容Summary of the invention
为克服相关技术中存在的至少一个问题,本申请实施例提供了图像分割方法、装置及终端设备。In order to overcome at least one problem in the related art, the embodiments of the present application provide an image segmentation method, device, and terminal device.
本申请是通过如下技术方案实现的:This application is realized through the following technical solutions:
第一方面,本申请实施例提供了一种图像分割方法,包括:In the first aspect, an embodiment of the present application provides an image segmentation method, including:
获取目标待分割图像;Obtain the target image to be segmented;
通过多层卷积层对所述目标待分割图像进行卷积处理;其中,所述多层卷积层的各层卷积层相互连接,第一层卷积层中的各个卷积层之间依次串接,第一尺度的卷积层接收所述目标待分割图像,各个卷积层依次对所述目标待分割图像进行卷积下采样;所述多层卷积层的最后一层卷积层中的各个卷积层之间依次串接,各个卷积层依次对接收到的特征信息进行卷积上采样,并通过第一尺度的卷积层输出卷积处理结果;Convolution processing is performed on the target image to be segmented through a multi-layer convolutional layer; wherein, the convolutional layers of the multi-layer convolutional layer are connected to each other, and the convolutional layers in the first convolutional layer are connected to each other. Concatenated sequentially, the first-scale convolutional layer receives the target image to be segmented, and each convolutional layer sequentially performs convolutional down-sampling on the target image to be segmented; the last layer of the multi-layer convolutional layer convolution Each convolutional layer in the layer is serially connected in sequence, and each convolutional layer sequentially convolutional upsamples the received feature information, and outputs the convolution processing result through the first-scale convolutional layer;
根据所述卷积处理的输出结果进行图像分割。Image segmentation is performed according to the output result of the convolution processing.
在第一方面的一种可能的实现方式中,所述多层卷积层的各层卷积层构成多个依次连接的子树,每个子树包括至少两层卷积层,且每个子树的父节点为之前所有子树的聚合。In a possible implementation of the first aspect, each convolutional layer of the multi-layer convolutional layer constitutes a plurality of sequentially connected subtrees, each subtree includes at least two convolutional layers, and each subtree The parent node of is the aggregation of all previous subtrees.
在第一方面的一种可能的实现方式中,每个子树中均包括一输出层卷积层,所述输出层卷积层与当前子树中的其他层卷积层和当前子树的父节点分别连接,且当前子树的父节点和当前子树中的其他层卷积层之间依次连接;其中,当前子树的父节点为上一子树的输出层卷积层。In a possible implementation of the first aspect, each subtree includes an output layer convolutional layer, and the output layer convolutional layer is the same as the other convolutional layers in the current subtree and the parent of the current subtree. The nodes are connected separately, and the parent node of the current subtree and the other convolutional layers in the current subtree are sequentially connected; wherein the parent node of the current subtree is the output convolutional layer of the previous subtree.
在第一方面的一种可能的实现方式中,对于存在连接关系的两层卷积层之间,下一层卷积层的各个卷积层分别与上一层卷积层中对应的卷积层连接。In a possible implementation of the first aspect, between two convolutional layers that have a connection relationship, each convolutional layer of the next convolutional layer corresponds to the corresponding convolutional layer in the previous convolutional layer. Layer connection.
在第一方面的一种可能的实现方式中,所述多层卷积层包括多个尺度的卷积层,每一尺度的卷积层包括多个卷积层;In a possible implementation of the first aspect, the multi-layer convolutional layer includes convolutional layers of multiple scales, and the convolutional layer of each scale includes multiple convolutional layers;
所述下一层卷积层的各个卷积层分别与上一层卷积层中对应的卷积层对应连接,为:Each convolutional layer of the next convolutional layer is connected to the corresponding convolutional layer in the previous convolutional layer, and is:
下一层卷积层的各个卷积层与上一层卷积层中的同一尺度的卷积层对应连接;其中,所述下一层卷积层与所述上一层卷积层为不相邻的两层卷积层。Each convolutional layer of the next convolutional layer is correspondingly connected to the convolutional layer of the same scale in the previous convolutional layer; wherein, the next convolutional layer is different from the previous convolutional layer. Two adjacent convolutional layers.
在第一方面的一种可能的实现方式中,所述多层卷积层包括多个尺度的卷积层,每一尺度的卷积层包括多个卷积层;In a possible implementation of the first aspect, the multi-layer convolutional layer includes convolutional layers of multiple scales, and the convolutional layer of each scale includes multiple convolutional layers;
所述下一层卷积层的各个卷积层分别与上一层卷积层中对应的卷积层对应连接,为:Each convolutional layer of the next convolutional layer is connected to the corresponding convolutional layer in the previous convolutional layer, and is:
下一层卷积层的当前尺度的卷积层分别与上一层卷积层中的所述当前尺度的卷积层和与所述当前尺度相邻的尺度的卷积层均连接;The convolutional layer of the current scale of the next convolutional layer is respectively connected to the convolutional layer of the current scale in the previous convolutional layer and the convolutional layer of the scale adjacent to the current scale;
其中,所述下一层卷积层与所述上一层卷积层为相邻的两层卷积层。Wherein, the lower convolutional layer and the upper convolutional layer are two adjacent convolutional layers.
在第一方面的一种可能的实现方式中,通过最近邻插值法进行所述卷积上采样或所述卷积下采样。In a possible implementation of the first aspect, the convolutional upsampling or the convolutional downsampling is performed by a nearest neighbor interpolation method.
第二方面,本申请实施例提供了一种图像分割装置,包括:In a second aspect, an embodiment of the present application provides an image segmentation device, including:
图像获取模块,用于获取目标待分割图像;The image acquisition module is used to acquire the target image to be segmented;
卷积处理模块,用于通过多层卷积层对所述目标待分割图像进行卷积处理;其中,所述多层卷积层的各层卷积层相互连接,第一层卷积层中的各个卷积层之间依次串接,第一尺度的卷积层接收所述目标待分割图像,各个卷积层依次对所述目标待分割图像进行卷积下采样;所述多层卷积层的最后一层卷积层中的各个卷积层之间依次串接,各个卷积层依次对接收到的特征信息进行卷积上采样,并通过第一尺度的卷积层输出卷积处理结果;The convolution processing module is configured to perform convolution processing on the target image to be segmented through a multi-layer convolution layer; wherein, the convolution layers of the multi-layer convolution layer are connected to each other, and the first convolution layer is The convolutional layers of each are sequentially connected in series, the first-scale convolutional layer receives the target image to be segmented, and each convolutional layer sequentially performs convolution down-sampling on the target image to be segmented; the multi-layer convolution The convolutional layers in the last convolutional layer of the layer are serially connected in sequence, and each convolutional layer sequentially convolutional upsampling the received feature information, and output convolution processing through the first-scale convolutional layer result;
分割模块,用于根据所述卷积处理结果进行图像分割。The segmentation module is configured to perform image segmentation according to the convolution processing result.
第三方面,本申请实施例提供了一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现如第一方面任一项所述的图像分割方法。In the third aspect, an embodiment of the present application provides a terminal device, including a memory, a processor, and a computer program stored in the memory and running on the processor. When the processor executes the computer program, The image segmentation method according to any one of the first aspect is implemented.
第四方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现如第一方面任一项所述的图像分割方法。In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the computer program implements the process described in any of the first aspects. Image segmentation method.
第五方面,本申请实施例提供了一种计算机程序产品,当计算机程序产品在终端设备上运行时,使得终端设备执行上述第一方面中任一项所述的图像分割方法。In a fifth aspect, embodiments of the present application provide a computer program product, which when the computer program product runs on a terminal device, causes the terminal device to execute the image segmentation method described in any one of the above-mentioned first aspects.
可以理解的是,上述第二方面至第五方面的有益效果可以参见上述第一方面中的相关描述,在此不再赘述。It is understandable that, for the beneficial effects of the second aspect to the fifth aspect described above, reference may be made to the relevant description in the first aspect described above, and details are not repeated here.
本申请实施例与现有技术相比存在的有益效果是:Compared with the prior art, the embodiments of this application have the following beneficial effects:
本申请实施例,获取目标待分割图像,然后通过多层卷积层对目标待分割图像进行卷积处理,其中,第一层卷积层中的各个卷积层之间依次串接,第一尺度的卷积层接收目标待分割图像,各个卷积层依次对目标待分割图像进行卷积下采样;最后一层卷积层中的各个卷积层之间依次串接,各个卷积层依次对接收到的特征信息进行卷积上采样,并通过第一尺度的卷积层输出卷积处理结果,并根据卷积处理的输出结果进行图像分割,相对于传统的神经网络能够大幅度降低参数里,从而减少计算量,提升神经网络的性能和效率。In the embodiment of the present application, the target image to be segmented is acquired, and then the target image to be segmented is convolved through a multi-layer convolutional layer, wherein each convolutional layer in the first convolutional layer is serially connected in sequence, and the first The scaled convolutional layer receives the target image to be segmented, and each convolutional layer sequentially performs convolutional down-sampling on the target image to be segmented; each convolutional layer in the last convolutional layer is connected in series, and each convolutional layer is in turn Perform convolutional upsampling on the received feature information, and output the convolution processing result through the first-scale convolution layer, and perform image segmentation according to the output result of the convolution processing. Compared with the traditional neural network, the parameters can be greatly reduced. , Thereby reducing the amount of calculation and improving the performance and efficiency of the neural network.
进一步地,多层卷积层的各层卷积层构成多个依次连接的子树,每个子树包括至少两层卷积层,且每个子树的父节点为之前所有子树的聚合,采用神经网络不同层卷积层提取的特征相互融合的方式,将特征金字塔视为一种整体的特征提取器,那么距离输入较近的特征金字塔可以称为浅层金字塔,反之可以称为深层金字塔,其中浅层金字塔在低级特征提取上较有优势,而深层金字塔中的特征更多的是语义级别的高级特征,将两者层级融合能够高效率实现深层和浅层的特征融合,从而能够有效利用不同层级的特征金字塔的信息,进而提高图像分割的精度。Further, each convolutional layer of the multi-layer convolutional layer constitutes a plurality of sequentially connected subtrees, each subtree includes at least two convolutional layers, and the parent node of each subtree is the aggregation of all previous subtrees, using The way the features extracted from different layers of the convolutional layer of the neural network are merged, the feature pyramid is regarded as an overall feature extractor, then the feature pyramid closer to the input can be called a shallow pyramid, and vice versa, it can be called a deep pyramid. Among them, the shallow pyramid has advantages in low-level feature extraction, while the features in the deep pyramid are more semantic-level high-level features. Combining the two levels can efficiently achieve deep and shallow feature fusion, which can be effectively used The information of feature pyramids at different levels can further improve the accuracy of image segmentation.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本说明书。It should be understood that the above general description and the following detailed description are only exemplary and explanatory, and cannot limit this specification.
附图说明Description of the drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings needed in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only of the present application. For some embodiments, for those of ordinary skill in the art, other drawings may be obtained based on these drawings without creative labor.
图1是本申请一实施例提供的图像分割方法的应用环境示意图;FIG. 1 is a schematic diagram of an application environment of an image segmentation method provided by an embodiment of the present application;
图2是本申请一实施例提供的图像分割方法的流程示意图;FIG. 2 is a schematic flowchart of an image segmentation method provided by an embodiment of the present application;
图3是本申请一实施例提供的神经网络的各卷积层之间的并行机制示的意图;FIG. 3 is a schematic diagram of a parallel mechanism between convolutional layers of a neural network provided by an embodiment of the present application;
图4是本申请一实施例提供的多层卷积层的结构示意图;4 is a schematic structural diagram of a multi-layer convolutional layer provided by an embodiment of the present application;
图5是本申请一实施例提供的多层卷积层的结构示意图;FIG. 5 is a schematic structural diagram of a multi-layer convolutional layer provided by an embodiment of the present application;
图6是本申请一实施例提供的多层卷积层的结构示意图;FIG. 6 is a schematic structural diagram of a multi-layer convolutional layer provided by an embodiment of the present application;
图7是本申请一实施例提供的多层卷积层的中间层卷积层之间的连接示意图;FIG. 7 is a schematic diagram of connections between intermediate convolutional layers of a multi-layer convolutional layer provided by an embodiment of the present application;
图8是本申请一实施例提供的图像分割装置的结构示意图;FIG. 8 is a schematic structural diagram of an image segmentation device provided by an embodiment of the present application;
图9是本申请一实施例提供的终端设备的结构示意图;FIG. 9 is a schematic structural diagram of a terminal device provided by an embodiment of the present application;
图10是本申请一实施例提供的图像分割方法所适用的计算机的结构示意图。FIG. 10 is a schematic structural diagram of a computer to which the image segmentation method provided by an embodiment of the present application is applicable.
具体实施方式Detailed ways
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、技术之类的具体细节,以便透彻理解本申请实施例。然而,本领域的技术人员应当清楚,在没有这些具体细节的其它实施例中也可以实现本申请。在其它情况中,省略对众所周知的系统、装置、电路以及方法的详细说明,以免不必要的细节妨碍本申请的描述。In the following description, for the purpose of illustration rather than limitation, specific details such as a specific system structure and technology are proposed for a thorough understanding of the embodiments of the present application. However, it should be clear to those skilled in the art that the present application can also be implemented in other embodiments without these specific details. In other cases, detailed descriptions of well-known systems, devices, circuits, and methods are omitted to avoid unnecessary details from obstructing the description of this application.
应当理解,当在本申请说明书和所附权利要求书中使用时,术语“包括”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。It should be understood that when used in the specification and appended claims of this application, the term "comprising" indicates the existence of the described features, wholes, steps, operations, elements and/or components, but does not exclude one or more other The existence or addition of features, wholes, steps, operations, elements, components, and/or collections thereof.
还应当理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。It should also be understood that the term "and/or" used in the specification of this application and the appended claims refers to any combination of one or more of the items listed in association and all possible combinations, and includes these combinations.
如在本申请说明书和所附权利要求书中所使用的那样,术语“如果”可以依据上下文被解释为“当...时”或“一旦”或“响应于确定”或“响应于检测到”。类似地,短语“如果确定”或“如果检测到[所描述条件或事件]”可以依据上下文被解释为意指“一旦确定”或“响应于确定”或“一旦检测到[所描述条件或事件]”或“响应于检测到[所描述条件或事件]”。As used in the description of this application and the appended claims, the term "if" can be construed as "when" or "once" or "in response to determination" or "in response to detecting ". Similarly, the phrase "if determined" or "if detected [described condition or event]" can be construed as meaning "once determined" or "in response to determination" or "once detected [described condition or event]" depending on the context ]" or "in response to detection of [condition or event described]".
另外,在本申请说明书和所附权利要求书的描述中,术语“第一”、“第二”、“第三”等仅用于区分描述,而不能理解为指示或暗示相对重要性。In addition, in the description of the specification of this application and the appended claims, the terms "first", "second", "third", etc. are only used to distinguish the description, and cannot be understood as indicating or implying relative importance.
在本申请说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。The reference to "one embodiment" or "some embodiments" described in the specification of this application means that one or more embodiments of this application include a specific feature, structure, or characteristic described in combination with the embodiment. Therefore, the sentences "in one embodiment", "in some embodiments", "in some other embodiments", "in some other embodiments", etc. appearing in different places in this specification are not necessarily All refer to the same embodiment, but mean "one or more but not all embodiments" unless it is specifically emphasized otherwise. The terms "including", "including", "having" and their variations all mean "including but not limited to", unless otherwise specifically emphasized.
提高神经网络的多尺度表达能力是提高心脏MRI图像多组织分割的重要途径。目前在计算机视觉领域,图像金字塔以多种形式和方法被广泛应用于计算机视觉任务中。虽然关于神经网络系统在图像分割的研究已经取得了较多的研究成果,但是在多数情况下,分割神经 网络参数量过于庞大,既使卷积神经网络本身建立在参数共享的基础之上,复杂的网络结构设计也使得大量参数参与到了梯度优化的过程中来,如此产生了巨大的计算量。而且,尽管多尺度融合金字塔网络在心脏MRI图像分割的任务中表现出了较好的分割能力,但是输入层与输出层网络连接过长导致浅层金字塔与深层金字塔之间信息交互过少的问题,也就是说该网络的多尺度特征表示不充分。Improving the multi-scale expression ability of neural networks is an important way to improve the multi-tissue segmentation of cardiac MRI images. At present, in the field of computer vision, image pyramids are widely used in computer vision tasks in various forms and methods. Although many research results have been achieved in the study of neural network systems in image segmentation, in most cases, the segmentation neural network parameters are too large, even if the convolutional neural network itself is built on the basis of parameter sharing, it is complicated The design of the network structure also makes a large number of parameters participate in the process of gradient optimization, which generates a huge amount of calculation. Moreover, although the multi-scale fusion pyramid network has shown good segmentation ability in the task of cardiac MRI image segmentation, the long connection between the input layer and the output layer network leads to the problem of too little information interaction between the shallow pyramid and the deep pyramid. , Which means that the multi-scale feature representation of the network is insufficient.
基于上述问题,本申请实施例提供了图像分割方法、装置及终端设备,设计多层卷积层结构的神经网络,第一层卷积层中的各个卷积层依次对目标待分割图像进行卷积下采样,最后一层卷积层中的各个卷积层依次对接收到的特征信息进行卷积上采样,中间层卷积层相互连接,相对于传统的神经网络能够大幅度降低参数里,从而减少计算量,提升神经网络的性能和效率。Based on the above problems, the embodiments of the application provide an image segmentation method, device, and terminal equipment, and design a neural network with a multilayer convolutional layer structure. Each convolutional layer in the first convolutional layer sequentially convolves the target image to be segmented. Product down-sampling, each convolutional layer in the last layer of convolutional layer convolutional up-sampling the received feature information in turn, the middle layer of convolutional layers are connected to each other, compared with the traditional neural network, it can greatly reduce the parameters, Thereby reducing the amount of calculation and improving the performance and efficiency of the neural network.
进一步地,多层卷积层的各层卷积层构成多个依次连接的子树,每个子树包括至少两层卷积层,且每个子树的父节点为之前所有子树的聚合,采用神经网络不同层卷积层提取的特征相互融合的方式,将特征金字塔视为一种整体的特征提取器,那么距离输入较近的特征金字塔可以称为浅层金字塔,反之可以称为深层金字塔,其中浅层金字塔在低级特征提取上较有优势,而深层金字塔中的特征更多的是语义级别的高级特征,将两者层级融合能够高效率实现深层和浅层的特征融合,从而能够有效利用不同层级的特征金字塔的信息,进而提高图像分割的精度。Further, each convolutional layer of the multi-layer convolutional layer constitutes a plurality of sequentially connected subtrees, each subtree includes at least two convolutional layers, and the parent node of each subtree is the aggregation of all previous subtrees, using The way the features extracted from different layers of the convolutional layer of the neural network are merged, the feature pyramid is regarded as an overall feature extractor, then the feature pyramid closer to the input can be called a shallow pyramid, and vice versa, it can be called a deep pyramid. Among them, the shallow pyramid has advantages in low-level feature extraction, while the features in the deep pyramid are more semantic-level high-level features. Combining the two levels can efficiently achieve deep and shallow feature fusion, which can be effectively used The information of feature pyramids at different levels can further improve the accuracy of image segmentation.
具体地,可以获取目标待分割图像,然后通过多层卷积层对目标待分割图像进行卷积处理,其中,多层卷积层的各层卷积层相互连接,第一层卷积层中的各个卷积层之间依次串接,第一尺度的卷积层接收目标待分割图像,各个卷积层依次对目标待分割图像进行卷积下采样;最后一层卷积层中的各个卷积层之间依次串接,各个卷积层依次对接收到的特征信息进行卷积上采样,并通过第一尺度的卷积层输出卷积处理结果,最后根据卷积处理的输出结果进行图像分割。Specifically, the target image to be segmented can be obtained, and then convolution processing is performed on the target image to be segmented through the multi-layer convolutional layer, wherein the convolutional layers of the multi-layer convolutional layer are connected to each other, and the first convolutional layer is Each convolutional layer is connected in series. The first-scale convolutional layer receives the target image to be segmented, and each convolutional layer performs convolutional down-sampling on the target image to be segmented; each convolutional layer in the last convolutional layer The layers are serially connected in sequence, and each convolution layer sequentially convolution and upsamples the received feature information, and output the convolution processing result through the first scale convolution layer, and finally image according to the output result of the convolution processing segmentation.
举例说明,本申请实施例可以应用到如图1所示的示例性场景中,在该场景中,磁共振扫描设备10对人体的某个部位进行扫描,得到该部位的扫描图像进行分割,例如可以为心脏图像,并将该扫描图像发送给图像分割设备20。图像分割设备20获取到该扫描图像后,将该扫描图像作为目标待分割图像,通过多层卷积层对所述目标待分割图像进行卷积处理得到卷积处理结果,再根据上述卷积处理结果进行图像分割。For example, the embodiment of the present application can be applied to the exemplary scene shown in FIG. 1, in this scene, the magnetic resonance scanning device 10 scans a certain part of the human body to obtain a scanned image of the part for segmentation, for example It may be a heart image, and the scanned image is sent to the image segmentation device 20. After the image segmentation device 20 obtains the scanned image, it uses the scanned image as a target image to be segmented, and performs convolution processing on the target image to be segmented through a multi-layer convolution layer to obtain a convolution processing result, and then according to the above convolution processing The result is image segmentation.
需要说明的是,上述应用场景作为示例性说明吗,并不用于限定本申请实施例实施时的应用场景,事实上,本申请实施例也可以应用于其它应用场景中。比如,在另一些示例性应用场景中,也可以是由医务人员挑选目标待分割图像发送给图像分割设备等。It should be noted that the above application scenario is used as an exemplary description, and it is not used to limit the application scenario during the implementation of the embodiment of the present application. In fact, the embodiment of the present application may also be applied to other application scenarios. For example, in some other exemplary application scenarios, medical personnel may also select the target image to be segmented and send it to the image segmentation device.
为了使本技术领域的人员更好地理解本发明方案,下面将结合图1,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to enable those skilled in the art to better understand the solutions of the present invention, the technical solutions in the embodiments of the present application will be described clearly and completely with reference to FIG. 1. Obviously, the described embodiments are only a part of the embodiments of the present invention. , Not all examples. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.
图2是本申请一实施例提供的图像分割方法的示意性流程图,参照图1,对该图像分割方法的详述如下:FIG. 2 is a schematic flowchart of an image segmentation method provided by an embodiment of the present application. Referring to FIG. 1, the image segmentation method is described in detail as follows:
在步骤101中,获取目标待分割图像。In step 101, a target image to be segmented is acquired.
其中,上述目标待分割图像可以为对人体某个部位进行磁共振成像MRI得到的图像,例如为对人体心脏的MRI图像。The above-mentioned target image to be segmented may be an image obtained by performing magnetic resonance imaging MRI on a certain part of the human body, for example, an MRI image of the human heart.
可选的,在获取到目标待分割图像后,可以对该目标待分割图像进行预处理,再对预处理后的图像进行后续步骤的处理。示例性的,该目标待分割图像可以包含心脏、大小为m*n的图像,可以对该目标待分割图像进行识别提取出心脏部分的图像,得到128*128大小的图像,在进行后续步骤的处理。Optionally, after obtaining the target image to be segmented, the target image to be segmented may be preprocessed, and then the preprocessed image may be processed in subsequent steps. Exemplarily, the target image to be segmented may include an image of the heart and a size of m*n. The target image to be segmented may be identified to extract the image of the heart part to obtain an image with a size of 128*128. deal with.
在步骤102中,通过多层卷积层对所述目标待分割图像进行卷积处理。In step 102, convolution processing is performed on the target image to be segmented through a multi-layer convolution layer.
其中,上述多层卷积层的各层卷积层相互连接,第一层卷积层中的各个卷积层之间依次串接,第一尺度的卷积层接收上述目标待分割图像,各个卷积层依次对上述目标待分割图像进行卷积下采样;上述多层卷积层的最后一层卷积层中的各个卷积层之间依次串接,各个卷积层依次对接收到的特征信息进行卷积上采样,并通过第一尺度的卷积层输出卷积处理结果。Wherein, the convolutional layers of the above-mentioned multi-layer convolutional layer are connected to each other, and the convolutional layers in the first convolutional layer are sequentially connected in series, and the convolutional layer of the first scale receives the target image to be divided, and each The convolutional layer sequentially performs convolutional down-sampling on the above-mentioned target image to be segmented; each convolutional layer in the last convolutional layer of the above-mentioned multi-layer convolutional layer is serially connected in sequence, and each convolutional layer sequentially performs The feature information is convolutional up-sampled, and the convolution processing result is output through the first-scale convolution layer.
为便于理解上述图像分割方法,先介绍并行多尺度交叉融合金字塔。In order to facilitate the understanding of the above image segmentation method, first introduce the parallel multi-scale cross fusion pyramid.
传统的U-Net结构通过逐次地对输入图像进行卷积和池化来提取图像不同层级的特征,再通过连续地反卷积操作将深层具有强语义的特征图逐次恢复至原图像尺寸大小,在恢复尺寸的过程中,跨连接操作对于增强恢复路径上的卷积层的特征表达能力起到非常大的作用,使收缩路径上的具有良好轮廓特征的卷积层与扩展路径上的强语义卷积层相互融合,最终完成逐像素的分类识别。然而U-Net结构对于心脏MRI图像的分割还存在着不足之处。U-Net的跨连接机制只针对相同尺度的卷积层之间,这样的特征融合能力并不充分,在分割目标的基端和顶端,尤其是顶端部位,左右心室和心肌的目标通常都比较小,因此U-Net在这些位置的分割能力还存在着较明显的缺陷。另外,U-Net在高分辨率卷积层到低分辨率卷积层的过程中,特征图数目以2倍的比率增加,造成模型的参数量较大,给计算资源带来一定的负担。The traditional U-Net structure extracts the features of different levels of the image by successively convolving and pooling the input image, and then successively restores the deep semantic feature maps to the original image size through continuous deconvolution operations. In the process of restoring the size, the cross-connection operation plays a very important role in enhancing the feature expression ability of the convolutional layer on the restoration path, so that the convolutional layer with good contour characteristics on the contraction path and the strong semantics on the expansion path The convolutional layers merge with each other, and finally complete the pixel-by-pixel classification and recognition. However, the U-Net structure still has shortcomings for the segmentation of cardiac MRI images. U-Net's cross-connection mechanism is only for the convolutional layers of the same scale. This feature fusion ability is not sufficient. At the base and top of the segmentation target, especially the top part, the left and right ventricles and myocardium are usually compared. Therefore, the segmentation ability of U-Net in these positions still has obvious defects. In addition, in the process of U-Net from the high-resolution convolutional layer to the low-resolution convolutional layer, the number of feature maps increases by a factor of two, resulting in a large amount of model parameters and a certain burden on computing resources.
为了避免U-Net的收缩与扩展的结构设计在多尺度信息融合方面的缺陷以及为了降低分割模型的参数量,进一步提高图像的多组织分割能力,例如心脏MRI图像的多组织分割能力,本申请实施例中提出了并行交叉尺度神经网络结构。In order to avoid the shortcomings of U-Net's contraction and expansion structure design in multi-scale information fusion and to reduce the parameter amount of the segmentation model, and to further improve the multi-tissue segmentation capability of the image, such as the multi-tissue segmentation capability of cardiac MRI images, this application In the embodiment, a parallel cross-scale neural network structure is proposed.
其中,并行交叉尺度神经网络结构的核心是每个尺度上的卷积层特征的相互融合,增强神经网络卷积层之间的多尺度信息交流。一个并行多尺度融合单元如公式(1)所示:Among them, the core of the parallel cross-scale neural network structure is the mutual fusion of the convolutional layer features at each scale, which enhances the multi-scale information exchange between the neural network convolutional layers. A parallel multi-scale fusion unit is shown in formula (1):
Figure PCTCN2020128846-appb-000001
Figure PCTCN2020128846-appb-000001
在一个并行多尺度融合单元内,假设输出卷积层为
Figure PCTCN2020128846-appb-000002
表示第i+1层第n个卷积层,输入卷积层分别是
Figure PCTCN2020128846-appb-000003
分别表示第i层的第n个卷积层和相邻的两个尺度的卷积层,ο表示拼接操作,函数F(.)代表一组操作,例如可以包括卷积、批归一化和激活。三个输入卷积层包含不同尺度的特征,大尺度的特征图会被下采样至输出卷积层
Figure PCTCN2020128846-appb-000004
的维度,小尺度的特征图会被上采样至输出卷积层
Figure PCTCN2020128846-appb-000005
的维度,两种采样后的卷积层再以通道为轴与和输出尺度相同的输入卷积层相拼接,接着对拼接后的卷积层进行3*3卷积、批归一化层和 ReLU激活操作,上采样和下采样操作分别采用最近邻插值和最大池化,进行多尺度特征融合之后,输出卷积层相当于提取了不同尺度的特征信息。需要注意的是,当
Figure PCTCN2020128846-appb-000006
没有较大尺度的相邻卷积层或者小尺度的相邻卷积层时,输入卷积层只有两个。
In a parallel multi-scale fusion unit, suppose the output convolutional layer is
Figure PCTCN2020128846-appb-000002
Represents the nth convolutional layer of the i+1th layer, and the input convolutional layers are
Figure PCTCN2020128846-appb-000003
Respectively represent the nth convolutional layer of the i-th layer and the convolutional layers of two adjacent scales, ο represents the splicing operation, and the function F(.) represents a set of operations, such as convolution, batch normalization, and activation. The three input convolutional layers contain features of different scales, and the large-scale feature maps will be downsampled to the output convolutional layer
Figure PCTCN2020128846-appb-000004
The small-scale feature map will be upsampled to the output convolutional layer
Figure PCTCN2020128846-appb-000005
The two sampled convolutional layers are then spliced with the input convolutional layer of the same output scale with the channel as the axis, and then the spliced convolutional layer is subjected to 3*3 convolution, batch normalization layer and ReLU activation operation, up-sampling and down-sampling operations respectively use nearest neighbor interpolation and maximum pooling. After multi-scale feature fusion, the output convolutional layer is equivalent to extracting feature information of different scales. It should be noted that when
Figure PCTCN2020128846-appb-000006
When there are no larger-scale adjacent convolutional layers or small-scale adjacent convolutional layers, there are only two input convolutional layers.
一个并行多尺度交叉金字塔单元中包含一个输入特征金字塔P i和输出特征金字塔P i+1,图3显示了P i和P i+1之间的连接关系,两个金字塔的每一个尺度上都存在一个并行交叉尺度融合单元,相当于对于输出金字塔而言,每一个尺度上的卷积层都有从输入金字塔的相应尺度卷积层及其所有相邻层那里接受信息的能力。 A parallel cross multiscale pyramid feature input unit comprises a pyramid P i and output characteristics pyramid P i + 1, Figure 3 shows the relationship between P i that is connected and P i + 1, on each of the two scales are pyramid There is a parallel cross-scale fusion unit, which is equivalent to that for the output pyramid, the convolutional layer at each scale has the ability to receive information from the corresponding-scale convolutional layer of the input pyramid and all adjacent layers.
上述并行多尺度融合金字塔对于语义分割的优势包括三点:1)相对于传统的编码解码结构,并行交叉尺度融合金字塔能够极大增强浅层特征层到深层特征层的多尺度特征融合能力;2)持续应用特征金字塔能够指数级地增大神经网络的感受野,更有利于像素级的分类;3)同分辨率的卷积层能够直接进行交互,能够减小池化再上采样带来的信息损失。The above-mentioned parallel multi-scale fusion pyramid has three advantages for semantic segmentation: 1) Compared with the traditional encoding and decoding structure, the parallel cross-scale fusion pyramid can greatly enhance the multi-scale feature fusion ability from the shallow feature layer to the deep feature layer; 2 ) Continuous application of the feature pyramid can exponentially increase the receptive field of the neural network, which is more conducive to pixel-level classification; 3) Convolutional layers of the same resolution can directly interact, which can reduce the information brought by pooling and upsampling loss.
本申请实施例中,通过进一步精简多尺度融合结构提出级联式并行多尺度融合金字塔网络(PCP-Net)。PCP-Net的结构如图4所示,为了简明地表示网络结构,图中省略了多个相同的并行多尺度连接金字塔。整体结构由若干个级联的特征金字塔构成,每一个金字塔内包含多个(例如5个)不同尺度的卷积层。对于左边第一个金字塔而言,5个不同尺度的卷积层是对输入图像进行依次卷积下采样得到;对于最后一层卷积层,由内部的5个不同尺度的卷积层从低分辨率向高分辨率采用上采样得到;中间的各个金字塔之间相互连接,例如可以采用并行多尺度融合规则。In the embodiments of the present application, a cascaded parallel multi-scale fusion pyramid network (PCP-Net) is proposed by further simplifying the multi-scale fusion structure. The structure of PCP-Net is shown in Figure 4. In order to show the network structure concisely, multiple identical parallel multi-scale connection pyramids are omitted in the figure. The overall structure is composed of several cascaded feature pyramids, and each pyramid contains multiple (for example, 5) convolutional layers of different scales. For the first pyramid on the left, the 5 convolutional layers of different scales are obtained by sequentially convolution and down-sampling the input image; for the last layer of convolutional layer, the inner 5 convolutional layers of different scales start from the lower Resolution to high resolution is obtained by up-sampling; the pyramids in the middle are connected to each other, for example, parallel multi-scale fusion rules can be used.
一些实施例中,对于存在连接关系的两层卷积层之间,下一层卷积层的各个卷积层分别与上一层卷积层中对应的卷积层连接。In some embodiments, between two convolutional layers with a connection relationship, each convolutional layer of the next convolutional layer is connected to the corresponding convolutional layer of the previous convolutional layer.
参见图4,该多层卷积层可以包括m层卷积层,具体可以为第1层卷积层、第2层卷积层、……、第i-1层卷积层、第i层卷积层、第i+1层卷积层、……、第m-1层卷积层、第m层卷积层。Referring to FIG. 4, the multi-layer convolutional layer may include m-layer convolutional layers, and specifically may be the first layer of convolutional layer, the second layer of convolutional layer, ..., the i-1th layer of convolutional layer, and the i-th layer. Convolutional layer, i+1th convolutional layer, ..., m-1th convolutional layer, mth convolutional layer.
第1层卷积层的各个卷积层之间依此串接,最上面的卷积层接收目标待分割图像,各个卷积层依次对目标待分割图像进行卷积下采样;第m层卷积层之间依此串接,各个卷积层依次对接收到的特征信息进行卷积上采样,并通过最上面的卷积层输出卷积处理结果。其中,可以通过最近邻插值法进行上述卷积上采样或上述卷积下采样。The convolutional layers of the first convolutional layer are connected in series according to this, the uppermost convolutional layer receives the target image to be segmented, and each convolutional layer sequentially performs convolutional down-sampling on the target image to be segmented; the mth convolutional layer The layers are connected in series according to this, and each convolutional layer sequentially performs convolution and upsampling on the received feature information, and outputs the convolution processing result through the uppermost convolution layer. Wherein, the above-mentioned convolutional up-sampling or the above-mentioned convolutional down-sampling can be performed by the nearest neighbor interpolation method.
本实施例中,所述多层卷积层包括多个尺度的卷积层,每一尺度的卷积层包括多个卷积层。在一种可能的实现方式中,下一层卷积层的各个卷积层分别与上一层卷积层中对应的卷积层对应连接,可以为:下一层卷积层的当前尺度的卷积层分别与上一层卷积层中的所述 当前尺度的卷积层和与所述当前尺度相邻的尺度的卷积层均连接。In this embodiment, the multi-layer convolutional layer includes convolutional layers of multiple scales, and the convolutional layer of each scale includes multiple convolutional layers. In a possible implementation, each convolutional layer of the next convolutional layer is connected to the corresponding convolutional layer in the previous convolutional layer, which can be: the current scale of the next convolutional layer The convolutional layer is respectively connected to the convolutional layer of the current scale in the previous convolutional layer and the convolutional layer of the scale adjacent to the current scale.
例如,参见图4,对于第i层卷积层中的尺度1对应的卷积层,可以与第i-1层卷积层中的尺度1对应的卷积层和尺度2对应的卷积层连接;对于第i层卷积层中的尺度2对应的卷积层,可以与第i-1层卷积层中的尺度1对应的卷积层、尺度2对应的卷积层和尺度3对应的卷积层连接;对于第i层卷积层中的尺度3对应的卷积层,可以与第i-1层卷积层中的尺度2对应的卷积层、尺度3对应的卷积层和尺度4对应的卷积层连接;对于第i层卷积层中的尺度4对应的卷积层,可以与第i-1层卷积层中的尺度3对应的卷积层、尺度4对应的卷积层和尺度5对应的卷积层连接;对于第i层卷积层中的尺度5对应的卷积层,可以与第i-1层卷积层中的尺度4对应的卷积层和尺度5对应的卷积层连接。For example, referring to Figure 4, for the convolutional layer corresponding to scale 1 in the i-th convolutional layer, the convolutional layer corresponding to scale 1 in the i-1th convolutional layer and the convolutional layer corresponding to scale 2 can be Connection; for the convolutional layer corresponding to scale 2 in the i-th convolutional layer, it can correspond to the convolutional layer corresponding to scale 1 in the i-1th convolutional layer, the convolutional layer corresponding to scale 2 and the scale 3 Convolutional layer connection; for the convolutional layer corresponding to scale 3 in the i-th convolutional layer, it can be the convolutional layer corresponding to scale 2 in the i-1th convolutional layer, and the convolutional layer corresponding to scale 3 in the i-1th convolutional layer Connect to the convolutional layer corresponding to scale 4; for the convolutional layer corresponding to scale 4 in the i-th convolutional layer, it can correspond to the convolutional layer corresponding to scale 3 in the i-1th convolutional layer, and scale 4 The convolutional layer of is connected to the convolutional layer corresponding to scale 5; for the convolutional layer corresponding to scale 5 in the i-th convolutional layer, it can be the convolutional layer corresponding to scale 4 in the i-1th convolutional layer Connect with the convolutional layer corresponding to scale 5.
需要说明的是,图4中每一个卷积层后都施加批归一化层和ReLU激活层,在图中为了更简洁,批归一化层和ReLU激活层被隐含在卷积操作内。It should be noted that after each convolutional layer in Figure 4, batch normalization layer and ReLU activation layer are applied. In the figure, for brevity, the batch normalization layer and ReLU activation layer are implicitly included in the convolution operation. .
图5和图6为本申请实施例提供的多层卷积层的结构示意图,参见图5和图6,在上述PCP-Net的基础上提出层级聚合并行多尺度融合金字塔网络(APCP-Net)。该多层卷积层的各层卷积层可以构成多个依次连接的子树,每个子树包括至少两层卷积层,且每个子树的父节点为之前所有子树的聚合。Figures 5 and 6 are schematic diagrams of the structure of the multi-layer convolutional layer provided by the embodiments of this application. See Figures 5 and 6, on the basis of the above-mentioned PCP-Net, a hierarchical aggregation parallel multi-scale fusion pyramid network (APCP-Net) is proposed. . Each convolutional layer of the multi-layer convolutional layer may form a plurality of sequentially connected subtrees, each subtree includes at least two convolutional layers, and the parent node of each subtree is the aggregation of all previous subtrees.
示例性的,多层卷积层包括9层卷积层,分别为P 1,P 1,…,P 9,其中P 1、P 2和P 3构成一棵子树,P 3聚合了P 1、P 2两个节点的特征;P 3、P 4、P 5和P 6构成一棵子树,P 6聚合了P 3、P 4、P 5的特征;P 6、P 7、P 8和P 9构成一棵子树,P 9聚合了P 6、P 7、P 8的特征。也就是说层级聚合是将各层卷积层逐级相融合,这样既可以达到浅层与深层特征更好的融合,又没有密集连接那样需要对卷积层有过多的拼接操作,使不同层级的特征融合更有效率。 Exemplarily, the multi-layer convolutional layer includes 9 convolutional layers, which are respectively P 1 , P 1 ,..., P 9 , where P 1 , P 2, and P 3 form a subtree, and P 3 aggregates P 1 , The characteristics of the two nodes of P 2 ; P 3 , P 4 , P 5 and P 6 form a subtree, and P 6 aggregates the characteristics of P 3 , P 4 , P 5 ; P 6 , P 7 , P 8 and P 9 To form a subtree, P 9 aggregates the characteristics of P 6 , P 7 , and P 8 . That is to say, the level aggregation is the fusion of the convolutional layers of each layer, which can achieve better integration of the shallow and deep features, and there is no dense connection that requires too many splicing operations on the convolutional layers to make the difference The fusion of hierarchical features is more efficient.
示例性的,每个子树中均包括一输出层卷积层,该输出层卷积层与当前子树中的其他层卷积层和当前子树的父节点分别连接,且当前子树的父节点和当前子树中的其他层卷积层之间依次连接;其中,当前子树的父节点为上一子树的输出层卷积层。Exemplarily, each subtree includes an output layer convolutional layer, and the output layer convolutional layer is connected to the other convolutional layers in the current subtree and the parent node of the current subtree respectively, and the parent of the current subtree The nodes are sequentially connected to other convolutional layers in the current subtree; among them, the parent node of the current subtree is the output convolutional layer of the previous subtree.
例如,参见5和6,对于P 1、P 2和P 3构成的子树,P 3为该子树的输出层卷积层,第一个子树的父节点可以视为接收目标待分割图像Input的P 1,P 3分别与P 1和P 2连接,而且P 1和P 2连接;其中,P 3与P 1为不相邻的两层卷积层,两者之间的连接为跨层连接(如图6中的带箭头的细实线所示)。 For example, referring to 5 and 6, for the sub-tree composed of P 1 , P 2 and P 3 , P 3 is the output layer convolutional layer of the sub-tree, and the parent node of the first sub-tree can be regarded as receiving the target image to be segmented P 1 and P 3 of Input are connected to P 1 and P 2 respectively, and P 1 and P 2 are connected; among them, P 3 and P 1 are two non-adjacent convolutional layers, and the connection between the two is a cross Layer connection (as shown by the thin solid line with arrow in Figure 6).
对于P 3、P 4、P 5和P 6构成的子树,P 6为该子树的输出层卷积层,P 3为该子树的父节点,P 6分别与P 3、P 4、P 5连接,而且P 3、P 4、P 5三者之间依次连接;其中,P 6与P 3为不 相邻的两层卷积层,两者之间的连接为跨层连接(如图6中的带箭头的细实线所示),P 6和P 4也为不相邻的两层卷积层,两者之间的连接为跨层连接(如图6中的带箭头的细实线所示)。 For the sub-tree composed of P 3 , P 4 , P 5 and P 6 , P 6 is the output layer convolutional layer of the sub-tree, P 3 is the parent node of the sub-tree, and P 6 is the same as P 3 , P 4 , P 5 is connected, and P 3 , P 4 , and P 5 are connected in turn; among them, P 6 and P 3 are two non-adjacent convolutional layers, and the connection between the two is a cross-layer connection (such as The thin solid line with arrows in Figure 6), P 6 and P 4 are also two non-adjacent convolutional layers, and the connection between the two is a cross-layer connection (the arrowed line in Figure 6 (Shown by the thin solid line).
对于P 6、P 7、P 8和P 9构成的子树,P 9为该子树的输出层卷积层,P 6为该子树的父节点,P 9分别与P 6、P 7、P 8连接,而且P 6、P 7、P 8三者之间依次连接;其中,P 9与P 6为不相邻的两层卷积层,两者之间的连接为跨层连接(如图6中的带箭头的细实线所示),P 9与P 7为不相邻的两层卷积层,两者之间的连接为跨层连接(如图6中的带箭头的细实线所示)。 For the sub-tree composed of P 6 , P 7 , P 8 and P 9 , P 9 is the output layer convolutional layer of the sub-tree, P 6 is the parent node of the sub-tree, and P 9 is the same as P 6 , P 7 , P 8 is connected, and P 6 , P 7 , P 8 are connected in turn; among them, P 9 and P 6 are two non-adjacent convolutional layers, and the connection between the two is a cross-layer connection (such as The thin solid line with arrows in Figure 6), P 9 and P 7 are two non-adjacent convolutional layers, and the connection between the two is a cross-layer connection (the thin arrowed line in Figure 6 Shown by the solid line).
在一种可能的实现方式中,下一层卷积层的各个卷积层分别与上一层卷积层中对应的卷积层对应连接,可以为:下一层卷积层的各个卷积层与上一层卷积层中的同一尺度的卷积层对应连接;其中,上述下一层卷积层与上述上一层卷积层为不相邻的两层卷积层。In a possible implementation, each convolutional layer of the next convolutional layer is connected to the corresponding convolutional layer in the previous convolutional layer, which can be: each convolution of the next convolutional layer The layers are correspondingly connected to the convolutional layers of the same scale in the upper convolutional layer; wherein, the above-mentioned next-level convolutional layer and the above-mentioned upper-level convolutional layer are two non-adjacent convolutional layers.
例如,第9层卷积层P 9与第7层卷积层P 7为不相邻的两层卷积层,但两层卷积层之间存在连接关系。此时,对于P 9中的尺度1对应的卷积层
Figure PCTCN2020128846-appb-000007
可以与P 7中的尺度1对应的卷积层
Figure PCTCN2020128846-appb-000008
连接;对于P 9中的尺度2对应的卷积层
Figure PCTCN2020128846-appb-000009
可以与P 7中的尺度2对应的卷积层
Figure PCTCN2020128846-appb-000010
连接;对于P 9中的尺度3对应的卷积层
Figure PCTCN2020128846-appb-000011
可以与P 7中的尺度3对应的卷积层
Figure PCTCN2020128846-appb-000012
连接;对于P 9中的尺度4对应的卷积层
Figure PCTCN2020128846-appb-000013
可以与P 7中的尺度4对应的卷积层
Figure PCTCN2020128846-appb-000014
连接;对于P 9中的尺度5对应的卷积层
Figure PCTCN2020128846-appb-000015
可以与P 7中的尺度5对应的卷积层
Figure PCTCN2020128846-appb-000016
连接。
For example, the 9th convolutional layer P 9 and the 7th convolutional layer P 7 are two non-adjacent convolutional layers, but there is a connection relationship between the two convolutional layers. At this time, for the convolutional layer corresponding to scale 1 in P 9
Figure PCTCN2020128846-appb-000007
Convolutional layer that can correspond to scale 1 in P 7
Figure PCTCN2020128846-appb-000008
Connection; for the convolutional layer corresponding to scale 2 in P 9
Figure PCTCN2020128846-appb-000009
Convolutional layer that can correspond to scale 2 in P 7
Figure PCTCN2020128846-appb-000010
Connection; for the convolutional layer corresponding to scale 3 in P 9
Figure PCTCN2020128846-appb-000011
Convolutional layer that can correspond to scale 3 in P 7
Figure PCTCN2020128846-appb-000012
Connection; for the convolutional layer corresponding to scale 4 in P 9
Figure PCTCN2020128846-appb-000013
Convolutional layer that can correspond to scale 4 in P 7
Figure PCTCN2020128846-appb-000014
Connection; for the convolutional layer corresponding to scale 5 in P 9
Figure PCTCN2020128846-appb-000015
Convolutional layer that can correspond to scale 5 in P 7
Figure PCTCN2020128846-appb-000016
connection.
需要说明的是,对于其他存在连接关系且为不相邻的两层卷积层,也可以参照上述对第9层卷积层P 9与对于第7层卷积层P 7之间各个卷积层的连接关系,不再赘述。 It should be noted that for other two non-adjacent convolutional layers that have a connection relationship, you can also refer to the above-mentioned respective convolutions between the 9th convolutional layer P 9 and the 7th convolutional layer P 7 The connection relationship of the layers will not be repeated here.
在另一种可能的实现方式中,所述下一层卷积层的各个卷积层分别与上一层卷积层中对应的卷积层对应连接,可以为:下一层卷积层的当前尺度的卷积层分别与上一层卷积层中的该当前尺度的卷积层和与该当前尺度相邻的尺度的卷积层均连接;其中,上述下一层卷积层与上述上一层卷积层为相邻的两层卷积层。In another possible implementation manner, each convolutional layer of the next convolutional layer is connected to the corresponding convolutional layer in the previous convolutional layer, which can be: The convolutional layer of the current scale is respectively connected to the convolutional layer of the current scale in the previous convolutional layer and the convolutional layer of the scale adjacent to the current scale; wherein, the above-mentioned next-level convolutional layer is connected to the above-mentioned convolutional layer. The upper convolutional layer is two adjacent convolutional layers.
参见图7,第i层卷积层P i与第i+1层卷积层P i+1为相邻的两层卷积层,且两层卷积层之间存在连接关系,其中,1≤i≤m-1,m为多层卷积层中卷积层的层数。此时,对于P i+1中的尺度1对应的卷积层
Figure PCTCN2020128846-appb-000017
可以与P i中的尺度1对应的卷积层
Figure PCTCN2020128846-appb-000018
和尺度2对应的卷积层
Figure PCTCN2020128846-appb-000019
连接;对于P i+1中的尺度2对应的卷积层
Figure PCTCN2020128846-appb-000020
可以与P i中的尺度1对应的卷积层
Figure PCTCN2020128846-appb-000021
尺度2对应的卷积层
Figure PCTCN2020128846-appb-000022
和尺度3对应的卷积层
Figure PCTCN2020128846-appb-000023
连接;对于P i+1中的尺度3对应的卷积层
Figure PCTCN2020128846-appb-000024
可以与P i中的尺度2对应的卷积层
Figure PCTCN2020128846-appb-000025
尺度3对应的卷积层
Figure PCTCN2020128846-appb-000026
和尺度4对应的卷 积层
Figure PCTCN2020128846-appb-000027
连接;对于P i+1中的尺度4对应的卷积层
Figure PCTCN2020128846-appb-000028
可以与P i中的尺度3对应的卷积层
Figure PCTCN2020128846-appb-000029
尺度4对应的卷积层
Figure PCTCN2020128846-appb-000030
和尺度5对应的卷积层
Figure PCTCN2020128846-appb-000031
连接;对于P i+1中的尺度5对应的卷积层
Figure PCTCN2020128846-appb-000032
可以与P i中的尺度4对应的卷积层
Figure PCTCN2020128846-appb-000033
和尺度5对应的卷积层
Figure PCTCN2020128846-appb-000034
连接。
Referring to Figure 7, a convolution layer P i-th layer and the i layer of the first convolutional layer i + 1 P i + 1 is the convolution of two adjacent layers, and there are connections between two convolutional layer, wherein 1 ≤i≤m-1, where m is the number of convolutional layers in the multi-layer convolutional layer. At this time, for the convolutional layer corresponding to scale 1 in Pi+1
Figure PCTCN2020128846-appb-000017
Layer may be a convolution of P i corresponding to the scale 1
Figure PCTCN2020128846-appb-000018
Convolutional layer corresponding to scale 2
Figure PCTCN2020128846-appb-000019
Connection; for the convolutional layer corresponding to scale 2 in Pi+1
Figure PCTCN2020128846-appb-000020
Layer may be a convolution of P i corresponding to the scale 1
Figure PCTCN2020128846-appb-000021
Convolutional layer corresponding to scale 2
Figure PCTCN2020128846-appb-000022
Convolutional layer corresponding to scale 3
Figure PCTCN2020128846-appb-000023
Connection; for the convolutional layer corresponding to scale 3 in Pi+1
Figure PCTCN2020128846-appb-000024
Layer may be a convolution of P i corresponding to the scale 2
Figure PCTCN2020128846-appb-000025
Convolutional layer corresponding to scale 3
Figure PCTCN2020128846-appb-000026
Convolutional layer corresponding to scale 4
Figure PCTCN2020128846-appb-000027
Connection; for the convolutional layer corresponding to scale 4 in Pi+1
Figure PCTCN2020128846-appb-000028
Layer may be a convolution of P i corresponding to the scale 3
Figure PCTCN2020128846-appb-000029
Convolutional layer corresponding to scale 4
Figure PCTCN2020128846-appb-000030
Convolutional layer corresponding to scale 5
Figure PCTCN2020128846-appb-000031
Connection; for the convolutional layer corresponding to scale 5 in Pi+1
Figure PCTCN2020128846-appb-000032
Layer may be a convolution of P i corresponding to the scale 4
Figure PCTCN2020128846-appb-000033
Convolutional layer corresponding to scale 5
Figure PCTCN2020128846-appb-000034
connection.
需要说明的是,图5至图7中仅示出了卷积层,每一个卷积层后都施加批归一化层和ReLU激活层,为了更简洁,批归一化层和ReLU激活层被隐含在卷积操作内。It should be noted that only the convolutional layers are shown in Figures 5 to 7. After each convolutional layer, a batch normalization layer and a ReLU activation layer are applied. For simplicity, the batch normalization layer and the ReLU activation layer It is implicit in the convolution operation.
其中,跨层连接是为了将浅层轮廓信息进行前向传播,而不要求必须要进行多尺度融合,而且过多的多尺度融合也会带来参数量的大量增长,影响模型的精简程度。Among them, the cross-layer connection is to forward the shallow contour information, and does not require multi-scale fusion, and too many multi-scale fusions will also bring about a large increase in the number of parameters, which affects the degree of simplification of the model.
当然在其他实施例中,跨层连接的两层卷积层之间也可以采用如第i层卷积层P i与第i+1层卷积层P i+1的各个卷积层之间的连接方式,本申请实施例对此不予限定。 Of course, in other embodiments, the convolution between the two layers may be cross-linked layer as a convolution between the i-th layer and the i layer of the first P i + 1 P i + layer convolutional layer 1 using a convolution layer The connection mode of this application is not limited in this embodiment.
在步骤103中,根据所述卷积处理结果进行图像分割。In step 103, image segmentation is performed according to the convolution processing result.
其中,在通过上述多层卷积层对目标待分割图像进行卷积处理后,根据卷积处理结果进行图像分割,得到图像分割结果。Wherein, after convolution processing is performed on the target image to be segmented through the above-mentioned multi-layer convolution layer, the image segmentation is performed according to the convolution processing result to obtain the image segmentation result.
上述图像分割方法,获取目标待分割图像,然后通过多层卷积层对目标待分割图像进行卷积处理,其中,第一层卷积层中的各个卷积层之间依次串接,第一尺度的卷积层接收目标待分割图像,各个卷积层依次对目标待分割图像进行卷积下采样;最后一层卷积层中的各个卷积层之间依次串接,各个卷积层依次对接收到的特征信息进行卷积上采样,并通过第一尺度的卷积层输出卷积处理结果,并根据卷积处理的输出结果进行图像分割,相对于传统的神经网络能够大幅度降低参数里,从而减少计算量,提升神经网络的性能和效率。In the above image segmentation method, the target image to be segmented is acquired, and then convolution processing is performed on the target image to be segmented through a multi-layer convolution layer, wherein each convolution layer in the first layer of convolution layer is connected in series, the first The scaled convolutional layer receives the target image to be segmented, and each convolutional layer sequentially performs convolutional down-sampling on the target image to be segmented; each convolutional layer in the last convolutional layer is connected in series, and each convolutional layer is in turn Perform convolutional upsampling on the received feature information, and output the convolution processing result through the first-scale convolution layer, and perform image segmentation according to the output result of the convolution processing. Compared with the traditional neural network, the parameters can be greatly reduced. , Thereby reducing the amount of calculation and improving the performance and efficiency of the neural network.
进一步地,多层卷积层的各层卷积层构成多个依次连接的子树,每个子树包括至少两层卷积层,且每个子树的父节点为之前所有子树的聚合,采用神经网络不同层卷积层提取的特征相互融合的方式,将特征金字塔视为一种整体的特征提取器,那么距离输入较近的特征金字塔可以称为浅层金字塔,反之可以称为深层金字塔,其中浅层金字塔在低级特征提取上较有优势,而深层金字塔中的特征更多的是语义级别的高级特征,将两者层级融合能够高效率实现深层和浅层的特征融合,从而能够有效利用不同层级的特征金字塔的信息,进而提高图像分割的精度。Further, each convolutional layer of the multi-layer convolutional layer constitutes a plurality of sequentially connected subtrees, each subtree includes at least two convolutional layers, and the parent node of each subtree is the aggregation of all previous subtrees, using The way the features extracted from different layers of the convolutional layer of the neural network are merged, the feature pyramid is regarded as an overall feature extractor, then the feature pyramid closer to the input can be called a shallow pyramid, and vice versa, it can be called a deep pyramid. Among them, the shallow pyramid has advantages in low-level feature extraction, while the features in the deep pyramid are more semantic-level high-level features. Combining the two levels can efficiently achieve deep and shallow feature fusion, which can be effectively used The information of feature pyramids at different levels can further improve the accuracy of image segmentation.
以下对上述图像分割方法中的PCP-Net结构和APCP-Net结构进行实验,验证两种结构的有效性。The following experiments are performed on the PCP-Net structure and APCP-Net structure in the above image segmentation method to verify the effectiveness of the two structures.
其中,在验证数据集上的Dice相似性系数和Hausdorff距离,结果如表1所示。Among them, the Dice similarity coefficient and Hausdorff distance on the verification data set are shown in Table 1.
表1 PCP-Net结构的实验结果Table 1 Experimental results of PCP-Net structure
Figure PCTCN2020128846-appb-000035
Figure PCTCN2020128846-appb-000035
由实验结果可以看出,PCP-Net结构在左心室和心肌两个组织的分割结果上超过了基准 模型,而右心室保持不变,可以看出PCP-Net网络对于左心室和心肌的分割精度有所提高,然而对右心室的结果并没有影响。实际上,右心室在舒张期与收缩期在组织器官内的形态差异过大,在某些切片上甚至没有目标,因此在特征下采样和持续卷积的过程中目标有可能会消失,因此在线性连接的多级金字塔网络中,对于右心室的分割依然存在缺陷。从模型的计算复杂度上来看,该模型0.278百万的参数量,相比基准模型降低了将近90%。从验证结果上可以确定,PCP-Net结构不但在分割精度上有所提高,而且在模型复杂度方面有大幅度地改善。It can be seen from the experimental results that the PCP-Net structure exceeds the benchmark model in the segmentation results of the left ventricle and myocardium, while the right ventricle remains unchanged. It can be seen that the segmentation accuracy of the PCP-Net network for the left ventricle and myocardium It has improved, but it has no effect on the results of the right ventricle. In fact, the morphology of the right ventricle in the tissues and organs during diastole and systole is too different, and there is even no target on some slices. Therefore, the target may disappear during the process of feature downsampling and continuous convolution, so online In the multi-level pyramid network of sexual connections, the segmentation of the right ventricle still has defects. From the perspective of the computational complexity of the model, the model has 0.278 million parameters, which is nearly 90% lower than the benchmark model. From the verification results, it can be determined that the PCP-Net structure not only improves the segmentation accuracy, but also greatly improves the model complexity.
相对于传统的U-Net网络,PCP-Net网络结构具有以下优势:Compared with the traditional U-Net network, the PCP-Net network structure has the following advantages:
1、不同的金字塔之间由高分辨率到低分辨率的卷积层都并行的相连,卷积层之间出输入和输出层的金字塔外没有相互交流,这样的设计对于图像特征的分层次提取起到了重要作用,每一种分辨率的卷积层都聚焦于提取当前级别的特征,这样既能有效地利用金字塔的分级特征提取的能力,又能尽可能的降低模型复杂度,要知道融合的特征图越多,对于参数量的增长贡献就越大;1. Different pyramids are connected in parallel from high-resolution to low-resolution convolutional layers. There is no communication between the convolutional layers outside the pyramids of the input and output layers. This design has hierarchical image features. Extraction plays an important role. The convolutional layer of each resolution focuses on extracting the current level of features, so that it can effectively use the hierarchical feature extraction capabilities of the pyramid and reduce the complexity of the model as much as possible. The more feature maps that are fused, the greater the contribution to the growth of the parameter;
2、采用多个特征金字塔并行相连,这样不但低分辨率的深层语义特征实现了高效地提取,而且由于从输入到输出之间保持了多次的高分辨率特征表示,这对于精细化分割目标的轮廓非常重要,尤其针对MRI图像低分辨率的特性,这样的设计更能使低级的轮廓特征得以保留和精细化;2. Use multiple feature pyramids to connect in parallel, so that not only the low-resolution deep semantic features can be extracted efficiently, but also because multiple high-resolution feature representations are maintained from input to output, this is useful for refined segmentation targets. The contour of MRI is very important, especially for the low-resolution characteristics of MRI images. Such a design can preserve and refine low-level contour features;
3、由于U-Net网络是在特征图分辨率减半的同时将特征图数目加倍,这样势必带来较大的参数量增长,但对于MRI分割任务而言,由于分割的目标类别较少,图像内部的灰度值变化较为平缓,对于高级语义特征的表示并不需要提取过多的特征,反而对于组织间模糊的边缘特征而言,若要对边缘更好的描绘,需要更好的特征表示,因此在多级并行多尺度融合的神经网络结构中,取消了特征图数目对分辨率降低而加倍的设计,转而使用相同数目的特征图。3. Since the U-Net network halves the resolution of the feature map while doubling the number of feature maps, this will inevitably bring about a larger increase in the amount of parameters, but for MRI segmentation tasks, due to the segmentation of fewer target categories, The gray value changes within the image are relatively smooth, and there is no need to extract too many features for the representation of high-level semantic features. On the contrary, for the fuzzy edge features between tissues, if you want to better describe the edges, you need better features. Therefore, in the neural network structure of multi-level parallel multi-scale fusion, the design of doubling the resolution of the number of feature maps is cancelled, and the same number of feature maps are used instead.
表2所示为APCP-Net结构在验证集上的实验结果,与PCP-Net相比,层级聚合机制各个组织在收缩期的Dice相似性系数的增长,说明层级聚合机制对于不同深度的特征融合有了显著的贡献,并且层次聚合机制的高效率,并没有带给APCP-Net大规模的参数增长,APCP-Net的参数量为0.317百万,接近于PCP-Net的参数量。Table 2 shows the experimental results of the APCP-Net structure on the validation set. Compared with PCP-Net, the increase in the Dice similarity coefficient of each organization of the hierarchical aggregation mechanism during the systole indicates that the hierarchical aggregation mechanism has different depths of feature fusion It has made a significant contribution and the high efficiency of the hierarchical aggregation mechanism has not brought the large-scale parameter growth of APCP-Net. The parameter amount of APCP-Net is 0.317 million, which is close to the parameter amount of PCP-Net.
表2 APCP-Net结构的实验结果Table 2 Experimental results of APCP-Net structure
Figure PCTCN2020128846-appb-000036
Figure PCTCN2020128846-appb-000036
层级融合的机制使得浅层的特征能够在特征融合的过程中持续的进行前向传播,尽管在第一个金字塔的下采样过程中和持续的融合过程中可能导致占图像比例较小的右心室有消失的情况,但是由于层级融合的机制,在浅层特征依然能够保持到深层特征中参与最后的预测,这也证明了浅层特征与深层特征融合的必要性,尤其对于图像中各类别像素分布不均时的小目标的分割中,作用更加突出。The mechanism of hierarchical fusion allows the shallow features to continue forward propagation during the feature fusion process, although the downsampling process of the first pyramid and the continuous fusion process may result in the right ventricle which accounts for a small proportion of the image. There are cases of disappearance, but due to the hierarchical fusion mechanism, the shallow features can still be retained until the deep features participate in the final prediction, which also proves the necessity of fusion of shallow features and deep features, especially for pixels of various categories in the image In the segmentation of small targets when the distribution is uneven, the role is more prominent.
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执 行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution. The execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present application.
对应于上文实施例所述的图像分割方法,图8示出了本申请实施例提供的图像分割装置的结构框图,为了便于说明,仅示出了与本申请实施例相关的部分。Corresponding to the image segmentation method described in the above embodiment, FIG. 8 shows a structural block diagram of an image segmentation device provided in an embodiment of the present application. For ease of description, only parts related to the embodiment of the present application are shown.
参见图8,本申请实施例中的图像分割装置可以包括图像转换模块201、卷积处理模块202和分割模块203。Referring to FIG. 8, the image segmentation device in the embodiment of the present application may include an image conversion module 201, a convolution processing module 202 and a segmentation module 203.
其中,图像获取模块201,用于获取目标待分割图像;Among them, the image acquisition module 201 is used to acquire the target image to be segmented;
卷积处理模块202,用于通过多层卷积层对所述目标待分割图像进行卷积处理;其中,所述多层卷积层的各层卷积层相互连接,第一层卷积层中的各个卷积层之间依次串接,第一尺度的卷积层接收所述目标待分割图像,各个卷积层依次对所述目标待分割图像进行卷积下采样;所述多层卷积层的最后一层卷积层中的各个卷积层之间依次串接,各个卷积层依次对接收到的特征信息进行卷积上采样,并通过第一尺度的卷积层输出卷积处理结果;The convolution processing module 202 is configured to perform convolution processing on the target image to be segmented through a multi-layer convolution layer; wherein, the convolution layers of the multi-layer convolution layer are connected to each other, and the first convolution layer The convolutional layers in, the convolutional layer of the first scale receives the target image to be segmented, and each convolutional layer sequentially performs convolutional down-sampling on the target image to be segmented; the multi-layer convolution The convolutional layers in the last convolutional layer of the buildup layer are serially connected in sequence, and each convolutional layer sequentially convolution and upsamples the received feature information, and outputs the convolution through the first-scale convolutional layer process result;
分割模块203,用于根据所述卷积处理的输出结果进行图像分割。The segmentation module 203 is configured to perform image segmentation according to the output result of the convolution processing.
可选的,所述多层卷积层的各层卷积层构成多个依次连接的子树,每个子树包括至少两层卷积层,且每个子树的父节点为之前所有子树的聚合。Optionally, each convolutional layer of the multi-layer convolutional layer constitutes a plurality of sequentially connected subtrees, each subtree includes at least two convolutional layers, and the parent node of each subtree is the subtree of all previous subtrees. polymerization.
可选的,每个子树中均包括一输出层卷积层,所述输出层卷积层与当前子树中的其他层卷积层和当前子树的父节点分别连接,且当前子树的父节点和当前子树中的其他层卷积层之间依次连接;其中,当前子树的父节点为上一子树的输出层卷积层。Optionally, each subtree includes an output convolutional layer, and the output convolutional layer is connected to other convolutional layers in the current subtree and the parent node of the current subtree, and the current subtree The parent node and the other convolutional layers in the current subtree are sequentially connected; wherein, the parent node of the current subtree is the output convolutional layer of the previous subtree.
可选的,对于存在连接关系的两层卷积层之间,下一层卷积层的各个卷积层分别与上一层卷积层中对应的卷积层连接。Optionally, between two convolutional layers that have a connection relationship, each convolutional layer of the next convolutional layer is connected to the corresponding convolutional layer of the previous convolutional layer.
在一种可能的实现方式中,所述多层卷积层包括多个尺度的卷积层,每一尺度的卷积层包括多个卷积层;In a possible implementation manner, the multi-layer convolutional layer includes convolutional layers of multiple scales, and the convolutional layer of each scale includes multiple convolutional layers;
所述下一层卷积层的各个卷积层分别与上一层卷积层中对应的卷积层对应连接,为:Each convolutional layer of the next convolutional layer is connected to the corresponding convolutional layer in the previous convolutional layer, and is:
下一层卷积层的各个卷积层与上一层卷积层中的同一尺度的卷积层对应连接;其中,所述下一层卷积层与所述上一层卷积层为不相邻的两层卷积层。Each convolutional layer of the next convolutional layer is correspondingly connected to the convolutional layer of the same scale in the previous convolutional layer; wherein, the next convolutional layer is different from the previous convolutional layer. Two adjacent convolutional layers.
在一种可能的实现方式中,所述多层卷积层包括多个尺度的卷积层,每一尺度的卷积层包括多个卷积层;In a possible implementation manner, the multi-layer convolutional layer includes convolutional layers of multiple scales, and the convolutional layer of each scale includes multiple convolutional layers;
所述下一层卷积层的各个卷积层分别与上一层卷积层中对应的卷积层对应连接,为:Each convolutional layer of the next convolutional layer is connected to the corresponding convolutional layer in the previous convolutional layer, and is:
下一层卷积层的当前尺度的卷积层分别与上一层卷积层中的所述当前尺度的卷积层和与所述当前尺度相邻的尺度的卷积层均连接;The convolutional layer of the current scale of the next convolutional layer is respectively connected to the convolutional layer of the current scale in the previous convolutional layer and the convolutional layer of the scale adjacent to the current scale;
其中,所述下一层卷积层与所述上一层卷积层为相邻的两层卷积层。Wherein, the lower convolutional layer and the upper convolutional layer are two adjacent convolutional layers.
示例性的,可以通过最近邻插值法进行所述卷积上采样或所述卷积下采样。Exemplarily, the convolutional upsampling or the convolutional downsampling may be performed by the nearest neighbor interpolation method.
需要说明的是,上述装置/单元之间的信息交互、执行过程等内容,由于与本申请方法实施例基于同一构思,其具体功能及带来的技术效果,具体可参见方法实施例部分,此处不再赘述。It should be noted that the information interaction and execution process between the above-mentioned devices/units are based on the same concept as the method embodiment of this application, and its specific functions and technical effects can be found in the method embodiment section for details. I won't repeat it here.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块 的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述系统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that for the convenience and conciseness of description, only the division of the above functional units and modules is used as an example. In practical applications, the above functions can be allocated to different functional units and modules as needed. Module completion, that is, the internal structure of the device is divided into different functional units or modules to complete all or part of the functions described above. The functional units and modules in the embodiments can be integrated into one processing unit, or each unit can exist alone physically, or two or more units can be integrated into one unit. The above-mentioned integrated units can be hardware-based Formal realization can also be realized in the form of a software functional unit. In addition, the specific names of the functional units and modules are only for the convenience of distinguishing each other, and are not used to limit the protection scope of the present application. For the specific working process of the units and modules in the foregoing system, reference may be made to the corresponding process in the foregoing method embodiment, which will not be repeated here.
本申请实施例还提供了一种终端设备,参见图9,该终端设备300可以包括:至少一个处理器310、存储器320以及存储在所述存储器320中并可在所述至少一个处理器310上运行的计算机程序,所述处理器310执行所述计算机程序时实现上述任意各个方法实施例中的步骤,例如图2所示实施例中的步骤S101至步骤S103。或者,处理器310执行所述计算机程序时实现上述各装置实施例中各模块/单元的功能,例如图8所示模块201至203的功能。An embodiment of the present application also provides a terminal device. Referring to FIG. 9, the terminal device 300 may include: at least one processor 310, a memory 320, and is stored in the memory 320 and can be stored on the at least one processor 310. A running computer program, when the processor 310 executes the computer program, the steps in any of the foregoing method embodiments, such as steps S101 to S103 in the embodiment shown in FIG. 2, are implemented. Or, when the processor 310 executes the computer program, the functions of the modules/units in the foregoing device embodiments, for example, the functions of the modules 201 to 203 shown in FIG. 8 are realized.
示例性的,计算机程序可以被分割成一个或多个模块/单元,一个或者多个模块/单元被存储在存储器320中,并由处理器310执行,以完成本申请。所述一个或多个模块/单元可以是能够完成特定功能的一系列计算机程序段,该程序段用于描述计算机程序在终端设备300中的执行过程。Exemplarily, the computer program may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 320 and executed by the processor 310 to complete the application. The one or more modules/units may be a series of computer program segments capable of completing specific functions, and the program segments are used to describe the execution process of the computer program in the terminal device 300.
本领域技术人员可以理解,图9仅仅是终端设备的示例,并不构成对终端设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如输入输出设备、网络接入设备、总线等。Those skilled in the art can understand that FIG. 9 is only an example of a terminal device, and does not constitute a limitation on the terminal device. It may include more or less components than those shown in the figure, or a combination of certain components, or different components, such as Input and output equipment, network access equipment, bus, etc.
处理器310可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The processor 310 may be a central processing unit (Central Processing Unit, CPU), other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), ready-made Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
存储器320可以是终端设备的内部存储单元,也可以是终端设备的外部存储设备,例如插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。所述存储器320用于存储所述计算机程序以及终端设备所需的其他程序和数据。所述存储器320还可以用于暂时地存储已经输出或者将要输出的数据。The memory 320 may be an internal storage unit of the terminal device, or an external storage device of the terminal device, such as a plug-in hard disk, a smart media card (SMC), a secure digital (SD) card, and a flash memory card. (Flash Card) and so on. The memory 320 is used to store the computer program and other programs and data required by the terminal device. The memory 320 can also be used to temporarily store data that has been output or will be output.
总线可以是工业标准体系结构(Industry Standard Architecture,ISA)总线、外部设备互连(Peripheral Component,PCI)总线或扩展工业标准体系结构(Extended Industry Standard Architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表示,本申请附图中的总线并不限定仅有一根总线或一种类型的总线。The bus can be an Industry Standard Architecture (ISA) bus, Peripheral Component (PCI) bus, or Extended Industry Standard Architecture (EISA) bus, etc. The bus can be divided into address bus, data bus, control bus and so on. For ease of representation, the buses in the drawings of this application are not limited to only one bus or one type of bus.
本申请实施例提供的图像分割方法可以应用于计算机、平板电脑、笔记本电脑、上网本、个人数字助理(personal digital assistant,PDA)等终端设备上,本申请实施例对终端设备的具体类型不作任何限制。The image segmentation method provided in the embodiments of this application can be applied to terminal devices such as computers, tablets, notebooks, netbooks, personal digital assistants (PDAs), etc. The embodiments of this application do not impose any restrictions on the specific types of terminal devices. .
以所述终端设备为计算机为例。图10示出的是与本申请实施例提供的计算机的部分结构的框图。参考图10,计算机包括:通信电路410、存储器420、输入单元430、显示单元440、音频电路450、无线保真(wireless fidelity,WiFi)模块460、处理器470以及电源480等部件。本领域技术人员可以理解,图10中示出的计算机结构并不构成对计算机的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。Take the terminal device as a computer as an example. FIG. 10 shows a block diagram of a part of the structure of a computer provided in an embodiment of the present application. 10, the computer includes: a communication circuit 410, a memory 420, an input unit 430, a display unit 440, an audio circuit 450, a wireless fidelity (WiFi) module 460, a processor 470, a power supply 480 and other components. Those skilled in the art can understand that the computer structure shown in FIG. 10 does not constitute a limitation on the computer, and may include more or less components than shown in the figure, or a combination of certain components, or different component arrangements.
下面结合图10对计算机的各个构成部件进行具体的介绍:The following is a detailed introduction to the various components of the computer in conjunction with Figure 10:
通信电路410可用于收发信息或通话过程中,信号的接收和发送,特别地,将图像采集设备发送的图像样本接收后,给处理器470处理;另外,将图像采集指令发送给图像采集设备。通常,通信电路包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器(Low Noise Amplifier,LNA)、双工器等。此外,通信电路410还可以通过无线通信 与网络和其他设备通信。上述无线通信可以使用任一通信标准或协议,包括但不限于全球移动通讯系统(Global System of Mobile communication,GSM)、通用分组无线服务(General Packet Radio Service,GPRS)、码分多址(Code Division Multiple Access,CDMA)、宽带码分多址(Wideband Code Division Multiple Access,WCDMA)、长期演进(Long Term Evolution,LTE))、电子邮件、短消息服务(Short Messaging Service,SMS)等。The communication circuit 410 can be used for receiving and sending signals during the process of sending and receiving information or talking. In particular, after receiving the image sample sent by the image acquisition device, it is processed by the processor 470; in addition, the image acquisition instruction is sent to the image acquisition device. Generally, the communication circuit includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier (LNA), a duplexer, and the like. In addition, the communication circuit 410 can also communicate with the network and other devices through wireless communication. The above-mentioned wireless communication can use any communication standard or protocol, including but not limited to Global System of Mobile Communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (Code Division) Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE)), Email, Short Messaging Service (SMS), etc.
存储器420可用于存储软件程序以及模块,处理器470通过运行存储在存储器420的软件程序以及模块,从而执行计算机的各种功能应用以及数据处理。存储器420可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据计算机的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器420可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。The memory 420 may be used to store software programs and modules. The processor 470 executes various functional applications and data processing of the computer by running the software programs and modules stored in the memory 420. The memory 420 may mainly include a program storage area and a data storage area. The program storage area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; Data created by the use of the computer (such as audio data, phone book, etc.), etc. In addition, the memory 420 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.
输入单元430可用于接收输入的数字或字符信息,以及产生与计算机的用户设置以及功能控制有关的键信号输入。具体地,输入单元430可包括触控面板431以及其他输入设备432。触控面板431,也称为触摸屏,可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板431上或在触控面板431附近的操作),并根据预先设定的程式驱动相应的连接装置。可选的,触控面板431可包括触摸检测装置和触摸控制器两个部分。其中,触摸检测装置检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成触点坐标,再送给处理器470,并能接收处理器470发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触控面板431。除了触控面板431,输入单元430还可以包括其他输入设备432。具体地,其他输入设备432可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。The input unit 430 may be used to receive inputted number or character information, and generate key signal input related to user settings and function control of the computer. Specifically, the input unit 430 may include a touch panel 431 and other input devices 432. The touch panel 431, also called a touch screen, can collect user touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc.) on the touch panel 431 or near the touch panel 431. Operation), and drive the corresponding connection device according to the preset program. Optionally, the touch panel 431 may include two parts: a touch detection device and a touch controller. Among them, the touch detection device detects the user's touch position, and detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts it into contact coordinates, and then sends it To the processor 470, and can receive and execute the commands sent by the processor 470. In addition, the touch panel 431 can be implemented in multiple types such as resistive, capacitive, infrared, and surface acoustic wave. In addition to the touch panel 431, the input unit 430 may also include other input devices 432. Specifically, the other input devices 432 may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackball, mouse, and joystick.
显示单元440可用于显示由用户输入的信息或提供给用户的信息以及计算机的各种菜单。显示单元440可包括显示面板441,可选的,可以采用液晶显示器(Liquid Crystal Display,LCD)、有机发光二极管(Organic Light-Emitting Diode,OLED)等形式来配置显示面板441。进一步的,触控面板431可覆盖显示面板441,当触控面板431检测到在其上或附近的触摸操作后,传送给处理器470以确定触摸事件的类型,随后处理器470根据触摸事件的类型在显示面板441上提供相应的视觉输出。虽然在图10中,触控面板431与显示面板441是作为两个独立的部件来实现计算机的输入和输入功能,但是在某些实施例中,可以将触控面板431与显示面板441集成而实现计算机的输入和输出功能。The display unit 440 may be used to display information input by the user or information provided to the user and various menus of the computer. The display unit 440 may include a display panel 441. Optionally, the display panel 441 may be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), etc. Further, the touch panel 431 can cover the display panel 441. When the touch panel 431 detects a touch operation on or near it, it transmits it to the processor 470 to determine the type of the touch event, and then the processor 470 determines the type of the touch event. The type provides corresponding visual output on the display panel 441. Although in FIG. 10, the touch panel 431 and the display panel 441 are used as two independent components to realize the input and input functions of the computer, in some embodiments, the touch panel 431 and the display panel 441 can be integrated. Realize the computer's input and output functions.
音频电路450可提供用户与计算机之间的音频接口。音频电路450可将接收到的音频数据转换后的电信号,传输到扬声器由扬声器转换为声音信号输出;另一方面,传声器将收集的声音信号转换为电信号,由音频电路450接收后转换为音频数据,再将音频数据输出处理器470处理后,经通信电路410以发送给比如另一计算机,或者将音频数据输出至存储器420以便进一步处理。The audio circuit 450 may provide an audio interface between the user and the computer. The audio circuit 450 can transmit the electric signal after the conversion of the received audio data to the speaker, which is converted into a sound signal for output by the speaker; on the other hand, the microphone converts the collected sound signal into an electric signal, which is converted into an electric signal after being received by the audio circuit 450 The audio data is processed by the audio data output processor 470, and then sent to, for example, another computer through the communication circuit 410, or the audio data is output to the memory 420 for further processing.
WiFi属于短距离无线传输技术,计算机通过WiFi模块460可以帮助用户收发电子邮件、浏览网页和访问流式媒体等,它为用户提供了无线的宽带互联网访问。虽然图10示出了WiFi模块460,但是可以理解的是,其并不属于计算机的必须构成,完全可以根据需要在不改变 发明的本质的范围内而省略。WiFi is a short-distance wireless transmission technology. The computer can help users send and receive emails, browse web pages, and access streaming media through the WiFi module 460. It provides users with wireless broadband Internet access. Although FIG. 10 shows the WiFi module 460, it can be understood that it is not a necessary component of the computer and can be omitted as needed without changing the essence of the invention.
处理器470是计算机的控制中心,利用各种接口和线路连接整个计算机的各个部分,通过运行或执行存储在存储器420内的软件程序和/或模块,以及调用存储在存储器420内的数据,执行计算机的各种功能和处理数据,从而对计算机进行整体监控。可选的,处理器470可包括一个或多个处理单元;优选的,处理器470可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器470中。The processor 470 is the control center of the computer. It uses various interfaces and lines to connect various parts of the entire computer. It executes by running or executing software programs and/or modules stored in the memory 420, and calling data stored in the memory 420. Various functions of the computer and processing data, so as to monitor the computer as a whole. Optionally, the processor 470 may include one or more processing units; preferably, the processor 470 may integrate an application processor and a modem processor, where the application processor mainly processes the operating system, user interface, application programs, etc. , The modem processor mainly deals with wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 470.
计算机还包括给各个部件供电的电源480(比如电池),优选的,电源480可以通过电源管理系统与处理器470逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。The computer also includes a power source 480 (such as a battery) for supplying power to various components. Preferably, the power source 480 may be logically connected to the processor 470 through a power management system, so that functions such as charging, discharging, and power consumption management can be managed through the power management system.
本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现可实现上述图像分割方法各个实施例中的步骤。The embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps in each embodiment of the above-mentioned image segmentation method can be realized.
本申请实施例提供了一种计算机程序产品,当计算机程序产品在移动终端上运行时,使得移动终端执行时实现可实现上述图像分割方法各个实施例中的步骤。The embodiments of the present application provide a computer program product. When the computer program product runs on a mobile terminal, the steps in each embodiment of the above-mentioned image segmentation method can be realized when the mobile terminal is executed.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质至少可以包括:能够将计算机程序代码携带到拍照装置/终端设备的任何实体或装置、记录介质、计算机存储器、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、电载波信号、电信信号以及软件分发介质。例如U盘、移动硬盘、磁碟或者光盘等。在某些司法管辖区,根据立法和专利实践,计算机可读介质不可以是电载波信号和电信信号。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the implementation of all or part of the processes in the above-mentioned embodiments and methods in the present application can be accomplished by instructing relevant hardware through a computer program. The computer program can be stored in a computer-readable storage medium. The computer program can be stored in a computer-readable storage medium. When executed by the processor, the steps of the foregoing method embodiments can be implemented. Wherein, the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate forms. The computer-readable medium may at least include: any entity or device capable of carrying the computer program code to the photographing device/terminal device, recording medium, computer memory, read-only memory (ROM, Read-Only Memory), and random access memory (RAM, Random Access Memory), electric carrier signal, telecommunications signal and software distribution medium. Such as U disk, mobile hard disk, floppy disk or CD-ROM, etc. In some jurisdictions, in accordance with legislation and patent practices, computer-readable media cannot be electrical carrier signals and telecommunication signals.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。In the above-mentioned embodiments, the description of each embodiment has its own focus. For parts that are not described in detail or recorded in an embodiment, reference may be made to related descriptions of other embodiments.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。A person of ordinary skill in the art may realize that the units and algorithm steps of the examples described in combination with the embodiments disclosed herein can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
在本申请所提供的实施例中,应该理解到,所揭露的装置/网络设备和方法,可以通过其它的方式实现。例如,以上所描述的装置/网络设备实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性,机械或其它的形式。In the embodiments provided in this application, it should be understood that the disclosed apparatus/network equipment and method may be implemented in other ways. For example, the device/network device embodiments described above are merely illustrative. For example, the division of the modules or units is only a logical function division, and there may be other divisions in actual implementation, such as multiple units. Or components can be combined or integrated into another system, or some features can be omitted or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元 上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, a person of ordinary skill in the art should understand that it can still implement the foregoing The technical solutions recorded in the examples are modified, or some of the technical features are equivalently replaced; and these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in Within the scope of protection of this application.

Claims (10)

  1. 一种图像分割方法,其特征在于,包括:An image segmentation method, characterized in that it includes:
    获取目标待分割图像;Obtain the target image to be segmented;
    通过多层卷积层对所述目标待分割图像进行卷积处理;其中,所述多层卷积层的各层卷积层相互连接,第一层卷积层中的各个卷积层之间依次串接,第一尺度的卷积层接收所述目标待分割图像,各个卷积层依次对所述目标待分割图像进行卷积下采样;所述多层卷积层的最后一层卷积层中的各个卷积层之间依次串接,各个卷积层依次对接收到的特征信息进行卷积上采样,并通过第一尺度的卷积层输出卷积处理结果;Convolution processing is performed on the target image to be segmented through a multi-layer convolutional layer; wherein, the convolutional layers of the multi-layer convolutional layer are connected to each other, and the convolutional layers in the first convolutional layer are connected to each other. Concatenated sequentially, the first-scale convolutional layer receives the target image to be segmented, and each convolutional layer sequentially performs convolutional down-sampling on the target image to be segmented; the last layer of the multi-layer convolutional layer convolution Each convolutional layer in the layer is serially connected in sequence, and each convolutional layer sequentially convolutional upsamples the received feature information, and outputs the convolution processing result through the first-scale convolutional layer;
    根据所述卷积处理结果进行图像分割。Image segmentation is performed according to the result of the convolution processing.
  2. 如权利要求1所述的图像分割方法,其特征在于,所述多层卷积层的各层卷积层构成多个依次连接的子树,每个子树包括至少两层卷积层,且每个子树的父节点为之前所有子树的聚合。The image segmentation method according to claim 1, wherein each convolutional layer of the multi-layer convolutional layer constitutes a plurality of sequentially connected subtrees, and each subtree includes at least two convolutional layers, and each subtree includes at least two convolutional layers. The parent node of each subtree is the aggregation of all previous subtrees.
  3. 如权利要求2所述的图像分割方法,其特征在于,每个子树中均包括一输出层卷积层,所述输出层卷积层与当前子树中的其他层卷积层和当前子树的父节点分别连接,且当前子树的父节点和当前子树中的其他层卷积层之间依次连接;其中,当前子树的父节点为上一子树的输出层卷积层。The image segmentation method according to claim 2, wherein each subtree includes an output layer convolutional layer, and the output layer convolutional layer is the same as other convolutional layers in the current subtree and the current subtree The parent nodes of are respectively connected, and the parent node of the current subtree is connected to other convolutional layers in the current subtree in sequence; wherein, the parent node of the current subtree is the output convolutional layer of the previous subtree.
  4. 如权利要求2所述的图像分割方法,其特征在于,对于存在连接关系的两层卷积层之间,下一层卷积层的各个卷积层分别与上一层卷积层中对应的卷积层连接。The image segmentation method according to claim 2, characterized in that, between two convolutional layers that have a connection relationship, each convolutional layer of the next convolutional layer corresponds to the corresponding one in the previous convolutional layer. Convolutional layer connection.
  5. 如权利要求4所述的图像分割方法,其特征在于,所述多层卷积层包括多个尺度的卷积层,每一尺度的卷积层包括多个卷积层;5. The image segmentation method according to claim 4, wherein the multi-layer convolutional layer includes a plurality of scales of convolutional layers, and each scale of the convolutional layer includes a plurality of convolutional layers;
    所述下一层卷积层的各个卷积层分别与上一层卷积层中对应的卷积层对应连接,为:Each convolutional layer of the next convolutional layer is connected to the corresponding convolutional layer in the previous convolutional layer, and is:
    下一层卷积层的各个卷积层与上一层卷积层中的同一尺度的卷积层对应连接;其中,所述下一层卷积层与所述上一层卷积层为不相邻的两层卷积层。Each convolutional layer of the next convolutional layer is correspondingly connected to the convolutional layer of the same scale in the previous convolutional layer; wherein, the next convolutional layer is different from the previous convolutional layer. Two adjacent convolutional layers.
  6. 如权利要求4所述的图像分割方法,其特征在于,所述多层卷积层包括多个尺度的卷积层,每一尺度的卷积层包括多个卷积层;5. The image segmentation method according to claim 4, wherein the multi-layer convolutional layer includes a plurality of scales of convolutional layers, and each scale of the convolutional layer includes a plurality of convolutional layers;
    所述下一层卷积层的各个卷积层分别与上一层卷积层中对应的卷积层对应连接,为:Each convolutional layer of the next convolutional layer is connected to the corresponding convolutional layer in the previous convolutional layer, and is:
    下一层卷积层的当前尺度的卷积层分别与上一层卷积层中的所述当前尺度的卷积层和与所述当前尺度相邻的尺度的卷积层均连接;The convolutional layer of the current scale of the next convolutional layer is respectively connected to the convolutional layer of the current scale in the previous convolutional layer and the convolutional layer of the scale adjacent to the current scale;
    其中,所述下一层卷积层与所述上一层卷积层为相邻的两层卷积层。Wherein, the lower convolutional layer and the upper convolutional layer are two adjacent convolutional layers.
  7. 如权利要求1所述的图像分割方法,其特征在于,通过最近邻插值法进行所述卷积 上采样或所述卷积下采样。The image segmentation method according to claim 1, wherein the convolutional upsampling or the convolutional downsampling is performed by a nearest neighbor interpolation method.
  8. 一种图像分割装置,其特征在于,包括:An image segmentation device, characterized in that it comprises:
    图像获取模块,用于获取目标待分割图像;The image acquisition module is used to acquire the target image to be segmented;
    卷积处理模块,用于通过多层卷积层对所述目标待分割图像进行卷积处理;其中,所述多层卷积层的各层卷积层相互连接,第一层卷积层中的各个卷积层之间依次串接,第一尺度的卷积层接收所述目标待分割图像,各个卷积层依次对所述目标待分割图像进行卷积下采样;所述多层卷积层的最后一层卷积层中的各个卷积层之间依次串接,各个卷积层依次对接收到的特征信息进行卷积上采样,并通过第一尺度的卷积层输出卷积处理结果;The convolution processing module is configured to perform convolution processing on the target image to be segmented through a multi-layer convolution layer; wherein, the convolution layers of the multi-layer convolution layer are connected to each other, and the first convolution layer is The convolutional layers of each are sequentially connected in series, the first-scale convolutional layer receives the target image to be segmented, and each convolutional layer sequentially performs convolution down-sampling on the target image to be segmented; the multi-layer convolution The convolutional layers in the last convolutional layer of the layer are serially connected in sequence, and each convolutional layer sequentially convolutional upsampling the received feature information, and output convolution processing through the first-scale convolutional layer result;
    分割模块,用于根据所述卷积处理的输出结果进行图像分割。The segmentation module is configured to perform image segmentation according to the output result of the convolution processing.
  9. 一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1至7任一项所述的方法。A terminal device, comprising a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the processor executes the computer program as claimed in claims 1 to 7. The method of any one.
  10. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1至7任一项所述的方法。A computer-readable storage medium storing a computer program, wherein the computer program implements the method according to any one of claims 1 to 7 when the computer program is executed by a processor.
PCT/CN2020/128846 2019-11-26 2020-11-13 Image segmentation method and apparatus, and terminal device WO2021104058A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911172775.6 2019-11-26
CN201911172775.6A CN111047602A (en) 2019-11-26 2019-11-26 Image segmentation method and device and terminal equipment

Publications (1)

Publication Number Publication Date
WO2021104058A1 true WO2021104058A1 (en) 2021-06-03

Family

ID=70233439

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/128846 WO2021104058A1 (en) 2019-11-26 2020-11-13 Image segmentation method and apparatus, and terminal device

Country Status (2)

Country Link
CN (1) CN111047602A (en)
WO (1) WO2021104058A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113361524A (en) * 2021-06-29 2021-09-07 北京百度网讯科技有限公司 Image processing method and device
CN116012688A (en) * 2023-03-27 2023-04-25 成都神鸟数据咨询有限公司 Image enhancement method for urban management evaluation system
CN116229336A (en) * 2023-05-10 2023-06-06 江西云眼视界科技股份有限公司 Video moving target identification method, system, storage medium and computer
CN117292067A (en) * 2023-11-24 2023-12-26 中影年年(北京)文化传媒有限公司 Virtual 3D model method and system based on scanning real object acquisition
CN117495884A (en) * 2024-01-02 2024-02-02 湖北工业大学 Steel surface defect segmentation method and device, electronic equipment and storage medium

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111047602A (en) * 2019-11-26 2020-04-21 中国科学院深圳先进技术研究院 Image segmentation method and device and terminal equipment
CN111723841A (en) * 2020-05-09 2020-09-29 北京捷通华声科技股份有限公司 Text detection method and device, electronic equipment and storage medium
CN111985634A (en) * 2020-08-21 2020-11-24 北京灵汐科技有限公司 Operation method and device of neural network, computer equipment and storage medium
CN113012220A (en) * 2021-02-02 2021-06-22 深圳市识农智能科技有限公司 Fruit counting method and device and electronic equipment
CN115223017B (en) * 2022-05-31 2023-12-19 昆明理工大学 Multi-scale feature fusion bridge detection method based on depth separable convolution
CN117635942B (en) * 2023-12-05 2024-05-07 齐鲁工业大学(山东省科学院) Cardiac MRI image segmentation method based on edge feature enhancement

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109598728A (en) * 2018-11-30 2019-04-09 腾讯科技(深圳)有限公司 Image partition method, device, diagnostic system and storage medium
CN109741331A (en) * 2018-12-24 2019-05-10 北京航空航天大学 A kind of display foreground method for segmenting objects
CN110223304A (en) * 2019-05-20 2019-09-10 山东大学 A kind of image partition method, device and computer readable storage medium based on multipath polymerization
CN111047602A (en) * 2019-11-26 2020-04-21 中国科学院深圳先进技术研究院 Image segmentation method and device and terminal equipment

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109785336B (en) * 2018-12-18 2020-11-27 深圳先进技术研究院 Image segmentation method and device based on multipath convolutional neural network model
CN109801293B (en) * 2019-01-08 2023-07-14 平安科技(深圳)有限公司 Remote sensing image segmentation method and device, storage medium and server
CN110378913B (en) * 2019-07-18 2023-04-11 深圳先进技术研究院 Image segmentation method, device, equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109598728A (en) * 2018-11-30 2019-04-09 腾讯科技(深圳)有限公司 Image partition method, device, diagnostic system and storage medium
CN109741331A (en) * 2018-12-24 2019-05-10 北京航空航天大学 A kind of display foreground method for segmenting objects
CN110223304A (en) * 2019-05-20 2019-09-10 山东大学 A kind of image partition method, device and computer readable storage medium based on multipath polymerization
CN111047602A (en) * 2019-11-26 2020-04-21 中国科学院深圳先进技术研究院 Image segmentation method and device and terminal equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CAIZI LI; QIANQIAN TONG; XIANGYUN LIAO; WEIXIN SI; SHU CHEN; QIONG WANG; ZHIYONG YUAN: "APCP-NET: Aggregated Parallel Cross-Scale Pyramid Network for CMR Segmentation", 2019 IEEE 16TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2019), 11 April 2019 (2019-04-11), pages 784 - 788, XP033576332, ISSN: 1945-8452, DOI: 10.1109/ISBI.2019.8759147 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113361524A (en) * 2021-06-29 2021-09-07 北京百度网讯科技有限公司 Image processing method and device
CN113361524B (en) * 2021-06-29 2024-05-03 北京百度网讯科技有限公司 Image processing method and device
CN116012688A (en) * 2023-03-27 2023-04-25 成都神鸟数据咨询有限公司 Image enhancement method for urban management evaluation system
CN116012688B (en) * 2023-03-27 2023-06-09 成都神鸟数据咨询有限公司 Image enhancement method for urban management evaluation system
CN116229336A (en) * 2023-05-10 2023-06-06 江西云眼视界科技股份有限公司 Video moving target identification method, system, storage medium and computer
CN116229336B (en) * 2023-05-10 2023-08-18 江西云眼视界科技股份有限公司 Video moving target identification method, system, storage medium and computer
CN117292067A (en) * 2023-11-24 2023-12-26 中影年年(北京)文化传媒有限公司 Virtual 3D model method and system based on scanning real object acquisition
CN117292067B (en) * 2023-11-24 2024-03-05 中影年年(北京)科技有限公司 Virtual 3D model method and system based on scanning real object acquisition
CN117495884A (en) * 2024-01-02 2024-02-02 湖北工业大学 Steel surface defect segmentation method and device, electronic equipment and storage medium
CN117495884B (en) * 2024-01-02 2024-03-22 湖北工业大学 Steel surface defect segmentation method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN111047602A (en) 2020-04-21

Similar Documents

Publication Publication Date Title
WO2021104058A1 (en) Image segmentation method and apparatus, and terminal device
WO2021104060A1 (en) Image segmentation method and apparatus, and terminal device
TWI779238B (en) Image processing method and apparatus, electronic device, and computer-readable recording medium
WO2021036695A1 (en) Method and apparatus for determining image to be marked, and method and apparatus for training model
CN109753978B (en) Image classification method, device and computer readable storage medium
WO2020125623A1 (en) Method and device for live body detection, storage medium, and electronic device
WO2021196873A1 (en) License plate character recognition method and apparatus, electronic device, and storage medium
WO2021104089A1 (en) Image registration method and device as well as terminal device
Kou et al. Microaneurysms segmentation with a U-Net based on recurrent residual convolutional neural network
WO2021259393A2 (en) Image processing method and apparatus, and electronic device
Ren et al. Towards efficient medical lesion image super-resolution based on deep residual networks
WO2020125498A1 (en) Cardiac magnetic resonance image segmentation method and apparatus, terminal device and storage medium
US11354797B2 (en) Method, device, and system for testing an image
WO2021136368A1 (en) Method and apparatus for automatically detecting pectoralis major region in molybdenum target image
WO2022127111A1 (en) Cross-modal face recognition method, apparatus and device, and storage medium
WO2023202285A1 (en) Image processing method and apparatus, computer device, and storage medium
CN111680755A (en) Medical image recognition model construction method, medical image recognition device, medical image recognition medium and medical image recognition terminal
CN108876716A (en) Super resolution ratio reconstruction method and device
CN110399827A (en) A kind of Handwritten Numeral Recognition Method based on convolutional neural networks
CN111489318B (en) Medical image enhancement method and computer-readable storage medium
CN114419375B (en) Image classification method, training device, electronic equipment and storage medium
WO2021000495A1 (en) Image processing method and device
WO2022227193A1 (en) Liver region segmentation method and apparatus, and electronic device and storage medium
CN114064870B (en) Multi-mode-oriented conversation method and device, electronic equipment and storage medium
CN113032622A (en) Novel medical video image acquisition and data management system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20892186

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20892186

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 24.01.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20892186

Country of ref document: EP

Kind code of ref document: A1