KR102204956B1

KR102204956B1 - Method for semantic segmentation and apparatus thereof

Info

Publication number: KR102204956B1
Application number: KR1020200011129A
Authority: KR
Inventors: 유인완
Original assignee: 주식회사 루닛
Priority date: 2020-01-30
Filing date: 2020-01-30
Publication date: 2021-01-19
Also published as: KR20200112646A

Abstract

세그먼테이션 결과의 정확도를 향상시킬 수 있는 시맨틱 세그먼테이션 방법 및 장치가 제공된다. 본 개시의 몇몇 실시예에 따른 시맨틱 세그먼테이션 방법은 레이블이 주어진 이미지를 세그먼테이션 신경망에 입력하여 상기 이미지에 대한 세그먼테이션 정보를 얻고, 상기 세그먼테이션 정보에 대한 세그먼테이션 오차를 역전파(back-propagation)할 수 있다. 이때, 상기 세그먼테이션 정보에 대한 경계선 오차를 더 역전파하여 상기 세그먼테이션 신경망이 업데이트될 수 있다. 그렇게 함으로써, 세그먼테이션 신경망의 성능이 개선되며, 세그먼테이션 결과의 정확도는 향상될 수 있다.A semantic segmentation method and apparatus capable of improving the accuracy of a segmentation result are provided. The semantic segmentation method according to some embodiments of the present disclosure may obtain segmentation information for the image by inputting an image given a label into a segmentation neural network, and back-propagation of a segmentation error for the segmentation information. In this case, the segmentation neural network may be updated by further backpropagating the boundary line error with respect to the segmentation information. By doing so, the performance of the segmentation neural network can be improved, and the accuracy of the segmentation result can be improved.

Description

Semantic segmentation method and apparatus therefor

본 개시는 시맨틱 세그먼테이션 방법 및 장치에 관한 것이다. 보다 자세하게는, 신경망을 이용하여 시맨틱 세그먼테이션(semantic segmentation)을 수행함에 있어서, 세그먼테이션 결과의 정확도를 향상시킬 수 있는 방법 및 그 방법을 지원하는 장치에 관한 것이다.The present disclosure relates to a method and apparatus for semantic segmentation. In more detail, in performing semantic segmentation using a neural network, the present invention relates to a method capable of improving the accuracy of a segmentation result and an apparatus supporting the method.

기계학습 분야에서, 시맨틱 세그먼테이션(semantic segmentation)은 이미지를 구성하는 모든 픽셀들을 미리 정의된 시맨틱 객체의 클래스로 분류하는 태스크를 의미한다. 시맨틱 세그먼테이션은 픽셀 단위 예측(pixel-wise prediction)을 하는 것이기 때문에, 조밀한 예측(dense prediction)이란 용어로 불리우기도 한다. 통상적으로, 시맨틱 세그먼테이션은 심층 신경망(deep neural network)에 기반하여 수행되는데, 심층 신경망을 학습시키기 위해서는 상당한 양의 이미지와 시맨틱 객체의 클래스, 위치 및 형태 정보가 담긴 정교한 레이블(e.g. 픽셀 단위의 클래스 정보)이 요구된다.In the field of machine learning, semantic segmentation refers to a task of classifying all pixels constituting an image into a predefined class of semantic objects. Since semantic segmentation is to make pixel-wise prediction, it is sometimes referred to as a term of dense prediction. Typically, semantic segmentation is performed based on a deep neural network.In order to learn a deep neural network, an elaborate label (eg, class information in pixels) containing class, position and shape information of a significant amount of images and semantic objects. ) Is required.

그러나, 레이블링 작업에 소요되는 인적 비용으로 인해 정교한 레이블이 주어진 이미지셋을 구하는 것은 매우 어렵다. 예컨대, 의료 도메인에서는, 전문의에 의해 레이블링 작업이 수행되기 때문에, 단순한 레이블링 작업에도 상당한 비용이 소모된다. 따라서, 시맨틱 객체(e.g. 병변)의 경계선(즉, 형태)까지 정교하게 레이블링된 의료 이미지를 대량으로 구하는 것은 사실상 불가능하다.However, it is very difficult to obtain an image set given an elaborate label due to the human cost required for labeling. For example, in the medical domain, since labeling is performed by a specialist, a significant cost is consumed even for a simple labeling operation. Therefore, it is virtually impossible to obtain a large amount of elaborately labeled medical images up to the boundary (ie, shape) of a semantic object (e.g. lesion).

위와 같은 문제를 해결하기 위해, 정교하지 않은 레이블을 이용하여 시맨틱 세그먼테이션을 수행하는 것이 고려될 수 있다. 그러나, 상기 레이블을 이용하여 심층 신경망을 학습시키는 경우, 세그먼테이션 결과의 정확도를 담보할 수 없게 된다. 예를 들어, 시맨틱 객체의 경계선 정보를 포함하지 않는 레이블을 이용하여 심층 신경망을 학습하게 되면, 세그먼테이션 결과에 나타난 시맨틱 객체의 형태가 본래의 형태를 따라가지 못하는 문제가 발생된다.To solve the above problem, it may be considered to perform semantic segmentation using a label that is not elaborate. However, when a deep neural network is trained using the label, the accuracy of the segmentation result cannot be guaranteed. For example, when a deep neural network is learned using a label that does not include boundary information of a semantic object, a problem occurs in that the shape of the semantic object displayed in the segmentation result does not follow the original shape.

한국공개특허 제10-2018-0097944호 (2018.09.03 공개)Korean Patent Publication No. 10-2018-0097944 (published on September 3, 2018)

본 개시의 몇몇 실시예들을 통해 해결하고자 하는 기술적 과제는, 세그먼테이션 결과의 정확도를 향상시킬 수 있는 시맨틱 세그먼테이션 방법 및 그 방법을 지원하는 장치를 제공하는 것이다.A technical problem to be solved through some embodiments of the present disclosure is to provide a semantic segmentation method capable of improving the accuracy of a segmentation result, and an apparatus supporting the method.

본 개시의 몇몇 실시예들을 통해 해결하고자 하는 다른 기술적 과제는, 시멘틱 객체들의 경계선 정보가 포함되지 않은 레이블을 이용하여 시맨틱 객체의 형태까지 정확하게 예측할 수 있도록 세그먼테이션 신경망을 학습하는 방법 및 그 방법을 지원하는 장치를 제공하는 것이다.Another technical problem to be solved through some embodiments of the present disclosure is a method of learning a segmentation neural network to accurately predict a shape of a semantic object using a label that does not contain boundary information of semantic objects, and a method of supporting the method. To provide a device.

본 개시의 몇몇 실시예들을 통해 해결하고자 하는 또 다른 기술적 과제는, 시맨틱 객체에 대한 경계선 오차를 정확하게 산출함으로써, 세그먼테이션 신경망의 성능을 향상시킬 수 있는 방법 및 그 방법을 지원하는 장치를 제공하는 것이다.Another technical problem to be solved through some embodiments of the present disclosure is to provide a method capable of improving performance of a segmentation neural network by accurately calculating a boundary error for a semantic object, and an apparatus supporting the method.

본 개시의 기술적 과제들은 이상에서 언급한 기술적 과제들로 제한되지 않으며, 언급되지 않은 또 다른 기술적 과제들은 아래의 기재로부터 본 개시의 기술분야에서의 통상의 기술자에게 명확하게 이해될 수 있을 것이다.The technical problems of the present disclosure are not limited to the technical problems mentioned above, and other technical problems that are not mentioned will be clearly understood by those skilled in the art from the following description.

상기 기술적 과제를 해결하기 위한, 본 개시의 몇몇 실시예에 따른 시맨틱 세그먼테이션 방법은, 컴퓨팅 장치에 의하여 수행되는 시맨틱 세그먼테이션(semantic segmentation) 방법에 있어서, 레이블이 주어진 이미지를 세그먼테이션 신경망에 입력하여 상기 이미지에 대한 세그먼테이션 정보를 얻는 단계, 상기 세그먼테이션 정보에 대한 세그먼테이션 오차와 경계선 오차를 산출하는 단계 및 상기 산출된 세그먼테이션 오차와 상기 산출된 경계선 오차를 역전파(back-propagation)하여 상기 세그먼테이션 신경망을 업데이트하는 단계를 포함할 수 있다.In order to solve the technical problem, a semantic segmentation method according to some embodiments of the present disclosure is a semantic segmentation method performed by a computing device, in which an image given a label is input to a segmentation neural network and Obtaining segmentation information for the segmentation information, calculating a segmentation error and a boundary line error for the segmentation information, and updating the segmentation neural network by back-propagating the calculated segmentation error and the calculated boundary line error. Can include.

몇몇 실시예에서, 상기 레이블은 시맨틱 객체의 형태를 나타내는 경계선 정보를 포함하지 않을 수 있다.In some embodiments, the label may not include boundary information indicating the shape of the semantic object.

몇몇 실시예에서, 상기 세그먼테이션 오차는 크로스 엔트로피(cross-entropy) 손실 함수에 기초하여 산출되고, 상기 경계선 오차는 L1 손실(loss) 함수 또는 L2 손실 함수에 기초하여 산출될 수 있다.In some embodiments, the segmentation error may be calculated based on a cross-entropy loss function, and the boundary line error may be calculated based on an L1 loss function or an L2 loss function.

몇몇 실시예에서, 상기 경계선 오차를 산출하는 단계는, 경계선 검출 신경망을 통해 상기 이미지에서 시맨틱 객체에 대한 제1 경계선 정보를 얻는 단계, 이미지 프로세싱 로직(image processing logic)을 이용하여 상기 세그먼테이션 정보에서 제2 경계선 정보를 추출하는 단계 및 상기 제1 경계선 정보와 상기 제2 경계선 정보에 기초하여 상기 경계선 오차를 산출하는 단계를 포함할 수 있다. 이때, 상기 산출된 경계선 오차의 역전파를 통해 상기 경계선 검출 신경망도 업데이트될 수 있다.In some embodiments, the calculating of the boundary line error may include obtaining first boundary line information about a semantic object from the image through a boundary line detection neural network, and using image processing logic to determine the segmentation information. 2 extracting boundary line information, and calculating the boundary line error based on the first boundary line information and the second boundary line information. In this case, the boundary line detection neural network may also be updated through backpropagation of the calculated boundary line error.

몇몇 실시예에서, 상기 경계선 검출 신경망은 상기 세그먼테이션 신경망보다 더 적은 개수의 레이어 또는 가중치 파라미터를 포함할 수 있다.In some embodiments, the boundary line detection neural network may include fewer layers or weight parameters than the segmentation neural network.

몇몇 실시예에서, 상기 제1 경계선 정보와 상기 제2 경계선 정보에 기초하여 상기 경계선 오차를 산출하는 단계는, 상기 이미지를 어텐션(attention) 신경망을 입력하여 상기 이미지에 대한 어텐션 정보를 얻는 단계, 상기 어텐션 정보를 상기 제1 경계선 정보에 적용하여 제3 경계선 정보를 얻는 단계 및 상기 제2 경계선 정보와 상기 제3 경계선 정보의 차이에 기초하여 상기 경계선 오차를 산출하는 단계를 포함할 수 있다. 이때, 상기 산출된 경계선 오차의 역전파를 통해 상기 어텐션 신경망도 업데이트될 수 있다.In some embodiments, calculating the boundary line error based on the first boundary line information and the second boundary line information comprises: obtaining attention information for the image by inputting an attention neural network to the image, the It may include applying attention information to the first boundary line information to obtain third boundary line information, and calculating the boundary line error based on a difference between the second boundary line information and the third boundary line information. In this case, the attention neural network may also be updated through backpropagation of the calculated boundary line error.

상술한 기술적 과제를 해결하기 위한 본 개시의 몇몇 실시예에 따른 시맨틱 세그먼테이션 장치는, 하나 이상의 인스트럭션들(instructions)을 저장하는 메모리, 상기 저장된 하나 이상의 인스트럭션들을 실행시킴으로써, 레이블이 주어진 이미지를 세그먼테이션 신경망에 입력하여 상기 이미지에 대한 세그먼테이션 정보를 얻고, 상기 세그먼테이션 정보에 대한 세그먼테이션 오차와 경계선 오차를 산출하며, 상기 산출된 세그먼테이션 오차와 상기 산출된 경계선 오차를 역전파(back-propagation)하여 상기 세그먼테이션 신경망을 업데이트하는 프로세서를 포함할 수 있다.The semantic segmentation apparatus according to some embodiments of the present disclosure for solving the above-described technical problem includes a memory storing one or more instructions, and executing the stored one or more instructions, so that a labeled image is transferred to a segmentation neural network. Input to obtain segmentation information for the image, calculate a segmentation error and a boundary line error for the segmentation information, and update the segmentation neural network by back-propagating the calculated segmentation error and the calculated boundary line error It may include a processor.

몇몇 실시예에서, 상기 프로세서는, 경계선 검출 신경망을 통해 상기 이미지에서 시맨틱 객체에 대한 제1 경계선 정보를 얻고, 이미지 프로세싱 로직(image processing logic)을 이용하여 상기 세그먼테이션 정보에서 제2 경계선 정보를 추출하며, 상기 제1 경계선 정보와 상기 제2 경계선 정보에 기초하여 상기 경계선 오차를 산출할 수 있다. 이때, 상기 경계선 오차의 역전파를 통해 상기 경계선 검출 신경망의 가중치도 업데이트될 수 있다.In some embodiments, the processor obtains first boundary line information on a semantic object from the image through a boundary line detection neural network, and extracts second boundary line information from the segmentation information using image processing logic. , The boundary line error may be calculated based on the first boundary line information and the second boundary line information. In this case, the weight of the boundary line detection neural network may also be updated through backpropagation of the boundary line error.

몇몇 실시예에서, 상기 프로세서는, 상기 이미지를 어텐션(attention) 신경망을 입력하여 상기 이미지에 대한 어텐션 정보를 얻고, 상기 어텐션 정보를 상기 제1 경계선 정보에 적용하여 제3 경계선 정보를 생성하며, 상기 제2 경계선 정보와 상기 제3 경계선 정보를 비교하여 상기 경계선 오차를 산출할 수 있다. 이때, 상기 산출된 경계선 오차의 역전파를 통해 상기 어텐션 신경망의 가중치도 업데이트될 수 있다.In some embodiments, the processor obtains attention information for the image by inputting the image into an attention neural network, applies the attention information to the first boundary line information to generate third boundary line information, and the The boundary line error may be calculated by comparing the second boundary line information and the third boundary line information. In this case, the weight of the attention neural network may also be updated through backpropagation of the calculated boundary line error.

상술한 기술적 과제를 해결하기 위한 본 개시의 몇몇 실시예에 따른 컴퓨터 프로그램은, 컴퓨팅 장치와 결합되어, 레이블이 주어진 이미지를 세그먼테이션 신경망에 입력하여 상기 이미지에 대한 세그먼테이션 정보를 얻는 단계, 상기 세그먼테이션 정보에 대한 세그먼테이션 오차와 경계선 오차를 산출하는 단계 및 상기 산출된 세그먼테이션 오차와 상기 산출된 경계선 오차를 역전파(back-propagation)하여 상기 세그먼테이션 신경망을 업데이트하는 단계를 실행시키기 위하여, 컴퓨터로 판독 가능한 기록매체에 저장될 수 있다.A computer program according to some embodiments of the present disclosure for solving the above-described technical problem is combined with a computing device to input a labeled image to a segmentation neural network to obtain segmentation information for the image, and the segmentation information In order to execute the step of calculating the segmentation error and the boundary line error for each, and updating the segmentation neural network by back-propagating the calculated segmentation error and the calculated boundary line error, a computer-readable recording medium is provided. Can be saved.

도 1은 본 개시의 몇몇 실시예에 따른 시맨틱 세그먼테이션 장치와 학습 환경을 설명하기 위한 도면이다.
도 2 내지 도 3은 본 개시의 다양한 실시예에서 참조될 수 있는 코어스 레이블의 실례를 도시한다.
도 4는 본 개시의 몇몇 실시예에 따른 시맨틱 세그먼테이션 방법의 학습 프로세스를 개략적으로 나타내는 예시적인 흐름도이다.
도 5는 본 개시의 제1 실시예에 따른 세그먼테이션 신경망 학습 프로세스를 설명하기 위한 도면이다.
도 6은 본 개시의 제2 실시예에 따른 세그먼테이션 신경망 학습 프로세스를 설명하기 위한 도면이다.
도 7은 본 개시의 제3 실시예에 따른 세그먼테이션 신경망 학습 프로세스를 설명하기 위한 도면이다.
도 8은 본 개시의 제4 실시예에 따른 세그먼테이션 신경망 학습 프로세스를 설명하기 위한 도면이다.
도 9는 본 개시의 몇몇 실시예에 따른 시맨틱 세그먼테이션 방법의 추론 프로세스를 개략적으로 나타내는 예시적인 흐름도이다.
도 10은 본 개시의 다양한 실시예에 따른 장치를 구현할 수 있는 예시적인 컴퓨팅 장치를 도시한다.1 is a diagram illustrating a semantic segmentation device and a learning environment according to some embodiments of the present disclosure.
2 to 3 illustrate examples of coarse labels that may be referenced in various embodiments of the present disclosure.
4 is an exemplary flowchart schematically illustrating a learning process of a semantic segmentation method according to some embodiments of the present disclosure.
5 is a diagram for explaining a segmentation neural network learning process according to the first embodiment of the present disclosure.
6 is a diagram illustrating a segmentation neural network learning process according to a second embodiment of the present disclosure.
7 is a diagram illustrating a segmentation neural network learning process according to a third embodiment of the present disclosure.
8 is a diagram for explaining a segmentation neural network learning process according to a fourth embodiment of the present disclosure.
9 is an exemplary flowchart schematically illustrating an inference process of a semantic segmentation method according to some embodiments of the present disclosure.
10 illustrates an exemplary computing device capable of implementing a device according to various embodiments of the present disclosure.

이하, 첨부된 도면을 참조하여 본 개시의 바람직한 실시예들을 상세히 설명한다. 본 개시의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 개시의 기술적 사상은 이하의 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 이하의 실시예들은 본 개시의 기술적 사상을 완전하도록 하고, 본 개시가 속하는 기술분야에서 통상의 지식을 가진 자에게 본 개시의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 개시의 기술적 사상은 청구항의 범주에 의해 정의될 뿐이다.Hereinafter, exemplary embodiments of the present disclosure will be described in detail with reference to the accompanying drawings. Advantages and features of the present disclosure, and a method of achieving them will be apparent with reference to the embodiments described below in detail together with the accompanying drawings. However, the technical idea of the present disclosure is not limited to the following embodiments, but may be implemented in various different forms, and only the following embodiments complete the technical idea of the present disclosure, and in the technical field to which the present disclosure belongs. It is provided to completely inform the scope of the present disclosure to those of ordinary skill in the art, and the technical idea of the present disclosure is only defined by the scope of the claims.

각 도면의 구성요소들에 참조부호를 부가함에 있어서, 동일한 구성요소들에 대해서는 비록 다른 도면상에 표시되더라도 가능한 한 동일한 부호를 가지도록 하고 있음에 유의해야 한다. 또한, 본 개시를 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 개시의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략한다.In adding reference numerals to elements of each drawing, it should be noted that the same elements are assigned the same numerals as possible even if they are indicated on different drawings. In addition, in describing the present disclosure, when it is determined that a detailed description of a related known configuration or function may obscure the subject matter of the present disclosure, a detailed description thereof will be omitted.

다른 정의가 없다면, 본 명세서에서 사용되는 모든 용어(기술 및 과학적 용어를 포함)는 본 개시가 속하는 기술분야에서 통상의 지식을 가진 자에게 공통적으로 이해될 수 있는 의미로 사용될 수 있다. 또 일반적으로 사용되는 사전에 정의되어 있는 용어들은 명백하게 특별히 정의되어 있지 않는 한 이상적으로 또는 과도하게 해석되지 않는다. 본 명세서에서 사용된 용어는 실시예들을 설명하기 위한 것이며 본 개시를 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다.Unless otherwise defined, all terms (including technical and scientific terms) used in the present specification may be used as meanings that can be commonly understood by those of ordinary skill in the art to which this disclosure belongs. In addition, terms defined in a commonly used dictionary are not interpreted ideally or excessively unless explicitly defined specifically. The terms used in the present specification are for describing exemplary embodiments and are not intended to limit the present disclosure. In this specification, the singular form also includes the plural form unless specifically stated in the phrase.

또한, 본 개시의 구성 요소를 설명하는 데 있어서, 제1, 제2, A, B, (a), (b) 등의 용어를 사용할 수 있다. 이러한 용어는 그 구성 요소를 다른 구성 요소와 구별하기 위한 것일 뿐, 그 용어에 의해 해당 구성 요소의 본질이나 차례 또는 순서 등이 한정되지 않는다. 어떤 구성 요소가 다른 구성요소에 "연결", "결합" 또는 "접속"된다고 기재된 경우, 그 구성 요소는 그 다른 구성요소에 직접적으로 연결되거나 또는 접속될 수 있지만, 각 구성 요소 사이에 또 다른 구성 요소가 "연결", "결합" 또는 "접속"될 수도 있다고 이해되어야 할 것이다.In addition, in describing the constituent elements of the present disclosure, terms such as first, second, A, B, (a) and (b) may be used. These terms are only used to distinguish the component from other components, and the nature, order, or order of the component is not limited by the term. When a component is described as being "connected", "coupled" or "connected" to another component, the component may be directly connected or connected to that other component, but another component between each component It should be understood that elements may be “connected”, “coupled” or “connected”.

명세서에서 사용되는 "포함한다 (comprises)" 및/또는 "포함하는 (comprising)"은 언급된 구성 요소, 단계, 동작 및/또는 소자는 하나 이상의 다른 구성 요소, 단계, 동작 및/또는 소자의 존재 또는 추가를 배제하지 않는다.As used in the specification, "comprises" and/or "comprising" refers to the presence of one or more other components, steps, actions and/or elements, and/or elements, steps, actions and/or elements mentioned. Or does not exclude additions.

본 명세서에 대한 설명에 앞서, 본 명세서에서 사용되는 몇몇 용어들에 대하여 명확하게 하기로 한다.Prior to the description of the present specification, some terms used in the present specification will be clarified.

본 명세서에서, 시맨틱 객체(semantic object)란, 시맨틱 세그먼테이션의 대상이 되는 객체를 의미할 수 있다. 상기 시맨틱 객체의 클래스는 사전에 정의되어 있을 수 있으며, 배경(background) 또한 하나의 시맨틱 객체로 정의될 수 있다.In this specification, a semantic object may mean an object to be subjected to semantic segmentation. The class of the semantic object may be defined in advance, and a background may also be defined as one semantic object.

본 명세서에서, 파인 레이블(fine label)이란, 코어스 레이블(coarse label)에 비해 상대적으로 높은 정확도 또는 정밀도를 갖는 레이블을 의미할 수 있다. 예를 들어, 파인 레이블은 시맨틱 객체에 대한 클래스, 위치 및 형태 정보를 포함하는 레이블(e.g. 픽셀 단위의 클래스 정보)일 수 있다.In the present specification, a fine label may mean a label having a relatively high accuracy or precision compared to a coarse label. For example, the fine label may be a label (e.g. class information in units of pixels) including class, position, and shape information on a semantic object.

본 명세서에서, 코어스 레이블(coarse label)이란, 파인 레이블에 비해 상대적으로 낮은 정확도 또는 정밀도를 갖는 레이블을 의미할 수 있다. 예를 들어, 파인 레이블은 시맨틱 객체의 형태를 가리키는 경계선 정보를 포함할 수 있고, 코어스 레이블은 경계선 정보를 포함하지 않거나 시맨틱 객체의 형태에 대한 개략적인 정보(e.g. 스크리블 정보)만을 포함할 수 있다. 코어스 레이블의 실례는 도 2 및 도 3을 참조하도록 한다.In the present specification, a coarse label may mean a label having a relatively low accuracy or precision compared to a fine label. For example, the fine label may include boundary information indicating the shape of the semantic object, and the coarse label may not include boundary information or may include only rough information (eg scribble information) about the shape of the semantic object. . For an example of a coarse label, refer to FIGS. 2 and 3.

본 명세서에서, 세그먼테이션 신경망(segmentation neural network)이란, 시맨틱 세그먼테이션을 위해 이용되는 신경망을 의미할 수 있다. 세그먼테이션 신경망은 다양한 형태 또는 구조를 갖는 신경망으로 구현될 수 있을 것이므로, 세그먼테이션 신경망의 구현 방식에 의해 본 개시의 기술적 범위가 한정되는 것은 아니다.In this specification, a segmentation neural network may mean a neural network used for semantic segmentation. Since the segmentation neural network may be implemented as a neural network having various forms or structures, the technical scope of the present disclosure is not limited by the implementation method of the segmentation neural network.

본 명세서에서, 경계선 검출 신경망(edge detection neural network) 이미지에서 경계선을 검출하기 위해 이용되는 신경망을 의미할 수 있다. 경계선 검출 신경망은 다양한 형태 또는 구조를 갖는 신경망으로 구현될 수 있을 것이므로, 경계선 검출 신경망의 구현 방식에 의해 본 개시의 기술적 범위가 한정되는 것은 아니다. 또한, 당해 기술 분야에서 경계선이라는 용어는 외곽선(outline), 에지(edge), 윤곽선(contour), 바운더리(boundary) 등과 같이 다양한 용어와 혼용될 수 있다.In the present specification, an edge detection neural network may mean a neural network used to detect a boundary line in an image. Since the boundary line detection neural network may be implemented as a neural network having various shapes or structures, the technical scope of the present disclosure is not limited by the implementation method of the boundary line detection neural network. In addition, in the art, the term boundary line may be mixed with various terms such as outline, edge, contour, and boundary.

본 명세서에서 인스트럭션(instruction)이란, 기능을 기준으로 묶인 일련의 컴퓨터 판독가능 명령어들로서 컴퓨터 프로그램의 구성 요소이자 프로세서에 의해 실행되는 것을 가리킨다.In this specification, an instruction refers to a series of computer-readable instructions grouped on a function basis, which is a component of a computer program and executed by a processor.

이하, 본 개시의 몇몇 실시예들에 대하여 첨부된 도면에 따라 상세하게 설명한다.Hereinafter, some embodiments of the present disclosure will be described in detail with reference to the accompanying drawings.

도 1은 본 개시의 몇몇 실시예에 따른 시맨틱 세그먼테이션 장치(10)와 학습 환경을 나타내는 예시도이다.1 is an exemplary diagram illustrating a semantic segmentation apparatus 10 and a learning environment according to some embodiments of the present disclosure.

도 1에 도시된 바와 같이, 시맨틱 세그먼테이션 장치(10)는 이미지에 대해 시맨틱 세그먼테이션을 수행할 수 있는 컴퓨팅 장치이다. 보다 상세하게는, 시맨틱 세그먼테이션 장치(10)는 레이블이 주어진 이미지셋(11)을 이용하여 세그먼테이션 신경망을 학습시킬 수 있다. 또한, 시맨틱 세그먼테이션 장치(10)는 학습된 세그먼테이션 신경망을 이용하여 레이블이 주어지지 않은 이미지(13)에 대해 시맨틱 세그먼테이션을 수행할 수 있다. 상기 수행의 결과로, 시맨틱 세그먼테이션 장치(10)는 이미지에 대한 세그먼테이션 정보(e.g. 15, 17)를 출력할 수 있다. 이때, 상기 세그먼테이션 정보는 픽셀 단위(pixel-wise)의 클래스 정보가 포함된 세그먼테이션 맵(15) 또는 세그먼테이션된 이미지(17) 등이 될 수 있을 것이나, 이에 한정되는 것은 아니다. 이하의 서술에서는, 설명의 편의상, 시맨틱 세그먼테이션 장치(10)를 세그먼테이션 장치(10)로 약칭하도록 한다.As illustrated in FIG. 1, the semantic segmentation device 10 is a computing device capable of performing semantic segmentation on an image. In more detail, the semantic segmentation apparatus 10 may train a segmentation neural network using the image set 11 given a label. In addition, the semantic segmentation apparatus 10 may perform semantic segmentation on the image 13 to which no label is given by using the learned segmentation neural network. As a result of the execution, the semantic segmentation device 10 may output segmentation information (e.g. 15, 17) for an image. In this case, the segmentation information may be a segmentation map 15 including pixel-wise class information or a segmented image 17, but is not limited thereto. In the following description, for convenience of description, the semantic segmentation device 10 will be abbreviated as the segmentation device 10.

상기 컴퓨팅 장치는 노트북, 데스크톱(desktop), 랩탑(laptop), 서버(server) 등이 될 수 있으나, 이에 국한되는 것은 아니며 컴퓨팅 기능이 구비된 모든 종류의 장치를 포함할 수 있다. 상기 컴퓨팅 장치의 일 예는 도 10을 참조하도록 한다.The computing device may be a notebook, a desktop, a laptop, a server, and the like, but is not limited thereto and may include all types of devices equipped with a computing function. Refer to FIG. 10 for an example of the computing device.

도 1은 세그먼테이션 장치(10)가 하나의 컴퓨팅 장치로 구현된 것을 예시하고 있으나, 세그먼테이션 장치(10)의 제1 기능은 제1 컴퓨팅 장치에서 구현되고, 세그먼테이션 장치(10)의 제2 기능은 제2 컴퓨팅 장치에서 구현될 수도 있다. 즉, 세그먼테이션 장치(10)는 복수의 컴퓨팅 장치로 구성될 수 있다. 또한, 복수의 컴퓨팅 장치가 제1 기능 또는 제2 기능을 나누어 구현할 수 도 있다.1 illustrates that the segmentation device 10 is implemented as one computing device, the first function of the segmentation device 10 is implemented in the first computing device, and the second function of the segmentation device 10 is 2 may be implemented in a computing device. That is, the segmentation device 10 may be composed of a plurality of computing devices. In addition, a plurality of computing devices may be implemented by dividing the first function or the second function.

이미지셋(11)은 복수의 이미지로 구성된 학습 데이터셋이다. 각각의 이미지에는 시맨틱 객체에 대한 레이블이 주어질 수 있으며, 주어진 레이블을 이용하여 세그먼테이션 신경망이 학습될 수 있다.The image set 11 is a training data set composed of a plurality of images. Each image may be given a label for a semantic object, and a segmentation neural network may be trained using the given label.

본 개시의 다양한 실시예에서, 이미지셋(11)의 레이블은 코어스 레이블을 포함할 수 있다.In various embodiments of the present disclosure, the label of the image set 11 may include a coarse label.

몇몇 실시예에서, 상기 코어스 레이블에는 시맨틱 객체에 대한 경계선 정보가 포함되지 않고, 시맨틱 객체의 클래스와 위치 정보만 포함될 수 있다. 예를 들어, 상기 코어스 레이블에는 시맨틱 객체의 클래스와 상기 시맨틱 객체의 위치를 나타내는 마킹 정보만 포함될 수 있다. 이에 대한 실례는 도 2에 도시되어 있다. 도 2에 예시된 바와 같이, 코어스 레이블에는 이미지(20) 상에서 세포(21)의 위치를 포인트(23)의 형태로 마킹한 정보가 포함되고, 세포(21)에 대한 경계선 정보는 포함되지 않을 수 있다.In some embodiments, the coarse label does not include boundary information on the semantic object, but may include only the class and location information of the semantic object. For example, the coarse label may include only the class of the semantic object and marking information indicating the location of the semantic object. An example of this is shown in FIG. 2. As illustrated in FIG. 2, the coarse label includes information that marks the location of the cell 21 on the image 20 in the form of a point 23, and may not include information on the boundary line for the cell 21. have.

다른 몇몇 실시예에서, 상기 코어스 레이블에는 시맨틱 객체의 클래스와 위치 정보 및 개략적인 형태 정보만이 포함될 수 있다. 예를 들어, 상기 코어스 레이블에는 경계선 정보 대신에 시맨틱 객체의 개략적인 형태를 나타내는 스크리블(scribble) 정보가 포함될 수 있다. 이에 대한 실례는 도 3에 도시되어 있다. 도 3에 예시된 바와 같이, 코어스 레이블에는 이미지(30)에 포함된 시맨틱 객체(e.g. cow, grass)의 경계선 정보 대신에 시맨틱 객체에 대한 스크리블 정보(31, 33)가 포함될 수 있다.In some other embodiments, the coarse label may include only class and location information of a semantic object, and rough shape information. For example, the coarse label may include scribble information indicating a schematic shape of a semantic object instead of boundary information. An example of this is shown in FIG. 3. As illustrated in FIG. 3, the coarse label may include scribble information 31 and 33 on semantic objects instead of boundary information of semantic objects (e.g. cow, grass) included in the image 30.

다른 몇몇 실시예에서는, 코어스 레이블에는 시맨틱 객체를 나타내는 바운딩 박스(bounding box) 정보가 포함될 수 있다. 이외에도, 다양한 유형의 코어스 레이블이 존재할 수 있을 것이므로, 본 개시의 기술적 범위는 특정 유형의 코어스 레이블에 한정되지 않는다.In some other embodiments, the coarse label may include bounding box information representing a semantic object. In addition, since various types of coarse labels may exist, the technical scope of the present disclosure is not limited to a specific type of coarse label.

본 개시의 다양한 실시예에 따르면, 세그먼테이션 장치(10)는 세그먼테이션 오차 외에 시맨틱 객체에 대한 경계선 오차를 추가로 학습하여 세그먼테이션 신경망을 학습시킬 수 있다. 그렇게 함으로써, 코어스 레이블이 주어진 경우라 하더라도, 세그먼테이션 신경망이 시맨틱 객체의 형태까지 정확하게 예측하도록 학습될 수 있다. 본 실시예에 따르면, 코어스 레이블이 주어진 경우라 하더라도, 정교한 세그먼테이션 결과가 제공될 수 있다. 뿐만 아니라, 파인 레이블을 이용하지 않고도 시맨틱 세그먼테이션 태스크를 수행할 수 있는 바, 레이블링 비용이 크게 절감될 수 있다. 본 실시예에 대한 자세한 설명은 도 4 내지 도 9를 참조하여 후술하도록 한다.According to various embodiments of the present disclosure, the segmentation apparatus 10 may train a segmentation neural network by additionally learning a boundary line error for a semantic object in addition to the segmentation error. By doing so, even when a coarse label is given, the segmentation neural network can be trained to accurately predict the shape of the semantic object. According to the present embodiment, even when a coarse label is given, an elaborate segmentation result can be provided. In addition, since the semantic segmentation task can be performed without using a fine label, labeling costs can be greatly reduced. A detailed description of the present embodiment will be described later with reference to FIGS. 4 to 9.

지금까지 도 1 내지 도 3을 참조하여 본 개시의 몇몇 실시예에 따른 세그먼테이션 장치(10)의 동작과 학습 환경에 대하여 설명하였다. 이하에서는, 도 4 내지 도 9를 참조하여 본 개시의 몇몇 실시예에 따른 시맨틱 세그먼테이션 방법에 대하여 설명하도록 한다.So far, the operation and learning environment of the segmentation apparatus 10 according to some embodiments of the present disclosure have been described with reference to FIGS. 1 to 3. Hereinafter, a semantic segmentation method according to some embodiments of the present disclosure will be described with reference to FIGS. 4 to 9.

이하에서 후술될 방법들의 각 단계는 컴퓨팅 장치에 의해 수행될 수 있다. 다시 말하면, 상기 방법들의 각 단계는 컴퓨팅 장치의 프로세서에 의해 실행되는 하나 이상의 인스트럭션들로 구현될 수 있다. 상기 방법들에 포함되는 모든 단계는 하나의 물리적인 컴퓨팅 장치에 의하여 실행될 수도 있을 것이나, 상기 방법의 제1 단계들은 제1 컴퓨팅 장치에 의하여 수행되고, 상기 방법의 제2 단계들은 제2 컴퓨팅 장치에 의하여 수행될 수도 있다. 이하에서는, 상기 방법들의 각 단계가 도 2에 예시된 세그먼테이션 장치(10)에 의해 수행되는 것을 가정하여 설명을 이어가도록 한다. 따라서, 본 실시예에 관한 설명에서 각 동작의 주어가 생략된 경우, 상기 예시된 장치(10)에 의하여 수행될 수 있는 것으로 이해될 수 있을 것이다. 또한, 본 실시예에 따른 방법은 필요에 따라 논리적으로 수행 순서가 바뀔 수 있는 범위 안에서 각 동작의 수행 순서가 바뀔 수 있음은 물론이다.Each step of the methods to be described below may be performed by a computing device. In other words, each step of the methods may be implemented with one or more instructions executed by the processor of the computing device. All steps included in the methods may be performed by one physical computing device, but the first steps of the method are performed by a first computing device, and the second steps of the method are performed by a second computing device. It can also be performed by Hereinafter, description will be continued on the assumption that each step of the above methods is performed by the segmentation apparatus 10 illustrated in FIG. 2. Accordingly, when the subject of each operation is omitted in the description of the present embodiment, it may be understood that it may be performed by the illustrated apparatus 10. In addition, it goes without saying that in the method according to the present embodiment, the execution order of each operation may be changed within a range in which the execution order can be logically changed as necessary.

도 4는 본 개시의 몇몇 실시예에 따른 시맨틱 세그먼테이션 방법의 학습 프로세스를 나타내는 예시적인 흐름도이다. 단, 이는 본 개시의 목적을 달성하기 위한 바람직한 실시예일 뿐이며, 필요에 따라 일부 단계가 추가되거나 삭제될 수 있음은 물론이다.4 is an exemplary flowchart illustrating a learning process of a semantic segmentation method according to some embodiments of the present disclosure. However, this is only a preferred embodiment for achieving the object of the present disclosure, and of course, some steps may be added or deleted as necessary.

도 4에 도시된 바와 같이, 상기 시맨틱 세그먼테이션 방법은 레이블이 주어진 이미지셋을 획득하는 단계 S100에서 시작된다. 상기 이미지셋은 복수의 이미지로 구성되며, 이미지 단위 또는 미니 배치(batch) 단위로 이하의 단계 S200 내지 S500가 수행될 수 있다. 이하에서는, 이미지 단위로 학습이 수행되는 것을 가정하여 설명을 이어가도록 하나, 본 개시의 기술적 범위가 이에 한정되는 것은 아니다.As shown in FIG. 4, the semantic segmentation method starts in step S100 of obtaining an image set with a label. The image set is composed of a plurality of images, and the following steps S200 to S500 may be performed in an image unit or a mini-batch unit. Hereinafter, description will be continued on the assumption that learning is performed in units of images, but the technical scope of the present disclosure is not limited thereto.

몇몇 실시예에서, 상기 레이블은 코어스 레이블을 포함할 수 있다. 즉, 상기 이미지셋은 코어스 레이블이 주어진 하나 이상의 이미지를 포함할 수 있다. 또한, 상기 코어스 레이블에는 시맨틱 객체에 대한 경계선 정보가 포함되어 있지 않고, 상기 시맨틱 객체에 대한 위치 정보 또는 개략적인 형태 정보가 포함될 수 있다. 상기 코어스 레이블의 실례에 대해서는 도 2 및 도 3을 참조하도록 한다.In some embodiments, the label may include a coarse label. That is, the image set may include one or more images given a coarse label. In addition, the coarse label does not include boundary line information on the semantic object, but location information or rough shape information on the semantic object. For an example of the coarse label, refer to FIGS. 2 and 3.

단계 S200에서, 상기 이미지가 세그먼테이션 신경망으로 입력된다. 상기 세그먼테이션 신경망은 목표 태스크(target task)인 시맨틱 세그먼테이션을 수행하는 주 신경망을 의미한다. 상기 입력의 결과로, 상기 세그먼테이션 신경망에서 이미지에 대한 세그먼테이션 정보가 출력된다. 상기 세그먼테이션 정보는 예를 들어 시맨틱 객체에 대한 픽셀 단위의 클래스 정보를 포함하는 세그먼테이션 맵일 수 있고, 상기 세그먼테이션 맵을 이미지 형태로 가공한 것일 수도 있다. 단, 이에 한정되는 것은 아니다.In step S200, the image is input to the segmentation neural network. The segmentation neural network refers to a main neural network that performs semantic segmentation, which is a target task. As a result of the input, segmentation information for an image is output from the segmentation neural network. The segmentation information may be, for example, a segmentation map including class information of a semantic object in a pixel unit, or may be processed into an image form of the segmentation map. However, it is not limited thereto.

단계 S300에서, 획득된 세그먼테이션 정보에 대한 세그먼테이션 오차가 산출된다. 상기 세그먼테이션 오차는 세그먼테이션 정보와 상기 이미지에 주어진 레이블 정보와의 차이에 기초하여 산출될 수 있다(도 5 내지 도 8 참조). 상기 세그먼테이션 오차를 산출하는 구체적인 방식은 실시예에 따라 달라질 수 있다.In step S300, a segmentation error with respect to the obtained segmentation information is calculated. The segmentation error may be calculated based on a difference between segmentation information and label information given to the image (see FIGS. 5 to 8). A specific method of calculating the segmentation error may vary according to embodiments.

몇몇 실시예에서, 상기 세그먼테이션 오차는 크로스 엔트로피(cross entropy) 손실 함수에 기초하여 산출될 수 있다. 예를 들어, 세그먼테이션 정보에 포함된 시맨틱 객체의 예측 클래스와 상기 레이블에 포함된 정답 클래스의 차이가 크로스 엔트로피 손실 함수에 의해 산출될 수 있다. 또한, 상기 차이는 픽셀 별로 산출될 수 있다.In some embodiments, the segmentation error may be calculated based on a cross entropy loss function. For example, a difference between a prediction class of a semantic object included in segmentation information and a correct answer class included in the label may be calculated by a cross entropy loss function. Also, the difference may be calculated for each pixel.

단계 S400에서, 상기 세그먼테이션 정보에 대한 경계선 오차가 산출된다. 상기 경계선 오차는 시맨틱 객체의 형태적 측면의 오차를 강조한 것으로써, 세그먼테이션 신경망이 시맨틱 객체의 형태를 보다 정확하게 예측할 수 있도록 이용되는 값으로 이해될 수 있다. 상기 경계선 오차를 산출하는 구체적인 방식은 실시예에 따라 달라질 수 있다. 이와 관련하여서는, 도 5 내지 도 8을 참조하여 추후 상세하게 설명하도록 한다.In step S400, a boundary line error for the segmentation information is calculated. The boundary line error emphasizes the error of the shape aspect of the semantic object, and can be understood as a value used so that the segmentation neural network can more accurately predict the shape of the semantic object. A specific method of calculating the boundary line error may vary according to embodiments. In this regard, it will be described in detail later with reference to FIGS. 5 to 8.

단계 S500에서, 상기 세그먼테이션 오차와 상기 경계선 오차를 역전파(back-propagation)하여 세그먼테이션 신경망의 가중치가 업데이트된다.In step S500, the weight of the segmentation neural network is updated by back-propagating the segmentation error and the boundary line error.

몇몇 실시예에서는, 상기 세그먼테이션 오차와 상기 경계선 오차 각각에 가중치가 부여되고, 상기 가중치를 고려하여 세그먼테이션 신경망의 가중치가 업데이트될 수 있다. 몇몇 예에서는, 레이블에 포함된 형태 정보의 정밀도에 따라 경계선 오차에 부여된 가중치 값이 결정될 수도 있다. 가령, 레이블 정보에 시맨틱 객체에 대한 정교한 경계선 정보가 포함된 경우, 경계선 오차에 부여된 가중치는 스크리블 정보가 포함된 경우보다 상향될 수 있다. 다른 예에서는, 이미지셋에 주어진 레이블 중 코어스 레이블이 차지하는 비중에 따라 경계선 오차에 부여된 가중치 값이 결정될 수도 있다. 즉, 이미지셋에 코어스 레이블이 많이 포함될수록 경계선 오차에 부여된 가중치 값은 상향되어 적용될 수 있다. 그렇게 함으로써, 코어스 레이블이 많을 때, 시맨틱 객체의 형태에 더 중점을 두고 세그먼테이션 신경망에 대한 학습이 이루어질 수 있기 때문이다.In some embodiments, a weight is assigned to each of the segmentation error and the boundary line error, and the weight of the segmentation neural network may be updated in consideration of the weight. In some examples, a weight value assigned to the boundary line error may be determined according to the precision of the shape information included in the label. For example, when the label information includes elaborate boundary line information on the semantic object, the weight assigned to the boundary line error may be higher than when the scribble information is included. In another example, a weight value assigned to a boundary line error may be determined according to a weight of a coarse label among labels given to an image set. That is, as more coarse labels are included in the image set, the weight value assigned to the boundary line error may be increased and applied. By doing so, when there are many coarse labels, it is possible to learn about the segmentation neural network with more emphasis on the shape of the semantic object.

단계 S600에서, 학습 종료 여부가 판정된다. 학습 종료 여부는 미리 설정된 종료 조건에 기초하여 판정될 수 있다. 또한, 상기 종료 조건은 에폭(epoch), 미학습 데이터 존재 여부, 신경망의 성능 등과 같이 다양한 기준에 의거하여 정의되고 설정될 수 있다. 따라서, 본 개시의 기술적 범위가 특정 종료 조건에 한정되는 것은 아니다.In step S600, it is determined whether or not the learning has ended. Whether or not to end learning may be determined based on a preset end condition. In addition, the termination condition may be defined and set based on various criteria such as an epoch, whether there is unlearned data, and performance of a neural network. Therefore, the technical scope of the present disclosure is not limited to a specific termination condition.

상기 종료 조건이 만족되지 않았다는 판정에 응답하여, 전술한 단계 S200 내지 S600이 다시 수행될 수 있다. 반대의 경우, 세그먼테이션 신경망에 대한 학습은 종료될 수 있다. 세그먼테이션 신경망이 학습되면, 레이블이 주어지지 않은 이미지에 대해 시맨틱 세그먼테이션(즉, 추론 프로세스)이 수행될 수 있다. 이에 관련하여서는, 도 9를 참조하여 후술하도록 한다.In response to a determination that the termination condition is not satisfied, steps S200 to S600 described above may be performed again. In the opposite case, learning about the segmentation neural network can be terminated. When the segmentation neural network is trained, semantic segmentation (ie, inference process) may be performed on an image that is not given a label. In this regard, it will be described later with reference to FIG. 9.

지금까지 도 4를 참조하여 세그먼테이션 신경망에 대한 학습 프로세스에 대하여 개략적으로 설명하였다. 상술한 바에 따르면, 세그먼테이션 오차 외에 경계선 오차를 더 이용하여 세그먼테이션 신경망이 학습된다. 따라서, 세그먼테이션 신경망은 시맨틱 객체의 형태를 보다 정확하게 예측하도록 학습될 수 있다. 즉, 시맨틱 객체의 부정확한 형태 정보가 담긴 코어스 레이블이 이용되더라도, 세그먼테이션 신경망을 통해 시맨틱 객체의 정확한 형태 정보가 담긴 세그먼테이션 정보가 제공될 수 있다. 이에 따라, 학습 레이블링 작업에 소요되는 비용과 학습 데이터셋을 확보하는데 소요되는 노력이 크게 절감될 수 있다.So far, the learning process for the segmentation neural network has been schematically described with reference to FIG. 4. According to the above, the segmentation neural network is trained by further using the boundary line error in addition to the segmentation error. Therefore, the segmentation neural network can be trained to more accurately predict the shape of the semantic object. That is, even if a coarse label containing incorrect shape information of a semantic object is used, segmentation information containing accurate shape information of a semantic object may be provided through a segmentation neural network. Accordingly, the cost required for the learning labeling task and the effort required to secure the learning dataset can be greatly reduced.

이하에서는, 도 5 내지 도 8을 참조하여 경계선 오차와 세그먼테이션 신경망의 학습 프로세스와 관련된 본 개시의 다양한 실시예들에 대하여 설명하도록 한다.Hereinafter, various embodiments of the present disclosure related to a boundary error and a learning process of a segmentation neural network will be described with reference to FIGS. 5 to 8.

도 5는 본 개시의 제1 실시예에 따른 세그먼테이션 신경망 학습 프로세스를 설명하기 위한 도면이다.5 is a diagram for explaining a segmentation neural network learning process according to the first embodiment of the present disclosure.

도 5에 도시된 바와 같이, 상기 제1 실시예에서는 경계선 오차(47)를 산출하기 위해 경계선 검출 신경망(45)이 이용된다. 경계선 예측 신경망(45)은 입력 이미지(40)에서 시맨틱 객체와 연관된 경계선을 예측하는 신경망으로, 세그먼테이션 신경망(41)의 학습을 위한 보조 신경망(auxiliary neural network)으로 이해될 수 있다. 이하, 구체적인 학습 프로세스에 대하여 설명한다.As shown in FIG. 5, in the first embodiment, a boundary line detection neural network 45 is used to calculate a boundary line error 47. The boundary line prediction neural network 45 is a neural network that predicts a boundary line associated with a semantic object from the input image 40 and may be understood as an auxiliary neural network for learning the segmentation neural network 41. Hereinafter, a specific learning process will be described.

이미 설명한 바와 같이, 세그먼테이션 오차(43)는 세그먼테이션 신경망(41)에 의해 예측된 세그먼테이션 정보(42)와 레이블 정보(44)의 차이에 기초하여 산출될 수 있다. 또한, 세그먼테이션 오차(43)의 역전파를 통해 세그먼테이션 신경망(41)의 가중치가 업데이트될 수 있다.As described above, the segmentation error 43 may be calculated based on the difference between the segmentation information 42 and the label information 44 predicted by the segmentation neural network 41. Also, the weight of the segmentation neural network 41 may be updated through backpropagation of the segmentation error 43.

경계선 오차(47)는 경계선 검출 신경망(45)에 의해 예측된 제1 경계선 정보(46)와 세그먼테이션 정보(42)에서 추출된 제2 경계선 정보(48)의 차이에 기초하여 산출될 수 있다. 여기서, 제2 경계선 정보(48)는 세그먼테이션 정보(42)에 이미지 프로세싱 로직(image processing logic)을 적용하여 추출된 것일 수 있다. 또한, 상기 이미지 프로세싱 로직은 예를 들어 소벨(sobel) 연산 등과 같이 다양한 종류의 에지 검출 로직을 포함할 수 있다. 따라서, 본 개시의 기술적 범위는 특정 종류의 로직에 한정되지 않는다.The boundary line error 47 may be calculated based on a difference between the first boundary line information 46 predicted by the boundary line detection neural network 45 and the second boundary line information 48 extracted from the segmentation information 42. Here, the second boundary line information 48 may be extracted by applying image processing logic to the segmentation information 42. In addition, the image processing logic may include various types of edge detection logic such as, for example, a sobel operation. Therefore, the technical scope of the present disclosure is not limited to a specific type of logic.

몇몇 실시예에서, 경계선 오차(47)는 L1 손실(loss) 함수 또는 L2 손실 함수에 기초하여 산출될 수 있다. 여기서 L1 손실 함수는 최소 절대 오류(least absolute errors)를 의미하고, L2 손실 함수는 최소 제곱 오류(least square errors)를 의미한다. 그러나, 이외에도 다양한 손실 함수가 적용될 수 있으므로, 본 개시의 기술적 범위가 이에 한정되는 것은 아니다.In some embodiments, the boundary line error 47 may be calculated based on an L1 loss function or an L2 loss function. Here, the L1 loss function means least absolute errors, and the L2 loss function means least square errors. However, since various loss functions may be applied in addition to this, the technical scope of the present disclosure is not limited thereto.

경계선 오차(47)가 역전파됨에 따라 세그먼테이션 신경망(41)과 경계선 검출 신경망(45)의 가중치가 갱신될 수 있다. 이때, 경계선 오차(47)는 세그먼테이션 정보(42)에 기반하여 산출된 것이기 때문에, 경계선 검출 신경망(45)은 입력된 이미지(e.g. 40)에서 시맨틱 객체와 연관된 경계선만을 검출하도록 학습될 수 있다. 또한, 세그먼테이션 신경망(41)은 경계선 오차(47)를 최소화하도록 학습됨으로써, 시맨틱 객체의 형태 정보를 더욱 정교하게 예측할 수 있게 된다.As the boundary line error 47 is backpropagated, the weights of the segmentation neural network 41 and the boundary line detection neural network 45 may be updated. At this time, since the boundary line error 47 is calculated based on the segmentation information 42, the boundary line detection neural network 45 may be trained to detect only the boundary line associated with the semantic object in the input image (e.g. 40). In addition, the segmentation neural network 41 is trained to minimize the boundary line error 47, so that the shape information of the semantic object can be more accurately predicted.

몇몇 실시예에서, 경계선 검출 신경망(45)은 세그먼테이션 신경망(41)보다 더 얕은 신경망((즉, 더 적은 개수의 레이어를 갖는 신경망)으로 구현되거나, 더 적은 개수의 가중치 파라미터를 갖도록 구현될 수 있다. 세그먼테이션 신경망(41)은 시맨틱 세그먼테이션 태스크의 특성상 시맨틱 객체와 연관된 고차원인 특징을 추출하고 이를 깊게 이해할 수 있어야 한다. 따라서, 세그먼테이션 신경망(41)은 깊은 신경망으로 구현되는 것이 바람직할 수 있다. 반면에, 경계선 검출 신경망(45)은 이미지로부터 시맨틱 객체와 연관된 로컬 피처(local feature)를 추출하는 동작만을 수행하므로, 시맨틱 객체에 대해 깊은 이해를 할 필요가 없다. 따라서, 학습 비용과 신경망의 성능 측면에서 경계선 검출 신경망(45)은 상대적으로 얕은 신경망으로 구현되는 것이 바람직할 수 있다.In some embodiments, the boundary line detection neural network 45 may be implemented as a shallower neural network (that is, a neural network having fewer layers) than the segmentation neural network 41, or may be implemented to have a smaller number of weight parameters. The segmentation neural network 41 must be able to extract and deeply understand the high-dimensional features associated with the semantic object due to the characteristics of the semantic segmentation task, so it may be desirable that the segmentation neural network 41 be implemented as a deep neural network. , Since the boundary line detection neural network 45 only extracts a local feature associated with a semantic object from an image, there is no need to deeply understand the semantic object, thus in terms of learning cost and performance of the neural network. The boundary line detection neural network 45 may be preferably implemented as a relatively shallow neural network.

한편, 몇몇 실시예에서는, 학습이 진행됨에 따라 세그먼테이션 오차(43)와 경계선 오차(47)가 세그먼테이션 신경망(41)의 학습에 영향을 미치는 비중이 달라질 수도 있다. 예를 들어, 학습이 진행됨에 따라 경계선 오차(47)에 부여되는 가중치는 증가하고, 세그먼테이션 오차(43)에 부여되는 가중치는 감소될 수 있다. 경계선 오차(47)의 정확도는 경계선 검출 신경망(45)의 학습 성숙도에 따라 달라질 수 있고, 경계선 검출 신경망(45)이 성숙해질수록 경계선 오차(47)의 정확도는 증가될 것이기 때문이다.Meanwhile, in some embodiments, as learning progresses, the proportions of the segmentation error 43 and the boundary line error 47 to the learning of the segmentation neural network 41 may vary. For example, as learning progresses, a weight assigned to the boundary line error 47 may increase, and a weight assigned to the segmentation error 43 may decrease. This is because the accuracy of the boundary line error 47 may vary according to the learning maturity of the boundary line detection neural network 45, and the accuracy of the boundary line error 47 will increase as the boundary line detection neural network 45 matures.

도 6은 본 개시의 제2 실시예에 따른 세그먼테이션 신경망 학습 프로세스를 설명하기 위한 도면이다. 이하의 서술에서, 앞선 실시예와 중복되는 내용에 대한 설명은 생략하고, 앞선 실시예와의 차이점을 중심으로 설명하도록 한다.6 is a diagram illustrating a segmentation neural network learning process according to a second embodiment of the present disclosure. In the following description, descriptions of contents overlapping with the previous embodiments will be omitted, and the differences from the previous embodiments will be mainly described.

도 6에 도시된 바와 같이, 상기 제2 실시예에서는, 경계선 오차(57)를 산출하기 위해 경계선 검출 신경망(55)에 의해 예측된 제1 경계선 정보(56)와 세그먼테이션 정보(52)에서 추출된 제2 경계선 정보(58) 외에 입력 이미지(50)에서 추출된 제3 경계선 정보(59)가 더 이용될 수 있다. 이때, 상기 추출 프로세스는 이미지 프로세싱 로직(e.g. 소벨 연산 등의 에지 검출 로직)을 통해서 이루어질 수 있다. 경계선 오차(57)를 산출하는 구체적인 방식은 실시예에 따라 달라질 수 있다.6, in the second embodiment, in order to calculate the boundary line error 57, the first boundary line information 56 predicted by the boundary line detection neural network 55 and the segmentation information 52 are extracted. In addition to the second boundary line information 58, the third boundary line information 59 extracted from the input image 50 may be further used. In this case, the extraction process may be performed through image processing logic (e.g. edge detection logic such as Sobel operation). A specific method of calculating the boundary line error 57 may vary according to embodiments.

몇몇 실시예에서는, 제3 경계선 정보(59)를 기준으로 제2 경계선 정보(58)에 포함된 시맨틱 객체의 경계선이 보정될 수 있다. 또한, 보정된 경계선과 제1 경계선 정보(56)의 차이에 기초하여 경계선 오차(57)가 산출될 수 있다. 다만, 제3 경계선 정보(59)에는 시맨틱 객체와 연관되지 않은 다수의 경계선이 포함되어 있을 수 있기 때문에, 상기 보정 이전에 제3 경계선 정보(59)에 대한 노이즈 제거 프로세스가 더 수행될 수도 있다. 상기 노이즈 제거 프로세스는 다양한 방식으로 수행될 수 있다. 몇몇 예에서는, 세그먼테이션 정보(52)에 포함된 시맨틱 객체의 위치 정보를 이용하여 제3 경계선 정보(59)에서 시맨틱 객체와 연관되지 않은 경계선이 제거(즉, 시맨틱 객체와 밀접하게 연관된 경계선만이 선별)될 수 있다. 다른 예에서는, 세그먼테이션 정보(52)에 포함된 시맨틱 객체의 위치 및 클래스 정보와 사전에 알려진 시맨틱 객체의 형태 정보를 이용하여 제3 경계선 정보(59)에서 시맨틱 객체와 연관된 경계선이 보정될 수 있다. 이때, 상기 형태 정보에는 시맨틱 객체의 전체 형태 외에도 특정 부분들의 비율 등 다양한 정보가 포함될 수 있다. 또 다른 예에서는, 상기 예시된 노이즈 제거 프로세스가 제2 경계선 정보(58)에 대해서도 수행될 수 있다. 본 실시예에 따르면, 다양한 방식으로 검출 또는 추출된 경계선 정보(56, 58, 59)를 이용하여 보다 정확하게 경계선 오차(57)가 산출될 수 있다. 이에 따라, 세그먼테이션 신경망(51)은 시맨틱 객체의 형태 정보를 더욱 정교하게 예측하도록 학습될 수 있다.In some embodiments, the boundary line of the semantic object included in the second boundary line information 58 may be corrected based on the third boundary line information 59. Also, a boundary line error 57 may be calculated based on a difference between the corrected boundary line and the first boundary line information 56. However, since the third boundary line information 59 may include a plurality of boundary lines not related to the semantic object, a noise removal process for the third boundary line information 59 may be further performed before the correction. The noise removal process can be performed in various ways. In some examples, the boundary lines not related to the semantic object are removed from the third boundary line information 59 using the location information of the semantic object included in the segmentation information 52 (that is, only the boundary lines closely related to the semantic object are selected. ) Can be. In another example, the boundary line associated with the semantic object may be corrected in the third boundary line information 59 using position and class information of the semantic object included in the segmentation information 52 and shape information of the semantic object known in advance. In this case, the shape information may include various information such as a ratio of specific parts in addition to the overall shape of the semantic object. In another example, the illustrated noise removal process may also be performed on the second boundary line information 58. According to the present embodiment, the boundary line error 57 can be more accurately calculated using the boundary line information 56, 58, and 59 detected or extracted in various ways. Accordingly, the segmentation neural network 51 may be trained to more accurately predict shape information of the semantic object.

전술한 제1 실시예와 마찬가지로, 경계선 오차(57)는 세그먼테이션 신경망(51)과 경계선 검출 신경망(55)을 학습시키기 위해 이용될 수 있으며, 세그먼테이션 오차(53)는 세그먼테이션 정보(52)와 레이블 정보(54)의 차이에 기초하여 산출될 수 있다.Like the first embodiment described above, the boundary line error 57 can be used to train the segmentation neural network 51 and the boundary line detection neural network 55, and the segmentation error 53 is the segmentation information 52 and label information. It can be calculated based on the difference of (54).

도 7 본 개시의 제3 실시예에 따른 세그먼테이션 신경망 학습 프로세스를 설명하기 위한 도면이다. 이하의 서술에서, 앞선 실시예와 중복되는 내용에 대한 설명은 생략하고, 앞선 실시예와의 차이점을 중심으로 설명하도록 한다.7 is a diagram illustrating a segmentation neural network learning process according to a third embodiment of the present disclosure. In the following description, descriptions of contents overlapping with the previous embodiments will be omitted, and the differences from the previous embodiments will be mainly described.

도 7에 도시된 바와 같이, 상기 제3 실시예에서는, 경계선 오차(67)를 보다 정확하게 산출하기 위해 어텐션 신경망(68)이 이용될 수 있다. 어텐션 신경망(68)은 입력 이미지(60)에 대한 어텐션(attention) 정보를 출력하는 신경망을 의미할 수 있다. 상기 어텐션 정보는 제1 경계선 정보(66-1)에 포함된 노이즈(e.g. 예측된 경계선 중에서 시맨틱 객체와 연관되지 않은 경계선)를 제거하기 위해 이용되는 가중치 정보로 이해될 수 있다. 예를 들어, 상기 어텐션 정보는 픽셀 별 가중치 값으로 구성될 수 있고, 상기 픽셀 별 가중치 값을 제1 경계선 정보(66-1)에 반영함으로 노이즈 제거가 수행될 수 있다. 보다 구체적으로, 어텐션 정보를 제1 경계선 정보(66-1)에 반영하면 시맨틱 객체와 연관된 경계선 픽셀의 값은 증폭되고, 연관되지 않은 경계선 픽셀의 값은 억제됨으로써, 노이즈가 제거될 수 있다.As shown in FIG. 7, in the third embodiment, an attention neural network 68 may be used to more accurately calculate the boundary line error 67. The attention neural network 68 may mean a neural network that outputs attention information on the input image 60. The attention information may be understood as weight information used to remove noise (e.g. a boundary line not related to a semantic object among predicted boundary lines) included in the first boundary line information 66-1. For example, the attention information may be composed of a weight value for each pixel, and noise removal may be performed by reflecting the weight value for each pixel to the first boundary line information 66-1. More specifically, when the attention information is reflected in the first boundary line information 66-1, the value of the boundary line pixel associated with the semantic object is amplified, and the value of the unrelated boundary line pixel is suppressed, thereby removing noise.

어텐션 신경망(68)은 경계선 오차(47)를 통해 학습될 수 있다. 즉, 경계선 오차(47)의 역전파를 통해 세그먼테이션 신경망(41)과 경계선 검출 신경망(65) 뿐만 아니라 어텐션 신경망(68)의 가중치도 업데이트될 수 있다. 다만, 다른 몇몇 실시예에서는, 경계선 검출 신경망(65)은 제1 경계선 정보(66-1)와 제2 경계선 정보(66-2) 간의 오차(즉, 어텐션 정보 69가 반영되지 않은 오차)에 기초하여 학습될 수도 있다. 어텐션 신경망(68)은 다양한 구조의 신경망으로 구현될 수 있을 것이므로, 어텐션 신경망(68)의 구현 방식에 의해 본 개시의 기술적 범위가 한정되는 것은 아니다.The attention neural network 68 may be trained through the boundary line error 47. That is, weights of the attention neural network 68 as well as the segmentation neural network 41 and the boundary line detection neural network 65 may be updated through backpropagation of the boundary line error 47. However, in some other embodiments, the boundary line detection neural network 65 is based on an error between the first boundary line information 66-1 and the second boundary line information 66-2 (that is, an error in which attention information 69 is not reflected). It can also be learned by doing. Since the attention neural network 68 may be implemented as a neural network having various structures, the technical scope of the present disclosure is not limited by the implementation method of the attention neural network 68.

몇몇 실시예에서, 어텐션 신경망(68)은 경계선 검출 신경망(65)보다 더 깊은 신경망((즉, 더 많은 개수의 레이어를 갖는 신경망)으로 구현되거나, 더 많은 개수의 가중치 파라미터를 갖도록 구현될 수 있다. 어텐션 신경망(68)의 성능(즉, 어텐션 정보의 정확도)은 신경망의 깊이가 깊어질수록 향상되는 경향이 있기 때문이다.In some embodiments, the attention neural network 68 may be implemented as a deeper neural network (that is, a neural network having a greater number of layers) than the boundary line detection neural network 65, or may be implemented to have a greater number of weight parameters. This is because the performance of the attention neural network 68 (that is, the accuracy of the attention information) tends to improve as the depth of the neural network increases.

앞선 실시예들과 마찬가지로, 세그먼테이션 오차(63)는 세그먼테이션 정보(62)와 레이블 정보(64)의 차이에 기초하여 산출될 수 있고, 세그먼테이션 오차(63)를 역전파하여 세그먼테이션 신경망(61)의 가중치가 업데이트될 수 있다.Like the previous embodiments, the segmentation error 63 may be calculated based on the difference between the segmentation information 62 and the label information 64, and the segmentation error 63 is backpropagated to the weight of the segmentation neural network 61 Can be updated.

도 8은 본 개시의 제4 실시예에 따른 세그먼테이션 신경망 학습 방법을 설명하기 위한 도면이다. 이하의 서술에서, 앞선 실시예와 중복되는 내용에 대한 설명은 생략하고, 앞선 실시예와의 차이점을 중심으로 설명하도록 한다.8 is a diagram for describing a method of learning a segmentation neural network according to a fourth embodiment of the present disclosure. In the following description, descriptions of contents overlapping with the previous embodiments will be omitted, and the differences from the previous embodiments will be mainly described.

도 8에 도시된 바와 같이, 상기 제4 실시예에서는, 경계선 검출 신경망이 이용되지 않고, 입력 이미지(70)로부터 추출된 제1 경계선 정보(76)와 세그먼테이션 정보(72)에서 추출된 제2 경계선 정보(77)에 기초하여 경계선 오차(75)가 산출될 수 있다. 또한, 상기 추출 프로세스는 모두 이미지 프로세싱 로직을 통해 이루어질 수 있다.As shown in FIG. 8, in the fourth embodiment, the boundary line detection neural network is not used, and the first boundary line information 76 extracted from the input image 70 and the second boundary line extracted from the segmentation information 72 The boundary line error 75 may be calculated based on the information 77. In addition, all of the extraction processes may be performed through image processing logic.

다만, 제1 경계선 정보(75)에는 시맨틱 객체와 연관되지 않은 다수의 경계선이 포함되어 있을 수 있다. 따라서, 본 개시의 몇몇 실시예에서는, 경계선 오차(75)를 산출하기 전에, 제1 경계선 정보(76)에 대한 노이즈 제거 프로세스가 수행될 수도 있다.However, the first boundary line information 75 may include a plurality of boundary lines that are not related to the semantic object. Accordingly, in some embodiments of the present disclosure, before calculating the boundary line error 75, a noise removal process for the first boundary line information 76 may be performed.

상기 노이즈 제거 프로세스는 다양한 방식으로 수행될 수 있다. 몇몇 예에서는, 세그먼테이션 정보(72)에 포함된 시맨틱 객체의 위치 정보를 이용하여 제1 경계선 정보(76)에서 시맨틱 객체와 연관되지 않은 경계선이 제거(즉, 시맨틱 객체와 밀접하게 연관된 경계선만 선별)될 수 있다. 다른 예에서는, 세그먼테이션 정보(72)에 포함된 시맨틱 객체의 위치 및 클래스 정보와 사전에 알려진 시맨틱 객체의 형태 정보를 이용하여 제1 경계선 정보(76)에서 시맨틱 객체와 연관된 경계선이 보정될 수 있다. 이때, 상기 형태 정보에는 시맨틱 객체의 전체 형태 외에도 특정 부분들의 비율 등 다양한 정보가 포함될 수 있다.The noise removal process can be performed in various ways. In some examples, a boundary line not related to a semantic object is removed from the first boundary line information 76 using the location information of a semantic object included in the segmentation information 72 (that is, only boundary lines closely related to the semantic object are selected). Can be. In another example, a boundary line associated with the semantic object may be corrected in the first boundary line information 76 using location and class information of a semantic object included in the segmentation information 72 and shape information of a semantic object known in advance. In this case, the shape information may include various information such as a ratio of specific parts in addition to the overall shape of the semantic object.

몇몇 실시예에서, 상기 예시된 노이즈 제거 프로세스는 제2 경계선 정보(77)에 대해서도 수행될 수 있다.In some embodiments, the illustrated noise removal process may also be performed on the second boundary line information 77.

앞선 실시예들과 마찬가지로, 세그먼테이션 오차(763)는 세그먼테이션 정보(72)와 레이블 정보(74)의 차이에 기초하여 산출될 수 있고, 세그먼테이션 오차(73)를 역전파하여 세그먼테이션 신경망(71)의 가중치가 업데이트될 수 있다.As in the previous embodiments, the segmentation error 763 may be calculated based on the difference between the segmentation information 72 and the label information 74, and the weight of the segmentation neural network 71 by backpropagating the segmentation error 73 Can be updated.

상기 제4 실시예에 따르면, 경계선 검출을 위한 보조 신경망이 학습될 필요가 없다. 따라서, 학습에 소모되는 컴퓨팅 비용이 절감될 수 있다.According to the fourth embodiment, it is not necessary to learn an auxiliary neural network for boundary line detection. Accordingly, computing cost consumed for learning can be reduced.

지금까지 도 5 내지 도 8을 참조하여 본 개시의 다양한 실시예에 따른 세그먼테이션 신경망 학습 프로세스에 대하여 설명하였다. 상술한 바에 따르면, 다양한 방식으로 경계선 오차가 산출되고, 경계선 오차에 대해 세그먼테이션 신경망이 더 학습될 수 있다. 그렇게 함으로써, 코어스 레이블이 주어진 경우에도 시맨틱 객체의 형태 정보에 대한 정교한 예측을 수행할 수 있는 세그먼테이션 신경망이 구축될 수 있다. 또한, 그렇게 함으로써, 레이블링 작업에 소요되는 비용은 크게 절감될 수 있기 때문에, 시맨틱 세그먼테이션 태스크를 위한 신경망이 다양한 도메인(e.g. 의료 도메인)에서 구축될 수 있다.So far, a segmentation neural network training process according to various embodiments of the present disclosure has been described with reference to FIGS. 5 to 8. According to the above, the boundary line error is calculated in various ways, and the segmentation neural network may be further learned for the boundary line error. By doing so, even when a coarse label is given, a segmentation neural network capable of performing elaborate prediction on shape information of a semantic object can be constructed. In addition, by doing so, since the cost required for the labeling operation can be greatly reduced, neural networks for semantic segmentation tasks can be built in various domains (e.g. medical domains).

이하에서는 도 9를 참조하여 본 개시의 몇몇 실시예에 따른 시맨틱 세그먼테이션 방법의 추론 프로세스에 대하여 설명하도록 한다.Hereinafter, an inference process of a semantic segmentation method according to some embodiments of the present disclosure will be described with reference to FIG. 9.

도 9는 상기 추론 프로세스를 나타내는 예시적인 흐름도이다. 단, 이는 본 개시의 목적을 달성하기 위한 바람직한 실시예일 뿐이며, 필요에 따라 일부 단계가 추가되거나 삭제될 수 있음은 물론이다.9 is an exemplary flow diagram illustrating the inference process. However, this is only a preferred embodiment for achieving the object of the present disclosure, and of course, some steps may be added or deleted as necessary.

도 9에 도시된 바와 같이, 상기 추론 프로세스는 레이블이 주어지지 않은 이미지에 대하여 수행되며(S700), 상기 이미지를 기 학습된 세그먼테이션 신경망에 입력함으로써 수행될 수 있다(S800). 또한, 상기 세그먼테이션 신경망을 통해 세그먼테이션 정보가 출력될 수 있다. 예를 들어, 도 1에 도시된 바와 같이, 시맨틱 객체가 세그먼테이션되어 있는 이미지(17) 또는 픽셀 단위로 시맨틱 객체의 클래스 정보가 포함된 세그먼테이션 맵(15)이 출력될 수 있다.As shown in FIG. 9, the inference process is performed on an image that is not given a label (S700), and may be performed by inputting the image to a previously learned segmentation neural network (S800). Also, segmentation information may be output through the segmentation neural network. For example, as illustrated in FIG. 1, an image 17 in which a semantic object is segmented or a segmentation map 15 including class information of a semantic object in pixel units may be output.

몇몇 실시예에서는, 경계선 검출 신경망(e.g. 도 4 내지 도 7의 45, 55, 65)의 경계선 정보를 이용하여 세그먼테이션 결과의 정확도를 향상시키기 위한 보정 프로세스가 더 수행될 수도 있다. 예를 들어, 상기 경계선 검출 신경망으로부터 얻어진 경계선 정보를 기준으로 상기 세그먼테이션 정보를 보정 함으로써, 세그먼테이션 결과의 정확도가 더욱 향상될 수 있다.In some embodiments, a correction process for improving the accuracy of the segmentation result may be further performed by using the boundary line information of the boundary line detection neural network (e.g. 45, 55, 65 in FIGS. 4 to 7 ). For example, by correcting the segmentation information based on the boundary line information obtained from the boundary line detection neural network, the accuracy of the segmentation result may be further improved.

지금까지 도 9를 참조하여 본 개시의 몇몇 실시예에 따른 시맨틱 세그먼테이션 방법의 추론 프로세스에 대하여 설명하였다. 이하에서는, 도 10을 참조하여 본 개시의 다양한 실시예에 따른 장치(e.g. 도 1의 세그먼테이션 장치 10)를 구현할 수 있는 예시적인 컴퓨팅 장치(100)에 대하여 설명하도록 한다.So far, the inference process of the semantic segmentation method according to some embodiments of the present disclosure has been described with reference to FIG. 9. Hereinafter, an exemplary computing device 100 capable of implementing a device (e.g., the segmentation device 10 of FIG. 1) according to various embodiments of the present disclosure will be described with reference to FIG. 10.

도 10은 컴퓨팅 장치(100)를 나타내는 하드웨어 구성도이다.10 is a hardware configuration diagram illustrating the computing device 100.

도 10에 도시된 바와 같이, 컴퓨팅 장치(100)는 하나 이상의 프로세서(110), 버스(150), 통신 인터페이스(170), 프로세서(110)에 의하여 수행되는 컴퓨터 프로그램을 로드(load)하는 메모리(130)와, 시맨틱 세그먼테이션 컴퓨터 프로그램(191)을 저장하는 스토리지(190)를 포함할 수 있다. 다만, 도 10에는 본 개시의 실시예와 관련 있는 구성요소들만이 도시되어 있다. 따라서, 본 개시가 속한 기술분야의 통상의 기술자라면 도 10에 도시된 구성요소들 외에 다른 범용적인 구성 요소들이 더 포함될 수 있음을 알 수 있다.As shown in FIG. 10, the computing device 100 is a memory for loading a computer program executed by one or more processors 110, a bus 150, a communication interface 170, and the processor 110 ( 130) and a storage 190 for storing the semantic segmentation computer program 191. However, only components related to the embodiment of the present disclosure are shown in FIG. 10. Accordingly, those of ordinary skill in the art to which the present disclosure belongs may recognize that other general-purpose components may be further included in addition to the components illustrated in FIG. 10.

프로세서(110)는 컴퓨팅 장치(100)의 각 구성의 전반적인 동작을 제어한다. 프로세서(110)는 CPU(Central Processing Unit), MPU(Micro Processor Unit), MCU(Micro Controller Unit), GPU(Graphic Processing Unit) 또는 본 개시의 기술 분야에 잘 알려진 임의의 형태의 프로세서를 포함하여 구성될 수 있다. 또한, 프로세서(110)는 본 개시의 실시예들에 따른 방법/동작을 실행하기 위한 적어도 하나의 애플리케이션 또는 프로그램에 대한 연산을 수행할 수 있다. 컴퓨팅 장치(100)는 하나 이상의 프로세서를 구비할 수 있다.The processor 110 controls the overall operation of each component of the computing device 100. The processor 110 includes a CPU (Central Processing Unit), MPU (Micro Processor Unit), MCU (Micro Controller Unit), GPU (Graphic Processing Unit), or any type of processor well known in the art of the present disclosure. Can be. Further, the processor 110 may perform an operation on at least one application or program for executing the method/operation according to the embodiments of the present disclosure. The computing device 100 may include one or more processors.

메모리(130)는 각종 데이터, 명령 및/또는 정보를 저장한다. 메모리(130)는 본 개시의 실시예들에 따른 다양한 방법/동작을 실행하기 위하여 스토리지(190)로부터 하나 이상의 프로그램(191)을 로드할 수 있다. 메모리(130)는 RAM과 같은 휘발성 메모리로 구현될 수 있을 것이나, 본 개시의 기술적 범위가 이에 한정되는 것은 아니다.The memory 130 stores various types of data, commands and/or information. The memory 130 may load one or more programs 191 from the storage 190 in order to execute various methods/operations according to embodiments of the present disclosure. The memory 130 may be implemented as a volatile memory such as RAM, but the technical scope of the present disclosure is not limited thereto.

버스(150)는 컴퓨팅 장치(100)의 구성 요소 간 통신 기능을 제공한다. 버스(150)는 주소 버스(Address Bus), 데이터 버스(Data Bus) 및 제어 버스(Control Bus) 등 다양한 형태의 버스로 구현될 수 있다.The bus 150 provides communication functions between components of the computing device 100. The bus 150 may be implemented as various types of buses such as an address bus, a data bus, and a control bus.

통신 인터페이스(170)는 컴퓨팅 장치(100)의 유무선 인터넷 통신을 지원한다. 또한, 통신 인터페이스(170)는 인터넷 통신 외의 다양한 통신 방식을 지원할 수도 있다. 이를 위해, 통신 인터페이스(170)는 본 개시의 기술 분야에 잘 알려진 통신 모듈을 포함하여 구성될 수 있다. 몇몇 실시예에서, 통신 인터페이스(170)는 생략될 수도 있다.The communication interface 170 supports wired/wireless Internet communication of the computing device 100. In addition, the communication interface 170 may support various communication methods other than Internet communication. To this end, the communication interface 170 may be configured to include a communication module well known in the technical field of the present disclosure. In some embodiments, the communication interface 170 may be omitted.

스토리지(190)는 상기 하나 이상의 프로그램(191)을 비임시적으로 저장할 수 있다. 스토리지(190)는 ROM(Read Only Memory), EPROM(Erasable Programmable ROM), EEPROM(Electrically Erasable Programmable ROM), 플래시 메모리 등과 같은 비휘발성 메모리, 하드 디스크, 착탈형 디스크, 또는 본 개시가 속하는 기술 분야에서 잘 알려진 임의의 형태의 컴퓨터로 읽을 수 있는 기록 매체를 포함하여 구성될 수 있다.The storage 190 may non-temporarily store the one or more programs 191. The storage 190 is a nonvolatile memory such as a ROM (Read Only Memory), EPROM (Erasable Programmable ROM), EEPROM (Electrically Erasable Programmable ROM), flash memory, etc., a hard disk, a removable disk, or well in the technical field to which the present disclosure belongs. It may be configured to include any known computer-readable recording medium.

컴퓨터 프로그램(191)은 메모리(130)에 로드될 때 프로세서(110)로 하여금 본 개시의 다양한 실시예에 따른 방법을 수행하도록 하는 하나 이상의 인스트럭션들(instructions)을 포함할 수 있다. 즉, 프로세서(110)는 상기 하나 이상의 인스트럭션들을 실행함으로써, 본 개시의 다양한 실시예에 따른 방법/동작들을 수행할 수 있다.The computer program 191 may include one or more instructions that when loaded into the memory 130 cause the processor 110 to perform a method according to various embodiments of the present disclosure. That is, the processor 110 may perform methods/operations according to various embodiments of the present disclosure by executing the one or more instructions.

예를 들어, 컴퓨터 프로그램(191)은 레이블이 주어진 이미지를 세그먼테이션 신경망에 입력하여 상기 이미지에 대한 세그먼테이션 정보를 얻는 동작, 상기 세그먼테이션 정보에 대한 세그먼테이션 오차와 경계선 오차를 산출하는 동작 및 상기 산출된 세그먼테이션 오차와 상기 산출된 경계선 오차를 역전파(back-propagation)하여 상기 세그먼테이션 신경망을 업데이트하는 동작을 수행하도록 하는 인스트럭션들을 포함할 수 있다. 이와 같은 경우, 컴퓨팅 장치(100)를 통해 본 개시의 몇몇 실시예에 따른 세그먼테이션 장치(100)가 구현될 수 있다.For example, the computer program 191 inputs an image given a label to a segmentation neural network to obtain segmentation information for the image, an operation of calculating a segmentation error and a boundary error for the segmentation information, and the calculated segmentation error. And instructions for performing an operation of updating the segmentation neural network by back-propagating the calculated boundary line error. In this case, the segmentation device 100 according to some embodiments of the present disclosure may be implemented through the computing device 100.

지금까지 도 1 내지 도 10을 참조하여 본 개시의 다양한 실시예들 및 그 실시예들에 따른 효과들을 언급하였다. 본 개시의 기술적 사상에 따른 효과들은 이상에서 언급한 효과들로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 아래의 기재로부터 통상의 기술자에게 명확하게 이해될 수 있을 것이다.So far, various embodiments of the present disclosure and effects according to the embodiments have been mentioned with reference to FIGS. 1 to 10. The effects according to the technical idea of the present disclosure are not limited to the above-mentioned effects, and other effects not mentioned will be clearly understood by those skilled in the art from the following description.

지금까지 도 1 내지 도 10을 참조하여 설명된 본 개시의 기술적 사상은 컴퓨터가 읽을 수 있는 매체 상에 컴퓨터가 읽을 수 있는 코드로 구현될 수 있다. 상기 컴퓨터로 읽을 수 있는 기록 매체는, 예를 들어 이동형 기록 매체(CD, DVD, 블루레이 디스크, USB 저장 장치, 이동식 하드 디스크)이거나, 고정식 기록 매체(ROM, RAM, 컴퓨터 구비 형 하드 디스크)일 수 있다. 상기 컴퓨터로 읽을 수 있는 기록 매체에 기록된 상기 컴퓨터 프로그램은 인터넷 등의 네트워크를 통하여 다른 컴퓨팅 장치에 전송되어 상기 다른 컴퓨팅 장치에 설치될 수 있고, 이로써 상기 다른 컴퓨팅 장치에서 사용될 수 있다.The technical idea of the present disclosure described with reference to FIGS. 1 to 10 so far may be implemented as computer-readable codes on a computer-readable medium. The computer-readable recording medium is, for example, a removable recording medium (CD, DVD, Blu-ray disk, USB storage device, removable hard disk) or a fixed recording medium (ROM, RAM, computer-equipped hard disk). I can. The computer program recorded in the computer-readable recording medium may be transmitted to another computing device through a network such as the Internet and installed in the other computing device, thereby being used in the other computing device.

이상에서, 본 개시의 실시예를 구성하는 모든 구성 요소들이 하나로 결합되거나 결합되어 동작하는 것으로 설명되었다고 해서, 본 개시의 기술적 사상이 반드시 이러한 실시예에 한정되는 것은 아니다. 즉, 본 개시의 목적 범위 안에서라면, 그 모든 구성요소들이 하나 이상으로 선택적으로 결합하여 동작할 수도 있다.In the above, even if all the constituent elements constituting the embodiments of the present disclosure have been described as being combined into one or operating in combination, the technical idea of the present disclosure is not necessarily limited to these embodiments. That is, as long as it is within the scope of the object of the present disclosure, one or more of the components may be selectively combined and operated.

도면에서 동작들이 특정한 순서로 도시되어 있지만, 반드시 동작들이 도시된 특정한 순서로 또는 순차적 순서로 실행되어야만 하거나 또는 모든 도시 된 동작들이 실행되어야만 원하는 결과를 얻을 수 있는 것으로 이해되어서는 안 된다. 특정 상황에서는, 멀티태스킹 및 병렬 처리가 유리할 수도 있다. 더욱이, 위에 설명한 실시예들에서 다양한 구성들의 분리는 그러한 분리가 반드시 필요한 것으로 이해되어서는 안 되고, 설명된 프로그램 컴포넌트들 및 시스템들은 일반적으로 단일 소프트웨어 제품으로 함께 통합되거나 다수의 소프트웨어 제품으로 패키지 될 수 있음을 이해하여야 한다.Although the operations are illustrated in a specific order in the drawings, it should not be understood that the operations must be executed in the specific order shown or in a sequential order, or all illustrated operations must be executed to obtain a desired result. In certain situations, multitasking and parallel processing may be advantageous. Moreover, the separation of the various components in the above-described embodiments should not be understood as necessitating such separation, and the program components and systems described may generally be integrated together into a single software product or packaged into multiple software products. It should be understood that there is.

이상 첨부된 도면을 참조하여 본 개시의 실시예들을 설명하였지만, 본 개시가 속하는 기술분야에서 통상의 지식을 가진 자는 그 기술적 사상이나 필수적인 특징을 변경하지 않고서 본 개시가 다른 구체적인 형태로도 실시될 수 있다는 것을 이해할 수 있다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적인 것이 아닌 것으로 이해해야만 한다. 본 개시의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 개시에 의해 정의되는 기술적 사상의 권리범위에 포함되는 것으로 해석되어야 할 것이다.Although the embodiments of the present disclosure have been described with reference to the accompanying drawings, the present disclosure may be implemented in other specific forms without changing the technical spirit or essential features of those of ordinary skill in the art. I can understand that there is. Therefore, it should be understood that the embodiments described above are illustrative in all respects and not limiting. The scope of protection of the present disclosure should be interpreted by the claims below, and all technical ideas within the scope equivalent thereto should be interpreted as being included in the scope of the technical ideas defined by the present disclosure.

Claims

In the semantic segmentation method performed by a computing device,
Obtaining segmentation information on a semantic object included in a target image by using a segmentation neural network;
Calculating a segmentation error related to the semantic object based on a difference between the obtained segmentation information and a coarse label corresponding to the target image;
Calculating a boundary line error related to the semantic object based on a difference between the first boundary line information obtained from the segmentation information and the second boundary line information obtained from the target image; And
Including the step of updating the segmentation neural network based on the segmentation error and the boundary line error
Semantic segmentation method.

The method of claim 1,
The step of calculating the boundary line error
From the segmentation information, acquiring the first boundary line information on the semantic object,
In the target image, including the step of acquiring the second boundary line information on the semantic object
Semantic segmentation method.

The method of claim 1,
The above coarse label is
Including information on at least one of the location, shape, and class of the semantic object
Semantic segmentation method.

The method of claim 1,
The above coarse label is
A label with lower precision than a fine label including boundary information related to at least one semantic object included in the target image
Semantic segmentation method.

The method of claim 1,
The above coarse label is
Including marking information indicating the location of the semantic object or scribble information on the semantic object
Semantic segmentation method.

The method of claim 1,
The step of calculating the segmentation error
Comprising the step of calculating a difference between the predicted class of the semantic object included in the obtained segmentation information and the correct answer class included in the coarse label as the segmentation error based on a cross entropy loss function
Semantic segmentation method.

The method of claim 1,
The step of calculating the boundary line error
Comprising the step of calculating a difference between the first boundary line information and the second boundary line information as the boundary line error based on an L1 loss function or an L2 loss function
Semantic segmentation method.

The method of claim 1,
The updating step
Assigning a weight to at least one of the segmentation error and the boundary line error,
Including the step of updating the segmentation neural network by reflecting the assigned weight.
Semantic segmentation method.

delete

The method of claim 1,
The step of calculating the boundary line error
Including the step of obtaining the second boundary line information by inputting the target image into a boundary line detection neural network
Semantic segmentation method.

Including at least one processor and memory,
The memory stores one or more instructions;
The processor executes the one or more instructions,
By using the segmentation neural network, segmentation information on the semantic object included in the target image is obtained,
A segmentation error related to the semantic object is calculated based on a difference between the obtained segmentation information and a coarse label corresponding to the target image,
A boundary line error related to the semantic object is calculated based on a difference between the first boundary line information obtained from the segmentation information and the second boundary line information obtained from the target image,
To update the segmentation neural network based on the segmentation error and the boundary line error
Semantic segmentation device.

The method of claim 11,
The processor is
From the segmentation information, obtaining the first boundary line information on the semantic object,
Obtaining the second boundary line information on the semantic object from the target image
Semantic segmentation device.

The method of claim 11,
The above coarse label is
Including information on at least one of the location, shape, and class of the semantic object
Semantic segmentation device.

The method of claim 11,
The above coarse label is
A label with lower precision than a fine label including boundary information related to at least one semantic object included in the target image
Semantic segmentation device.

The method of claim 11,
The above coarse label is
Including marking information indicating the location of the semantic object or scribble information on the semantic object
Semantic segmentation device.

The method of claim 11,
The processor is
Computing the difference between the prediction class of the semantic object included in the obtained segmentation information and the correct answer class included in the coarse label as the segmentation error based on a cross entropy loss function
Semantic segmentation device.

The method of claim 11,
The processor is
Calculating the difference between the first boundary line information and the second boundary line information as the boundary line error based on an L1 loss function or an L2 loss function
Semantic segmentation device.

The method of claim 11,
The processor is
Assigning a weight to at least one of the segmentation error and the boundary line error,
Reflecting the assigned weight to update the segmentation neural network
Semantic segmentation device.

delete

The method of claim 11,
The processor is
Inputting the target image into a boundary line detection neural network to obtain the second boundary line information
Semantic segmentation device.