KR102427861B1

KR102427861B1 - Apparatus and method for generating underwater image data

Info

Publication number: KR102427861B1
Application number: KR1020200131291A
Authority: KR
Inventors: 강일범; 고한석; 구본화
Original assignee: 국방과학연구소
Priority date: 2020-10-12
Filing date: 2020-10-12
Publication date: 2022-08-01
Also published as: KR20220048300A

Abstract

본 개시는 GAN을 이용하여 수중 영상 데이터를 생성하는 장치에 관한 것으로, 입력 영상의 픽셀 분포 특성을 추출하는 픽셀 분포 예측부, 추출된 픽셀 분포 특성을 이용하여 수중 영상 데이터를 생성하는 생성부, 입력 영상 및 수중 영상 데이터가 위조된 영상인지 여부를 결정하는 판별부를 포함함으로써, 다수의 수중 영상 데이터를 효과적으로 생성할 수 있다.The present disclosure relates to an apparatus for generating underwater image data using a GAN, a pixel distribution predictor for extracting pixel distribution characteristics of an input image, a generator for generating underwater image data using the extracted pixel distribution characteristics, and an input By including a determining unit for determining whether the image and the underwater image data is a forged image, it is possible to effectively generate a plurality of underwater image data.

Description

Apparatus and method for generating underwater image data {APPARATUS AND METHOD FOR GENERATING UNDERWATER IMAGE DATA}

본 개시는 수중 영상 데이터를 생성하는 장치 및 방법을 제공한다.The present disclosure provides an apparatus and method for generating underwater image data.

수중 영상은 해양 탐사에 이용된다. 예를 들어 측면주사 소나(side-scan sonar)는 해저 지형 측량 및 잔해물 감지 등 다양한 해양 산업 분야에서 중요한 요소이다. 측면주사 소나 데이터는 표적 혹은 지형에서 반향된 수신 신호를 이용하여 선 단위로 수중 영상을 구성한다. 동작 주파수에 따라 측면주사 소나 영상은 고주파 및 저주파 유형으로 분류될 수 있고, 객체 유무에 따라 해저 영상과 객체 영상으로 단순하게 나눌 수도 있다. 측면주사 소나 영상을 이용한 수중 영상 인식과 같은 관련 연구를 위해서는 다수의 측면주사 소나 영상이 필요하다. 다수의 측면주사 소나 영상을 획득하기 위해서는 광범위한 현장 작업이 필요하고, 바다에서 실제 실험이 수행되어야 한다. 바다에서 실제 실험이 수행되어야 하므로 높은 비용이 발생되고, 많은 시간이 소모된다. 다양한 데이터를 수집 및 가공하는 것은 손쉬운 문제가 아니다. 높은 비용 발생 및 많은 시간 소모로 인해 다수의 측면주사 소나 영상을 획득하기 어렵다.Underwater imaging is used for ocean exploration. For example, side-scan sonars are an important element in various marine industries, such as subsea topography surveying and debris detection. The side scan sonar data composes an underwater image in line units using the received signal reflected from the target or terrain. According to the operating frequency, the side scan sonar image can be classified into high frequency and low frequency types, and can be simply divided into a seabed image and an object image according to the presence or absence of an object. A number of side-scan sonar images are needed for related research such as underwater image recognition using side-scan sonar images. Acquisition of multiple side-scan sonar images requires extensive field work and actual experiments at sea. Since actual experiments have to be performed at sea, high costs are incurred and a lot of time is consumed. Collecting and processing various data is not an easy task. It is difficult to acquire a large number of side scan sonar images due to high cost and time consumption.

생성적 대립 신경망(GAN: Generative Adversarial Network)은 이미지 합성, 콘텐츠 생성, 및 데이터 확대 등 여러 분야에서 유용하다. 또한, GAN은 소나 영상 시뮬레이션 작업에 광범위하게 적용된다. GAN은 컨볼루션 뉴럴 네트워크(CNN: Convolutional Neural Network)의 고기능 추출 능력을 활용하는데, 이는 실제 데이터에서 사실적인 합성 데이터를 생성할 수 있다는 것을 의미한다. Generative Adversarial Networks (GANs) are useful in various fields such as image synthesis, content generation, and data augmentation. In addition, GAN is widely applied to sonar image simulation tasks. GANs leverage the high-function extraction capabilities of convolutional neural networks (CNNs), which means that realistic synthetic data can be generated from real data.

이에 따라, 다수의 측면주사 소나 영상을 획득하기 위해, GAN을 이용하여 소나 영상 데이터를 생성하는 방법이 요구된다.Accordingly, in order to acquire a plurality of side-scan sonar images, a method of generating sonar image data using a GAN is required.

10-2020-0009852 A10-2020-0009852 A 10-2018-0065417 A10-2018-0065417 A

수중 영상 데이터를 생성하는 장치 및 방법을 제공하는 데 있다. 또한, 상기 방법을 컴퓨터에서 실행시키기 위한 프로그램을 기록한 기록매체를 제공하는 데 있다. 본 실시예가 이루고자 하는 기술적 과제는 상기된 바와 같은 기술적 과제들로 한정되지 않으며, 이하의 실시예들로부터 또 다른 기술적 과제들이 유추될 수 있다.An apparatus and method for generating underwater image data are provided. Another object of the present invention is to provide a recording medium in which a program for executing the method in a computer is recorded. The technical problems to be achieved by the present embodiment are not limited to the technical problems as described above, and other technical problems may be inferred from the following embodiments.

상술한 기술적 과제를 달성하기 위한 기술적 수단으로서, 본 개시의 제1 측면은, 입력 영상의 픽셀 분포 특성을 추출하는 픽셀 분포 예측부; 상기 추출된 픽셀 분포 특성을 이용하여 수중 영상 데이터를 생성하는 생성부; 및 상기 입력 영상 및 상기 수중 영상 데이터가 위조된 영상인지 여부를 결정하는 판별부;를 포함하는 수중 영상 데이터 생성 장치를 제공할 수 있다.As a technical means for achieving the above technical problem, a first aspect of the present disclosure includes: a pixel distribution prediction unit for extracting pixel distribution characteristics of an input image; a generator for generating underwater image data by using the extracted pixel distribution characteristics; and a determining unit that determines whether the input image and the underwater image data are forged images.

또한, 상기 픽셀 분포 예측부는, 비지도 학습(unsupervised learning)을 통해 상기 입력 영상의 픽셀 분포를 학습하고, 상기 학습된 픽셀 분포로부터 상기 픽셀 분포 특성을 추출하고, 상기 추출된 픽셀 분포 특성을 이용하여 랜덤 벡터(random vector)를 생성하는 수중 영상 데이터 생성 장치를 제공할 수 있다.In addition, the pixel distribution prediction unit learns the pixel distribution of the input image through unsupervised learning, extracts the pixel distribution characteristic from the learned pixel distribution, and uses the extracted pixel distribution characteristic. An apparatus for generating underwater image data for generating a random vector may be provided.

또한, 상기 픽셀 분포 예측부는, 컨볼루션 레이어(convolution layer) 및 FC 레이어(Fully-Connected layer) 중 적어도 하나를 포함하는 수중 영상 데이터 생성 장치를 제공할 수 있다.Also, the pixel distribution prediction unit may provide an apparatus for generating underwater image data including at least one of a convolution layer and a fully-connected layer (FC).

또한, 상기 생성부는, Pix2pixHD 모델을 이용하여 상기 수중 영상 데이터를 생성하는 수중 영상 데이터 생성 장치를 제공할 수 있다. Also, the generator may provide an apparatus for generating underwater image data that generates the underwater image data using a Pix2pixHD model.

또한, 상기 생성부는, 업샘플링 컨볼루션 레이어(upsampling convolution layer) 및 상기 업샘플링 컨볼루션 레이어의 출력에 분할 지도(segmentation map)를 추가하는 스페이드 블록(spade block) 중 적어도 하나를 포함하는 수중 영상 데이터 생성 장치를 제공할 수 있다.In addition, the generator, an upsampling convolution layer (upsampling convolution layer) and an underwater image data comprising at least one of a spade block for adding a segmentation map to the output of the upsampling convolution layer A generating device may be provided.

또한, 상기 스페이드 블록은, 컨볼루션 레이어 및 배치 정규화 레이어(batch normalization layer) 중 적어도 하나를 포함하는 수중 영상 데이터 생성 장치를 제공할 수 있다.Also, the spade block may provide an apparatus for generating underwater image data including at least one of a convolution layer and a batch normalization layer.

또한, 상기 판별부는, Patch GAN(Generative Adversarial Network) 구조를 이용하는 수중 영상 데이터 생성 장치를 제공할 수 있다.In addition, the determining unit may provide an apparatus for generating underwater image data using a Patch Generative Adversarial Network (GAN) structure.

또한, 상기 판별부는, 분할 지도와 채널(channel)에서 연결된(concatenate) 상기 입력 영상 및 상기 분할 지도와 상기 채널에서 연결된 상기 수중 영상 데이터가 상기 위조된 영상인지 여부를 결정하는 수중 영상 데이터 생성 장치를 제공할 수 있다.In addition, the determining unit, an underwater image data generating device for determining whether the input image concatenated in the divided map and the channel and the underwater image data connected in the divided map and the channel are the forged images can provide

또한, 상기 픽셀 분포 예측부는 손실 함수를 통해 학습되고, 상기 손실 함수는 다음 수학식 1인 수중 영상 데이터 생성 장치를 제공할 수 있다.In addition, the pixel distribution predictor is learned through a loss function, the loss function may provide an apparatus for generating underwater image data in which the following Equation (1).

[수학식 1][Equation 1]

(여기서, 상기

는 상기 손실 함수, 상기

는 KL 다이버전스(KL divergence) 함수, 상기

는 상기 입력 영상, 상기

는 잠재 변수(latent variable), 상기

은 상기 입력 영상에 따른 상기 학습된 픽셀 분포, 상기

는 표준 가우스 분포임)(here, the

is the loss function, where

is the KL divergence function, where

is the input image, the

is a latent variable,

is the learned pixel distribution according to the input image, the

is the standard Gaussian distribution)

또한, 상기 생성부 및 상기 판별부의 목적 함수는 다음 수학식 2인 수중 영상 데이터 생성 장치를 제공할 수 있다.In addition, the objective function of the generating unit and the determining unit may provide an apparatus for generating underwater image data, which is the following Equation (2).

[수학식 2][Equation 2]

(여기서, 상기

은 GAN 손실, 상기

은 특징 매칭 손실(feature matching loss), 상기

는 상기 생성부, 상기

는 상기 판별부, 상기

는 상기 입력 영상, 상기

는 상기 입력 영상의 크기에 해당되는 시맨틱 라벨 지도(semantic label map) 값임)(here, the

is the GAN loss, above

is the feature matching loss,

is the generator, the

is the determination unit, the

is the input image, the

is a semantic label map value corresponding to the size of the input image)

본 개시의 제2 측면은, 수중 영상 데이터를 생성하는 방법에 있어서, 입력 영상의 픽셀 분포 특성을 추출하는 단계; 상기 추출된 픽셀 분포 특성을 이용하여 수중 영상 데이터를 생성하는 단계; 및 상기 입력 영상 및 상기 수중 영상 데이터가 위조된 영상인지 여부를 결정하는 단계;를 포함하는 수중 영상 데이터 생성 방법을 제공할 수 있다.A second aspect of the present disclosure provides a method for generating underwater image data, the method comprising: extracting pixel distribution characteristics of an input image; generating underwater image data by using the extracted pixel distribution characteristics; and determining whether the input image and the underwater image data are forged images.

본 개시의 제3 측면은, 제11 항에 따른 방법을 컴퓨터에서 실행시키기 위한 프로그램을 기록한 기록 매체를 제공할 수 있다.A third aspect of the present disclosure may provide a recording medium in which a program for executing the method according to claim 11 in a computer is recorded.

본 개시의 제4 측면은, 수중 영상 데이터 생성 장치에 있어서, 메모리; 및 프로세서;를 포함하고, 상기 프로세서는, 입력 영상의 픽셀 분포 특성을 추출하고, 상기 추출된 픽셀 분포 특성을 이용하여 수중 영상 데이터를 생성하고, 상기 입력 영상 및 상기 수중 영상 데이터가 위조된 영상인지 여부를 결정하는 수중 영상 데이터 생성 장치를 제공할 수 있다.A fourth aspect of the present disclosure is an underwater image data generating apparatus, comprising: a memory; and a processor, wherein the processor extracts pixel distribution characteristics of the input image, generates underwater image data using the extracted pixel distribution characteristics, and determines whether the input image and the underwater image data are forged images. It is possible to provide an underwater image data generating device for determining whether or not.

본 개시는, GAN을 이용하여 수중 영상 데이터를 생성할 수 있다. 구체적으로, GAN을 이용함으로써 생성부 및 판별부가 서로 경쟁적인 학습을 할 수 있고, 학습이 진행될수록 입력 영상과 유사한 수중 영상 데이터를 생성할 수 있다. 또한, 서로 다른 주파수에서도 수중 영상 데이터를 효과적으로 생성할 수 있다. The present disclosure may generate underwater image data using a GAN. Specifically, by using the GAN, the generating unit and the discriminating unit can perform competitive learning with each other, and as the learning progresses, it is possible to generate underwater image data similar to the input image. In addition, it is possible to effectively generate underwater image data even at different frequencies.

바다에서 실제 실험이 수행되지 않아도 되므로, 비용이 적게 들고 효율이 높을 수 있고, 우수한 품질과 세밀한 영상의 표현이 가능할 수 있다.Since actual experiments do not have to be conducted at sea, the cost may be low and the efficiency may be high, and excellent quality and detailed image expression may be possible.

도 1은 일 실시예에 따른 수중 영상 데이터 생성 장치의 블록도이다.
도 2는 일 실시예에 따른 픽셀 분포 예측부를 설명하기 위한 도면이다.
도 3은 일 실시예에 따른 생성부 및 판별부를 설명하기 위한 도면이다.
도 4는 도 3의 스페이드 블록의 구조를 설명하기 위한 도면이다.
도 5는 입력 영상과 수중 영상 데이터를 비교하기 위한 도면이다.
도 6은 일 실시예에 따른 수중 영상 데이터 생성 장치의 성능을 설명하기 위한 표이다.
도 7은 다른 실시예에 따른 수중 영상 데이터 생성 장치의 블록도이다.1 is a block diagram of an apparatus for generating underwater image data according to an exemplary embodiment.
2 is a diagram for describing a pixel distribution predictor according to an exemplary embodiment.
3 is a view for explaining a generator and a determiner according to an embodiment.
FIG. 4 is a view for explaining the structure of the spade block of FIG. 3 .
5 is a diagram for comparing an input image and underwater image data.
6 is a table for explaining the performance of the apparatus for generating underwater image data according to an embodiment.
7 is a block diagram of an apparatus for generating underwater image data according to another embodiment.

본 실시예들에서 사용되는 용어는 본 실시예들에서의 기능을 고려하면서 가능한 현재 널리 사용되는 일반적인 용어들을 선택하였으나, 이는 당 분야에 종사하는 기술자의 의도 또는 판례, 새로운 기술의 출현 등에 따라 달라질 수 있다. 또한, 특정한 경우는 출원인이 임의로 선정한 용어도 있으며, 이 경우 해당되는 부분에서 상세히 그 의미를 기재할 것이다. 따라서, 본 실시예들에서 사용되는 용어는 단순한 용어의 명칭이 아닌, 그 용어가 가지는 의미와 본 실시예들 전반에 걸친 내용을 토대로 정의되어야 한다. The terms used in the present embodiments have been selected as currently widely used general terms as possible while considering the functions in the present embodiments, but this may vary depending on the intention or precedent of a person skilled in the art, the emergence of new technology, etc. have. In addition, in certain cases, there are also terms arbitrarily selected by the applicant, and in this case, the meaning will be described in detail in the relevant part. Therefore, the terms used in the present embodiments should be defined based on the meaning of the term and the contents throughout the present embodiments, rather than the simple name of the term.

본 실시예들은 다양한 변경을 가할 수 있고 여러 가지 형태를 가질 수 있는 바, 일부 실시예들을 도면에 예시하고 상세하게 설명하고자 한다. 그러나, 이는 본 실시예들을 특정한 개시형태에 대해 한정하려는 것이 아니며, 본 실시예들의 사상 및 기술범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 본 명세서에서 사용한 용어들은 단지 실시예들의 설명을 위해 사용된 것으로, 본 실시예들을 한정하려는 의도가 아니다.Since the present embodiments may have various changes and may have various forms, some embodiments will be illustrated in the drawings and described in detail. However, this is not intended to limit the present embodiments to a specific disclosure form, and it should be understood that all modifications, equivalents and substitutes included in the spirit and scope of the present embodiments are included. The terms used herein are used only for description of the embodiments, and are not intended to limit the present embodiments.

본 실시예들에 사용되는 용어들은 다르게 정의되지 않는 한, 본 실시예들이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미가 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 실시예들에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않아야 한다.Unless otherwise defined, terms used in the present embodiments have the same meanings as commonly understood by those of ordinary skill in the art to which the present embodiments belong. Terms such as those defined in a commonly used dictionary should be interpreted as having a meaning consistent with the meaning in the context of the related art, and unless explicitly defined in the present embodiments, have an ideal or excessively formal meaning. should not be interpreted.

본 개시의 일부 실시예는 기능적인 블록 구성들 및 다양한 처리 단계들로 나타내어질 수 있다. 이러한 기능 블록들의 일부 또는 전부는, 특정 기능들을 실행하는 다양한 개수의 하드웨어 및/또는 소프트웨어 구성들로 구현될 수 있다. 예를 들어, 본 개시의 기능 블록들은 하나 이상의 마이크로프로세서들에 의해 구현되거나, 소정의 기능을 위한 회로 구성들에 의해 구현될 수 있다. 또한, 예를 들어, 본 개시의 기능 블록들은 다양한 프로그래밍 또는 스크립팅 언어로 구현될 수 있다. 기능 블록들은 하나 이상의 프로세서들에서 실행되는 알고리즘으로 구현될 수 있다. 또한, 본 개시는 전자적인 환경 설정, 신호 처리, 및/또는 데이터 처리 등을 위하여 종래 기술을 채용할 수 있다. Some embodiments of the present disclosure may be represented by functional block configurations and various processing steps. Some or all of these functional blocks may be implemented in various numbers of hardware and/or software configurations that perform specific functions. For example, the functional blocks of the present disclosure may be implemented by one or more microprocessors, or by circuit configurations for a given function. Also, for example, the functional blocks of the present disclosure may be implemented in various programming or scripting languages. The functional blocks may be implemented as an algorithm running on one or more processors. Also, the present disclosure may employ prior art for electronic configuration, signal processing, and/or data processing, and the like.

또한, 본 명세서에서 사용되는 '제1' 또는 '제2' 등과 같이 서수를 포함하는 용어는 다양한 구성 요소들을 설명하는데 사용할 수 있지만, 상기 구성 요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성 요소를 다른 구성 요소로부터 구별하는 목적으로 사용될 수 있다.Also, terms including ordinal numbers such as 'first' or 'second' used in this specification may be used to describe various components, but the components should not be limited by the terms. The above terms may be used for the purpose of distinguishing one component from another.

또한, 도면에 도시된 구성 요소들 간의 연결 선 또는 연결 부재들은 기능적인 연결 및/또는 물리적 또는 회로적 연결들을 예시적으로 나타낸 것일 뿐이다. 실제 장치에서는 대체 가능하거나 추가된 다양한 기능적인 연결, 물리적인 연결, 또는 회로 연결들에 의해 구성 요소들 간의 연결이 나타내어질 수 있다. In addition, the connecting lines or connecting members between the components shown in the drawings only exemplify functional connections and/or physical or circuit connections. In an actual device, a connection between components may be represented by various functional connections, physical connections, or circuit connections that are replaceable or added.

이하 첨부된 도면을 참고하여 본 개시를 상세히 설명하기로 한다.Hereinafter, the present disclosure will be described in detail with reference to the accompanying drawings.

도 1은 일 실시예에 따른 수중 영상 데이터 생성 장치의 블록도이다. 1 is a block diagram of an apparatus for generating underwater image data according to an exemplary embodiment.

도 1을 참조하면 수중 영상 데이터 생성 장치(100)는 픽셀 분포 예측부(110), 생성부(120), 판별부(130)를 포함할 수 있다. Referring to FIG. 1 , the apparatus 100 for generating underwater image data may include a pixel distribution predictor 110 , a generator 120 , and a determiner 130 .

픽셀 분포 예측부(110)는 입력 영상의 픽셀 분포 특성을 추출할 수 있다. 픽셀 분포 예측부(110)는 픽셀 레벨 특징 추출기를 의미할 수 있고, 픽셀 레벨의 특징을 추출함으로써 입력 영상의 스타일 및 콘텐츠를 효과적으로 추출할 수 있다. 입력 영상은 수중 영상 데이터 생성 장치(100)로 입력되는 영상으로, 측면주사 소나(side scan sonar) 영상을 의미할 수 있다. 픽셀 분포 특성은 픽셀 분포 파라메터 벡터인 평균(

) 벡터, 분산(

²) 벡터를 의미할 수 있다.The pixel distribution predictor 110 may extract a pixel distribution characteristic of the input image. The pixel distribution predictor 110 may mean a pixel-level feature extractor, and may effectively extract the style and content of the input image by extracting the pixel-level feature. The input image is an image input to the underwater image data generating apparatus 100 and may mean a side scan sonar image. The pixel distribution characteristic is the pixel distribution parameter vector, the average (

) vector, variance(

² ) can mean a vector.

일 실시예에서, 픽셀 분포 예측부(110)는 비지도 학습(unsupervised learning)을 통해 입력 영상의 픽셀 분포를 학습하고, 학습된 픽셀 분포로부터 픽셀 분포 특성을 추출하고, 추출된 픽셀 분포 특성을 이용하여 랜덤 벡터(random vector)를 생성할 수 있다. 비지도 학습은 머신 러닝(machine learning)에서 컴퓨터가 입력값만 있는 훈련 데이터를 이용하여 입력들의 규칙성을 찾는 학습 방법을 의미할 수 있다. In an embodiment, the pixel distribution prediction unit 110 learns the pixel distribution of the input image through unsupervised learning, extracts a pixel distribution characteristic from the learned pixel distribution, and uses the extracted pixel distribution characteristic. Thus, a random vector can be generated. Unsupervised learning may refer to a learning method in which a computer finds regularity of inputs using training data having only input values in machine learning.

픽셀 분포 예측부(110)는 손실 함수를 통해 학습될 수 있다. 손실 함수는 다음 수학식 1로 나타낼 수 있다. The pixel distribution prediction unit 110 may be learned through a loss function. The loss function can be expressed by Equation 1 below.

[수학식 1][Equation 1]

여기서,

는 손실 함수를 의미하고,

는 KL 다이버전스(KL divergence) 함수를 의미하고,

는 상기 입력 영상을 의미하고,

은 입력 영상에 따른 학습된 픽셀 분포를 의미하고,

는 표준 가우스 분포를 의미할 수 있다.

는 잠재 변수(latent variable)로, 픽셀 분포 예측부(110)의 출력을 의미할 수 있다. here,

is the loss function,

is KL divergence (KL divergence) function,

means the input image,

is the learned pixel distribution according to the input image,

may mean a standard Gaussian distribution.

is a latent variable and may mean an output of the pixel distribution predictor 110 .

손실 함수로 KL 다이버전스 함수를 이용하여 입력 영상에서 생성된 픽셀 분포와 랜덤 변수로부터 생성된 픽셀 분포의 차이를 계산할 수 있다. KL 다이버전스 함수를 이용함으로써, 픽셀 분포를 학습하고, 정규 분포를 따르도록 강제할 수 있다. The difference between the pixel distribution generated from the input image and the pixel distribution generated from the random variable can be calculated using the KL divergence function as the loss function. By using the KL divergence function, we can learn the pixel distribution and force it to follow a normal distribution.

생성부(120)는 추출된 픽셀 분포 특성을 이용하여 수중 영상 데이터를 생성할 수 있다. 생성부(120)는 픽셀 분포 예측부(110)로부터 생성된 랜덤 벡터를 이용하여 입력 영상과 유사한 형태의 수중 영상 데이터를 생성할 수 있다. 수중 영상 데이터는 수중 영상 데이터 생성 장치로부터 생성된 측면주사 소나 영상을 의미할 수 있다. 생성부(120)는 도 3을 참조하여 상세히 후술한다. The generator 120 may generate underwater image data by using the extracted pixel distribution characteristic. The generator 120 may generate underwater image data having a shape similar to that of the input image by using the random vector generated by the pixel distribution predictor 110 . The underwater image data may refer to a side-scan sonar image generated by the underwater image data generating apparatus. The generator 120 will be described in detail later with reference to FIG. 3 .

판별부(130)는 입력 영상 및 수중 영상 데이터가 위조된 영상인지 여부를 결정할 수 있다. 생성부(120) 및 판별부(130)는 서로 경쟁적인 학습을 할 수 있다. 판별부(130)는 생성부(120)가 만들어낸 수중 영상 데이터를 평가할 수 있고, 생성부(120)는 판별부(130)의 평가 결과를 반영하여 입력 영상과 보다 유사한 수중 영상 데이터를 생성할 수 있다. 학습이 진행될수록 생성부(120)에서 생성된 수중 영상 데이터는 입력 영상과 유사할 수 있다. The determining unit 130 may determine whether the input image and the underwater image data are forged images. The generating unit 120 and the determining unit 130 may compete with each other for learning. The determining unit 130 may evaluate the underwater image data generated by the generating unit 120 , and the generating unit 120 may generate underwater image data more similar to the input image by reflecting the evaluation result of the determining unit 130 . can As learning progresses, the underwater image data generated by the generator 120 may be similar to the input image.

도 2는 일 실시예에 따른 픽셀 분포 예측부를 설명하기 위한 도면이다.2 is a diagram for describing a pixel distribution predictor according to an exemplary embodiment.

도 2를 참조하면, 입력 영상은 Conv(convolution) 레이어, Reshape 레이어, 및 Linear 레이어를 통과할 수 있다. Referring to FIG. 2 , an input image may pass through a convolution (conv) layer, a reshape layer, and a linear layer.

일 실시예에서, 픽셀 분포 예측부(예를 들어, 도 1의 픽셀 분포 예측부(110))는 컨볼루션 레이어(convolution layer) 및 FC 레이어(Fully-Connected layer) 중 적어도 하나를 포함할 수 있다. 컨볼루션 레이어는 도 2의 Conv 레이어를 의미할 수 있고, 컨볼루션 연산을 수행할 수 있다. 컨볼루션 레이어는 복수 개 존재할 수 있고, 바람직하게는, 6개일 수 있다. 컨볼루션 레이어에서 3 × 3은 컨볼루션 레이어의 크기를 의미하고,

는 입력 영상의 너비 및 높이 각각을 반으로 줄이는 것을 의미할 수 있다. 예를 들어, 크기가 100 × 100인 입력 영상이 컨볼루션 레이어를 통과하면 입력 영상의 크기가 50 × 50으로 줄어들 수 있다. 도 2의

은 입력 영상의 크기를 줄였으므로 채널을 늘리는 것을 의미할 수 있다. In an embodiment, the pixel distribution predictor (eg, the pixel distribution predictor 110 of FIG. 1 ) may include at least one of a convolution layer and a fully-connected layer (FC). . The convolution layer may refer to the conv layer of FIG. 2 , and a convolution operation may be performed. A plurality of convolutional layers may exist, and preferably, there may be six convolutional layers. In the convolutional layer, 3 × 3 means the size of the convolutional layer,

may mean reducing each of the width and height of the input image by half. For example, when an input image having a size of 100 × 100 passes through a convolutional layer, the size of the input image may be reduced to 50 × 50. 2 of

Since the size of the input image is reduced, it may mean increasing the channel.

Reshape 레이어는 컨볼루션 레이어의 출력이 Linear 레이어의 입력이 되도록 컨볼루션 레이어의 출력의 형태를 변형시킬 수 있다. 예를 들어, 컨볼루션 레이어의 출력이 2차원인 경우, Reshape 레이어는 2차원의 컨볼루션 레이어의 출력이 Linear 레이어의 입력이 되도록 1차원으로 변형할 수 있다. The Reshape layer may transform the shape of the output of the convolution layer so that the output of the convolution layer becomes the input of the Linear layer. For example, when the output of the convolution layer is two-dimensional, the reshape layer may transform the output of the two-dimensional convolution layer into one dimension so that the output of the linear layer becomes the input of the linear layer.

픽셀 분포 예측부는 복수 개의 FC 레이어를 포함할 수 있고, 바람직하게는 2개의 FC 레이어를 포함할 수 있다. 도 2의 Linear 레이어가 FC 레이어에 해당할 수 있다. The pixel distribution predictor may include a plurality of FC layers, and preferably may include two FC layers. The Linear layer of FIG. 2 may correspond to the FC layer.

입력 영상이 픽셀 분포 예측부의 6개의 컨볼루션 레이어, Reshape 레이어, 및 Linear 레이어를 통과함으로써 픽셀 분포 특성(평균(

) 벡터, 분산(

²) 벡터)이 추출될 수 있고, 추출된 픽셀 분포 특성을 이용하여 랜덤 벡터가 생성될 수 있다.By passing the input image through the six convolutional layers, the Reshape layer, and the Linear layer of the pixel distribution prediction unit, the pixel distribution characteristics (average (

) vector, variance(

² ) vector) may be extracted, and a random vector may be generated using the extracted pixel distribution characteristics.

도 3은 일 실시예에 따른 생성부 및 판별부를 설명하기 위한 도면이다. 3 is a view for explaining a generator and a determiner according to an embodiment.

도 3을 참조하면 수중 영상 데이터 생성 장치(300)는 픽셀 분포 예측부(310), 생성부(320), 판별부(330)를 포함할 수 있다. 생성부(320)는 픽셀 분포 예측부(310)에서 출력된 랜덤 벡터로부터 분할 지도(segmentation map)을 사용하여 수중 영상 데이터를 생성할 수 있다. 수중 영상 데이터는 입력 영상과 유사한 형태의 영상으로, 수중 영상 데이터 생성 장치(300)로부터 생성된 측면주사 소나 영상을 의미할 수 있다. 도 3의 수중 영상 데이터 생성 장치(300), 픽셀 분포 예측부(310), 생성부(320), 및 판별부(330)는 도 1의 수중 영상 데이터 생성 장치(100), 픽셀 분포 예측부(110), 생성부(120), 및 판별부(130)에 대응되므로, 중복되는 내용은 생략한다. Referring to FIG. 3 , the apparatus 300 for generating underwater image data may include a pixel distribution predictor 310 , a generator 320 , and a determiner 330 . The generator 320 may generate underwater image data by using a segmentation map from the random vector output from the pixel distribution predictor 310 . The underwater image data is an image similar to the input image, and may mean a side scan sonar image generated by the underwater image data generating apparatus 300 . The apparatus 300 for generating underwater image data of FIG. 3 , the pixel distribution prediction unit 310 , the generation unit 320 , and the determining unit 330 include the underwater image data generation apparatus 100 of FIG. 1 , the pixel distribution prediction unit ( 110), the generating unit 120, and the determining unit 130, so overlapping content is omitted.

일 실시예에서, 생성부(320)는 Pix2pixHD 모델을 이용하여 상기 수중 영상 데이터를 생성할 수 있다. Pix2pixHD는 GAN(Generative Adversarial Network)을 이용하여 시맨틱 레이블 맵(semantic label map)에서 고해상도의 사실적 이미지를 합성하는 방법으로, HD 영상을 생성하기 위해 설계된 특수한 구조의 GAN 모델을 의미할 수 있다. 생성부(320)는 픽셀 분포 예측부(310)에서 생성된 랜덤 벡터로 생성 프로세스를 시작하므로, Pix2pixHD의 업샘플링 컨볼루션 레이어(upsampling convolution layer)(321, 322, 323)만 포함할 수 있다. In an embodiment, the generator 320 may generate the underwater image data by using the Pix2pixHD model. Pix2pixHD is a method of synthesizing high-resolution realistic images from a semantic label map using a Generative Adversarial Network (GAN), and may refer to a GAN model of a special structure designed to generate an HD image. Since the generation unit 320 starts the generation process with the random vector generated by the pixel distribution prediction unit 310, only the upsampling convolution layers 321, 322, and 323 of Pix2pixHD may be included.

생성부(320)는 업샘플링 컨볼루션 레이어(321, 322, 323) 및 스페이드 블록(spade block)(324, 325, 326) 중 적어도 하나를 포함할 수 있다. 생성부(320)는 복수 개의 업샘플링 컨볼루션 레이어(321, 322, 323) 및 복수 개의 스페이드 블록(324, 325, 326)을 포함할 수 있고, 바람직하게는, 업샘플링 컨볼루션 레이어(321, 322, 323) 및 스페이드 블록(324, 325, 326)은 각각 7개일 수 있다. 또한, 생성부(320)의 업샘플링 컨볼루션 레이어(321, 322, 323) 및 스페이드 블록(324, 325, 326)은 하나씩 번갈아 배치될 수 있고, 업샘플링 컨볼루션 레이어(321, 322, 323) 다음에 스페이드 블록(324, 325, 326)이 오는 구조가 7번 반복될 수 있다. 예를 들어, 생성부(320)에서 업샘플링 컨볼루션 레이어(321), 스페이드 블록(324), 업샘플링 컨볼루션 레이어(322), 스페이드 블록(325), 업샘플링 컨볼루션 레이어(323), 및 스페이드 블록(326)이 나열한 순서대로 배치되거나 수행될 수 있다.The generator 320 may include at least one of upsampling convolutional layers 321 , 322 , and 323 and spade blocks 324 , 325 , and 326 . The generator 320 may include a plurality of upsampling convolution layers 321 , 322 , 323 and a plurality of spade blocks 324 , 325 , 326 , and preferably, the upsampling convolution layer 321 , The number of 322 and 323 and the spade blocks 324 , 325 and 326 may be seven, respectively. In addition, the upsampling convolution layers 321 , 322 , 323 and the spade blocks 324 , 325 , 326 of the generator 320 may be alternately disposed one by one, and the upsampling convolution layers 321 , 322 , 323 . The structure in which the spade blocks 324 , 325 , and 326 follow may be repeated seven times. For example, in the generator 320 , the upsampling convolution layer 321 , the spade block 324 , the upsampling convolution layer 322 , the spade block 325 , the upsampling convolution layer 323 , and Spade blocks 326 may be placed or performed in the order listed.

스페이드 블록(324, 325, 326)은 업샘플링 컨볼루션 레이어(321, 322, 323)의 출력에 분할 지도(segmentation map)를 추가할 수 있다. 분할 지도 및 바로 이전의 업샘플링 컨볼루션 레이어(321, 322, 323)의 출력이 스페이드 블록(324, 325, 326)의 입력이 될 수 있다. 예를 들어, 스페이드 블록(324)의 입력은 분할 지도 및 업샘플링 컨볼루션 레이어(321)의 출력일 수 있다. 스페이드 블록(325)의 입력은 분할 지도 및 업샘플링 컨볼루션 레이어(322)의 출력일 수 있다. 스페이드 블록(324, 325, 326)은 도 4를 참조하여 상세히 후술한다.The spade blocks 324 , 325 , and 326 may add a segmentation map to the output of the upsampling convolutional layers 321 , 322 , and 323 . The output of the split map and the immediately preceding upsampling convolution layer 321 , 322 , 323 may be an input of the spade blocks 324 , 325 , 326 . For example, the input of the spade block 324 may be the output of the segmentation map and upsampling convolution layer 321 . The input of the spade block 325 may be the output of the segmentation map and upsampling convolution layer 322 . The spade blocks 324 , 325 , and 326 will be described in detail later with reference to FIG. 4 .

판별부(330)는 입력 영상 및 생성부(320)로부터 생성된 수중 영상 데이터가 위조된 영상인지 여부를 결정할 수 있다. The determining unit 330 may determine whether the underwater image data generated by the input image and the generator 320 is a forged image.

일 실시예에서, 판별부(330)는 분할 지도와 채널(channel)에서 연결된(concatenate) 입력 영상 및 분할 지도와 채널에서 연결된 수중 영상 데이터가 위조된 영상인지 여부를 결정할 수 있다. 도 3의 Concatenate는 concatenate 연산을 의미하고, 2개의 영상을 서로 연결시켜 하나의 영상으로 만들어 주는 것을 의미할 수 있다. 예를 들어, 분할 지도 및 입력 영상이 채널에서 서로 연결되어 하나의 영상으로 만들어질 수 있고, 분할 지도 및 생성부(320)에서 생성된 수중 영상 데이터가 채널에서 서로 연결되어 하나의 영상으로 만들어질 수 있다. In an embodiment, the determining unit 330 may determine whether the input image concatenated in the segmented map and the channel and the underwater image data concatenated in the segmented map and the channel are forged images. Concatenate in FIG. 3 means a concatenate operation, and may mean that two images are connected to each other to make one image. For example, the divided map and the input image may be connected to each other in a channel to make one image, and the underwater image data generated by the divided map and generator 320 may be connected to each other in the channel to make one image. can

판별부(330)는 분할 지도와 채널에서 연결된 입력 영상 및 분할 지도와 채널에서 연결된 수중 영상 데이터가 위조된 영상인지 여부를 결정하여 결과값 Y를 산출할 수 있다. The determining unit 330 may determine whether the input image connected to the divided map and the channel and the underwater image data connected to the divided map and the channel are forged images to calculate a result value Y.

일 실시예에서, 판별부(330)는 Patch GAN(Generative Adversarial Network) 구조를 이용할 수 있다. Patch GAN은 전체 영역이 아니라 특정 크기의 패치(patch) 단위로 위조 영상인지 여부를 결정하고, 그 결과에 평균을 취하는 방식을 의미할 수 있다. 전체 영역이 아니라 패치 단위로 슬라이딩 윈도우(sliding window)가 지나가며 연산을 수행하므로 파라미터 개수가 감소할 수 있다. 이에 따라, 연산 속도가 증가할 수 있다. 또한, 영상의 크기에 영향을 받지 않아, 구조적으로 유연할 수 있다.In an embodiment, the determining unit 330 may use a Patch Generative Adversarial Network (GAN) structure. The patch GAN may refer to a method of determining whether a forged image is a forged image in a patch unit of a specific size rather than the entire area, and taking an average of the result. The number of parameters may be reduced because the operation is performed while a sliding window passes through the patch unit rather than the entire area. Accordingly, the calculation speed may be increased. In addition, since it is not affected by the size of the image, it may be structurally flexible.

생성부(320) 및 판별부(330)의 목적 함수는 GAN 손실과 특징 매칭 손실의 2가지 손실 함수로 구성될 수 있다. 목적 함수는 다음 수학식 2로 나타낼 수 있다.The objective function of the generating unit 320 and the determining unit 330 may be composed of two loss functions: a GAN loss and a feature matching loss. The objective function can be expressed by the following Equation (2).

[수학식 2][Equation 2]

여기서,

은 GAN 손실을 의미하고,

은 특징 매칭 손실(feature matching loss)을 의미하고,

는 생성부(320),

는 판별부(330),

는 입력 영상,

는 입력 영상의 크기에 해당되는 시맨틱 라벨 지도 값을 의미할 수 있다. 특징 매칭 손실은 생성부(320)가 여러 해상도에서 사실적인 영상을 생성하도록 하여 훈련 과정을 안정화시키는 데 도움을 줄 수 있고, GAN 손실은 분할 지도와 함께 입력 영상의 조건부 분포를 학습하는 데 중점을 둘 수 있다. here,

means GAN loss,

is a feature matching loss,

is the generator 320,

is the determination unit 330,

is the input image,

may mean a semantic label map value corresponding to the size of the input image. The feature matching loss can help stabilize the training process by causing the generator 320 to generate realistic images at different resolutions, and the GAN loss focuses on learning the conditional distribution of the input image with the segmentation map. can be put

도 4는 도 3의 스페이드 블록의 구조를 설명하기 위한 도면이다. FIG. 4 is a view for explaining the structure of the spade block of FIG. 3 .

도 4를 참조하면, 도 4의 스페이드 블록(400)은 컨볼루션 레이어 및 배치 정규화 레이어(batch normalization layer)를 포함할 수 있다. 도 4의 스페이드 블록(400) 및 업샘플링 컨볼루션 레이어(410)는 도 3의 스페이드 블록(324, 325, 326) 및 업샘플링 컨볼루션 레이어(321, 322, 323)에 각각 대응되므로, 중복되는 내용은 생략한다.Referring to FIG. 4 , the spade block 400 of FIG. 4 may include a convolution layer and a batch normalization layer. The spade block 400 and the upsampling convolutional layer 410 of FIG. 4 correspond to the spade blocks 324, 325, 326 and the upsampling convolutional layers 321, 322, and 323 of FIG. content is omitted.

스페이드 블록(400)의 입력은 분할 지도 및 이전의 업샘플링 컨볼루션 레이어의 출력일 수 있다. 이전의 업샘플링 컨볼루션 레이어는 이전의 업샘플링 컨볼루션 레이어의 출력이 입력되는 스페이드 블록(400) 바로 이전의 업샘플링 컨볼루션 레이어를 의미할 수 있다. 스페이드 블록(400)의 츨력은 다음의 업샘플링 컨볼루션 레이어(410)의 입력이 될 수 있다. 생성부에 스페이드 블록(400)이 포함됨으로써 분할 지도의 정보를 추가적으로 획득할 수 있다.The input of the spade block 400 may be the output of the split map and the previous upsampling convolutional layer. The previous upsampling convolutional layer may mean an upsampling convolutional layer immediately before the spade block 400 to which the output of the previous upsampling convolutional layer is input. The output of the spade block 400 may be an input of the next upsampling convolution layer 410 . Since the spade block 400 is included in the generator, it is possible to additionally obtain information on the divided map.

일 실시예에서, 스페이드 블록(400)은 컨볼루션 레이어 및 배치 정규화 레이어 중 적어도 하나를 포함할 수 있다. 배치 정규화 레이어는 배치 정규화를 수행하는 레이어로, 입력의 평균이 0, 분산이 1이 되도록 재배치 하는 것을 의미할 수 있다. In an embodiment, the spade block 400 may include at least one of a convolutional layer and a placement normalization layer. The batch normalization layer is a layer that performs batch normalization, and may mean rearranging the input so that the mean is 0 and the variance is 1.

분할 지도는 스페이드 블록(400)의 컨볼루션 레이어를 통과할 수 있고, 배치 정규화 레이어에 대한 스케일링 요소(scaling factor:

) 및 시프팅 요소(shifting factor:

)로 변환될 수 있다. 스케일링 요소는 정규화된 데이터에 대한 스케일 조정 파라미터를 의미할 수 있고, 시프팅 요소는 정규화된 데이터에 대한 이동 조정 파라미터를 의미할 수 있다. The segmentation map may pass through the convolutional layer of the spade block 400, and a scaling factor for the batch normalization layer:

) and a shifting factor:

) can be converted to The scaling factor may mean a scale adjustment parameter for normalized data, and the shifting element may mean a movement adjustment parameter for normalized data.

도 5는 입력 영상과 수중 영상 데이터를 비교하기 위한 도면이다. 5 is a diagram for comparing an input image and underwater image data.

도 5의 OURS HO(High frequency Object)는 고주파로 측정한 객체가 포함된 수중 영상을 의미할 수 있고, OURS LO(Low frequency Object)는 저주파로 측정한 객체가 포함된 수중 영상을 의미할 수 있다. HO 및 LO는 협곡이나 능선과 같은 순수한 해저 환경보다 더 복잡한 해저 환경을 포함할 수 있다. 첫 번째 행은 분할 지도를 나타내고, 두 번째 행은 입력 영상을 나타내고, 세 번째 행은 수중 영상 데이터를 나타낼 수 있다. 고주파 및 저주파는 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자가 용이하게 이해하는 범위를 포함할 수 있다. OURS HO (High frequency Object) of FIG. 5 may mean an underwater image including an object measured at a high frequency, and OURS LO (Low Frequency Object) may mean an underwater image including an object measured at a low frequency. . HO and LO may contain more complex subsea environments than pure subsea environments such as canyons or ridges. A first row may represent a segmentation map, a second row may represent an input image, and a third row may represent underwater image data. High frequency and low frequency may include ranges easily understood by those of ordinary skill in the art to which the present invention pertains.

도 5의 HO의 입력 영상에는 분할 지도에 표시되어 있는 해저의 암석이 포함되어 있다. 입력 영상은 하이라이트된 영역(측면주사 소나 신호를 향하고 있음), 음영 영역(하이라이트된 영역에 의해 측면주사 소나 신호가 차단됨), 배경(해저 영역)의 세 영역으로 구성될 수 있다. 입력 영상과 수중 영상 데이터를 비교 시, 수중 영상 데이터가 입력 영상에 비해 높은 영상 화질을 유지하면서 세 영역을 잘 반영할 수 있음을 알 수 있다. 수중 영상 데이터는 본 개시에 따른 수중 영상 데이터 생성 장치로부터 생성된 측면주사 소나 영상을 의미할 수 있다. The input image of the HO of FIG. 5 includes the rocks of the seabed displayed on the divided map. The input image may consist of three areas: a highlighted area (facing the side scan sonar signal), a shaded area (the side scan sonar signal is blocked by the highlighted area), and a background (undersea area). When comparing the input image and the underwater image data, it can be seen that the underwater image data can reflect the three areas well while maintaining high image quality compared to the input image. The underwater image data may refer to a side-scan sonar image generated by the apparatus for generating underwater image data according to the present disclosure.

도 5의 LO는 저주파에서 생성된 수중 영상을 나타내는 것으로, 저주파 측면주사 소나는 심해 환경에서 작동하므로, LO는 HO보다 상대적으로 화질 및 해상도가 낮을 수 있다. 수중 영상 데이터에 분할 지도가 추가될 수 있으므로, 세밀한 구조로 사실적으로 표현될 수 있고, 낮은 수준의 노이즈가 포함될 수 있다. LO of FIG. 5 represents an underwater image generated at a low frequency. Since the low frequency side scan sonar operates in a deep sea environment, the LO may have relatively lower image quality and resolution than the HO. Since a segmentation map can be added to the underwater image data, it can be realistically expressed in a detailed structure and low-level noise can be included.

도 6은 일 실시예에 따른 수중 영상 데이터 생성 장치의 성능을 설명하기 위한 표이다. 6 is a table for explaining the performance of the apparatus for generating underwater image data according to an embodiment.

도 6의 Acc(Acuracy)는 픽셀 정확도를 의미하고, mIoU(Mean Intersection over Union)는 평균 조합 교차점을 의미할 수 있다. 픽셀 정확도는 정확하게 분류된 픽셀의 비율을 나타낸다. IoU는 분할된 결과와 레이블 맵이 분할된 결과와 레이블 맵 사이의 결합 영역으로 나눈 중복 영역을 의미할 수 있다. 6, Acc (Acuracy) may mean pixel accuracy, and mIoU (Mean Intersection over Union) may mean an average combination intersection point. Pixel accuracy refers to the proportion of correctly classified pixels. The IoU may refer to an overlapping area in which the segmented result and the label map are divided by the combined area between the segmented result and the label map.

Acc 및 mIoU는 수중 영상 데이터가 충분히 사실적인 경우, 수중 영상 데이터를 사용하여 정확한 예측 결과를 산출할 수 있으므로, 시맨틱 분할의 지표로 사용될 수 있다. PSPNet 및 FCN8s의 시맨틱 분할 방법을 사용하여 수중 영상 데이터 집합 및 입력 영상 집합에 대한 분할 결과를 평가할 수 있다. 도 6의 generated data(%)는 입력 영상 및 수중 영상 데이터의 비율을 의미할 수 있다. 예를 들어 generated data가 0%인 경우, 수중 영상 데이터의 비율이 0%임을 의미할 수 있다. generated data가 50%인 경우, 수중 영상 데이터와 입력 영상의 비율이 각각 50%임을 의미할 수 있다. Acc and mIoU can be used as indicators of semantic segmentation because accurate prediction results can be calculated using the underwater image data when the underwater image data is sufficiently realistic. The semantic segmentation method of PSPNet and FCN8s can be used to evaluate the segmentation results on the underwater image dataset and the input image set. The generated data (%) of FIG. 6 may mean a ratio of the input image and the underwater image data. For example, when the generated data is 0%, it may mean that the ratio of the underwater image data is 0%. When the generated data is 50%, it may mean that the ratio of the underwater image data and the input image is 50%, respectively.

Acc 및 mIoU는 수중 영상 데이터의 신뢰성을 나타내는 지표를 의미할 수 있다. 예를 들어, generated data가 0% 및 50%인 경우의 수치가 유사하거나, 0%보다 50%에서 Acc의 수치가 더 높다면 수중 영상 데이터가 신뢰성 있음을 의미할 수 있다. Acc and mIoU may mean indicators indicating reliability of underwater image data. For example, if the values of 0% and 50% of the generated data are similar, or the value of Acc is higher at 50% than 0%, it may mean that the underwater image data is reliable.

PSPNet을 사용한 경우, 입력 영상에 대한 수중 영상 데이터의 비율이 증가함에 따라 메트릭(metric)이 향상될 수 있고 generated data가 50%인 경우의 수중 영상 데이터의 신뢰성이 높을 수 있다. 예를 들어, 고주파에서 generated data가 0%인 경우의 Acc는 87.62%이고, mIoU는 65.95%이고, generated data가 50%인 경우의 Acc는 90.95%이고, mIoU는 73.84%일 수 있다. 다른 예로, 저주파에서 generated data가 0%인 경우의 Acc는 87.71%이고, mIoU는 68.08%이고, generated data가 50%인 경우의 Acc는 90.95%이고, mIoU는 73.84%일 수 있다. Generated data가 50%인 경우의 Acc 및 mIoU가 generated data가 0%인 경우보다 높으므로, 수중 영상 데이터가 신뢰성 있음을 알 수 있다.In the case of using PSPNet, as the ratio of the underwater image data to the input image increases, the metric may be improved, and the reliability of the underwater image data may be high when the generated data is 50%. For example, when the generated data is 0% at high frequency, Acc may be 87.62%, mIoU may be 65.95%, and when generated data is 50%, Acc may be 90.95%, and mIoU may be 73.84%. As another example, when the generated data is 0% at a low frequency, Acc is 87.71%, mIoU is 68.08%, and when generated data is 50%, Acc is 90.95%, and mIoU is 73.84%. Since Acc and mIoU when generated data is 50% are higher than when generated data is 0%, it can be seen that the underwater image data is reliable.

FCN8s을 사용한 경우, 입력 영상에 대한 수중 영상 데이터의 비율이 generated data가 50%보다 작을 때 수중 영상 데이터의 신뢰성이 높을 수 있다. 예를 들어, 고주파에서 generated data가 0%인 경우의 Acc는 88.72%이고, mIoU는 69.19%이고, generated data가 10%인 경우의 Acc는 89.02%이고, mIoU는 69.8%이고, generated data가 30%인 경우의 Acc는 89.01%이고, mIoU는 69.89%일 수 있다. Generated data가 10%인 경우 및 30%인 경우의 Acc 및 mIoU는 generated data가 0%인 경우보다 높을 수 있고, generated data가 10%인 경우 Acc가 가장 높고, generated data가 30%인 경우 mIoU가 가장 높을 수 있다. When FCN8s is used, when the ratio of the underwater image data to the input image is less than 50% of the generated data, the reliability of the underwater image data may be high. For example, at high frequency, when the generated data is 0%, the Acc is 88.72%, the mIoU is 69.19%, and when the generated data is 10%, the Acc is 89.02%, the mIoU is 69.8%, and the generated data is 30 In the case of %, Acc may be 89.01%, and mIoU may be 69.89%. Acc and mIoU when generated data is 10% and 30% can be higher than when generated data is 0%, Acc is highest when generated data is 10%, and mIoU is highest when generated data is 30% can be the highest.

다른 예로, 저주파에서 generated data가 0%인 경우의 Acc는 85.60%이고, mIoU는 73.75%이고, generated data가 20%인 경우의 Acc는 86.84%이고, mIoU는 75.43% 일 수 있다. Generated data가 20%인 경우의 Acc 및 mIoU는 generated data가 0%인 경우보다 높을 수 있고, generated data가 20%인 경우의 Acc 및 mIoU가 가장 높을 수 있다. Generated data가 10%, 20%, 30%일 때 0%인 경우보다 Acc 및 mIoU가 높으므로, 수중 영상 데이터가 신뢰성 있음을 알 수 있다. 이에 따라, 본 개시에 따른 장치는 신뢰성 있는 수중 영상 데이터를 생성하여 측면주사 소나 영상의 품질을 향상시킬 수 있다.As another example, when the generated data is 0% at a low frequency, Acc may be 85.60%, mIoU may be 73.75%, Acc may be 86.84% when generated data is 20%, and mIoU may be 75.43%. Acc and mIoU when generated data is 20% may be higher than when generated data is 0%, and Acc and mIoU when generated data is 20% may be the highest. When the generated data is 10%, 20%, and 30%, Acc and mIoU are higher than when it is 0%, so it can be seen that the underwater image data is reliable. Accordingly, the device according to the present disclosure can improve the quality of the side-scan sonar image by generating reliable underwater image data.

도 7은 다른 실시예에 따른 수중 영상 데이터 생성 장치의 블록도이다.7 is a block diagram of an apparatus for generating underwater image data according to another embodiment.

도 7을 참조하면, 수중 영상 데이터 생성 장치(700)는 메모리(710) 및 프로세서(720)를 포함할 수 있다. 도 7의 수중 영상 데이터 생성 장치(700)에는 실시예와 관련된 구성요소들만이 도시되어 있다. 따라서, 도 7에 도시된 구성요소들 외에 다른 범용적인 구성요소들이 더 포함될 수 있음을 당해 기술분야의 통상의 기술자라면 이해할 수 있다.Referring to FIG. 7 , the apparatus 700 for generating underwater image data may include a memory 710 and a processor 720 . Only the components related to the embodiment are shown in the underwater image data generating apparatus 700 of FIG. 7 . Accordingly, it can be understood by those skilled in the art that other general-purpose components may be further included in addition to the components shown in FIG. 7 .

메모리(710)는 수중 영상 데이터 생성 장치(700) 내에서 처리되는 각종 데이터들을 저장하는 하드웨어이다. 예를 들어, 메모리(710)는 입력 영상, 픽셀 분포 특성, 분할 지도, 픽셀 분포 특성으로부터 추출된 랜덤 벡터 등을 저장할 수 있다. 메모리(710)는 DRAM(dynamic random access memory), SRAM(static random access memory) 등과 같은 RAM(random access memory), ROM(read-only memory), EEPROM(electrically erasable programmable read-only memory), CD-ROM, 블루레이 또는 다른 광학 디스크 스토리지, HDD(hard disk drive), SSD(solid state drive), 또는 플래시 메모리를 포함할 수 있다.The memory 710 is hardware for storing various data processed in the underwater image data generating apparatus 700 . For example, the memory 710 may store an input image, a pixel distribution characteristic, a segmentation map, a random vector extracted from the pixel distribution characteristic, and the like. The memory 710 includes random access memory (RAM), such as dynamic random access memory (DRAM), static random access memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), CD- It may include ROM, Blu-ray or other optical disk storage, a hard disk drive (HDD), a solid state drive (SSD), or flash memory.

프로세서(720)는 도 1 내지 도 6에서 상술한, 수중 영상 데이터를 생성하기 위한 전반적인 기능을 수행한다. The processor 720 performs an overall function for generating the underwater image data described above with reference to FIGS. 1 to 6 .

일 실시예에서, 프로세서(720)는 입력 영상의 픽셀 분포 특성을 추출하고, 추출된 픽셀 분포 특성을 이용하여 수중 영상 데이터를 생성하고, 입력 영상 및 수중 영상 데이터가 위조된 영상인지 여부를 결정할 수 있다. In an embodiment, the processor 720 may extract a pixel distribution characteristic of the input image, generate underwater image data using the extracted pixel distribution characteristic, and determine whether the input image and the underwater image data are forged images. have.

본 실시예들은 컴퓨터에 의해 실행되는 프로그램 모듈과 같은 프로그램을 기록한 기록 매체의 형태로도 구현될 수 있다. 컴퓨터 판독 가능 매체는 컴퓨터에 의해 액세스될 수 있는 임의의 가용 매체일 수 있고, 휘발성 및 비휘발성 매체, 분리형 및 비분리형 매체를 모두 포함한다. 또한, 컴퓨터 판독가능 매체는 컴퓨터 저장 매체 및 통신 매체를 모두 포함할 수 있다. 컴퓨터 저장 매체는 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터와 같은 정보의 저장을 위한 임의의 방법 또는 기술로 구현된 휘발성 및 비휘발성, 분리형 및 비분리형 매체를 모두 포함한다. 통신 매체는 전형적으로 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈과 같은 변조된 데이터 신호의 기타 데이터, 또는 기타 전송 메커니즘을 포함하며, 임의의 정보 전달 매체를 포함한다.The present embodiments may also be implemented in the form of a recording medium in which a program such as a program module executed by a computer is recorded. Computer-readable media can be any available media that can be accessed by a computer and includes both volatile and nonvolatile media, removable and non-removable media. In addition, computer-readable media may include both computer storage media and communication media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Communication media typically includes computer readable instructions, data structures, other data in modulated data signals, such as program modules, or other transport mechanisms, and includes any information delivery media.

또한, 본 명세서에서, "부"는 프로세서 또는 회로와 같은 하드웨어 구성(hardware component), 및/또는 프로세서와 같은 하드웨어 구성에 의해 실행되는 소프트웨어 구성(software component)일 수 있다.Also, in this specification, "unit" may be a hardware component such as a processor or circuit, and/or a software component executed by a hardware component such as a processor.

전술한 본 명세서의 설명은 예시를 위한 것이며, 본 명세서의 내용이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.The description of the present specification described above is for illustration, and those of ordinary skill in the art to which the content of the present specification pertains will understand that it can be easily modified into other specific forms without changing the technical spirit or essential features of the present invention. will be able Therefore, it should be understood that the embodiments described above are illustrative in all respects and not restrictive. For example, each component described as a single type may be implemented in a distributed manner, and likewise components described as distributed may also be implemented in a combined form.

상술한 실시예들에 대한 설명은 예시적인 것에 불과하며, 당해 기술 분야에서 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 다른 실시예가 가능하다는 점을 이해할 것이다. 따라서 발명의 진정한 보호 범위는 첨부된 청구범위에 의해 정해져야 할 것이며, 청구범위에 기재된 내용과 동등한 범위에 있는 모든 차이점은 청구범위에 의해 정해지는 보호 범위에 포함되는 것으로 해석되어야 할 것이다.The description of the above-described embodiments is merely exemplary, and it will be understood by those skilled in the art that various modifications and equivalent other embodiments are possible therefrom. Accordingly, the true scope of protection of the invention should be defined by the appended claims, and all differences within the scope of equivalents to those described in the claims should be construed as being included in the protection scope defined by the claims.

100: 수중 영상 데이터 생성 장치 110: 픽셀 분포 예측부
120: 생성부 130: 판별부
300: 수중 영상 데이터 생성 장치 310: 픽셀 분포 예측부
320: 생성부 330: 판별부
321: 업샘플링 컨볼루션 레이어 322: 업샘플링 컨볼루션 레이어
323: 업샘플링 컨볼루션 레이어 324: 스페이드 블록
325: 스페이드 블록 326: 스페이드 블록
400: 스페이드 블록 410: 업샘플링 컨볼루션 레이어
700: 수중 영상 데이터 생성 장치 710: 메모리
720: 프로세서100: underwater image data generating device 110: pixel distribution prediction unit
120: generating unit 130: determining unit
300: Underwater image data generating device 310: Pixel distribution prediction unit
320: generating unit 330: determining unit
321: upsampling convolutional layer 322: upsampling convolutional layer
323: upsampling convolution layer 324: spade block
325: spade block 326: spade block
400: spade block 410: upsampling convolutional layer
700: underwater image data generating device 710: memory
720: processor

Claims

a pixel distribution prediction unit for extracting pixel distribution characteristics of the input image;
a generator for generating underwater image data by using the extracted pixel distribution characteristics; and
A determining unit for determining whether the input image and the underwater image data are forged images;
The input image is a side scan sonar image,
The generator includes at least one of an upsampling convolution layer and a spade block that adds a segmentation map to the output of the upsampling convolution layer,
The upsampling convolution layer or the spade block is provided in plurality,
The spade block is
at least one of a convolutional layer and a batch normalization layer,
The determining unit,
A device for generating underwater image data using a Patch GAN (Generative Adversarial Network) structure.

The method of claim 1,
The pixel distribution prediction unit,
Learning the pixel distribution of the input image through unsupervised learning,
extracting the pixel distribution characteristic from the learned pixel distribution,
An apparatus for generating an underwater image data for generating a random vector by using the extracted pixel distribution characteristic.

The method of claim 1,
The pixel distribution prediction unit,
Convolutional layer (convolution layer) and FC layer (Fully-Connected layer) comprising at least one of the, underwater image data generating apparatus.

The method of claim 1,
The generating unit,
An underwater image data generating apparatus for generating the underwater image data by using a Pix2pixHD model.

delete

The method of claim 1,
The determining unit,
An apparatus for generating underwater image data, which determines whether the input image concatenated in a segmented map and a channel and the underwater image data concatenated in the segmented map and the channel are the forged images.

The method of claim 1,
The pixel distribution prediction unit is learned through a loss function,
The loss function is the following equation (1), underwater image data generating device.
[Equation 1]

(here, the

is the loss function, where

is the KL divergence function, where

is the input image, the

is a latent variable,

is the learned pixel distribution according to the input image, the

is the standard Gaussian distribution)

The method of claim 1,
The objective function of the generating unit and the determining unit is the following equation (2), an underwater image data generating device.
[Equation 2]

(here, the

is the GAN loss, above

is the feature matching loss,

is the generator, the

is the determination unit, the

is the input image, the

is a semantic label map value corresponding to the size of the input image)

A method for generating underwater image data, the method comprising:
extracting a pixel distribution characteristic of an input image;
generating underwater image data by using the extracted pixel distribution characteristics; and
determining whether the input image and the underwater image data are forged images;
The input image is a side scan sonar image,
The generating of the underwater image data using the extracted pixel distribution characteristic includes at least one of an upsampling convolution layer and a spade block for adding a split map to the output of the upsampling convolutional layer,
The upsampling convolution layer or the spade block is provided in plurality,
The spade block is
at least one of a convolutional layer and a batch normalization layer;
The step of determining whether the input image and the underwater image data are forged images,
A method of generating underwater image data using the patch GAN structure.

A recording medium recording a program for executing the method according to claim 11 in a computer.

An apparatus for generating underwater image data, comprising:
Memory; and
processor; including;
The processor is
Extracting the pixel distribution characteristics of the input image,
Generate underwater image data using the extracted pixel distribution characteristics,
determining whether the input image and the underwater image data are forged images,
The input image is a side scan sonar image,
Generating the underwater image data using the extracted pixel distribution characteristic includes at least one of an upsampling convolution layer and a spade block for adding a split map to the output of the upsampling convolution layer,
The upsampling convolution layer or the spade block is provided in plurality,
The spade block is
at least one of a convolutional layer and a batch normalization layer;
Determining whether the input image and the underwater image data are forged images comprises:
A device for generating underwater image data using the patch GAN structure.