KR20210098656A

KR20210098656A - A rendering method based on machine learning driven adaptive sampling

Info

Publication number: KR20210098656A
Application number: KR1020200012452A
Authority: KR
Inventors: 윤성의; 유치 후오
Original assignee: 한국과학기술원
Priority date: 2020-02-03
Filing date: 2020-02-03
Publication date: 2021-08-11
Also published as: KR102291702B1

Abstract

Disclosed is a rendering method based on driven adaptive sampling using machine learning. According to an embodiment of the present invention, the rendering method operated by a rendering system comprises the following steps of: training a rendering network in which a first network and a second network are combined using training data; reconstructing a radiance field by making the radiance field of an image to learn based on the trained first network and the second network; and performing rendering to generate an unbiased image using the reconstructed radiance field.

Description

A RENDERING METHOD BASED ON MACHINE LEARNING DRIVEN ADAPTIVE SAMPLING based rendering method using machine learning

아래의 설명은 렌더링 기술에 관한 것이다. The description below relates to rendering techniques.

글로벌 일루미네이션은 사실적인 이미지를 렌더링하는 데 매우 중요하다. 클래식 경로 추적으로 고품질 이미지를 렌더링하는 데 많은 시간이 필요하지만 성능을 향상시키는 다양한 방법이 있다.Global illumination is very important for rendering realistic images. Rendering high-quality images with classic path tracing takes a lot of time, but there are many ways to improve performance.

널리 사용되는 방법 중 하나는 노이즈 입력의 노이즈 제거를 수행하는 필터링 방법이다. 필터링 방법은 매우 빠르지 만 수학적으로 실제 사실(ground truth)에 수렴할 수 없다. 또한, 널리 사용되는 다른 방법은 편견없는 경로 안내이며, 먼저 경로 추적의 샘플링과 유사한 방법을 샘플링하기 위해 라이트 필드를 재구성한다. 필터링보다 속도는 느리지만 설계, 물리적 시뮬레이션 및 딥 러닝을 위한 데이터 세트 생성과 같은 고품질 렌더링에는 강력하다.One of the widely used methods is a filtering method that performs denoising of a noise input. The filtering method is very fast, but mathematically it cannot converge to the ground truth. Furthermore, another widely used method is unbiased path guidance, which first reconstructs the light field to sample a method similar to that of path tracing. Although slower than filtering, it is powerful for high-quality rendering, such as creating data sets for design, physics simulation, and deep learning.

상기 방법들은 라이트 필드(Light Field) 또는 래디언스 필드(Radiance Field)를 효율적으로 재구성함으로써 이점을 얻을 수 있다. The above methods may benefit from efficiently reconstructing the Light Field or Radiance Field.

머신 러닝을 이용한 적응형 샘플링 기반의 렌더링 방법 및 시스템을 제공할 수 있다. It is possible to provide a rendering method and system based on adaptive sampling using machine learning.

렌더링 시스템에 의해 수행되는 렌더링 방법은, 제1 네트워크 및 제2 네트워크가 결합된 렌더링 네트워크에 훈련 데이터를 사용하여 훈련시키는 단계; 상기 훈련된 제1 네트워크 및 제2 네트워크에 기반하여 이미지의 래디언스 필드를 학습시킴에 따라 상기 래디언스 필드를 재구성하는 단계; 및 상기 재구성된 래디언스 필드를 사용하여 편향되지 않은(unbiased) 이미지를 생성하기 위한 렌더링을 수행하는 단계를 포함할 수 있다. A rendering method performed by a rendering system includes: training a rendering network in which a first network and a second network are combined using training data; reconstructing the radiance field by learning the radiance field of the image based on the trained first network and the second network; and performing rendering to generate an unbiased image using the reconstructed radiance field.

상기 훈련된 제1 네트워크는 및 제2 네트워크는, R 네트워크 및 심층 강화 학습 기반의 Q 네트워크이고, 상기 래디언스 필드를 재구성하는 단계는, 이미지의 래디언스 필드에 대한 적응적 샘플링 및 정제를 수행하는 단계를 포함하고, 상기 래디언스(Radiance) 필드는 타일에서 처리되는 픽셀의 이미지 공간과 각 픽셀의 입사 반구를 파라미터화하는 방향 공간의 조합을 의미할 수 있다. The trained first network and the second network are an R network and a Q network based on deep reinforcement learning, and the step of reconstructing the radiance field includes performing adaptive sampling and refinement on the radiance field of the image. In addition, the radiance field may mean a combination of an image space of a pixel processed in a tile and a directional space parameterizing an incident hemisphere of each pixel.

상기 래디언스 필드를 재구성하는 단계는, 상기 래디언스 필드를 균일하게 분할하고, 심층 강화학습 기반의 Q-네트워크를 적용하여 래디언스 필드의 방향 공간을 분할한 필드 블록에서, 필드 블록을 기 설정된 기준 이상의 해상도로 정제하거나 상기 필드 블록의 샘플을 기 설정된 배수로 증가시키는 동작을 수행하여 품질값을 획득하는 단계를 포함할 수 있다. The step of reconstructing the radiance field comprises uniformly dividing the radiance field and applying a deep reinforcement learning-based Q-network to divide the directional space of the radiance field in a field block with a resolution higher than or equal to a preset standard. The method may include obtaining a quality value by performing an operation of refining or increasing the sample of the field block by a preset multiple.

상기 래디언스 필드를 재구성하는 단계는, 상기 동작 수행을 복수 번의 반복을 통하여 상기 래디언스 필드를 평가하고 구체화여 정제된 래디언스 필드 블록의 계층 구조를 획득하는 단계를 포함할 수 있다. The step of reconstructing the radiance field may include obtaining a refined hierarchical structure of the radiance field block by evaluating and specifying the radiance field through iteration of the operation a plurality of times.

상기 래디언스 필드를 재구성하는 단계는, 이미지 공간 및 방향 공간을 처리하기 위한 이미지 부분 및 방향 부분으로 구성된 R 네트워크를 이용하여 4D 공간에서 래디언스 필드를 재구성하는 단계를 포함하고, 상기 R 네트워크에서 이미지 부분은, 상기 이미지 공간에서 지오메트리 특징과 래디언스 특징이 포함되고, 상기 지오메트리 특징은 법선, 위치 및 깊이를 포함하는 특징이 포함되고, 상기 래디언스 특징은 복수 개의 서로 다른 블록 또는 방향의 입사 래디언스 특징을 나타내고, 상기 R 네트워크의 이미지 부분을 출력한 결과로 복수 개의 특징맵을 획득하고, 상기 R 네트워크에서 방향 부분은, 특징맵을 방향 공간의 순서로 재배열한 후, 각 픽셀의 방향 공간 CNN과 개별적으로 컨벌브(convolve)할 수 있다. The step of reconstructing the radiance field comprises reconstructing the radiance field in 4D space using an R network consisting of an image part and a directional part for processing an image space and a directional space, wherein the image part in the R network is , a geometric feature and a radiance feature are included in the image space, the geometric feature includes a feature including a normal, a position, and a depth, the radiance feature represents an incident radiance feature of a plurality of different blocks or directions, the A plurality of feature maps are obtained as a result of outputting the image portion of the R network, and the direction portion in the R network is individually convolved with the direction space CNN of each pixel after rearranging the feature maps in the order of the direction space ( can convolve).

상기 훈련시키는 단계는, 상기 제1 네트워크 및 상기 제2 네트워크가 결합된 렌더링 네트워크를 구성하는 단계를 포함하고, 상기 제1 네트워크는, R 네트워크이고, 상기 제2 네트워크는, 심층 강화 학습 기반의 Q 네트워크일 수 있다. The training includes constructing a rendering network in which the first network and the second network are combined, the first network is an R network, and the second network is Q based on deep reinforcement learning. It may be a network.

상기 훈련시키는 단계는, 특징과 참조를 포함하는 훈련 데이터를 이용하여 제1 네트워크를 훈련시킴에 따라 재구성 결과를 출력하고, 상기 출력된 재구성 결과를 사용하여 생성된 보상과 상기 훈련 데이터의 특징을 이용하여 제2 네트워크를 훈련시키는 단계를 포함하고, 상기 제1 네트워크는, R 네트워크이고, 상기 제2 네트워크는, 심층 강화 학습 기반의 Q 네트워크일 수 있다.The training includes outputting a reconstruction result by training the first network using training data including a feature and a reference, and using a reward generated using the output reconstruction result and a feature of the training data. to train a second network, wherein the first network may be an R network, and the second network may be a deep reinforcement learning-based Q network.

렌더링 시스템은, 제1 네트워크 및 제2 네트워크가 결합된 렌더링 네트워크에 훈련 데이터를 사용하여 훈련시키는 훈련부; 상기 훈련된 제1 네트워크 및 제2 네트워크에 기반하여 이미지의 래디언스 필드를 학습시킴에 따라 상기 래디언스 필드를 재구성하는 재구성부; 및 상기 재구성된 래디언스 필드를 사용하여 편향되지 않은(unbiased) 이미지를 생성하기 위한 렌더링을 수행하는 렌더링 수행부를 포함할 수 있다. The rendering system includes: a training unit for training using training data in a rendering network in which the first network and the second network are combined; a reconstruction unit configured to reconstruct the radiance field by learning the radiance field of the image based on the trained first network and the second network; and a rendering performing unit that performs rendering to generate an unbiased image by using the reconstructed radiance field.

상기 훈련된 제1 네트워크는 및 제2 네트워크는, R 네트워크 및 심층 강화 학습 기반의 Q 네트워크이고, 상기 재구성부는, 이미지의 래디언스 필드에 대한 적응적 샘플링 및 정제를 수행하는 것을 포함하고, 상기 래디언스(Radiance) 필드는 타일에서 처리되는 픽셀의 이미지 공간과 각 픽셀의 입사 반구를 파라미터화하는 방향 공간의 조합을 의미할 수 있다. The trained first network and the second network are an R network and a deep reinforcement learning-based Q network, and the reconstruction unit includes performing adaptive sampling and refinement on a radiance field of an image, and the radiance ( Radiance) field may mean a combination of an image space of a pixel processed in a tile and a directional space parameterizing an incident hemisphere of each pixel.

상기 재구성부는, 상기 래디언스 필드를 균일하게 분할하고, 심층 강화학습 기반의 Q-네트워크를 적용하여 래디언스 필드의 방향 공간을 분할한 필드 블록에서, 필드 블록을 기 설정된 기준 이상의 해상도로 정제하거나 상기 필드 블록의 샘플을 기 설정된 배수로 증가시키는 동작을 수행하여 품질값을 획득할 수 있다. The reconstruction unit uniformly divides the radiance field, and refines the field block to a resolution higher than or equal to a preset standard in a field block in which the directional space of the radiance field is divided by applying a Q-network based on deep reinforcement learning or the field block A quality value may be obtained by performing an operation of increasing the samples of .

상기 재구성부는, 상기 동작 수행을 복수 번의 반복을 통하여 상기 래디언스 필드를 평가하고 구체화여 정제된 래디언스 필드 블록의 계층 구조를 획득할 수 있다. The reconstruction unit may obtain a refined hierarchical structure of the radiance field block by evaluating and specifying the radiance field through a plurality of iterations of performing the operation.

상기 재구성부는, 이미지 공간 및 방향 공간을 처리하기 위한 이미지 부분 및 방향 부분으로 구성된 R 네트워크를 이용하여 4D 공간에서 래디언스 필드를 재구성하는 것을 포함하고, 상기 R 네트워크에서 이미지 부분은, 상기 이미지 공간에서 지오메트리 특징과 래디언스 특징이 포함되고, 상기 지오메트리 특징은 법선, 위치 및 깊이를 포함하는 특징이 포함되고, 상기 래디언스 특징은 복수 개의 서로 다른 블록 또는 방향의 입사 래디언스 특징을 나타내고, 상기 R 네트워크의 이미지 부분을 출력한 결과로 복수 개의 특징맵을 획득하고, 상기 R 네트워크에서 방향 부분은, 특징맵을 방향 공간의 순서로 재배열한 후, 각 픽셀의 방향 공간 CNN과 개별적으로 컨벌브(convolve)할 수 있다. wherein the reconstruction unit reconstructs a radiance field in 4D space using an R network composed of an image part and a directional part for processing an image space and a directional space, wherein the image part in the R network includes a geometry in the image space a feature and a radiance feature are included, the geometric feature includes a feature comprising a normal, a position and a depth, the radiance feature represents an incident radiance feature of a plurality of different blocks or directions, and the image portion of the R network is A plurality of feature maps are obtained as a result of the output, and the direction part in the R network can be individually convolved with the direction space CNN of each pixel after rearranging the feature maps in the order of the direction space.

상기 훈련부는, 상기 제1 네트워크 및 상기 제2 네트워크가 결합된 렌더링 네트워크를 구성하는 것을 포함하고, 상기 제1 네트워크는, R 네트워크이고, 상기 제2 네트워크는, 심층 강화 학습 기반의 Q 네트워크일 수 있다. The training unit may include configuring a rendering network in which the first network and the second network are combined, the first network may be an R network, and the second network may be a deep reinforcement learning-based Q network. there is.

상기 훈련부는, 특징과 참조를 포함하는 훈련 데이터를 이용하여 제1 네트워크를 훈련시킴에 따라 재구성 결과를 출력하고, 상기 출력된 재구성 결과를 사용하여 생성된 보상과 상기 훈련 데이터의 특징을 이용하여 제2 네트워크를 훈련시키는 것을 포함하고, 상기 제1 네트워크는, R 네트워크이고, 상기 제2 네트워크는, 심층 강화 학습 기반의 Q 네트워크일 수 있다. The training unit outputs a reconstruction result by training the first network using training data including a feature and a reference, and uses a reward generated using the output reconstruction result and a feature of the training data. and training two networks, wherein the first network may be an R network, and the second network may be a deep reinforcement learning-based Q network.

머신 러닝을 이용한 적응형 샘플링 기반의 렌더링 방법 및 시스템을 제공할 수 있다. 구체적으로, 렌더링 네트워크(심층 강화 학습 기반의 Q 네트워크 및 R 네트워크)를 훈련시키고, 훈련된 렌더링 네트워크를 통하여 래디언스 샘플에 대한 샘플링 및 정제를 수행한 후, 4D 공간에서 래디언스 필드를 재구성함으로써 편향되지 않은 결과를 획득할 수 있다. It is possible to provide a rendering method and system based on adaptive sampling using machine learning. Specifically, by training a rendering network (a Q network and an R network based on deep reinforcement learning), sampling and refinement of the radiance sample through the trained rendering network, and then reconstructing the radiance field in 4D space, the unbiased results can be obtained.

도 1은 일 실시예에 따른 렌더링 시스템의 동작을 설명하기 위한 도면이다.
도 2는 일 실시예에 따른 렌더링 시스템에서 훈련 데이터를 훈련시키는 동작을 설명하기 위한 도면이다.
도 3은 일 실시예에 따른 렌더링 시스템의 R 네트워크를 설명하기 위한 도면이다.
도 4는 일 실시예에 따른 렌더링 시스템의 Q 네트워크를 설명하기 위한 도면이다.
도 5는 일 실시예에 따른 렌더링 시스템에서 래디언스 필드를 정제 및 리샘플링하는 것을 설명하기 위한 도면이다.
도 6은 일 실시예에 따른 렌더링 시스템의 구성을 설명하기 위한 블록도이다.
도 7은 일 실시예에 따른 렌더링 시스템에서 머신 러닝을 이용한 적응형 샘플링 기반의 렌더링 방법을 설명하기 위한 흐름도이다.
도 8은 일 실시예에 따른 렌더링 시스템에서 래디언스 필드를 재구성하는 것을 설명하기 위한 예이다. 1 is a diagram for explaining an operation of a rendering system according to an embodiment.
2 is a diagram for explaining an operation of training training data in a rendering system according to an embodiment.
3 is a diagram for describing an R network of a rendering system according to an embodiment.
4 is a diagram for describing a Q network of a rendering system according to an embodiment.
5 is a diagram for explaining refining and resampling of a radiance field in a rendering system according to an embodiment.
6 is a block diagram illustrating a configuration of a rendering system according to an embodiment.
7 is a flowchart illustrating an adaptive sampling-based rendering method using machine learning in a rendering system according to an embodiment.
8 is an example for explaining reconstruction of a radiance field in a rendering system according to an embodiment.

이하, 실시예를 첨부한 도면을 참조하여 상세히 설명한다.Hereinafter, embodiments will be described in detail with reference to the accompanying drawings.

도 1은 일 실시예에 따른 렌더링 시스템의 동작을 설명하기 위한 도면이다.1 is a diagram for explaining an operation of a rendering system according to an embodiment.

4D 입사 래디언스(radiance) 필드는 이미지(픽셀 또는 음영 지점의 공간)과 방향 공간(음영 지점의 그룹의 평균 정규 분포를 중심으로 한 입사 반구의 공간)의 조합으로 정의될 수 있다. A 4D incident radiance field can be defined as a combination of image (the space of pixels or shaded points) and directional space (the space of the incident hemisphere centered on the mean normal distribution of the group of shaded points).

렌더링 시스템은 이미지 공간(픽셀)을 타일로 분할하고 방향 공간을 래디언스 필드 블록이라고 하는 빈(bin)으로 분할할 수 있다. GPU는 단일 명령 다중 데이터(SIMD) 동작으로 배치되고 정렬된 입력에 대한 네트워크 추론을 효율적으로 수행할 수 있다. 이때, 픽셀 타일은 일반성의 손실 없이 하나의 픽셀을 나타낼 수 있다. 래디언스 필드 블록 B ^j 는 방향 공간의 경계 도메인에 의해 정의되며 다음과 같이 표현될 수 있다.The rendering system can partition the image space (pixels) into tiles and the directional space into bins called radiance field blocks. GPUs can efficiently perform network inference on inputs that are placed and ordered in single instruction multiple data (SIMD) operations. In this case, the pixel tile may represent one pixel without loss of generality. The radiance field block B ^j is defined by the boundary domain of the directional space and can be expressed as follows.

여기서,

와

는 각각 B ^j 의 고도 및 방위각 경계이다. B ^j 는 방향 공간의 경계만 정의하는 반면, B ^j 는 때때로 특정 픽셀

의 방향 공간에 있는 j번째 파티션인

또는 타일의 모든 픽셀이 공유하는 방향 공간의 j번째 파티션인

를 간략하게 나타내는 데 사용될 수 있다. here,

Wow

are the elevation and azimuth boundaries of ^{B j , respectively.} B ^j only defines the bounds of directional space, whereas B ^j is sometimes a specific pixel

The jth partition in the direction space of

or the jth partition of the directional space shared by all pixels in the tile.

can be used to briefly represent

렌더링 시스템은 Q 네트워크에 의해 방위각

과 코사인 가중 천정각

을 절반으로 재귀적으로 분할하여 다양한 크기의 노드를 가진 래디언스 필드 계층으로 방향 공간을 적응적으로 분할할 수 있다. The rendering system is azimuthed by the Q network.

and cosine weighted zenith angle

We can adaptively partition the directional space into layers of radiance fields with nodes of various sizes by recursively dividing in half.

렌더링 시스템은 픽셀당 대신 타일당 입사 래디언스 필드를 네트워크의 입력으로 재구성할 수 있다. 따라서 계층은 타일의 로컬 프레임의 반구에 구축되며, 이는 타일에서 픽셀의 평균 법선 픽셀로부터 정의될 수 있다. 평균 로컬 프레임에서 래디언스 필드를 재구성한 후, 픽셀의 개별 프레임에 결과를 투영할 수 있다. 이러한 설정에서, 개별 픽셀의 반구가 평균 로컬 프레임의 반구에서 커버되지 않는 특정 입사 방향(예를 들면, 노출되지 않은 도메인)을 갖는 특별한 경우가 존재할 수 있다. 네트워크를 사용하여 단일 도메인을 재구성하면 런타임 성능이 저하될 수 있다. 실시예에 따른 렌더링 시스템은 편향되지 않은(unbiased) 샘플링을 위해 각 노출되지 않은 도메인(비공개 도메인)에 균일한 PDF를 할당할 수 있다.The rendering system may reconstruct the incident radiance field per tile instead of per pixel as the input of the network. The layer is thus built on the hemisphere of the tile's local frame, which can be defined from the average normal pixels of the pixels in the tile. After reconstructing the radiance field from the average local frame, we can project the result onto individual frames of pixels. In such a setup, there may be a special case where the hemisphere of an individual pixel has a particular direction of incidence (eg unexposed domains) that is not covered in the hemisphere of the average local frame. Reconfiguring a single domain using a network can degrade runtime performance. The rendering system according to the embodiment may allocate a uniform PDF to each unexposed domain (private domain) for unbiased sampling.

렌더링 시스템은 R 네트워크를 이용하여 래디언스 필드를 재구성할 수 있다. CNN에서, N의 경우 입사 래디언스 블록 B ^j 의 래디언스 필드 재구성은The rendering system may reconstruct the radiance field using the R network. In CNN, the radiance field reconstruction of the incident radiance block B ^{j for N is}

이다. 여기서,

는 네트워크를 통과함에 따라 출력된 결과(예를 들면, 픽셀

에서의 B ^j 의 평균 입사 래디언스)이고, X는 B ^j 의 도메인에 있는 입사 래디언스 샘플이고,

는 보조 특징(예를 들면, 위치, 법선, 깊이)을 나타내고,

는 N의 훈련 가능한 무게 및 바이어스(bias)이다. 훈련 과정에서 네트워크는 오프라인 훈련 데이터가 공급될 수 있다.

를 최적화하기 위해 오프라인 훈련 데이터를 제공받아 결과와 GT 사이의 오류를 최소화할 수 있다.am. here,

is the output output as it traverses the network (e.g., pixels

And the average incident radiance) of B ^j in, X is the radiance incident on the sample in the domain B ^j,

denotes auxiliary features (e.g., position, normal, depth),

is the trainable weight and bias of N. During the training process, the network may be supplied with offline training data.

It is possible to minimize the error between the result and the GT by receiving the offline training data to optimize it.

도 5를 참고하면, 렌더링 시스템은 래디언스 필드를 정제 및 리샘플링할 수 있다. 실시예에서는 적응형 샘플링을 동적 필드로 취급하여 래디언스 필드 블록을 더 작은 블록으로 세분화하거나 샘플 수를 증가시키기 위한 동작을 반복적으로 수행할 수 있다. 훈련된 Q 네트워크는 적응적 프로세스를 안내하기 위하여 각 동작의 값을 런타임에 평가할 수 있다.Referring to FIG. 5 , the rendering system may refine and resample the radiance field. In the embodiment, an operation for subdividing the radiance field block into smaller blocks or increasing the number of samples may be repeatedly performed by treating the adaptive sampling as a dynamic field. The trained Q network can evaluate the value of each operation at runtime to guide the adaptive process.

렌더링 시스템은 훈련 프로세스 및 런타임 렌더링 프로세스를 수행할 수 있다. GT 데이터는 256X256의 방향 해상도를 가진 입사 래디언스 필드의 평균 값으로, R 네트워크 및 Q 네트워크를 훈련하는 데 사용될 수 있다. 훈련 과정에서, 처음에 R 네트워크를 훈련시킨 후, 훈련된 R 네트워크를 사용하여 Q 네트워크를 훈련시킬 수 있다. Q 네트워크의 손실을 최소화하기 위하여 R 네트워크로부터 출력된 재구성 결과와 GT 차이에 따라 보상이 평가될 수 있다. The rendering system may perform a training process and a runtime rendering process. The GT data is the average value of the incident radiance field with directional resolution of 256X256, which can be used to train the R network and the Q network. In the training process, we can train the R network at first, and then use the trained R network to train the Q network. In order to minimize the loss of the Q network, the compensation may be evaluated according to the difference between the reconstruction result output from the R network and the GT.

모든 네트워크 훈련이 끝나면, 훈련된 네트워크가 런타임 렌더링에 사용될 수 있다. 렌더링 시스템은 이미지를 타일로 분할하고 타일에 있는 모든 픽셀에 대해 동일한 래디언스 필드 계층 구조를 사용할 수 있다. 렌더링 시스템은 주어진 정지 기준에 도달할 때까지 Q 네트워크를 반복적으로 사용하여 동작의 값을 예측하고, 래디언스 필드 계층의 정제 또는 샘플링을 안내할 수 있다. 그런 다음 R 네트워크를 사용하여 입사 래디언스 필드 계층의 리프 노드를 재구성할 수 있다. 응용 프로그램이 필터링된 경우, BRDF의 제품 및 재구성된 입사 래디언스 필드를 미리보기로 직접 평가할 수 있다. 원하는 결과가 편향되지 않은 이미지인 경우, BRDF를 사용하여 입사 래디언스 필드를 작성하여 최종 렌더링(유도 경로 추적)을 위한 PDF를 생성할 수 있다.After all network training is finished, the trained network can be used for runtime rendering. The rendering system can split the image into tiles and use the same radiance field hierarchy for every pixel in the tile. The rendering system can iteratively use the Q network to predict the value of the motion until a given stopping criterion is reached, and guide refinement or sampling of the radiance field layer. We can then use the R network to reconstruct the leaf nodes of the incident radiance field layer. If the application is filtered, the product of the BRDF and the reconstructed incident radiance field can be evaluated directly as a preview. If the desired result is an unbiased image, BRDF can be used to create an incident radiance field to generate a PDF for final rendering (guided path tracing).

도 1을 참고하면, (a) 이미지 공간은 타일로 분할될 수 있다. (b) 각 타일 내에서 픽셀에 의해 공유되는 방향 공간은 균일한 블록으로 초기화되고, 임의의 샘플이 각 블록에 확산될 수 있다. (c) Q 네트워크를 사용하여 방향 공간의 정제(블록 분할)또는 리샘플링(블록의 샘플의 두 배)의 품질값을 평가할 수 있다. (d) 방향 공간은 최적의 동작을 취함으로써 래디언스 필드의 계층으로 구체화할 수 있다. R 네트워크를 사용하여 입사 래디언스 필드를 재구성할 수 있다.Referring to FIG. 1 , (a) an image space may be divided into tiles. (b) The directional space shared by pixels within each tile is initialized to a uniform block, and arbitrary samples can be spread across each block. (c) The Q network can be used to evaluate the quality value of refinement (block division) or resampling (double the sample of a block) of the directional space. (d) The directional space can be specified as a layer of radiance fields by taking an optimal action. An R network can be used to reconstruct the incident radiance field.

도 3은 일 실시예에 따른 렌더링 시스템의 R 네트워크를 설명하기 위한 도면이다. 3 is a diagram for describing an R network of a rendering system according to an embodiment.

도 3을 참고하면, 아래 첨자 i 및

는 각각의 픽셀

, 전체 픽셀 세트를 나타내는 이미지 방향 네트워크의 예이다. j와

는 각각 래디언스 필드 블록 B ^j 및 래디언스 필드 블록의 전체 세트를 각각 나타낸다.

는 이미지 공간 보조 특징의 특징맵이다.

은 B ¹ 에서 B ¹⁶ 까지의 래디언스 특징의 특쟁맵이고, F_d는 래디언스 필드 블록에 대응하는 특징맵의 방향을 나타내고, 최종 출력 결과는 예측된 입사 래디언스이다. 이미지 방향 네트워크의 첫 번째 부분은 이미지 공간에서 동작하고, 두 번째 부분은 각각의 개별 픽셀

의 방향 공간에서 동작한다. 재배열 동작은 첫 번째 부분의 출력을 방향 공간 입력으로 재구성될 수 있다. Referring to Figure 3, the subscripts i and

is each pixel

, is an example of an image orientation network representing the entire set of pixels. j and

denotes a radiance field block B ^j and a full set of radiance field blocks, respectively.

is the feature map of the image space auxiliary feature.

is the characteristic map of the radiance features from B ¹ to B ¹⁶ _{, F d} represents the direction of the feature map corresponding to the radiance field block, and the final output result is the predicted incident radiance. The first part of the image orientation network operates in image space, and the second part is for each individual pixel.

operates in the directional space of The rearrangement operation may reconstruct the output of the first part as a directional space input.

렌더링 시스템은 래디언스 필드의 재구성에 사용되는 R 네트워크를 제시한다. 기술적으로 몇 가지 과제를 가지고 있고 이미지 공간 필터링과 비교하여 특정한 차이를 포함한다. 첫째, 샘플은 여러 방향으로 분산되므로 입력은 개별 래디언스 필드 블록에서 희박하다. 둘째, 4D 광원 공간에서 직접 콘볼루션은 높은 메모리와 연산을 요구한다. 이러한 과제를 해결하기 위하여 이미지, 방향, 이미지 방향, 방향 이미지의 서로 다른 네 가지 CNN을 탐색하고 비교할 수 있다. 이미지 네트워크는 이미지 공간 보조 특징을 사용하고 이미지 공간 콘볼루션을 수행하는 반면, 방향 네트워크는 방향 공간에서 특징 및 콘볼루션과 함께 동작한다. 이미지 방향 네트워크 및 방향 이미지 네트워크는 이미지 및 방향 공간 콘볼루션을 서로 다른 순서로 연결하여 4D 광원 콘볼루션을 수행한다. 결론적으로, 실시예에서 최종적으로 R-네트워크로 사용하는 이미지 방향 네트워크는 정해진 시간 예산을 감안할 때 가장 낮은 오차를 달성할 수 있다. The rendering system presents the R network used for reconstruction of the radiance field. Technically it has some challenges and includes certain differences compared to image spatial filtering. First, the samples are scattered in multiple directions, so the input is sparse in individual blocks of radiance fields. Second, direct convolution in the 4D light source space requires high memory and computation. To solve this problem, we can search and compare four different CNNs of image, orientation, image orientation, and orientation image. Image networks use image space auxiliary features and perform image space convolution, whereas directional networks operate with features and convolutions in directional space. Image orientation network and orientation image network perform 4D light source convolution by connecting image and orientation spatial convolutions in different orders. In conclusion, the image direction network finally used as the R-network in the embodiment can achieve the lowest error in consideration of the set time budget.

렌더링 시스템에서 래디언스 필드의 재구성을 위해 선택한 전략은 이미지 방향 네트워크이다. 처음에 방향성 특징맵을 학습하기 위해 입사 래디언스 필드가 있는 이미지 공간 보조 특징을 컨벌브(convolve)하고, 그 후에 이러한 특징맵을 방향 공간으로 재배열하여, 방향 공간의 일관성을 탐색하도록 컨벌브(convolve)할 수 있다.The strategy chosen for reconstruction of the radiance field in the rendering system is the image orientation network. First, convolve the image space auxiliary features with incident radiance field to learn the directional feature map, and then rearrange these feature maps into the directional space to explore the coherence of the directional space. )can do.

이미지 방향 네트워크는 이미지 부분과 방향 부분으로 분할될 수 있다. 방향 블록 i와 픽셀 타일

과 관련된 특징맵을 나타내기 위해

를 사용할 수 있다. 이미지 부분은 일부 이미지 공간 보조 특징맵

와 래디언스 특징맵

(래디언스의 평균, 분산 및 기울기)을 입력으로 사용할 수 있다. 출력 결과는 방향 공간 특징맵

이다.

는 이미지 공간에 설정된 전체 픽셀을 나타내고, 위첨자 j는 j번째 래디언스 필드 블록 B ^j 를 가리킨다. 예를 들면,

는 첫 번째 래디언스 필드 블록 B¹의 이미지 공간에 정의된 특징맵을 나타낸다. 수작업으로 설계한 특징에 의존하는 대신, 방향 부분은 이미지 부분으로부터 학습된 방향 특징맵

을 입력으로 취하여 모든 래디언스 필드 블록의 래디언스 예측을 동시에 컨벌브(convolve)할 수 있다.The image direction network may be divided into an image portion and a direction portion. Direction block i and pixel tiles

To show a feature map related to

can be used The image part is some image space auxiliary feature map

and radiance feature map

(mean, variance and slope of radiance) can be used as input. The output result is a directional space feature map.

am.

denotes the entire pixel set in image space, and the superscript j denotes the j-th radiance field block B ^j . For example,

denotes the feature map defined in the image space of the first radiance field block B ^{1 .} Instead of relying on hand-designed features, the directional part is a directional feature map learned from the image part.

can be taken as input to convolve the radiance prediction of all radiance field blocks simultaneously.

네트워크의 초기 입력에는 이미지 공간 보조 특징맵

과 래디언스 특징맵

이 포함될 수 있다. 이미지 공간 보조 특징은 대부분 기하학적 특징, 예를 들면 표면 법선(채널 3개), 위치(채널 3개), 깊이(채널 1개)이다. 래디언스 특징은 희박한 샘플, 예를 들면, 각 래디언스 블록 B ^j 의 평균적인 입사 래디언스

에서 나온 것이다. 또한, 각 특징의 픽셀당 분산과 이미지 공간 그래디언트(gradients)를 제공하여 훈련을 용이하게 하고 중요한 세부 사항을 강조한다. 예를 들면, 기하학적 특징은 다음과 같이 총 26개의 채널이 있다.The initial input to the network is an image space auxiliary feature map.

and radiance feature map

may be included. Image space auxiliary features are mostly geometric features, eg surface normals (3 channels), position (3 channels), and depth (1 channel). The radiance characteristic is the average incident radiance of a sparse sample, e.g., each radiance block B ^{j .}

it comes from It also provides per-pixel variance and image space gradients of each feature to facilitate training and highlight important details. For example, the geometric feature has a total of 26 channels as follows.

-평균 정규 법선을 위한 3개의 채널, 법선의 평균 분산을 위한 1개의 채널, 평균 법선의 기울기를 위한 6개의 채널- 3 channels for the mean normal normal, 1 channel for the mean variance of the normal, and 6 channels for the slope of the mean normal.

-평균 위치를 위한 3개의 채널, 위치의 평균 분산을 위한 1개의 채널, 평균 위치의 그래디언트를 위한 6개의 채널- 3 channels for average position, 1 channel for average variance of positions, 6 channels for gradient of average position

-평균 깊이를 위한 1개의 채널, 깊이 분산을 위한 1개의 채널, 평균 깊이의 그래디언트를 위한 2개의 채널- 1 channel for average depth, 1 channel for depth dispersion, 2 channels for average depth gradients

-모든 블록의 평균 래디언스의 그래디언트를 위한 2개의 채널- 2 channels for gradients of the average radiance of all blocks

래디언스 특징은 총 4개의 채널이 있으며, 다음과 같이 구성된다.The radiance feature has a total of 4 channels and is composed as follows.

-래디언스를 위한 3개의 채널- 3 channels for radiance

-래디언스의 평균 분산을 위한 1개의 채널- 1 channel for average variance of radiance

이러한 특징은 모두 이미지 공간에 결합할 텐서로서 특징맵으로 축적될 수 있다. 래디언스 필드 블록에 대한 래디언스 특징은 블록에 샘플이 없을 경우 0으로 초기화될 수 있다. All of these features can be accumulated as a feature map as tensors to be combined in image space. The radiance characteristic for a radiance field block may be initialized to 0 if there are no samples in the block.

이미지 방향 네트워크의 첫 번째 부분은 크기가 5X5인 100개의 커널을 갖는 6개의 표준형 콘볼루션 히든 레이어를 포함하고 있다. 콘볼루션 레이어를 사용하여 완전 연결된 레이어(fully connected layer)와 비교하여 과적합(overfitting)을 방지할 수 있다. 이러한 구조는 이미지 공간 필터링 문제에 있어 효과적일 수 있다. 각 레이어에서 일부 훈련 가능한 커널이 이전 레이어의 출력에 적용되어 근처의 픽셀에서 선형 콘볼루션을 생성한다. 콘볼루션 출력의 결과는 훈련 가능한 바이어스(bias)로 추가되며, 현재 레이어의 출력으로 요소별 비선형 활성화 함수에 적용될 수 있다. 실시예에서는 모든 네트워크에 대해 정류된 ReLU 활성화 함수를 사용한다. 6개의 레이어는 입력 텐서를 통합하여 각 래디언스 필드 블록에 대하여 4X4 방향 공간 특징이 각 래디언스 필드 블록에 대해

을 추출한다.The first part of the image-oriented network contains 6 canonical convolutional hidden layers with 100 kernels of size 5x5. A convolutional layer can be used to prevent overfitting compared to a fully connected layer. Such a structure can be effective in the image spatial filtering problem. In each layer, some trainable kernel is applied to the output of the previous layer to generate a linear convolution from nearby pixels. The result of the convolutional output is added as a trainable bias, and it can be applied to the element-by-element nonlinear activation function as the output of the current layer. The embodiment uses the rectified ReLU activation function for all networks. The six layers integrate the input tensors so that for each block of radiance fields, a 4X4 directional spatial feature is created for each block of radiance fields.

to extract

네트워크의 두 번째 부분은 특징맵을 통합하여 최종 출력의 결과를 생성할 수 있다. 방향 공간에서 콘볼루션을 수행하기 위해, 이미지 부분의 출력을 재배열하여 4X4의 래디언스 필드 블록에 대한 12개의 방향성 특징을 나타내는 입력 텐서로서 방향 공간을 재배열할 수 있다. 이에, 다양한 특징을 테스트하고 12개의 특징을 사용하여 정확도와 런타임 성능의 균형을 유지할 수 있다. The second part of the network can integrate the feature maps to produce the final output result. To perform convolution in directional space, we can rearrange the output of the image part to rearrange the directional space as an input tensor representing 12 directional features for a block of radiance fields of 4X4. This allows you to test different features and use 12 features to balance accuracy and runtime performance.

재배열 동작은 전체 네트워크가 완전히 다른 방식으로 엔드투엔드 훈련을 가능하게 하기 위해 특징만 재배열할 뿐이라는 점에 유의한다. 그런 다음 네트워크는 3X3의 50 개의 커널과 최종 출력 레이어를 사용하는 3개의 표준 콘볼루션 히든 레이어를 사용하여 입력을 개별적으로 연결할 수 있다. 네트워크의 출력은 각 픽셀에서 16개의 래디언스 필드 블록의 예측 또는 필터링된 입사 래디언스

이며, 여기서

는 래디언스 필드 블록의 집합을 나타내고

는 픽셀을 나타낸다. 각 픽셀마다 래디언스 필드 블록이 다르기 때문에 서로 다른 픽셀에 걸쳐 있는 이러한 래디언스 필드 블록은 4D 래디언스 필드로 간주될 수 있다. 이러한 결과를 MC 추정을 위한 PDF로 처리함으로써, 편향되지 않은(unbiased) 렌더링을 수행할 수 있다. 또한, 이미지 방향 네트워크 이외의 이미지 네트워크, 방향 네트워크, 방향 이미지 네트워크가 사용될 수 있다. Note that the rearrangement operation only rearranges the features to enable the end-to-end training of the whole network in a completely different way. The network can then connect the inputs individually using 50 kernels of 3X3 and 3 standard convolutional hidden layers using the final output layer. The output of the network is the predicted or filtered incident radiance of blocks of 16 radiance fields at each pixel.

and where

represents the set of radiance field blocks and

represents a pixel. Since each pixel has a different block of radiance fields, these blocks of radiance fields that span different pixels can be considered 4D radiance fields. By processing these results as PDFs for MC estimation, unbiased rendering can be performed. Also, an image network, a directional network, and a directional image network other than the image directional network may be used.

이러한 재구성 네트워크를 평가하기 위해 동일한 훈련 세트를 사용하여 훈련한 후 동일한 샘플 설정으로 비교할 수 있다. 도 8은 래디언스 필드 블록 및 모든 블록의 수단의 재구성 결과를 보여준다. 픽셀당 64개의 샘플, 4X4 래디언스 필드 해상도, 500X500 이미지 해상도로 렌더링된 계단 장면이다. 맨 윗줄은 경로 안내에 사용될 수 있는 하나의 래디언스 필드 블록의 재구성 결과이고, 아랫줄은 모든 래디언스 필드 블록, 다시 말해서, 첫 번째 행에 표시된 블록을 포함하는 개별 블록의 평균을 통합하여 주변 폐색 또는 미리보기에 사용할 수 있다. To evaluate these reconstruction networks, they can be trained using the same training set and then compared with the same sample setup. 8 shows the results of reconstruction of the radiance field block and the means of all blocks. Stairs scene rendered at 64 samples per pixel, 4X4 radiance field resolution, and 500X500 image resolution. The top row is the reconstruction result of one block of radiance field that can be used for route guidance, and the bottom row is the average of all the radiance field blocks, i.e. the individual blocks containing the block shown in the first row, so that the surrounding occlusion or preliminaries are integrated. can be used for viewing.

도 8의 훈련 세트 외부의 계단 장면에서, 픽셀당 64개의 샘플, 4X4 래디언스 필드 해상도, 500X 500 이미지 해상도가 사용될 수 있다. 개별 블록 결과는 래디언스 필드 공간에 있으며 경로 안내를 위해 PDF를 합성하는 데 사용될 수 있다. 평균 결과는 래디언스 캐싱 또는 미리보기에 사용할 수 있는 이미지 공간 결과이다. 결과에서 지적한 바와 같이, 방향 네트워크와 이미지 네트워크는 이미지 공간이나 방향 공간에서만 정보를 활용하기 때문에 아티팩트나 지나치게 흐릿한 디테일을 생성한다. 그러나 이미지 방향 네트워크 및 방향 이미지 네트워크는 공간을 모두 활용하고 향상된 결과를 보여준다. 그럼에도 불구하고 방향 이미지 네트워크는 이미지 공간에서 결과를 직접 출력하는 반면, 이미지 방향 네트워크는 래디언스 필드 블록의 결과를 출력하기 때문에 더 부드러운 픽셀당 동일한 샘플(SPP)이 방향 이미지 결과로 도출될 수 있다. In the stair scene outside the training set of Figure 8, 64 samples per pixel, 4X4 radiance field resolution, 500X500 image resolution may be used. The individual block results are in radiance field space and can be used to composite PDFs for path guidance. The average result is an image space result that can be used for radiance caching or previewing. As pointed out in the results, orientation networks and image networks create artifacts or overly blurry detail because they utilize information only in image space or orientation space. However, image orientation network and orientation image network utilize both space and show improved results. Nevertheless, since directional image networks output results directly in image space, whereas image directional networks output results from blocks of radiance fields, smoother equal samples per pixel (SPP) can be derived as directional image results.

실시예에서 렌더링 시스템은 이미지 방향 네트워크를 최종 재구성 네트워크로 선택할 수 있다. 이하, 선택한 이미지 방향 재구성 네트워크를 고려하여 전체 렌더링 품질을 점진적으로 향상시키는 제안된 강화학습 기반의 적응 샘플링 방법을 설명하기로 한다.In embodiments, the rendering system may select the image orientation network as the final reconstruction network. Hereinafter, a proposed reinforcement learning-based adaptive sampling method that gradually improves the overall rendering quality in consideration of the selected image direction reconstruction network will be described.

렌더링 시스템은 강화 학습을 통하여 래디언스 필드의 샘플링을 수행할 수 있다. R 네트워크는 고정된 수의 노드(즉, 4X4 노드)가 있는 래디언스 필드 계층 구조를 입력으로 사용할 수 있다. 그러나 계층을 적응적으로 정제하는 전략은 다른 시나리오에 대한 적응력을 향상시킬 수 있다. 실시예에서는 래디언스 필드 계층의 샘플링과 구체화를 적응적으로 유도하기 위하여 DRL 기반의 Q 네트워크를 사용할 수 있다. The rendering system may perform sampling of the radiance field through reinforcement learning. The R network can take as input a radiance field hierarchy with a fixed number of nodes (i.e. 4X4 nodes). However, the strategy of adaptively refining the layer can improve its adaptability to different scenarios. In an embodiment, a DRL-based Q network may be used to adaptively induce sampling and refinement of the radiance field layer.

계층을 구축할 때, 두 가지 요소가 중요하다. 첫 번째 요소는 계층 구조, 즉 래디언스 필드를 분리(discretize)하는 방법이다. 그리드 해상도가 높을수록 고주파 조명 특징을 효과적으로 포착할 수 있지만 오버헤드가 함께 발생된다. 두 번째 요소는 적응형 샘플 분포로, 더 많은 샘플(즉, 더 큰 샘플 밀도)을 노이즈 영역(블록 또는 노드)에 배치하여 재구성 오류를 줄이는 것이다. 실시예에서는 로세스를 안내하기 위해 DNN을 훈련시키기 위한 대량의 오프라인 데이터를 사용할 수 있다. 정제 프로세스를 일련의 극소의 동작으로 분해하고, 프로세스를 안내하고 재구성 오류를 최소화하기 위한 Q 네트워크 훈련을 위해 감독되지 않는 학습 기술인 DRL을 적응시키는 것을 선택한다.When building a hierarchy, two factors are important. The first factor is the hierarchical structure, i.e. how to discretize the radiance field. Higher grid resolution can effectively capture high-frequency lighting features, but with overhead. The second factor is the adaptive sample distribution, which places more samples (i.e., greater sample density) into the noisy region (blocks or nodes) to reduce reconstruction errors. In embodiments, a large amount of offline data may be used to train the DNN to guide the process. We decompose the refinement process into a series of minimal operations and choose to adapt DRL, an unsupervised learning technique, for training the Q network to guide the process and minimize reconstruction errors.

적응형 샘플링 및 분할 접근 방식을 동적 프로세스로 간주하고 DRL에 의해 문제를 해결할 수 있다. 구체적으로, 심층 Q 학습을 적용하여 주어진 상태의 래디언스 필드 블록에 대한 다음 최선의 샘플링 또는 분할 동작을 선택할 수 있다. Q 학습에는 일반적으로 특정 환경에 위치한 에이전트가 포함될 수 있다. s가 주어지면, 에이전트는 동작(action) a를 수행하는데, 이는 새로운 상태의 s'과 보상 r'으로 이어진다. 이 프레임워크에서 신경 네트워크는 보상으로부터 학습한 후, 다른 상태의 동작의 장기적인 값을 예측할 수 있다. 실시예에 따른 적응형 샘플링 시나리오에서는 입력 상태로 글로벌 래디언스 필드 정보(예: 지오메트리 정보, 래디언스 샘플 및 래디언스 필드 계층)를 수신하고 다음 동작의 품질값을 출력으로 예측하는 신경망을 훈련시킬 수 있다. 우선, 다음과 같이 동작, 품질값, 상태 및 네트워크 구조를 정의할 수 있다.The adaptive sampling and segmentation approach can be considered as a dynamic process and the problem can be solved by DRL. Specifically, deep Q learning can be applied to select the next best sampling or segmentation operation for a block of radiance fields in a given state. Q-learning can typically involve agents located in specific environments. Given s, the agent performs an action a, which leads to a new state s' and a reward r'. In this framework, neural networks can predict long-term values of behaviors in different states after learning from rewards. In the adaptive sampling scenario according to the embodiment, it is possible to train a neural network that receives global radiance field information (eg, geometry information, radiance sample, and radiance field layer) as an input state and predicts the quality value of the next operation as an output. First, the operation, quality value, state, and network structure can be defined as follows.

에포크(epoch)에서 래디언스 필드 블록에 대해 두 가지 다른 동작이 사용될 수 있다. 먼저, 블록당 샘플 밀도(블록당 샘플 수)를 2배로 하여 블록을 리샘플링하고, 다음으로, 각 축을 균등하게 분할하여 래디언스 필드 블록을 4X4 새 블록으로 세분화하고, 일부 새로운 샘플을 추가하여 블록당 평균 샘플 수를 유지한다. 블록당 샘플 밀도를 유지하는 것은 희박한 샘플로 인한 재구성 품질의 저하를 방지하기 위함이다. 이 두 가지 동작은 재구성 결과에 다른 방식으로 영향을 미친다. 샘플의의 수를 증가시키면 래디언스 특징의 분산이 감소하여 노이즈를 억제시킬 수 있다. 정제하면 그리드의 해상도가 높아져 고주파 디테일을 캡처할 수 있다.Two different operations can be used for the radiance field block in an epoch. First, resampling the block by doubling the sample density per block (number of samples per block), then subdividing the radiance field block into 4X4 new blocks by equally splitting each axis, adding some new samples to average per block Keep the number of samples. Maintaining the sample density per block is to prevent degradation of reconstruction quality due to sparse samples. These two actions affect the reconstruction result in different ways. By increasing the number of samples, the dispersion of the radiance feature can be reduced to suppress noise. Refining increases the resolution of the grid, allowing it to capture high-frequency details.

품질값 Q와 동작 보상 r은 각 래디언스 필드 블록에 정의된다. 픽셀(또는 타일)에서의 래디언스 필드 블록 B ^j 의 경우, 상태 s ^j 에서 작용하는 동작 a의 품질값은 다음과 같은 수학식 1로 정의될 수 있다. A quality value Q and motion compensation r are defined for each radiance field block. In the case ^{of the radiance field block B j} in the pixel (or tile), the quality value of the operation a acting in the state s ^{j may be defined by Equation 1 below.}

수학식 1: Equation 1:

여기서 s ^j 와 s' ^j 는 동작 a가 발생한 전후의 상태에 해당하며, a'는 가능한 다음 동작이고, r(s ^j , a)는 그 동작의 보상을 나타내며,

은 각각 0과 1 사이의 붕괴 파라미터로, 근시안적 사고와 원시안적 사고를 유발한다. 직관적으로 주어진 래디언스 필드 계층과 샘플 분포로 표현되는 상태 s ^j 를 고려할 때, 식은 동작 a에 기초하는 동작의 최대 품질 가치를 찾으려 할 뿐만 아니라 다음 가능한 동작 a'을 다시 고려함으로써 추가 단계를 통해 시도한다. 이러한 식을 여러 단계로 평가하는 것은 상당한 오버헤드를 수반하며 행동의 장기적인 영향을 학습하는 측면에서 어려움을 야기한다. 이에, 두 가지 추가 단계를 고려하여 동작의 Q값을 추정하여 동작의 조합 수를 제한하여 추정 문제를 단순화할 수 있다. 수학식 2를 다음과 같이 계산한다.where s ^j and s ' ^j correspond to the state before and after the action a occurs, a' is the next possible action, and r(s ^j , a) represents the compensation of the action,

is a decay parameter between 0 and 1, respectively, causing myopic thinking and farsighted thinking. Given the intuitively given layer of radiance field and the state s ^j represented by the distribution of samples, the equation not only tries to find the maximum quality value of the action based on action a, but also tries through additional steps by reconsidering the next possible action a' . Evaluating these equations in multiple steps involves significant overhead and creates difficulties in terms of learning the long-term effects of behavior. Accordingly, it is possible to simplify the estimation problem by estimating the Q value of the motion in consideration of two additional steps and limiting the number of combinations of motions. Equation 2 is calculated as follows.

수학식 2:Equation 2:

보상 r(s ^j , a)을 다음과 같이 정의할 수 있다. The reward r(s ^j , a) can be defined as

여기서 E ^j (s ^j )는 블록 B ^j 의 재구성 오류, 즉 B ^j 의 도메인에서 재구성된 래디언스와 실측 자료 래디언스(ground-truth radiance) 사이의 절대 차이의 통합이다. 또한, r(s ^j , a)의 스칼라 범위는 dom(j)라고 불리는 B ^j 의 통합 도메인(이미지 공간의 영역)에 따라 다르다. 네트워크를 스칼라 범위와 무관하게 생성되면 견고성이 향상되기 때문에 정규화된 대체 보상인

를 훈련 및 추론하고 Q 값을

로 대체할 수 있다. actionvalue 함수를 사용하여, Q'(s ^j , a)와 그 예측값 Q^' _pre (s ^j , a) 사이의 제곱 오차를 학습 과정에서의 손실로 사용할 수 있다. 런타임에서 적응형 샘플링 동작을 위해 Q ^j (s ^j , a)를 복구할 수 있다.where E ^j (s ^j ) is the reconstruction error of block B ^j , that is, the integration of the absolute difference between the reconstructed radiance in the domain of ^{B j and the ground-truth radiance.} Also, the scalar range of r(s ^j , a) depends on the integration domain (region of image space) of B ^j called dom( j ). Since the network is created independent of the scalar range, its robustness is improved, so it is a normalized replacement reward.

train and infer the value of Q

can be replaced with Using the actionvalue function, ^{the squared error between Q'(s j} , a) and its predicted value Q ^' _pre (s ^j , a) can be used as a loss in the learning process. ^{We can recover Q j} (s ^j , a) for adaptive sampling operation at runtime.

동작의 정의, 품질값 및 상태를 고려할 때 전체 학습 알고리즘은 알고리즘 1에 기재되어 있다. The full learning algorithm is described in Algorithm 1, given the definition, quality value and state of the behavior.

에이전트는 동작(예: 재샘플링 또는 정제)을 수행하고, 학습 동작의 입력으로 래디언스 필드 블록 B ^j 의 상태와 보상을 관찰할 수 있다. 우선, 재생 메모리 D와 상태 s를 초기화한다. 재생 메모리는 일련의 상태, 동작 및 보상을 저장하는 데 사용된다. 재생 메모리에 기초하여, 단기 효과만을 관찰하는 대신, 장기 보상을 배우기 위해 Q 네트워크의 업데이트를 지연시킬 수 있다. The agent may perform an operation (eg resampling or refinement) and observe the state and reward of the ^{radiance field block B j as an input of the learning operation.} First, the reproduction memory D and the state s are initialized. Replay memory is used to store a set of states, actions, and rewards. Based on regenerative memory, instead of only observing short-term effects, we can delay the update of the Q network to learn long-term rewards.

각 학습 반복에 따라 알고리즘은 처음에 두 동작의 Q값을 계산하고 가장 가치있는 동작을 수행한다. 알고리즘의 견고성을 높이고 넓은 범위를 탐색할 때,

의 확률은 알고리즘이 가장 가치 있는 동작을 취하지 않고 두 동작에서 임의의 동작을 선택한다는 것이 존재한다. 동작을 취하면, 네트워크의 지연된 업데이트에 대한 보상 r과 새로운 상태 s'이 재생 메모리 D에 기록된다. 반복이 끝나면 메모리에서 무작위로

한 쌍을 선택하여 손실을 계산하고 Q네트워크를 훈련시킨다. 응답 메모리를 통해 학습 단계는 경험을 획득하는 과정과 분리되어 순차적 학습 샘플의 의존성을 줄이고 컨버전스(convergence)를 높이는 데 도움이 된다.For each learning iteration, the algorithm first calculates the Q-values of the two actions and performs the most valuable action. When it comes to increasing the robustness of the algorithm and exploring a wide range,

The probability exists that the algorithm chooses a random action from the two actions without taking the most valuable action. When an action is taken, the compensation r for the delayed update of the network and the new state s' are written to the replay memory D. At the end of the iteration, randomly from memory

Choose a pair, compute the loss, and train the Q network. With response memory, the learning phase is separated from the process of acquiring experience, which helps to reduce dependence on sequential learning samples and increase convergence.

도 4는 일 실시예에 따른 렌더링 시스템의 Q 네트워크를 설명하기 위한 도면이다. 4 is a diagram for describing a Q network of a rendering system according to an embodiment.

Q 네트워크는 픽셀 i와 래디언스 필드 블록 B ^j 에 대한 정보를 입력으로 취하여 B ^j 에서 수행할 수 있는 두 가지 동작의 보상을 예측할 수 있다. Q 네트워크는 이미지 방향 R 네트워크와 비교했을 때, 성능을 위해 이미지 공간 콘볼루션과 더 적은 히든 레이어를 사용한다. 적응적 분할 동작을 수행하는 동안 계층 구조 정보를 사용하여 근처의 래디언스 필드 블록에서 정보에 접근할 수 있다. 구체적으로, 3X3의 50개의 커널을 가진 2개의 히든 콘볼루션 레이어는 먼저 이미지 공간에서 래디언스 필드 블록 B ^j 의 특징을 컨벌브하고, 그 후에 완전히 연결된 레이어는 최종 출력된 결과를 예측하는 데 사용될 수 있다. 입력에는 이미지 공간 보조 특징맵

, 래디언스 필드 블록 B ^j 의 래디언스 특징맵

, 새로 도입된 계층 특징맵

, 주변 방향-공간 정보를 캡처하기 위해 B ^j 의 상위 래디언스 필드 블록의 래디언스 특징을 포함하는 부모 래디언스 특징맵

등이 포함된다. 이러한 특징은 모두 이미지 공간에 정렬되어 있다. G와 R의 의미는 R 네트워크에서 설명한 바와 같이 의미가 동일하다.The Q network can take as input information about pixel i and the radiance field block B ^j , and predict the compensation of the two operations it can perform on ^{B j .} Q networks use image space convolution and fewer hidden layers for performance compared to image-directed R networks. During the adaptive segmentation operation, hierarchical structure information can be used to access information from nearby radiance field blocks. Specifically, two hidden convolutional layers with 50 kernels of 3X3 first convolve ^{the features of the radiance field block B j} in image space, and then the fully connected layer can be used to predict the final output result. . The input is an image space auxiliary feature map.

, the radiance feature map of the radiance field block B ^j

, the newly introduced hierarchical feature map

, the parent radiance feature map containing the radiance features of the parent radiance field block of ^{B j} to capture the surrounding direction-spatial information.

etc. are included. All of these features are aligned in image space. The meanings of G and R are the same as described in the R network.

새롭게 사용된 계층 특징은 B ^j 의 범위 내에서 현재 계층 상태를 설명한다. 다시 말해서, 샘플의 수와 래디언스 필드 블록 B ^j 의 최대 레벨을 설명한다. 계층 구조 레벨은 노드의 세분성을 나타낸다. 예를 들어, 루트 노드는 레벨 0이고 해당 노드의 하위 노드는 레벨 1이다. B ^j 의 최대 계층 레벨은 B ^j 의 도메인 내에서 현재 래디언스 필드 블록의 최대 계층 레벨로 정의될 수 있다. 이 특징은 중앙 픽셀 i이 주어지면, 주변 픽셀이 B ^j 를 정제하는 방법을 나타낸다. 만약, 인접 픽셀들이 B ^j 내에서 최대 계층 레벨을 가지고 있다면, 중앙 픽셀은 또한 B ^j 의 깊은 수준을 보아야 한다.The newly used layer feature describes the current layer state within the scope ^{of B j .} In other words, it describes the number of samples and the maximum level of the ^{radiance field block B j .} Hierarchical levels represent the granularity of nodes. For example, the root node is level 0 and its descendants are level 1. The maximum level of the layer B ^j may be currently defined as the highest hierarchical level of the radiance field blocks in the domain of B ^j. This feature indicates how, given the central pixel i, the surrounding pixels ^{refine B j .} If the adjacent pixels have the ^{maximum hierarchical level within B j} , the center pixel must also see the deep level of ^{B j .}

R 네트워크는 노이즈를 효율적으로 제거하기 위해 복수 개(예를 들면, 7개)의 이미지 공간 콘볼루션 레이어를 사용하는 반면에, Q 네트워크는 두 가지 이유로 더 적은 이미지 공간 콘볼루션 레이어를 사용한다. 첫째, DRL은 훈련을 위해 더 많은 데이터와 시간을 필요로 하기 때문에 파라미터의 수는 적당한 크기로 제한될 필요가 있다. 둘째, Q 네트워크는 적응형 샘플링 동작 중에 빈번하게 평가되기 때문에 성능을 고려할 필요가 있다. 이에, 여러 레이어의 훈련 손실을 테스트하고 세 개의 레이어를 선택할 수 있다.The R network uses multiple (eg, 7) image space convolutional layers to efficiently remove noise, whereas the Q network uses fewer image space convolutional layers for two reasons. First, since DRL requires more data and time for training, the number of parameters needs to be limited to a reasonable size. Second, the performance of the Q network needs to be considered because it is frequently evaluated during the adaptive sampling operation. Thus, we can test the training loss of multiple layers and select three layers.

렌더링 시스템은 R 네트워크 및 Q 네트워크를 훈련하고 사용하는 방법에 대하여 설명하기로 한다. 도 1과 같이, 적응형 샘플링과 렌더링은 훈련된 네트워크가 주어지면 세 가지 단계를 포함한다. 첫째, 훈련된 Q 네트워크를 사용하여 적응형 샘플링 과정을 안내하고 필드 블록을 정제할 수 있다. 래디언스 필드 블록의 계층이 생성될 수 있다. 둘째, 훈련된 R 네트워크를 사용하여 계층으로부터 들어오는 래디언스를 재구성할 수 있다. 마지막으로, 최종 렌더링에 재구성 결과를 적용한다.The rendering system will be described in terms of how to train and use the R network and the Q network. As shown in Figure 1, adaptive sampling and rendering involves three steps given a trained network. First, a trained Q network can be used to guide the adaptive sampling process and refine field blocks. A layer of radiance field blocks may be created. Second, we can use the trained R network to reconstruct the radiance coming from the layer. Finally, we apply the reconstruction result to the final rendering.

도 2를 참고하면, 훈련 데이터를 훈련시키는 동작을 설명하기 위한 도면이다.Referring to FIG. 2 , it is a diagram for explaining an operation of training training data.

Q 네트워크가 R 네트워크에 따라 달라지는 것을 감안하여 먼저 R 네트워크를 훈련시킬 수 있다. 예를 들면, 네트워크의 모든 변수는 초기화되며, 0.0001의 학습 속도로 업데이트된다고 가정한다. 또한, initializer와 optimizer는 TensorFlow에서 사용할 수 있다. 실시예에서 데이터 세트에는 20개의 장면이 포함되어 있다고 가정하기로 한다. 훈련 데이터 세트를 풍부하게 하기 위해, 무작위로 장면의 뷰, 조명, 재료 및 질감을 변경하여 약 300개의 장면을 생성할 수 있다. 또한 R 네트워크를 다양한 범위에 맞게 조정하기 위해 블록당 샘플(SPB)과 장면당 래디언스 필드 해상도의 5가지 값을 무작위로 선택할 수 있다. SPB 구성은 (1, 2, 4, 8, 16)에서 선택될 수 있다. 래디언스 필드 블록의 해상도는 (16¹, 16², 16³, 16⁴, 16⁵)에서 선택될 수 있다.Given that the Q network depends on the R network, we can train the R network first. For example, assume that all variables in the network are initialized and updated with a learning rate of 0.0001. Also, initializer and optimizer are available in TensorFlow. In the embodiment, it is assumed that the data set includes 20 scenes. To enrich the training data set, about 300 scenes can be generated by randomly changing the views, lighting, materials, and textures of the scenes. We can also randomly select five values: samples per block (SPB) and radiance field resolution per scene to tune the R network to different ranges. The SPB configuration may be selected from (1, 2, 4, 8, 16). Resolution radiance field block may be selected from (16 ^1, 16 ^2, 16 ^3, 16 ^4, 16 ^5).

R 네트워크가 수렴되면 Q 네트워크를 훈련하는데 사용할 수 있다. 다시 말해서, 동일한 훈련 데이터 세트의 품질값을 계산할 수 있다. 예를 들면, 훈련 데이터 세트의 9가지 예를 사용할 수 있다. 재생 메모리를 통해 실제 데이터 분포를 시뮬레이션하고 데이터 일관성을 방지할 수 있는 고전적인 확률적 RL 알고리즘(알고리즘 1)을 통해 Q 네트워크를 훈련시킬 수 있다. 그러나 확률적인 훈련 과정에서 오버헤드가 발생할 수 있다. 심층 강화 훈련을 가속화하기 위해, 다른 구성으로 해상도나 샘플 밀도를 높인 다음, 확률적 알고리즘(알고리즘 1)을 사용하여 미리 훈련된 Q 네트워크를 미세 조정하여 Q 네트워크를 사전 훈련하는 결정론적 단계를 추가로 사용할 수 있다. Once the R network converges, it can be used to train the Q network. In other words, we can compute the quality value of the same training data set. For example, nine examples of the training data set are available. The Q network can be trained via the classical probabilistic RL algorithm (Algorithm 1), which simulates the real data distribution via replay memory and can avoid data inconsistency. However, overhead may occur in the probabilistic training process. To accelerate deep reinforcement training, we add a deterministic step to pretrain the Q network by increasing the resolution or sample density with different configurations and then fine-tuning the pretrained Q network using a probabilistic algorithm (Algorithm 1). Can be used.

렌더링 시스템은 훈련된 Q 네트워크를 적용하여 적응적 샘플링을 수행할 수 있다. 적응적 샘플링에 대한 의사 코드는 알고리즘 2로 나타내었다. The rendering system may perform adaptive sampling by applying the trained Q network. The pseudo code for adaptive sampling is shown as Algorithm 2.

이미지 공간을 25X25 타일로 분할하고, 각 타일은 병렬로 처리될 수 있다. Q 네트워크는 주변 픽셀의 특징을 컨벌브(convolve)하므로 입력을 일괄 처리될 경우 상대적으로 효율적이다. 다음 동작을 결정하기 위해 픽셀의 평균 Q 값인 사용하여 모든 타일 T 픽셀을 함께 결합할 수 있다. 이에 따라 동일한 타일의 픽셀은 동일한 계층 구조 H _T 를 공유하며 동시에 융합될 수 있다. 동일한 타일의 픽셀이 동일한 계층 구조 H _T , 다시 말해서, 래디언스 필드에서의 동일한 분할을 공유하지만, 각 픽셀은 타일에 있는 각 픽셀의 맞춤형 재구성을 제공함에 있어 다른 픽셀과는 다른 재구성된 입사 래디언스 필드 값을 갖는다. 의사 코드 아래첨자 T를 사용하여 블록이나 동작이 타일 T의 픽셀에 의해 공유됨을 나타낸다.Divide the image space into 25x25 tiles, each tile can be processed in parallel. Since Q networks convolve the features of surrounding pixels, they are relatively efficient when their inputs are batched. Then we can combine all the tile T pixels together using the pixel's average Q value to determine its behavior. Accordingly, pixels of the same tile _{share the same hierarchical structure H T} and can be fused at the same time. Although pixels of the same tile share the same hierarchical structure H _T , that is, the same division in the radiance field, each pixel has a different reconstructed incident radiance field value than the other pixels in providing a custom reconstruction of each pixel in the tile. has A pseudocode subscript T is used to indicate that a block or action is shared by the pixels of the tile T .

타일Q _T ^j 의 품질값은 단순히 모든 픽셀의 품질 값의 합(

)이다. 래디언스 필드를 깊이 1(4X4 = 16 래디언스 필드 블록)로 균일하게 분할하여 계층을 초기화하며, 각 블록에서 하나의 샘플로 균일하게 샘플을 추출한다. 그런 다음 Q 네트워크를 사용하여 H _T 에서 정제 또는 리샘플링 블록의 품질값을 예측한다.The quality value of tile Q _T ^j is simply the sum of the quality values of all pixels (

)am. The layer is initialized by uniformly dividing the radiance field into depth 1 (4X4 = 16 blocks of radiance field), and samples are uniformly sampled from each block with one sample. Then, the Q network is used to predict the quality value of the refinement or resampling block in _{H T .}

적응형 샘플링 동작의 반복은 다음과 같다. 동작 및 Q 값과 관련된 heap을 사용하여 상위(즉, 가장 가치 있는) 동작 a ^j _T 를 선택하고 타일 T에 대한 동작을 수행할 수 있다. 정제 동작이 선택되면 B ^j 를 새로운 품질값으로 대체하기 위해 새로운 블록 B _T ^sub(j) 를 사용한다. 리샘플링을 선택하면 동작을 수행한 후 B ^j 의 품질값만 업데이트된다.The iteration of the adaptive sampling operation is as follows. Using the heap associated with the action and Q value, we can select the higher (ie, most valuable) action a ^j _T and perform the action on the tile T . When the refinement operation is selected, _{a new block B T} ^sub(j) is used ^{to replace B j} with a new quality value. If resampling is selected, only the quality value of ^{B j is updated after performing the operation.}

메모리와 계산 비용을 제한하는데 유익한 훈련 데이터를 실제 범위에서 제한하기 위해, 몇 가지 규칙을 정할 수 있다. 먼저 타일당 최대 반복 횟수를 100회로 설정하고 계층의 최대 깊이는 5로 설정한다. 다시 말해서, 가장 작은 블록은 래디언스 필드의

이다. 둘째, 최대 샘플 밀도(블록 당 샘플)는 16이다. 셋째,

에 대한 새로운 동작의 품질값이 추정된 irraiance의

보다 작을 때, 예상 효과가 작을 것이기 때문에

에 대한 새로운 샘플을 분배하지 않는다. In order to limit the practical range of the training data, which is useful for limiting memory and computational cost, we can set some rules. First, the maximum number of iterations per tile is set to 100, and the maximum depth of the layer is set to 5. In other words, the smallest block is the size of the radiance field.

am. Second, the maximum sample density (samples per block) is 16. third,

The quality value of the new motion for

Since the expected effect will be small when smaller than

Do not distribute new samples for

렌더링 시스템은 이미지 방향 R 네트워크를 사용하여 계층 H _T 의 래디언스 필드 블록을 재구성할 수 있다. 빠른 미리 보기를 생성하기 위해 단순히 재구성된 입사 래디언스 필드를 사용하여 입사 래디언스의 결과 및 BRDF를 평가하고 통합할 수 있다. 편향되지 않은 이미지를 렌더링하기 위해 재구성된 래디언스 필드와 BRDF를 두 개의 PDF로 처리하여 샘플링 방향을 생성한다. 그런 다음, 다중 중요도 샘플링(MIS)을 통해 이 두 샘플러를 결합할 수 있다. The rendering system may reconstruct the radiance field block of _{layer H T} using the image direction R network. You can simply use the reconstructed incident radiance field to evaluate and integrate the result and BRDF of incident radiance to generate a quick preview. To render an unbiased image, the reconstructed radiance field and BRDF are processed into two PDFs to generate sampling directions. These two samplers can then be combined via multiple importance sampling (MIS).

구체적으로, MIS는 BRDF와 재구성된 래디언스 필드에서 두 개의 독립적인 추정기로 샘플을 추출한 후, 두 추정기의 결과는 가중치를 적용하고 최종 렌더링 결과로 합산되고, BRDF와 입사 래디언스를 고려한다.Specifically, MIS samples the BRDF and two independent estimators from the reconstructed radiance field, and then the results of the two estimators are weighted and summed to the final rendering result, taking into account the BRDF and the incident radiance.

BRDF 샘플은 누적분포함수로부터 분석적으로 추출할 수 있다. 재구성된 래디언스 필드 샘플은 처음에는 이산 PDF에서 래디언스 필드 블록을 선택한 다음 코사인 가중항에 따라 블록에서 점(point)을 비례적으로 샘플링하여 생성될 수 있다. 현재, 첫 번째 바운스 음영 지점에서의 고품질 입사 래디언스 필드를 재구성하는 데 주력하고 있다. 다중 바운스 정점은 희박하기 때문에 네트워크의 입력 형식과 호환되지 않는다. 다중 바운스 정점의 경우 표준 다중 중요도 샘플링으로 전환하거나 조명이 복잡한 경우 다른 광자 유도 방법에 적응할 수 있다. 예를 들어, 단순히 샘플 방향을 안내하기 위해 100만 개의 광자를 퍼뜨릴 수 있다. 구체적으로, 반구의 단위 제곱을 구분된 영역으로 분할하고 각 영역의 광자 기여는 경로 안내를 위해 축적될 수 있다. BRDF samples can be analytically extracted from the cumulative distribution function. Reconstructed radiance field samples can be generated by first selecting a block of radiance fields from a discrete PDF and then proportionally sampling points in the block according to cosine weighting terms. Currently, the focus is on reconstructing a high-quality incident radiance field at the first bounce shadow point. Because multi-bounce vertices are sparse, they are not compatible with the input format of the network. For multi-bounce vertices, one can switch to standard multi-importance sampling or adapt to other photon guidance methods if lighting is complex. For example, you can spread out a million photons simply to guide the sample direction. Specifically, by dividing the unit square of the hemisphere into distinct regions, the photon contribution of each region can be accumulated for path guidance.

도 6은 일 실시예에 따른 렌더링 시스템의 구성을 설명하기 위한 블록도이고, 도 7은 일 실시예에 따른 렌더링 시스템에서 머신 러닝을 이용한 적응형 샘플링 기반의 렌더링 방법을 설명하기 위한 흐름도이다. 6 is a block diagram illustrating a configuration of a rendering system according to an embodiment, and FIG. 7 is a flowchart illustrating an adaptive sampling-based rendering method using machine learning in the rendering system according to an embodiment.

렌더링 시스템(100)에 포함된 프로세서는 훈련부(610), 재구성부(620) 및 렌더링 수행부(630)를 포함할 수 있다. 이러한 프로세서 및 프로세서의 구성요소들은 도 7의 머신 러닝을 이용한 적응형 샘플링 기반의 렌더링 방법이 포함하는 단계들(710 내지 730)을 수행하도록 렌더링 시스템을 제어할 수 있다. 이때, 프로세서 및 프로세서의 구성요소들은 메모리가 포함하는 운영체제의 코드와 적어도 하나의 프로그램의 코드에 따른 명령(instruction)을 실행하도록 구현될 수 있다. 여기서, 프로세서의 구성요소들은 렌더링 시스템(100)에 저장된 프로그램 코드가 제공하는 제어 명령에 따라 프로세서에 의해 수행되는 서로 다른 기능들(different functions)의 표현들일 수 있다. The processor included in the rendering system 100 may include a training unit 610 , a reconstruction unit 620 , and a rendering performer 630 . The processor and its components may control the rendering system to perform steps 710 to 730 included in the adaptive sampling-based rendering method using machine learning of FIG. 7 . In this case, the processor and components of the processor may be implemented to execute instructions according to the code of the operating system and the code of at least one program included in the memory. Here, the components of the processor may be expressions of different functions performed by the processor according to a control command provided by the program code stored in the rendering system 100 .

프로세서는 시스템에서 머신 러닝을 이용한 적응형 샘플링 기반의 렌더링 방법을 위한 프로그램의 파일에 저장된 프로그램 코드를 메모리에 로딩할 수 있다. 예를 들면, 렌더링 시스템(100)에서 프로그램이 실행되면, 프로세서는 운영체제의 제어에 따라 프로그램의 파일로부터 프로그램 코드를 메모리에 로딩하도록 렌더링 시스템을 제어할 수 있다.The processor may load the program code stored in the file of the program for the rendering method based on adaptive sampling using machine learning in the system into the memory. For example, when a program is executed in the rendering system 100 , the processor may control the rendering system to load the program code from the program file into the memory according to the control of the operating system.

단계(710)에서 훈련부(610)는 제1 네트워크 및 제2 네트워크가 결합된 렌더링 네트워크에 훈련 데이터를 사용하여 훈련시킬 수 있다. 훈련부(610)는 제1 네트워크 및 제2 네트워크가 결합된 렌더링 네트워크를 구성할 수 있다. 이때, 1 네트워크는 및 제2 네트워크는, R 네트워크 및 심층 강화 학습 기반의 Q 네트워크이다. 훈련부(610)는 특징과 참조를 포함하는 훈련 데이터를 이용하여 제1 네트워크를 훈련시킴에 따라 재구성 결과를 출력하고, 출력된 재구성 결과를 사용하여 생성된 보상과 훈련 데이터의 특징을 이용하여 제2 네트워크를 훈련시킬 수 있다. In step 710 , the training unit 610 may train a rendering network in which the first network and the second network are combined using training data. The training unit 610 may configure a rendering network in which the first network and the second network are combined. In this case, the first network and the second network are R networks and Q networks based on deep reinforcement learning. The training unit 610 outputs a reconstruction result by training the first network using training data including features and references, and uses the compensation generated using the output reconstruction result and the features of the training data to provide a second You can train the network.

단계(720)에서 재구성부(620)는 훈련된 제1 네트워크 제2 네트워크에 기반하여 이미지의 래디언스 필드를 학습시킴에 따라 래디언스 필드를 재구성할 수 있다. 재구성부(620)는 이미지의 래디언스 필드에 대한 적응적 샘플링 및 정제를 수행할 수 있다. 재구성부(620)는 래디언스 필드를 균일하게 분할하고, 심층 강화학습 기반의 Q-네트워크를 적용하여 래디언스 필드의 방향 공간을 분할한 필드 블록에서, 필드 블록을 기 설정된 기준 이상의 해상도로 정제하거나 필드 블록의 샘플을 기 설정된 배수(예를 들면, 두 배)로 증가시키는 동작을 수행하여 품질값을 획득할 수 있다. 재구성부(620)는 동작 수행을 복수 번의 동작을 통하여 래디언스 필드를 평가하고 구체화여 정제된 래디언스 필드의 계층 구조를 획득할 수 있다. 재구성부(620)는 이미지 공간 및 방향 공간을 처리하기 위한 이미지 부분 및 방향 부분으로 구성된 R 네트워크를 이용하여 래디언스 필드를 재구성할 수 있다. In operation 720 , the reconstruction unit 620 may reconstruct the radiance field by learning the radiance field of the image based on the trained first network and the second network. The reconstruction unit 620 may perform adaptive sampling and refinement on the radiance field of the image. The reconstruction unit 620 uniformly divides the radiance field, and in the field block in which the directional space of the radiance field is divided by applying a Q-network based on deep reinforcement learning, refines the field block to a resolution higher than or equal to a preset standard or a field block A quality value may be obtained by performing an operation of increasing the samples of n by a preset multiple (eg, doubling). The reconstruction unit 620 may obtain a refined hierarchical structure of the radiance field by evaluating and specifying the radiance field through a plurality of operations. The reconstruction unit 620 may reconstruct the radiance field by using an R network composed of an image part and a directional part for processing the image space and the directional space.

단계(730)에서 렌더링 수행부(630)는 재구성된 래디언스 필드를 사용하여 편향되지 않은 이미지를 생성하기 위한 렌더링을 수행할 수 있다. In operation 730 , the rendering performer 630 may perform rendering to generate an unbiased image using the reconstructed radiance field.

이상에서 설명된 장치는 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPGA(field programmable gate array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The device described above may be implemented as a hardware component, a software component, and/or a combination of the hardware component and the software component. For example, devices and components described in the embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate array (FPGA). , a programmable logic unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions, may be implemented using one or more general purpose or special purpose computers. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. The processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For convenience of understanding, although one processing device is sometimes described as being used, one of ordinary skill in the art will recognize that the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that can include For example, the processing device may include a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as parallel processors.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치에 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.The software may comprise a computer program, code, instructions, or a combination of one or more thereof, which configures a processing device to operate as desired or is independently or collectively processed You can command the device. The software and/or data may be any kind of machine, component, physical device, virtual equipment, computer storage medium or device, to be interpreted by or to provide instructions or data to the processing device. may be embodied in The software may be distributed over networked computer systems, and stored or executed in a distributed manner. Software and data may be stored in one or more computer-readable recording media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, etc. alone or in combination. The program instructions recorded on the medium may be specially designed and configured for the embodiment, or may be known and available to those skilled in the art of computer software. Examples of the computer-readable recording medium include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic such as floppy disks. - includes magneto-optical media, and hardware devices specially configured to store and carry out program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described with reference to the limited embodiments and drawings, various modifications and variations are possible from the above description by those skilled in the art. For example, the described techniques are performed in a different order than the described method, and/or the described components of the system, structure, apparatus, circuit, etc. are combined or combined in a different form than the described method, or other components Or substituted or substituted by equivalents may achieve an appropriate result.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

Claims

A rendering method performed by a rendering system, comprising:
training a rendering network in which the first network and the second network are combined using the training data;
reconstructing the radiance field by learning the radiance field of the image based on the trained first network and the second network; and
performing rendering to generate an unbiased image using the reconstructed radiance field;
Rendering method including .

According to claim 1,
The trained first network and the second network are R networks and Q networks based on deep reinforcement learning,
Reconstructing the radiance field comprises:
performing adaptive sampling and refinement on the radiance field of the image;
including,
The radiance field means a combination of an image space of a pixel processed in a tile and a directional space parameterizing an incident hemisphere of each pixel.

3. The method of claim 2,
Reconstructing the radiance field comprises:
In a field block in which the radiance field is uniformly divided and the directional space of the radiance field is divided by applying a Q-network based on deep reinforcement learning, the field block is refined to a resolution higher than or equal to a preset standard, or a sample of the field block is obtained. Acquiring a quality value by performing an operation to increase by a set multiple
Rendering method including .

4. The method of claim 3,
Reconstructing the radiance field comprises:
Evaluating and specifying the radiance field through repeating the operation a plurality of times to obtain a refined hierarchical structure of the radiance field block
Rendering method including .

According to claim 1,
Reconstructing the radiance field comprises:
Reconstructing the radiance field in 4D space using an R network consisting of image parts and orientation parts for processing image space and orientation space.
including,
The image part in the R network includes a geometry feature and a radiance feature in the image space,
wherein the geometric features include features including normals, positions, and depths;
The radiance feature represents an incident radiance feature of a plurality of different blocks or directions, and a plurality of feature maps are obtained as a result of outputting the image portion of the R network,
The direction part in the R network rearranges the feature map in the order of direction space, and then individually convolves with the direction space CNN of each pixel.

According to claim 1,
The training step is
configuring a rendering network in which the first network and the second network are combined
including,
The first network is an R network, and the second network is a deep reinforcement learning-based Q network.

According to claim 1,
The training step is
As the first network is trained using training data including features and references, the reconstruction result is output, and the second network is trained using the reward generated using the output reconstruction result and the features of the training data. step to let
including,
The first network is an R network, and the second network is a deep reinforcement learning-based Q network.

In the rendering system,
a training unit that trains a rendering network in which the first network and the second network are combined using training data;
a reconstruction unit configured to reconstruct the radiance field by learning the radiance field of the image based on the trained first network and the second network; and
A rendering performing unit that performs rendering to generate an unbiased image using the reconstructed radiance field
Rendering system that includes.

9. The method of claim 8,
The trained first network and the second network are R networks and Q networks based on deep reinforcement learning,
The reconstruction unit,
performing adaptive sampling and refinement on the radiance field of the image;
The radiance field means a combination of an image space of a pixel processed in a tile and a directional space parameterizing an incident hemisphere of each pixel.

10. The method of claim 9,
The reconstruction unit,
In a field block in which the radiance field is uniformly divided and the directional space of the radiance field is divided by applying a Q-network based on deep reinforcement learning, the field block is refined to a resolution higher than or equal to a preset standard, or a sample of the field block is obtained. To obtain a quality value by performing an operation to increase by a set multiple
Rendering system, characterized in that.

11. The method of claim 10,
The reconstruction unit,
To obtain a hierarchical structure of a refined radiance field block by evaluating and specifying the radiance field through a plurality of iterations of performing the operation
Rendering system, characterized in that.

9. The method of claim 8,
The reconstruction unit,
reconstructing the radiance field in 4D space using an R network consisting of image parts and orientation parts for processing image space and orientation space;
The image part in the R network includes a geometry feature and a radiance feature in the image space,
wherein the geometric features include features including normals, positions, and depths;
The radiance feature represents an incident radiance feature of a plurality of different blocks or directions, and a plurality of feature maps are obtained as a result of outputting the image portion of the R network,
The directional part in the R network rearranges the feature maps in the order of directional space and then individually convolves with the directional space CNN of each pixel.

9. The method of claim 8,
The training department
and configuring a rendering network in which the first network and the second network are combined,
The first network is an R network, and the second network is a deep reinforcement learning-based Q network.

9. The method of claim 8,
The training department
As the first network is trained using training data including features and references, the reconstruction result is output, and the second network is trained using the reward generated using the output reconstruction result and the features of the training data. including making
The first network is an R network, and the second network is a deep reinforcement learning-based Q network.