KR20220076514A

KR20220076514A - arbitrary view creation

Info

Publication number: KR20220076514A
Application number: KR1020227015247A
Authority: KR
Inventors: 클래런스 추이; 마누 파르마; 브룩 아론 시턴; 히만슈 제인
Original assignee: 아웃워드, 인코포레이티드
Priority date: 2019-11-08
Filing date: 2020-11-05
Publication date: 2022-06-08
Also published as: WO2021092229A1; EP4055567A1; JP2022553844A; EP4055567A4

Abstract

앙상블 장면의 임의적 뷰 또는 퍼스펙티브를 생성하는 기술들이 개시된다. 일부 실시예들에서, 복수의 에셋들을 포함하는 앙상블 장면의 규정된 퍼스펙티브에 대한 수신된 요청에 응답하여, 요청된 규정된 퍼스펙티브에 대한 앙상블 장면의 출력 이미지가 복수의 에셋들의 적어도 서브세트 각각의 기존 이미지의 적어도 일부를 결합하는 겟에 적어도 부분적으로 기초하여 생성된다.Techniques for creating an arbitrary view or perspective of an ensemble scene are disclosed. In some embodiments, in response to a received request for a prescribed perspective of an ensemble scene comprising a plurality of assets, an output image of the ensemble scene for the requested prescribed perspective is an existing image of each of at least a subset of the plurality of assets. generated based at least in part on a get that combines at least a portion of the image.

Description

arbitrary view creation

다른 출원들의 상호 참조Cross-references to other applications

본 출원은 2018년 10월 25일에 출원된 임의적 뷰 생성(ARBITRARY VIEW GENERATION)이라는 제목의 미국 특허출원번호 16/171,221의 일부 계속 출원이고, 이는 2017년 9월 29일에 출원된 임의적 뷰 생성(ARBITRARY VIEW GENERATION)이라는 제목의 미국 특허출원번호 15/721,421로 현재는 미국 특허번호 10,163,249의 연속 출원이고, 이는 2016년 3월 25일에 출원된 임의적 뷰 생성(ARBITRARY VIEW GENERATION)이라는 제목의 미국 특허출원번호 15/081,553으로 현재는 미국특허번호 9,996,914의 일부 계속 출원이며, 이들 모두는 모든 목적들을 위해 참조로 본 명세서에 포함된다. 미국 특허출원번호 15/721,421로 현재는 미국 특허번호 10,163,249는 2017년 8월 4일에 출원된 어셈블링된 장면들의 빠른 렌더링(FAST RENDERING OF ASSEMBLED SCENES)이라는 제목의 미국 가특허출원번호 62/541,607에 대한 우선권들을 주장하며, 이는 모든 목적들을 위해 참조로 본 명세서에 포함된다.This application is a continuation-in-part of U.S. Patent Application No. 16/171,221, titled ARBITRARY VIEW GENERATION, filed on October 25, 2018, which is filed on September 29, 2017 ( U.S. Patent Application No. 15/721,421 entitled ARBITRARY VIEW GENERATION, which is currently a continuation of U.S. Patent No. 10,163,249, which is a U.S. patent application titled ARBITRARY VIEW GENERATION, filed March 25, 2016 No. 15/081,553 is now a continuation-in-part of U.S. Patent No. 9,996,914, all of which are incorporated herein by reference for all purposes. U.S. Patent Application Serial No. 15/721,421, now U.S. Patent No. 10,163,249, is in U.S. Provisional Patent Application Serial No. 62/541,607, entitled FAST RENDERING OF ASSEMBLED SCENES, filed on August 4, 2017. priority, which is incorporated herein by reference for all purposes.

본 출원은 2019년 11월 8일에 출원된 양자화된 퍼스펙티브 카메라 뷰들(QUANTIZED PERSPECTIVE CAMERA VIEWS)이라는 미국 가특허출원번호 62/933,254에 대한 우선권을 주장하며, 이는 모든 목적들을 위해 참조로 여기에 통합된다.This application claims priority to U.S. Provisional Patent Application No. 62/933,254, QUANTIZED PERSPECTIVE CAMERA VIEWS, filed on November 8, 2019, which is incorporated herein by reference for all purposes. .

본 발명은 임의적 뷰 생성에 관한 것이다.The present invention relates to arbitrary view creation.

기존 렌더링 기술들은 품질과 속도라는 경쟁적인 목표들 사이에서 균형을 유지해야 한다. 고품질 렌더링은 상당한 프로세싱 리소스와 시간을 필요로 한다. 그러나, 느린 렌더링 기술들은 대화형 실시간 애플리케이션들과 같은 많은 애플리케이션들에서 허용되지 않는다. 이러한 애플리케이션들에 대해서는 일반적으로 품질은 낮지만 더 빠른 렌더링 기술들이 선호된다. 예를 들어, 래스터화가 상대적으로 빠른 렌더링을 위해 실시간 그래픽 애플리케이션들에 의해 일반적으로 이용되지만 품질은 떨어진다. Existing rendering technologies must strike a balance between competing goals of quality and speed. High-quality rendering requires significant processing resources and time. However, slow rendering techniques are not acceptable in many applications, such as interactive real-time applications. For these applications, generally lower quality but faster rendering techniques are preferred. For example, rasterization is commonly used by real-time graphics applications for relatively fast rendering, but at a lower quality.

따라서, 품질이나 속도를 크게 손상시키지 않는 개선된 기술들이 필요하다.Accordingly, there is a need for improved techniques that do not significantly compromise quality or speed.

방법이 제공된다. 방법은: 복수의 에셋들을 포함하는 앙상블 장면(ensemble scene)의 규정된 퍼스펙티브(prescribed perspective)에 대한 요청을 수신하는 단계; 및 복수의 에셋들의 적어도 서브세트 각각의 단일의 기존 이미지를 결합하는 것에 적어도 부분적으로 기초하여 요청된 규정된 퍼스펙티브에 근사하는 앙상블 장면의 출력 이미지를 생성하는 단계를 포함한다. 방법에서, 상기 요청은 앙상블 장면의 직교 뷰(orthographic view)에 대해 수신된다. 방법에서, 상기 앙상블 장면의 직교 뷰는 복수의 에셋들의 결합된 직교 뷰들을 포함한다. 방법은 복수의 에셋들의 적어도 서브세트 각각의 단일의 기존 이미지를 선택하는 단계를 더 포함한다. 방법에서, 상기 선택하는 단계는 요청된 규정된 퍼스펙티브와 정확히 일치하는 것을 선택하는 단계를 포함한다. 방법에서, 상기 선택하는 단계는 요청된 규정된 퍼스펙티브에 가장 근접한 또는 가장 가까운 이용 가능한 일치하는 것을 선택하는 단계를 포함한다. 방법에서, 상기 선택하는 단계는 앙상블 장면에서 연관된 에셋의 포즈에 기초하여 선택하는 단계를 포함한다. 방법에서, 상기 선택하는 단계는 연관된 에셋의 회전된 기존 이미지를 선택하는 단계를 포함한다. 방법에서, 상기 선택하는 단계는 앙상블 장면에서 연관된 에셋의 포즈에 기초하여 요청된 규정된 퍼스펙티브에 가장 근접한 또는 가장 가까운 이용 가능한 일치하는 것을 선택하는 단계를 포함한다. 방법에서, 상기 앙상블 장면의 출력 이미지를 생성하는 단계는 에셋들의 서브세트 중 하나 이상의 단일의 기존 이미지를 스케일링하는 단계를 포함한다. 방법에서, 상기 앙상블 장면의 출력 이미지를 생성하는 단계는 에셋들의 서브세트 중 하나 이상의 단일의 기존 이미지를 리사이징(resizing)하는 단계를 포함한다. 방법에서, 상기 앙상블 장면의 출력 이미지를 생성하는 단계는 앙상블 장면에서 에셋들의 적어도 서브세트 각각의 단일의 기존 이미지를 포함할 포지션을 결정하는 단계를 포함한다. 방법에서, 상기 결합하는 것은 합성하는 것을 포함한다. 방법에서, 상기 앙상블 장면의 출력 이미지를 생성하는 단계는 요청된 규정된 퍼스펙티브를 갖는 복수의 에셋들 중 적어도 하나의 에셋의 뷰를 생성하는 단계를 포함한다. 방법에서, 상기 뷰는 상기 적어도 하나의 에셋에 대한 복수의 기존 이미지들을 사용하여 생성된다. 방법에서, 상기 앙상블 장면의 출력 이미지를 생성하는 단계는 요청된 규정된 퍼스펙티브를 갖도록 앙상블 장면의 적어도 한 부분을 생성하는 단계를 포함한다. 방법에서, 상기 적어도 한 부분은 앙상블 장면의 표면을 포함한다. 상기 적어도 한 부분은 앙상블 장면의 구조적 요소를 포함한다. 방법에서, 상기 적어도 한 부분은 앙상블 장면의 전역 피처(global feature)를 포함한다. 방법은 앙상블 장면의 생성된 출력 이미지를 전역적으로 재조명하는 단계를 더 포함한다. 방법에서, 상기 출력 이미지는 비디오 시퀀스의 프레임을 포함한다.A method is provided. The method includes: receiving a request for a prescribed perspective of an ensemble scene comprising a plurality of assets; and generating an output image of the ensemble scene that approximates the requested prescribed perspective based at least in part on combining a single existing image of each of at least a subset of the plurality of assets. In the method, the request is received for an orthographic view of an ensemble scene. In the method, the orthogonal view of the ensemble scene comprises combined orthogonal views of a plurality of assets. The method further includes selecting a single existing image of each of at least a subset of the plurality of assets. In the method, said selecting comprises selecting an exact match with a requested defined perspective. In the method, the selecting comprises selecting the closest available match or closest to the requested defined perspective. In the method, said selecting comprises selecting based on a pose of an associated asset in the ensemble scene. In the method, said selecting comprises selecting an existing rotated image of an associated asset. In the method, the selecting comprises selecting the closest available match or closest to the requested prescribed perspective based on the pose of the associated asset in the ensemble scene. In the method, generating an output image of the ensemble scene comprises scaling a single existing image of one or more of the subset of assets. In the method, generating an output image of the ensemble scene comprises resizing a single existing image of one or more of the subset of assets. In the method, generating an output image of the ensemble scene comprises determining a position in the ensemble scene to contain a single existing image of each of at least a subset of assets. In the method, said binding comprises synthesizing. In the method, generating an output image of the ensemble scene comprises generating a view of at least one of a plurality of assets having a requested defined perspective. In the method, the view is created using a plurality of existing images for the at least one asset. In the method, generating an output image of the ensemble scene comprises generating at least a portion of the ensemble scene to have a requested prescribed perspective. In the method, said at least one portion comprises a surface of an ensemble scene. Said at least one portion comprises a structural element of the ensemble scene. In the method, said at least one portion comprises a global feature of an ensemble scene. The method further includes globally re-illuminating the generated output image of the ensemble scene. In the method, the output image comprises a frame of a video sequence.

시스템이 또한 개시된다. 시스템은, 프로세서로서: 복수의 에셋들을 포함하는 앙상블 장면의 규정된 퍼스펙티브에 대한 요청을 수신하고; 복수의 에셋들의 적어도 서브세트 각각의 단일의 기존 이미지를 결합하는 것에 적어도 부분적으로 기초하여 요청된 규정된 퍼스펙티브에 근사하는 앙상블 장면의 출력 이미지를 생성하도록 구성되는, 상기 프로세서; 및 상기 프로세서에 결합되고, 상기 프로세서에 명령들을 제공하도록 구성된 메모리를 포함한다. A system is also disclosed. The system may be configured to: receive a request for a defined perspective of an ensemble scene including a plurality of assets; the processor configured to generate an output image of the ensemble scene that approximates a requested prescribed perspective based at least in part on combining a single existing image of each at least a subset of a plurality of assets; and a memory coupled to the processor and configured to provide instructions to the processor.

컴퓨터 명령들을 포함하며 비일시적 컴퓨터 판독 가능한 매체에 구현된 컴퓨터 프로그램 제품이 개시되며, 상기 컴퓨터 명령들은: 복수의 에셋들을 포함하는 앙상블 장면의 규정된 퍼스펙티브에 대한 요청을 수신하고; 복수의 에셋들의 적어도 서브세트 각각의 단일의 기존 이미지를 결합하는 것에 적어도 부분적으로 기초하여 요청된 규정된 퍼스펙티브에 근사하는 앙상블 장면의 출력 이미지를 생성하기 위한 것이다.A computer program product embodied on a non-transitory computer-readable medium comprising computer instructions is disclosed, the computer instructions being configured to: receive a request for a defined perspective of an ensemble scene comprising a plurality of assets; to generate an output image of the ensemble scene approximating the requested prescribed perspective based at least in part on combining a single existing image of each of at least a subset of the plurality of assets.

본 발명의 다양한 실시예들은 다음의 상세한 설명 및 첨부 도면들에 개시된다.
도 1은 장면의 임의적 뷰를 생성하기 위한 시스템의 실시예를 도시하는 상위 레벨 블록도이다.
도 2는 데이터베이스 에셋의 예를 도시한다.
도 3은 임의적 퍼스펙티브를 생성하기 위한 프로세스의 실시예를 도시하는 흐름도이다.
도 4a 내지 도 4n은 독립적인 객체들이 결합되어 앙상블 또는 합성 객체를 생성하는 애플리케이션의 실시예의 예들을 도시한다.
도 5는 임의적 앙상블 뷰를 생성하기 위한 프로세스의 실시예를 도시하는 흐름도이다.Various embodiments of the present invention are disclosed in the following detailed description and accompanying drawings.
1 is a high-level block diagram illustrating an embodiment of a system for generating an arbitrary view of a scene.
2 shows an example of a database asset.
3 is a flow diagram illustrating an embodiment of a process for generating an arbitrary perspective.
4A-4N illustrate examples of an embodiment of an application in which independent objects are combined to create an ensemble or composite object.
5 is a flow diagram illustrating an embodiment of a process for generating an arbitrary ensemble view.

본 발명은 프로세스로서; 장치; 시스템; 물건의 구성; 컴퓨터 판독 가능한 저장 매체에 포함된 컴퓨터 프로그램 제품; 및/또는 프로세서, 예컨대 프로세서에 결합된 메모리에 저장되거나 및/또는 그에 의해 제공되는 명령들을 실행하도록 구성된 프로세서를 포함하여 다양한 방식들로 구현될 수 있다. 본 명세서에서, 이러한 구현들 또는 본 발명이 취할 수 있는 다른 모든 형태는 기술들로 지칭될 수 있다. 일반적으로, 개시된 프로세스들의 단계들의 순서는 본 발명의 범위 내에서 변경될 수 있다. 달리 명시되지 않는 한, 작업을 수행하도록 구성되는 것으로 설명된 프로세서 또는 메모리와 같은 구성요소는 주어진 시간에 작업을 수행하도록 일시적으로 구성되는 일반 구성요소 또는 작업을 수행하도록 제작된 특정 구성요소로 구현될 수 있다. 본 명세서에서 사용되는 것으로서, 용어 '프로세서' 는 컴퓨터 프로그램 명령들과 같은 데이터를 처리하도록 구성된 하나 이상의 디바이스들, 회로들 및/또는 프로세싱 코어들을 의미한다.The present invention is a process; Device; system; composition of things; a computer program product contained in a computer readable storage medium; and/or a processor, such as a processor configured to execute instructions stored in and/or provided by a memory coupled to the processor. In this specification, these implementations or any other form that the invention may take may be referred to as techniques. In general, the order of steps in the disclosed processes may be varied within the scope of the present invention. Unless otherwise specified, a component such as a processor or memory that is described as being configured to perform a task may be implemented as a generic component temporarily configured to perform a task at a given time, or as a specific component built to perform a task. can As used herein, the term 'processor' means one or more devices, circuits and/or processing cores configured to process data, such as computer program instructions.

본 발명의 하나 이상의 실시 예들에 대한 상세한 설명은 본 발명의 원리를 예시하는 첨부 도면들과 함께 아래에 제공된다. 본 발명은 이러한 실시 예들과 관련하여 설명되지만, 본 발명은 임의의 실시 예로 제한되지 않는다. 본 발명의 범위는 청구 범위에 의해서만 한정되고, 본 발명은 수많은 대안들, 수정들 및 등가물들을 포함한다. 본 발명의 완전한 이해를 제공하기 위해 다음의 설명에서 다수의 특정 세부 사항들이 설명된다. 이러한 세부 사항들은 예의 목적으로 제공되며, 본 발명은 이러한 특정 세부 사항들의 일부 또는 전부없이 청구 범위에 따라 실시될 수 있다. 명확성을 위해, 본 발명과 관련된 기술 분야들에서 알려진 기술 자료는 상세하게 설명하지 않았으며, 이는 본 발명이 불필요하게 모호 해지는 일이 없도록 하기 위함이다. A detailed description of one or more embodiments of the invention is provided below in conjunction with the accompanying drawings, which illustrate the principles of the invention. Although the present invention is described in connection with these embodiments, the present invention is not limited to any embodiments. The scope of the invention is limited only by the claims, and the invention includes numerous alternatives, modifications and equivalents. Numerous specific details are set forth in the following description in order to provide a thorough understanding of the present invention. These details are provided for purposes of illustration, and the invention may be practiced in accordance with the claims without some or all of these specific details. For clarity, technical data known in the technical fields related to the present invention have not been described in detail, so as not to unnecessarily obscure the present invention.

장면의 임의적 뷰(arbitrary view)를 생성하는 기술들이 개시된다. 여기에 설명된 패러다임은 여전히 고화질 출력을 제공하는 동시에 매우 낮은 프로세싱 또는 계산 오버헤드를 수반하여 렌더링 속도와 품질 사이의 어려운 교환을 효과적으로 제거한다. 개시된 기술들은 대화형 실시간 그래픽 애플리케이션들과 관련하여 고품질 출력을 매우 빠르게 생성하는 데 특히 유용하다. 이러한 애플리케이션들은 제시된 대화형 뷰 또는 장면의 사용자 조작들에 따라 또는 그에 응답하여 바람직하게 고품질 출력을 실질적으로 즉시 제공하는 것에 의존한다.Techniques for creating an arbitrary view of a scene are disclosed. The paradigm described here effectively eliminates the difficult trade-off between rendering speed and quality, which entails very low processing or computational overhead while still providing high-definition output. The disclosed techniques are particularly useful for generating high quality output very quickly in connection with interactive real-time graphics applications. Such applications rely on providing a desirable high-quality output substantially immediately upon or in response to user manipulations of a presented interactive view or scene.

도 1은 장면의 임의적 뷰를 생성하기 위한 시스템(100)의 실시예를 도시하는 상위 레벨 블록도이다. 도시된 바와 같이, 임의적 뷰 생성기(arbitrary view generator)(102)는 입력(104)으로서 임의적 뷰에 대한 요청을 수신하고, 기존 데이터베이스 에셋들(existing database assets)(106)에 기초하여 요청된 뷰를 생성하고, 입력 요청에 응답하여 생성된 뷰를 출력(108)으로서 제공한다. 다양한 실시예들에서, 임의적 뷰 생성기(102)는 중앙 프로세싱 유닛(CPU) 또는 그래픽 프로세싱 유닛(GPU)과 같은 프로세서를 포함할 수 있다. 도 1에 도시된 시스템(100)의 구성이 설명을 위해 제공된다. 일반적으로, 시스템(100)은 설명된 기능을 제공하는 상호 연결된 구성요소들의 임의의 다른 적절한 수 및/또는 구성을 포함할 수 있다. 예를 들어, 다른 실시예들에서, 임의적 뷰 생성기(102)는 내부 구성요소들(110-116)의 상이한 구성을 포함할 수 있고, 임의적 뷰 생성기(102)는 복수의 병렬 물리적 및/또는 가상 프로세서들을 포함할 수 있고, 데이터베이스(106)는 복수의 네트워크화된 데이터베이스 또는 에셋 클라우드(a cloud of assets) 등를 포함할 수 있다.1 is a high-level block diagram illustrating an embodiment of a system 100 for generating an arbitrary view of a scene. As shown, an arbitrary view generator 102 receives a request for an arbitrary view as an input 104 and generates the requested view based on existing database assets 106 . and provides as output 108 the generated view in response to the input request. In various embodiments, optional view generator 102 may include a processor, such as a central processing unit (CPU) or graphics processing unit (GPU). The configuration of the system 100 shown in FIG. 1 is provided for illustration. In general, system 100 may include any other suitable number and/or configuration of interconnected components that provide the described functionality. For example, in other embodiments, the arbitrary view generator 102 may include different configurations of internal components 110 - 116 , and the arbitrary view generator 102 may include a plurality of parallel physical and/or virtual It may include processors, and the database 106 may include a plurality of networked databases or a cloud of assets, and the like.

임의적 뷰 요청(104)은 장면의 임의적 퍼스펙티브(arbitrary perspective)에 대한 요청을 포함한다. 일부 실시예들에서, 장면의 요청된 퍼스펙티브(requested perspective)는 장면의 다른 퍼스펙티브 또는 뷰포인트를 포함하는 에셋 데이터베이스(106)에 이전에 존재하지 않는다. 다양한 실시예들에서, 임의적 뷰 요청(arbitrary view request)(104)은 프로세스 또는 사용자로부터 수신될 수 있다. 예를 들어, 입력(104)은 제시된 장면의 카메라 뷰포인트의 사용자 조작과 같은 제시된 장면 또는 그 일부의 사용자 조작에 응답하여 사용자 인터페이스로부터 수신될 수 있다. 다른 예로서, 임의적 뷰 요청(104)은 장면의 플라이-스루(fly-through)와 같은 가상 환경 내에서의 움직임 또는 이동 경로의 지정에 응답하여 수신될 수 있다. 일부 실시예들에서, 요청될 수 있는 장면의 가능한 임의적 뷰들은 적어도 부분적으로 제한된다. 예를 들어, 사용자는 제시된 대화형 장면(presented interactive scene)의 카메라 뷰포인트를 임의의 랜덤 포지션으로 조작할 수 없고 오히려 장면의 특정 포지션들 또는 퍼스펙티브들로 제한될 수 있다.Arbitrary view request 104 includes a request for an arbitrary perspective of a scene. In some embodiments, the requested perspective of the scene does not previously exist in the asset database 106 containing another perspective or viewpoint of the scene. In various embodiments, an arbitrary view request 104 may be received from a process or a user. For example, the input 104 may be received from the user interface in response to a user manipulation of the presented scene or a portion thereof, such as a user manipulation of a camera viewpoint of the presented scene. As another example, the arbitrary view request 104 may be received in response to designation of a movement or movement path within a virtual environment, such as a fly-through of a scene. In some embodiments, possible arbitrary views of a scene that may be requested are at least partially limited. For example, the user cannot manipulate the camera viewpoint of the presented interactive scene to any random position, but rather may be limited to specific positions or perspectives of the scene.

데이터베이스(106)는 각각의 저장된 에셋의 복수의 뷰들을 저장한다. 주어진 맥락에서, 에셋(asset)은 그 사양(specification)이 데이터베이스(106)에 복수의 뷰들로서 저장된 특정 장면을 지칭한다. 다양한 실시예들에서, 장면은 단일 객체, 복수의 객체들, 또는 풍부한 가상 환경을 포함할 수 있다. 구체적으로, 데이터베이스(106)는 각 에셋의 상이한 퍼스펙티브들 또는 뷰포인트들에 대응하는 복수의 이미지들을 저장한다. 데이터베이스(106)에 저장된 이미지들은 고품질 사진들 또는 사실적 렌더링을 포함한다. 데이터베이스(106)를 채우는 이러한 고화질, 고해상도 이미지들은 오프라인 프로세스 동안 캡처 또는 렌더링되거나 외부 소스들로부터 획득될 수 있다. 일부 실시예들에서, 대응하는 카메라 특성들은 데이터베이스(106)에 저장된 각각의 이미지와 함께 저장된다. 즉, 상대 위치 또는 포지션, 지향방향, 회전, 깊이 정보, 초점 거리, 애퍼처, 줌 레벨 등과 같은 카메라 속성들이 각 이미지와 함께 저장된다. 또한, 셔터 속도 및 노출과 같은 카메라 조명 정보도 역시 데이터베이스(106)에 저장된 각 이미지와 함께 저장될 수도 있다.The database 106 stores a plurality of views of each stored asset. In a given context, an asset refers to a particular scene whose specification is stored as a plurality of views in the database 106 . In various embodiments, a scene may include a single object, multiple objects, or a rich virtual environment. Specifically, the database 106 stores a plurality of images corresponding to different perspectives or viewpoints of each asset. The images stored in database 106 include high quality photos or photorealistic renderings. These high-definition, high-resolution images that populate the database 106 may be captured or rendered during an offline process or obtained from external sources. In some embodiments, corresponding camera characteristics are stored with each image stored in database 106 . That is, camera properties such as relative position or position, orientation, rotation, depth information, focal length, aperture, zoom level, etc. are stored with each image. In addition, camera lighting information such as shutter speed and exposure may also be stored with each image stored in database 106 .

다양한 실시예들에서, 에셋의 임의의 수의 상이한 퍼스펙티브들이 데이터베이스(106)에 저장될 수 있다. 도 2는 데이터베이스 에셋의 예를 도시한다. 주어진 예에서, 의자 객체 둘레에서 상이한 각도들에 대응하는 73개의 뷰들이 캡처되거나 렌더링되어 데이터베이스(106)에 저장된다. 예를 들어, 의자 둘레에서 카메라를 회전시키거나 카메라 앞에서 의자를 회전시킴으로써 뷰들이 캡처될 수 있다. 상대적인 객체 및 카메라 위치 및 지향방향 정보는 생성된 각 이미지와 함께 저장된다. 도 2는 단일 객체를 포함하는 장면의 뷰들을 구체적으로 예시한다. 데이터베이스(106)는 또한 복수의 객체들 또는 풍부한 가상 환경을 포함하는 장면의 사양을 저장할 수 있다. 그러한 경우에, 장면 또는 3차원 공간에서 상이한 위치들 또는 포지션들에 대응하는 다수의 뷰들이 캡처되거나 렌더링되어 데이터베이스(106)에 대응하는 카메라 정보와 함께 저장된다. 일반적으로, 데이터베이스(106)에 저장된 이미지들은 2차원 또는 3차원을 포함할 수 있고 애니메이션 또는 비디오 시퀀스의 스틸 또는 프레임을 포함할 수 있다.In various embodiments, any number of different perspectives of an asset may be stored in database 106 . 2 shows an example of a database asset. In the given example, 73 views corresponding to different angles around the chair object are captured or rendered and stored in database 106 . For example, views may be captured by rotating the camera around the chair or rotating the chair in front of the camera. Relative object and camera position and orientation information is stored with each image generated. 2 specifically illustrates views of a scene comprising a single object. Database 106 may also store a specification of a scene including a plurality of objects or a rich virtual environment. In such a case, multiple views corresponding to different locations or positions in the scene or three-dimensional space are captured or rendered and stored together with the camera information corresponding to the database 106 . In general, the images stored in database 106 may include two or three dimensions and may include stills or frames of animation or video sequences.

데이터베이스(106)에 이전에 존재하지 않는 장면(104)의 임의적 뷰에 대한 요청에 응답하여, 임의적 뷰 생성기(102)는 데이터베이스(106)에 저장된 장면의 복수의 다른 기존 뷰들로부터 요청된 임의적 뷰를 생성한다. 도 1의 예시적인 구성에서, 임의적 뷰 생성기(102)의 에셋 관리 엔진(110)은 데이터베이스(106)를 관리한다. 예를 들어, 에셋 관리 엔진(110)은 데이터베이스(106)에서 데이터의 저장 및 검색을 용이하게 할 수 있다. 장면(104)의 임의적 뷰에 대한 요청에 응답하여, 에셋 관리 엔진(110)은 데이터베이스(106)로부터 장면의 복수의 다른 기존 뷰들을 식별하고 획득한다. 일부 실시예들에서, 에셋 관리 엔진(110)은 데이터베이스(106)로부터 장면의 모든 기존 뷰들을 검색한다. 대안적으로, 에셋 관리 엔진(110)은 예를 들어 요청된 임의적 뷰에 가장 가까운 기존 뷰들의 서브세트를 선택하고 검색할 수 있다. 그러한 경우에, 에셋 관리 엔진(110)은 요청된 임의적 뷰를 생성하기 위해 픽셀들이 수집될 수 있는 기존 뷰들의 서브세트를 지능적으로 선택하도록 구성된다. 다양한 실시예들에서, 다수의 기존 뷰들은 에셋 관리 엔진(110)에 의해 함께 검색되거나 임의적 뷰 생성기(102)의 다른 구성요소들에 의해 필요하게 될 때 검색될 수 있다.In response to a request for an arbitrary view of the scene 104 that does not previously exist in the database 106 , the arbitrary view generator 102 generates the requested arbitrary view from a plurality of other existing views of the scene stored in the database 106 . create In the exemplary configuration of FIG. 1 , the asset management engine 110 of the arbitrary view generator 102 manages the database 106 . For example, the asset management engine 110 may facilitate storage and retrieval of data in the database 106 . In response to a request for an arbitrary view of the scene 104 , the asset management engine 110 identifies and obtains a plurality of other existing views of the scene from the database 106 . In some embodiments, the asset management engine 110 retrieves all existing views of the scene from the database 106 . Alternatively, the asset management engine 110 may select and retrieve a subset of existing views that are closest to, for example, any requested view. In such a case, the asset management engine 110 is configured to intelligently select a subset of existing views from which pixels may be collected to generate the requested arbitrary view. In various embodiments, multiple existing views may be retrieved together by the asset management engine 110 or as needed by other components of the arbitrary view generator 102 .

에셋 관리 엔진(110)에 의해 검색된 각각의 기존 뷰의 퍼스펙티브는 임의적 뷰 생성기(102)의 퍼스펙티브 변환 엔진(112)에 의해 요청된 임의적 뷰의 퍼스펙티브로 변환된다. 이전에 설명된 바와 같이, 정확한 카메라 정보가 알려져 있고 데이터베이스(106)에 저장된 각 이미지와 함께 저장된다. 따라서, 기존 뷰로부터 요청된 임의적 뷰로의 퍼스펙티브 변경은 간단한 기하학적 매핑 또는 변환을 포함한다. 다양한 실시예들에서, 퍼스펙티브 변환 엔진(112)은 기존 뷰의 퍼스펙티브를 임의적 뷰의 퍼스펙티브로 변환하기 위해 임의의 하나 이상의 적절한 수학적 기술들을 사용할 수 있다. 요청된 뷰가 기존 뷰와 일치하지 않은 임의적 뷰를 포함하는 경우, 기존 뷰를 임의적 뷰의 퍼스펙티브로 변환하는 것은 적어도 일부 매핑되지 않거나 누락된 픽셀들(즉, 기존 뷰에 없는 임의적 뷰에 도입된 각도들 또는 포지션들에서)을 포함할 것이다.The perspective of each existing view retrieved by the asset management engine 110 is transformed into the perspective of the requested arbitrary view by the perspective transformation engine 112 of the arbitrary view generator 102 . As previously described, accurate camera information is known and stored with each image stored in database 106 . Thus, changing the perspective from an existing view to the requested arbitrary view involves a simple geometric mapping or transformation. In various embodiments, perspective transformation engine 112 may use any one or more suitable mathematical techniques to transform the perspective of an existing view into the perspective of an arbitrary view. If the requested view contains an arbitrary view that does not match the existing view, transforming the existing view into the arbitrary view's perspective will result in at least some unmapped or missing pixels (i.e., angles introduced into the arbitrary view that are not in the existing view). in the fields or positions).

단일의 퍼스펙티브 변환된 기존 뷰로부터의 픽셀 정보는 상이한 뷰의 모든 픽셀들을 채울 수 없다. 그러나, 많은 경우에서, 요청된 임의적 뷰를 구성하는 픽셀들의 전부는 아니더라도 대부분은 복수의 퍼스펙티브 변환된 기존 뷰들로부터 수집될 수 있다. 임의적 뷰 생성기(102)의 병합 엔진(114)은 복수의 퍼스펙티브 변환된 기존 뷰들로부터 픽셀들을 결합하여 요청된 임의적 뷰를 생성한다. 이상적으로는, 임의적 뷰를 구성하는 모든 픽셀들은 기존 뷰들로부터 수집된다. 이러한 것은, 예를 들어 고려 중인 에셋의 기존 뷰들 또는 퍼스펙티브들의 충분히 다양한 세트가 이용 가능하고 및/또는 요청된 퍼스펙티브가 기존 퍼스펙티브들과 너무 다르지 않은 경우 가능할 수 있다. Pixel information from a single perspective transformed existing view cannot fill all the pixels of a different view. However, in many cases, most if not all of the pixels constituting the requested arbitrary view may be collected from a plurality of perspective transformed existing views. The merging engine 114 of the arbitrary view generator 102 combines pixels from a plurality of perspective transformed existing views to generate the requested arbitrary view. Ideally, all pixels that make up an arbitrary view are collected from existing views. This may be possible, for example, if a sufficiently diverse set of existing views or perspectives of the asset under consideration is available and/or if the requested perspective is not too different from the existing perspectives.

요청된 임의적 뷰를 생성하기 위해 복수의 퍼스펙티브 변환된 기존 뷰들로부터 픽셀들을 결합하거나 병합하기 위해 임의의 적절한 기술들이 사용될 수 있다. 일 실시예들에서, 요청된 임의적 뷰에 가장 가까운 제1 기존 뷰가 데이터베이스(106)로부터 선택 및 검색되고 요청된 임의적 뷰의 퍼스펙티브로 변환된다. 그런 다음 픽셀들이 이러한 퍼스펙티브 변환된 제1 기존 뷰로부터 수집되고 요청된 임의적 뷰에서 대응하는 픽셀들을 채우는 데 사용된다. 제1 기존 뷰로부터 이용 가능하지 않았던 요청된 임의적 뷰의 픽셀들을 채우기 위해, 이들 나머지 픽셀들 중 적어도 일부를 포함하는 제2 기존 뷰가 데이터베이스(106)로부터 선택 및 검색되고 요청된 임의적 뷰의 퍼스펙티브로 변환된다. 제1 기존 뷰로부터 이용 가능하지 않았던 픽셀들은 이러한 퍼스펙티브 변환된 제2 기존 뷰로부터 수집되고 요청된 임의적 뷰에서 대응하는 픽셀들을 채우는 데 사용된다. 이러한 프로세스는, 요청된 임의적 뷰의 모든 픽셀들이 채워질 때까지 및/또는 모든 기존 뷰들이 소진되거나 또는 기존 뷰들의 규정된 수의 임계값이 완전히 사용될 때까지, 임의의 수의 추가적인 기존 뷰들에 대해 반복될 수 있다.Any suitable techniques may be used to combine or merge pixels from a plurality of perspective transformed existing views to generate the requested arbitrary view. In one embodiment, the first existing view closest to the requested arbitrary view is selected and retrieved from the database 106 and converted into the perspective of the requested arbitrary view. Pixels are then collected from this perspective transformed first existing view and used to populate the corresponding pixels in the requested arbitrary view. In order to populate the pixels of the requested arbitrary view that were not available from the first existing view, a second existing view comprising at least some of these remaining pixels is selected and retrieved from the database 106 and into the perspective of the requested arbitrary view. is converted Pixels that were not available from the first existing view are collected from this perspective transformed second existing view and used to fill the corresponding pixels in the requested arbitrary view. This process repeats for any number of additional existing views until all pixels of any requested view are filled and/or until all existing views are exhausted or the threshold of a prescribed number of existing views is fully used. can be

일부 실시예들에서, 요청된 임의적 뷰는 임의의 기존 뷰들로부터 이용 가능하지 않은 일부 픽셀들을 포함할 수 있다. 그러한 경우에, 보간 엔진(116)은 요청된 임의적 뷰의 임의의 나머지 픽셀들을 채우도록 구성된다. 다양한 실시예들에서, 임의의 하나 이상의 적절한 보간 기술들이 요청된 임의적 뷰에서 이러한 채워지지 않은 픽셀들을 생성하기 위해 보간 엔진(116)에 의해 사용될 수 있다. 사용될 수 있는 보간 기술들의 예들은 예를 들어 선형 보간(linear interpolation), 최근접 이웃 보간(nearest neighbor interpolation) 등을 포함한다. 픽셀들의 보간은 평균화(averaging) 또는 평활화(smoothing)를 도입한다. 전체 이미지 품질은 일부 보간에 의해 크게 영향을 받지 않을 수 있지만, 과도한 보간은 허용할 수 없는 흐릿함(blurriness)을 유발할 수 있다. 따라서, 보간은 드물게 사용하는 것이 바람직할 수 있다. 앞에서 설명한 것처럼, 요청된 임의적 뷰의 모든 픽셀들이 기존 뷰들로부터 얻어질 수 있는 경우 보간을 완전히 피할 수 있다. 그러나, 요청된 임의적 뷰가 기존 뷰들로부터 이용 가능하지 않은 일부 픽셀들을 포함하는 경우, 보간이 도입된다. 일반적으로, 필요한 보간의 양은 이용 가능한 기존 뷰들의 수, 기존 뷰들의 퍼스펙티브들의 다양성, 및/또는 임의적 뷰의 퍼스펙티브가 기존 뷰들의 퍼스펙티브들과 관련하여 얼마나 상이한지에 따라 다르다.In some embodiments, the optional view requested may include some pixels that are not available from any existing views. In such a case, the interpolation engine 116 is configured to fill in any remaining pixels of the requested arbitrary view. In various embodiments, any one or more suitable interpolation techniques may be used by interpolation engine 116 to generate such unfilled pixels in the requested random view. Examples of interpolation techniques that may be used include, for example, linear interpolation, nearest neighbor interpolation, and the like. Interpolation of pixels introduces averaging or smoothing. The overall image quality may not be significantly affected by some interpolation, but excessive interpolation can cause unacceptable blurriness. Therefore, it may be desirable to use interpolation sparingly. As explained earlier, interpolation can be completely avoided if all pixels of the requested arbitrary view can be obtained from existing views. However, if the arbitrary view requested contains some pixels not available from existing views, interpolation is introduced. In general, the amount of interpolation required depends on the number of available existing views, the diversity of perspectives of the existing views, and/or how different the perspective of an arbitrary view is with respect to the perspectives of the existing views.

도 2에 도시된 예와 관련하여, 의자 객체(chair object) 둘레에서 73개의 뷰들이 의자의 기존 뷰들로 저장된다. 저장된 뷰들 중 임의의 것과 상이하거나 고유한 의자 객체 둘레에 임의적 뷰는 이러한 복수의 기존 뷰들을 사용하여 생성될 수 있으며, 바람직하게는 존재하는 경우 최소한의 보간을 사용한다. 그러나, 이러한 기존 뷰들의 완전한 세트를 생성하고 저장하는 것은 효율적이지 않거나 바람직하지 않을 수 있다. 일부 경우에, 충분히 다양한 퍼스펙티브들의 세트를 커버하는 훨씬 적은 수의 기존 뷰들이 그 대신에 생성되고 저장될 수 있다. 예를 들어, 의자 객체의 73개 뷰들은 의자 객체 둘레에 소수 뷰들의 작은 세트로 축소될(decimate) 수 있다. With respect to the example shown in FIG. 2 , 73 views around a chair object are stored as existing views of the chair. An arbitrary view around a chair object that is different or unique to any of the stored views may be created using a plurality of such existing views, preferably using minimal interpolation if present. However, creating and storing a complete set of these existing views may not be efficient or desirable. In some cases, a much smaller number of existing views covering a sufficiently diverse set of perspectives may be created and stored instead. For example, 73 views of a chair object may be decimated to a small set of minority views around the chair object.

이전에 언급된 바와 같이, 일부 실시예들에서, 요청될 수 있는 가능한 임의적 뷰들은 적어도 부분적으로 제한될 수 있다. 예를 들어, 사용자는 대화형 장면과 관련된 가상 카메라를 특정 포지션들로 이동시키는 것이 제한될 수 있다. 도 2의 주어진 예와 관련하여, 요청될 수 있는 가능한 임의적 뷰들은 의자 객체 둘레에 임의적 포지션들로 제한될 수 있지만, 예를 들어 의자 객체의 바닥에 대한 픽셀 데이터가 충분히 존재하지 않기 때문에 의자 객체 아래의 임의적 포지션들을 포함하지 않을 수 있다. 허용된 임의적 뷰들에 대한 이러한 제약은 요청된 임의적 뷰가 임의적 뷰 생성기(102)에 의해 기존 데이터로부터 생성될 수 있음을 보장한다.As mentioned previously, in some embodiments the possible arbitrary views that may be requested may be limited, at least in part. For example, a user may be restricted from moving a virtual camera associated with an interactive scene to certain positions. With regard to the given example of FIG. 2 , the possible arbitrary views that may be requested may be limited to arbitrary positions around the chair object, but below the chair object, for example because there is not enough pixel data for the floor of the chair object. may not include arbitrary positions of This constraint on allowed arbitrary views ensures that the requested arbitrary view can be generated by arbitrary view generator 102 from existing data.

임의적 뷰 생성기(102)는 입력된 임의적 뷰 요청(104)에 응답하여 요청된 임의적 뷰(108)를 생성 및 출력한다. 생성된 임의적 뷰(108)의 해상도 또는 품질은 이를 생성하는 데 사용된 기존 뷰들의 품질들과 동일하거나 유사하며, 이는 이들 뷰들로부터의 픽셀들이 임의적 뷰를 생성하는 데 사용되기 때문이다. 따라서, 대부분의 경우 고화질 기존 뷰를 사용하면 고화질 출력이 생성된다. 일부 실시예들에서, 생성된 임의적 뷰(108)는 연관된 장면의 다른 기존 뷰들과 함께 데이터베이스(106)에 저장되고, 이후에 임의적 뷰들에 대한 미래의 요청들에 응답하여 장면의 다른 임의적 뷰들을 생성하는 데 사용될 수 있다. 입력(104)이 데이터베이스(106)의 기존 뷰에 대한 요청을 포함하는 경우, 요청된 뷰는 설명된 바와 같이 다른 뷰들로부터 생성될 필요가 없으며; 대신, 요청된 뷰는 간단한 데이터베이스 조회를 통해 검색되고 출력(108)으로 직접 제공된다. The arbitrary view generator 102 generates and outputs the requested arbitrary view 108 in response to the input arbitrary view request 104 . The resolution or quality of the generated random view 108 is the same as or similar to the qualities of the existing views used to generate it, since pixels from these views are used to generate the random view. Thus, in most cases, using a high-definition legacy view produces a high-definition output. In some embodiments, the generated random view 108 is stored in the database 106 along with other existing views of the associated scene, which then create other arbitrary views of the scene in response to future requests for random views. can be used to If the input 104 includes a request for an existing view of the database 106, the requested view need not be created from other views as described; Instead, the requested view is retrieved through a simple database query and provided directly as output 108 .

임의적 뷰 생성기(102)는 또한 설명된 기술들을 사용하여 임의적 앙상블 뷰(arbitrary ensemble view)를 생성하도록 구성될 수 있다. 즉, 입력(104)은 복수의 객체들을 단일의 커스텀 뷰(single custom view)로 결합하기 위한 요청을 포함할 수 있다. 그러한 경우에, 전술한 기술들은 복수의 객체들 각각에 대해 수행되고 결합되어 복수의 객체들을 포함하는 단일의 통합 또는 앙상블 뷰를 생성한다. 구체적으로, 복수의 객체들 각각의 기존 뷰는 에셋 관리 엔진(110)에 의해 데이터베이스(106)로부터 선택 및 검색되며, 기존 뷰들은 퍼스펙티브 변환 엔진(112)에 의해 요청된 뷰의 퍼스펙티브로 변환되고, 퍼스펙티브 변환된 기존 뷰들의 픽셀들은 병합 엔진(114)에 의해 요청된 앙상블 뷰의 대응하는 픽셀들을 채우는 데 사용되며, 앙상블 뷰에서 채워지지 않은 나머지 픽셀들은 보간 엔진(116)에 의해 보간된다. 일부 실시예들에서, 요청된 앙상블 뷰는 앙상블을 구성하는 하나 이상의 객체들에 대해 이미 존재하는 퍼스펙티브를 포함할 수 있다. 그러한 경우에, 요청된 퍼스펙티브에 대응하는 객체 에셋의 기존 뷰가 객체의 다른 기존 뷰들로부터 요청된 퍼스펙티브를 먼저 생성하는 대신 앙상블 뷰의 객체에 대응하는 픽셀들을 직접 채우는 데 사용된다. Arbitrary view generator 102 may also be configured to generate an arbitrary ensemble view using the described techniques. That is, input 104 may include a request to combine a plurality of objects into a single custom view. In such a case, the techniques described above are performed on each of the plurality of objects and combined to create a single unified or ensemble view comprising the plurality of objects. Specifically, the existing view of each of the plurality of objects is selected and retrieved from the database 106 by the asset management engine 110 , and the existing views are converted into the perspective of the requested view by the perspective transformation engine 112 , The pixels of the perspective-transformed existing views are used by the merging engine 114 to fill the corresponding pixels of the requested ensemble view, and the remaining unfilled pixels in the ensemble view are interpolated by the interpolation engine 116 . In some embodiments, the requested ensemble view may include a perspective that already exists for one or more objects constituting the ensemble. In such a case, the existing view of the object asset corresponding to the requested perspective is used to directly populate the pixels corresponding to the object in the ensemble view instead of first generating the requested perspective from other existing views of the object.

복수의 객체들을 포함하는 임의적 앙상블 뷰의 예로서, 도 2의 의자 객체와 독립적으로 촬영되거나 렌더링된 테이블 객체를 고려한다. 의자 객체 및 테이블 객체는 양쪽 객체들의 단일의 앙상블 뷰를 생성하기 위해 개시된 기술들을 사용하여 결합될 수 있다. 따라서, 개시된 기술들을 사용하여, 복수의 객체들 각각의 독립적으로 캡처되거나 렌더링된 이미지들 또는 뷰들이 일관되게 결합되어 복수의 객체들을 포함하고 원하는 퍼스펙티브를 갖는 장면을 생성할 수 있다. 앞서 설명한 바와 같이, 각각의 기존 뷰의 깊이 정보는 알려져 있다. 각각의 기존 뷰의 퍼스펙티브 변환은 깊이 변환을 포함하여 복수의 객체들이 앙상블 뷰에서 서로에 대해 적절하게 위치할 수 있도록 한다.As an example of an arbitrary ensemble view including a plurality of objects, consider a table object photographed or rendered independently of the chair object of FIG. 2 . The chair object and table object can be combined using the disclosed techniques to create a single ensemble view of both objects. Thus, using the disclosed techniques, independently captured or rendered images or views of each of a plurality of objects may be consistently combined to create a scene comprising the plurality of objects and having a desired perspective. As described above, the depth information of each existing view is known. The perspective transformation of each existing view includes a depth transformation so that multiple objects can be properly positioned relative to each other in the ensemble view.

임의의 앙상블 뷰를 생성하는 것은 복수의 단일 객체들을 커스텀 뷰로 결합하는 것으로 제한되지 않는다. 오히려, 복수의 객체들 또는 복수의 풍부한 가상 환경들을 갖는 복수의 장면들이 유사하게 커스텀 앙상블 뷰로 결합될 수 있다. 예를 들어, 가능하게는 상이한 콘텐트 생성 소스들로부터 그리고 가능하게는 상이한 기존 개별 퍼스펙티브들을 갖는 복수의 개별적이고 독립적으로 생성된 가상 환경들이 원하는 퍼스펙티브를 갖는 앙상블 뷰로 결합될 수 있다. 따라서, 일반적으로, 임의적 뷰 생성기(102)는 가능하게는 상이한 기존 뷰들을 포함하는 복수의 독립적인 에셋들을 가능하게는 원하는 임의적 퍼스펙티브를 갖는 앙상블 뷰로 일관되게 결합하거나 조화시키도록 구성될 수 있다. 모든 결합된 에셋들이 동일한 퍼스펙티브로 정규화되므로, 완벽하게 조화로운 결과의 앙상블 뷰가 생성된다. 앙상블 뷰의 가능한 임의적 퍼스펙티브는 앙상블 뷰를 생성하기 위해 이용 가능한 개별적인 에셋들의 기존 뷰들에 기초하여 제한될 수 있다.Creating an arbitrary ensemble view is not limited to combining a plurality of single objects into a custom view. Rather, a plurality of scenes with a plurality of objects or a plurality of rich virtual environments may similarly be combined into a custom ensemble view. For example, a plurality of individually and independently created virtual environments, possibly from different content creation sources and possibly with different existing individual perspectives, may be combined into an ensemble view with a desired perspective. Thus, in general, the arbitrary view generator 102 may be configured to coherently combine or harmonize a plurality of independent assets, possibly comprising different existing views, into an ensemble view, possibly having a desired arbitrary perspective. All combined assets are normalized to the same perspective, resulting in a perfectly harmonious ensemble view. A possible arbitrary perspective of the ensemble view may be limited based on existing views of the individual assets available to create the ensemble view.

도 3은 임의적 퍼스펙티브를 생성하기 위한 프로세스의 실시예를 도시하는 흐름도이다. 프로세스(300)는 예를 들어 도 1의 임의적 뷰 생성기(102)에 의해 사용될 수 있다. 다양한 실시예들에서, 프로세스(300)는 규정된 에셋의 임의적 뷰 또는 임의의 앙상블 뷰를 생성하기 위해 사용될 수 있다.3 is a flow diagram illustrating an embodiment of a process for generating an arbitrary perspective. Process 300 may be used, for example, by arbitrary view generator 102 of FIG. 1 . In various embodiments, process 300 may be used to generate an arbitrary view or any ensemble view of a defined asset.

프로세스(300)는 임의적 퍼스펙티브에 대한 요청이 수신되는 단계(302)에서 시작한다. 일부 실시예들에서, 단계(302)에서 수신된 요청은 장면의 임의의 기존의 이용 가능한 퍼스펙티브들과 상이한 규정된 장면의 임의적 퍼스펙티브에 대한 요청을 포함할 수 있다. 그러한 경우에, 예를 들어, 장면의 제시된 뷰의 퍼스펙티브에서 요청된 변경에 대한 응답으로 임의적 퍼스펙티브 요청이 수신될 수 있다. 이러한 퍼스펙티브의 변경은 카메라 패닝(panning), 초점 거리 변경, 줌 레벨 변경 등과 같이 장면과 연관된 가상 카메라를 변경하거나 조작함으로써 가능하게 될 수 있다. 대안적으로, 일부 실시예들에서, 단계(302)에서 수신된 요청은 임의적 앙상블 뷰에 대한 요청을 포함할 수 있다. 일 예로서, 이러한 임의적 앙상블 뷰 요청은 복수의 독립적인 객체들이 선택되는 것을 허용하고 선택된 객체들의 통합된 퍼스펙티브-보정된 앙상블 뷰를 제공하는 애플리케이션과 관련하여 수신될 수 있다.Process 300 begins at step 302 where a request for an optional perspective is received. In some embodiments, the request received at step 302 may include a request for an arbitrary perspective of the defined scene that is different from any existing available perspectives of the scene. In such a case, for example, an optional perspective request may be received in response to the requested change in perspective of a presented view of the scene. This change of perspective may be enabled by changing or manipulating the virtual camera associated with the scene, such as panning the camera, changing the focal length, changing the zoom level, and the like. Alternatively, in some embodiments, the request received at step 302 may include a request for an arbitrary ensemble view. As an example, such an arbitrary ensemble view request may be received in connection with an application that allows a plurality of independent objects to be selected and provides a unified perspective-corrected ensemble view of the selected objects.

단계(304)에서, 요청된 임의적 퍼스펙티브의 적어도 일부를 생성하기 위한 복수의 기존 이미지들이 하나 이상의 연관된 에셋 데이터베이스로부터 검색된다. 복수의 검색된 이미지들은 단계(302)에서 수신된 요청이 규정된 에셋의 임의적 퍼스펙티브에 대한 요청을 포함하는 경우에 규정된 에셋과 연관될 수 있거나, 단계(302)에서 수신된 요청이 임의적 앙상블 뷰에 대한 요청을 포함하는 경우에 복수의 에셋들과 연관될 수 있다. In step 304, a plurality of existing images for generating at least a portion of the requested arbitrary perspective are retrieved from one or more associated asset databases. The plurality of retrieved images may be associated with the defined asset if the request received in step 302 includes a request for an arbitrary perspective of the defined asset, or the request received in step 302 may be in an arbitrary ensemble view. In the case of including a request for , it may be associated with a plurality of assets.

단계(306)에서, 상이한 퍼스펙티브를 갖는 단계(304)에서 검색된 복수의 기존 이미지들 각각은 단계(302)에서 요청된 임의적 퍼스펙티브로 변환된다. 단계(304)에서 검색된 기존 이미지들 각각은 연관된 퍼스펙티브 정보를 포함한다. 각 이미지의 퍼스펙티브는 상대적 포지션, 지향방향, 회전, 각도, 깊이, 초점 거리, 애퍼처, 줌 레벨, 조명 정보 등과 같은 이미지 생성과 관련된 카메라 특성들에 의해 정의된다. 각 이미지에 대해 완전한 카메라 정보가 알려져 있기 때문에, 단계(306)의 퍼스펙티브 변환은 간단한 수학적 연산을 포함한다. 일부 실시예들에서, 단계(306)는 또한 모든 이미지들이 동일한 원하는 조명 조건으로 일관되게 정규화되도록 조명 변환을 선택적으로 포함한다.In step 306 , each of the plurality of existing images retrieved in step 304 having a different perspective is transformed into an arbitrary perspective requested in step 302 . Each of the existing images retrieved in step 304 includes associated perspective information. The perspective of each image is defined by camera characteristics related to image creation, such as relative position, orientation, rotation, angle, depth, focal length, aperture, zoom level, and lighting information. Since complete camera information is known for each image, the perspective transformation of step 306 involves simple mathematical operations. In some embodiments, step 306 also optionally includes a lighting transformation such that all images are consistently normalized to the same desired lighting condition.

단계(308)에서, 단계(302)에서 요청된 임의적 퍼스펙티브를 갖는 이미지의 적어도 일부는 퍼스펙티브-변환된 기존 이미지로부터 수집된 픽셀들에 의해 채워진다. 즉, 복수의 퍼스펙티브-보정된 기존 이미지들로부터의 픽셀들을 사용하여 요청된 임의적 퍼스펙티브를 갖는 이미지를 생성한다. In step 308, at least a portion of the image with the arbitrary perspective requested in step 302 is filled by pixels collected from the perspective-transformed existing image. That is, pixels from a plurality of perspective-corrected existing images are used to create an image with the requested arbitrary perspective.

단계(310)에서, 요청된 임의적 퍼스펙티브를 갖는 생성된 이미지가 완전한지 여부가 결정된다. 단계(310)에서 요청된 임의적 퍼스펙티브를 갖는 생성된 이미지가 완전하지 않은 것으로 결정되면, 단계(312)에서 생성된 이미지의 임의의 나머지 채워지지 않은 픽셀들이 채굴(mine)될 수 있는 기존 이미지들이 더 있는지 여부가 결정된다. 단계(312)에서 더 많은 기존 이미지가 이용 가능한 것으로 결정되면, 하나 이상의 추가적인 기존 이미지들이 단계(314)에서 검색되고 프로세스(300)는 단계(306)에서 계속된다.At step 310, it is determined whether the generated image with the requested arbitrary perspective is complete. If it is determined in step 310 that the generated image with the requested arbitrary perspective is incomplete, then in step 312 there are more existing images from which any remaining unfilled pixels of the generated image may be mined. It is decided whether or not If it is determined at step 312 that more existing images are available, then one or more additional existing images are retrieved at step 314 and the process 300 continues at step 306 .

단계(310)에서 요청된 임의적 퍼스펙티브를 갖는 생성된 이미지가 완전하지 않은 것으로 결정되고, 단계(312)에서 더 이상 이용 가능한 기존 이미지가 없다고 결정되면, 생성된 이미지의 임의의 나머지 채워지지 않은 픽셀들이 단계(316)에서 보간된다. 임의의 하나 이상의 적절한 보간 기술들이 단계(316)에서 사용될 수 있다. If it is determined in step 310 that the generated image with the requested arbitrary perspective is incomplete, and there are no more existing images available in step 312, then any remaining unfilled pixels of the generated image are It is interpolated in step 316 . Any one or more suitable interpolation techniques may be used in step 316 .

단계(310)에서 요청된 임의적 퍼스펙티브를 갖는 생성된 이미지가 완전하다고 결정되거나 단계(316)에서 임의의 나머지 채워지지 않은 픽셀들을 보간한 후, 요청된 임의적 퍼스펙티브를 갖는 생성된 이미지가 단계(318)에서 출력된다. 이어서 프로세스(300)가 종료된다. After it is determined in step 310 that the generated image with the requested arbitrary perspective is complete or after interpolating any remaining unfilled pixels in step 316, the generated image with the requested arbitrary perspective is determined in step 318 is output from Process 300 is then terminated.

설명된 바와 같이, 개시된 기술들은 다른 기존 퍼스펙티브들에 기초하여 임의적 퍼스펙티브를 생성하는 데 사용될 수 있다. 각각의 기존 퍼스펙티브와 함께 카메라 정보가 보존되기 때문에, 상이한 기존 퍼스펙티브를 공통의 원하는 퍼스펙티브로 정규화하는 것이 가능한다. 원하는 퍼스펙티브를 갖는 결과 이미지는 퍼스펙티브-변환된 기존 이미지들로부터 픽셀들을 채굴하여 구성될 수 있다. 개시된 기술들을 사용하여 임의적 퍼스펙티브를 생성하는 것과 연관된 프로세스는 빠르고 거의 즉각적일 뿐만 아니라 고품질 출력을 가져오므로 개시된 기술들을 대화형 실시간 그래픽 애플리케이션들에 특히 강력하게 한다.As described, the disclosed techniques can be used to create an arbitrary perspective based on other existing perspectives. Since camera information is preserved with each existing perspective, it is possible to normalize different existing perspectives to a common desired perspective. The resulting image with the desired perspective can be constructed by mining pixels from the perspective-transformed existing images. The process associated with generating an arbitrary perspective using the disclosed techniques is fast and almost instantaneous, as well as results in high quality output, making the disclosed techniques particularly powerful for interactive real-time graphics applications.

개시된 기술들은 또한 복수의 객체들 각각의 이용 가능한 이미지들 또는 뷰들을 사용함으로써 복수의 객체들을 포함하는 임의적 앙상블 뷰의 생성을 설명한다. 설명된 바와 같이, 퍼스펙티브 변환 및/또는 정규화는 복수의 객체들의 독립적으로 캡처되거나 렌더링된 이미지들 또는 뷰들를 포함하는 픽셀들이 원하는 임의적 앙상블 뷰로 일관되게 결합되도록 한다.The disclosed techniques also describe the creation of an arbitrary ensemble view comprising a plurality of objects by using available images or views of each of the plurality of objects. As described, perspective transformation and/or normalization allows pixels comprising independently captured or rendered images or views of a plurality of objects to be consistently combined into a desired arbitrary ensemble view.

일부 실시예들에서, 장면 또는 앙상블 뷰에 포함되기를 원하는 콘텐트를 선택 및 위치 지정함으로써 장면 또는 앙상블 뷰를 먼저 구축하거나 어셈블링하는 것이 바람직할 수 있다. 그러한 일부 경우에, 복수의 객체들은 빌딩 블록들처럼 쌓이거나 결합되어 장면 또는 앙상블 뷰를 구성하는 합성 객체(composite object)를 생성할 수 있다. 예를 들어, 장면 또는 앙상블 뷰를 생성하기 위해 예를 들어 캔버스에 복수의 독립적인 객체들이 선택되고 적절하게 배치되는 대화형 애플리케이션을 고려한다. 예를 들어, 대화형 애플리케이션은 시각화 또는 모델링 애플리케이션을 포함할 수 있다. 이러한 응용 프로그램에서, 객체들의 임의적 뷰들은 연관된 초점 거리에서 발생하는 퍼스펙티브 왜곡들로 인해 장면 또는 앙상블 뷰를 구성하는 데 이용될 수 없다. 오히려, 실질적으로 퍼스펙티브 왜곡이 없는 규정된 객체 뷰들이 다음에 설명되는 바와 같이 활용된다.In some embodiments, it may be desirable to first build or assemble the scene or ensemble view by selecting and locating the content desired to be included in the scene or ensemble view. In some such cases, a plurality of objects may be stacked or combined like building blocks to create a composite object that constitutes a scene or ensemble view. Consider, for example, an interactive application in which a plurality of independent objects are selected and appropriately placed on, for example, a canvas to create a scene or ensemble view. For example, an interactive application may include a visualization or modeling application. In this application, arbitrary views of objects cannot be used to construct a scene or ensemble view due to perspective distortions that occur at the associated focal length. Rather, defined object views that are substantially free of perspective distortion are utilized as described below.

객체들의 직교 뷰들(orthographic views)은 일부 실시예들에서 복수의 독립적인 객체들을 포함하는 장면 또는 앙상블 뷰를 모델링하거나 정의하는 데 이용된다. 직교 뷰는, 광선들 또는 투영 라인들이 실질적으로 평행하도록 상대적으로 긴 초점 거리를 가지며 관심 대상으로부터의 크기에 비해 먼 거리에 위치되는 (가상) 카메라에 의해 근사화되는 평행 투영(parallel projection)을 포함한다. 직교 뷰들은 깊이가 없거나 고정되어 있으므로 퍼스펙티브 왜곡이 없거나 거의 없다. 이와 같이, 객체들의 직교 뷰들은 앙상블 장면(ensemble scene) 또는 합성 객체(composite object)를 지정할 때 빌딩 블록들과 유사하게 이용될 수 있다. 객체들의 임의적 조합을 포함하는 앙상블 장면이 이러한 직교 뷰들을 사용하여 지정되거나 정의된 후, 장면 또는 그 객체들은 도 1 내지 도 3의 설명과 관련하여 이전에 설명된 임의적 뷰 생성 기술을 사용하여 임의의 원하는 카메라 퍼스펙티브로 변환될 수 있다. Orthographic views of objects are used in some embodiments to model or define a scene or ensemble view that includes a plurality of independent objects. An orthographic view includes a parallel projection approximated by a (virtual) camera that has a relatively long focal length such that the rays or projection lines are substantially parallel, and is positioned at a large distance relative to the object of interest. . Orthographic views have little or no perspective distortion as they have no depth or are fixed. As such, orthogonal views of objects can be used similarly to building blocks when specifying an ensemble scene or a composite object. After an ensemble scene comprising arbitrary combinations of objects has been specified or defined using these orthogonal views, the scene or its objects can be defined using arbitrary view creation techniques previously described in connection with the description of FIGS. 1-3 . It can be converted to the desired camera perspective.

일부 실시예들에서, 도 1의 시스템(100)의 데이터베이스(106)에 저장된 에셋의 복수의 뷰들은 에셋의 하나 이상의 직교 뷰들을 포함한다. 이러한 직교 뷰들은 3차원 다각형 메쉬 모델로부터 캡처(예를 들어, 사진 촬영 또는 스캐닝)되거나 렌더링될 수 있다. 대안적으로, 직교 뷰는 도 1 내지 도 3의 설명과 관련하여 설명된 임의적 뷰 생성 기술들에 따라 데이터베이스(106)에서 이용 가능한 에셋의 다른 뷰들로부터 생성될 수 있다.In some embodiments, the plurality of views of an asset stored in database 106 of system 100 of FIG. 1 includes one or more orthogonal views of the asset. These orthogonal views may be captured (eg, photographed or scanned) or rendered from a three-dimensional polygonal mesh model. Alternatively, the orthographic view may be created from other views of the asset available in the database 106 according to the arbitrary view creation techniques described in connection with the description of FIGS. 1-3 .

도 4a 내지 도 4n은 독립적인 객체들이 결합되어 앙상블 또는 합성 객체 또는 장면을 생성하는 애플리케이션의 실시예의 예들을 도시한다. 구체적으로, 도 4a 내지 도 4n은 다양한 독립적인 좌석 구성요소들이 결합되어 상이한 조립식 구성들(sectional configurations)을 생성하는 가구 빌딩 애플리케이션의 예(furniture building application)를 도시한다. 4A-4N illustrate examples of an embodiment of an application in which independent objects are combined to create an ensemble or composite object or scene. Specifically, FIGS. 4A-4N illustrate a furniture building application in which various independent seating components are combined to create different sectional configurations.

도 4a는 왼팔 의자(left-arm chair), 팔걸이가 없는 러브시트(armless loveseat), 오른팔 긴 의자(right-arm chaise)의 3개의 독립적인 좌석 구성요소들의 퍼스펙티브 뷰들의 예를 도시한다. 도 4a의 예에서 퍼스펙티브 뷰들 각각은 25mm의 초점 거리를 갖는다. 알 수 있는 바와 같이, 결과적인 퍼스펙티브 왜곡들(resulting perspective distortions)은 구성요소들을 포함하는 조립식 구성을 구축할 때 바람직할 수 있는 구성요소들을 서로 간에 옆에 쌓는 것, 즉 구성요소들을 나란히 배치하는 것을 방해한다. 4A shows an example of perspective views of three independent seating components: a left-arm chair, an armless loveseat, and a right-arm chaise. In the example of FIG. 4A each of the perspective views has a focal length of 25 mm. As can be seen, the resulting perspective distortions are caused by stacking components next to each other, i.e. placing components side by side, which may be desirable when building a prefab construction comprising the components. interfere

도 4b는 도 4a의 동일한 3개의 구성요소들의 직교 뷰들(orthographic views)의 예를 도시한다. 도시된 바와 같이, 객체들의 직교 뷰들은 모듈식이거나 블록과 유사하며 나란히 쌓이거나 배치될 수 있다. 그러나 깊이 정보는 직교 뷰들에서 상당히 손실된다. 알 수 있는 바와 같이, 특히 긴 의자(chaise)와 관련하여 도 4a에서 볼 수 있는 깊이의 실제 차이에도 불구하고, 3개의 구성요소들 모두는 직교 뷰들에서 동일한 깊이를 갖는 것으로 나타난다. FIG. 4B shows an example of orthographic views of the same three components of FIG. 4A . As shown, orthogonal views of objects are modular or block-like and can be stacked or placed side by side. However, depth information is significantly lost in orthogonal views. As can be seen, all three components appear to have the same depth in orthogonal views, despite the actual difference in depth seen in FIG. 4A especially with regard to the chaise.

도 4c는 합성 객체를 지정하기 위해 도 4b의 3개의 구성요소들의 직교 뷰들을 결합하는 예를 도시한다. 즉, 도 4c는 도 4b의 3개의 구성요소들의 직교 뷰들의 병렬 배치(side-by-side placement)를 통한 조립식 가구(sectional)의 직교 뷰의 생성을 도시한다. 도 4c에 도시된 바와 같이, 3개의 좌석 구성요소들의 직교 뷰들의 바운딩 박스들(bounding boxes)은 조립식 가구의 직교 뷰를 생성하기 위해 서로 간에 옆에 완벽하게 맞다. 즉, 구성요소들의 직교 뷰들은 정확한 배치뿐만 아니라 장면에서 구성요소들의 사용자 친화적인 조작들을 용이하게 한다. Fig. 4c shows an example of combining orthogonal views of the three components of Fig. 4b to specify a composite object; That is, FIG. 4C shows the creation of an orthogonal view of a sectional view through side-by-side placement of orthogonal views of the three components of FIG. 4B . As shown in FIG. 4C , the bounding boxes of the orthogonal views of the three seating components fit perfectly next to each other to create an orthogonal view of the prefabricated furniture. That is, orthogonal views of components facilitate user-friendly manipulations of components in the scene as well as precise placement.

도 4d 및 도 4e는 각각 도 1 내지 도 3의 설명과 관련하여 이전에 설명된 임의적 뷰 생성 기술들을 사용하여 도 4c의 합성 객체의 직교 뷰를 임의적 카메라 퍼스펙티브로 변환하는 예를 보여준다. 즉, 합성 객체의 직교 뷰는 도 4d 및 도 4e의 각각의 예에서 깊이를 정확하게 나타내는 일반적인 카메라 퍼스펙티브로 변환된다. 도시된 바와 같이, 직교 뷰들에서 손실된 의자 및 러브시트에 대한 긴 의자의 상대적 깊이는 도 4d 및 도 4e의 퍼스펙티브 뷰들에서 볼 수 있다. 4D and 4E each show an example of converting an orthogonal view of the composite object of FIG. 4C into an arbitrary camera perspective using the arbitrary view generation techniques previously described in connection with the description of FIGS. 1-3 . That is, the orthographic view of the composite object is transformed into a general camera perspective that accurately represents the depth in each example of FIGS. 4D and 4E . As shown, the relative depth of the long chair to the love seat and chair lost in the orthogonal views can be seen in the perspective views of FIGS. 4D and 4E .

도 4f, 도 4g, 및 도 4h는 왼팔 의자, 팔걸이가 없는 러브시트, 오른팔 긴 의자의 복수의 직교 뷰들의 예들을 각각 나타낸다. 이전에 설명된 바와 같이, 에셋의 임의의 수의 상이한 뷰들 또는 퍼스펙티브들은 도 1의 시스템(100)의 데이터베이스(106)에 저장될 수 있다. 도 4f 내지 도 4h의 세트들은 독립적으로 캡처되거나 렌더링되고 데이터베이스(106)에 저장되며 객체들의 임의의 조합의 임의의 임의적 뷰가 생성될 수 있는 각 에셋 둘레에 상이한 각도들에 대응하는 25개의 직교 뷰들을 포함한다. 예를 들어, 가구 빌딩 애플리케이션에서, 탑 뷰들은 지면 배치(ground placement)에 유용할 수 있고, 프론트 뷰들(front views)은 벽면 배치에 유용할 수 있다. 일부 실시예들에서, 보다 컴팩트한 참조 데이터 세트를 유지하기 위해, 에셋의 임의의 임의적 뷰가 생성될 수 있는 데이터베이스(106)의 에셋에 대해 규정된 수의 직교 뷰들만이 저장된다.4F, 4G, and 4H respectively show examples of a plurality of orthogonal views of a left arm chair, a love seat without an armrest, and a right arm bench. As previously described, any number of different views or perspectives of an asset may be stored in the database 106 of the system 100 of FIG. 1 . The sets of Figures 4F-4H are independently captured or rendered and stored in the database 106 and have 25 orthogonal views corresponding to different angles around each asset from which any arbitrary view of any combination of objects can be created. include those For example, in a furniture building application, top views may be useful for ground placement and front views may be useful for wall placement. In some embodiments, in order to maintain a more compact set of reference data, only a prescribed number of orthogonal views are stored for an asset in database 106 from which any arbitrary view of the asset can be created.

도 4i 내지 도 4n은 객체들의 임의적 조합들의 임의적 뷰들 또는 퍼스펙티브들을 생성하는 다양한 예들을 도시한다. 구체적으로, 도 4i 내지 도 4n 각각은 복수의 독립적인 좌석 객체들 또는 구성요소들을 포함하는 조립식 가구의 임의적 퍼스펙티브 또는 뷰를 생성하는 것을 도시한다. 각각의 임의적 뷰는 예를 들어, 도 1 내지 도 3의 설명과 관련하여 이전에 설명한 임의적 뷰 생성 기술들을 사용하여, 앙상블 뷰 또는 합성 객체를 구성하는 객체들의 하나 이상의 직교(또는 다른) 뷰들을 임의적 뷰로 변환하고 임의적 뷰를 채우기 위해 픽셀을 수집하고 임의의 나머지 누락 픽셀들을 보간함으로써 생성될 수 있다.4I-4N show various examples of creating arbitrary views or perspectives of arbitrary combinations of objects. Specifically, each of FIGS. 4I-4N illustrates creating an arbitrary perspective or view of a prefab furniture comprising a plurality of independent seating objects or components. Each arbitrary view is an arbitrary view of one or more orthogonal (or other) views of objects comprising the ensemble view or composite object, for example, using the arbitrary view creation techniques previously described in connection with the description of FIGS. 1-3 . It can be created by transforming into a view and collecting pixels to fill an arbitrary view and interpolating any remaining missing pixels.

이전에 설명된 바와 같이, 데이터베이스(106)에 있는 에셋의 각각의 이미지 또는 뷰는 조명 정보뿐만 아니라 상대적 객체 및 카메라 위치 및 지향방향 정보와 같은 대응하는 메타데이터와 함께 저장될 수 있다. 메타데이터는 에셋의 3차원 다각형 메쉬 모델로부터 뷰를 렌더링할 때, 에셋을 이미징 또는 스캐닝할 때(이 경우 깊이 및/또는 표면 법선 데이터가 추정될 수 있음) 또는 이 둘의 조합에서 생성될 수 있다. As previously described, each image or view of an asset in database 106 may be stored along with lighting information as well as corresponding metadata such as relative object and camera position and orientation information. Metadata can be generated when rendering a view from a three-dimensional polygonal mesh model of an asset, imaging or scanning the asset (where depth and/or surface normal data can be estimated), or a combination of the two. .

에셋의 규정된 뷰 또는 이미지는 이미지를 구성하는 각 픽셀에 대한 픽셀 강도 값들(예를 들어, RGB 값들) 및 각 픽셀과 연관된 다양한 메타데이터 파라미터들을 포함한다. 일부 실시예들에서, 픽셀의 하나 이상의 적색, 녹색, 및 청색(RGB) 채널들 또는 값들이 픽셀 메타데이터를 인코딩하기 위해 이용될 수 있다. 예를 들어, 픽셀 메타데이터는 해당 픽셀에 투영하는 3차원 공간의 포인트의 상대적 위치 또는 포지션(예를 들어, x, y, 및 z 좌표 값들)에 대한 정보를 포함할 수 있다. 또한, 픽셀 메타데이터는 해당 위치에서 표면 법선 벡터들(예를 들어, x, y, 및 z 축들로 이루어진 각도들)에 대한 정보를 포함할 수 있다. 더욱이, 픽셀 메타데이터는 텍스처 매핑 좌표들(예를 들어, u 및 v 좌표 값들)을 포함할 수 있다. 이 경우, 텍스처 이미지의 해당 좌표들에서 RGB 값들을 판독함으로써 한 지점에서의 실제 픽셀 값이 결정된다.A defined view or image of an asset includes pixel intensity values (eg, RGB values) for each pixel that makes up the image and various metadata parameters associated with each pixel. In some embodiments, one or more red, green, and blue (RGB) channels or values of a pixel may be used to encode pixel metadata. For example, the pixel metadata may include information on a relative position or position (eg, x, y, and z coordinate values) of a point in a 3D space projected onto a corresponding pixel. Also, the pixel metadata may include information about surface normal vectors (eg, angles made up of x, y, and z axes) at the corresponding position. Moreover, the pixel metadata may include texture mapping coordinates (eg, u and v coordinate values). In this case, the actual pixel value at a point is determined by reading the RGB values at the corresponding coordinates of the texture image.

표면 법선 벡터들은 생성된 임의적 뷰 또는 장면의 조명을 수정하거나 변경하는 것을 용이하게 한다. 보다 구체적으로, 장면을 다시 조명(re-lighting)하는 것은 픽셀의 표면 법선 벡터들이 새롭게 추가, 제거, 또는 변경된 광원의 방향과 얼마나 잘 일치하는지에 따라 픽셀 값들을 스케일링하는 것을 포함하며, 이는 예를 들어 광의 방향과 픽셀들의 법선 벡터들의 내적(dot product)에 의해 적어도 부분적으로 정량화될 수 있다. 텍스처 매핑 좌표들을 통해 픽셀 값들을 지정하는 것은 생성된 임의적 뷰 또는 장면 또는 그 일부의 텍스처를 수정하거나 변경하는 것을 용이하게 한다. 보다 구체적으로, 참조된 텍스처 이미지를 동일한 치수들을 갖는 다른 텍스처 이미지로 간단히 교체하거나 대체함으로써 텍스처를 변경할 수 있다.Surface normal vectors facilitate modifying or changing the lighting of an arbitrary view or scene created. More specifically, re-lighting a scene involves scaling pixel values according to how well the pixel's surface normal vectors match the direction of a newly added, removed, or changed light source, which for example It can be quantified, for example, at least in part by the dot product of the direction of light and the normal vectors of the pixels. Specifying pixel values via texture mapping coordinates facilitates modifying or changing the texture of an arbitrary view or scene or part thereof that is created. More specifically, the texture can be changed by simply replacing or replacing the referenced texture image with another texture image having the same dimensions.

개시된 임의적 뷰 생성 기술들은 상대적으로 낮은 계산 비용 퍼스펙티브 변환 및/또는 조회 작업들에 효과적으로 기초한다. 임의적(앙상블) 뷰는 단순히 올바른 픽셀들을 선택하고 생성되는 임의적 뷰를 해당 픽셀들로 적절하게 채움으로써 생성될 수 있다. 일부 경우에, 픽셀 값들이 선택적으로 스케일링될 수 있다(예를 들어, 조명이 조정되는 경우). 개시된 기술들의 낮은 저장 및 프로세싱 오버헤드는, 생성되는 고화질 참조 뷰에 필적하는 품질인 복잡한 장면들(complex scenes)의 임의적 뷰들의 빠른 실시간 또는 주문형 생성을 용이하게 한다.The disclosed arbitrary view creation techniques are effectively based on relatively low computational cost perspective transformation and/or query operations. An arbitrary (ensemble) view can be created simply by selecting the correct pixels and properly filling the resulting arbitrary view with those pixels. In some cases, pixel values may be selectively scaled (eg, when lighting is adjusted). The low storage and processing overhead of the disclosed techniques facilitates fast real-time or on-demand generation of arbitrary views of complex scenes with a quality comparable to the high-definition reference view being generated.

설명된 바와 같이, 일부 실시예들에서, 앙상블 또는 합성 객체 또는 장면을 어셈블링하는 것은 직교 뷰들(orthographic views)을 사용하여 앙상블을 포함하는 복수의 객체들 또는 에셋들을 지정하는 것을 포함한다. 직교 뷰들은 앙상블 장면에서 복수의 객체들 또는 에셋들의 정확한 배치들 및 정렬들을 용이하게 한다. 앙상블 장면의 직교 뷰는, 예를 들어 임의의 원하는 또는 요청된 퍼스펙티브를 생성하기 위해 임의의 임의적 카메라 퍼스펙티브로 변환될 수 있다. 앙상블 뷰를 규정된 카메라 퍼스펙티브로 변환하는 것은, 앙상블 장면을 포함하는 복수의 객체들 또는 에셋들 각각을 이전에 설명된 기술들을 사용하여 규정된 퍼스펙티브(prescribed perspective)로 개별적으로 변환하는 것을 포함할 수 있다. 임의적 앙상블 뷰를 생성하기 위해 이전에 설명된 기술들이 상대적으로 효율적이지만, 예를 들어 사용자에게 대화형 실시간 경험들을 제공하는 애플리케이션들에서와 같이 최종 사용자에게 거의 검출 가능하지 않은 대기 시간 패널티(latency penalty)가 있는 출력을 거의 즉시 또는 적어도 매우 빠르게 생성하는 것이 유리한 특정 애플리케이션들에서는 훨씬 더 높은 효율성이 바람직할 수 있다. As described, in some embodiments assembling an ensemble or composite object or scene includes specifying a plurality of objects or assets comprising the ensemble using orthographic views. Orthographic views facilitate precise placements and alignments of a plurality of objects or assets in an ensemble scene. An orthogonal view of an ensemble scene may be transformed into any arbitrary camera perspective, for example to generate any desired or requested perspective. Transforming the ensemble view into a prescribed camera perspective may include individually transforming each of a plurality of objects or assets comprising the ensemble scene into a prescribed perspective using techniques previously described. have. Although the techniques previously described for generating an arbitrary ensemble view are relatively efficient, they suffer a little detectable latency penalty to the end user, such as, for example, in applications that provide interactive real-time experiences to the user. Even higher efficiencies may be desirable in certain applications where it is advantageous to generate an output with , almost immediately, or at least very quickly.

일부 실시예들에서, 효율성의 추가 개선은 앙상블 장면을 포함하는 복수의 객체들 또는 에셋들의 대부분(예를 들어, 직교 또는 다른 뷰)을 규정된 임의적 퍼스펙티브로 변환하는 것과 연관된 프로세싱을 제거함으로써 적어도 부분적으로 촉진될 수 있다. 대신, 앙상블 장면에서 해당 객체 또는 에셋의 규정된 포지션 및 지향방향에 대해 규정된 임의적 퍼스펙티브에 가장 가깝거나 가장 근접한 객체 또는 에셋의 이용 가능한 기존 뷰가, 규정된 임의의 퍼스펙티브를 나타내는 출력 앙상블 뷰 또는 이미지를 생성할 때 해당 객체 또는 에셋에 대해 이용된다. 대부분의 경우, 결과적인 출력 앙상블 뷰는 완벽하게 정확한 퍼스펙티브는 아니지만 많은 애플리케이션들에 허용되는 적절한 근사치를 제공하며 완벽하게 정확한 퍼스펙티브 출력을 생성하는 것에 비해 대기 시간이 훨씬 짧다. 앙상블을 구성하는 하나 이상의 객체들 또는 에셋들의 이미 존재하는 참조 뷰들의 최대 양자화된 서브세트를 이용함으로써 임의적 카메라 포즈에 대한 임의적 앙상블 뷰의 근사치를 생성하는 것은 다음에 더 자세히 설명된다.In some embodiments, a further improvement in efficiency is at least in part by eliminating the processing associated with transforming a majority (eg, orthogonal or other view) of a plurality of objects or assets comprising an ensemble scene into a defined arbitrary perspective. can be promoted by Instead, an output ensemble view or image in which an available existing view of an object or asset closest to or closest to a specified arbitrary perspective with respect to the specified position and orientation of that object or asset in the ensemble scene represents any specified perspective. It is used for the object or asset when creating it. In most cases, the resulting output ensemble view is not a perfectly accurate perspective, but provides an acceptable approximation for many applications and has much lower latency than producing a perfectly accurate perspective output. Creating an approximation of an arbitrary ensemble view to an arbitrary camera pose by using a maximally quantized subset of pre-existing reference views of one or more objects or assets that make up the ensemble is described in more detail below.

도 5는 임의적 앙상블 뷰를 생성하기 위한 프로세스의 실시예를 도시하는 상위 레벨 흐름도이다. 일부 실시예들에서, 프로세스(500)는, 앙상블 장면을 구성하는 객체들 또는 에셋들의 대부분 또는 전부는 아닐지라도 적어도 하나 이상의 단일의 가장 잘 일치하는 기존의 뷰를 적절하게 결합하거나 합성하는 것에 적어도 부분적으로 기초하여 앙상블 장면의 출력 이미지를 효율적으로 생성하기 위해 이용된다. 5 is a high-level flow diagram illustrating an embodiment of a process for generating an arbitrary ensemble view. In some embodiments, process 500 is at least partially involved in suitably combining or compositing at least one or more single best-matching existing views, if not most, if not all, of the objects or assets that make up the ensemble scene. It is used to efficiently generate an output image of the ensemble scene based on

프로세스(500)는 앙상블 장면의 규정된 퍼스펙티브에 대한 요청이 수신되는 단계(502)에서 시작한다. 앙상블 장면의 요청된 규정된 퍼스펙티브는 앙상블 장면에 대해 선택되거나 또는 지정된 카메라 퍼스펙티브 또는 포즈를 포함하고 일반적으로 임의의 임의적 뷰를 포함할 수 있다. 주어진 컨텍스트에서 임의적 뷰는 사양 또는 카메라 포즈가 요청되기 전에 미리 알려지지 않은 장면의 임의의 원하는 뷰 또는 퍼스펙티브를 포함한다. 앙상블 장면은 복수의 독립적인 객체들 또는 에셋들의 결합된 뷰를 포함한다. 일반적으로, 독립적인 객체 또는 에셋의 사양(specification)은 상이한 카메라 퍼스펙티브들 및 대응하는 메타데이터를 갖는 개별적인 객체 또는 에셋의 기존의 참조 이미지들 및 뷰들의 세트를 포함하고, 그 중 하나 이상은 해당 객체 또는 에셋과 연관된 앙상블 장면의 일부를 생성하거나 지정하는 데 사용될 수 있다. 일부 실시예들에서, 단계(502)의 요청은 앙상블 장면 공간에서 카메라 각도 또는 포즈의 조작 및/또는 합성 또는 앙상블 장면을 생성하기 위한 복수의 객체들 또는 에셋들의 배치를 용이하게 하는 대화형 모바일 또는 웹 기반 애플리케이션으로부터 수신된다. 예를 들어, 요청은 시각화 또는 모델링 애플리케이션 또는 증강 현실(AR) 애플리케이션으로부터 수신될 수 있다. 일부 실시예들에서, 직교 뷰들(orthographic views)이 앙상블 장면을 구성하는 복수의 객체들 또는 에셋들의 더 쉬운 조작, 배치, 및 정렬을 용이하게 하기 때문에, 단계(502)의 요청은 앙상블 장면의 직교 뷰에 대해 수신된다. Process 500 begins at step 502 where a request for a defined perspective of an ensemble scene is received. The requested defined perspective of the ensemble scene includes a camera perspective or pose selected or specified for the ensemble scene and may generally include any arbitrary view. An arbitrary view in a given context includes any desired view or perspective of a scene that is not known in advance before a specification or camera pose is requested. An ensemble scene includes a combined view of a plurality of independent objects or assets. In general, the specification of an independent object or asset includes a set of existing reference images and views of the individual object or asset with different camera perspectives and corresponding metadata, one or more of which is that object. Alternatively, it can be used to create or designate a part of an ensemble scene associated with an asset. In some embodiments, the request of step 502 is an interactive mobile or facilitating manipulation of a camera angle or pose in ensemble scene space and/or placement of a plurality of objects or assets to create a composite or ensemble scene. Received from a web-based application. For example, the request may be received from a visualization or modeling application or an augmented reality (AR) application. In some embodiments, the request of step 502 is an orthographic view of the ensemble scene because orthographic views facilitate easier manipulation, placement, and alignment of the plurality of objects or assets that make up the ensemble scene. Received for a view.

단계(504)에서, 앙상블 장면을 구성하는 하나 이상의 객체들 또는 에셋들의 적어도 서브세트 각각에 대해 가장 근접하거나 가장 가깝게 일치하는 기존의 참조 이미지 또는 뷰가 선택된다. 단계(504)는 앙상블 장면을 구성하는 개별적인 또는 독립적인 객체들 또는 에셋들에 대해 직렬로 및/또는 병렬로 수행될 수 있다. 일부 실시예들에서, 앙상블 장면 공간에서 해당 객체 또는 에셋의 주어진 포즈에 대해 요청된 규정된 퍼스펙티브와 가장 잘 일치하는 객체 또는 에셋에 대해 하나 또는 단일의 기존의 참조 이미지 또는 뷰만이 선택된다. 앙상블 장면 공간은 앙상블 장면의 중심(예를 들어, 질량 중심(center of mass))과 같은 적절한 방식으로 정의된 규정된 원점을 갖는 앙상블 장면 좌표계를 포함한다. 단계(504)에서 앙상블 장면을 구성하는 객체 또는 에셋에 대해 가장 가깝게 일치하는 기존의 참조 이미지 또는 뷰를 선택하기 위해, 앙상블 장면 좌표계에 대한 해당 객체 또는 에셋의 포지션 및 지향방향 또는 포즈가 결정되고 나서, 해당 객체 또는 에셋의 기존의 참조 이미지들 또는 뷰들과 연관된 그의 개별적인 좌표계에서의 등가 포즈로 전환(translate)되거나 변환되거나 또는 상관된다. 따라서, 단계(504)에서 가장 가깝게 일치하는 기존의 참조 이미지 또는 뷰가 선택될 수 있도록 앙상블 장면에서 요청된 퍼스펙티브 및 상대적 객체 또는 에셋 포즈에 기초하여 상대적으로 낮은 계산 복잡성을 갖는 간단한 카메라 메트릭 계산이 수행된다. In step 504 , the closest or closest matching existing reference image or view is selected for each of at least a subset of the one or more objects or assets constituting the ensemble scene. Step 504 may be performed serially and/or in parallel on individual or independent objects or assets that make up the ensemble scene. In some embodiments, only one or a single existing reference image or view is selected for the object or asset that best matches the requested prescribed perspective for a given pose of that object or asset in ensemble scene space. The ensemble scene space includes an ensemble scene coordinate system having a defined origin defined in an appropriate manner, such as the center (eg, center of mass) of the ensemble scene. In step 504, the position and orientation or pose of the object or asset relative to the ensemble scene coordinate system are determined to select an existing reference image or view that most closely matches the object or asset constituting the ensemble scene. , translated, transformed or correlated to an equivalent pose in its respective coordinate system associated with the existing reference images or views of that object or asset. Thus, a simple camera metric calculation with relatively low computational complexity is performed based on the requested perspective and relative object or asset pose in the ensemble scene so that in step 504 the closest matching existing reference image or view can be selected. do.

하나 이상의 기준들 및/또는 임계값들이 객체 또는 에셋에 대한 가장 가깝게 일치하는 기존의 참조 이미지 또는 뷰를 결정하거나 식별하도록 정의될 수 있다. 일부 경우에, 하나 이상의 그러한 임계값들이 충족되는 경우에만 단계(504)에서 기존의 참조 이미지 또는 뷰가 선택된다. 이상적인 경우에, 단계(504)에서 정확히 일치하는 항목을 찾아 선택한다. 그러나 일부 경우에서, 객체 또는 에셋의 이용 가능한 기존의 참조 이미지들 또는 뷰들이 요청된 퍼스펙티브와 너무 상이할 때와 같이 이용 가능한 기존의 참조 이미지 데이터세트가 너무 불완전한 경우 또는 객체 또는 에셋에 이용 가능한 참조 이미지들 또는 뷰들이 없는 경우, 하나 이상의 선택 기준들 및/또는 임계값들이 충족되지 않을 수 있다. 그러한 일부 경우에, 객체 또는 에셋의 가장 가깝게 일치하는 플레이스홀더 또는 고스트 이미지 또는 뷰가 단계(504)에서 대신 선택된다. 이러한 플레이스홀더 이미지 또는 뷰는 객체 또는 에셋의 모양을 나타내지만 텍스처 및 광학 특성과 같은 다른 속성들이 부족하다. 일부 실시예들에서, (예를 들어, 객체 또는 에셋 둘레에 360도를 커버하는 각도를 포함하는) 객체 또는 에셋 둘레에 가능한 뷰들의 충분히 조밀한 세트에 걸쳐 있는 플레이스홀더 이미지들의 세트가 상대적으로 낮은 계산 복잡도 렌더링 기술을 사용하여 각각의 고유한 객체 모양에 대해 생성되고 저장된다. 그런 다음, 객체 또는 에셋의 완전히 렌더링된 버전들이 이용 가능하지 않거나 또는 요청된 퍼스펙티브로부터 허용할 수 없는 편차들을 나타내는 경우 플레이스홀더들이 이용된다.One or more criteria and/or thresholds may be defined to determine or identify a closest matching existing reference image or view for an object or asset. In some cases, an existing reference image or view is selected at step 504 only if one or more such thresholds are met. In the ideal case, an exact match is found and selected at step 504 . However, in some cases, the available existing reference image dataset is too incomplete, such as when the available existing reference images or views of the object or asset are too different from the requested perspective, or the reference image available for the object or asset. If there are no fields or views, one or more selection criteria and/or thresholds may not be met. In some such cases, the closest matching placeholder or ghost image or view of the object or asset is selected instead at step 504 . These placeholder images or views represent the shape of an object or asset, but lack other properties such as texture and optical properties. In some embodiments, a relatively low set of placeholder images spanning a sufficiently dense set of possible views around an object or asset (eg, including an angle covering 360 degrees around the object or asset) It is created and stored for each unique object shape using computational complexity rendering techniques. Placeholders are then used when fully rendered versions of an object or asset are not available or exhibit unacceptable deviations from the requested perspective.

단계(506)에서, 적어도 부분적으로, 단계(504)에서 선택된 앙상블 장면을 구성하는 객체들 또는 에셋들의 가장 가깝게 일치하는 기존의 참조 이미지들 또는 뷰들을 적절하게 결합하거나 합성함으로써 앙상블 장면의 출력 이미지가 요청된 규정된 퍼스펙티브에 대해 생성된다. 단계(506)는 객체 또는 에셋에 대해 선택된 가장 가깝게 일치하는 기존의 참조 이미지 또는 뷰를 적절하게 스케일링(scaling)하거나 리사이징(resizing)하는 것 및/또는 객체 또는 에셋에 대해 선택된 가장 가깝게 일치하는 기존의 참조 이미지 또는 뷰를 붙여넣거나(paste) 합성할 앙상블 뷰에서의 위치 또는 포지션을 결정하는 것을 포함할 수 있다. 대부분의 경우, 앙상블 장면의 생성된 출력 이미지는 요청된 규정된 퍼스펙티브에 아주 가깝다. 앙상블 장면을 구성하는 대부분의 객체들 또는 에셋들은 그들의 가장 근사하거나 가장 가까운 이용 가능한 기존의 포즈들을 갖는 출력 이미지로 표현되기 때문에, 이러한 객체들 또는 에셋들은 완벽하게 정확한 퍼스펙티브는 아니며 이는 이들이 엄격하게(rigorous) 렌더링되거나 생성되지 않기 때문이다. 즉, 대부분의 경우, 이러한 객체들 또는 에셋들은 이용 가능한 기존의 이미지들 또는 뷰들에서 정확히 일치하는 항목이 발견되지 않는 한 출력 이미지에서 요청된 규정된 퍼스펙티브를 갖지 않는다. 이러한 객체들 또는 에셋들의 소실점들(vanishing points)이 출력 이미지에서 동일한 지점에 모두 수렴되는(converge) 것은 아니지만, 객체들 또는 에셋들은 대부분의 경우 출력 이미지를 대부분에 대해 정확한 퍼스펙티브로 인지하도록 인간의 시각 시스템을 속이기에 충분히 작응 양(예를 들어, 몇 도)만큼 오프셋되거나 왜곡된다. In step 506, the output image of the ensemble scene is obtained, at least in part, by suitably combining or synthesizing existing closest matching existing reference images or views of the objects or assets constituting the ensemble scene selected in step 504, as appropriate. Generated for the requested prescribed perspective. Step 506 comprises appropriately scaling or resizing an existing closest matching reference image or view selected for the object or asset and/or the closest matching existing reference image or view selected for the object or asset. This may include pasting the reference image or view or determining the position or position in the ensemble view to be composited. In most cases, the resulting output image of the ensemble scene is very close to the requested prescribed perspective. Since most of the objects or assets that make up the ensemble scene are represented in an output image with their closest or closest available existing poses, these objects or assets are not in a perfectly accurate perspective because they are strictly ) because it is neither rendered nor created. That is, in most cases these objects or assets do not have the specified perspective requested in the output image unless an exact match is found in the existing images or views available. Although the vanishing points of these objects or assets do not all converge to the same point in the output image, the objects or assets are in most cases the human vision to perceive the output image in an accurate perspective for the most part. It is offset or distorted by a corresponding amount (eg, a few degrees) enough to deceive the system.

앙상블 장면의 출력 이미지의 일관성은 또한 전체적으로 일관되거나 유사한 방식으로 앙상블 장면의 적어도 일부 부분들을 생성함으로써 더욱 용이하게 되며, 이는 또한 출력 이미지를 실질적으로 시각적으로 정확한 것으로서 인간의 해석을 가능하게 한다. 예를 들어, 앙상블 장면을 구성하는 하나 이상의 객체들 또는 에셋들 및/또는 앙상블 장면을 구성하는 (평평한 또는 다른) 표면들, 구조적 요소들, 전역 피처들(global features) 등은 퍼스펙티브에서 정확하게 되도록, 즉 요청된 퍼스펙티브의 근사치가 아닌 요청된 규정된 퍼스펙티브를 갖도록 엄격하게 렌더링되거나 생성될 수 있다. 예를 들어, 앙상블 장면이 방과 같은 공간을 포함하는 경우, 요청된 퍼스펙티브의 카메라 포즈를 사용하여 벽, 천장, 바닥, 양탄자, 벽걸이 등이 생성될 수 있으며, 따라서, 단계(506)에서 생성된 앙상블 장면의 출력 이미지에서 정확하게 표현될 수 있다. 더욱이, 앙상블 장면의 출력 이미지는 예를 들어 표면 법선 벡터들과 같은 이용 가능한 메타데이터를 사용하여 재조명할 때 유사하고 일관된 방식으로 장면의 모든 부분들에 영향을 미치는 전역 조명 위치를 포함할 수 있다. 따라서, 전역적 또는 퍼스펙티브 교정 방식(global or perspective corrective manner)으로 앙상블 뷰의 일부 부분들을 생성하고 앙상블 뷰를 구성하는 대부분의 독립적인 객체들을 최상의 근사치들로 표현함으로써, 많은 경우에 완벽하게 정확한 퍼스펙티브 버전과 거의 분간할 수 없는 출력이 생성된다. 일부 경우에, 약간의 왜곡이 보일 수 있지만, 디자이너 또는 사용자가 정확한 퍼스펙티브에 관계없이 함께 객체들 또는 에셋들의 앙상블을 보면서 이점을 얻는 무드 보드 애플리케이션들(mood board applications) 또는 공간/방 계획 애플리케이션들과 같이 완벽하게 정확한 뷰가 필요하지 않은 특정 애플리케이션들에서는 여전히 허용될 수 있다. 그럼에도 불구하고, 이용 가능한 기존의 이미지들 또는 뷰들의 저장소 또는 데이터베이스가 시간이 지남에 따라 증가하게 되고, 따라서, 개시된 기술들은 요청된 규정된 퍼스펙티브을 점점 더 정확하게 나타내는 출력들을 계속 생성할 것이다. 최적의 경우, 모든 객체들 또는 에셋들에 대해 정확히 일치하는 항목들을 찾고 이는 근사치가 아니라 실제로 요청된 규정된 퍼스펙티브를 갖는 출력 이미지를 생성하는 데 이용된다.Consistency of the output image of the ensemble scene is also facilitated by generating at least some portions of the ensemble scene in a generally consistent or similar manner, which also enables human interpretation of the output image as substantially visually correct. For example, one or more objects or assets that make up the ensemble scene and/or the (flat or other) surfaces, structural elements, global features, etc. that make up the ensemble scene are accurate in perspective; That is, it may be rendered or generated strictly to have the requested specified perspective rather than an approximation of the requested perspective. For example, if the ensemble scene includes a space such as a room, walls, ceilings, floors, rugs, wall hangings, etc. may be created using the camera poses of the requested perspective, thus, the ensemble created in step 506 . It can be accurately represented in the output image of the scene. Moreover, the output image of the ensemble scene may contain a global illumination location that affects all parts of the scene in a similar and consistent manner when re-illuminated using available metadata such as, for example, surface normal vectors. Thus, by creating some parts of an ensemble view in a global or perspective corrective manner and representing most of the independent objects that make up the ensemble view with best approximations, in many cases a perfectly correct perspective version An output that is almost indistinguishable from the past is produced. In some cases, some distortion may be seen, but with mood board applications or space/room planning applications where a designer or user benefits from viewing an ensemble of objects or assets together regardless of the correct perspective. This may still be acceptable for certain applications that do not require a perfectly accurate view, such as Nevertheless, the repository or database of existing images or views available will grow over time, and thus the disclosed techniques will continue to produce outputs that more and more accurately represent the requested prescribed perspective. In the best case, exact matches are found for all objects or assets and this is used to produce an output image with the specified perspective requested in practice, rather than an approximation.

전술한 실시 예들이 이해의 명확성을 위해 일부 상세하게 설명되었지만, 본 발명은 제공된 세부 사항에 제한되지 않는다. 본 발명을 구현하는 많은 대안적인 방법들이 있다. 개시된 실시 예들은 예시적인 것이며 제한적인 것이 아니다.Although the foregoing embodiments have been described in some detail for clarity of understanding, the present invention is not limited to the details provided. There are many alternative ways of implementing the present invention. The disclosed embodiments are illustrative and not restrictive.

Claims

A method comprising:
receiving a request for a prescribed perspective of an ensemble scene comprising a plurality of assets; and
generating an output image of the ensemble scene approximating the requested prescribed perspective based at least in part on combining a single existing image of each at least a subset of the plurality of assets.

The method of claim 1 , wherein the request is received for an orthographic view of an ensemble scene.

The method of claim 2 , wherein the orthogonal view of the ensemble scene comprises combined orthogonal views of a plurality of assets.

The method of claim 1 , further comprising selecting a single existing image of each of at least a subset of the plurality of assets.

5. The method of claim 4, wherein the selecting comprises selecting an exact match with a requested defined perspective.

5. The method of claim 4, wherein the selecting comprises selecting the closest available match or closest to the requested prescribed perspective.

5. The method of claim 4, wherein selecting comprises selecting based on poses of associated assets in the ensemble scene.

5. The method of claim 4, wherein selecting comprises selecting an existing rotated image of an associated asset.

5. The method of claim 4, wherein the selecting comprises selecting the closest or closest available match to the requested prescribed perspective based on the pose of the associated asset in the ensemble scene.

The method of claim 1 , wherein generating an output image of the ensemble scene comprises scaling a single existing image of one or more of the subset of assets.

The method of claim 1 , wherein generating an output image of the ensemble scene comprises resizing a single existing image of one or more of the subset of assets.

The method of claim 1 , wherein generating an output image of the ensemble scene comprises determining a position in the ensemble scene to contain a single existing image of each of at least a subset of assets.

The method of claim 1 , wherein said binding comprises synthesizing.

The method of claim 1 , wherein generating an output image of the ensemble scene comprises generating a view of at least one of a plurality of assets having a requested defined perspective.

The method of claim 14 , wherein the view is created using a plurality of existing images for the at least one asset.

The method of claim 1 , wherein generating an output image of the ensemble scene comprises generating at least a portion of the ensemble scene to have a requested prescribed perspective.

17. The method of claim 16, wherein the at least one portion comprises a surface of an ensemble scene.

17. The method of claim 16, wherein the at least one portion comprises structural elements of an ensemble scene.

17. The method of claim 16, wherein the at least one portion comprises a global feature of an ensemble scene.

The method of claim 1 , further comprising globally re-illuminating the generated output image of the ensemble scene.

The method of claim 1 , wherein the output image comprises a frame of a video sequence.

In the system,
As a processor:
receive a request for a defined perspective of an ensemble scene including a plurality of assets;
the processor configured to generate an output image of the ensemble scene that approximates a requested prescribed perspective based at least in part on combining a single existing image of each of at least a subset of a plurality of assets; and
and a memory coupled to the processor and configured to provide instructions to the processor.

A computer program product embodied on a non-transitory computer-readable medium comprising computer instructions, the computer instructions comprising:
receive a request for a defined perspective of an ensemble scene including a plurality of assets;
and generating an output image of the ensemble scene approximating a requested prescribed perspective based at least in part on combining a single existing image of each of at least a subset of the plurality of assets.